# Agentic vulnerability auditing

Snapshot the dependency tree a build actually used, diff plain-text manifests to
isolate exactly what changed since the last audited release, and spend expensive
agentic security review on that **delta** — while cheap advisory-database
scanners keep covering the whole tree.

## The problem

Deep dependency review does not scale by brute force. Agentic security
reviewers such as Anthropic's Mythos read dependency source the way a human
auditor would, and their cost grows with every file they are given — yet between
two releases most of a dependency tree is byte-for-byte unchanged, so reviewing
everything mostly re-reads code that was already audited. Targeting the review
is hard, though, because the usual inputs are weak: a lockfile *claims* what
should be installed, not what the build actually used, and diffing two installed
trees by hand is exactly the tedious work nobody does. And the security team
often operates in a restricted environment with no registry or package-manager
access, while still needing the exact bytes the build consumed.

## Why snapdir

- **Proof of what the build used.** Pushing the materialized tree right after
  dependency install records every file — direct and transitive dependencies
  alike — as bytes on disk under one content-addressed ID. This complements an
  SBOM rather than replacing it: an SBOM names the packages; the snapshot proves
  the bytes.
- **A plain-text audit surface.** The [manifest](../concepts/manifests.md) is
  one line per file or directory, with exactly five single-space-separated
  fields — `TYPE PERMS CHECKSUM SIZE PATH` — content checksums (BLAKE3 by
  default), sorted by path, comments starting with `#`. Standard Unix text
  tools can work on it directly.
- **Audit-once-per-content.** Review verdicts keyed on the `CHECKSUM` column
  transfer across paths, packages, and releases: content whose checksum appears
  in any previously audited manifest never needs re-reading by the deep
  reviewer, even after a rename. (Checking *unchanged* content against newly
  disclosed vulnerabilities remains the advisory-database scanners' job — see
  the two-tier pairing in the walkthrough.)
- **Verified hand-off to restricted environments.** `snapdir pull` re-hashes
  every object on the way in, so the analysis host gets a byte-identical copy
  of the exact tree without touching a registry.

## Walkthrough

**1. Capture the materialized tree in CI.** Right after dependency install,
snapshot the directory the package manager produced — `node_modules`, a cargo
vendor directory, a `pip install --target` directory — and save the plain-text
manifest as a build artifact. Snapshot a quiescent tree (no postinstall step
still writing to it), so the saved manifest and the pushed ID describe the same
bytes; the assertion makes that correspondence checkable:

```sh
deps_id=$(snapdir push --store s3://builds/deps ./vendor)
snapdir manifest ./vendor > deps.manifest
[ "$(snapdir id ./vendor)" = "$deps_id" ] || { echo "vendor changed mid-capture"; exit 1; }
```

**2. Compute the delta against the last audited release.** Here
`current.manifest` is this build's `deps.manifest`, and `audited.manifest` is
the same artifact saved by the last release that passed review. The set of
manifest lines new since that audit is one `comm -13` away. The explicit `sort`
matters: manifests are ordered by the path field, but `comm` compares whole
lines, so re-sort both sides first:

```sh
comm -13 <(grep -v '^#' audited.manifest | sort) \
         <(grep -v '^#' current.manifest | sort) > delta.lines
cut -d' ' -f5- delta.lines   # paths only
```

Comparing full lines is safe for paths that contain spaces — only the first
four columns are space-delimited, so everything after the fourth space is the
path. Parent directories appear in the delta as well, because a directory's
checksum is derived from its children; the `F` lines carry the content to
review.

**3. Skip content that is already audited.** Key past verdicts on the
`CHECKSUM` column: any delta line whose checksum appears in a previously
audited manifest is bytes the team has already read, whatever path, package, or
release it now lives in — a vendored file that merely moved is filtered out,
not re-reviewed. For an LLM-based reviewer this is the cost model: the input an
agent must read scales with the delta, not with the size of the tree. The
transfer is of the *bytes*, not the surrounding context — so the delta review
should still consider how new code calls into unchanged files, since that
interaction is only visible from the new side.

**4. Pair the two review tiers.** Delta-only review has a blind spot it cannot
paper over: a newly disclosed vulnerability in an *unchanged* dependency never
shows up in the delta. So run cheap advisory-database scanners over the entire
tree on every build, and reserve the expensive agentic deep review for the
delta. The tiers answer different questions — "is anything here on a known-bad
list?" versus "does this new code do something hostile?"

**5. Pull the exact tree into the restricted analysis environment.**

```sh
snapdir pull --store s3://builds/deps --id "$deps_id" ./audit-tree
```

Every object is re-hashed against the manifest on
[fetch](../concepts/integrity.md), so the team audits a byte-identical copy of
what CI built — with no registry or package-manager access on the analysis
host. For the evidence and chain-of-custody side of that workflow, see
[vulnerability scanning in restricted environments](vulnerability-scanning-restricted-envs.md).

**6. Carry the verdict to the shipped artifact.** An audit of the inputs is
only worth as much as its link to the output. The manifest records each file's
path, permissions, and content checksum — and embeds no timestamps — so two
builds print the same snapshot ID exactly when their manifests are identical:
same file contents, same paths, same permissions. A clean rebuild that prints
the same ID proves the audited inputs produced the shipped bits, and CI can
gate on it:

```sh
ref=$(snapdir id ./dist)
got=$(snapdir id ./dist-rebuild)
[ "$ref" = "$got" ] || { echo "build not reproducible: $ref != $got"; exit 1; }
```

## Outcome

Audit effort tracks what actually changed: each release, the delta falls out of
two text files CI already produces, content reviewed once is never read again
by the deep reviewer, and the agentic reviewers spend tokens only on genuinely
new bytes — while advisory-database scanners keep watching the unchanged
remainder for new disclosures. The same content-addressed IDs let the work
distribute: a fleet of agents on different machines — code review, triage,
pentest tooling — pulls one ID and gets identical bytes, pushes only transfer
objects absent from the store, content already in a host's local cache is not
fetched again, and [`--linked` checkouts](../guide/pushing-pulling.md) give
near-instant working copies. For fleets inside a private network,
[`ssh://` stores](../guide/stores.md#ssh-and-sftp-stores) serve the same
snapshots without a cloud bucket.