Snapshotting

This guide covers the local, offline half of snapdir: turning a directory into a content-addressed snapshot. Nothing here touches the network — you compute a manifest, derive its ID, and (optionally) stage the snapshot's objects into your local cache so they are ready to push.

A snapshot is described by a manifest (a UTF-8 text listing of every file, its mode, size, and BLAKE3 checksum) and identified by the snapshot ID — the BLAKE3 hash of that manifest. Identical content always yields a byte-identical manifest and therefore the same ID on any machine. See Manifests and Content addressing for the underlying model.

Describe a directory with snapdir manifest

snapdir manifest walks a directory and prints its manifest to stdout without writing anything to a store. It is the quickest way to inspect exactly what a snapshot would contain:

snapdir manifest ./my-dir

By default paths are emitted ./-relative; pass --absolute for absolute paths. Symlinks are followed unless you pass --no-follow. To leave out generated or machine-local files, pass --exclude <PATTERN>; the pattern is an extended regex and understands the %system% / %common% macros for the usual noise files.

# Everything except VCS metadata and build output.
snapdir manifest --exclude '\.git/|target/' ./my-dir

snapdir hashes with BLAKE3 by default. If you need to interoperate with other tooling, select a different algorithm with --checksum md5 or --checksum sha256. The algorithm is part of the manifest, so it changes the resulting ID.

For large trees, tune walk parallelism with --walk-jobs <N> or SNAPDIR_WALK_JOBS. 0/auto selects a capped CPU count, and the setting is distinct from transfer concurrency (--jobs / SNAPDIR_JOBS). This is purely a performance knob: manifest bytes and snapshot IDs are unchanged for every --walk-jobs value.

The input path is lexically normalized before the walk, so foo, ./foo, foo/, and ./foo/ all describe the same directory and produce an identical manifest and snapshot ID. Normalization is purely lexical — symlinks and .. segments are preserved (snapdir does not canonicalize()), and the manifest format is unchanged.

snapdir assumes the tree is quiescent while it is being walked. If the tree is in-flux, snapshotting fails instead of recording an incoherent manifest: files that grow, shrink, or are replaced report file changed during walk; files that disappear report file vanished during walk; and directory removals or shape changes report tree structure changed during walk. The error names the path and exits non-zero. On a static tree, these checks do not change the manifest or snapshot ID.

See snapdir manifest for the full option list.

Get just the ID with snapdir id

When you only want the identifier — for a cache key, a CI assertion, or a provenance record — use snapdir id. It prints the snapshot ID and nothing else:

snapdir id ./my-dir

Run it again on the same content (even on another machine) and you get the exact same ID. When PATH is omitted, snapdir id can read a manifest from stdin, so you can hash a manifest you produced earlier without re-walking the directory:

snapdir manifest ./my-dir | snapdir id

This pairs naturally with CI: capture the ID, then assert it later to prove a restored or rebuilt tree is byte-for-byte identical. See snapdir id.

Stage a snapshot into the local cache with snapdir stage

snapdir manifest and snapdir id only read the directory. To actually save a snapshot's objects locally — so they survive edits to the source tree and are ready to push offline — use snapdir stage. It copies every object into the local cache and prints the snapshot ID:

id=$(snapdir stage ./my-dir)

echo "$id"

Staging is content-addressed and incremental: objects already in the cache are never copied again, so re-staging a slightly changed directory only writes the new objects. Local file-store and cache writes may use copy-on-write clone or reflink fast paths when the filesystem supports them, but staging does not create a linked checkout. Override the cache location with --cache-dir <DIR> or the SNAPDIR_CACHE_DIR environment variable. Staging uses the same directory walk as manifest and id, so --exclude, --walk-jobs, and the quiescent-tree requirement apply here too.

Once a snapshot is staged, push it to a store when you want a shareable, store-verifiable copy. snapdir verify verifies a snapshot in a store by ID.

Where to go next