Snapshotting

This guide covers the local, offline half of snapdir: turning a directory into a content-addressed snapshot. Nothing here touches the network — you compute a manifest, derive its ID, and (optionally) stage the snapshot's objects into your local cache so they are ready to push or verify.

A snapshot is described by a manifest (a UTF-8 text listing of every file, its mode, size, and BLAKE3 checksum) and identified by the snapshot ID — the BLAKE3 hash of that manifest. Identical content always yields a byte-identical manifest and therefore the same ID on any machine. See Manifests and Content addressing for the underlying model.

Describe a directory with snapdir manifest

snapdir manifest walks a directory and prints its manifest to stdout without writing anything to a store. It is the quickest way to inspect exactly what a snapshot would contain:

snapdir manifest ./my-dir

By default paths are emitted ./-relative; pass --absolute for absolute paths. Symlinks are followed unless you pass --no-follow. To snapshot only part of a tree, filter with --paths <PATTERN> (include) or --exclude <PATTERN> (exclude); the exclude pattern is an extended regex and understands the %system% / %common% macros for the usual noise files.

# Everything except VCS metadata and build output.
snapdir manifest --exclude '\.git/|target/' ./my-dir

snapdir hashes with BLAKE3 by default. If you need to interoperate with other tooling, select a different algorithm with --checksum md5 or --checksum sha256. The algorithm is part of the manifest, so it changes the resulting ID.

The input path is lexically normalized before the walk, so foo, ./foo, foo/, and ./foo/ all describe the same directory and produce an identical manifest and snapshot ID. Normalization is purely lexical — symlinks and .. segments are preserved (snapdir does not canonicalize()), and the manifest format is unchanged.

See snapdir manifest for the full option list.

Get just the ID with snapdir id

When you only want the identifier — for a cache key, a CI assertion, or a provenance record — use snapdir id. It prints the snapshot ID and nothing else:

snapdir id ./my-dir

Run it again on the same content (even on another machine) and you get the exact same ID. snapdir id can also read a manifest from stdin, so you can hash a manifest you produced earlier without re-walking the directory:

snapdir manifest ./my-dir | snapdir id

This pairs naturally with CI: capture the ID, then assert it later to prove a restored or rebuilt tree is byte-for-byte identical. See snapdir id.

Stage a snapshot into the local cache with snapdir stage

snapdir manifest and snapdir id only read the directory. To actually save a snapshot's objects locally — so they survive edits to the source tree and are ready to push offline — use snapdir stage. It copies every object into the local cache and prints the snapshot ID:

id=$(snapdir stage ./my-dir)

echo "$id"

Staging is content-addressed and incremental: objects already in the cache are never copied again, so re-staging a slightly changed directory only writes the new objects. By default objects are copied into the cache; pass --linked to hardlink them instead (faster, but the cache then shares storage with the source tree). Override the cache location with --cache-dir <DIR> or the SNAPDIR_CACHE_DIR environment variable.

Once a snapshot is staged you can confirm its integrity at any time with snapdir verify, which re-hashes the staged objects against the manifest.

Where to go next