Snapshotting
This guide covers the local, offline half of snapdir: turning a directory into a content-addressed snapshot. Nothing here touches the network — you compute a manifest, derive its ID, and (optionally) stage the snapshot's objects into your local cache so they are ready to push.
A snapshot is described by a manifest (a UTF-8 text listing of every file, its mode, size, and BLAKE3 checksum) and identified by the snapshot ID — the BLAKE3 hash of that manifest. Identical content always yields a byte-identical manifest and therefore the same ID on any machine. See Manifests and Content addressing for the underlying model.
Describe a directory with snapdir manifest
snapdir manifest walks a directory and prints its manifest to stdout without
writing anything to a store. It is the quickest way to inspect exactly what a
snapshot would contain:
snapdir manifest ./my-dir
By default paths are emitted ./-relative; pass --absolute for absolute paths.
Symlinks are followed unless you pass --no-follow. To leave out generated or
machine-local files, pass --exclude <PATTERN>; the pattern is an extended regex
and understands the %system% / %common% macros for the usual noise files.
# Everything except VCS metadata and build output.
snapdir manifest --exclude '\.git/|target/' ./my-dir
snapdir hashes with BLAKE3 by default. If you need to interoperate with other
tooling, select a different algorithm with --checksum md5 or
--checksum sha256. The algorithm is part of the manifest, so it changes the
resulting ID.
For large trees, tune walk parallelism with --walk-jobs <N> or
SNAPDIR_WALK_JOBS. 0/auto selects a capped CPU count, and the setting is
distinct from transfer concurrency (--jobs / SNAPDIR_JOBS). This is purely a
performance knob: manifest bytes and snapshot IDs are unchanged for every
--walk-jobs value.
The input path is lexically normalized before the walk, so foo, ./foo, foo/,
and ./foo/ all describe the same directory and produce an identical manifest
and snapshot ID. Normalization is purely lexical — symlinks and .. segments are
preserved (snapdir does not canonicalize()), and the manifest format is unchanged.
snapdir assumes the tree is quiescent while it is being walked. If the tree is
in-flux, snapshotting fails instead of recording an incoherent manifest: files
that grow, shrink, or are replaced report file changed during walk; files that
disappear report file vanished during walk; and directory removals or shape
changes report tree structure changed during walk. The error names the path and
exits non-zero. On a static tree, these checks do not change the manifest or
snapshot ID.
See snapdir manifest for the full option
list.
Get just the ID with snapdir id
When you only want the identifier — for a cache key, a CI assertion, or a
provenance record — use snapdir id. It prints the snapshot ID and nothing else:
snapdir id ./my-dir
Run it again on the same content (even on another machine) and you get the exact
same ID. When PATH is omitted, snapdir id can read a manifest from stdin, so
you can hash a manifest you produced earlier without re-walking the directory:
snapdir manifest ./my-dir | snapdir id
This pairs naturally with CI: capture the ID, then assert it later to prove a
restored or rebuilt tree is byte-for-byte identical. See
snapdir id.
Stage a snapshot into the local cache with snapdir stage
snapdir manifest and snapdir id only read the directory. To actually save a
snapshot's objects locally — so they survive edits to the source tree and are
ready to push offline — use snapdir stage. It copies every object into the
local cache and prints the snapshot ID:
id=$(snapdir stage ./my-dir)
echo "$id"
Staging is content-addressed and incremental: objects already in the cache are
never copied again, so re-staging a slightly changed directory only writes the
new objects. Local file-store and cache writes may use copy-on-write clone or
reflink fast paths when the filesystem supports them, but staging does not create a linked checkout.
Override the cache location with --cache-dir <DIR> or the
SNAPDIR_CACHE_DIR environment variable. Staging uses the same directory walk as
manifest and id, so --exclude, --walk-jobs, and the quiescent-tree
requirement apply here too.
Once a snapshot is staged, push it to a store when you want a shareable,
store-verifiable copy. snapdir verify
verifies a snapshot in a store by ID.
Where to go next
-
Pushing and pulling — publish a staged snapshot to a store and restore it elsewhere.
-
Stores — the
file://,s3://,gs://,b2://,ssh://, andsftp://backends. -
Creating a manifest without snapdir — what
snapdir manifestcomputes, recreated by hand withb3sum. -
Quickstart — a full snapshot, push, and pull round-trip.
-
Reference:
snapdir manifest,snapdir id,snapdir stage.