Snapshotting
This guide covers the local, offline half of snapdir: turning a directory into a content-addressed snapshot. Nothing here touches the network — you compute a manifest, derive its ID, and (optionally) stage the snapshot's objects into your local cache so they are ready to push or verify.
A snapshot is described by a manifest (a UTF-8 text listing of every file, its mode, size, and BLAKE3 checksum) and identified by the snapshot ID — the BLAKE3 hash of that manifest. Identical content always yields a byte-identical manifest and therefore the same ID on any machine. See Manifests and Content addressing for the underlying model.
Describe a directory with snapdir manifest
snapdir manifest walks a directory and prints its manifest to stdout without
writing anything to a store. It is the quickest way to inspect exactly what a
snapshot would contain:
snapdir manifest ./my-dir
By default paths are emitted ./-relative; pass --absolute for absolute paths.
Symlinks are followed unless you pass --no-follow. To snapshot only part of a
tree, filter with --paths <PATTERN> (include) or --exclude <PATTERN>
(exclude); the exclude pattern is an extended regex and understands the
%system% / %common% macros for the usual noise files.
# Everything except VCS metadata and build output.
snapdir manifest --exclude '\.git/|target/' ./my-dir
snapdir hashes with BLAKE3 by default. If you need to interoperate with other
tooling, select a different algorithm with --checksum md5 or
--checksum sha256. The algorithm is part of the manifest, so it changes the
resulting ID.
The input path is lexically normalized before the walk, so foo, ./foo, foo/,
and ./foo/ all describe the same directory and produce an identical manifest
and snapshot ID. Normalization is purely lexical — symlinks and .. segments are
preserved (snapdir does not canonicalize()), and the manifest format is unchanged.
See snapdir manifest for the full option
list.
Get just the ID with snapdir id
When you only want the identifier — for a cache key, a CI assertion, or a
provenance record — use snapdir id. It prints the snapshot ID and nothing else:
snapdir id ./my-dir
Run it again on the same content (even on another machine) and you get the exact
same ID. snapdir id can also read a manifest from stdin, so you can hash a
manifest you produced earlier without re-walking the directory:
snapdir manifest ./my-dir | snapdir id
This pairs naturally with CI: capture the ID, then assert it later to prove a
restored or rebuilt tree is byte-for-byte identical. See
snapdir id.
Stage a snapshot into the local cache with snapdir stage
snapdir manifest and snapdir id only read the directory. To actually save a
snapshot's objects locally — so they survive edits to the source tree and are
ready to push offline — use snapdir stage. It copies every object into the
local cache and prints the snapshot ID:
id=$(snapdir stage ./my-dir)
echo "$id"
Staging is content-addressed and incremental: objects already in the cache are
never copied again, so re-staging a slightly changed directory only writes the
new objects. By default objects are copied into the cache; pass --linked to
hardlink them instead (faster, but the cache then shares storage with the source
tree). Override the cache location with --cache-dir <DIR> or the
SNAPDIR_CACHE_DIR environment variable.
Once a snapshot is staged you can confirm its integrity at any time with
snapdir verify, which re-hashes the staged
objects against the manifest.
Where to go next
-
Pushing and pulling — publish a staged snapshot to a store and restore it elsewhere.
-
Stores — the
file://,s3://,gs://, andb2://backends. -
Quickstart — a full snapshot, push, and pull round-trip.
-
Reference:
snapdir manifest,snapdir id,snapdir stage.