Stores and cache
A store is any backend that holds snapdir's content-addressed objects and
manifests. The local cache is a special store on your own disk that sits in
front of the remote ones. Both use the exact same on-disk layout, which is why a
cache populated by one snapdir and a bucket written by another remain mutually
readable. In the usual colocated shape, one store holds both manifests and
objects. In a split shape, --store names the manifest-side store and
--objects-store names a shared object pool.
Store backends
Snapdir routes a location to a backend purely by its URL scheme:
s3:// and b2:// are served by native, built-in adapters. gs:// maps to the
GCS adapter (the oracle's hardcoded gs→gcs special case). Any other scheme
xyz:// is dispatched to an external shim named snapdir-xyz-store found on
your PATH, so you can add a backend without modifying snapdir itself — a
webdav:// URL, for example, would call snapdir-webdav-store. The ssh://
and sftp:// schemes in the table above ship as two such first-party shims,
turning any SSH-reachable host into a store by driving the system OpenSSH
client; see SSH and SFTP stores.
A store is selected per command with --store, e.g.:
snapdir push --store s3://my-bucket/snapshots ./my-dir
snapdir pull --store file:///srv/snapdir <snapshot-id> ./restore-here
See the stores guide and
snapdir locations for configuring named
stores.
Each cloud backend also has per-provider request and bandwidth limits that snapdir paces against; see Storage provider limits for the defaults and how to override them.
The on-disk (and in-bucket) layout
Every store — local cache included — keys both objects and manifests on their lowercase hex digest and shards them across three directory levels so no single directory accumulates millions of entries:
.objects/<h[0:3]>/<h[3:6]>/<h[6:9]>/<h[9:]>
.manifests/<id[0:3]>/<id[3:6]>/<id[6:9]>/<id[9:]>
- For an object,
his the file's content checksum. - For a manifest,
idis its snapshot ID.
For checksum/id 49dc870df1de7fd60794cebce449f5ccdae575affaa67a24b62acb03e039db92:
.objects/49d/c87/0df/1de7fd60794cebce449f5ccdae575affaa67a24b62acb03e039db92
.manifests/49d/c87/0df/1de7fd60794cebce449f5ccdae575affaa67a24b62acb03e039db92
Because the address determines the path, every store is content-addressed and naturally deduplicated: an object can exist at exactly one location, and a second write of the same content is a no-op.
Split stores and shared pools
--objects-store <URI> (or SNAPDIR_OBJECTS_STORE) splits a transfer into two
store roots:
- manifests are read from or written to
--storeunder.manifests/; - content objects are read from or written to the object pool under
.objects/.
That lets many manifest locations share one object pool. For example, a daily inventory can write each run to a date-specific manifest prefix while all runs reuse the same content-addressed object pool. Unchanged objects are skipped, and the snapshot ID remains the hash of the manifest content, not of the store layout that happens to hold it.
When a snapshot was pushed this way, readers need both halves: fetch and
pull use the manifest-side --store to find the manifest and
--objects-store to find the objects it names. Without --objects-store, the
store is colocated and both trees live below the same URI.
For store-to-store replication, snapdir sync uses
explicit per-side pool flags, --from-objects and --to-objects. Those flags
are not aliases for the global --objects-store: each sync side is either split
by its own --*-objects flag or colocated when that flag is omitted.
The local cache
The local cache is a file://-shaped store rooted at
${XDG_CACHE_HOME:-$HOME/.cache}/snapdir/, using the same
.objects/.manifests sharded layout as every remote. It serves two purposes:
-
A staging area —
snapdirwrites objects and manifests to the cache first, then transfers them to a remote store. -
A read-through accelerator — content already present in the cache does not need to be re-fetched from a remote.
You can inspect cache integrity with
snapdir verify-cache and reclaim space
with snapdir flush-cache.
Copy-on-write local copies
For local file stores and the local cache, snapdir opportunistically uses
copy-on-write clones when the source and destination are on a compatible
filesystem. macOS uses APFS clonefile(2), and Linux uses FICLONE reflinks on
filesystems that support them. These clones can make stage, push, fetch,
and checkout much faster and avoid allocating another full copy of large
objects.
The optimization is transparent: unsupported platforms, non-CoW filesystems, and
cross-device copies fall back to normal fs::copy. Set SNAPDIR_CLONEFILE=0 to
disable the fast path, or SNAPDIR_VERIFY_COPIES=1 to force the strict
write-time re-hash even after a clone succeeds. In all modes, object bytes,
manifest bytes, and snapshot IDs are byte-identical.
For read-only local views, checkout and pull can also use --linked. In
linked mode the destination files are symlinks into local content-addressed objects,
and those objects are hardened to mode 0444. This does not copy file
bytes into the destination, and writing through the symlink fails instead of
mutating the shared object. Linked mode requires local objects; remote object stores are refused.
Push and fetch discipline
Transfers between the cache and a remote store follow a strict ordering so a store is never left referencing data it does not hold:
-
Push — confirm the manifest is not already present, push objects before the manifest, and only push objects that are absent (skip-if-present). The manifest, written last, always points at objects that already exist.
-
Fetch — download to a temporary path, verify the BLAKE3 checksum, retry on mismatch, then atomically rename into place. A half-written or corrupted object never becomes visible at its address.
This ordering, combined with content addressing, is what lets pushes and fetches be safely retried and resumed. See Integrity for the verification details.
Where to go next
-
Content addressing — why the layout is keyed on hashes.
-
Manifests — what a manifest contains and how its ID is formed.
-
Integrity — end-to-end verification on transfer.
-
Guides: stores, pushing and pulling, syncing.