Stores and cache

A store is any backend that holds snapdir's content-addressed objects and manifests. The local cache is a special store on your own disk that sits in front of the remote ones. Both use the exact same on-disk layout, which is why a cache populated by one snapdir and a bucket written by another remain mutually readable.

Store backends

Snapdir routes a location to a backend purely by its URL scheme:

Scheme Backend
file:// A local (or mounted) directory. Also the shape of the cache.
s3:// Amazon S3 and S3-compatible object storage.
gs:// Google Cloud Storage.
b2:// Backblaze B2.

s3:// and b2:// are served by native, built-in adapters. gs:// maps to the GCS adapter (the oracle's hardcoded gsgcs special case). Any other scheme xyz:// is dispatched to an external shim named snapdir-xyz-store found on your PATH, so you can add a backend without modifying snapdir itself — a webdav:// URL, for example, would call snapdir-webdav-store.

A store is selected per command with --store, e.g.:

snapdir push --store s3://my-bucket/snapshots ./my-dir
snapdir pull --store file:///srv/snapdir <snapshot-id> ./restore-here

See the stores guide and snapdir locations for configuring named stores.

Each cloud backend also has per-provider request and bandwidth limits that snapdir paces against; see Storage provider limits for the defaults and how to override them.

The on-disk (and in-bucket) layout

Every store — local cache included — keys both objects and manifests on their lowercase hex digest and shards them across three directory levels so no single directory accumulates millions of entries:

.objects/<h[0:3]>/<h[3:6]>/<h[6:9]>/<h[9:]>
.manifests/<id[0:3]>/<id[3:6]>/<id[6:9]>/<id[9:]>
  • For an object, h is the file's content checksum.
  • For a manifest, id is its snapshot ID.

For checksum/id 49dc870df1de7fd60794cebce449f5ccdae575affaa67a24b62acb03e039db92:

.objects/49d/c87/0df/1de7fd60794cebce449f5ccdae575affaa67a24b62acb03e039db92
.manifests/49d/c87/0df/1de7fd60794cebce449f5ccdae575affaa67a24b62acb03e039db92

Because the address determines the path, every store is content-addressed and naturally deduplicated: an object can exist at exactly one location, and a second write of the same content is a no-op.

The local cache

The local cache is a file://-shaped store rooted at ${XDG_CACHE_HOME:-$HOME/.cache}/snapdir/, using the same .objects/.manifests sharded layout as every remote. It serves two purposes:

  • A staging areasnapdir writes objects and manifests to the cache first, then transfers them to a remote store.

  • A read-through accelerator — content already present in the cache does not need to be re-fetched from a remote.

You can inspect cache integrity with snapdir verify-cache and reclaim space with snapdir flush-cache.

Push and fetch discipline

Transfers between the cache and a remote store follow a strict ordering so a store is never left referencing data it does not hold:

  • Push — confirm the manifest is not already present, push objects before the manifest, and only push objects that are absent (skip-if-present). The manifest, written last, always points at objects that already exist.

  • Fetch — download to a temporary path, verify the BLAKE3 checksum, retry on mismatch, then atomically rename into place. A half-written or corrupted object never becomes visible at its address.

This ordering, combined with content addressing, is what lets pushes and fetches be safely retried and resumed. See Integrity for the verification details.

Where to go next