# Snapdir **Deterministic, cryptographically verified snapshots of any directory — distributed across local and cloud stores, deduplicated by content.** Snapdir hashes a directory into a deterministic snapshot ID, pushes it to object storage, and pulls it back byte-for-byte verified anywhere. Identical content always produces the same ID on any machine, so the same snapshot lands under the same content-addressed keys in every store — unchanged data is never re-uploaded, and every restore is re-hashed on fetch. It ships as a single static binary with zero runtime dependencies. All hashing and storage happen in-process; there is nothing to install alongside it. ```sh cargo install snapdir-cli ``` ## Why Snapdir - **Deterministic IDs.** A snapshot is a BLAKE3 manifest — one line per file and directory, sorted by path, with directory checksums computed as a merkle hash of their children. The snapshot ID is the BLAKE3 of that manifest, so the same content yields the same ID everywhere. Reproducible by construction. - **End-to-end verification.** Don't trust the store. Objects and manifests are stored at content-addressed keys, and every object is re-hashed and verified on fetch. `verify` re-checks an entire snapshot against its store on demand. - **Content-addressed deduplication.** Identical files and identical snapshots are stored once and interoperate across stores. Re-pushing after an edit uploads only the objects that actually changed. - **One binary, many backends.** A static musl build with no sidecar daemon and no bespoke environment variables. Cloud backends use native SDKs and standard credential chains. - **Portable layout.** Objects live under a sharded, content-addressed key layout, so a store written by one Snapdir is readable by any other — and by any tool that understands the format. ## Snapshot, verify, distribute The core arc is **snapshot → verify → distribute**: 1. **Snapshot** a directory into a deterministic ID with `id`, materialize objects with `stage`, and inspect the manifest with `manifest`. 2. **Verify** integrity end-to-end with `verify` and `verify-cache` — every object re-hashed against its content address. 3. **Distribute** with `push`, `fetch`, `pull`, and `checkout` across any store, while an embedded catalog tracks where snapshots live via `locations`, `ancestors`, and `revisions`. Pick a backend by `--store` URI scheme: | Scheme | Backend | Auth | | --------- | -------------------- | ------------------------------------------------- | | `file://` | Local filesystem | — | | `s3://` | Amazon S3 | AWS credential chain (env / profiles / SSO) | | `gs://` | Google Cloud Storage | ADC / `GOOGLE_APPLICATION_CREDENTIALS` | | `b2://` | Backblaze B2 | B2 application key over the S3-compatible API | Swap one scheme for another and the exact same commands push to the cloud. ## Try it in 60 seconds No cloud account needed — snapshot a directory and round-trip it through a local store: ```sh # A directory to snapshot mkdir -p demo && echo hello > demo/a.txt # Its content has a deterministic ID — same content, same ID, on any machine snapdir id ./demo # Push it to a store (here, a local directory). push prints the snapshot ID. id=$(snapdir push --store "file://$PWD/store" ./demo) # Pull it back somewhere else; every object is re-hashed and verified on fetch snapdir pull --store "file://$PWD/store" --id "$id" ./restored snapdir id ./restored # prints the same $id — byte-for-byte identical ``` Replicate the same snapshot across clouds without the original directory — only changed objects ever upload: ```sh # Snapshot once, push to S3 (push prints the id) id=$(snapdir push --store s3://my-bucket/snaps ./project) # Replicate S3 → GCS by pulling from one store and pushing to another snapdir pull --store s3://my-bucket/snaps --id "$id" /tmp/from-s3 snapdir push --store gs://my-bucket/snaps /tmp/from-s3 # Pull from any backend, anywhere — verified on fetch snapdir pull --store gs://my-bucket/snaps --id "$id" ./restored ``` ## Where to next - [Quickstart](./quickstart.md) — install and run your first snapshot. - [Use cases](./use-cases/) — backups, artifact caching, dataset versioning, and cross-cloud replication. - [Guide](./guide/) — stores, verification, catalog, and day-to-day workflows. - [Command reference](./reference/) — every subcommand and flag.