Deduplicated backup and restore
Back directories up to Backblaze B2 or S3 with content-addressed deduplication, so each backup only uploads what actually changed — and every restore is re-hashed and verified as it lands.
The problem
Routine backups are mostly redundant: day over day, only a small fraction of a directory changes, yet naive backup schemes re-upload everything or rely on fragile incremental chains that are painful to restore from. Worse, a backup you cannot verify is not really a backup — bit rot in cold object storage, a truncated upload, or silent corruption can leave you with an archive that fails exactly when you need it. Restores then become an act of faith.
Why snapdir
Backups are pushed to an object store, deduplicated by content, and verified on the way back:
-
Incremental by content, not by chain. Objects live at content-addressed keys, so a backup only uploads objects whose bytes are new. Identical files across days — or across different machines pushing to the same bucket — are stored once.
-
Every backup is self-contained. A snapshot is addressed by one ID and carries its full manifest; there is no incremental chain to walk or to break. Restore any backup directly.
-
Verified restore. On pull every object is re-hashed against the manifest, so a successful restore is a proof of integrity, not an assumption.
Walkthrough
Back a directory up to a B2 bucket. push uploads only the objects not already
present and prints the snapshot ID — record it as your restore point:
backup_id=$(snapdir push --store b2://acme-backups/home ./srv/data)
echo "$backup_id"
Run it again tomorrow. Because the store is content-addressed, only the objects that changed since the last backup are uploaded; everything unchanged is skipped:
snapdir push --store b2://acme-backups/home ./srv/data
To restore, pull a chosen restore point into a fresh directory. Each object is fetched, re-hashed, and only then written into place:
snapdir pull --store b2://acme-backups/home --id "$backup_id" ./restore
Before relying on a cold backup, verify it end-to-end against the store without restoring it first:
snapdir verify --store b2://acme-backups/home --id "$backup_id"
The same commands work against s3:// — swap the store URI and nothing else
changes. See Stores for backend authentication.
Outcome
Backups cost only the bytes that actually changed, deduplicated across days and
across machines, while each one remains an independent, directly restorable
snapshot rather than a fragile incremental chain. Restores are verified by
construction, and a cold archive can be checked at any time with
snapdir verify — so you find corruption on a
routine check, not during a real recovery. Past restore points stay queryable via
snapdir revisions.