Cross-cloud replication

Replicate a snapshot directly between S3, Google Cloud Storage, and Backblaze B2 with snapdir sync — streaming store-to-store, skipping objects the destination already has, never staging the data on your own disk.

The problem

Keeping a copy of critical data in a second cloud — for disaster recovery, to avoid lock-in, or to satisfy a data-residency requirement — usually means downloading everything from one provider and re-uploading it to another. That burns egress on the round trip, needs scratch space large enough to hold the whole dataset, and re-transfers data that is already present at the destination from a previous run. And once the copy exists, you still have to prove it is faithful to the source.

Why snapdir

snapdir sync copies a snapshot — its manifest and every object it references — directly from one store to another, streaming through memory with no local staging:

  • No round trip to disk. sync pipes the snapshot straight from the source store to the destination, instead of a fetch then push that would pull every object into your cache first.

  • Skips what's already there. The transfer is content-addressed, so objects already present at the destination are skipped — re-running a sync after a small change copies only the new objects, which makes it cheap as a recurring replication step.

  • Faithful by construction. Both endpoints address objects by content, so the replicated snapshot carries the same ID and the same integrity guarantee as the source.

Walkthrough

Replicate a snapshot from a primary S3 bucket to a GCS disaster-recovery bucket. sync needs the snapshot --id and both endpoints, and the two must differ:

snapdir sync \
  --id "$id" \
  --from s3://primary-bucket/snapshots \
  --to   gs://dr-bucket/snapshots

Preview a cross-cloud copy before committing to it with --dryrun, then run it rate-limited so it does not saturate the link:

snapdir sync --id "$id" --from s3://primary/snap --to b2://dr/snap --dryrun
snapdir sync --id "$id" --from s3://primary/snap --to b2://dr/snap --limit-rate 50M

Confirm the replica is complete by verifying the snapshot against the destination store — every object is re-hashed against the manifest:

snapdir verify --store gs://dr-bucket/snapshots --id "$id"

Both endpoints must be in-process stores (file://, s3://, gs://, b2://); to replicate through a custom backend, fall back to a fetch then push.

Outcome

A second-cloud copy is maintained with one command, no scratch disk, and minimal egress: each run copies only objects the destination is missing, so steady-state replication is nearly free. The replica is provably faithful — same content, same ID, re-verifiable on demand — giving you a real cross-cloud DR posture without vendor lock-in. Discover which snapshots and locations exist to replicate with snapdir locations.