Syncing
snapdir sync copies a snapshot — its manifest and every object it references —
directly between two stores, streaming through memory with no local staging.
This is how you replicate a snapshot from one bucket or region to another (or
between clouds) without ever materializing it on your own disk.
It is the right tool when the data does not need to land locally: a normal
fetch + push would pull every object into your cache
and push it back out, whereas sync pipes the snapshot straight from the source
store to the destination store.
Replicate a snapshot between stores
sync requires a snapshot --id and both endpoints, --from and --to:
snapdir sync \
--id "$id" \
--from s3://primary-bucket/snapshots \
--to gs://dr-bucket/snapshots
When omitted, --from defaults to $SNAPDIR_STORE, so you can set the source
store once via the SNAPDIR_STORE environment variable and only pass --to on
each sync. --to is always explicit — a sync needs two distinct stores, so the
destination is never inferred.
The source and destination must differ. The transfer is content-addressed, so
objects already present at the destination are skipped — re-running a sync
after a small change only copies the new objects, which makes sync cheap to run
repeatedly as a replication step.
Sync split object pools
sync has its own per-side split-store flags:
-
--from-objects <URI>reads source objects from that object pool while source manifests come from--from. -
--to-objects <URI>writes destination objects to that object pool while destination manifests go to--to.
These flags are distinct from the global --objects-store /
SNAPDIR_OBJECTS_STORE used by push, fetch, and pull. For sync, name each
object pool explicitly on the side that is split; a side that omits its
--*-objects flag is treated as a normal colocated store. Passing the global
--objects-store to sync does not stand in for --from-objects or
--to-objects.
snapdir sync \
--id "$id" \
--from s3://inventory/manifests/host-a \
--from-objects s3://inventory/source-object-pool \
--to b2://dr/manifests/host-a \
--to-objects b2://dr/object-pool
The source and destination can use different object pools. Objects are read from
the source pool, written to the destination pool, and skipped when the
destination pool already has that content-addressed object. The manifest is
still published to the destination --to location last.
Both endpoints must be in-process stores (file://, s3://, gs://,
b2://), and the same restriction applies to --from-objects and
--to-objects. External snapdir-*-store URLs — including the first-party
ssh:// and sftp:// stores — have no
in-process streaming surface and are rejected by sync; to replicate through
such a backend, fall back to a fetch then push. See Stores for
the backend matrix and authentication.
Mirror a local destination store
sync --delete keeps the destination store's manifest set aligned with the
source. It copies the requested snapshot first, then removes destination
manifests that are not present in the source. It does not delete objects, so
unreferenced object bytes are left for a separate garbage-collection policy.
snapdir sync \
--id "$id" \
--from gs://feature-snapshots \
--to file:///srv/snapdir/features \
--delete
Mirror deletion is intentionally local-only: --to must be a file:// store.
Object stores such as s3://, gs://, and b2://, and external stores such as
ssh://, are refused for sync --delete. Add --dryrun to preview the manifest
prune count before changing the destination.
Tuning and dry runs
sync honors the shared transfer-tuning flags, applied to the single
store-to-store pipe:
-
-j, --jobs <N>— concurrent object transfers (0/auto= CPUs, capped 16). -
--limit-rate <RATE>— cap aggregate bandwidth, e.g.50M,512K,1G. -
--adaptive[=<FRACTION>]— adaptively tune concurrency/bandwidth toward a fraction (default0.8) of measured capacity, backing off under contention. -
--dryrun— report what would be copied without writing anything to the destination. Useful for confirming a replication plan before it runs.
# Preview a cross-cloud replication, then run it rate-limited.
snapdir sync --id "$id" --from s3://primary/snap --to b2://dr/snap --dryrun
snapdir sync --id "$id" --from s3://primary/snap --to b2://dr/snap --limit-rate 50M
On a real sync the snapshot ID is printed to stdout and a human-readable summary
goes to stderr, consistent with push and stage. The summary counts
unique objects actually copied, not file references in the manifest. If a
manifest references the same content more than once, or the destination pool
already has an object, those duplicates do not inflate the copied count; a sync
that only publishes a missing manifest can correctly report 0 copied.
Where to go next
-
Stores — the backends
synccan connect, and their auth. -
Pushing and pulling — the fetch/push path for custom backends.
-
History — find which locations already hold a given snapshot.
-
Reference:
snapdir sync.