#
Snapdir
#
What is snapdir?
Snapdir is a tool for creating and restoring snapshots of directories.
The main feature are:
- Generating manifests and unique identifiers of the contents of directories and files.
- Saving and restores data from pluggable storage backends such as Amazon S3 and Backblaze B2.
- Verifying the integrity of the data using cryptographic hashes.
- UNIX-style composability.
- Content addressable local object cache.
Snapdir is a building block for applications that need one or more of the following characteristics:
- Storing data on untrusted environments.
- Content replicated data types (CRDTs).
- File-system based data replication.
- Data integrity verification.
- File deduplication.
- Multicloud file sharing.
#
Motivation
Snapdir was created as a prototype to explore an optimal workflow for consuming and generating files in ephemeral environments. At BermiLabs, we used it to replicate parquet files in our analytics pipelines and our distributed ETL workflows.
We decided to open source it could be used by others to implement CRDT strategies on eventually consistent read-heavy applications.
#
Usage
#
Prerequisites
Snapdir requires BLAKE3 for hashing and HMAC signing and optionally [sqlite] to query local snapshots.
To verify your dependencies are on your $PATH
run:
command -v b3sum
command -v sqlite3
To install the dependencies on debian flavored distributions you can run:
apt-get install -y wget sqlite3
wget -q "https://github.com/BLAKE3-team/BLAKE3/releases/download/1.3.1/b3sum_linux_x64_bin" -O /usr/local/bin/b3sum
chmod +x /usr/local/bin/b3sum
#
Installation
Snapdir has been implemented as independent and tested bash scripts.
To install them all on /usr/local/bin/
you can run:
wget -O - https://raw.githubusercontent.com/bermi/snapdir/main/utils/install.sh | bash
At a minimum, snapdir requires the snapdir
and snapdir-manifest
scripts to
be on your $PATH
.
#
Via Docker
You can try snapdir using the Docker image bermi/snapdir
target_dir=./ # specify a target directory
# using -v to mount the target directory on the docker container
docker run -it --rm \
-v "$(realpath $target_dir):/target" \
-v "${HOME}/.cache/snapdir:/root/.cache/snapdir" \
bermi/snapdir manifest /target
#
Contributing
Snapdir is licensed under the MIT License and contributions are welcome! Please check the contributing guidelines and visit the github repo for more information.
To checkout the code and run tests:
git clone https://github.com/bermi/snapdir.git
cd snapdir
./snapdir-test
There project includes a VSCode devcontainer configuration that you can use to develop snapdir in a containerized environment.
#
Alternatives
There are many other tools that might be better suited for your particular use case. For example: ostree, mtree, Git LFS, DVC, Syncthing, BitTorrent, DAT, git, HDF5, tar, Btrfs, ZFS, IPFS, Perkeep, SeaweedFS, upspin Keybase Filesystem and Sigstore.
We use Snapdir
in conjunction with some of the tools mentioned above.
None of them met the simplicity, ergonomics and auditability goals we had in mind when defining Snapdir
.