It is available for Windows and Linux, and is a command line tool that can be left running in the background.
A central server takes care of coordinating the Internet Archive items that each client should back up. Each item can be given a priority score (currently, these priority are assigned based on size and "uniqueness" of the item type).
Currently implemented features
- User registration (optional)
- Retrieval of items from IA
- Hash consistency checks
- Disk space checks
- Coordination server and job assignment
- Download resume (file granularity)
- Run on startup (Windows only)
More info on the github page: iabak-sharp
Comparison with git-annex implementation
- Written in a more maintainable language (as opposed to bash)
- No concept of shards: because we're not constrained by git repository size limits, each client only has to worry about the metadata of the files that they're actually storing on their drive. The server only stores a minimal amount of metadata (identifier, total size, and users having that item).
- We're free to implement features that don't perfectly match the git use cases (eg. remote verification/challanges, encryption support, alternate distribution mechanisms eg. ipfs)
- Supports Windows (in addition to Linux)
- For example, "warc-example1.com" has more priority than all the "warc-example2-20200623", "warc-example2-20200624", "warc-example2-20200625" etc.