Difference between revisions of "BitTorrent DHT"

From Archiveteam
Jump to: navigation, search
m (Vuze DHT: typos fixed: psuedo → pseudo)
(Data dumps: 2009)
 
(3 intermediate revisions by the same user not shown)
Line 7: Line 7:
  
 
== DHT crawling ==
 
== DHT crawling ==
http://labs.boramalper.org/magnetico/ <br>
+
*http://labs.boramalper.org/magnetico/
https://github.com/kevinlynx/dhtcrawler2 <br>
+
*https://github.com/kevinlynx/dhtcrawler2
https://github.com/FlyersWeb/dhtbay <br>
+
*https://github.com/FlyersWeb/dhtbay
 +
*https://github.com/danfolkes/Magnet2Torrent
  
 
== DHT indexers ==
 
== DHT indexers ==
Line 20: Line 21:
  
 
There are no Vuze DHT indexing/archival projects. It should be easier, as Vuze DHT shares information more readily and has a pseudo-search engine build in the client. On the other hand, the only implementation is in Java.
 
There are no Vuze DHT indexing/archival projects. It should be easier, as Vuze DHT shares information more readily and has a pseudo-search engine build in the client. On the other hand, the only implementation is in Java.
 +
 +
== Data dumps ==
 +
 +
*[https://archive.org/details/publicbt.com publicbt.com daily dump from 2012-02 to 2012-08] ([https://torrentfreak.com/publicbt-tracker-set-to-patch-bittorrents-achilles-heel-090712/ background])
 +
 +
[[Category:Peer to peer]]

Latest revision as of 10:42, 11 September 2019

The BitTorrent DHT (Kademlia) is a decentralized alternative to trackers for BitTorrent. However, it can also be used to discover torrents and build an index. While downloading the contents would be prohibitively expensive (and have legal issues), the metadata is valuable and only 200-300gb in size.

The following bash oneliner can be used to download all torrents that coppersurfer.tk has peers for:

mkdir torrents
wget http://coppersurfer.tk/full_scrape_not_a_tracker.tar.gz -O - | tar --to-stdout -xz | xxd -ps -c1 | tr -d "\n" | LC_ALL=C grep --only-matching -P "32303a[0-9a-f]{40}64383a636f6d706c65746569(3[0-9])+6531303a646f776e6c6f6164656469(3[0-9])+6531303a696e636f6d706c65746569(3[0-9])+6565" |grep -v "646f776e6c6f6164656469306531303a696e636f6d706c65746569306565" | cut -c 7-46 | sed 's/^/magnet:?xt=urn:btih:/g' | sed 's/$/\&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969/g' | aria2c -d ./torrents -i - --bt-metadata-only=true --bt-save-metadata=true -j 100

This can obviously be used for other trackers as well, for an incomplete list of trackers sorted by indexed torrents see [1]. Note that some trackers do not publish their scrape files, and some publish them in a non-standard format. Also see [2] for another list and links to some more lists.

DHT crawling

DHT indexers

These have large databases that should be archived, as some of the torrent metadata is probably unavailable by now.
List: https://opentrackers.org/links/publicly-tracked-torrents/#searchengines
Note that most of the Chinese indexes are run by the same person/group/organization.

Vuze DHT

There are two competing BitTorrent DHTs, the one used in Vuze/Azureus (Vuze DHT) and the one used in all the other clients (Mainline DHT/Kademlia).

There are no Vuze DHT indexing/archival projects. It should be easier, as Vuze DHT shares information more readily and has a pseudo-search engine build in the client. On the other hand, the only implementation is in Java.

Data dumps