|Web Roasting - Save all the web hosting sites!|
|Archiving status||scraping In progress..., downloading Upcoming...|
|IRC channel||(on EFnet)|
Web Roasting is a project to save old web hosting sites before they shut down. Currently scraping for hosted websites, grab coming soon.
How can I help?
There are two ways you can help right now:
- Add more web hosting sites to the ISP Hosting or University Web Hosting pages.
- Scrape the following for hosted web sites:
- Google (site:webhost.com)
- Bing (site:webhost.com)
- DuckDuckGo (site:webhost.com)
- Yandex (site:webhost.com)
- Baidu (site:webhost.com)
- Twitter (litterapi preferred)
- Reddit (http://www.reddit.com/domain/webhost.com/)
- Links from MediaWiki wikis
- The Open Directory Project
- The Common Crawl Index
- The Wayback Machine
- URLTeam crawls
- DNSdumpster.com (only for hosts that use subdomains)
- pentest-tools.com (only for hosts that use subdomains)
- Sitemaps or other types of indexes, if the web host provides any.