WikiTeam

From Archiveteam
Revision as of 15:50, 6 December 2011 by Emijrp (Talk | contribs)
Jump to: navigation, search
We save wikis, from Wikipedia to tiniest wikis
130+ wikis saved to date
WikiTeam
WikiTeam logo
WikiTeam, a set of tools for wiki preservation and a repository of wikis
WikiTeam, a set of tools for wiki preservation and a repository of wikis
URL http://code.google.com/p/wikiteam
Project status Online!
Archiving status In progress...
Project source Unknown
Project tracker Unknown
IRC channel #wikiteam

Welcome to WikiTeam. A wiki is a website that allows the creation and editing of any number of interlinked web pages, generally used to store information on a specific subject or subjects. This is done with a day-to-day web browser using a simplified markup language (HTML as an example) or a WYSIWYG (what-you-see-is-what-you-get) text editor.

Examples of huge wikis:

There are also several wikifarms with hundreds of wikis.

Most of the wikis don't offer public backups. How bad!

Contents

Tools and source code

Official WikiTeam tools

Other


Wiki dumps

For a more detailed list, visit the download section on Google Code.

There is another site of MediaWiki dumps located here on Scott's Website. More dumps are available as a collection in the Internet Archive.

TODO lists:

Legend
     Good
     Could be better
     Bad
     Unknown
Wiki Wiki is online? Dumps available? (official or home-made) Comments/Details Saved by us? Who? Where?
Anarchopedias Yes Official: no. Home-made: Yes - idiolect
Archive Team Wiki Yes Official: no. Home-made: yes - WikiTeam
Bulbapedia Yes Official: no. Home-made: no - dr-spangle is working on it with a self-built PHP downloader
Citizendium Yes Official: daily (no full history). Home-made: yes, April 2011 No image dumps available -
EditThis Yes Official: no. Home-made: in progress - -
enciclopedia.us.es Yes Official: no. Home-made: no Sysop sent me page text sql tables emijrp
Encyclopedia Dramatica No Official: no. Home-made: partial WebEcology Project Article Dump (~9000 Articles)
Most of the Images probably Lost
-
Encyclopedia Dramatica.ch
(new ED)
Yes Official: ? Home-made: ? Slowly being rebuilt from old sources.
Should be up for a while but for who knows how long?
-
Gentoo wikis Yes Official: no. Home-made: yes - WikiTeam
GNUpedia No Official: no. Home-made: no No database. This "wiki encyclopedia" was only HTML pages. Only ~3 articles were sent to the mailing list. After that, the project was closed -
MeatBall Yes Official: no. Home-made: yes (mirror) No histories, no xml format SDBoyd
Metapedia Yes Official: ?. Home-made: no - -
Neoseeker aka Scout wikis Yes Official: ?. Home-made: no - -
Nupedia No Official: ?. Home-made: Yes, saved from IA - -
OmegaWiki Yes Official: daily - -
OpenStreetMap Yes Official: Yes. Home-made: no - -
OpenSUSE wikis Yes Official: no. Home-made: yes - Hydriz
OSDev Yes Official: weekly - Not yet
TV Tropes Yes Official: No Unofficial: In progress No dump mechanism, using wget -nc -r -p -l 0 -np -w 45 -E -k -T 10 -nv -x "http://tvtropes.org" DoubleJ
Uncyclomedias - - - -
Wikanda Yes Official: no. Home-made: yes - emijrp
Wikia Yes Official: on demand No image dumps available Not yet
WikiFur Yes Official: yes No image dumps available Not yet
WikiHow - - - -
Wikimedia Commons Yes Official: periodically No image dumps available Not yet
Wikipedia Yes Official: periodically No image dumps available. English Wikipedia dump uses to be very old Not yet
Wiki-site.com - - - -
WikiTravel Yes Official: not yet. Home-made: yes, another of 2010-06-14 - WikiTeam
WikiWikiWeb Yes Home-made: yes - Ca7
(o:forum Yes No - Not yet, to figure out how
WikiWiki.de Yes No - Not yet, to figure out how
GruenderWiki Yes No - Not yet, to figure out how

Tips

Some tips:

  • When downloading Wikipedia/Wikimedia Commons dumps, pages-meta-history.xml.7z and pages-meta-history.xml.bz2 are the same, but 7z use to be smaller (better compress ratio), so use 7z.

BitTorrent downloads

A feed of BitTorrent downloads is available for the latest files posted to the WikiTeam Google Code Downloads.

Files under 1 MB are blocked on the service generating these torrents (Burnbit.com), so not every file is available as a torrent. There may be some delay after a file is uploaded before the torrent appears on the feed. You can subscribe to this feed in your BitTorrent client for automatic downloads (this has been tested successfully in µTorrent on Windows).

Mirrors

  1. Sourceforge (also mirrored to another 26 mirrors)
  2. Internet Archive (direct link to directory)

Closing/In danger

External links


[view]  [edit]                   Archive Team                  
Current events Alive... OR ARE THEY · Deathwatch · Projects
Archiveteam.jpg
Archiving projects Archive.is · BetaArchive · Gmane · Internet Archive · It Died · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite
Blogging Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LiveJournal · My Opera · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd
Cloud hosting/file sharing AnyHub · Box · Dropbox · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase
Corporations Apple · IBM · Google · Lycos Europe · Microsoft · Yahoo!
Events Arab Spring · Occupy movement · Spanish Revolution
Font Repos Google Web Fonts · GNU FreeFont · Fontspace
Forums 4chan · College Confidential · ESPN Forums · forums.starwars.com · HeavenGames · Yahoo! Messages · Yahoo! Neighbors
Gaming Atomicgamer · City of Heroes · Club Nintendo · Desura · Emulation Zone · GameMaker Sandbox · Halo · Infinite Crisis · Minecraft.net · Player.me · Playfire · Steam · Warhammer · Xfire
Image hosting AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · deviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotopedia · Frontback · Geograph Britain and Ireland · GTF Képhost · ImageShack · Imgur · Inkblazers · Instagr.am · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Snapjoy · Streetfiles · Tabblo · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons
Knowledge/Wikis arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram) · Insurgency Wiki · Knol · Lost Media Wiki · Neoseeker.com · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia) · Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal
Magazines/Blogs/News Cyberpunkreview.com · Game Developer Magazine · Gigaom · Helium · JPG Magazine · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices
Microblogging Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Twitter · TwitLonger
Music/Audio AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · TuneWiki · Twaud.io · WinAmp
People Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project
Protocols/Infrastructure FTP · Gopher · IRC · Usenet · World Wide Web
Q&A Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers
Recipes/Food Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList
Social bookmarking Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero
Social networks Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vkontakte · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...
Shopping/Retail Alibaba · AliExpress · Amazon · Apple Store · eBay · Printfection · RadioShack · Sears · Target · The Book Depository · ThinkGeek · Walmart
Software/code hosting Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads
Torrenting/Piracy ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz
Video hosting Academic Earth · Blip.tv · Epic · Google Video · Justin.tv · Niconico · Nokia Trailers · Qwiki · Stickam · TED Talks · Twitch.tv · Ustream · Viddler · Viddy · Vimeo · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)
Web hosting Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch) · Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webzdarma · Virgin Media
Web applications Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin
Other AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Neopets · Quantcast · Quizilla · Salon Table Talk · Slidecast · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · Volán · Widgetbox · Windows Technical Preview · Wunderlist · Zoocasa
Information A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Backup Tips · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG
Projects Audit2014 · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census) · IRC Quotes · ISP Hosting · JSMESS · JSVLC · Just Solve the Problem · Project Newsletter · University Web Hosting · Valhalla · Woohoo
Tools ArchiveBot · ArchiveTeam Warrior (Tracker) · Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)
Teams Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam
About Archive Team Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ
Personal tools