|Archiving status||In progress... (WARC files being uploaded)|
Google Code allowed people to commit their code into either a Subversion (SVN), Git or Mercurial repository. It had a downloads section for people to upload their software packages (with a quota limit of 4GB, could be increased upon request) and also a wiki for projects to document their work at. There was also an issue tracker to track bugs in the project's software.
Google Code officially shut down on January 25, 2016, but they left a public archive.
The site went read-only on 24th August, 2015, and was closed on 25th January, 2016. They left a public archive, though.
Archiving source code repositories is rather easy (and incremental). Just clone the git/hg repository, or checkout SVN repo. For SVN, make sure that you checkout all branches, not just trunk. Ideally for svn one would use "svnrdump dump REPO" to dump not only the latest revision of the repository, but the complete history.
Archiving bugtrackers and the other stuff will be a bit harder.
ArchiveTeam started to save Google Code on December 18, 2015, as a Warrior project.
After the closure, they left a public archive, but that is missing some of the original information. Although the original content got hidden from the public, ArchiveTeam got access and went on saving it, so that the Wayback Machine can receive a full copy.
Some seeds for site discovery:
- Underway: Scrape Google Code Search
- URLs from ArchiveTeam IRC logs
- List scraped from MediaWiki wikis
- List from FlossMole's data (sorted from a possibly-incomplete survey in November 2012: http://flossdata.syr.edu/data/gc/)
- Links from Open Directory Project
- Links from Kyan
- TODO: Scrape Google Search
- TODO: Scrape Bing
- TODO: Scrape Twitter
- TODO: Scrape the Common Crawl Index
- TODO: Scrape URLTeam dumps
- TODO: ask chris dibona for a complete list of projects
- FlossMole provides a set of tools to spider projects from GC
Google Code archives are (being) uploaded to https://archive.org/details/archiveteam_googlecode, in WARC format.
"The Google Code Archive (https://code.google.com/archive/) contains the data found on the Google Code Project Hosting Service, which will be turned down in early 2016. This archive contains over 1.4 million projects, 1.5 million downloads, and 12.6 million issues."
- FAQ - support - Project Hosting on Google Code FAQ - User support for Google Project Hosting - Google Project Hosting
- Bidding farewell to Google Code
- Export to GitHub - Google Code