GeoCities Japan

From Archiveteam
Revision as of 05:31, 5 November 2018 by Hiroi (talk | contribs) (Adding some issues to keep track of.)
Jump to: navigation, search
GeoCities Japan
GeoCities Japan logo
Geocities japan 2k.png
Project status Closing
Archiving status In progress...
Project source Unknown
Project tracker Unknown
IRC channel #notagain (on EFnet)
Project lead Unknown

GeoCities Japan is the Japanese version of GeoCities. It survived the 2009 shutdown of the global platform.


On 2018-10-01, Yahoo! Japan announced that they would be closing GeoCities at the end of March 2019. (New accounts can still be created until 2019-01-10.)

Discovery Info


  • Hidden-entry sites (Importance: Low): There are a few sites that do not use index.htm/index.html as their entry points; as a result, first level directory access will fail to reach them.
    • However, as long as there are other geocities sites linked to them, they should be discoverable by the crawler.
    • So the only problem are those pages whose inlinks are all dead. There should be very few of those. If we want to be absolutely sure, we can run a diff between IA's current CDX and that from the crawl.
    • Notice that this is not a problem with the neighborhood sites as we can enumerate the URLs.
  • Deduplication (Importance: Low): If we are going to release a torrent as we did with Geocities, they it may be worth to dedup. Most likely won't be a major difference.
  • Final Snapshot (Importance: Moderate): The page contents may still change between now and March 31 2019, so we need to do another crawl when the time is near.
    • Note that a lot of users will be setting up 301/302s before the server shuts down. According to Yahoo, we'll have until Sep 30 2019 to log down those 301/302s.