Difference between revisions of "Fileplanet"

From Archiveteam
Jump to navigation Jump to search
(updates)
Tag: Replaced
 
(9 intermediate revisions by 2 users not shown)
Line 13: Line 13:
}}
}}


[http://www.fileplanet.com FilePlanet] is no longer hosting new content, and "is in the process of being archived [by IGN]."
In 2012 [http://www.fileplanet.com FilePlanet] announced it was no longer hosting new content, and "is in the process of being archived [by IGN]."


FilePlanet hosted 87,190 download pages of game-related material (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients. We'll want all the arms we can for this one, since it gets harder the farther the archiving goes (files are numbered chronologically, and Skyrim mods are bigger than Doom ones).
FilePlanet hosted tens of thousands of game-related files (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients.


===Current Situation===
===The archival===


We got direct access to the files by IGN, hooray! Mirroring is done, see https://archive.org/details/archiveteam-fileplanet for the tarballs. The ftp2 files cannot be shared publically since there are private files mixed in, we save them to IA anyways so maybe in the future we can sort them out. A detailed writeup and user-friendly interface will be available later. No help needed, everything below is outdated. Thanks for your interest!
After first downloading files [[Fileplanet/Status_of_by_id_grab|by iterating IDs on the public website fileplanet.com and upload those in chunks to archive.org]] as well as [[Fileplanet/non-id-urls|scouting the web for other public URLs]], we got FTP access to the storage servers by the staff. Thanks!


There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/
https://archive.org/details/archiveteam-fileplanet is the collection.


===What We Need===
Unpacked and sorted it amounts to about ~120k files at ~10TB. The "ftp2" files (another ~300k files at a total of ~1.2TB) cannot be shared publically since there are private files mixed in, we saved them to IA anyways so maybe in the future we can sort them out. If you are looking for files from Fileplanet that are not included in the public archives, contact [[User:Schbirid]] with archived URLs that prove their previous availability to the public, e.g. via archived fileplanet.com pages.


* More file URLs, see https://archiveteam.org/index.php?title=Fileplanet/non-id-urls
<gallery>
* Where do links like http://dl.fileplanet.com/dl/dl.asp?classicgaming/o2home/rtl.zip come from and can we rescue those too?
File:Fileplanet ftp structure.png|FTP structure
** The non-IDed files are stuck behind the download manager - any clever way past it?  URLs to the files are of the form [http://download.direct2drive.com/ftp2/planetannihilation/mercilesscreations/opflash/opflash_-_uber_editor_tutorial.pdf?clientid=781894158 http://download.direct2drive.com/ftp2/planetannihilation/mercilesscreations/opflash/opflash_-_uber_editor_tutorial.pdf?clientid=781894158] and seem to require a valid ID to fetch.
File:Fileplanet ftp restructured File Size Statistics.png|Size statistics
*** Those URLs are the ones we currently fetch too. The script "visits" the download page and extracts such URL. The problem with these files is that they open a download link in a new window and I have not yet found out how to "open" that window correctly with wget. Haven't really tried though. -Schbirid
File:Fileplanet ftp restructured File Age.png|Age statistics
* Files! (approx. ??% done 22 June 2012)
File:Fileplanet ftp restructured File Type Statistics.png|File type statistics
** The easy part (incrementing a fileID and downloading it) is pretty much done, we got ~7 Terabytes through that.
File:Fileplanet ftp restructured Largest Files.png|Largest files
* /fileinfo/ pages - get URLs from sitemaps (Schbirid is downloading these)
</gallery>
** Afterwards, extract all thumbnail image links and grab the full size images (strip _sm2 from the basename)
*** grep -hPo 'http.*?_sm2.jpg' fileinfo*/fileinfo.log | sed 's/_sm2//' > fileinfo_fullsizeimages_URLs; # wgot those
**** Done! https://archive.org/details/FileplanetFiles_fileinfo_pages_images
** Schbirid is re-downloading all the fileinfo pages by incrementing the ID, the sitemaps were missing URLs
* [http://blog.fileplanet.com http://blog.fileplanet.com]
** Done! https://archive.org/details/FileplanetBlogFileplanetCom
* Schbirid mirrored http://www.fileplanet.com/fileblog/archives/ (starting from a URL like http://www.fileplanet.com/fileblog/archives/10-24-2010_10-30-2010.shtml ).
** Done! https://archive.org/details/FileplanetFileblog


=== Grabbing files by iterating the IDs on the website ===
There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/
First we tried to archive all the files by iterating the ID of the urls on the website. See https://wiki.archiveteam.org/index.php?title=Fileplanet/Status_of_by_id_grab for info and what we achieved.
 
=== Grabbing files from the FTP ===
* Schbirid mailed to FPOps@IGN.com and at some point in history did get FTP access to archive the files.
* TODO


===Related items===
* /fileinfo/ pages and the embedded images/thumbnails from the grab by IDs: https://archive.org/details/FileplanetFiles_fileinfo_pages_images
* /download/ pages and download logs from the grab by IDs: https://archive.org/details/Fileplanet_index.htmls_and_logs_scraped_by_id
* http://blog.fileplanet.com: https://archive.org/details/FileplanetBlogFileplanetCom
* http://www.fileplanet.com/fileblog/archives/: https://archive.org/details/FileplanetFileblog


{{navigation box}}
{{navigation box}}

Latest revision as of 21:36, 28 December 2023

FilePlanet
Fileplanet logo
Website host of game content, 1999-2012
Website host of game content, 1999-2012
URL http://www.fileplanet.com
Status Special case (no longer being updated)
Archiving status Saved!
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)
(formerly #fireplanet (on EFnet))
Data[how to use] archiveteam-fileplanet

In 2012 FilePlanet announced it was no longer hosting new content, and "is in the process of being archived [by IGN]."

FilePlanet hosted tens of thousands of game-related files (demos, patches, mods, promo stuff, etc.). These tend to be larger files, ranging from 10MB patches to 3GB clients.

The archival

After first downloading files by iterating IDs on the public website fileplanet.com and upload those in chunks to archive.org as well as scouting the web for other public URLs, we got FTP access to the storage servers by the staff. Thanks!

https://archive.org/details/archiveteam-fileplanet is the collection.

Unpacked and sorted it amounts to about ~120k files at ~10TB. The "ftp2" files (another ~300k files at a total of ~1.2TB) cannot be shared publically since there are private files mixed in, we saved them to IA anyways so maybe in the future we can sort them out. If you are looking for files from Fileplanet that are not included in the public archives, contact User:Schbirid with archived URLs that prove their previous availability to the public, e.g. via archived fileplanet.com pages.

There is a half-assed search interface available at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?filename=yourfilenamehere and a directory browser at https://www.quaddicted.com/stuff/fileplanet/fileplanet.php?directory=/some/dir/here/

Related items