Talk:Internet Archive Census

From Archiveteam
Jump to: navigation, search


The jq command line for parsing the census json was not obvious to me, so here are two examples to get you started. To get the id and total_size for each item on the same row, separated by spaces:

jq -r '[.id, " ", .total_size | tostring] | add'

To get the hash and name for each file, you have to split up the "files" array and get the info from each element:

jq -r '.files | .[] | [.md5, " ", .name | tostring] | add'

--Sep332 10:01, 12 March 2015 (EDT)

2012 census

On August 2012 I did a "census" using the search engine exporting capabilities. Internet Archive had 4.9 million items on that date. Emijrp (talk) 06:42, 20 November 2016 (EST)