|Archiving status||Not saved yet|
|IRC channel||(on EFnet)|
Quizlet is a mobile and web-based study application that allows students to study information via learning tools and games. It is currently used by 1-in-2 high school students and 1-in-3 college students in the United States. Quizlet trains students via flashcards and various games and tests. As of April 30, 2018, Quizlet has over 200 million user-generated flashcard sets and more than 30 million active users. It now ranks among the top 50 websites in the U.S.
Quizlet ‘sets’ are incremental, with the earliest public set having the id ‘173’ and one of the more recent sets being above ‘300000000’. They do have an open API (see https://quizlet.com/api/2.0/docs) that returns a JSON copy of each set. An example API result can be seen here. Back of the napkin math shows that 300,000,000 public sets would take about 400 GB to store uncompressed.
Grabbing the Data
As of now, I have been unsuccessful in finding a reliable way to get everything downloaded. The initial python script I wrote to incrementally grab all of the sets via the API and save them as txt files works, but is painfully slow (after a week of running it on three machines, I only got about 3 million downloaded). I have tried multithreading and multiprocessing, but have been unable to get the same amount downloaded using those methods. Maybe someone else might have some more luck.