Skip to content

Instantly share code, notes, and snippets.

View c-rack's full-sized avatar
🎯
Focusing

Constantin Rack c-rack

🎯
Focusing
View GitHub Profile

Here's the assignment:

Download this raw statistics dump from Wikipedia (360mb unzipped):

http://dumps.wikimedia.org/other/pagecounts-raw/2014/2014-10/pagecounts-20141029-230000.gz

Write a simple script in your favourite programming language that:

  • Gets all views from the English Wikipedia (these are prefixed by "en ")
  • Limit those articles to the ones with at least 500 views
  • Sort by number of views, highest ones first and print the first ten articles.