Skip to content

Instantly share code, notes, and snippets.

@jjjake
Created January 31, 2014 00:13
Show Gist options
  • Save jjjake/8722921 to your computer and use it in GitHub Desktop.
Save jjjake/8722921 to your computer and use it in GitHub Desktop.
#!/usr/bin/env python
import sys
from internetarchive import get_data_miner
def get_arcs(item):
for f in item.files():
if f.name.endswith('arc.gz'):
yield f
if __name__ == '__main__':
ids = [x.strip() for x in open(sys.argv[-1])]
miner = get_data_miner(ids)
for i, item in miner:
for f in get_arcs(item):
print '\t'.join([f.name, f.md5])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment