Skip to content

Instantly share code, notes, and snippets.

@srgrn
Created November 17, 2016 08:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save srgrn/4098ae1ddcbeaa8c096da501c5c14836 to your computer and use it in GitHub Desktop.
Save srgrn/4098ae1ddcbeaa8c096da501c5c14836 to your computer and use it in GitHub Desktop.
download a gzipped file line by line (for example http://csr.lanl.gov/data/cyber1/ files)
import urllib.request
import gzip
N = 200000
def read_line_by_line(url,handlefunc,**kwargs):
with gzip.open(urllib.request.urlopen(url),'r') as f:
for i in range(N):
try:
arr =f.readline().decode('ascii').strip().split(',')
handlefunc(arr,kwargs)
except StopIteration:
break
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment