Skip to content

Instantly share code, notes, and snippets.

@brianewing
Created May 26, 2011 23:02
Show Gist options
  • Star 19 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save brianewing/994303 to your computer and use it in GitHub Desktop.
Save brianewing/994303 to your computer and use it in GitHub Desktop.
Python MD5 of remote file (URL)
import os, hashlib, urllib2, optparse
def get_remote_md5_sum(url, max_file_size=100*1024*1024):
remote = urllib2.urlopen(url)
hash = hashlib.md5()
total_read = 0
while True:
data = remote.read(4096)
total_read += 4096
if not data or total_read > max_file_size:
break
hash.update(data)
return hash.hexdigest()
if __name__ == '__main__':
opt = optparse.OptionParser()
opt.add_option('--url', '-u', default='http://www.google.com')
options, args = opt.parse_args()
print get_remote_md5_sum(options.url)
@lolobosse
Copy link

Note to myself: urlopen doesn't work with umlaut in the URL (use request instead)

@jhc36-duke-edu
Copy link

If I run it on the same url twice it always produces a diff hash, even tho the file did not change. I can't figure out how to fix that. Any help would be appreciated. Or works as designed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment