Skip to content

Instantly share code, notes, and snippets.

@christophermanning
Last active September 27, 2015 21:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save christophermanning/1334341 to your computer and use it in GitHub Desktop.
Save christophermanning/1334341 to your computer and use it in GitHub Desktop.
Make Flickr shapes JSON parsable
#!/usr/bin/env python
import tarfile
import urllib
import re
import StringIO
import os
flickr_shapes_file_name = 'flickr_shapes_public_dataset_2.0.tar.gz'
new_flickr_shapes_file_name = 'json_parsable_' + flickr_shapes_file_name
try:
open(flickr_shapes_file_name)
except IOError as e:
print 'Downloading Flickr Shapes File'
urllib.urlretrieve ('http://www.flickr.com/services/shapefiles/2.0/', flickr_shapes_file_name)
old_tar = tarfile.open(flickr_shapes_file_name)
new_tar = tarfile.open(new_flickr_shapes_file_name, 'w|gz')
for file_info in old_tar:
print 'Processing %s' % file_info.name
old_data = old_tar.extractfile(file_info.name).read()
p = re.compile(',(\s+})')
new_data = p.sub('\\1', old_data)
file_info.size = len(new_data)
new_tar.addfile(file_info, StringIO.StringIO(new_data))
new_tar.close()
print 'New archive saved at %s/%s' % (os.getcwd(), new_flickr_shapes_file_name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment