Skip to content

Instantly share code, notes, and snippets.

@asw456
Last active December 17, 2015 12:29
Show Gist options
  • Save asw456/5610026 to your computer and use it in GitHub Desktop.
Save asw456/5610026 to your computer and use it in GitHub Desktop.
downloading all *.gpx route files from the Canterbury Tramping Club website
# commands to get the links (bash)
# $ wget -c -r --no-parent -k -U Mozilla http://www.ctc.org.nz/
# $ find ./www.ctc.org.nz/ -name "*gpx*" > gpxfiles.txt
# gpxfiles.txt has the numbers used in the script below
import re
import os
import requests
numberlist = [6,92,32,86,4,28,27,103,75,46,90,91,1,10,35,97,20,72,52,69,9,12,96,22,40,102,78,14,82,99,83,84,51,58,42,79,55,26,81,53,76,39,56,3,80,45,77,94,60,47,33,95,36,48,2,50,24,25,71,88,16,89,23,11,98,49,31,18,57,13,19,43,61,44,54,74,37,93,7,85,59,104,100,73,8,30,101,41,34,17,38,70,29,21,15,5,87]
for number in numberlist:
url = "http://www.ctc.org.nz/dbgpx.php?id=%s" % (number)
r = requests.get(url)
print r.headers
filename = re.findall("filename=\"(.+)\"", r.headers['content-disposition'])
directory = os.getcwd()
with open("%s/%s" % (directory, filename[0]), "wb") as some_file:
some_file.write(r.content)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment