Skip to content

Instantly share code, notes, and snippets.

@davidbauer
Created April 18, 2014 17:22
Show Gist options
  • Star 21 You must be signed in to star a gist
  • Fork 11 You must be signed in to fork a gist
  • Save davidbauer/11055010 to your computer and use it in GitHub Desktop.
Save davidbauer/11055010 to your computer and use it in GitHub Desktop.
Python script to download images from a CSV of image urls
#!/usr/bin/env python
# assuming a csv file with a name in column 0 and the image url in column 1
import urllib
filename = "images"
# open file to read
with open("{0}.csv".format(filename), 'r') as csvfile:
# iterate on all lines
i = 0
for line in csvfile:
splitted_line = line.split(',')
# check if we have an image URL
if splitted_line[1] != '' and splitted_line[1] != "\n":
urllib.urlretrieve(splitted_line[1], "img_" + str(i) + ".png")
print "Image saved for {0}".format(splitted_line[0])
i += 1
else:
print "No result for {0}".format(splitted_line[0])
@francoispeyret
Copy link

how do I download these images in a subfolder in the main directory?

you can create a subfolder and add the directory before the name of file, like this :

        if splitted_line[1] != '' and splitted_line[1] != "\n":
            urllib.urlretrieve(splitted_line[1], "download/" + splitted_line[0] + ".jpg")

@scottblair
Copy link

scottblair commented Aug 30, 2022

This version has been updated to work with Python3, includes a subfolder of "images" where the files are saved, and uses a User-Agent to help avoid Forbidden Errors. Works off a file name of images.csv

#!/usr/bin/env python

# assuming a csv file with a name in column 0 and the image url in column 1

import urllib
import ntpath
import urllib.request

opener = urllib.request.build_opener()
opener.addheaders = [('User-Agent','Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1941.0 Safari/537.36')]
urllib.request.install_opener(opener)

def path_leaf(path):
    head, tail = ntpath.split(path)
    return tail or ntpath.basename(head)

filename = "images"

# open file to read
with open("{0}.csv".format(filename), 'r') as csvfile:
    # iterate on all lines
    i = 0
    for line in csvfile:
        splitted_line = line.split(',')
        img_filename = path_leaf(splitted_line[1])
        # check if we have an image URL
        if splitted_line[1] != '' and splitted_line[1] != "\n":
            urllib.request.urlretrieve(splitted_line[1], "images/" + '{0}'.format(img_filename.rstrip("\r\n")))
            print("Image saved for {0}".format(splitted_line[0]))
            print("Filename: " + img_filename)
            i += 1
        else:
            print("No result for {0}".format(splitted_line[0]))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment