slavakurilyak/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Download images stored as URLs from a CSV file

Dealing with a image dataset? Dealing with CSVs intead of JPGs? Use this script to download images from a CSV file, which were originally stored as URLs.
Usage

To download full resolution images, type:
$ python download-images-from-csv.py <csv_filename>

To download thumbnail images, type:
$ python download-thumbnails-from-csv.py <csv_filename>

Examples

$ python download-images-from-csv.py images

Assuming images.csv has the following columns:

Image Name (ImageID) in column 1
Full Resolution URL (OriginalURL) in column 3

$ python download-thumbnails-from-csv.py images

Assuming images.csv has the following columns:

Image Name (ImageID) in column 1
Thumbnail URL (Thumbnail300KURL) in column 11

Results

Full resolution images are stored into fullres folder, as <ImageID>.jpg
Thumbnail images are stored into thumbnails folder, as <ImageID>.jpg
Inspired By


David Bauer
Open Images Dataset by Google


## download-images-from-csv.py
## Assuming a csv file has:
## Image Name (ImageID) in column 1 (line[0])
## Full Resolution URL (OriginalURL) in column 3 (line[2])

import sys
import urllib
from csv import reader
import os.path

csv_filename = sys.argv[1]

with open(csv_filename+".csv".format(csv_filename), 'r') as csv_file:
    for line in reader(csv_file):
        if os.path.isfile("fullres/" + line[0] + ".jpg"):
            print "Image skipped for {0}".format(line[0])
        else:
            if line[2] != '' and line[0] != "ImageID":
                urllib.urlretrieve(line[2], "fullres/" + line[0] + ".jpg")
                print "Image saved for {0}".format(line[0])
            else:
                print "No result for {0}".format(line[0])

## download-thumbnails-from-csv.py
## Assuming a csv file has:
## Image Name (ImageID) in column 1 (line[0])
## Thumbnail URL (Thumbnail300KURL) in column 11 (line[10])

import sys
import urllib
from csv import reader
import os.path

csv_filename = sys.argv[1]

with open(csv_filename+".csv".format(csv_filename), 'r') as csv_file:
    for line in reader(csv_file):
        if os.path.isfile("thumbnails/" + line[0] + ".jpg"):
            print "Image skipped for {0}".format(line[0])
        else:
            if line[10] != '' and line[0] != "ImageID":
                urllib.urlretrieve(line[10], "thumbnails/" + line[0] + ".jpg")
                print "Image saved for {0}".format(line[0])
            else:
                print "No result for {0}".format(line[0])
	## Assuming a csv file has:
	## Image Name (ImageID) in column 1 (line[0])
	## Full Resolution URL (OriginalURL) in column 3 (line[2])

	import sys
	import urllib
	from csv import reader
	import os.path

	csv_filename = sys.argv[1]

	with open(csv_filename+".csv".format(csv_filename), 'r') as csv_file:
	for line in reader(csv_file):
	if os.path.isfile("fullres/" + line[0] + ".jpg"):
	print "Image skipped for {0}".format(line[0])
	else:
	if line[2] != '' and line[0] != "ImageID":
	urllib.urlretrieve(line[2], "fullres/" + line[0] + ".jpg")
	print "Image saved for {0}".format(line[0])
	else:
	print "No result for {0}".format(line[0])
	## Assuming a csv file has:
	## Image Name (ImageID) in column 1 (line[0])
	## Thumbnail URL (Thumbnail300KURL) in column 11 (line[10])

	import sys
	import urllib
	from csv import reader
	import os.path

	csv_filename = sys.argv[1]

	with open(csv_filename+".csv".format(csv_filename), 'r') as csv_file:
	for line in reader(csv_file):
	if os.path.isfile("thumbnails/" + line[0] + ".jpg"):
	print "Image skipped for {0}".format(line[0])
	else:
	if line[10] != '' and line[0] != "ImageID":
	urllib.urlretrieve(line[10], "thumbnails/" + line[0] + ".jpg")
	print "Image saved for {0}".format(line[0])
	else:
	print "No result for {0}".format(line[0])