Created
September 1, 2019 10:09
-
-
Save seandlg/d5d851993f42be09a2fbb815d7ac03a6 to your computer and use it in GitHub Desktop.
A small script that downloads flat images from openrent.co.uk
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# begin by executing: "scrapy shell {link to flat}" | |
# scrapy exposes a response object that is used in the | |
# in the following few lines of code. Run the lines | |
# individually or simply execute the script by running | |
# "exec(open('imagedownloader.py').read())" in the | |
# scrapy shell. | |
import urllib.request | |
divs = response.css('div') | |
divsWithDataSrc = [] | |
for div in divs: | |
try: | |
dataSrc = div.attrib['data-src'] | |
divsWithDataSrc.append(dataSrc) | |
print("Added a div") | |
except KeyError: | |
print("Skipped a div") | |
downloadLinks = list(map((lambda x: "http://" + x.split('//')[1]), divsWithDataSrc)) | |
for i,link in enumerate(downloadLinks): | |
urllib.request.urlretrieve(link, str(i)+".jpg") | |
print(f"Successfully downloaded image #{i}.") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment