Skip to content

Instantly share code, notes, and snippets.

@suriyadeepan
Created August 26, 2016 06:09
Show Gist options
  • Save suriyadeepan/b940caf6cba552527613c1f93e26cc80 to your computer and use it in GitHub Desktop.
Save suriyadeepan/b940caf6cba552527613c1f93e26cc80 to your computer and use it in GitHub Desktop.
Scrap images from a wiki page using Beautiful Soup
from bs4 import BeautifulSoup
import requests
url = 'https://en.wikipedia.org/wiki/Transhumanism'
# get contents from url
content = requests.get(url).content
# get soup
soup = BeautifulSoup(content,'lxml') # choose lxml parser
# find the tag : <img ... >
image_tags = soup.findAll('img')
# print out image urls
for image_tag in image_tags:
print(image_tag.get('src'))
@kchiran
Copy link

kchiran commented Aug 11, 2020

imagefile = open(filename + ".jpeg", 'wb')
imagefile.write(urllib.request.urlopen(image).read())
imagefile.close()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment