Skip to content

Instantly share code, notes, and snippets.

@nukeop
Created November 4, 2017 22:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nukeop/de3baec1cc3e7b433ded7f004c7062c2 to your computer and use it in GitHub Desktop.
Save nukeop/de3baec1cc3e7b433ded7f004c7062c2 to your computer and use it in GitHub Desktop.
import requests
import re
import urllib.request
import tqdm
from bs4 import BeautifulSoup, SoupStrainer
page = requests.get('http://mightandmagic.wikia.com/wiki/Category:Heroes_V_hero_icons')
page = page.text
page = BeautifulSoup(page, 'lxml')
page = page.find_all('a', href=re.compile('.*File:Hero.*png'))
links = ['http://mightandmagic.wikia.com' + x.get('href') for x in page]
links = list(set(links))
for x in tqdm.tqdm(links):
avatarpage = requests.get(x)
avatarpage = avatarpage.text
avatarpage = BeautifulSoup(avatarpage, 'lxml')
avatarpage = avatarpage.find_all('a', href=re.compile('.*vignette.*png.*format=original.*'))
avatarlink = avatarpage[0].get('href')
urllib.request.urlretrieve(avatarlink, avatarlink.split('/')[7])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment