Skip to content

Instantly share code, notes, and snippets.

@Sinkmanu
Created March 15, 2019 07:00
Show Gist options
  • Save Sinkmanu/c923b335bf64ed84c18e6bd766e5e709 to your computer and use it in GitHub Desktop.
Save Sinkmanu/c923b335bf64ed84c18e6bd766e5e709 to your computer and use it in GitHub Desktop.
Python scrapy skeleton
import requests
from bs4 import BeautifulSoup
user_agent = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0' }
r = requests.get("url", verify=False, headers=user_agent)
soup = BeautifulSoup(r.text, "html5lib")
print soup.find_all('a')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment