Skip to content

Instantly share code, notes, and snippets.

@prykon
Created February 8, 2021 11:10
Show Gist options
  • Save prykon/976f7e1f66bf7ada8c094ca5defadb45 to your computer and use it in GitHub Desktop.
Save prykon/976f7e1f66bf7ada8c094ca5defadb45 to your computer and use it in GitHub Desktop.
Scrape Top Stories URLs and titles from Google SERPs.
#Scrape Google SERPs for Top Stories URLs and titles
import requests
from bs4 import BeautifulSoup
keyword = 'keyword goes here'
url = 'https://www.google.com/search?q=%s' % keyword
source = requests.get(url).content
soup = BeautifulSoup(source, 'lxml')
result_url = soup.find_all("div", {"class":"HCUNre"})
result_text = soup.find_all("div", {"class":"nDgy9d"})
i=0
output_urls = []
output_text = []
output_rows = []
for r in result_url:
output_urls.append(r.find('a')['href'].strip())
for r in result_text:
output_text.append(r.text.strip())
for i in range(len(output_urls)):
output_rows.append('%s;%s' % (output_urls[i], output_text[i]))
print('%s;%s' % (output_urls[i], output_text[i]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment