Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Scrape Google Scholar Co-Authors Results with Python
from bs4 import BeautifulSoup
import requests, lxml, os
headers = {
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
proxies = {
'http': os.getenv('HTTP_PROXY')
html = requests.get('', headers=headers, proxies=proxies).text
soup = BeautifulSoup(html, 'lxml')
for container in'.gsc_rsb_aa'):
author_name = container.select_one('#gsc_rsb_co a').text
author_affiliations = container.select_one('.gsc_rsb_a_ext').text
author_link = container.select_one('#gsc_rsb_co a')['href']
# Part of the output:
Christoph Benzmüller
Professor, FU Berlin
Pascal Fontaine
LORIA, INRIA, Université de Lorraine, Nancy, France
Stephan Merz
Senior Researcher, INRIA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment