Skip to content

Instantly share code, notes, and snippets.

@hadware
Created January 19, 2018 00:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hadware/b613d15464499daf87c92f901fbfc2fb to your computer and use it in GitHub Desktop.
Save hadware/b613d15464499daf87c92f901fbfc2fb to your computer and use it in GitHub Desktop.
Scraping a website's text in a class using LXML and XPath
from lxml import etree
from urllib import request
page = request.urlopen("http://link.fr").read()
root = etree.HTML(page)
items = root.xpath("//div[@class='custom-class']/text()")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment