Skip to content

Instantly share code, notes, and snippets.

@terryp
Created December 28, 2011 16:38
Show Gist options
  • Save terryp/1528596 to your computer and use it in GitHub Desktop.
Save terryp/1528596 to your computer and use it in GitHub Desktop.
Beautiful Soup v. PyQuery
...
page = b.get_html()
soup = BeautifulSoup.BeautifulSoup(page)
first_headline = soup.find('div',
{"class" : "news-stream-module headlines cf"}).findNext('ol').li
first_headline_bs = first_headline.a.contents
print first_headline_bs
assert first_headline_bs
contents = pq(page)
# first_headline_pq = contents("div").filter(".news-stream module").find("ol").find("li").find("a").eq(0).text()
# or like Alex says!
first_headline_pq = contents("div.news-stream-module ol li a").eq(0).text()
assert first_headline_pq
print first_headline_pq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment