Skip to content

Instantly share code, notes, and snippets.

@datahutrepo
Created September 7, 2016 06:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save datahutrepo/d7d6e9ea3090acc77b0b1c3e7fca23be to your computer and use it in GitHub Desktop.
Save datahutrepo/d7d6e9ea3090acc77b0b1c3e7fca23be to your computer and use it in GitHub Desktop.
In [23]: for i in html_content.iterchildren():
....: print i
....:
<Element head at 0x7f43a5737db8>
<Element body at 0x7f43a5737e10>
In [24]: news_stories = html_content.xpath('//h3[@data-analytics]/a/span/text()')
In [25]: news_links = html_content.xpath('//h3[@data-analytics]/a/@href')
In [26]: news_links
Out[26]:
['/2016/07/25/politics/democratic-convention-dnc-emails-russia/index.html',
'/2016/07/25/us/fort-myers-nightclub-shooting/index.html',
'/2016/07/24/world/ansbach-germany-blast/index.html',
'/2016/07/25/europe/germany-attacks-asylum-seekers-refugees/index.html',
'/2016/07/25/world/protests-boy-killed-bangladesh/index.html',
'/2016/06/15/politics/muslim-ban-maps-donald-trump/index.html',
'/2016/07/24/world/qandeel-baloch-death-father-azeem/index.html',
'/2016/07/24/aviation/tripadvisor-world-favorite-airlines/index.html',
'/2016/07/25/africa/koffi-olomide-dancer-kenya/index.html']
In [27]: news_stories
Out[27]:
['FBI launches investigation into suspected Russian email hack',
"Two dead, 14 injured at Florida 'Swimsuit Glow Party'",
'Suicide bomber was slated to be deported',
'German public questions refugee policy',
'Brutal killing of boy, 10, sparks protests',
"Mapped: Trump's Muslim travel ban ",
"Father of slain social star: 'I want revenge'",
"World's most-loved airline is...",
'Pop star apologizes for kicking dancer ']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment