Skip to content

Instantly share code, notes, and snippets.

@jorgebastida
Created November 29, 2010 15:17
Show Gist options
  • Save jorgebastida/720066 to your computer and use it in GitHub Desktop.
Save jorgebastida/720066 to your computer and use it in GitHub Desktop.
Using lxml, parse a wow blizzard forum to discover blue posts
# encoding: utf-8
import urllib
from lxml.html import fromstring
URL = 'http://eu.battle.net/wow/es/forum/975481/'
def is_blue_post(element):
return len(element.getparent().cssselect('.blizzard_icon')) > 0
if __name__ == '__main__':
content = urllib.urlopen(URL).read()
doc = fromstring(content)
for element in doc.find_class('post-title'):
if is_blue_post(element):
print u'[Blue] %s' % element.cssselect('a')[0].text.strip()
@BirkhoffLee
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment