Skip to content

Instantly share code, notes, and snippets.

@jtg567
Last active December 2, 2015 00:14
Show Gist options
  • Save jtg567/16b92b45346cc82d6a38 to your computer and use it in GitHub Desktop.
Save jtg567/16b92b45346cc82d6a38 to your computer and use it in GitHub Desktop.
Scrape broadcaster_status from mixlr.com with python (lxml and selenium)
from lxml import html
from selenium import webdriver
# scrape from mixlr page after js done (element won't exist otherwise)
driver = webdriver.PhantomJS(executable_path='<replace path>/phantomjs.exe') # another webdriver works fine
driver.get('http://mixlr.com/<replace username>')
tree = html.fromstring(driver.page_source)
air_status = tree.xpath('//*[@id="broadcaster_status"]/span[2]//text()')
driver.quit()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment