Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save manugarri/0e6ee3353eaadfa0ea7f to your computer and use it in GitHub Desktop.
Save manugarri/0e6ee3353eaadfa0ea7f to your computer and use it in GitHub Desktop.
Using Selenium and Python to screenshot a javascript-heavy page

Using Selenium and Python to screenshot a javascript-heavy page

As websites become more JavaScript heavy, it's harder to automate things like screenshotting for archival purposes. I've seen examples and suggestions to use PhantomJS for visual testing/archiving of websites, but have run into issues such as the non-rendering of webfonts. I've never tried out Selenium until today...and while I'm not thinking about performance implications yet, Selenium seems far more accurate than PhantomJS...which makes sense since it actually opens a real browser. And it's not too hard to script to do complex interactions: here's an example of how to log in to Twitter, write a tweet, upload an image, and send a tweet via Selenium and DOM element selection...Obviously, you shouldn't be automating Twitter via browser when the API and tweepy are so much better for that, though it is fun to watch the browser go through the steps without you touching a thing.

The example snippet below, which is not particularly well coded, opens up YouTube's homepage and clunkily scrolls to the bottom, triggering the AJAX events that load video previews below the browser fold. It then "clicks" the Load more button, scrolls to the bottom, then scrolls back up before taking a screenshot of the entire page:

from selenium import webdriver
from time import sleep
driver = webdriver.Firefox()
driver.get("https://www.youtube.com")


# scroll some more
for isec in (4, 3, 2, 1):
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight / %s);" % isec)
    sleep(1)

# load more
sleep(2)
print("push Load more...")
driver.find_element_by_css_selector('button.load-more-button').click()

print("wait a bit...")
sleep(2)

print("Jump to the bottom, work our way back up")
for isec in (1, 2, 3, 4, 5):
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight / %s);" % isec)
    sleep(1)

driver.execute_script("window.scrollTo(0, 0)")
print("Pausin a bit...")
sleep(2)
print("Scrollin to the top so that the nav bar isn't funny looking")
driver.execute_script("window.scrollTo(0, 0);")


sleep(1)
print("Screenshotting...")
# screenshot
driver.save_screenshot("/tmp/youtube.com.jpg")

Result

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment