This is a small sketch allowing you to load and save a dynamic webpage that you'd like to scrape; this is useful when tools like wget
can only grab the HTML and JS the server gives you (which might then go on to load or synthesize additional parts of the page).
You'll also need to install the development version of CasperJS (via brew install casperjs --devel
) and run the scrape file via casperjs test.js --ssl-protocol=any
. Notice that if the dynamic page needs cookies to load properly (e.g. if you're scraping content that relies on being logged in), you can invoke this with casperjs test.js --ssl-protocol=any --cookies-file=cookies.txt
.