Skip to content

Instantly share code, notes, and snippets.

@jeroen
Created January 13, 2015 00:01
Show Gist options
  • Save jeroen/7e56e2649389f53ed0ee to your computer and use it in GitHub Desktop.
Save jeroen/7e56e2649389f53ed0ee to your computer and use it in GitHub Desktop.
V8 cheerio rvest example
# Proof of concept of using V8 to parse HTML in R
# Example taken from rvest readme
# Jeroen Ooms, 2015
library(V8)
stopifnot(packageVersion("V8") >= "0.4")
# Get Document
html <- paste(readLines("http://www.imdb.com/title/tt1490017/"), collapse="\n")
# Initiate Cheerio
ct <- new_context()
ct$source("https://raw.githubusercontent.com/jeroenooms/js/master/inst/lib/cheerio.min.js")
# Parse HTML
ct$assign('html', html)
ct$assign('lego_movie', I('cheerio.load(html)'))
ct$get('lego_movie("strong span").text()')
# Play around some more:
ct$console()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment