Skip to content

Instantly share code, notes, and snippets.

@toyeiei
Created January 16, 2019 00:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save toyeiei/d1188399b0bee6acc68e3fcf345393c3 to your computer and use it in GitHub Desktop.
Save toyeiei/d1188399b0bee6acc68e3fcf345393c3 to your computer and use it in GitHub Desktop.
rvest final tutorial 4
# select movie name
imdb %>%
html_nodes(".lister-item-header a") %>%
html_text() -> movie_names
# select year
imdb %>%
html_nodes(".text-muted.unbold") %>%
html_text() %>%
# extract numeric year
stringr::str_extract('[0-9]+') %>%
as.integer() -> years
# select rating
imdb %>%
html_nodes(".ratings-imdb-rating strong") %>%
html_text() %>%
as.numeric() -> ratings
# create dataframe
imdb_data <- data.frame(movie_names, years, ratings)
# export csv file
write.csv(imdb_data, "imdb_data.csv")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment