Skip to content

Instantly share code, notes, and snippets.

@dsilvadeepal
Last active May 4, 2018 03:13
Show Gist options
  • Save dsilvadeepal/2c6353fa36267a742fa98b27ee7b5c32 to your computer and use it in GitHub Desktop.
Save dsilvadeepal/2c6353fa36267a742fa98b27ee7b5c32 to your computer and use it in GitHub Desktop.
Extracting the Top 10 Pop Artists of All Time
#Identify the url from where you want to extract data
base_url <- "https://www.billboard.com/charts/greatest-of-all-time-pop-songs-artists"
webpage <- read_html(base_url)
# Get the artist name
artist <- html_nodes(webpage, ".chart-row__artist")
artist <- as.character(html_text(artist))
# Get the artist rank
rank <- html_nodes(webpage, ".chart-row__rank")
rank <- as.numeric(html_text(rank))
# Save it to a tibble
top_artists <- tibble('Artist' = gsub("\n", "", artist), #remove the \n character in the artist's name
'Rank' = rank) %>%
filter(rank <= 10)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment