Last active
May 4, 2018 03:13
-
-
Save dsilvadeepal/2c6353fa36267a742fa98b27ee7b5c32 to your computer and use it in GitHub Desktop.
Extracting the Top 10 Pop Artists of All Time
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Identify the url from where you want to extract data | |
base_url <- "https://www.billboard.com/charts/greatest-of-all-time-pop-songs-artists" | |
webpage <- read_html(base_url) | |
# Get the artist name | |
artist <- html_nodes(webpage, ".chart-row__artist") | |
artist <- as.character(html_text(artist)) | |
# Get the artist rank | |
rank <- html_nodes(webpage, ".chart-row__rank") | |
rank <- as.numeric(html_text(rank)) | |
# Save it to a tibble | |
top_artists <- tibble('Artist' = gsub("\n", "", artist), #remove the \n character in the artist's name | |
'Rank' = rank) %>% | |
filter(rank <= 10) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment