Skip to content

Instantly share code, notes, and snippets.

@favstats
Created January 10, 2019 11:54
Show Gist options
  • Save favstats/5a856d8fc0679d23850d5491496839de to your computer and use it in GitHub Desktop.
Save favstats/5a856d8fc0679d23850d5491496839de to your computer and use it in GitHub Desktop.
Twitter Followers per Day from Social Blade
## Helper function to preview ggplots
## thanks to @tjmahr for sharing!
ggpreview <- function (..., device = "png") {
fname <- tempfile(fileext = paste0(".", device))
ggplot2::ggsave(filename = fname, device = device, ...)
system2("open", fname)
invisible(NULL)
}
## install pacman if you don't have it
pacman::p_load(tidyverse, rvest, glue, ggrepel, emo)
## my social blade data
social_html <- read_html("https://socialblade.com/twitter/user/favstats/monthly")
## function to scrape social blade data
get_social_data <- function(x) {
date <- social_html %>%
html_nodes(xpath = glue("/html/body/div[10]/div[1]/div[{x}]/div[1]")) %>%
html_text()
daily_follower <- social_html %>%
html_nodes(xpath = glue("/html/body/div[10]/div[1]/div[{x}]/div[3]/div[1]/span")) %>%
html_text()
tibble(date, daily_follower)
}
## scraping and cleaning
social_data <- 5:34 %>%
map_dfr(get_social_data) %>%
mutate(daily_follower = ifelse(daily_follower == "--", 0, parse_number(daily_follower))) %>%
mutate(date = lubridate::as_date(date)) %>%
mutate(pos_neg = ifelse(daily_follower >= 0, "pos", "neg"))
## graph
ggtwitter <- social_data %>%
ggplot(aes(date, daily_follower)) +
geom_point() +
geom_line(size = 0.5) +
geom_text_repel(data = social_data %>% filter(daily_follower >= 5),
aes(label = daily_follower),
nudge_y = 1, nudge_x = 0.5,
direction = "y") +
theme_minimal() +
geom_hline(yintercept = 0, linetype = "dashed", color = "grey") +
annotate(geom = "text", x = as.Date("2019-01-06"), y = 62,
label = ji_glue(":open_mouth:Hadley Wickham retweets me:open_mouth:")) +
annotate(geom = "text", x = as.Date("2019-01-04") + 0.35, y = 14,
label = ji_glue("Tweet about Rstudio Cloud :cloud:")) +
labs(x = "", y = "Twitter Followers per Day\n",
title = "Welcome to all my new Twitter Followers!",
subtitle = "Beware: You will encounter a lot of R. And Memes\n",
caption = "Data from Social Blade") +
scale_x_date(date_breaks = "4 day", date_labels = "%d %b %Y") +
theme(plot.title = element_text(size = 14, face = "bold"),
plot.subtitle = element_text(face = "italic"))
## preview plot with this awesome function
# ggpreview(width = 10, height = 6)
ggsave("ggtwitter.png", width = 10, height = 6)
## Tweet with Graph can be found here: https://twitter.com/favstats/status/1083330490915086336
@erinlynmclean
Copy link

Howdy! I'm getting

> social_html <- read_html("https://socialblade.com/twitter/user/favstats/monthly") Error in open.connection(x, "rb") : HTTP error 403.

When I try to run this. Any ideas? The preliminary research I did tells me I may not have permissions to scrape web data from this domain.

@favstats
Copy link
Author

Yes, this is not working indeed! Seems like they closed the opportunity to retrieve data from their website like this. You can check out RSelenium (https://github.com/ropensci/RSelenium) to scrape the website anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment