Skip to content

Instantly share code, notes, and snippets.

@mkearney
Created February 22, 2020 22:12
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mkearney/5a503e2821f5b92834fa669973828811 to your computer and use it in GitHub Desktop.
Save mkearney/5a503e2821f5b92834fa669973828811 to your computer and use it in GitHub Desktop.
Collect (via Twitter's stream API) tweets from 2020 Nevada Caucus
## load rtweet
library(rtweet)
## store bounding box coordinates for state of nevada
nv <- c(-120.00647, 35.00186, -114.03965, 42.00221)
## initialize output vector and Midwestern 'ope' counter
s <- list()
ope <- 0L
## specify stop time as *noon tomorrow* (you can manually override too)
stop_time <- as.POSIXct(format(Sys.Date(), "%Y-%m-%d 12:00:00"))
## keep streaming tweets until noon (UTC) tomorrow
while (Sys.time() < stop_time) {
## connect to stream API for one hour at a time
s[[length(s) + 1L]] <- tryCatch(
stream_tweets(nv, timeout = 60 * 60),
error = function(e) data.frame())
## if no tweets (probably due to error) then add to 'ope' counter
## otherwise print running total of collected tweets
if (NROW(s[[length(s)]]) == 0L) {
cat("Ope!\n")
Sys.sleep(1)
ope <- ope + 1L
} else {
cat("Collected ", sum(dapr::vap_int(s, NROW)), " tweets!\n")
}
## this is a safety valve; if your stream fails 101 times,
## you should probably shut it down and figure out why
if (ope > 100L) {
stop("Shut it down. That's too many 'opes'!")
}
}
## combine into single data frame
s <- do.call("rbind", s)
## save as CSV (and how to read it back in)
save_as_csv(s, "nevada-caucus-tweets-2020.csv")
#s <- rtweet::read_as_csv("nevada-caucus-tweets-2020.csv")
## save as serialized R data (and how to read it back in)
saveRDS(s, "nevada-caucus-tweets-2020.rds")
#s <- readRDS("nevada-caucus-tweets-2020.rds")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment