Skip to content

Instantly share code, notes, and snippets.

@joelnitta
Created November 18, 2022 03:03
Show Gist options
  • Save joelnitta/48a8c3b80fafc0e06dde9c39e5841915 to your computer and use it in GitHub Desktop.
Save joelnitta/48a8c3b80fafc0e06dde9c39e5841915 to your computer and use it in GitHub Desktop.
Archive a user's tweets
library(rtweet)
library(tidyverse)
# Initial authorization setup, only need to do once
# auth_setup_default() #nolint
# Authorize
auth_as("default")
# Set user name
user_name <- "PUT USER NAME HERE"
# Download all tweets from user (as many as possible anyways)
# The docs say it should work for up to 3200 tweets
# https://docs.ropensci.org/rtweet/reference/get_timeline.html
my_tweets <- get_timeline(
user = user_name,
n = Inf,
retryonratelimit = TRUE)
# And save them forever!
saveRDS(my_tweets, "my_tweets.RDS")
# Extract URLs of media (images etc)
# We can map back to the tweet the image came from by `tweet_id`, which
# matches `id` of my_tweets
media_url <-
my_tweets %>%
rename(tweet_id = id) %>%
mutate(media = map(entities, "media")) %>%
select(tweet_id, media) %>%
unnest(media) %>%
filter(!is.na(id)) %>%
mutate(
filename = str_match(media_url, "\\/([^\\/]*)$") |>
magrittr::extract(, 2)
)
# Download media to "media" folder
walk2(
media_url$media_url,
media_url$filename,
~download.file(.x, glue::glue("media/{.y}")))
@joelnitta
Copy link
Author

@darachm glad it worked for you!

It should be possible to wrap this into a function that could be run from the command line - basically, the only input you need is the user name. But as you note they would still need to authorize using rtweet.

However, there's another reason the current script wouldn't work well for non-useRs: the data are nested, so you can't write them out to a flat CSV (they originally come from the twitter API as JSON). I save them as an RDS file, which you can only really work with in R. So if you wanted to make a "general purpose" script it might be better just to save the JSON.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment