Skip to content

Instantly share code, notes, and snippets.

@thoughtfulbloke
Last active October 26, 2018 02:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thoughtfulbloke/b14a097dce1958b44eaabecb18b3d638 to your computer and use it in GitHub Desktop.
Save thoughtfulbloke/b14a097dce1958b44eaabecb18b3d638 to your computer and use it in GitHub Desktop.
library(rtweet)
library(dplyr)
library(tidytext)
# the q setting is a single emoji, but I got a unicode spacing issue when writing it:
happy_face <- search_tweets(q=" 😀", n=100000, token=bearer_token(), retryonratelimit = TRUE)
happy_centric <- happy_face %>%
filter(is.na(retweet_status_id) & is.na(quoted_status_id)) %>%
select(status_id, text) %>% unnest_tokens(char, text, token="characters") %>%
group_by(status_id) %>% mutate(before=lag(char), after=lead(char)) %>% ungroup() %>%
select(before, char, after) %>%
filter(char == "😀")
either_side <- data.frame(char = c(happy_centric$before, happy_centric$after),
place = c(rep("before", length(happy_centric$before)),
rep("after", length(happy_centric$after))),
stringsAsFactors=FALSE) %>% filter(!is.na(char))
either_side %>% count(place,char) %>% group_by(char) %>%
summarise(prop_before = sum(n * (place == "before")) / sum(n), number=sum(n)) %>%
filter(number > 30) %>% View()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment