Skip to content

Instantly share code, notes, and snippets.

@gadenbuie
Created April 18, 2019 17:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gadenbuie/c6af98dcda98cb7b2949d12f363d495b to your computer and use it in GitHub Desktop.
Save gadenbuie/c6af98dcda98cb7b2949d12f363d495b to your computer and use it in GitHub Desktop.
library(tidyverse)
library(tidytext)
mueller_report <- read_csv("mueller_report.csv")
mueller_report_tidy <-
mueller_report %>%
tidytext::unnest_tokens(word, text)
mueller_report_tidy %>%
filter(line > 2) %>%
anti_join(tidytext::stop_words) %>%
filter(!is.na(word), str_detect(word, "[a-zA-Z]")) %>%
mutate(word = str_remove(word, "'s$")) %>%
mutate(word = str_replace(word, "^corney$", "comey")) %>%
mutate(word = str_replace(word, "russian?s?", "russia")) %>%
group_by(word) %>%
count(word, sort = TRUE) %>%
ungroup() %>%
mutate(word = fct_rev(fct_inorder(word))) %>%
slice(1:25) %>%
ggplot() +
aes(word, n, fill = n) +
geom_col() +
coord_flip() +
labs(y = "Number of Times Mentioned", x = NULL,
title = "25 Most-Used Words in the Redacted Mueller Report",
caption = "Chart: @grrrck\nData: github.com/gadenbuie/mueller-report\nSource: Steven Rich (Washington Post)") +
scale_y_continuous(expand = c(0.01,0), limits = c(0, 2500)) +
scale_fill_gradient(low = "#b7c6d6", high = "#445566", guide = FALSE) +
# ggsci::scale_fill_material("deep-purple") +
grkmisc::theme_grk(panel_background_color = NA) +
theme(
panel.border = element_blank(),
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
plot.caption = element_text(hjust = 0)
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment