Skip to content

Instantly share code, notes, and snippets.

@katerinabc
Created February 2, 2019 20:49
Show Gist options
  • Save katerinabc/47024a26f1da2a33e9187f4a52e9e450 to your computer and use it in GitHub Desktop.
Save katerinabc/47024a26f1da2a33e9187f4a52e9e450 to your computer and use it in GitHub Desktop.
library(ggraph)
# as before, separate the mission column by words, but this time not per word, but per 2 words.
ms_bi <- ms %>% unnest_tokens(bigram, mission, token='ngrams', n=2)
ms_bi
# split the bigram column into two columns
bigrams_separated <- ms_bi %>%
separate(bigram, c("word1", "word2"), sep = " ")
# filter out stop words
bigrams_filtered <- bigrams_separated %>%
filter(!word1 %in% stop_words$word) %>%
filter(!word2 %in% stop_words$word)
# count the bigrams
bigram_counts <- bigrams_filtered %>%
count(word1, word2, sort = TRUE)
bigram_counts
# filter to only keep bigrams above 1
bigram_counts <- bigram_counts %>% filter(n > 1)
#create the bigram network graph
bigram_graph <- bigram_counts %>%
igraph::graph_from_data_frame()
a <- grid::arrow(type = "closed", length = unit(.15, "inches"))
ggraph(bigram_graph, layout = "fr") +
geom_edge_link(aes(edge_alpha = n), show.legend = FALSE,
arrow = a, end_cap = circle(.07, 'inches')) +
geom_node_point(color = "lightblue", size = 5) +
geom_node_text(aes(label = name), vjust = 1, hjust = 1) +
theme_void()
#save the network
ggsave('mission_text_bigram_more_than_1.png', path=mypath)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment