Skip to content

Instantly share code, notes, and snippets.

@JosiahParry
Created April 29, 2019 21:45
Show Gist options
  • Save JosiahParry/edf28701a5d4f813ccddfe0f74ae2df0 to your computer and use it in GitHub Desktop.
Save JosiahParry/edf28701a5d4f813ccddfe0f74ae2df0 to your computer and use it in GitHub Desktop.
Example of using `calc_self_sim()` for self-similarity of song lyrics
library(genius)
library(tidytext)
library(tidyverse)
# get lyrics for album
since_96 <- genius_album("ALIX", "since 96")
# calculate self similarity by group (thank you dplyr v 0.8!!!)
self_sim_96 <- since_96 %>%
group_by(track_title, track_n) %>%
group_map(~calc_self_sim(.x, lyric))
# filter to only the first four tracts
# converting to a factor
self_sim_96 %>%
ungroup() %>%
filter(track_n %in% 1:4) %>%
mutate(track_n = as.factor(track_n)) %>%
ggplot(aes(x = x_id, y = y_id, fill = identical)) +
geom_tile() +
scale_fill_manual(values = c("white", "black")) +
theme_minimal() +
theme(legend.position = "none",
axis.text = element_blank()) +
scale_y_continuous(trans = "reverse") +
facet_wrap(~track_title, ncol = 2, scales = "free") +
labs(x = "", y = "", title = "Since '96", subtitle = "Lyric self-similarity")
@JosiahParry
Copy link
Author

Note that this code was broken as of dplyr 0.8.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment