This find & replace function was inspired by Daniel Mallory Ortberg's "Bible Verses Where 'Behold' Has Been Replaced With Look, Buddy” <http://the-toast.net/2016/06/06/bible-verses-where-behold-has-been-replaced-with-look-buddy/>, allowing you to create a dataframe of verses from the Bible with similar substitutions.
# imports the necessary libraries | |
library(scriptuRs, stringr) | |
# creates a function to import Bible data, select important columns, detect the first string passed to the function, and create a new column in which that string is replaced by the second string passed to the function. Function returns a dataframe of each verse that includes the first string. The `revText` column contains the revised text of each verse. | |
bibleEdit <- function(kjv, nkjv){ | |
bible <- kjv_bible() %>% | |
select(chapter_number, verse_number, verse_title, book_title, text) %>% | |
drop_na(text) %>% | |
filter(grepl(kjv, text, ignore.case = T)) %>% | |
# filter(str_detect(text, kjv)) %>% | |
mutate(revText = gsub(kjv, nkjv, text, ignore.case = T)) %>% | |
mutate(revText = gsub("(^[a-z]|\\. [a-z])", "\\U\\1", revText, perl = TRUE)) %>% | |
mutate(revText = gsub(",,",",", revText)) %>% | |
mutate(revText = paste(verse_title, " ", revText, "\n\n", sep = "")) | |
return(bible) | |
} | |
# This line calls the function with the string to seach in the first quotation marks and the string to replace in the second. The resulting dataframe is passed to the `verses` variable. | |
verses <- bibleEdit("thou shalt", "it'd be great if you could") | |
# `cat` will print in a way that retains line breaks. This version will print just the revised text to the console. | |
cat(verses$revText) | |
# if you are unsure what phrases might be fun/interesting to substitute, the code below will either count words from the `bible` dataframe or count ngrams. Control the size of the ngrams with the `n =` argument. | |
bible %>% | |
unnest_tokens(word, text) %>% | |
anti_join(stopWords) %>% | |
group_by(word) %>% | |
summarize(count = n()) %>% | |
arrange(desc(count), word) %>% | |
View() | |
bible %>% | |
unnest_tokens(ngram, text, token = "ngrams", n = 3) %>% | |
group_by(ngram) %>% | |
summarize(count = n()) %>% | |
arrange(desc(count), ngram) %>% | |
View() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment