Skip to content

Instantly share code, notes, and snippets.

Last active June 7, 2019 03:25
Show Gist options
  • Save rccordell/ceb46023066b632a31d2c47d4b07369f to your computer and use it in GitHub Desktop.
Save rccordell/ceb46023066b632a31d2c47d4b07369f to your computer and use it in GitHub Desktop.
This find & replace function was inspired by Daniel Mallory Ortberg's "Bible Verses Where 'Behold' Has Been Replaced With Look, Buddy” <>, allowing you to create a dataframe of verses from the Bible with similar substitutions.
# imports the necessary libraries
library(scriptuRs, stringr)
# creates a function to import Bible data, select important columns, detect the first string passed to the function, and create a new column in which that string is replaced by the second string passed to the function. Function returns a dataframe of each verse that includes the first string. The `revText` column contains the revised text of each verse.
bibleEdit <- function(kjv, nkjv){
bible <- kjv_bible() %>%
select(chapter_number, verse_number, verse_title, book_title, text) %>%
drop_na(text) %>%
filter(grepl(kjv, text, = T)) %>%
# filter(str_detect(text, kjv)) %>%
mutate(revText = gsub(kjv, nkjv, text, = T)) %>%
mutate(revText = gsub("(^[a-z]|\\. [a-z])", "\\U\\1", revText, perl = TRUE)) %>%
mutate(revText = gsub(",,",",", revText)) %>%
mutate(revText = paste(verse_title, " ", revText, "\n\n", sep = ""))
# This line calls the function with the string to seach in the first quotation marks and the string to replace in the second. The resulting dataframe is passed to the `verses` variable.
verses <- bibleEdit("thou shalt", "it'd be great if you could")
# `cat` will print in a way that retains line breaks. This version will print just the revised text to the console.
# if you are unsure what phrases might be fun/interesting to substitute, the code below will either count words from the `bible` dataframe or count ngrams. Control the size of the ngrams with the `n =` argument.
bible %>%
unnest_tokens(word, text) %>%
anti_join(stopWords) %>%
group_by(word) %>%
summarize(count = n()) %>%
arrange(desc(count), word) %>%
bible %>%
unnest_tokens(ngram, text, token = "ngrams", n = 3) %>%
group_by(ngram) %>%
summarize(count = n()) %>%
arrange(desc(count), ngram) %>%
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment