Skip to content

Instantly share code, notes, and snippets.

@leeper
Created August 9, 2017 08:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save leeper/4b0a111a000401d391816b5b6e670da1 to your computer and use it in GitHub Desktop.
Save leeper/4b0a111a000401d391816b5b6e670da1 to your computer and use it in GitHub Desktop.
Small script to build a bibtex file from a collection of PDF files
library("pdftools")
# load files
files <- dir(pattern = "pdf$")
# setup metadata
year <- ifelse(grepl("\\d{4}", files), regmatches(files, regexpr("(?<=, )\\d{4}", files, perl = TRUE)), "")
journal <- unlist(lapply(files, function(x) {
if (grepl("(?<=\\().+(?=, \\d{4})", x, perl = TRUE)) {
regmatches(x, regexpr("(?<=\\().*?(?=, \\d{4})", x, perl = TRUE))
} else {
""
}
}))
authors <- gsub("( \\(.+,? ?[[:digit:]]?\\))|(\\)?\\.pdf)", "", files)
key <- unlist(lapply(authors, function(x) {
x <- strsplit(x, ", ")[[1]]
if (length(x) > 3) {
paste0(head(x, 1), "etal", collapse = "")
} else {
paste0(head(x, 3), collapse = "")
}
}))
title <- lapply(files, function(x) {
title <- pdftools::pdf_info(x)$keys$Title
if (!is.null(title)) {
return(title)
} else {
return("")
}
})
# write to disk
cat(
paste0("@Article{", paste0(key, year), ",\n",
"author = {", gsub(", ", " and ", authors, fixed = TRUE), "},\n",
"title = {", title, "},\n",
"year = {", year, "},\n",
"journal = {", journal, "},\n",
"file = {:", files, ":PDF}\n",
"}", collapse = "\n\n"), "\n",
file = "tmp.bib")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment