Skip to content

Instantly share code, notes, and snippets.

@ashenkin
Last active March 22, 2023 21:11
Show Gist options
  • Save ashenkin/6ab0cd6e475b9cbc3cabc295cf696bc6 to your computer and use it in GitHub Desktop.
Save ashenkin/6ab0cd6e475b9cbc3cabc295cf696bc6 to your computer and use it in GitHub Desktop.
R script to generate co-author lists often required by funding agencies
# Alexander Shenkin 2023.
# License: CC BY 4.0: https://creativecommons.org/licenses/by/4.0/. TLDR; Share and adapt with attribution.
#
# This script reads in a csv file of publications, and produces a list of coauthors
# 1) To get the csv of publications, download a bibtex or other archive of desired pubs from ORCID or another publication search tool.
# Note: don't use Google Scholar! They limit the number of co-authors exported.
# 2) Import that bibtex or other format archive into Zotero
# 3) Export those imported references in a .csv format.
#
# This does not include institutions of the coauthors, unfortunately. That still has to be done manually.
# This could be improved by using the RefSplitr R package to include the institutional affiliations of coauthors automatically.
library(tidyverse)
setwd("/your/diretory/")
pubs = readr::read_csv("your_zotero_exported_file.csv") # replace filename here
coauthors = pubs %>%
arrange(`Publication Year`) %>%
filter(`Publication Year` > 2020 & !is.na(`Publication Year`)) %>% # set the year you want to go back to
select(Author, `Publication Year`) %>%
separate_longer_delim(Author, delim = ';') %>%
separate_wider_delim(Author, delim = ',', names = c("last", "first")) %>%
mutate(across(everything(), trimws)) %>%
group_by(last, first) %>%
summarize(max(`Publication Year`))
write.csv(coauthors, file = "COI.csv") # copy and paste this data into your conflict of interest form. Remove your own name, and
# have a looksee for duplicates that weren't caught by the script.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment