Skip to content

Instantly share code, notes, and snippets.

@skonkiel
Created February 13, 2018 23:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save skonkiel/6914fd79bcce4330fcf0f1f03c02beb1 to your computer and use it in GitHub Desktop.
Save skonkiel/6914fd79bcce4330fcf0f1f03c02beb1 to your computer and use it in GitHub Desktop.
Useful Python packages for scholarly communication research

Here's a few useful Python packages I've found so far that can help with various aspects of scholcomm research. I make no claim as to the quality of these packages.

Multi-purpose

  • pyOpenSci pyApiToolkit: 'Python 3 scripts to access, create, distribute and publish open research data or data about open science works.' Includes DOAJ, oaDOI (Unpaywall), ORCID, Zotero and Wikidata API wrappers.

Webpages as data

  • BeautifulSoup: Scrape webpages (including journal webpages, where permitted by journal T&Cs) using BeautifulSoup. I've used this to scrape acknowledgements and conflict of interest data from clinical trials published in journals.

Publication data

  • crossrefapi: Access the Crossref API for data on journal articles, journals, funding info, and more.
  • refextract: Enter a link to a journal article, get back a scraped, structured list of the references included in that article.

Persistent identifiers

  • idutils: 'Small library for validating and normalising persistent identifiers used in scholarly communication.' Haven't used this one yet but it looks to be good.

Altmetrics

  • pyAltmetric: Counts of altmetrics (tweets, news articles, policy cites, etc) for 9MM+ research outputs (journal articles, books, etc). Query the API using a number of persistent identifiers or by fixed time range (1d, 3d, 1m, etc).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment