Skip to content

Instantly share code, notes, and snippets.

@paultopia
Last active July 1, 2022 02:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save paultopia/73e87c1f877c710dc6b2473875ff3207 to your computer and use it in GitHub Desktop.
Save paultopia/73e87c1f877c710dc6b2473875ff3207 to your computer and use it in GitHub Desktop.
# PROBLEM: Zotero loves to grab the URL and last accessed date of articles and books and such when you use the
# zotero browser extensions to get them into the citation manager. Then those things incorrectly show up in your
# automatically generated citations, e.g., when using the bluebook citation format.
# And it's really obnoxious to delete them all by hand either in the output or zotero.
# SOLUTION: use CSL json as your output for processing with pandoc or whatever (which you should be doing anyway)
# and then interpose this little script between the raw CSL json file and cleaned up CSL json file with the URLs etc.
# stripped.
# USAGE: "python clean_cite_json.py YOUR_EXISTING_CITATION_FILE NEW_FILENAME_FOR_CLEAN_FILE"
# Assumes citation file is in CSL json produced by zotero
# Should work with python 3, probably should work with python 2 too because it's simple, but I may have forgotten
# something
import json, sys
from copy import deepcopy
infile = sys.argv[1]
outfile = sys.argv[2]
with open(infile, encoding='UTF-8') as inf:
# NOTE: Depending on your system, you might have to change that encoding. Or leaving it off may be fine.
# this is a possible source of glitches where unicode code point numbers instead of things like smart quotes
# appear in the output. No warranties etc.
incites = json.load(inf)
def clean_cite(ref):
newref = deepcopy(ref)
types = ["chapter", "book", "article-journal"]
if newref['type'] in types:
newref.pop("URL", None)
newref.pop("accessed", None)
return newref
outcites = [clean_cite(reference) for reference in incites]
with open(outfile, mode='w') as outf:
json.dump(outcites, outf, indent=4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment