Skip to content

Instantly share code, notes, and snippets.

@alanzchen
Created September 1, 2021 23:03
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save alanzchen/6c30227d82f411da1baba4b7d1ec088b to your computer and use it in GitHub Desktop.
Save alanzchen/6c30227d82f411da1baba4b7d1ec088b to your computer and use it in GitHub Desktop.
Convert Word with Zotero citation to Markdown. Step 1: https://www.zotero.org/support/kb/moving_documents_between_word_processors Step 2: Convert the docx to Markdown via pandoc with "--wrap=none". Step 3: Use this script to process the markdown file.
#!/usr/bin/env python3
import sqlite3
import re
import json
import argparse
def convert(db, filename):
con = sqlite3.connect(db)
cur = con.cursor()
with open(filename, "r") as f:
md = f.read()
raw_match = re.findall('\[ITEM CSL_CITATION .*?\]\(.*?\)', md)
citeid_match = [re.findall(r'\\"id\\":(\d*?),\\"uris', i) for i in raw_match]
for i in range(len(raw_match)):
s = "["
for j in citeid_match[i]:
s += "@" + list(cur.execute('SELECT * FROM citekeys WHERE itemID={}'.format(j)))[0][3] + "; "
s = s.strip("; ")
s += "]"
print(s)
md = md.replace(raw_match[i], s)
with open(filename, "w") as f:
f.write(md)
parser = argparse.ArgumentParser(description='Convert markdown in Zotero-style citations format (e.g., convert MS Word to Markdown via pandoc) to cite-key format. Requires better-bibtex.')
parser.add_argument('db', metavar='db', type=str,
help='Your better-bibtex-search.sqlite path.')
parser.add_argument('filename', metavar='filename', type=str,
help='The markdown file you converted with pandoc. Note that you will need to use pandoc option --wrap=none when converting it to markdown.')
args = parser.parse_args()
convert(args.db, args.filename)
@nacht-falter
Copy link

Hi,

this is great! Thank you. Exactly what I was looking for. But unfortunately I get an error:

./convert.py /Users/bla/Zotero/better-bibtex-search.sqlite /Users/bla/Desktop/transfer.md
[@utz2012]
[@linke2018]
[@ligeti2007m]
[@ligeti2007m]
[@voss2008]
[@caduff2002]
[@ligeti2007aa]
[@ligeti2007ab]
[@mosch2016]
[@adorno1975]
[@adorno1975]
[@adorno1975]
[@adorno1975]
[@adorno1975]
Traceback (most recent call last):
  File "/Users/bla/./convert.py", line 32, in <module>
    convert(args.db, args.filename)
  File "/Users/bla/./convert.py", line 17, in convert
    s += "@" + list(cur.execute('SELECT * FROM citekeys WHERE itemID={}'.format(j)))[0][3] + "; "
IndexError: list index out of range

Any ideas how to solve this?

@alanzchen
Copy link
Author

alanzchen commented Oct 3, 2021

@phaeton6680

Hi,

this is great! Thank you. Exactly what I was looking for. But unfortunately I get an error:

Any ideas how to solve this?

Looks like there is a cite key that exists in your Word file but it is no longer in your Zotero database. Refresh your Word file with Zotero and try again?

@nacht-falter
Copy link

@phaeton6680

Hi,
this is great! Thank you. Exactly what I was looking for. But unfortunately I get an error:
Any ideas how to solve this?

Looks like there is a cite key that exists in your Word file but it is no longer in your Zotero database. Refresh your Word file with Zotero and try again?

Thanks, I had manually edited some of the cite keys within the Word document. That seems to have caused the error. Thank you for your help and for this nice little script.

@jordantgh
Copy link

Thanks very much for this. Adding encoding='utf-8', errors='ignore' to the open() statements helped me overcome some issues.

@FriederRodewald
Copy link

Thank you ;D However, I had to adapt the script in line 13 to citeid_match = [re.findall(r'\\"id\\":(\d*?),\\"type', i) for i in raw_match] changing uris to type in the regex pattern. But with this small change the script worked like a charm :)

@alanzchen
Copy link
Author

Hi all,

Thanks for all of your feedback. I would now recommend https://retorque.re/zotero-better-bibtex/citing/migrating/ over this script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment