Skip to content

Instantly share code, notes, and snippets.

@ilikepi
Forked from jsonbecker/WPFootnotesToMarkdown.py
Last active December 10, 2015 01:08
Show Gist options
  • Save ilikepi/4355865 to your computer and use it in GitHub Desktop.
Save ilikepi/4355865 to your computer and use it in GitHub Desktop.
import re
p = re.compile("\(\(([^()]+)\)\)")
file_path = str(raw_input('File Name >'))
orig_text = open(file_path).read()
new_text = ""
footnoteMatches = p.finditer(orig_text)
coordinates = []
footnotes = []
# Print span of matches
for match in footnoteMatches:
coordinates.append(match.span())
footnotes.append(match.group(1))
next_start = 0
for i in range(0,len(coordinates)):
new_text += orig_text[next_start:coordinates[i][0]] + "[^{0}]".format(i)
next_start = coordinates[i][1]
# Ensure we capture everything after the final match.
if next_start <= len(orig_text):
new_text += orig_text[next_start:len(orig_text)]
# Ensure a trailing newline before our references.
if not new_text.endswith('\n'):
new_text += '\n'
for i in range(0, len(footnotes)):
new_text += '[^{0}]: {1}\n'.format(i, footnotes[i])
newFile = open(file_path, 'w')
newFile.truncate()
newFile.write(new_text)
newFile.close()
Copy link

ghost commented Nov 9, 2014

Thanks, this saved me so much time! I did have to make one tweak to the regex, though, because I had a lot of footnotes with parentheses in them. For example:

((This is a footnote (so I say anyway) with parentheses in it.))

These don't get processed by your regex because it disallows any single parentheses inside a footnote. I wound up matching everything inside double parentheses and using a non-greedy operator to ensure we get the smallest match:

p = re.compile(" \(\((.+?)\)\)")

It's a matter of personal preference, but I also matched the space before the start of the footnote. You can keep/remove that space as you like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment