Skip to content

Instantly share code, notes, and snippets.

Last active August 29, 2015 14:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cjerdonek/76608610df43fd5b0fc3 to your computer and use it in GitHub Desktop.
Save cjerdonek/76608610df43fd5b0fc3 to your computer and use it in GitHub Desktop.
Python Pandoc filter for converting GitHub markdown to Python reST long_description for PyPI
#!/usr/bin/env python
Python Pandoc filter [1] for converting a GitHub markdown file to a Python
reST long_description (suitable for display on PyPI).
Sample usage:
$ pandoc --filter ./ --write=rst --output=long_description.rst
PyPI's reST rendering breaks on things like relative links (supported by
GitHub [2]), and anchor fragments. This filter converts these links
to links that will continue to work once on PyPI.
See also this PyPI bug report [3].
import logging
import os
import sys
from urllib.parse import urljoin, urlparse, urlunparse
from pandocfilters import toJSONFilter, Link
log = logging.getLogger(os.path.basename(__file__))
def configure_logging():
format_string = "%(name)s: [%(levelname)s] %(message)s"
logging.basicConfig(format=format_string, level=logging.DEBUG)
log.debug("Debug logging enabled.")
# This function can be used to create other Pandoc filters that
# transform URLs in hyperlinks.
def init_action(convert_url):
Return a Pandoc "action" suitable for passing to toJSONFilter.
convert_url: a function that accepts an URL path and returns
a new one.
def transform_url(key, value, format, meta):
if key != 'Link':
return None
# Then value has the following form:
# [[{'t': 'Str', 'c': 'Contributing'}], ['docs/', '']]
# Extract the URL.
url = value[1][0]
new_url = convert_url(url)
if new_url is None:
return None"converting URL:\n"
" %s\n"
"-->%s" % (url, new_url))
value[1][0] = new_url
return Link(*value)
return transform_url
def convert_url(url):
"""Convert URL appearing in a markdown file to a new URL.
Returns None if URL should remain same.
parsed_url = urlparse(url)
url_path = parsed_url[2]
if not url_path:
# Then we assume it is a fragment (e.g. "#license") that should
# link back to a section on the same PyPI page.
new_url = urlunparse(parsed_url)
new_url = urljoin(PYPI_URL, new_url)
return new_url
if (not url_path.endswith(".md") and
url_path != "LICENSE"):
return None
# Otherwise, we link back to the original source GitHub page.
new_url = urlunparse(parsed_url)
new_url = urljoin(GITHUB_URL, new_url)
return new_url
if __name__ == "__main__":
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment