Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Convert HTML to markup for Jekyll
# The Jekyll import tool (
# creates HTML files. I'd like to use html2text
# ( to convert those to Markdown.
# The challenge is that Jekyll files have a YAML header at the top that gets
# mangled by the conversion. This strips the header, passes the remainder
# of the body into html2text, then adds the header back to the result.
import os
import subprocess
import sys
import tempfile
import html2text
for filename in sys.argv[1:]:
with open(filename, 'r') as infile:
contents =
# Find the end of the YAML header
endheader = contents.find('\n---\n')
header = contents[:endheader + 5]
new_body = html2text.html2text(contents[endheader + 5:].decode('unicode_escape'))
with open(os.path.splitext(filename)[0]+'.markdown', 'w') as f:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment