Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Bash script to convert Hugo markdown files to smartquotes, since Goldmark sucks at it
#!/usr/bin/env bash
#
# add smart quotes to Hugo Markdown source files, using the
# reference implementation of CommonMark's CLI tool:
# https://github.com/commonmark/commonmark-spec
# Notes:
# - assumes TOML front matter
# - converts footnote-style links to inline
# - normalizes ordered/unordered list formatting
#
# WARNING: possible site-breaking changes:
# ! rarely, cmark breaks *italic* and **bold** by backslashing
# the asterisks
# ! breaks description/definition-list formatting by reflowing it
# ! cmark adds gratuitous backslashes before "[]*&!#_<>+"; stripping
# them back out can break escapes in front matter
# - adds blank line before shortcode that starts a line
# - adds blank line after shortcode that ends a line
# - converts &nbsp;, &rsquo;, &#8203;, etc into Unicode literals
# - probably won't handle a "+++" line in body content
CMARK="cmark --to commonmark --width 70 --smart --unsafe"
for file in "$@"; do
cat "$file" |
# convert front matter to HTML comment, so it all gets ignored
sed -e '1 s/^\+\+\+$/<!-- _FMPLUS_/' \
-e 's/^\+\+\+$/_FMPLUS_ -->/' |
# convert shortcodes to HTML comments, to keep it from
# escaping their arguments
sed -e 's/{{</<!-- _SC1OPEN_/g' \
-e 's/>}}/_SC1CLOSE -->/g' \
-e 's/{{%/<!-- _SC2OPEN_/g' \
-e 's/%}}/_SC2CLOSE -->/g' |
# pass through commonmark
$CMARK |
# restore shortcodes
sed -e 's/<!-- _SC1OPEN_/{{</g' \
-e 's/_SC1CLOSE -->/>}}/g' \
-e 's/<!-- _SC2OPEN_/{{%/g' \
-e 's/_SC2CLOSE -->/%}}/g' |
# strip out mostly-gratuitous backslashes
sed -e 's/\\\([][*&!#_<>+]\)/\1/g' |
# restore front matter
sed -e 's/^.*_FMPLUS_.*$/+++/' > "$file.new"
# overwrite original (you have source control, right?)
mv "$file.new" "$file"
done
exit 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment