Skip to content

Instantly share code, notes, and snippets.

@rattletat
Last active July 8, 2022 10:47
Show Gist options
  • Save rattletat/4a1098d5c1e7c8db1fe52d4740076808 to your computer and use it in GitHub Desktop.
Save rattletat/4a1098d5c1e7c8db1fe52d4740076808 to your computer and use it in GitHub Desktop.
#!/bin/bash
# This is heavily based on this code here:
# https://gist.github.com/maikeldotuk/54a91c21ed9623705fdce7bab2989742
# Which is heavily based on this code here:
# https://gist.github.com/enpassant/0496e3db19e32e110edca03647c36541
# Special thank you to the user enpassant for starting it https://github.com/enpassant
# ARGUMENT PARSING
# Do not overwrite (0) or overwrite (1)
OVERWRITE="$1"
# Syntax chosen for the wiki
SYNTAX="$2"
# File extension for the wiki
EXTENSION="$3"
# Full path of the output directory
OUTPUTDIR="$4"
# Full path of the wiki page
INPUT="$5"
# Full path of the css file for this wiki
CSSFILENAME=$(basename "$6")
# Full path to the wiki's template
TEMPLATE_PATH="$7"
# The default template name
TEMPLATE_DEFAULT="$8"
# The extension of template files
TEMPLATE_EXT="$9"
# Count of '../' for pages buried in subdirs
ROOT_PATH="${10}"
# If file is in vimwiki base dir, the root path is '-'
[[ "$ROOT_PATH" = "-" ]] && ROOT_PATH=''
# Example: index.md
FILE=$(basename "$INPUT")
# Example: index
FILENAME=$(basename "$INPUT" ."$EXTENSION")
# Example: /home/rattletat/wiki/text/uni/
FILEPATH=${INPUT%$FILE}
# Example: /home/rattletat/wiki/html/uni/index
OUTPUT=$OUTPUTDIR$FILENAME
# PANDOC ARGUMENTS
# If you have Mathjax locally use this:
# MATHJAX="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
MATHJAX="/usr/share/mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
# PREPANDOC PROCESSING AND PANDOC
pandoc_template="pandoc \
--mathjax=$MATHJAX \
--template=$TEMPLATE_PATH$TEMPLATE_DEFAULT$TEMPLATE_EXT \
-f $SYNTAX \
-t html \
-c $CSSFILENAME \
-M root_path:$ROOT_PATH"
# Searches for markdown links (without extension or .md) and appends a .html
regex1='s/[^!()[]]*(\[[^]]+\])\(([^.)]+)(\.md)?\)/\1(\2.html)/g'
# [^!\[\])(]*(\[[^\]]+\])\(([^).]+)(\.md)?\)
# Removes placeholder title from vimwiki markdown file. Not needed if you use a
# correct YAML header.
# regex2='s/^%title (.+)$/---\ntitle: \1\n---/'
pandoc_input=$(cat "$INPUT" | sed -r "$regex1")
pandoc_output=$(echo "$pandoc_input" | $pandoc_template)
# POSTPANDOC PROCESSING
# Removes "file" from ![pic of sharks](file:../sharks.jpg)
regex3='s/file://g'
echo "$pandoc_output" | sed -r $regex3 > "$OUTPUT.html"
# With this you can have ![pic of sharks](file:../sharks.jpg) in your markdown file and it removes "file"
# and the unnecesary dot html that the previous command added to the image.
# sed 's/file://g' < /tmp/crap.html | sed 's/\(png\|jpg\|pdf\).html/\1/g' | sed -e 's/\(href=".*\)\.html/\1/g' > "$OUTPUT.html"
# Copy relative
# destination=$(cd -- "$4" && pwd) # make it an absolute path
# cd -- "/home/rattletat/wiki/text/" &&
# find . -type f -regex ".*\.\(jpg\|gif\|png\|jpg\)" -exec cp {} "$destination/{}"
@staffan7s
Copy link

Would you please share your .vimrc settings?

@rattletat
Copy link
Author

@staffan7s

Relevant vimrc configuration:

call dein#add('vimwiki/vimwiki', {'rev':'dev'})
let g:vimwiki_list = [{
            \ 'auto_export': 1,
            \ 'auto_header' : 1,
            \ 'automatic_nested_syntaxes':1,
            \ 'path_html': '$HOME/wiki/html',
            \ 'path': '$HOME/wiki/src',
            \ 'template_path': '$HOME/wiki/templates/',
            \ 'template_default':'GitHub',
            \ 'template_ext':'.html5',
            \ 'syntax': 'markdown',
            \ 'ext':'.md',
            \ 'custom_wiki2html': '/home/rattletat/scripts/wiki2html.sh',
            \ 'autotags': 1,
            \ 'list_margin': 0,
            \ 'links_space_char' : '_',
            \}]
let g:vimwiki_folding='expr'
let g:vimwiki_hl_headers = 1
let g:vimwiki_ext2syntax = {'.md': 'markdown'}

If you have any problems feel free to ask.

@staffan7s
Copy link

Thanks a lot! I'll try this out and report back.
Btw, I tried adapting the regex in the script so it handles external links. As it is now, "www.dn.se" ends up as "www.dn.se.html".
This does the trick in gVim:
s/\(\.[a-z0-9]\{1,}\)\.html/\1/g
...but trying to do the same in sed fails. Something like
sed -r 's/se.html/se/g'
...works by specifying each case, which is a bit primitive.

@rattletat
Copy link
Author

I don't see how "www.dn.se" gets tranformed to "www.dn.se.html". Atleast for the following command

echo "[Test](https://www.kaller.se)" | sed -r 's/(\[.+\])\(([^.)]+)(\.md)?\)/\1(\2.html)/g'
results in "[Test](https://www.kaller.se)".

Can you give me an example?

@staffan7s
Copy link

Thanks again! I tried your vimrc settings (minus the call#dein bit, since I installed VimWiki using git clone). For some reason, however, I got an error:
VimWiki Error: conversion to HTML not supported for this syntax
I am using VimWiki 2.4.1 and gVim 8.0.1453.

This doesn't matter, though, as I got the link conversion working after I imported your sed syntax above. I was previously using syntax from one of the versions you forked. My wiki2html.sh now has this minimal version which suits me fine!

sed -r 's/(\[.+\])\(([^.)]+)(\.md)?\)/\1(\2.html)/g' <"$INPUT" | pandoc $MATH -s -f $SYNTAX -t html -c $CSSFILENAME | sed -r 's/file://g' >"$OUTPUT.html"

/Staffan

@rattletat
Copy link
Author

@staffan7s Glad to hear that! I made a small change to the first regex expression, as I encountered some problems when converting a line which includes two or more markdown links. Maybe you want to adopt that.

@darrennoble
Copy link

I added this right after you did the similar thing to the ROOT_PATH so that you can set 'template_path': '', in your .vimrc to use a template from the default pandoc paths, ie. ~/.local/share/pandoc/templates

# Load a template in the default template directory
[[ "$TEMPLATE_PATH" = "-" ]] && TEMPLATE_PATH=''

see https://pandoc.org/MANUAL.html#option--data-dir for more info on default paths.

@staffan7s
Copy link

@rattletat : I adopted your change, namely the 
s/[^!()[]]*(\[[^]]+\])\(([^.)]+)(\.md)?\)/\1(\2.html)/g
part, which works, but the linked words come out with no separating spaces. Thus:
Some [word](link1) which [looks](link2) like this
=>
Someword whichlooks like this. Any idea why?

Another issue is with the diary pages: on doing \w\i in Vimwiki, a list of generated links is created. These are in this [[format]] and thus need to be converted somehow to [this](format).

Copy link

ghost commented May 13, 2020

@rattletat Thank you for this wonderful script. However, I'm stuck on an issue. When I use :VimwikiAll2HTML, it creates the HTML version of every markdown files recursively (which is what I would expect). However, it doesn't seem to link them properly.

For example, [foo](foo) in the original document points to just foo in the HTML file (instead of foo.html, which is already present in the relevant directory).

Similarly, relative links seem to be broken. [bar](foo/bar) points to foo/bar.md instead of foo/bar.html (even though the file exists).

Do you have any idea on what the issue might be? (I used your script/vimrc verbatim so as not to introduce any errors, except some minor changes such as MathJax loading, --quiet parameter in pandoc call, etc).
As for the HTML template, I just downloaded Github.html5 and placed it inside the relevant directory. Do I need to copy any other script too?

@rattletat
Copy link
Author

@staffan7s
Sorry for the late response. Probably you already solved your problem.
I think you need to change the regex from
regex1='s/[^!()[]]*(\[[^]]+\])\(([^.)]+)(\.md)?\)/\1(\2.html)/g'
to
regex1='s/([^!()[]]*)(\[[^]]+\])\(([^.)]+)(\.md)?\)/\1(\2.html)/g'

But I haven't tested it yet, so maybe you could give me feedback. Unfortunately, I switched to wiki.vim.
You can always test your regexes on pages like this.

@rattletat
Copy link
Author

@Just-A-Visitor
I am rather clueless. Since I'm not using vimwiki anymore (see last comment), some API changes could have happened, but this is rather unlikely. To debug this, I would replace echo "$pandoc_output... by echo"$pandoc_input.. to see the parsed text before it is send to pandoc. Otherwise, I would recommend trying the python tool offered here (1)

Relevant links:

  1. Python Script
  2. Previous thread
  3. Relevant issue

@staffan7s
Copy link

@rattletat
I'm sorry to hear you're abandoning this fork, but thanks again for your input. This now works reasonably well, I just had to add a space before the first \1:
regex1='s/([^!()[]]*)(\[[^]]+\])\(([^.)]+)(\.md)?\)/\1(\2.html)/g'
to
regex1='s/[^!()[]]*(\[[^]]+\])\(([^.)]+)(\.md)?\)/ \1(\2.html)/g'

In fact, my wiki2html.sh is now rather minimal: after the HAS_MATH bit, I just kept this:

sed -r 's/[^!()[]]*(\[[^]]+\])\(([^.)]+)(\.md)?\)/ \1(\2.html)/g' <"$INPUT" | pandoc $MATH -s -f $SYNTAX -t html -c $CSSFILENAME >"$OUTPUT.html"

This covers all internal links to files located in the same folder, but for folder change (eg to ../index) I have to revert to html links (<a href ... etc). All external links starting with "http" work as well.

[2020-04-06](2020-04-06) -- working
[indexpage](../index) -- not working
[DN1](www.dn.se) -- not working
[DN2](http://www.dn.se) -- working

@fcsm1922
Copy link

fcsm1922 commented Jul 8, 2022

@rattletat Thank you for this wonderful script. However, I'm stuck on an issue. When I use :VimwikiAll2HTML, it creates the HTML version of every markdown files recursively (which is what I would expect). However, it doesn't seem to link them properly.

For example, [foo](foo) in the original document points to just foo in the HTML file (instead of foo.html, which is already present in the relevant directory).

Similarly, relative links seem to be broken. [bar](foo/bar) points to foo/bar.md instead of foo/bar.html (even though the file exists).

Do you have any idea on what the issue might be? (I used your script/vimrc verbatim so as not to introduce any errors, except some minor changes such as MathJax loading, --quiet parameter in pandoc call, etc). As for the HTML template, I just downloaded Github.html5 and placed it inside the relevant directory. Do I need to copy any other script too?

Hi. faced the same problem as you. Did you find a way to solve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment