Skip to content

Instantly share code, notes, and snippets.

@maybemkl
Created September 6, 2021 23:00
Show Gist options
  • Save maybemkl/d9be15bcabadaa19d2ca50c87b59a92e to your computer and use it in GitHub Desktop.
Save maybemkl/d9be15bcabadaa19d2ca50c87b59a92e to your computer and use it in GitHub Desktop.
Remove markdown wiki-link brackets during pandoc exports
@tim-hilde
Copy link

Do I have to put the file somewhere specific? If I call it I get the following error:

Traceback (most recent call last): File "/Users/tim/Documents/Wissensmanagement/Pandoc/remove_links.py", line 3, in <module> from pandocfilters import toJSONFilter, Str ImportError: No module named pandocfilters Error running filter /Users/tim/Documents/Wissensmanagement/Pandoc/remove_links.py: Filter returned error status 1

@chrisgrieser
Copy link

I think it's this package, although i am not entirely certain how to appropriately install it 🤔
https://pypi.org/project/pandocfilters/

@racng
Copy link

racng commented Sep 19, 2022

You can install it using pip, but I tried installing it with conda instead in an isolated environment just in case.

conda create -n pandoc
conda install -c conda-forge pandocfilteres
conda activate pandoc

Before any filtering is done, pandac parses markdown file into abstract syntax tree (AST). I took a look at what the tree looks like for a simple markdown with a single line: [[@citekey]]. The string is actually broken into three blocks: [, [@citekey], and ]. So there would be no string that contains [[ or ]], therefore this pandocfilter script didn't work. Similary for the lua filters, replacing [[ or ]] doesn't work.

Both pandocfilter and lua filter would work if we replace [ and ] with ''.

Here is what the AST looks like

List of 3
 |-pandoc-api-version:List of 4
 |  |-: int 1
 |  |-: int 22
 |  |-: int 2
 |  |-: int 1
 |-meta              : Named list()
 |-blocks            :List of 1
    |-:List of 2
       |-t: chr "Para"
       |-c:List of 3
          |-:List of 2
          |  |-t: chr "Str"
          |  |-c: chr "["
          |-:List of 2
          |  |-t: chr "Cite"
          |  |-c:List of 2
          |     |-:List of 1
          |     |  |-:List of 6
          |     |     |-citationId     : chr "citekey"
          |     |     |-citationPrefix : list()
          |     |     |-citationSuffix : list()
          |     |     |-citationMode   :List of 1
          |     |     |  |-t: chr "NormalCitation"
          |     |     |-citationNoteNum: int 1
          |     |     |-citationHash   : int 0
          |     |-:List of 1
          |        |-:List of 2
          |           |-t: chr "Str"
          |           |-c: chr "[@citekey]"
          |-:List of 2
             |-t: chr "Str"
             |-c: chr "]"

@aravindk100
Copy link

Thanks for this great insight @racng . I was able to make this change and get it work except I noticed that the back end of the link did not get filtered correctly.
It went from [[name]] to name] . I found it odd that it was able to replace [[ but only one of the ]

This is the modified code I am using,
#!/usr/bin/env python3

from pandocfilters import toJSONFilter, Str
import re

def replace(key, value, format, meta):
if key == 'Str':
if '[' in value:
new_value = value.replace('[', '')
return Str(new_value)
if ']' in value:
new_value = value.replace(']', '')
return Str(new_value)

if name == 'main':
toJSONFilter(replace)

@balaji-dutt
Copy link

balaji-dutt commented Dec 21, 2022

Thanks for the original filter code @maybemkl! I was hitting the same problem as @aravindk100 in that the filter would not find the closing ]] characters, so I modified the script to take advantage of some newer Python 3.8 features which also greatly simplifies the code. Here's my version:

#!/usr/bin/env python3

from pandocfilters import toJSONFilter, Str
import re

def replace(key, value, format, meta):
    if key == 'Str':
        if match := re.search('\[\[(.+)\]\]',value,re.IGNORECASE):
           new_value = match.group(1)
           return Str(new_value)

if __name__ == '__main__':
    toJSONFilter(replace)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment