Skip to content

Instantly share code, notes, and snippets.

Last active July 31, 2023 22:17
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
What would you like to do?
Tinyscript tool for replacing text in files from a target folder and based on a JSON dictionary of replacement patterns


A simple Tinyscript-based tool for recursively replacing disturbing/undesired text inside documents contained in a given folder based on a JSON dictionary defining regular expressions and the replacements to be applied.

This can be installed using:

$ pip install tinyscript
$ tsm install doc-text-masker
$ wget


  • Recursive folder parsing
  • No filtering regarding the file format
  • Ask for confirmation before replacing
  • Execute without applying changes (test mode)


This tool is useful for replacing particular strings, e.g. in a documentation folder, and allows to test then run the replacements that are to be done based on a JSON dictionary defining all the (regex, replacement) pairs to be handled.

$ ./ -h
usage: ./ [-a] [-b] [-c {*,#,@,+,-,%,$}] [-e EXT [EXT ...]]
                            [-r REPLACEMENTS] [-t] [-h] [-v]

DocTextMasker v3.0
Author   : Alexandre D'Hondt

This tool parses all Markdown files in the specified folder and replaces
 multiple metadata by a hidding character. The purpose is to mask metadata in
 the tool outputs and sessions shown in the Markdown files.

positional arguments:
  folder              target folder

optional arguments:
  -a                  ask for confirmation (default: False)
  -b                  take a backup copy (default: False)
  -c {*,#,@,+,-,%,$}  hiding char (default: #)
  -e EXT [EXT ...]    extensions to be handled (default: ['md', 'mdtxt', 'txt'])
  -r REPLACEMENTS     replacements JSON file (default: replacements.json)
  -t                  display modifications but do not apply them (default: False)
                       NB: this ignores -a and -b

extra arguments:
  -h, --help          show this help message and exit
  -v, --verbose       verbose mode (default: False)

Usage examples:
  ./ -t
  ./ -r my-own-replacements.json
  ./ -f docs -c $


  1. Testing
$ ./doc-text-maker -t
12:34:56 [WARNING] Changes in 'src/trace.txt':
0: 12:34:56 [INFO] [0a:1b:2c:3d:4e:5f] -> [1b:2c:3d:4e:5f:0a]
   12:34:56 [INFO] [0a:1b:2c:##:##:##] -> [1b:2c:3d:##:##:##]
  1. Replacement
$ ./doc-text-maker -v
12:34:56 [DEBUG] Entering 'src'...
12:34:56 [DEBUG] Parsing 'src/trace.txt'...
12:34:56 [DEBUG] > Saving new file...

Replacement creation

Use case: We want to display a session for illustrating the execution of a CLI tool. However, we don't want to display the date and times of execution while displaying the logging trace of the tool.

Example: Telnet trace

$ telnet

Connected to
Escape character is '^]'.
Last login: Thu Dec 29 23:58:00 UTC 2016 on tty1

We want to hide "Thu Dec 29 23:58:00 UTC 2016". The (Python-style) regular expression that matches such a line is:


The JSON item that can be added to the dictionary is thus:

    "telnet-datetime": [
        "{0}{0}{0} {0}{0}{0} {0}{0} {0}{0}:{0}{0}:{0}{0} {0}{0}{0}{0}"

Note that "{0}" is the format string that designates the first input argument in str.format(), that is, the selected hidding char (by default, "#").

# -*- coding: UTF-8 -*-
import json
from tinyscript import *
__author__ = "Alexandre D'Hondt"
__version__ = "3.1"
__doc__ = """
This tool parses all Markdown files in the specified folder and replaces
multiple metadata by a hidding character. The purpose is to mask metadata in
the tool outputs and sessions shown in the Markdown files.
__examples__ = ["", "-t", "-r my-own-replacements.json", "-f docs -c $"]
def apply_replacements(fp):
This function handles a text file for replacements according to a user-
provided list of replacements formatted as pairs (regexp, replacement).
:param fp: Path instance of file to be handled
logger.debug("Parsing '{}'...".format(fp))
# retrieve file content
content = contentm = fp.read_text()
# apply replacements to 'contentm' buffer
for cat, regex in args.replacements.items():
regex, repl = regex
contentm = regex.sub(lambda m:[0],
repl.format(args.mchar)), contentm)
except UnicodeDecodeError:
h = lambda t: hashlib.sha256(t.encode()).hexdigest()
if h(content) != h(contentm):
# if testing mode, just display the line (if any replacement)
if args.test:
diff, i = [], 0
for l1, l2 in zip(content.split('\n'), contentm.split('\n')):
if l1 != l2:
diff.append("{}: {}\n{} {}"
.format(i, l1, len(str(i)) * ' ', l2))
i += 1
logger.warn("Changes in '{}':\n{}".format(fp, '\n'.join(diff)))
# if replacements were done
elif not args.ask or args.ask and ts.confirm():
# backup original file if required
if args.backup:
bf = p.parent.joinpath("." + args.folder.basename, create=True)
bfp = bf.joinpath(fp.basename + ".bak")
if not bfp.exists():
logger.debug("> Saving backup copy...")
# overwrite original file with the replaced content
logger.debug("> Saving new file...")
logger.debug("> No change")
if __name__ == '__main__':
global args
parser.add_argument("folder", type=ts.folder_exists_or_create,
help="target folder")
parser.add_argument("-a", dest="ask", action="store_true",
help="ask for confirmation")
parser.add_argument("-b", dest="backup", action="store_true",
help="take a backup copy")
parser.add_argument("-c", dest="mchar", default='#', choices="*#@+-%$",
help="hiding char")
parser.add_argument("-e", dest="ext", nargs="+",
default=["md", "mdtxt", "txt"],
help="extensions to be handled")
parser.add_argument("-r", dest="replacements", default="replacements.json",
type=ts.file_exists, help="replacements JSON file")
parser.add_argument("-t", dest="test", action="store_true",
help="display modifications but do not apply them",
note="this ignores -a and -b")
args.replacements = {k: (re.compile(v[0]), v[1]) \
for k, v in json.load(open(args.replacements)).items()}
args.folder = Path(args.folder)
# running the main stuff
ffunc = lambda x: any(str(x).endswith(e) for e in args.ext)
for p in args.folder.walk(filter_func=ffunc):
"whois-datetime": [
"{0}{0}{0}, {0}{0} {0}{0}{0} {0}{0}{0}{0} {0}{0}:{0}{0}:{0}{0} {0}{0}{0}"
"mac": [
"patator-datatime": [
"{0}{0}{0}{0}-{0}{0}-{0}{0} {0}{0}:{0}{0} {0}{0}{0}"
"ssh-datetime": [
"{0}{0}{0} {0}{0}{0} {0}{0} {0}{0}:{0}{0}:{0}{0} {0}{0}{0} {0}{0}{0}{0}"
"ncrack-datetime": [
"{0}{0}{0}{0}-{0}{0}-{0}{0} {0}{0}:{0}{0} {0}{0}{0}"
"nmap-datetime": [
"{0}{0}{0}{0}-{0}{0}-{0}{0} {0}{0}:{0}{0}"
"telnet-datetime": [
"{0}{0}{0} {0}{0}{0} {0}{0} {0}{0}:{0}{0}:{0}{0} {0}{0}{0}{0}"
"hydra-info": [
"{0}{0}{0}{0}-{0}{0}-{0}{0} {0}{0}:{0}{0}:{0}{0}"
"patator-logging": [
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment