Skip to content

Instantly share code, notes, and snippets.

View goodmami's full-sized avatar

Michael Wayne Goodman goodmami

View GitHub Profile
@goodmami
goodmami / README.md
Last active February 23, 2023 17:39
Converting ACE's Subversion repository to Git

Converting ACE from Subversion to Git

The ace-svn-to-git.sh script will use git-svn to convert ACE's Subversion repository to Git with the --stdlayout flag so the trunk, tags, and branches are handled mostly as expected (more below). The --prefix=svn/ option puts all of those tags and branches under the svn reference namespace, and the --authors-file option maps the Subversion author names to the current GitHub profiles of the three authors in ACE's history.

@goodmami
goodmami / README.md
Last active September 8, 2023 04:10
Parsing JSON with regular expressions

Parsing JSON with Regular Expressions

When I learned of regular expression engines that support recursion I thought I could write a recursive-descent parser in regex. Since I've written JSON parsers a few times and it's a simple spec, I chose that as the test case. In the end I created two versions.

version 1

@goodmami
goodmami / repp.md
Created November 25, 2019 14:35
REPP notes

Regular Expression Preprocessing (REPP)

Specification

Modules

Operators

Every operator must appear as the first character on a line (in column 0).

@goodmami
goodmami / lark-parsimonious.py
Created August 30, 2018 21:58
Comparing Lark and Parsimonious on JSON parsing
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# usage: python3 lark-parsimonious.py [TESTNUM]
#
# Where TESTNUM is one of:
#
# 1. Parsimonious with the faster grammar (tree-only)
# 2. Parsimonious with the faster grammar (transform data)
# 3. Parsimonious with the slower grammar (tree-only)
@goodmami
goodmami / nltk-bleu.py
Created June 27, 2017 01:03
Simple multi-bleu utility using the NLTK
#!/usr/bin/env python3
# Copyright 2017 Michael Wayne Goodman <goodman.m.w@gmail.com>
# Licensed under the MIT license: https://opensource.org/licenses/MIT
import sys
import os
import gzip
import docopt
@goodmami
goodmami / getargs.bash
Created August 14, 2016 04:50
Processing command-line arguments in Bash
#!/bin/bash
die() { echo "$1"; exit 1; }
usage() {
cat <<EOF
Usage: getargs [--help] [OPTION...] ARGUMENT...
Example usage of useful conventions for command-line argument parsing.
@goodmami
goodmami / quotes.py
Created February 16, 2016 20:53
List of unicode quote symbols
# quote list: https://en.wikipedia.org/wiki/Quotation_mark
QUOTES = (
'\u0022' # quotation mark (")
'\u0027' # apostrophe (')
'\u00ab' # left-pointing double-angle quotation mark
'\u00bb' # right-pointing double-angle quotation mark
'\u2018' # left single quotation mark
'\u2019' # right single quotation mark
'\u201a' # single low-9 quotation mark
'\u201b' # single high-reversed-9 quotation mark
@goodmami
goodmami / ElementPath-xpath_tokenizer-original.py
Last active September 2, 2021 16:09
ElementPath with default namespace support
def xpath_tokenizer(pattern, namespaces=None):
for token in xpath_tokenizer_re.findall(pattern):
tag = token[1]
if tag and tag[0] != "{" and ":" in tag:
try:
prefix, uri = tag.split(":", 1)
if not namespaces:
raise KeyError
yield token[0], "{%s}%s" % (namespaces[prefix], uri)
except KeyError:
@goodmami
goodmami / make-preference.sh
Last active August 29, 2015 14:16
Make a [incr tsdb()] preference file with a specific result ID.
#!/bin/bash
if [ $# -ne 2 ]; then
echo 'usage: make-preference.sh PROFILE RESULT-ID'
exit 1
fi
awk -F@ -v RES="$2" \
'{ if($2 == RES) { printf("%d@-1@%d\n", $1, $2) } }' \
< "$1"/result
@goodmami
goodmami / README.md
Last active August 29, 2015 14:13
Arc Diagrams with Variably Spaced Nodes