Skip to content

Instantly share code, notes, and snippets.

@8enmann
8enmann / post_process.py
Created May 3, 2019 22:40 — forked from Smerity/post_process.py
WikiText: Python 2 post processing used on Moses tokenized input
# encoding=utf8
import sys
reload(sys)
sys.setdefaultencoding('utf8')
import re
number_match_re = re.compile(r'^([0-9]+[,.]?)+$')
number_split_re = re.compile(r'([,.])')