Skip to content

Instantly share code, notes, and snippets.

View eiriks's full-sized avatar

Eirik S eiriks

View GitHub Profile
@eiriks
eiriks / OBT-stemmer.sh
Last active August 29, 2015 14:17 — forked from ljos/OBT-stemmer.sh
#!/usr/bin/env bash
sed '/^\s*$/d' \
| paste -d '\t\0' - - - \
| sed -e 's/\([^"]*\)$/\t\1/' \
-e 's,<word>\(.*\)</word>,\1,' \
-e 's/"<\(.*\)>"\t"\(.*\)"/\1\t\2/' \
| cut -f3 \
| sed 's/./\L\0/g'
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.