Last active
August 29, 2015 14:17
-
-
Save ljos/395d2a755cb6c4c20ca7 to your computer and use it in GitHub Desktop.
OBT hjelpeskript
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
input_file="${1:-/dev/stdin}" | |
sed '/^\s*$/d' "$input_file" \ | |
| paste -d '\t\0' - - - \ | |
| sed -e 's/\([^"]*\)$/\t\1/' \ | |
-e 's,<word>\(.*\)</word>,\1,' \ | |
-e 's/"<\(.*\)>"\t"\(.*\)"/\1\t\2/' \ | |
| cut -f3 \ | |
| sed 's/./\L\0/g' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Det dette skriptet gjør er:
input_file
til enten først argumentet, eller stdin.<word></word>
taggen."<>"
og""
rundt ordene.Du må først bruke OBT og OBT-Stat programmene for at dette skriptet skal gjøre det riktige.