Skip to content

Instantly share code, notes, and snippets.

@greymd
Last active June 13, 2022 10:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save greymd/a1a393a18d672e5c1059a0278cf1e26d to your computer and use it in GitHub Desktop.
Save greymd/a1a393a18d672e5c1059a0278cf1e26d to your computer and use it in GitHub Desktop.
ポリバケツ語列挙
## 長音に対応した
nkf -w /usr/share/mecab/dic/ipadic/Noun*.csv | awk -F, '$0=$NF" "$NF' | teip -f 2 -- uconv -x latin | awk '/[aā]/&&/[iī]/&&/[uū]/&&/[eē]/&&/[oō]/' | sed -E -e'/('{[aā],[iī],[uū],[eē],[oō]}').*\1/d' | awk '{print length($1),$0}' | sort -un
## 長さランキング
nkf -w /usr/share/mecab/dic/ipadic/Noun*.csv | awk -F, '$0=$NF" "$NF' | teip -f 2 -- uconv -x latin | awk '/[aā]/&&/[iī]/&&/[uū]/&&/[eē]/&&/[oō]/' | sed -e/{'[aā].*[aā]','[iī].*[iī]','[uū].*[uū]','[eē].*[eē]','[oō].*[oō]'}/d | sort -u | awk '{print length($1),$0}' | sort -n
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment