Skip to content

Instantly share code, notes, and snippets.

@gmodecorp
Created January 29, 2015 09:37
Show Gist options
  • Save gmodecorp/60eb05702f3a5d264447 to your computer and use it in GitHub Desktop.
Save gmodecorp/60eb05702f3a5d264447 to your computer and use it in GitHub Desktop.
use strict;
use MeCab;
my @validHinshiList = qw(名詞 動詞 形容詞 副詞 感動詞);
my %wordlist = ();
my $mecab = MeCab::Tagger->new();
while (<STDIN>) {
my $node = $mecab->parseToNode($_);
for( ; $node; $node = $node->{next} ) {
next unless defined $node->{surface};
my $midasi = $node->{surface};
my( $hinsi, $yomi ) = (split( /,/, $node->{feature} ))[0,7];
if (grep {$_ eq $hinsi} @validHinshiList) {
my $key = $midasi.','.$hinsi;
$wordlist{$key} = $wordlist{$key} + 1;
}
}
}
while (my ($key, $value) = each(%wordlist)){
print "$key,$value\n";
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment