Skip to content

Instantly share code, notes, and snippets.

@whitebell
Last active September 6, 2017 15:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save whitebell/a7d991e3083cd319abe337cfa5a63b2c to your computer and use it in GitHub Desktop.
Save whitebell/a7d991e3083cd319abe337cfa5a63b2c to your computer and use it in GitHub Desktop.
http://www.vastalto.com/jpn/#e-Dic pejv181uからpdic用csvに変換(ざっくりと)
use strict;
use warnings;
use utf8;
use Text::CSV;
my $csv = Text::CSV->new({binary => 1, always_quote => 1}) or die Text::CSV->error_diag;
open my $rh, '<:encoding(UTF-8)', 'pejvo.txt' or die $!;
open my $wh, '>:raw:encoding(UTF-16LE)', 'esperanto.csv' or die $!;
print $wh "\x{feff}"; # BOM
print $wh "word,trans,level\r\n";
while (my $line = <$rh>) {
chomp $line;
my ($word, $trans) = split /:/, $line;
my $level = 0;
#{B} => lv1, {O} => lv2に変換
if ($trans =~ s<{([BO])}><>) {
$level = $1 eq 'B' ? 1 : 2;
}
if ($word =~ m{/}) {
$trans .= " $word";
$word =~ s{/}{}g;
}
my @col = ($word, $trans, $level);
$csv->print($wh, \@col);
print $wh "\r\n";
}
close $wh;
close $rh;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment