Skip to content

Instantly share code, notes, and snippets.

@shiba-yu36
Created June 8, 2011 10:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shiba-yu36/1014187 to your computer and use it in GitHub Desktop.
Save shiba-yu36/1014187 to your computer and use it in GitHub Desktop.
Encode::Guessで判定失敗したときに、全ての文字コードを試してみる例
use strict;
use warnings;
use utf8;
use Encode;
use Encode::Guess qw/ascii utf8 euc-jp shiftjis 7bit-jis/;
my $presence_list = [qw(
出席 出 しゅっせき
)];
my $absence_list = [qw(
欠席 欠 けっせき
)];
my $presence_re = join '|', map {quotemeta $_} @$presence_list;
my $absence_re = join '|', map {quotemeta $_} @$absence_list;
my $str = encode('shiftjis', '出席します');
# guess decoder
my $decoders = [];
my $decoder = Encode::Guess->guess($str);
unless (ref($decoder)) {
my @names = split(' or ', $decoder);
for my $name (@names) {
my $d = find_encoding($name);
push @$decoders, $d if $d;
}
}
else {
push @$decoders, $decoder;
}
return if @$decoders == 0;
my $rollcall_type = 0;
for my $d (@$decoders) {
my $decoded_str = $d->decode($str);
$rollcall_type = 1 if $decoded_str =~ qr{$presence_re};
$rollcall_type = 2 if $decoded_str =~ qr{$absence_re};
}
warn $rollcall_type;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment