Skip to content

Instantly share code, notes, and snippets.

@hinnerk-a
Created August 31, 2011 14:41
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Embed
What would you like to do?
Perl: handle malformed UTF-8 strings with Encode::encode
sub encode_utf_8 {
my $string = @_;
my $utf8_encoded = '';
eval {
$utf8_encoded = Encode::encode('UTF-8', $string, Encode::FB_CROAK);
};
if ($@) {
# sanitize malformed UTF-8
$utf8_encoded = '';
my @chars = split(//, $string);
foreach my $char (@chars) {
my $utf_8_char = eval { Encode::encode('UTF-8', $char, Encode::FB_CROAK) }
or next;
$utf8_encoded .= $utf_8_char;
}
}
return $utf8_encoded;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment