Skip to content

Instantly share code, notes, and snippets.

@csirac2
Created October 20, 2011 23:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save csirac2/1302720 to your computer and use it in GitHub Desktop.
Save csirac2/1302720 to your computer and use it in GitHub Desktop.
perl unisanity
#!/usr/bin/perl -CSAD
# You probably want to run as perl -CSAD /tmp/test.pl
# This is a little exploration of http://perldoc.perl.org/perlunicode.html#The-"Unicode-Bug"
# See also http://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default
use utf8;
use warnings;
use strict;
my $u = 'ü';
print "'$u' is utf8: " . utf8::is_utf8($u) . "\n";
print "$u =~ /\\w/: " . ($u =~ /\w/) . "\n";
my $ordu = ord($u);
printf("ord('$u') = 0x%x\n", $ordu);
my $xfc = "\xfc";
tests($xfc, $u);
print "utf::upgrade('\\xfc')\n";
utf8::upgrade($xfc);
tests($xfc, $u);
sub tests {
my ($raw, $chr) = @_;
print "\nTesting '$chr':\n";
print "'\\xfc' is utf8: " . (utf8::is_utf8($raw) || 0) . "\n";
print "'\\xfc' eq '$chr'? " . ($raw eq $chr || 0) . "\n";
print "'\\xfc' =~ /\\w/: " . ($raw =~ /\w/ || 0) . "\n";
print "uc('\\xfc') eq uc('$chr')? " . (uc($raw) eq uc($chr) || 0) . "\n";
return;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment