Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Get the IANA Language Tag registry, parse it, and write it to a tab-delimited file
#!perl -w
# Parse out "
# And write it to a tab-delimited file.
# Format described at
# TODO: use some Excel writer to split into sheets by Type
use LWP::Simple;
$_ = get("") or die;
s{\n }{}g; # continuation lines in Comments
open STDOUT,">iana-lang-tags.txt" or die "can't open STDOUT: $!\n";
binmode STDOUT, ':encoding(UTF-8)' or die "can't set binmode UTF-8: $!\n";
my @cols = qw(Type Scope Prefix Tag Subtag Suppress-Script Description Macrolanguage Added Deprecated Preferred-Value Comments);
print join("\t",@cols), "\n";
foreach (split/\n%%\n/) { # %% separated block
next if /File-Date:/; # first block is date stamp, not data
my %hash;
foreach (split/\n/) {
my ($key,$val) = split(/: /,$_,2);
if ($hash{$key}) {$hash{$key} .= ", $val"} # 'Description', 'Comments', and 'Prefix' are multivalued
else {$hash{$key} = $val};
foreach (@cols) {
print $hash{$_} || "", "\t";
print "\n";
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.