This is what i could write quickly from memory as i dont have access to actual code rn, but this should be almost identical #!/usr/bin/env perl6
grammar G {
token TOP { .*? ": " <hex_array> + " " .* }
token hex_array { $<hd>=<[0..9 A..F]>**2 " " };
#token ws { ' ' }
}
my Buf $buf = Buf.new;
class Ga {
method hex_array ($/) { $buf.append(:16($<hd>.Str)); }
}
my $ga = Ga.new;
"hex.txt".IO.lines(enc => 'utf8-c8').map( { G.parse($_, actions => $ga ) } );
say $buf.elems;
and i used something like this to generate hex.txt dummy data, and it takes around 45sec to 1 min to process this.
for i in {1..100000}; do printf '1234567890\xff\xfe\xab\x00\x11\x121234567890' | xxd -g 1 -c 13 >>hex.txt ; done
let me know if you want me to post it on the sub or in some other way.
Finally had some time to work on this. Here's what I've got:
my $buf = Buf.new;
grammar G {
token TOP { <.line>* }
rule line { <t000etc> <hex_array> **13 <t123etc>\n?}
token t000etc { <xdigit>**8 ':' }
token hex_array { <[0..9 a..f A..F]>**2 { $buf.append: :16(~$/) } }
token t123etc { <[. 0..9]>**13 }
token ws { ' ' }
}
G.parse: slurp 'hex200K.txt';
say now - INIT now;
Hopefully you can run the above with suitable input and let me know how it compares with whatever solution is your current fastest.