Skip to content

Instantly share code, notes, and snippets.

@muraiki
Created September 3, 2015 14:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save muraiki/e53afa88fefa2b5b2c3d to your computer and use it in GitHub Desktop.
Save muraiki/e53afa88fefa2b5b2c3d to your computer and use it in GitHub Desktop.
Process wikidata using perl6, take 2
use JSON::Fast;
sub MAIN() {
my $lines = open('wikidata-20150831-all.json', :r).lines(20).hyper(batch => 4, degree => 4);
my $filter_ends = $lines.grep: -> $line { $line !~~ / ^'[' || $']' / };
my $remove_comma = $filter_ends.map: -> $line { $line.chop if $line.ends-with(',') };
my $json = $remove_comma.map: &from-json;
my $ids = $json.map: *<id>;
say $ids.list;
}
@muraiki
Copy link
Author

muraiki commented Sep 3, 2015

batch 4 degree 4:

real    0m14.335s
user    0m41.268s
sys     0m0.396s

hyper with no args:

real    0m19.639s
user    0m19.402s
sys     0m0.216s

@muraiki
Copy link
Author

muraiki commented Sep 3, 2015

This is with 4 virtual cores at 2800.000 (1 thread per core), 16GB ram

@muraiki
Copy link
Author

muraiki commented Sep 3, 2015

No hyper:

real    0m21.884s
user    0m21.759s
sys     0m0.107s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment