Skip to content

Instantly share code, notes, and snippets.

@Skarsnik
Created December 2, 2015 19:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Skarsnik/5dfb0dce3b517f13d767 to your computer and use it in GitHub Desktop.
Save Skarsnik/5dfb0dce3b517f13d767 to your computer and use it in GitHub Desktop.
use Gumbo::Parser; # replace with HTML::Parser::XML
use XML;
my $parser = Gumbo::Parser.new;
my $html = qqx{wget -o /dev/null -O - https://www.fimfiction.net/bookshelf/149291/};
say "Getting/parsing a wegpage with a quite huge xml tree, prepare yourself some coffee if you use H:P:X";
say "webpage is "~$html.chars~" size.";
my $xml = $parser.parse($html);
$xml.save('somelargetree.txt');
say "Parsing done : looking for something, forever";
for 1..1000 {
my @characters = $xml.lookfor(:TAG<a>, :class<character_icon>); #shortcut for elements method
sleep 0.5;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment