Skip to content

Instantly share code, notes, and snippets.

@mattn
Created October 30, 2008 09:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mattn/20947 to your computer and use it in GitHub Desktop.
Save mattn/20947 to your computer and use it in GitHub Desktop.
#!/usr/bin/perl
use strict;
use warnings;
use Web::Scraper;
use URI;
use YAML;
my $airlines_accident_scraper = scraper {
process '//div[@class="entry-content"]//table/tr',
'airlines[]' => scraper {
process '//td[1]', title => 'TEXT';
process '//td[2]', last_accident => 'TEXT';
process '//td[3]', flight_count => 'TEXT';
process '//td[4]', death_accident => 'TEXT';
process '//td[5]', death_rate => 'TEXT';
process '//td[6]', accident_incidence => 'TEXT';
process '//td[7]', total_rank => 'TEXT';
};
result 'airlines';
};
my $list = $airlines_accident_scraper->scrape(URI->new('http://www.manji.com/jp/2007/08/post_22.html'));
use YAML;
warn Dump $list;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment