Created
September 15, 2014 06:23
-
-
Save gyng/890a98e33c0e9a2dfbd7 to your computer and use it in GitHub Desktop.
MarineTraffic scraper
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Install Ruby | |
gem install nokogiri | |
gem install sqlite3 | |
To open the db from command line: sqlite3 mydb.db |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "nokogiri" | |
require "open-uri" | |
require "sqlite3" | |
require "json" | |
# puts open("http://www.marinetraffic.com/en/ais/details/ships/9256858") | |
page = Nokogiri::HTML(open("./list.html").read) | |
table = page.xpath("/html/body/div/div[4]/div[2]/div[2]/div/div[1]/table/tbody") | |
ship_rows = table.xpath("tr") | |
mmsi = [] | |
ship_rows.each do |row| | |
mmsi.push row.children[4].text | |
end | |
mmsi.delete_at(0) # Delete label | |
mmsi = mmsi.map { |s| Integer(s).abs } | |
puts mmsi.inspect |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "nokogiri" | |
require "open-uri" | |
require "sqlite3" | |
require "net/http" | |
mmsi = [9193678, 9179608, 9147277, 227532830, 227532860, 227532880, 227533000, 227533190, 227533340, 227533480, 227533570, 227533740, 227533750, 413901270, 413901271, 413901272, 413901273, 413901274, 413901275, 413901276, 413901277, 413901278, 413901279, 413901281, 232004296, 232004304, 232004306, 232004315, 232004332, 232004336, 232004347, 232004349, 232004360, 423128100, 423131100, 423133100, 423134100, 423135100, 423136100, 423138100, 423143100, 367397940, 367397950, 367397990, 367398010, 367398020, 367398040, 367398050, 367398110, 316003142] | |
SQLite3::Database.new "ships.db" | |
db = SQLite3::Database.open "ships.db" | |
# db.execute "DROP TABLE Ships IF EXISTS Ships" | |
db.execute "CREATE TABLE IF NOT EXISTS Ships(Id INTEGER PRIMARY KEY, Name TEXT, Flag TEXT, Type TEXT)" | |
i = 0 | |
mmsi.each do |m| | |
puts m | |
i += 1 | |
url = "http://www.marinetraffic.com/en/ais/details/ships/#{m}" | |
page = Nokogiri::HTML(Net::HTTP.get(URI.parse(url))) | |
flag = page.xpath('/html/body/div/div[4]/div[1]/div[1]/div/div/div[1]/div[4]/b') | |
type = page.xpath('/html/body/div/div[4]/div[1]/div[1]/div/div/div[1]/div[5]/b') | |
puts flag | |
puts type | |
name = 'LadyM' | |
db.execute "INSERT INTO Ships VALUES(#{i},'#{name}','#{flag}','#{type}')" | |
end | |
result = db.execute "SELECT * FROM Ships" | |
puts result.inspect |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment