Skip to content

Instantly share code, notes, and snippets.

@nileshtrivedi
Created December 1, 2010 19:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nileshtrivedi/724018 to your computer and use it in GitHub Desktop.
Save nileshtrivedi/724018 to your computer and use it in GitHub Desktop.
Scraping trains and stations list from indiantrains.org
require 'net/http'
require 'uri'
# Fetch list of all stations with their codes
File.open("all_stations.csv","w") { |f|
(?A..?Z).to_a.collect(&:chr).each { |c|
puts "Starting #{c}"
sleep(3) #wait for 3 seconds
uri = URI.parse("http://www.indiantrains.org/station-list/?navigate=" << c)
response = Net::HTTP.get_response(uri)
matches = response.body.scan(/http:\/\/www\.indiantrains\.org\/station-details\/\?code=([A-Z]+)&name=([A-Z+]+)"/).uniq
matches.each { |m|
name = m[1].gsub(/\+/,' ')
f.puts "\"#{m.first}\",\"#{name}\""
} #matches.each
} #chars.each
} #File.open
# Fetch list of all trains
File.open("all_trains.csv","w") { |f|
(?A..?Z).to_a.collect(&:chr).each { |c|
puts "Starting #{c}"
sleep(3)
uri = URI.parse("http://www.indiantrains.org/train-list/?navigate=" << c)
response = Net::HTTP.get_response(uri)
matches = response.body.scan(/http:\/\/www\.indiantrains\.org\/train-details\/\?number=([1-9a-zA-Z]+)&name=([0-9a-zA-Z+()]+)"/).uniq
matches.each { |m|
name = m[1].gsub(/\+/,' ')
f.puts "\"#{m.first}\",\"#{name}\""
} #matches.each
} #chars.each
} #File.open
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment