Skip to content

Instantly share code, notes, and snippets.

@matth
Created March 16, 2011 16:51
Show Gist options
  • Save matth/872819 to your computer and use it in GitHub Desktop.
Save matth/872819 to your computer and use it in GitHub Desktop.
Google Search Term > Page positions finder
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'cgi'
searches = [['alf', 'www.hedweb.com/alffaq.htm'], ['bold food', 'http://dockets.justia.com/docket/new-york/nysdce/1:2009cv00440/338836/']]
searches.each do |term, url|
def search(term, url, start = 0)
doc = Nokogiri::HTML(open("http://www.google.co.uk/search?q=#{CGI::escape(term)}&start=#{start}"))
count = 1
found = false
doc.xpath('//h3/a[@class="l"]').each do |link|
count = start + count
if "#{link[:href]}" =~ /#{url}/
found = true
break
end
end
return (found) ? count : false
end
attempts = 0
found = false
while attempts < 10
res = search(term, url, 10 * attempts)
if res != false
found = res
break
else
attempts = attempts + 1
end
end
if found != false
puts "#{term},#{url},#{found}"
else
puts "#{term},#{url},-1"
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment