Skip to content

Instantly share code, notes, and snippets.

@kig
Created November 11, 2008 11:45
Show Gist options
  • Save kig/23808 to your computer and use it in GitHub Desktop.
Save kig/23808 to your computer and use it in GitHub Desktop.
google_raw, a small google commandline tool
#!/usr/bin/ruby
=begin
$ google_raw -n1 euribor
Euribor Homepage
http://www.euribor.org/
Euribor (Euro Interbank Offered Rate) is the rate at which euro interban
k term deposits within the euro zone are offered by one prime bank to ano
ther prime ...Historical DataAbout EuriborAbout EoniaPress ReleasesWhat's
newFAQsPanel BanksTechnical FeaturesMore results from euribor.org??
$ google_raw -n1 ピース
ピース - Wikipedia
http://ja.wikipedia.org/wiki/%E3%83%94%E3%83%BC%E3%82%B9
出典: フリー百科事典『ウィキペディア(Wikipedia)』. 移動: ナビゲーション, 検索. ピース. Peace(ピース)は、英語で「平和
」を意味する。 Piece(ピース)は、同じく英語で「欠片」を意味する。 ピース ( タバコ) - 日本の煙草の銘柄のひとつ。 ...
=end
$KCODE = 'u'
require 'hpricot'
require 'open-uri'
require 'cgi'
require 'optparse'
require 'iconv'
require 'stringio'
options = {:count => 1.0/0.0}
opts = OptionParser.new do |opts|
opts.banner = "USAGE: #{$0} [options] keyword [keyword ...]"
opts.separator ""
opts.on("-h", "--help", "Show this message"){ puts opts.help; exit!(0) }
opts.on("-n COUNT", Integer, "Maximum number of results to show"){|p|
options[:count] = p
}
end
begin
opts.parse!(ARGV)
rescue => e
STDERR.puts e.message
STDERR.puts opts.help
exit!(1)
end
if ARGV.empty?
puts opts.banner
exit!(1)
end
#STDERR.puts("Making query...")
open(
"http://www.google.fi/search?q=" + CGI.escape(ARGV.join(" ")),
"Accept-Charset" => "UTF-8;q=0.8",
"User-Agent" => "Mozilla/5.0 (Windows; U; Windows NT 5.1; fi; rv:1.8.1.15)"+
" Gecko/20080623 Firefox/2.0.0.15"
){|f|
charset = f.meta["content-type"].split("=").last
outstr = StringIO.new
page = Hpricot.parse(f.read)
result_num = 0
(page/"li.g").each{|li|
break if result_num >= options[:count]
result_num += 1
a = (li/"a")[0]
div = (li/"div")[0]
div.children.each{|c|
if c.respond_to?("etag") and
["cite","span","a"].include?(c.etag.inspect[2..-2])
c.innerHTML = ""
end
}
outstr.puts a.innerText
outstr.puts a[:href]
outstr.puts(div.innerText.scan(/.{1,72}\S{1,4}?\s*/u).map{|l|
" " + l.strip })
outstr.puts
}
outstr.rewind
if charset != "UTF-8"
puts(Iconv.iconv("UTF-8", charset, outstr.read))
else
puts outstr.read
end
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment