Skip to content

Instantly share code, notes, and snippets.

@yury
Created July 18, 2010 10:00
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yury/480277 to your computer and use it in GitHub Desktop.
Save yury/480277 to your computer and use it in GitHub Desktop.
#encoding: utf-8
require 'mechanize'
class MechanizeEncodingHook
def call(params)
return if params[:response].nil? || params[:response_body].nil?
response = params[:response]
content_type = response['Content-Type']
internal_encoding = (Encoding::default_internal || "utf-8").to_s.downcase
charset = 'windows-1251'
return if content_type.nil? ||
(charset = content_type[/charset=(?<charset>.*)/, "charset"]).nil?
content_type = content_type.sub(/charset=.*/,
"charset=#{internal_encoding}")
response['Content-Type'] = content_type
response_body = params[:response_body].
force_encoding(charset).
encode(internal_encoding)
response_body[/#{charset}/] = internal_encoding
params[:response_body] = response_body
end
end
KINOPOISK_SEARCH_URL = "http://kinopoisk.ru/index.php?kp_query="
query = 'терминатор'
agent = Mechanize.new
agent.post_connect_hooks << MechanizeEncodingHook.new
agent.get "#{KINOPOISK_SEARCH_URL}#{query.encode("windows-1251")}"
agent.page.search("td.news[width]").map do |section|
puts section.at(".all").content
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment