Skip to content

Instantly share code, notes, and snippets.

@dima4p
Created April 9, 2011 20:35
Show Gist options
  • Save dima4p/911742 to your computer and use it in GitHub Desktop.
Save dima4p/911742 to your computer and use it in GitHub Desktop.
Monkey patch for Mechanize
class Mechanize
class Page
def initialize(uri=nil, response=nil, body=nil, code=nil, mech=nil)
@encoding = nil
method = response.respond_to?(:each_header) ? :each_header : :each
re_test = /charset\s*=/i
re_select = /charset\s*=\s*([^; "]+)/i
response.send(method) do |header,v|
next unless v =~ re_test
encoding = v[re_select, 1]
@encoding = encoding unless encoding == 'none'
end
# Force the encoding to be 8BIT so we can perform regular expressions.
# We'll set it to the detected encoding later
body.force_encoding('ASCII-8BIT') if body && body.respond_to?(:force_encoding)
if !@encoding and body =~ /<meta\s+http-equiv\s*=\s*"?Content-Type"?(.*?)>/i
encoding = $1[re_select, 1]
@encoding = encoding unless encoding == 'none'
end
@encoding ||= Util.detect_charset(body)
super(uri, response, body, code)
@mech ||= mech
raise Mechanize::ContentTypeError.new(response['content-type']) unless
response['content-type'] =~ /^(text\/html)|(application\/xhtml\+xml)/i
@parser = @links = @forms = @meta = @bases = @frames = @iframes = nil
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment