Create a gist now

Instantly share code, notes, and snippets.

取ってきた HTML のエンコーディング化かさないようにするやつ
require 'open-uri'
require 'nkf'
require 'nokogiri'
user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586'
html = open(url, 'User-Agent' => user_agent).read
unless html.encoding.name == 'UTF-8'
html.encode!('UTF-8', NKF.guess(html).name, invalid: :replace, undef: :replace)
end
doc = Nokogiri::HTML(html)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment