Skip to content

Instantly share code, notes, and snippets.

@kijun
Created February 17, 2010 02:24
Show Gist options
  • Save kijun/306219 to your computer and use it in GitHub Desktop.
Save kijun/306219 to your computer and use it in GitHub Desktop.
irb(main):001:0> require 'open-uri'
=> true
irb(main):002:0> require 'nokogiri'
=> true
irb(main):003:0> f = open('http://211.254.102.90/web').read
=> "<html><head><title>\xBB\xF5\xB7\xCE\xBF\xEE \xBC\xBC\xBB\xF3\xC0\xBB \xBF\xA9\xB4\xC2 \xB9\xAE, G\xB8\xB6\xC4\xCF</title><meta http-equiv='Content-Type' content='text/html; charset=euc-kr'>\n</head>\n</body>\n</html>\n"
irb(main):011:0> Nokogiri::HTML(f).encoding
=> "euc-kr"
irb(main):012:0> Nokogiri::HTML(f).meta_encoding
=> "euc-kr"
irb(main):013:0> Nokogiri::HTML(f).xpath('//title').text
=> "\xBBõ·Î¿î ¼¼»óÀ» ¿©´Â ¹®, G¸¶ÄÏ" # wrong!
irb(main):014:0> Nokogiri::HTML(f).xpath('//title').text.encoding
=> #<Encoding:UTF-8>
irb(main):015:0> Nokogiri::HTML(f,nil,'euc-kr').encoding
=> "euc-kr"
irb(main):016:0> Nokogiri::HTML(f,nil,'euc-kr').meta_encoding
=> "euc-kr"
irb(main):017:0> Nokogiri::HTML(f,nil,'euc-kr').xpath('//title').text
=> "새로운 세상을 여는 문, G마켓" # correct!
irb(main):018:0> Nokogiri::HTML(f,nil,'euc-kr').xpath('//title').text.encoding
=> #<Encoding:UTF-8>
---------------------
Using ruby 1.9.1p243, Nokogiri 1.4.1, iconv (GNU libc) 2.5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment