Skip to content

Instantly share code, notes, and snippets.

@john1king
Created November 20, 2011 15:18
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save john1king/1380350 to your computer and use it in GitHub Desktop.
Save john1king/1380350 to your computer and use it in GitHub Desktop.
to utf-8
def txt_to_utf(file)
open(file) do |f|
s = f.read
#shift_jis 必须写在前面,因为被误判为gbk的概率很高
x = %w(shift_jis gbk big5)
begin
u = s.dup.encode("utf-8",x.shift)
rescue Encoding::UndefinedConversionError,Encoding::InvalidByteSequenceError => ex
if x.empty?
if s.chr.ord.to_s(16) == "efbb"
#处理utf-8 with bom 文件
end
else
retry
end
else
#处理 shift-jis, gbk, big5 文件
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment