Skip to content

Instantly share code, notes, and snippets.

@cjbottaro
Created December 16, 2011 21:56
Show Gist options
  • Save cjbottaro/1488185 to your computer and use it in GitHub Desktop.
Save cjbottaro/1488185 to your computer and use it in GitHub Desktop.
convert to utf-8 using iconv fallback when String#encode doesn't work
require "iconv"
def sanitize_utf8(string)
string = string.encode("UTF-8", :invalid => :replace, :undef => :replace)
begin
string.blank? # Assume you're using ActiveSupport
rescue ArgumentError => e
if e.message == "invalid byte sequence in UTF-8"
Thread.current["iconv"] ||= Iconv.new('UTF-8//IGNORE', 'UTF-8')
string = Thread.current["iconv"].iconv(string)
else
raise
end
end
string
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment