Skip to content

Instantly share code, notes, and snippets.

@iaingray
Created June 1, 2014 11:41
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save iaingray/c93f2fec991b575ff075 to your computer and use it in GitHub Desktop.
Save iaingray/c93f2fec991b575ff075 to your computer and use it in GitHub Desktop.
ruby convert to clean strings to utf-8
# Converting ASCII-8BIT to UTF-8 based domain-specific guesses
if new_value.is_a? String
begin
# Try it as UTF-8 directly
cleaned = new_value.dup.force_encoding('UTF-8')
unless cleaned.valid_encoding?
# Some of it might be old Windows code page
cleaned = new_value.encode( 'UTF-8', 'Windows-1252' )
end
new_value = cleaned
rescue EncodingError
# Force it to UTF-8, throwing out invalid bits
new_value.encode!( 'UTF-8', invalid: :replace, undef: :replace )
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment