Skip to content

Instantly share code, notes, and snippets.

@AMekss
Created October 9, 2012 10:56
Show Gist options
  • Save AMekss/3857957 to your computer and use it in GitHub Desktop.
Save AMekss/3857957 to your computer and use it in GitHub Desktop.
String#normalize_encoding! for working with differently encoded strings in Ruby 1.9.3
# -*- encoding : utf-8 -*-
class String
# method always returns string with valid encoding which is equal to Encoding#default_internal
# handy for working with strings which encoding may differ (all kinds of imports and free text inputs)
def normalize_encoding!
return self unless !!defined?(Encoding) # apply for Ruby 1.9.3 only
encoding_equal_to_default_internal = (self.encoding == Encoding.default_internal)
# return unchanged if encoding is valid and equal to default_internal
return self if self.valid_encoding? && encoding_equal_to_default_internal
# try to force to default_internal and return if encoding is valid in result
return self.force_encoding(Encoding.default_internal) if self.dup.force_encoding(Encoding.default_internal).valid_encoding?
if encoding_equal_to_default_internal
# there might be a cases when encoding is the same as default_internal, but it's not valid
# so we need to force it to something else in order to make String#encode! method work.
non_default_encoding = Encoding.list.detect{|enc| enc != Encoding.default_internal}
self.force_encoding(non_default_encoding).encode!
else
self.encode!(Encoding.default_internal, self.encoding, { :undef => :replace, :invalid => :replace})
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment