Skip to content

Instantly share code, notes, and snippets.

@jrochkind
Created April 18, 2012 19:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jrochkind/2416043 to your computer and use it in GitHub Desktop.
Save jrochkind/2416043 to your computer and use it in GitHub Desktop.
sample ruby 1.9 validate_encoding hacky implementation
# Pass in a string, will raise an Encoding::InvalidByteSequenceError
# if it contains an invalid byte for it's encoding; otherwise
# returns an equivalent string.
#
# OR, like String#encode, pass in option `:invalid => :replace`
# to replace invalid bytes with a replacement string in the
# returned string. Pass in the
# char you'd like with option `:replace`, or will, like String#encode
# use the unicode replacement char if it thinks it's a unicode encoding,
# else ascii '?'.
#
# in any case, method will raise, or return a new string
# that is #valid_encoding?
def validate_encoding(str, options = {})
str.chars.collect do |c|
if c.valid_encoding?
c
else
unless options[:invalid] == :replace
# it ought to be filled out with all the metadata
# this exception usually has, but what a pain!
# Why isn't ruby doing this for us?
raise Encoding::InvalidByteSequenceError.new
else
options[:replace] || (
# surely there's a better way to tell if
# an encoding is a 'Unicode encoding form'
# than this? What's wrong with you ruby 1.9?
str.encoding.name.start_with?('UTF') ?
"\uFFFD" :
"?" )
end
end
end.join
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment