public
Last active

  • Download Gist
gistfile1.rb
Ruby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
# coding: utf-8
 
class ApplicationController < ActionController::Base
before_filter :normalise_param_encodings
 
# On M17N aware VMs, ensure params from the user are marked with an appropriate encoding.
#
# As of Rails 2.3, Rack returns all params with an ASCII-8BIT encoding, which causes an
# exception if a param is mixed with a UTF-8 string or ERB template. Hopefully that will be
# fixed at some point and this won't be necessary any more.
#
# I've read in a few places that most browsers seem to submit data to the server in the same
# encoding as the last page it received from that server. My brief testing on FF 3.0.x
# confirmed this (for FF at least). FF also doesn't seem to explicitly specify the charset
# on either GET or POST requests (unless they're via AJAX).
#
# Since we always serve UTF-8, I'm going to assume all data we get is the same. If it isn't,
# I sanitise it.
#
# In *theory*, request.content_charset would contain the charset of the request, but it
# never seems to.
#
# As well as marking the strings as UTF-8, I also ensure they contain valid utf-8 data. The
# iconv technique for doing this is based on
# http://po-ru.com/diary/fixing-invalid-utf-8-in-ruby-revisited/
#
def normalise_param_encodings
return unless String.method_defined?(:force_encoding)
 
normalise_object_encoding(params)
end
 
def normalise_object_encoding(obj)
case obj
when String
unless obj.frozen?
obj.force_encoding(Encoding::UTF_8)
ic = Iconv.new('UTF-8//IGNORE', 'UTF-8')
obj.replace(ic.iconv(obj + ' ')[0..-2])
end
when Array
obj.each { |o| normalise_object_encoding(o) }
when Hash
obj.each { |k,v| normalise_object_encoding(v) }
end
end
end

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.