Skip to content

Instantly share code, notes, and snippets.

@bf4
Created July 8, 2014 15:58
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bf4/d26259acfa29f3b9882b to your computer and use it in GitHub Desktop.
Save bf4/d26259acfa29f3b9882b to your computer and use it in GitHub Desktop.
invalid %-encoding error in application for malformed uri

Question re: should this be rack's job

# config.ru
run ->(env) {
  [ 200, { 'Content-Type' => 'text/html', }, [Rack::Request.new(env).params.inspect] ]
}

rackup

open http://localhost:9292/?utf8=%E2%9C%93&search=&page=2&commit=%u00ABscript%u00BBalert(209);%u00AB/script%u00BB

ArgumentError at /
invalid %-encoding (%u00ABscript%u00BBalert(209))

Ruby	$HOME/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/uri/common.rb: in decode_www_form_component, line 898
Web	GET localhost/

Would you argue that rackup is not an acceptable app server, then? If not, would you accept a PR to rescue the Argument error in rack, warn that it's the app server's job to fix the issue, and re-raise it?


edit: I forgot posted a similar comment in rack/rack#323 (comment)

require 'rack'
require 'logger'
class ExceptionApp
DEFAULT_CONTENT_TYPE = 'text/html'
DEFAULT_CHARSET = 'utf-8'
attr_reader :logger
def initialize(app, stdout=STDOUT)
@app = app
@logger = defined?(Rails.logger) ? Rails.logger : Logger.new(stdout)
end
def call(env)
begin
# calling env.dup here prevents bad things from happening
request = Rack::Request.new(env.dup)
# calling request.params is sufficient to trigger the error
# see https://github.com/rack/rack/issues/337#issuecomment-46453404
request.params
@app.call(env)
rescue ArgumentError => e
raise unless e.message =~ /invalid %-encoding/
message = "BAD REQUEST: Returning 400 due to #{e.message} from request with env #{request.inspect}"
logger.info message
content_type = env['HTTP_ACCEPT'] || DEFAULT_CONTENT_TYPE
status = 400
body = "Bad Request"
return [
status,
{
'Content-Type' => "#{content_type}; charset=#{DEFAULT_CHARSET}",
'Content-Length' => body.bytesize.to_s
},
[body]
]
end
end
end
require_relative 'exception_app'
describe ExceptionApp do
let(:app) do
ExceptionApp.new( ->(env){ [418, {}, ["I'm a teapot."]] }, '/dev/null')
end
let(:bad_param) { "commit=%u00ABscript%u00BBalert(209);%u00AB/script%u00BB" }
let(:env) do
{
"REQUEST_METHOD"=>"GET",
"QUERY_STRING"=> bad_param,
"REQUEST_URI"=>"/?#{bad_param}",
"PATH_INFO"=>"/",
"HTTP_ACCEPT"=>"application/json",
"rack.input" => ""
}
end
it "returns an early 'Bad Request' when given params with a bad encoding" do
status, headers, body = app.call(env)
expect(body.join).to eq("Bad Request")
expect(status).to eq(400)
expect(headers["Content-Type"]).to eq("application/json; charset=utf-8")
expect(headers["Content-Length"]).to eq("11")
end
it "does nothing when no encoding error if found" do
status, headers, body = app.call(env.merge("QUERY_STRING" => "", "REQUEST_URI" => "/"))
expect(body.join).to eq("I'm a teapot.")
expect(status).to eq(418)
expect(headers["Content-Type"]).to be_nil
expect(headers["Content-Length"]).to be_nil
end
end
@bf4
Copy link
Author

bf4 commented Jul 13, 2014

Same middleware with some Rails changes

require 'action_dispatch'
require 'logger'
class ExceptionApp
  DEFAULT_CONTENT_TYPE = 'text/html'
  DEFAULT_CHARSET      = ActionDispatch::Response.default_charset

  attr_reader :logger
  def initialize(app, stdout=STDOUT)
    @app = app
    @logger = defined?(Rails.logger) ? Rails.logger : Logger.new(stdout)
  end

  def call(env)
    begin
      # calling env.dup here prevents bad things from happening
      request = ActionDispatch::Request.new(env.dup)
      # calling request.params is sufficient to trigger the error
      # see https://github.com/rack/rack/issues/337#issuecomment-46453404
      request.params
      @app.call(env)
    rescue ArgumentError => e
      raise unless e.message =~ /invalid %-encoding/
      message = "BAD REQUEST: Returning 400 due to #{e.message} from request with env #{request.inspect}"
      logger.info message
      content_type = request.formats.first || DEFAULT_CONTENT_TYPE
      status = 400
      body   = "Bad Request"
      return [
        status,
        {
           'Content-Type' => "#{content_type}; charset=#{DEFAULT_CHARSET}",
           'Content-Length' => body.bytesize.to_s
        },
        [body]
      ]
    end
  end

end

@bf4
Copy link
Author

bf4 commented Jul 13, 2014

I chose to force the error via Rack::Request.new(env).params even though I know the 'failure' is ultimately from the URI module, since I want to trigger the exception at the Rack level, which might be fixed at some point. There are some other monkey patches people have made, mentioned in those issues, e.g. https://gist.github.com/psychocandy/3130349

@pioz
Copy link

pioz commented Jul 13, 2014

@bf4 Can you create a pull request with this fix to https://github.com/whitequark/rack-utf8_sanitizer ?

@rajesh38
Copy link

I found this solution which is more compact and working well:

class Utf8Sanitizer
  SANITIZE_ENV_KEYS = %w(
    HTTP_REFERER
    PATH_INFO
    REQUEST_URI
    REQUEST_PATH
    QUERY_STRING
  )

  def initialize(app)
    @app = app
  end

  def call(env)
    SANITIZE_ENV_KEYS.each do |key|
      string = env[key].to_s
      valid = URI.decode(string).force_encoding('UTF-8').valid_encoding?
      # Don't accept requests with invalid byte sequence
      return [ 400, { }, [ 'Bad request' ] ] unless valid
    end

    @app.call(env)
  end
end

link: http://dev.mensfeld.pl/2014/03/rack-argument-error-invalid-byte-sequence-in-utf-8/

Tell me which one is better.

@bschaeffer
Copy link

@bf4 What are the bad things env.dup prevents from happening?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment