Skip to content

Instantly share code, notes, and snippets.

@bf4 bf4/comment.md
Created Jul 8, 2014

Embed
What would you like to do?
invalid %-encoding error in application for malformed uri

https://github.com/rack/rack/issues/337#issuecomment-46453404

Question re: should this be rack's job

# config.ru
run ->(env) {
  [ 200, { 'Content-Type' => 'text/html', }, [Rack::Request.new(env).params.inspect] ]
}

rackup

open http://localhost:9292/?utf8=%E2%9C%93&search=&page=2&commit=%u00ABscript%u00BBalert(209);%u00AB/script%u00BB

ArgumentError at /
invalid %-encoding (%u00ABscript%u00BBalert(209))

Ruby	$HOME/.rvm/rubies/ruby-1.9.3-p484/lib/ruby/1.9.1/uri/common.rb: in decode_www_form_component, line 898
Web	GET localhost/

Would you argue that rackup is not an acceptable app server, then? If not, would you accept a PR to rescue the Argument error in rack, warn that it's the app server's job to fix the issue, and re-raise it?


edit: I forgot posted a similar comment in https://github.com/rack/rack/issues/323#issuecomment-46185906

require 'rack'
require 'logger'
class ExceptionApp
DEFAULT_CONTENT_TYPE = 'text/html'
DEFAULT_CHARSET = 'utf-8'
attr_reader :logger
def initialize(app, stdout=STDOUT)
@app = app
@logger = defined?(Rails.logger) ? Rails.logger : Logger.new(stdout)
end
def call(env)
begin
# calling env.dup here prevents bad things from happening
request = Rack::Request.new(env.dup)
# calling request.params is sufficient to trigger the error
# see https://github.com/rack/rack/issues/337#issuecomment-46453404
request.params
@app.call(env)
rescue ArgumentError => e
raise unless e.message =~ /invalid %-encoding/
message = "BAD REQUEST: Returning 400 due to #{e.message} from request with env #{request.inspect}"
logger.info message
content_type = env['HTTP_ACCEPT'] || DEFAULT_CONTENT_TYPE
status = 400
body = "Bad Request"
return [
status,
{
'Content-Type' => "#{content_type}; charset=#{DEFAULT_CHARSET}",
'Content-Length' => body.bytesize.to_s
},
[body]
]
end
end
end
require_relative 'exception_app'
describe ExceptionApp do
let(:app) do
ExceptionApp.new( ->(env){ [418, {}, ["I'm a teapot."]] }, '/dev/null')
end
let(:bad_param) { "commit=%u00ABscript%u00BBalert(209);%u00AB/script%u00BB" }
let(:env) do
{
"REQUEST_METHOD"=>"GET",
"QUERY_STRING"=> bad_param,
"REQUEST_URI"=>"/?#{bad_param}",
"PATH_INFO"=>"/",
"HTTP_ACCEPT"=>"application/json",
"rack.input" => ""
}
end
it "returns an early 'Bad Request' when given params with a bad encoding" do
status, headers, body = app.call(env)
expect(body.join).to eq("Bad Request")
expect(status).to eq(400)
expect(headers["Content-Type"]).to eq("application/json; charset=utf-8")
expect(headers["Content-Length"]).to eq("11")
end
it "does nothing when no encoding error if found" do
status, headers, body = app.call(env.merge("QUERY_STRING" => "", "REQUEST_URI" => "/"))
expect(body.join).to eq("I'm a teapot.")
expect(status).to eq(418)
expect(headers["Content-Type"]).to be_nil
expect(headers["Content-Length"]).to be_nil
end
end
@bf4

This comment has been minimized.

Copy link
Owner Author

commented Jul 13, 2014

Same middleware with some Rails changes

require 'action_dispatch'
require 'logger'
class ExceptionApp
  DEFAULT_CONTENT_TYPE = 'text/html'
  DEFAULT_CHARSET      = ActionDispatch::Response.default_charset

  attr_reader :logger
  def initialize(app, stdout=STDOUT)
    @app = app
    @logger = defined?(Rails.logger) ? Rails.logger : Logger.new(stdout)
  end

  def call(env)
    begin
      # calling env.dup here prevents bad things from happening
      request = ActionDispatch::Request.new(env.dup)
      # calling request.params is sufficient to trigger the error
      # see https://github.com/rack/rack/issues/337#issuecomment-46453404
      request.params
      @app.call(env)
    rescue ArgumentError => e
      raise unless e.message =~ /invalid %-encoding/
      message = "BAD REQUEST: Returning 400 due to #{e.message} from request with env #{request.inspect}"
      logger.info message
      content_type = request.formats.first || DEFAULT_CONTENT_TYPE
      status = 400
      body   = "Bad Request"
      return [
        status,
        {
           'Content-Type' => "#{content_type}; charset=#{DEFAULT_CHARSET}",
           'Content-Length' => body.bytesize.to_s
        },
        [body]
      ]
    end
  end

end
@bf4

This comment has been minimized.

Copy link
Owner Author

commented Jul 13, 2014

I chose to force the error via Rack::Request.new(env).params even though I know the 'failure' is ultimately from the URI module, since I want to trigger the exception at the Rack level, which might be fixed at some point. There are some other monkey patches people have made, mentioned in those issues, e.g. https://gist.github.com/psychocandy/3130349

@pioz

This comment has been minimized.

Copy link

commented Jul 13, 2014

@bf4 Can you create a pull request with this fix to https://github.com/whitequark/rack-utf8_sanitizer ?

@rajesh38

This comment has been minimized.

Copy link

commented Dec 21, 2014

I found this solution which is more compact and working well:

class Utf8Sanitizer
  SANITIZE_ENV_KEYS = %w(
    HTTP_REFERER
    PATH_INFO
    REQUEST_URI
    REQUEST_PATH
    QUERY_STRING
  )

  def initialize(app)
    @app = app
  end

  def call(env)
    SANITIZE_ENV_KEYS.each do |key|
      string = env[key].to_s
      valid = URI.decode(string).force_encoding('UTF-8').valid_encoding?
      # Don't accept requests with invalid byte sequence
      return [ 400, { }, [ 'Bad request' ] ] unless valid
    end

    @app.call(env)
  end
end

link: http://dev.mensfeld.pl/2014/03/rack-argument-error-invalid-byte-sequence-in-utf-8/

Tell me which one is better.

@bschaeffer

This comment has been minimized.

Copy link

commented Feb 9, 2017

@bf4 What are the bad things env.dup prevents from happening?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.