Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save motoyasu-saburi/1b19ef18e96776fe90ba1b9f910fa714 to your computer and use it in GitHub Desktop.
Save motoyasu-saburi/1b19ef18e96776fe90ba1b9f910fa714 to your computer and use it in GitHub Desktop.
Land Mine named "Content-Disposition > filename"

TL;DR

  • I found 1 browser, 1 language, and 15 vulnerabilities in { Web Framework, HTTP Client library, Email library / Web Service, etc }
  • All the vulnerabilities I found were found from a single perspective (I investigated maybe 50-80 products).
  • The RFC description of the problem (rather confusingly) describes the requirements for this problem, while the WHATWG > HTML Spec is well documented.
  • The problem is clearly targeted at the Content-Disposition fields filename and filename*.
  • This problem affects HTTP Request/Response/Email in different ways.
    • HTTP Request : request tampering (especially with file contents, tainting of other fields, etc.)
    • HTTP Response : Reflect File Download vulnerability
    • Email : Attachment tampering (e.g., extension and filename tampering and potential file content tampering)
  • Not many people currently see Content-Disposition (filename, filename*) as an obvious attack vector for these attack vectors.
  • I haven't seen a single OWASP publication that summarizes this area properly. ASVS has an Issue on this.
  • Make sure to escape filename and filename* in Content-Disposition.
    • filename:
      • " --> \" or %22
      • \r --> \\r or %0D
      • \n --> \\n or %0A
    • filename*:
      • URL Encode with proper formatting

Introduction

In this article, I will write about the vulnerabilities I found in the 2018-2022 research. The research period is very long, but it is simply due to a mixture of ups and downs of motivation, etc., and the length of time I have not investigated. The actual research period is about three to six months.

Research revealed vulnerabilities in the following products

  • 1 browser
  • 1 programming language
  • 15 {Web Framework / Library / Web Service } (We dare to use ambiguous wording because it includes some items that have not been corrected yet.)

The problems we found fall into three main categories

  1. Content tampering vulnerability during payload generation due to insufficient escaping of filename on multipart/form-data > Content-Disposition in HTTP Request
  2. Reflect File Download vulnerability due to insufficient escaping of filename and filename* in Content-Disposition header in HTTP Response
  3. Content tampering vulnerability due to insufficient escaping of filename in the multipart > Content-Disposition of Email

The cause of this problem is common, only the location where each occurs is different.

When we first discovered traces of the problem, we judged it to be an implementation error on the part of the Web Framework that did not comply with the RFC. However, after reporting the problem to several Web Frameworks, we received the comment, "It must be an implementation problem on the browser side." We reported the vulnerability to the browser (Firefox), and it was successfully fixed (after two years of inactivity).

This seemed to be the end of the project, but an article by one person led me to discover a new perspective, and I was motivated to do additional research after several years.

The vulnerabilities whose contents have already been disclosed are as follows

  • Firefox (No-CVE, https://bugzilla.mozilla.org/show_bug.cgi?id=1556711 )
  • Python (No-CVE, python/cpython#100612 )
  • apache/httpcomponents-client (Java, CVE None, Fixed)
  • Sinatra (Ruby, CVE-2022-45442, Fixed)
  • Ktor (Kotlin, CVE-2022-38179, Fixed)
  • Django (Python, CVE-2022-36359, Fixed)
  • iris (Golang, CVE None, Fixed)
  • httparty (Ruby, Github Advisory (GHSA-5pq7-52mg-hr42), Fixed)
  • Other (Not yet fixed)

In addition, this issue itself is just the tip of the iceberg, and many blog posts do not mention the vulnerable implementation method that this case deals with. Even the RFC is somewhat vague, and advice on countermeasures is not well known to the general public. This suggests that this issue has not been grasped by some (perhaps other than the maintainers of well-known frameworks and some security researchers).

In fact, a discussion on this issue is currently underway in OWASP ASVS v5.

OWASP/ASVS - proposal/new requirement - served filename in content-disposition header must follow correct encoding

This article will summarize the results of these series of studies.

Notes

Some parts of the record cannot be traced back due to the fact that quite a bit of time has passed. For this reason, I have written (vague) in ambiguous sections.

I wondered whether I should wait for all the problems I found to be fixed before publishing this article, but I decided to publish it for the following reasons.

  • I made every effort we could think of, but there were cases where we were not contacted back.   I sent Patch/Unit Test/Detailed Report/Reference to Security Advisory for publication, etc., but never heard from them or they were not merged.
  • Vulnerability was found in a library that is not maintained (we contacted them, but they did not reply).
  • Since there seems to be no public awareness of this issue of Content-Disposition in the first place, we decided that it would be better to make the issue known as soon as possible.

In fact, 99% of the implementations of Content-Disposition introduced in (Japanese) blogs etc. have no warning about this issue. Of course, there are cases where the Framework/Library implicitly performs escaping and the like. In other cases, they do nothing.

Therefore, we will provide a detailed report on the root cause of this problem. Of course, for libraries that have not been fixed, the article will not mention any names or problem areas.

Finding and Reporting Vulnerabilities

Well, it was a long time ago that I found signs of a vulnerability. At the time, we were looking for vulnerabilities in our web application (made with Spring Framework) and were testing it with a simple Fuzzing file. I was testing with a simple Fuzzing file.

I was testing file uploads, etc. using files like the following

example";';.jpg

When testing file upload and other functions using such data, we found that some functions returned a 500 Error.

I was excited to see if there might be some kind of vulnerability that could allow an Injection, so we analyzed the behavior, but the results were disastrous. After creating a minimal proof-of-concept data, it seemed that the error occurred when "; was included.

To be sure, I checked the error log on the server side to see if it was "just a bug" or if my Injection skills were insufficient.

I found the following error logs

IllegalArgumentException("Invalid content disposition format");

https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L316

Apparently, a formatting error has occurred in Content-Disposition (see below) when uploading a file.
Content-Dispositon is a header used in request/response, etc. It is used in mutipart/form-data at the time of request and has information such as file name.

It turns out that it is not a basic problem such as Injection as originally envisioned. It's unfortunate.
However, it is not a problem that can be quickly identified just by looking at it, but one that could lead to a major vulnerability if investigated further!

So let's investigate.

Content-Disposition と multipart/form-data

Before we begin, let us preliminarily discuss Content-Disposition and multipart/form-data, which are relevant to this article.

<form> & multipart/form-data

multipart/form-data is a MIME type. It is a kind of encoding type for forms (<form>) available in HTML.

Besides multipart/form-data, other encoding types that can be sent with <form> are as follows.

  • application/x-www-form-urlencoded
  • multipart/form-data
  • text/plain

Of these, only multipart/form-data can send binary data (using the <form> method).

For example, if you send binary data in the following form format, it will automatically be sent as multipart/form-data.

<form action="file_upload">
	<input type="text" name="name">
	<input type="file" name="avatar">
	<input type="submit">
</form>

<!-- It can be written as follows -->
<!-- <form enctype="multipart/form-data"> -->

multipart/form-data format

The format of multipart/form-data is described in the following format.

※Excerpts from HTTP Header / Body. Abbreviations are marked as ....

HTTP Header

Content-Type: multipart/form-data; boundary=----RandomValue123

HTTP Body

------RandomValue123
Content-Disposition: form-data; name="name"

Taro
------RandomValue123
Content-Disposition: form-data; name="profile"; filename="profile.txt"
Content-Type: text/plain

Hello
World!
------RandomValue123
Content-Disposition: form-data; name="avator"; filename="avator.png"
Content-Type: image/png

{バイナリ}
------RandomValue123--

As shown above, the Content-Type of multipart/form-data is specified in the Header section. In addition, a value like boundary=RandomValue123 is set as a delimiter to delimit each parameter in the Body section. The value of boundary=----RandomValue123 is set as a delimiter to separate each parameter in the Body section. Only the last part of the body section has the format ----RandomValue123 + -- to indicate the end of the parameter.

This boundary value is used to separate the parameters in the Body section. Each separated parameter is called a part. In the part, data may be expressed using Content-Disposition, which is the subject of this issue.

About "Content-Dispotion"

I took a detour, but now let's talk about Content-Disposition, the subject of this article.

To be honest, I feel that reading MDN is the most accurate and easiest way to understand the subject, so I will post the URL.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition

Content-Dispotion is a header used in HTTP Response / Request (inside multipart/form-data) / Email.

In the case of Response, it controls what action is taken based on the MIME type of the target response file.
For example

  • If the MIME type is image/png and Content-Disposition: inline, the file is displayed in the browser.
  • If the MIME type is image/png and Content-Disposition: attachment, download the file.
  • If MIME type is image/png and Content-Disposition: attachment; filename="abc.jpg", download the file as "abc.jpg".
  • If the MIME-type is image/png and Content-Disposition: attachment; filename*=utf-8''{URL-encoded filename}.jpg, decode the file with the "URL-encoded value" and download it as Download the file as filename (non-ASCII characters are also supported)

Thus, Content-Dispotion can control how the file is handled. Also, the filename* field appears only in the HTTP Response case.

On the other hand, if you include it inside the multipart/form-data of the Request This is used to express information about the target parameter (field name or filename of the form).

For example, in the case of Content-Disposition: form-data; name="picture"; filename="filename.jpg", the following is used.

  • parameter name is picture (equivalent to "name" in )
  • The file name is filename.jpg.

The form-data part at the beginning of Content-Disposition: form-data; is a cliché. The first argument in multipart/form-data at request time is always this value.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition#as_a_header_for_a_multipart_body

As a header for a multipart body ... The first directive is always form-data

For your information, the fields available (and defined) for Content-Disposition in multipart/form-data include the following

  • disposition
  • disposition-type
  • disposition-param

Inside these fields, the following parameters are further defined

  • disposition

    • disposition-type (e.g. "form-data")
  • disposition-type

    • "inline"
    • "attachment"
    • extension-token
  • disposition-param

    • filename
    • creation-date
    • modification-date
    • read-date
    • size
    • parameter

The above fields are defined in RFC 2183 [page 2].
https://datatracker.ietf.org/doc/html/rfc2183

Investigation of the cause of the error

Now, we finally have the premise in place. We will now look at why the problematic parsing error occurred.

At the beginning of this article, I mentioned that I sent the file example";';.png to the upload function, but I did not mention the payload of the HTTP Request sent at that time, so let's take a look.

The following is the status of the request sent using "Firefox" (at that time). (*This kind of request cannot be sent using Firefox, as it has been fixed now.)

sent file:

example";';.jpg

HTTP Header of the multipart/form-data request sent:

POST /form HTTP/1
...
Content-Type: multipart/form-data; boundary=---------------------------277224600214918072883416139191

HTTP Body of multipart/form-data request sent:

-----------------------------277224600214918072883416139191
Content-Disposition: form-data; name="file"; filename="example";';.png"
Content-Type: image/png

{Binary Data}
-----------------------------277224600214918072883416139191--

Server-side errors encountered (Spring)
https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L316

As you may have already noticed, " is not escaped in the filename of the request HTTP Body.

part of HTTP Header:

Content-Disposition: form-data; name="file"; filename="example";';.png"



Problem Location:

filename="example";';.png"
                 ^
                 | Here

Perhaps the " is causing the parameters to be parsed incorrectly on the server-side application.
The " is a delimiter that separates parameters, so the fact that it is not escaped may fool the parser.

In other words, if (in some attack scenario) a crafted filename can be sent It may be possible to tamper with or spoof parameters or fields and adversely affect one of the victims.

If this is the case, we need to look at the parser code in order to know how to create a payload that can trick the parser and what fields can be exploited. First, let's look at the function part of the parser that was generating the Exception.

Note: Finally, you can see that the case targeting this Parser cannot be exploited in an attack, but it is relevant to what we will discuss later, so we will include it here in case you want to know more about it. You can skip this section if you wish.

Parser Process

The parse() that caused the Exception is the parser for the field defined in RFC 2183.

Parse a {@literal Content-Disposition} header value as defined in RFC 2183.
https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L252

This parse() returns the ContentDiposition type as a parsed result (parsed data).

return new ContentDisposition(type, name, filename, charset, size, creationDate, modificationDate, readDate);

https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L319

  • type
  • name
  • filename
  • charset
  • size
  • creationDate
  • modificationDate
  • readDate

So, the range (fields) that can fool the parser is limited to these data. Let's look at the detailed implementation.

First, parse() takes a String, which is a Content-Disposition header, and decomposes it into a List<String> parts called parts with the tokenize() function.

public static ContentDisposition parse(String contentDisposition) {
		List<String> parts = tokenize(contentDisposition);

https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L257-L258

The tokenize() function checks the Delimiter [ ; " \\ ] and disassembles the field while maintaining the flags corresponding to each Delimiter ("Is it currently inside a Quote?" "Is the current char an escape character?"). and then decompose the field, keeping track of the flags for each Delimiter.

https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L322-L357

For this tokenize(), if you give the Content-Disposition header string from earlier (where the browser is sending the wrong data), it will be broken down into the following set of parts.

Input:
Content-Disposition: form-data; name="file"; filename="example";';.png"

Output:
(ArrayList String)
  [0] = Content-Disposition: form-data
  [1] = name="file"
  [2] = filename="example"
  [3] = '
  [4] = .png"

The Content-Disposition given in Input is mixed with " without \\ escaping, resulting in a distorted parsing result as you can see. The parsing result is distorted as you can see.

Next, let's look at the result when it is properly escaped ( " converted to \\").

(ArrayList String)
  [0] = Content-Disposition: form-data
  [1] = name="file"
  [2] = filename="example\";';.png"

If the escaping is done properly, we see that the result is consistent with our intuition.

Now, the next question is how far we can go with this broken format (without generating errors) to fake values in predefined fields and the like. Currently, we have an Exception, so we need to create something that is not problematic from the parser's point of view (although it is in fact broken).

In parse(), the next field pair (key,value) is extracted based on the result of tokenize().

  • type
  • name
  • filename
  • charset
  • size
  • creationDate
  • modificationDate
  • readDate

The detailed processing of parse() includes the following (Skip the shaped part(Content-Disposition: form-data) )

https://github.com/spring-projects/spring-framework/blob/4f05da7fed7e55d0744a91e4ac384d8f5df6e665/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L268-L317

  1. Decompose the field (find = for attribute/value pairs. Exception if there is no =)
    (Since there are two forms of value pairs (key=value, key="value"), the extraction is performed with this in mind).
  2. If attribute is name, , store value in name variable
  3. If attribute is filename* , store value in filename variable, taking care of ' (because it is special unlike other formats)
  4. If attribute is filename , store value in filename variable
  5. If attribute is size , store value in size variable
  6. If attribute is creationDate , store value in creationDate variable
  7. If attribute is modificationDate , store value in modificationDate variable
  8. If attribute is read-date , store value in readDate variable

Now I finally figured out why the Exception was happening.
In the broken Content-Disposition described earlier, tokenize() created a part that did not contain a single =.

Output:
(ArrayList String)
  [0] = Content-Disposition: form-data
  [1] = name="file"
  [2] = filename="example"
  [3] = '
  [4] = .png"

So, it seemed to be caught in a place inside parse() that raises an Exception if there is no =.

In other words.

  1. Generate a set of parts containing = by making full use of " and so on in the file name.
  2. Use key=value (or key="value") format.
  3. Set attribute to name, filename, creationDate, etc.

If we can create a file name that satisfies all of the above, it could (possibly) be used for some kind of attack. So I created the following file name.

fila name:
a.txt"; dummy=".txt


Content-Disposition generated by Firefox
Content-Disposition: form-data; name="file"; filename="a.txt"; dummy=".txt"


The parts of the result parsed by tokenize():
[0] = Content-Disposition: form-data
[1] = name="file"
[2] = filename="a.txt"
[3] = dummy=".txt"

Great, it worked. At least I was able to fool tokenize() without any Exception.

Now let's turn the generated dummy into several separate fields and see the final parse() result.

file name:
a.txt"; name="dummy"; filename="dummy"; size=1234; dummy=".txt


Content-Disposition generated by Firefox:
Content-Disposition: form-data; name="file"; filename="a.txt"; name="dummy"; filename="dummy"; size=1234; dummy=".txt"


parse() result:

type = "Content-Disposition: form-data"
name = "dummy"
filename = "a.txt"
charset = null   ( Value set for filename*.)
size = {Long@438} 1234
creationDate = null
modificationDate = null
readDate = null

It works! I was able to override the value set in the name attribute of the form from file to dummy and the file name to a.txt.
I was also able to set the file size to 1234, and the dummy field, which is undefined in the RFC, was ignored without causing any problems, so it is perfect.

Now all we have to do is to look at how the fields we were able to trick are used inside the Spring Framework and come up with a scenario. But after looking at the internal code, we found that most of the fields are not used inside the Spring Framework.

The only two fields that were used internally were

  • fileName
  • name

Still, not giving up, we also followed up on the use of these fields, but could not find any clues to an attack that could be completed in a single request.

Having no choice, we sent a report of the vulnerability so far to Spring Framework, while also looking horizontally for similar problems in other frameworks. We also found reproductions of similar issues in Ruby on Rails and Ktor and sent them in. (At this point we were unaware that it was a Firefox issue...)

As a result, as I wrote at the beginning of the report, the response was "this is not a problem on the Framework side, but on the browser side (Firefox)".
It is obvious now, but this point of view is lost when you are verifying a problem after it has been discovered.

In fact, when we tried to reproduce the problem in each browser, the problem was reproduced only in Firefox. So, we changed direction and considered whether the problem could be reported as a Firefox vulnerability.

Vulnerability Reporting to Firefox

However, there is a problem that the attack scenario is too weak to be communicated to browsers as a "vulnerability" at this stage. This is because, at this stage, we have only constructed scenarios such as "potential problem" and "parameters can be overwritten".

Therefore, we devised the following clearer scenario with the materials we have now.

1. the attacker sends a crafted value for a parameter that is set as "filename" (in a subsequent request)
2. there exists a page where the value sent by the attacker in (1) is reflected in a form field
3. the victim visits the form in (2) with Firefox .
4. the victim submits the form
5. the victim's Firefox sends a "broken Content-Disposition".
6. a parser such as Spring parses the corrupted Content-Disposition in the wrong format .
7. as a result of (6), the form field name `name` is overwritten and the paired value is sent to a different field than the original field

In other words, it is a scenario where the contents of (the field it reflects) can be contaminated by a field set up by the attacker.

One specific scenario would be as follows. (This is a slightly unreasonable scenario...)

  1. the attacker sends a value that becomes the filename of the form to be submitted by the victim (crafted to overwrite the form age)
  2. the victim sends the form data in the profile page form where the data in (1) is sent.
  • The age field exists in the sent form.
  • The file content of the file is a number such as 1000. .
  1. A multipart/form-data request where the victim sends data for a normal age field and data for a corrupted Content-Disposition with multiple names (avator on one side, age on the other) field.
  2. the server side has a behavior of giving priority (overwriting) to the back side of the field if there are multiple fields. .
  3. the server stored the user's age as 1000 (integrity impact)

In the above scenario, the impact may seem very small. However, the impact of the above scenarios may seem very small, but the impact depends greatly on "what kind of system" and "which parameters can be affected". Therefore, it is not possible to evaluate them as "low risk" all together.

Suppose there is a parameter such as profile_html If there is a parameter such as profile_html`, and if there is Self XSS, etc., and if data can be put there If data can be put there, the impact will be high.

Nevertheless, this problem is a multipart/form-data form, and It only occurs on fairly rare pages where the attacker's data is reflected in the victim's fields. Therefore, we believe that the conditions for the attack are high and the risk is low.


I sent an Issue to Firefox with the above information. The result was that the issue was prioritized in the Security Issue category and P3. Congratulations, I'm happy.

https://bugzilla.mozilla.org/show_bug.cgi?id=1556711

...and things happened from there.

First of all, I did not hear from them at all for about two years after I reported it. The conditions under which the problem occurred were so severe that it is not hard to understand. (At the time, Mozilla was undergoing a major restructuring, so there may have been a lack of internal resources.)

Then one day, two years after I reported it, I thought, "Maybe it's cured?" I tested it and sure enough, it was cured silently. From there, I asked, "Will a CVE be issued?" I asked. but was ignored.

Now, as a side note, I personally found an interesting part of the post-report content.

As noted in the Bugzilla exchange, the whatwg specs list requirements for escaping. https://html.spec.whatwg.org/#multipart-form-data

For field names and filenames for file fields, the result of the encoding in the previous bullet point must be escaped by replacing any 0x0A (LF) bytes with the byte sequence %0A, 0x0D (CR) with %0D and 0x22 (") with %22. The user agent must not perform any other escapes.

And the reason for the (silent) fix was that the correspondent had modified the code in a way that conformed to the HTML specification.

This was fixed in bug 1686765, as part of changing the multipart/form-data encoding to follow the HTML specification and to be compatible with Chrome and Safari. When I was working on that bug, and on the specification change that enshrined Chrome/Safari's behavior, I did not know that this was an open security bug, nor did I realize that the fact that double quotes were not escaped in Firefox's previous implementation could be exploited. But this issue has now been fixed since Firefox 90.

If it is a large system of browsers, it is not surprising that the reason is "I fixed it so that it would be correct according to the RFC of the agreement and this problem was fixed. No wonder they say, "I didn't know that fixing that part of the system was a vulnerability. I also realized once again that (although it is obvious) there are requirements in the code to prevent potential problems, and that ignoring these requirements can lead to vulnerabilities.

Look for further vulnerabilities

I was a little disappointed that Firefox didn't seem to be issuing CVEs, but then something happened that made me want to look further into Content-Disposition. It all started with a CVE. (I believe it was the following CVE. I'm a bit fuzzy on this one, as there are several similar CVEs out there).

https://nvd.nist.gov/vuln/detail/CVE-2020-5398

In Spring Framework, versions 5.2.x prior to 5.2.3, versions 5.1.x prior to 5.1.13, and versions 5.0.x prior to 5.0.16, an application is vulnerable to a reflected file download (RFD) attack when it sets a “Content-Disposition” header in the response where the filename attribute is derived from user supplied input.

At the time this CVE was published, I had read through all the CVEs issued to some extent, and was surprised to see Spring report a new CVE related to "Content-Disposition".

I was surprised because I had reported this issue to Spring as a vulnerability in the Content-Disposition parser when I had mistakenly thought that the initial issue was still a problem with the Web Framework.

At the time, I had no idea that this problem might lead to an RFD (Reflect File Download) vulnerability.

I could read it as "This could be a vulnerability based on a Content-Disposition problem I reported! At the time, I knew RFD at least by name from OWASP or other documents, so I regretted my lack of knowledge and persistence in such situations.

One day, some time later, my motivation returned and I decided to look for a similar problem.

Before proceeding with the investigation, I asked again, "Where can we attack?" I will consider the following.
First, as far as I could tell, Content-Disposition is used in the following three ways.

  • multipart in HTTP Body of HTTP Request
  • HTTP Header of HTTP Response
  • multipart in the HTTP Body of an Email

After considering several of these attack cases, we came up with some that could be used for attacks and some that are unlikely to be used.

The following are some of the things that can be used in an attack

  • Tainting of other fields, tampering with file contents, etc. by using filename when sending an HTTP Request
  • Changing file extensions by using filename, filename* when receiving HTTP Response (Reflect File Download vulnerability)
  • Contamination of other fields or tampering with file contents by using filename when sending an email.

On the other hand, we did not investigate the following cases because we judged that they could not be used for attacks.

  • Problems when parsing mail with crafted Content-Disposition headers in mail (mbox), etc.

In this case, we did not proceed with the investigation because we judged that the problem occurs when a mail receiver such as Mail Client parses a crafted Content-Disposition, and even if integrity is violated at that time, it is not the responsibility of the mail receiver. If there is an impact on availability, etc., it may be addressed, but it is outside the scope of this article's investigation.

The investigation proceeded as follows

  1. Determine the repository to be investigated.
  2. Check the repository by `disposition
  3. If the escaping process seems to be insufficient in the generation process of Content-Disposition, perform additional investigation.
  4. Examine how to fill in the code sample for which Content-Disposition is generated.
  • In case of Web Framework, use "File Download" to check.
  • For the HTTP Client, check with `multipart
  • In case of Email library, use "Attach File" or similar.
  1. Build an application based on the samples and test several cases

The (3) part of the search, for example, is as follows

e.g Spring https://github.com/search?q=org%3Aspring-projects+Disposition&type=code

As a result, we found vulnerabilities in 17 products (including those that are still undisclosed), as described at the beginning of this report. I will pick out the most distinctive ones and provide details on these.

Analysis of each vulnerability

From here, we will discuss the vulnerabilities that were unique among the vulnerabilities we found while analyzing the problem for each category.

HTTP Request Issues

This is the Firefox problem described in the first section.
When a value entered by someone else enters the HTTP Request > Content-Disposition > filename on the victim's device (where the vulnerability exists), the content is tampered with or the field is corrupted.

This "victim terminal" includes.

  • Vulnerable browsers running on the victim's terminal
  • HTTP Client included in a Web App or other functionality

Possible solutions to this problem include

  • If " can be inserted, it forces the end of the filename delimiter and allows the attacker to add arbitrary fields
  • If \r, \n can be inserted, CRLF can be added in multipart/form-data.
    • Can add a new line in the header part of multipart/form-data's part (e.g., Content-Disposition can be appended).
    • Can terminate the header part of a multipart/form-data part and move to the body part (can insert content in the head / body)

The impact of this problem will vary slightly depending on where it occurs (in the Client, such as a browser, or in a service, such as a Web App).

For example, in the case of a browser, the victim is always the browser operator, and the attacker is always an external party. In other words, the attacker tries to make the victim send a "broken Content-Disposition". I don't think this kind of form design is usually done, so the incidence is not high.

On the other hand, if the problem occurs in the HTTP Client inside the Web App, the attacker may try to send a broken Content-Disposition himself (or more precisely, in the HTTP Client of the victim). I feel that this one is still "possible"; in this world where features such as Web Hook have become commonplace, you can probably find it if you look for it.

Another way to exploit this is to bypass the Validator. For example, if there is a function such as Web Application Form Validator, it is possible to check the extension of the Form Validator's extension checks can be bypassed.

20230105182039

It is also possible to tamper with the internal hidden parameters by overwriting them, which may lead to an impactful attack depending on how it is used.

This is a technique that is more suited for Penetration Test and CTF, but it depends on how it is used.

Case: httpparty

httparty is ruby's HTTP Client. https://github.com/jnunemaker/httparty/

This case was exactly the same problem as firefox. For this reason, we will not go into details.
Please refer to the report we sent to the maintainer here, as appropriate. https://github.com/jnunemaker/httparty/security/advisories/GHSA-5pq7-52mg-hr42

Case: httpcomponents-client

apache/httpcomponents-client is a Java HTTP Client. https://github.com/apache/httpcomponents-client

The httpcomponents-client was also caused by the same thing as Firefox and httparty. However, there are some differences.
There are a few differences, the regression occurred from v4 to v5.

  • v4 had the vulnerability addressed.
  • v5 was vulnerable to this issue when major changes were made (including changes to the policies of the APIs provided)

In v4, the fix is in place as shown in this commit. https://github.com/apache/httpcomponents-client/commit/6d583c7d8cc41a188a190218a6489541b79cf35a

HTTPCLIENT-1859: Encode header name, filename appropriately

The original point of this modification is as follows.

https://www.mail-archive.com/dev@hc.apache.org/msg18531.html

The ContentDisposition header, used in multi-part forms, has a name and filename subfield; these need to be escaped using unix-standard backslash character stuffing, but FormBodyPartBuilder does not currently do this. It should.

However, in v5, the library (supposedly) only provided the most core http functionality, and did not handle the lower(?) level interests such as multipart. This is because the multipart-related classes that existed up to v4 are no longer present. This means that, depending on how you look at it, this issue may be out of scope.

I believe that the maintainer has his/her own intentions on how to handle this area, so I contacted him about the matter and he was able to correct it.

HTTP Response Issues

The HTTP Response issue is the same as the Spring issue RFD (Reflect File Download) that we found after reporting the Firefox issue (which inspired this research).

Reflect File Download is an attack vector that first appeared in Blackhat Europe 2014. https://www.blackhat.com/docs/eu-14/materials/eu-14-Hafif-Reflected-File-Download-A-New-Web-Attack-Vector.pdf

In a nutshell, RFD (which may be incorrect exactly) is an attack in which

This is a problem where the values entered by the attacker are downloaded as a file (on the victim's terminal). This is also a problem when the attacker has control over the file extension, etc. of the file.

For example, suppose the following (good site) URL Path is accessed

/file_download?filename=abc.txt&contents=hello

In this case, the attacker would use filename=malicious.sh, contents=#! /bin/bash......... by creating a URL like `! The victim downloads the malicious file even though he/she has accessed the (unproblematic) official site.

To give a rough explanation, this attack "Reflects" the parameters entered in the name of the RFD and downloads the file.

In this case, the file extension is changed when the uploaded content is downloaded, and there is Content Injection by CRLF Injection (starting from Starting with filename). (Starting with filename), or Content Injection by CRLF Injection (starting with filename), this is an RFD issue.

20230105182910

Case: django, sinatra

Django is Python's and Sinatra is Ruby's Web Framework.
They forgot to escape the " in the HTTP Response Content-Disposition > filename, resulting in an RFD.

https://security.snyk.io/vuln/SNYK-PYTHON-DJANGO-2968205 https://github.com/advisories/GHSA-2x8x-jmrp-phxw

Case: Iris

Iris is Golang's Web Framework.
https://github.com/kataras/iris

In this case, the filename="..." (an RFC violation), and the format of filename=... format.

Therefore, it was possible to insert another field simply by inserting the ; character.

Case: Ktor

Ktor is Kotlin's Web Framework. Unlike other RFDs, the problem with Ktor occurred with filename*, not filename.
https://security.snyk.io/vuln/SNYK-JAVA-IOKTOR-2980134

filename* is another field of filename available in Content-Disposition, formatted as follows to support non-ASCII filenames when downloading files.

Content-Disposition: attachment; filename*=utf-8''{PARAMETER}

(If filename / filename* are mixed, filename* takes precedence)

In the case of Ktor, URL Encoding was not performed where URL Encoding was originally required when using a file name like the following. This is why RFD was possible in some browsers (at least Firefox).

file name:
''malicious.sh%00'normal.txt

Generated Content-Disposition:
Content-Disposition: attachment; filename*=utf-8''malicious.sh%00'normal.txt

Since the above Content-Disposition is not a normal format (originally URL Encoded), some browsers judged it as an invalid Content-Disposition and did not read it (ignoring the file name).

Incidentally, this PoC file was created by the following process.

''malicious.sh%00'normal.txt
  1. Insert '' at the beginning and format filename*='' (to be RFC compliant...). (I wrote this, but after all this time, it is not compliant with UTF-8, since there is no specification such as UTF-8).
  2. By putting ' in the middle (in firefox), for some reason it started separating filenames in a strange way (just can't replace the extension properly, perhaps some char index was off)
  3. To make the misaligned parse position in (2) more rigid (i.e., to tell the parser, "This is the end! ), a %00 (NULL byte character) is inserted

As a result, it spits out a broken Content-Disposition, which the browser somehow tries to interpret, resulting in an RFD.
I tested for about 30 minutes to see if Chrome or Safari would also have the problem, but it didn't work.

Currently, when the file mentioned earlier is used as Input, the following Content-Disposition is generated.

Content-Disposition: attachment; filename*=utf-8''%27%27malicious.sh%2500%27normal.txt

Email Issues

If you use attachments in email as well, Content-Disposition is inserted in multipart.

Case: Python

There was a CRLF Injection problem in the Python Email module, starting with Content-Disposition > filename.
python/cpython#100612

In this problem, unlike the others, " was escaped, so simple field insertion seemed difficult. However, since CRLF Injection was possible in the multipart internal part (one parameter delimiter), there seemed to be a reasonable problem (possible content insertion).

However, when writing a file open process in Python, an Exception occurs when trying to load a file containing the \r\n character, so we determined that the impact is low.

with open("abc\r\n.txt") as f:
  ...

Vulnerable patterns

Having found multiple problems in this way, we now have some idea of the pattern of impact.

Content-Disposition: attachment; filename={PARAMETER};  # Probably RFC violation
Content-Disposition: attachment; filename="{PARAMETER}";
Content-Disposition: attachment; filename*=utf-8''{PARAMETER}
(utf-8 may be replaced by other Encode formats, etc.)

Most are implemented in these patterns. (In some cases, filename and filename* are written mixed together, but this is not a problem.)

However, all of the problems were caused by the following missing escapes with respect to the filename problem.

Case: filename:

  • "
  • \r
  • \n

Case filename*:

  • URL Encode with proper formatting

Just to elaborate on the case of the incorrect escaping of filename*, I have seen about 50-80 services and it only happened on one Web Framework (Ktor). So I think this is a very rare problem.

How to fix

It should be encoded according to either RFC or WHATWG's HTML Spec(multipart/form-data).

In the case of RFCs, escape with \ (I'm not sure about this, because I can't find the latest version of the RFC that mentions it). .

Golang's multipart module is of this form.
https://github.com/golang/go/blob/1e7e160d070443147ee38d4de530ce904637a4f3/src/mime/multipart/writer.go#L132-L136

The WHATWG, on the other hand, performs URL Encode.

  • " --> %22
  • \r --> %0D
  • \n --> %0A

https://html.spec.whatwg.org/#multipart-form-data

Also, there is currently an Issue on whether to add this issue to OWASP ASVS v5, so if you are in the know, please comment.

[OWASP/ASVS#1390]

Conclusion

This time, we surveyed about 50-80 frameworks, libraries, etc., and reported them all so that they are generally problem-free, but this does not mean that "using a framework is safe.

This is because some languages and frameworks do not perform automatic escaping. For example, some Web frameworks provide methods for adding raw HTTP Response Header. If a file download function is implemented using such a method, the file name escaping must be implemented by the developer.

And, sadly, as of 2023, there are probably no (Japanese) articles mentioning filename escaping at all. I can look up "Web Framework name + file download" or something like that, but you won't really find any reference to escaping. So, if Web Framework or others have not implicitly fixed it, your implementation of Content-Disposition is probably vulnerable.

Also, it is not absolutely certain that you will find it even if you are doing a Web security audit, etc.
I am (or used to be) a security assessor myself, and I thought I had studied a little, but I had not even recognized this problem until I did this investigation.

Because of the above background, I believe that this problem will continue to occur like whack-a-mole problems like XSS, SQLI, etc. Therefore, I understand that there are still some products that have not been fixed yet, but I have published this article for the time being.

I will add those that have been fixed to my github repository for reporting.

[https://github.com/motoyasu-saburi/reported_vulnerability:embed:cite]

summary

It was a long research project, but it is finally finished. It was hard work. My battle is not over yet. There are still a few products left to fix, but this is the end of it.

As written in the TLDR section, the summary is as follows.

  • I found 1 browser, 1 language, and 15 vulnerabilities in { Web Framework, HTTP Client library, Email library / Web Service, etc }
  • All the vulnerabilities I found were found from a single perspective (I investigated maybe 50-80 products).
  • The RFC description of the problem (rather confusingly) describes the requirements for this problem, while the WHATWG > HTML Spec is well documented.
  • The problem is clearly targeted at the Content-Disposition fields filename and filename*.
  • This problem affects HTTP Request/Response/Email in different ways.
    • HTTP Request : request tampering (especially with file contents, tainting of other fields, etc.)
    • HTTP Response : Reflect File Download vulnerability
    • Email : Attachment tampering (e.g., extension and filename tampering and potential file content tampering)
  • Not many people currently see Content-Disposition (filename, filename*) as an obvious attack vector for these attack vectors.
  • I haven't seen a single OWASP publication that summarizes this area properly. ASVS has an Issue on this.
  • Make sure to escape filename and filename* in Content-Disposition.
    • filename:
      • " --> \" or %22
      • \r --> \\r or %0D
      • \n --> \\n or %0A
    • filename*:
      • URL Encode with proper formatting

Appendix & Reference

Incidentally, in the process of writing this article, I came across someone (a GitHub employee) who is looking for a similar perspective on the issue.
https://securitylab.github.com/research/rfd-spring-mvc-CVE-2020-5398/

WHATWG HTML Spec - multipart/form-data
https://html.spec.whatwg.org/#multipart-form-data

RFC 6266 (Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)):
https://tools.ietf.org/html/rfc6266#section-5

RFC 2183 (Communicating Presentation Information in Internet Messages: The Content-Disposition Header Field)
https://datatracker.ietf.org/doc/html/rfc2183

Escape Implementation in Golang: https://github.com/golang/go/blob/1e7e160d070443147ee38d4de530ce904637a4f3/src/mime/multipart/writer.go#L132-L136

Escape Implementation in Symfony: https://github.com/symfony/symfony/blob/123b1651c4a7e219ba59074441badfac65525efe/src/Symfony/Component/HttpFoundation/HeaderUtils.php#L187-L189

Escape Implementation in Spring: https://github.com/spring-projects/spring-framework/blob/4cc91e46b210b4e4e7ed182f93994511391b54ed/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L259-L267

https://github.com/spring-projects/spring-framework/blob/4cc91e46b210b4e4e7ed182f93994511391b54ed/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L605-L628

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment