A mechanism for packaging multiple files in a single HTTP response and referencing files inside of those packages as URLs.
- Ability to take a directory of web content, package it up, and use it without changes to the files.
- Ability to reference another file inside the package as a relative URL.
- Ability to deploy packages on existing servers that do not support configuration (no new headers, content types, etc.)
- Support graceful degradation on old browsers talking to unconfigurable servers into requests for the individual files inside the package.
- Ability to provide information that would normally be provided by HTTP headers (notably charset, which has security implications)
- Doesn't introduce new security vulnerabilities for sites that allow arbitrary uploads (like Dropbox) and serve files as text/plain or application/octet-stream.
URL Syntax Extension
The delimiter must meet the following requirements:
To support graceful degradation:
- Is currently sent to the server by existing browsers
- Is not currently rejected by popular servers
- Is not currently used in web content
To support relative URLs from files inside the package:
- Must end in a
/, to support
- Must not come after a
..relative URLs are not resolved relative to the query string.
- Must not simply be the fragment itself, because relative URLs are not resolved relative to the fragment.
To support graceful degradation in a server's file system:
- Must not begin with a
/, because it must be possible for
<package>[[Delimiter]]to be represented on the file system as both a file (the package itself) and a directory of individual files. If the delimiter began with a
<package>would have to do double duty as a file and a directory, which is impossible in file systems.
Any delimiter that meets these requirements will work, but we
should try to find something that is aesthetically clear. One
Default Package Format
The proposed default package format is
a body that is Multipart MIME. The encoding of headers and
bodies is determined in the same way as normal web content
(either provided in the body itself, such as via
<meta charset> in HTML, or via headers included
in the multipart file).
The only allowed header in the HTTP Message is
Content-Type, which must be
multipart/mixed with the
required boundary parameter.
The Multipart format elegantly handles several of the requirements.
It allows the inclusion of HTTP headers that are sometimes
necessary to process web content (for example, CSS files
MUST be served as
text/css for security reasons).
Requiring that this information be supplied by metadata
provided in the file path makes it impossible to package
up a directory without changing the bodies of the files
(for example, if
styles was renamed
would need to modify the body of the HTML file to reference
the new location).
Additionally, charset information is provided by HTTP headers, and is not possible to sniff reliably and securely. The packaging format must be able to include this metadata.
By using Multipart MIME, we also avoid the need to create a new manifest format, which would have to be placed at the beginning of the package to allow browsers to properly stream-process the package.
We embed the multipart message in an HTTP Message so that we can provide the boundary value without introducing new mandatory headers.
This part hasn't been fully fleshed out and is purely the opinion of the author, Yehuda Katz.
For the purpose of this section a "packaged URL" is a URL that contains the [[Delimiter]].
- If a packaged URL is used for top-level navigation, the origin for the page is a collision-free origin derived from the URL of the package.
- Any relative URLs fetched from inside such a context use the same derived origin.
- Otherwise, the packaged URL's origin is the same as the origin of the package's URL.
This satisfies several competing use-cases:
- The ability to reference a module bundle from an HTML file on the same origin and have it be considered the same origin.
- Avoiding new vulnerabilities on sites that support arbitrary uploads from top-level navigations.
In essence, navigating into a package is navigating into an "origin in a box", but sites can use files in bundles as regular same-origin content if they are directly referenced from other same-origin content.
The potential vulnerability only exists for content that the application isn't aware of, so this separation makes sense.