Skip to content

Instantly share code, notes, and snippets.

@wycats
Created October 2, 2013 02:34
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wycats/220039304b053b3eedd0 to your computer and use it in GitHub Desktop.
Save wycats/220039304b053b3eedd0 to your computer and use it in GitHub Desktop.

Goal

A mechanism for packaging multiple files in a single HTTP response and referencing files inside of those packages as URLs.

Meta-Syntax

<package><delimiter><file>

Constraints

  • Ability to take a directory of web content, package it up, and use it without changes to the files.
  • Ability to reference another file inside the package as a relative URL.
  • Ability to deploy packages on existing servers that do not support configuration (no new headers, content types, etc.)
  • Support graceful degradation on old browsers talking to unconfigurable servers into requests for the individual files inside the package.
  • Ability to provide information that would normally be provided by HTTP headers (notably charset, which has security implications)
  • Doesn't introduce new security vulnerabilities for sites that allow arbitrary uploads (like Dropbox) and serve files as text/plain or application/octet-stream.

Proposal

URL Syntax Extension

<scheme>://<host>[:<port>]<package>[[New Delimiter]]<file>#<fragment>

The delimiter must meet the following requirements:

To support graceful degradation:

  • Is currently sent to the server by existing browsers
  • Is not currently rejected by popular servers
  • Is not currently used in web content

To support relative URLs from files inside the package:

  • Must end in a /, to support . and ..
  • Must not come after a ?, because . and .. relative URLs are not resolved relative to the query string.
  • Must not simply be the fragment itself, because relative URLs are not resolved relative to the fragment.

To support graceful degradation in a server's file system:

  • Must not begin with a /, because it must be possible for <package>[[Delimiter]] to be represented on the file system as both a file (the package itself) and a directory of individual files. If the delimiter began with a /, the <package> would have to do double duty as a file and a directory, which is impossible in file systems.

Any delimiter that meets these requirements will work, but we should try to find something that is aesthetically clear. One straw man: !/.

Default Package Format

The proposed default package format is message/http with a body that is Multipart MIME. The encoding of headers and bodies is determined in the same way as normal web content (either provided in the body itself, such as via <meta charset> in HTML, or via headers included in the multipart file).

The only allowed header in the HTTP Message is Content-Type, which must be multipart/mixed with the required boundary parameter.

Why?

The Multipart format elegantly handles several of the requirements.

It allows the inclusion of HTTP headers that are sometimes necessary to process web content (for example, CSS files MUST be served as text/css for security reasons).

Requiring that this information be supplied by metadata provided in the file path makes it impossible to package up a directory without changing the bodies of the files (for example, if styles was renamed styles.css, you would need to modify the body of the HTML file to reference the new location).

Additionally, charset information is provided by HTTP headers, and is not possible to sniff reliably and securely. The packaging format must be able to include this metadata.

By using Multipart MIME, we also avoid the need to create a new manifest format, which would have to be placed at the beginning of the package to allow browsers to properly stream-process the package.

We embed the multipart message in an HTTP Message so that we can provide the boundary value without introducing new mandatory headers.

Origins

This part hasn't been fully fleshed out and is purely the opinion of the author, Yehuda Katz.

For the purpose of this section a "packaged URL" is a URL that contains the [[Delimiter]].

  1. If a packaged URL is used for top-level navigation, the origin for the page is a collision-free origin derived from the URL of the package.
  2. Any relative URLs fetched from inside such a context use the same derived origin.
  3. Otherwise, the packaged URL's origin is the same as the origin of the package's URL.

Why?

This satisfies several competing use-cases:

  1. The ability to reference a module bundle from an HTML file on the same origin and have it be considered the same origin.
  2. Avoiding new vulnerabilities on sites that support arbitrary uploads from top-level navigations.

In essence, navigating into a package is navigating into an "origin in a box", but sites can use files in bundles as regular same-origin content if they are directly referenced from other same-origin content.

The potential vulnerability only exists for content that the application isn't aware of, so this separation makes sense.

@dherman
Copy link

dherman commented Oct 12, 2013

Unless I missed it something that's missing is the semantics of .. relative links from within a package that go beyond the root of the package. I.e., whether it's an error, it is clamped to the package's root, or it is allowed to "escape" from the package.

@tracker1
Copy link

tracker1 commented Dec 2, 2013

I think that using multipart-mime format makes it harder for developers to actually build and work with modules. Most modular document packages (OpenOffice Documents, MS-Office *X documents, JAR, XAPP, Scorm, and others) use a .zip format (with a custom extension) that includes a manifest file, and the files in the package. Specifying some sane defaults based on file extension, an optional manifest.json in the root of the package could be used in order for the browser to handle unknown types, or other bits of information.

Beyond this, you could work with an extracted my.pack! directory for development and zip the directory whole, or use a number of packaging tools that can create an archive ignoring certain files. By using multipart-mime, you'd need custom tooling... Opposed to using zip for your "package" using command line and gui tools available on pretty much every modern platform in use today.

@pragmatic-programmer
Copy link

Is this been taken forward? How will bundling work in es6?

@BigBlueHat
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment