Skip to content

Instantly share code, notes, and snippets.

@andrewdotn
Created February 12, 2021 23:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save andrewdotn/eebeaa60d48c3c0f6f9fc75f0ede8d03 to your computer and use it in GitHub Desktop.
Save andrewdotn/eebeaa60d48c3c0f6f9fc75f0ede8d03 to your computer and use it in GitHub Desktop.

Absolute URL Terminology

What does the term ‘absolute URL’ refer to?

The RFCs indicate that the scheme is required in an absolute URL, making https://example.com/foo/cat.gif an absolute URL.

But cat.gif and /foo/cat.gif are both URLs, with one relative and the other absolute, so one of them is an absolute URL, even though it’s not the ‘absolute URL’ of the RFCs.

Given the optionality of various parts of the URL syntax, the following are all valid URLs that resolve to https://example.com/foo/cat.gif if referenced from https://example.com/foo/:

  • https://example.com/foo/cat.gif
  • //example.com/foo/cat.gif
  • https:/foo/cat.gif
  • https:cat.gif
  • /foo/cat.gif
  • cat.gif

Sometimes some code needs a URL in one of these particular forms, and the terms ‘Absolute URL’ and ‘Relative URL’ aren’t enough to distinguish between these forms. While comments and documentation can give clarifying examples, it becomes much trickier with variable names in code that converts between these forms.

The WHATWG URL Spec defines terms for parts of URLs, but the terms are wordy (e.g., “path-relative-scheme-less-URL string”) and describe branches of a URL parsing algorithm, not classes of URLs.

We need a terminology for referring to classes of URLs in terms of which URL parts are explicitly specified.

Proposal

Without going into the details about the relationship between the terms ‘URL’ and ‘URI’; and, for simplicity, omitting consideration of usernames, passwords, port numbers, search queries, and fragments—

If we take subsequences of the acronym SHPF, denoting the

  • S: scheme,
  • H: host,
  • P: path, and
  • F: file,

parts of a URL, respectively, we get a nomenclature for not only the above examples:

  • SHPF: https://example.com/foo/cat.gif
  • HPF: //example.com/foo/cat.gif
  • SPF: https:/foo/cat.gif (uncommon)
  • SF: https:cat.gif (uncommon)
  • PF: /foo/cat.gif
  • F: cat.gif

but these additional ones as well:

  • SH: https://example.com
  • P: /
  • SHP: https://example.com/

As this four-letter acronym can describe these 9 distinct and valid URL classes, it can be especially useful in code that converts among them. For example, to_shpf_url() can do pf_url = path_relativize(filename).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment