Skip to content

Instantly share code, notes, and snippets.

@Yaffle
Last active September 5, 2022 02:19
Show Gist options
  • Save Yaffle/1088850 to your computer and use it in GitHub Desktop.
Save Yaffle/1088850 to your computer and use it in GitHub Desktop.
parse URL + absolutize URL in javascript (URLUtils shim - http://url.spec.whatwg.org/#url)
/*jslint regexp: true, maxerr: 50, indent: 2 */
(function (global) {
"use strict";
function URLUtils(url, baseURL) {
var m = String(url).replace(/^\s+|\s+$/g, "").match(/^([^:\/?#]+:)?(?:\/\/(?:([^:@\/?#]*)(?::([^:@\/?#]*))?@)?(([^:\/?#]*)(?::(\d*))?))?([^?#]*)(\?[^#]*)?(#[\s\S]*)?/);
if (!m) {
throw new RangeError();
}
var protocol = m[1] || "";
var username = m[2] || "";
var password = m[3] || "";
var host = m[4] || "";
var hostname = m[5] || "";
var port = m[6] || "";
var pathname = m[7] || "";
var search = m[8] || "";
var hash = m[9] || "";
if (baseURL !== undefined) {
var base = new URLUtils(baseURL);
var flag = protocol === "" && host === "" && username === "";
if (flag && pathname === "" && search === "") {
search = base.search;
}
if (flag && pathname.charAt(0) !== "/") {
pathname = (pathname !== "" ? (((base.host !== "" || base.username !== "") && base.pathname === "" ? "/" : "") + base.pathname.slice(0, base.pathname.lastIndexOf("/") + 1) + pathname) : base.pathname);
}
// dot segments removal
var output = [];
pathname.replace(/^(\.\.?(\/|$))+/, "")
.replace(/\/(\.(\/|$))+/g, "/")
.replace(/\/\.\.$/, "/../")
.replace(/\/?[^\/]*/g, function (p) {
if (p === "/..") {
output.pop();
} else {
output.push(p);
}
});
pathname = output.join("").replace(/^\//, pathname.charAt(0) === "/" ? "/" : "");
if (flag) {
port = base.port;
hostname = base.hostname;
host = base.host;
password = base.password;
username = base.username;
}
if (protocol === "") {
protocol = base.protocol;
}
}
this.origin = protocol + (protocol !== "" || host !== "" ? "//" : "") + host;
this.href = protocol + (protocol !== "" || host !== "" ? "//" : "") + (username !== "" ? username + (password !== "" ? ":" + password : "") + "@" : "") + host + pathname + search + hash;
this.protocol = protocol;
this.username = username;
this.password = password;
this.host = host;
this.hostname = hostname;
this.port = port;
this.pathname = pathname;
this.search = search;
this.hash = hash;
}
global.URLUtils = URLUtils;
}(this));
@Ciantic
Copy link

Ciantic commented Sep 30, 2011

Great job. I'm going to give this a try.

Is this snippet in Public Domain? E.g. like Douglas Crockford's json.js and json2.js? Public domain is great no pressures to litter ones code with disclaimers.

@Yaffle
Copy link
Author

Yaffle commented Sep 30, 2011

of course

p.s.
all tests from here http://skew.org/uri/uri_tests.html passed

@johan
Copy link

johan commented Oct 18, 2012

You're working too hard. :-) http://stackoverflow.com/a/12965135/1130377

@johan
Copy link

johan commented Oct 18, 2012

(Unless you're using this in node or similar, where you wouldn't have an url resolver built into the browser DOM already, of course.)

@terinjokes
Copy link

@johan: I don't believe that works in IE, but even more so, crossing the DOM line from JavaScript tends to be slower than a native-JS solution. Requiring a DOM also prevent anyone from creating a cross-environment library.

@pjt33
Copy link

pjt33 commented Jun 23, 2014

Forked to fix a small bug: since pathname is guaranteed to start with /, adding an extra / before the leading part of base.pathname is unnecessary.

@guybedford
Copy link

The following case breaks, as the @ gets detected as the auth part:

  https://service.domain.com/go?email=jim@jam.com

Suggested change is:

 var m = String(url).replace(/^\s+|\s+$/g, "").match(/^([^:\/?#]+:)?(?:\/\/(?:([^:@]*)(?::([^:@]*))?@)?(([^:\/?#]*)(?::(\d*))?))?([^?#]*)(\?[^#]*)?(#[\s\S]*)?/);

to

 var m = String(url).replace(/^\s+|\s+$/g, "").match(/^([^:\/?#]+:)?(?:\/\/(?:([^:@\/]*)(?::([^:@\/]*))?@)?(([^:\/?#]*)(?::(\d*))?))?([^?#]*)(\?[^#]*)?(#[\s\S]*)?/);

I'm not sure this is 100% correct though - better suggestions welcome.

@Yaffle
Copy link
Author

Yaffle commented Nov 14, 2014

@guybedford,
Thanks, good catch, I have updated my code.
the specification, written by Anne van Kesteren - https://url.spec.whatwg.org/#authority-state - tells, that "/", "", "?", and "#" should not be in "authority".

@guybedford
Copy link

@Yaffle thanks so much for the quick response.

When updating to this new code in my tests, I've hit two more issues now unfortunately -

  1. new URLUtils('asdf', 'http://example.org/test') gives an href of http://example.org//asdf instead of http://example.org/asdf
  2. new URLUtils('asdf', 'file:///example.org/test') gives an href of file:/example.org/asdf instead of file:///example.org/asdf

These issues didn't happen in the original code that I'm still using though so will stick with that for now.

@Yaffle
Copy link
Author

Yaffle commented Nov 14, 2014

@guybedford, Thanks again,
I updated the code to fix those issues too. Possibly, there are other issues, I did not test well.
P.S. URLUtils tries to match URL API available in Chrome and Firefox - new URL('asdf', 'file:///example.org/test')

@guybedford
Copy link

@Yaffle perhaps it is worth considering turning this into a polyfill repo?

@Yaffle
Copy link
Author

Yaffle commented Nov 24, 2014

@guybedford
Copy link

A minor performance optimization for line 21 can be useful:

var base = baseURL instanceof URLUtils ? baseURL : new URLUtils(baseURL);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment