Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
URL parsing regex.js
/*
A single regex to parse and breakup a full URL including query parameters and anchors e.g.
https://www.google.com/dir/1/2/search.html?arg=0-a&arg1=1-b&arg3-c#hash
*/
Url.regex = /^((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$/;
url: RegExp['$&'],
protocol: RegExp.$2,
host: RegExp.$3,
path: RegExp.$4,
file: RegExp.$6,
query: RegExp.$7,
hash: RegExp.$8
/*
Alternate from Reverse HTTP javascript server http://www.reversehttp.net/demos/httpd.js
*/
Url.regex =
/*12 3 45 6 7 8 9 A B C D E F 0 */
/* proto user pass host port path query frag */
/^((\w+):)?(\/\/((\w+)?(:(\w+))?@)?([^\/\?:]+)(:(\d+))?)?(\/?([^\/\?#][^\?#]*)?)?(\?([^#]+))?(#(\w*))?/;
this.url = r[0];
this.protocol = r[2];
this.username = r[5];
this.password = r[7];
this.host = r[8] || "";
this.port = r[10];
this.pathname = r[11] || "";
this.querystring = r[14] || "";
this.fragment = r[16] || "";
@skounis

This comment has been minimized.

Copy link

@skounis skounis commented Aug 25, 2013

Hi,

I stumbled upon this gist while searching for a URL parse regexp. Very helpful. I noticed however that the first regexp does not match the following cases:

http://www.domain.org
http://www.domain.org/
http://www.domain.org/?foo=bar
http://www.domain.org/a
http://www.domain.org/a?foo=bar

In order to fix this I had to adjust it slightly:

^((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)?([\w\-\.]*[^#?\s]+)?(.*)?(#[\w\-]+)?$
@wehrstedt

This comment has been minimized.

Copy link

@wehrstedt wehrstedt commented Apr 16, 2020

If anybody wants to match the port:
^((http[s]?|ftp):\/)?\/?([^:\/\s]+)(?::([0-9]+))?((\/\w+)*\/)?([\w\-\.]*[^#?\s]+)?(.*)?(#[\w\-]+)?$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment