Skip to content

Instantly share code, notes, and snippets.

@winzig
winzig / Liberal Regex Pattern for URLs
Last active May 4, 2022 05:44 — forked from gruber/Liberal Regex Pattern for Web URLs
Updated @gruber's regex with a modified version that looks for 2-13 letters rather than trying to look for specific TLDs, and many other improvements. (UPDATE 2018-07-30: Support for IPv4 addresses, bare hostnames, naked domains, xn-- internationalized domains, and more... see comments for BREAKING CHANGE.)
# Single-line version:
(?i)\b(https?:\/{1,3})?((?:(?:[\w.\-]+\.(?:[a-z]{2,13})|(?<=http:\/\/|https:\/\/)[\w.\-]+)\/)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))+(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’])|(?:(?<!@)(?:\w+(?:[.\-]+\w+)*\.(?:[a-z]{2,13})|(?:(?:[0-9](?!\d)|[1-9][0-9](?!\d)|1[0-9]{2}(?!\d)|2[0-4][0-9](?!\d)|25[0-5](?!\d))[.]?){4})\b\/?(?!@)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))*(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’])?))
# Commented multi-line version:
(?xi)
\b
(https?:\/{1,3})? # Capture $1: (optional) URL scheme, colon, and slashes
( # Capture $2: Entire matched URL (other than optional protocol://)