Regex or Regular Expressions are powerful patterns used to match, search, and manipulate text. Yet they are hard to master which is why I am creating a breakdown, that everyone can understand. They are widely used in various programming languages and tools for tasks such as data validation, pattern matching, and text processing.
The regex /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\s\.-]*)*\/?$/
is a pattern that matches a URL. It consists of various components such as anchors, quantifiers, grouping constructs, bracket expressions, character classes, the OR operator, flags, and character escapes. This guide will provide a comprehensive explanation of each component used in the regex pattern.
- Anchors
- Quantifiers
- Grouping Constructs
- Bracket Expressions
- Character Classes
- The OR Operator
- Flags
- Character Escapes
Anchors are special characters that match the position of characters within a string rather than matching specific characters. The ^
and $
symbols used in this regex are examples of anchors. The ^
matches the start of the string, while the $
matches the end of the string.
Ex: /^(https?://)?
Here the ^
starts the string.
Ex: ([\/\w \.-]*)*\/?$
Here the $
ends the string.
Quantifiers specify how many times a pattern should occur in the input string. In this regex, quantifiers such as ?
and *
are used to match optional and zero or more occurrences of a pattern respectively. Also in this regex is +
which matches one or more occurences of a pattern.
Ex: /^(https?
Here the ?
makes the s optional for a given url.
Ex: ([\/\w \.-]*)*
Here the first *
matches 0 or more of the given case starting with a "/" followed with any word character (alphanumeric & underscore). Then a space character followed by a "." ending with a "-".
Ex: ([\da-z\.-]+)
Here the +
will match one or more occurences of any digit (0-9), any letter (a-z) followed by a "." ending with a "-".
Ex: ([a-z\.]{2,6})
Here the { }
will match the specified quantity of 2-6, so the length cannot be shorter than 2 or greater than 6.
Grouping constructs are used to group multiple characters or patterns together. The ( )
symbols in this regex are examples of grouping constructs.
Ex: (https?:\/\/)
Here the parentheses group the http or https and the "://" together.
Bracket expressions are used to match a range of characters. In this regex, bracket expressions such as [a-z.-]
are used to match a range of alphanumeric characters, dots, and hyphens.
Ex: [a-z\.]
Here the [ ]
lets us use any character a-z case sensitive followed by a period.
Character classes are a shorthand way of representing a group of characters. In this regex, character classes such as \w
and \s
are used to match alphanumeric characters and whitespace characters respectively.
Ex: ([\/\w \.-]*)*\/?$/
Here \w
matches any word characters (alphanumeric & underscore).
Ex: ([\/\w\s\.-]*)*\/?$/
Here the \s
is used in place of a space which is considered a whitespace character.
Character escapes are used to match specific characters that have special meaning in a regex pattern. In this regex, character escapes such as /
are used to match the forward slash character.
Ex: (https?:\/\/)
Here the \
precedes the character to escape which in this example is the "/".
Travontaz Lowry