Skip to content

Instantly share code, notes, and snippets.

@Deiontre10
Last active January 30, 2023 00:50
Show Gist options
  • Save Deiontre10/7719f3369fe63782f3c0394468668a25 to your computer and use it in GitHub Desktop.
Save Deiontre10/7719f3369fe63782f3c0394468668a25 to your computer and use it in GitHub Desktop.
A regex used to match a URL

Regex E for Everyone

Regex or Regular Expressions are powerful patterns used to match, search, and manipulate text. Yet they are hard to master which is why I am creating a breakdown, that everyone can understand. They are widely used in various programming languages and tools for tasks such as data validation, pattern matching, and text processing.

Summary

The regex /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\s\.-]*)*\/?$/ is a pattern that matches a URL. It consists of various components such as anchors, quantifiers, grouping constructs, bracket expressions, character classes, the OR operator, flags, and character escapes. This guide will provide a comprehensive explanation of each component used in the regex pattern.

Table of Contents

Regex Components

Anchors

Anchors are special characters that match the position of characters within a string rather than matching specific characters. The ^ and $ symbols used in this regex are examples of anchors. The ^ matches the start of the string, while the $ matches the end of the string.

Ex: /^(https?://)? Here the ^ starts the string.
Ex: ([\/\w \.-]*)*\/?$ Here the $ ends the string.

Quantifiers

Quantifiers specify how many times a pattern should occur in the input string. In this regex, quantifiers such as ? and * are used to match optional and zero or more occurrences of a pattern respectively. Also in this regex is + which matches one or more occurences of a pattern.

Ex: /^(https? Here the ? makes the s optional for a given url.
Ex: ([\/\w \.-]*)* Here the first * matches 0 or more of the given case starting with a "/" followed with any word character (alphanumeric & underscore). Then a space character followed by a "." ending with a "-".
Ex: ([\da-z\.-]+) Here the + will match one or more occurences of any digit (0-9), any letter (a-z) followed by a "." ending with a "-".
Ex: ([a-z\.]{2,6}) Here the { } will match the specified quantity of 2-6, so the length cannot be shorter than 2 or greater than 6.

Grouping Constructs

Grouping constructs are used to group multiple characters or patterns together. The ( ) symbols in this regex are examples of grouping constructs.

Ex: (https?:\/\/) Here the parentheses group the http or https and the "://" together.

Bracket Expressions

Bracket expressions are used to match a range of characters. In this regex, bracket expressions such as [a-z.-] are used to match a range of alphanumeric characters, dots, and hyphens.

Ex: [a-z\.] Here the [ ] lets us use any character a-z case sensitive followed by a period.

Character Classes

Character classes are a shorthand way of representing a group of characters. In this regex, character classes such as \w and \s are used to match alphanumeric characters and whitespace characters respectively.

Ex: ([\/\w \.-]*)*\/?$/ Here \w matches any word characters (alphanumeric & underscore).
Ex: ([\/\w\s\.-]*)*\/?$/ Here the \s is used in place of a space which is considered a whitespace character.

Character Escapes

Character escapes are used to match specific characters that have special meaning in a regex pattern. In this regex, character escapes such as / are used to match the forward slash character.

Ex: (https?:\/\/) Here the \ precedes the character to escape which in this example is the "/".

Author

Travontaz Lowry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment