Skip to content

Instantly share code, notes, and snippets.

@Arankin7
Last active Jun 23, 2022
Embed
What would you like to do?
Understanding Regex

What is Regex?

A regex, short for regular expression, is a (seemingly random) string of characters that defines a text search pattern. For instance, you could use a regex to help validate that a string of text is a valid email address. Regex could also be used in an application where you want to set certain rules for when a user chooses a username. Regular expressions are used in search engines and most general purpose programming languages support regex capabilities.

Summary

This regex tutorial will break down a regex that is used to match emails. This can be useful when used with an application where you want validate the users email address. The following string of characters is the aforementioned regex. Don't worry if it looks daunting at first sight. We're going to break this down bit by bit so it is much easier to digest.

/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/

Table of Contents

Regex Components

Anchors

/^([a-z0-9_.-]+)@([\da-z.-]+).([a-z.]{2,6})$/

Every regex begins and ends with a forward slash /. After the opening forward slash is a carat ^ which denotes the begining of the string. At the end of the regex is a dollar sign $ which designates the end of the regex.

Quantifiers

/^([a-z0-9_.-]+)@([\da-z.-]+).([a-z.]{2,6})$/

Quantifiers indicate numbers of characters or expressions to match.
The + quantifier connects the users email name, the email service, and the email domain (.com/.org). The quantifier {2,6} will allow a match range of between 2 and 6 characters from the preceeding character set ([a-z\.]).

Character Classes

/^([a-z0-9_.-]+)@([\da-z.-]+).([a-z.]{2,6})$/

The Character class \d matches any digit equivalent to [0-9]. So in the string "5 Park Avenue" the number 5 would be matched.

Grouping and Capturing

/^([a-z0-9_\.-]+)@([\da-z\.-]+).([a-z\.]{2,6})$/

Groups are captured between opening and closing parentheses. In this case the regex captures three separate subexpressions.

  • ([a-z0-9_\.-]+) is used to capture the users email name.
  • ([\da-z\.-]+) is used to capture the email service.
  • ([a-z\.]{2,6}) is used to capture the email domain.

Bracket Expressions

/^([a-z0-9_\.-]+)@([\da-z\.-]+).([a-z\.]{2,6})$/

Bracket expressions are the expressions located enclosed in square brackets [].

  • Take our first bracket expression: [a-z0-9_\.-].
    a-z denotes that the regex will match any letter from a-z, though this is case sensitive. 0-9 denotes that the regex will match any digits from 0-9. _\.- is slightly more complex, but this will match the three characters _, ., and -.

  • In our second bracket expression: [\da-z\.-]. \d matches any single digit.
    a-z as in our previous bracket expression will match any letter from a-z, but it will be case sensitive. \.- will match characters . and -.

  • Finally our last bracket expression: [a-z\.] a-z as in both of our previous bracket expressions will match any letter from a-z in a case sensitive context. \. will match the character ..

Greedy and Lazy Match

/^([a-z0-9_.-]+)@([\da-z.-]+).([a-z.]{2,6})$/

A greedy quantifier will try to match the longest text that matches a given patter. A lazy quantifier will try to match the shortest possible string.

This regex contains all greedy quantifiers. The + and {2,6} quantifiers will try to match as many times as possible.

Author

Visit my Github for more info about me and to see all of my latest projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment