Skip to content

Instantly share code, notes, and snippets.

@laurelthorburn
Last active December 23, 2021 18:39
Show Gist options
  • Save laurelthorburn/9e8edf997bdd68dcaaae01bb20107e9a to your computer and use it in GitHub Desktop.
Save laurelthorburn/9e8edf997bdd68dcaaae01bb20107e9a to your computer and use it in GitHub Desktop.
regex-tutorial

Matching An Email

Regular Expressions, or regex, define unique search patterns using special characters. One of such is "Matching an Email".

Summary

"Matching an Email" regex:

/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/

This unique search pattern is able to validate if an email was entered by comparing the string entered to the requirements defined. The email requirement can be broken down into three subsections:

section 1: /^([a-z0-9_\.-]+)
section 2: @([\da-z\.-]+)
section 3: \.([a-z\.]{2,6})$/

Section one:

  • The string can contain any lowercase letter from a to z ([a-z0-9_\.-]+)

  • The string can contain any number from 0 to 9 ([a-z0-9_\.-]+)

  • The string can contain an underscore, a backslash, a period, and/or a hyphen ([a-z0-9_\.-]+)

  • The first section of the code ends with a qualifier, see the section on qualifiers for more ([a-z0-9_\.-]+)

Section two:

  • The second section must start with an at sign @([\da-z\.-]+)

  • The string can contains any numbers between 0-9, notice how this is written different than section one which utilizes [0-9] @([\da-z\.-]+)

  • The string can contain letters from a to z @([\da-z\.-]+)

  • The string can contain a backslash, a period, and/or a hyphen @([\da-z\.-]+)

Section three:

  • The second section must start with a period \.([a-z\.]{2,6})$/

  • The section may contain letters from a to z \.([a-z\.]{2,6})$/

  • The section may contain a backslash and/or period \.([a-z\.]{2,6})$/

  • The final portion of the code will be covered in quantifiers and anchors \.([a-z\.]{2,6})$/

Table of Contents

Regex Components

A regex is a literal and therefore the code must begin and end with a backslash \ . This is known as a literal notation and is one of two ways to create a regex object. See MDN Web Docs on the RegExp Object to read more about the second way, RegExp constructor.

The Regex Components include: Anchors, Quantifiers, Grouping Constructs, Bracket Expressions, Character Classes, the OR Operator, Flags, and Character Escapes.

This tutorial will focus on the Regex Components used in the "Matching an Email" code.

Anchors

The "Matching an Email" code contains two anchors, the ^ and $ symbols.

Anchors define that the engine's current position, such as the front or end of the string, match a well-determined location.

  • ^, or the caret anchor, matches the beginning of the string and pertains to the text that follows it. This regex is case sensitive.

  • $, or the dollar anchor, matches the end of the string and pertains to string that precedes it.

In the "Matching an Email" regex, the string must begin and end with text that matches each section:/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z.]{2,6})$/

Notice how the {2,6} are not bolded in the code snippet? This is a component called a quantifiers and will be covered below.

Bracket Expressions

The "Matching an Email" code contains three snippets of bracket expressions which are outlined by brackets []. The brackets indicate that the string may only contain the range of characters within them, in any order. The three bracket expressions are:

section 1: [a-z0-9_\.-]
section 2: [\da-z\.-]
section 3: ([a-z\.]
  • [a-z]: Code may contain a range of lowercase letters.
  • [0-9] || [\d]: Code may contain a range of numbers.
  • [_\.-]: Code may contain any of the special characters implicitly stated within the brackets. When the backslash is within a set of brackets, and is not a character class as defined below, it implies that the backslash may be used as a special character.

All three code snippets contain a hyphen -. In the first two examples, the hyphen represent a range and can only be applied to alphanumeric characters [a-z] or [0-9]. In the third example, the hyphen represents a special character that can be included in the string.

Some samples of strings that meet the bracket expression requirements:

  • "textexample"
  • "textexample12"
  • "textexample_1.2"
  • "12349"
  • "._\--"

Quantifiers

A quantifier sets the limits of the string that your regex matches and can be applied to each individual section. The "Matching an Email" has the following quantifiers:

  • + indicates that the string must match the pattern one or more times
  • {n, x} matches the pattern a minimum of n times to a maximum of x times.

In the "Match an Email" regex, the quantifier {2,6} indicates that the preceding text pattern needs to be found at least 2 times and up to 6 times. Recall in the bracket expressions that the final section of code contained ([a-z\.] which indicated that the string could contain any combination of lower case letters and listed special characters (\.), and must be between 2 and 6 characters. Examples include:

  • "com"
  • "ca"
  • "com.co"

Grouping Constructs

Grouping constructs, indicated by (), break the code into sub-expressions. Recall the "Matching an Email" complete regex:

/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/

In total, there are three subsections we divided in both the summary and bracket expression sections. Typically, sub-expressions look for strings that match the criteria exactly; however, our subsections reside within brackets. See the bracket expression section for further details.

Character Classes

Character classes in a regex defines a set of permissable characters. Our code contains:

  • /d which, as defined above, is similar to writing [0-9] and matches one single digit in most regex grammar styles. See the bracket expression section for further details.

Character Escapes

The backslash character \ escapes a character thus preventing the character from being interpreted literally. The "Matching an Email" regex, however, also contains backslashes within brackets []. When the backslash is within brackets it loses its special significant and indicates that it is a special character permitted in the string.

Final Example

Now that we've dissected the code, let's look at it as a whole and create some email examples.

/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/

Author

Laurel Thorburn is a full stack developer who enjoys eating, talking, and coding donuts. Follow her github to learn more: https://github.com/laurelthorburn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment