Regex, or regular expression, is a sequence of characters that specifies a match pattern in text. Usually, such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation.
In this tutorial, we will break down how this regular expression:
^(?:\+1\s?)?(?:\(\d{3}\)|\d{3})[-.\s]?\d{3}[-.\s]?\d{4}$
is able to detect U.S. phone numbers in various common formats, including with or without the country code, hyphens, dots, parentheses, and spaces.
- Anchors
- Quantifiers
- OR Operator
- Character Classes
- Flags
- Grouping and Capturing
- Bracket Expressions
- Greedy and Lazy Match
- Boundaries
- Back-references
- Look-ahead and Look-behind
- Usage
- Recap
- Author
Anchors specify the position of the pattern within the string. In regex used, we use the ^
anchor to match the start of the string and the $
anchor to match the end, ensuring that the entire string must match the pattern.
Quantifiers determine how many times a certain element can occur in the string. In our phone number example, we use {}
to define the exact number of occurrences. In our phone number example, \d{3}
matches exactly three digits.
The |
symbol is the OR operator in regex. We use it implicitly in the provided regex through the use of optional elements enclosed in ()
. For example, (?:\(\d{3}\)|\d{3})
matches either three digits enclosed in parentheses or just three digits.
Character classes allow us to match any one character from a set of characters. In our phone number example, we use [0-9]
to match any digit, and we use [-.\s]
to match a hyphen, dot, or space.
Flags in regex control how the pattern matching is performed. In our phone number example, we do not use any flags.
In regex, we can use parentheses ()
to group parts of the pattern. This is useful for applying quantifiers to a group or for capturing the matched content. Our phone number example does not explicitly capture groups for extraction.
Bracket expressions allow us to match one character from a set of characters. In our phone number example, we use [0-9]
to match any digit.
The regex engine is greedy by default, meaning it tries to match as much as possible. If you want to make a quantifier lazy (match as little as possible), you can add ?
after the quantifier. Our phone number example does not use lazy matching.
Boundaries are used to specify the position where a match should occur. Our phone number example does not explicitly use boundaries.
Back-references allow us to reference a previously matched group. Our phone number example does not use back-references.
Look-ahead ((?=...)
) and look-behind ((?<=...)
) are used to determine if a pattern is or isn't followed by, or preceded by, another pattern without including the second pattern in the match. Our phone number example does not use look-ahead or look-behind.
In javascript for example, we can create a function to use our phone number regex as followed:
const isPhoneNumber = (number) => {
const regex = ^(?:\+1\s?)?(?:\(\d{3}\)|\d{3})[-.\s]?\d{3}[-.\s]?\d{4}$
const found = number.match(regex)
if (found) {
return true
} else {
return false
}
}
We can then call this function with an argument, and it will return either true or false:
isPhoneNumber('555-555-5555') // true
isPhoneNumber('5555555555') // true
isPhoneNumber('this is a test') // false
In this gist, we went over anchors, the or operator, character classes, flags, grouping and capturing, bracket expressions, greedy and lazy match, boundaries, back references, look ahead/look behind, and a way to use the regex to detect phone numbers.
Fun fact, I am actually using the same provided function in a private repo I am creating for pro gamers twitch chatbot. The isPhoneNumber(number)
function takes in the chat text string as an argument, and returns whether a user posted a phone number in chat or not. If true
, the message is auto-deleted by the moderator chatbot. This helps protect the twitch communities anonymity online.
You can test this phone number pattern here if you'd like. We used two test cases.
Hi, I'm Christian Martinez, a full stack web developer from Utah. You can reach me by email with any questions or check out my GitHub profile here 😊