A tutorial that explains how a specific regular expression, or regex, functions by breaking down each part of the expression.
RegEx is actually a tool to define a search term parameter. It help us to programmatically search by using simple text characters but we can organize them and transform them into meta characters. For example the following regular expression will help us find any phone number in a website:
^(?:(\+|00)\d{1,3}[-\s.])?(\()?\d{3,10}(\))?(?:[-\s.)]\d{2,7}([-\s.]\d{2,5})?([-\s.]\d{2})?)?$
/i
- Anchors
- Quantifiers
- OR Operator
- Character Classes
- Flags
- Grouping and Capturing
- Bracket Expressions
- Greedy and Lazy Match
Our regular expression is made up with five components to match with all kind of formats around the world.
- The first component is the international suffix for the country code:
^(?:(\+|00)\d{1,3}[-\s.])?
👋 see an example screenshoot that follows.
- Next component is the area code:
(\()?\d{3,10}(\))?
👇 ...more screenshoots
- The last three components are for the rest pf the phone separated by the preference of the user:
-
(?:[-\s.)]\d{2,7}
-
([-\s.]\d{2,5})?
-
([-\s.]\d{2})?)?$
-
The
$
asserts position at the end of the string
✌️ last screenshoot ;)
Anchors encapsulate regex components to define what search pattern to find.
Example: ^123$
where the ^
character indicates the start of a string, and the $
character indicates the end of a string In our example, we use two anchors. One at the beginning, and one at the end; to match the string (phone numbers) we are searching for.
?
Is the meta character used for like true or false.
Example: (\()?
Include the (
if exists.
The |
meta character is to define one case or another.
Example: (\+|00)
if a number starts with +
sign or it comes with two zeros.
.
= Any character except newline
When combined with the \
anchor, \. [a-z]
= Accepts any letter between a and z [0-9]
= Accepts any digit between 0 and 9
/i
modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
Example: ([-\s.])
where the ()
group together the [-\s.]
bracket expression.
This regex matches any dash or white space or 0 to 9 characters.
Bracket expressions are the syntax theory on combining character classes. Multiple character classes may be combined in a single bracket expression.
Example: [a-z0-9]
where the regex would match any letter or number.
{2, }
= Lazy match, searches number 2 or more {2,7}
= greedy match, search BETWEEN numbers 2 and 7
© 2022 Published by Vardis Sartzetakis