Regular expressions are patterns that can be matched against strings. They are denoted with /
(a forward slash) at the beginning and the end of the pattern. To the average eye, a regular expression - or regex for short - might look like your cat walked across your computer's keyboard. It certainly left me baffled when I first saw one, but my love for patterns and logic prompted me to do some digging. After a little bit of research, I found that regular expressions can be decoded - admittedly, some more easily than others. But I realized that in the mess of characters I was looking at, there was a purpose to each piece. Let's break it down.
I could spend several days learning about different components found in regular expressions, but let's start with some basics. In this tutorial, I will break down the regular expression below, and explain how it will validate a Hexidecimal Color Code.
regex = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/
Anchors assert the position within a string. The caret anchor, ^
, signifies the beginning of a string, while the dollar anchor, $
, marks the end of a string. In our regex, the caret anchor asserts that in order to match a hexidecimal color code, the string must begin with #
.
For example:
let regex = /^t/
let str1 = "notebook"
let str2 - "teammate"
str1
would not be a match because of the position of "t", but str2
would be.
The dollar anchor in our regex asserts that the string must end with a character specified in the preceding character class, which we'll get to in a bit.
The quantifier in our regex specifies the number of times the preceding pattern must be repeated. A quantifier can be a specific number, but it doesn't have to be. Because a hexidecimal color code can either be 3 or 6 characters long, our quantifiers are {3}
and {6}
. This dictates that our pattern must be repeated either 3 or 6 times, which would result in a string of 3 or 6 characters. Note that this specified quantity does not include the #
. We know this because the #
is outside of the grouping (the parentheses) while the quantifiers are within the grouping.
Let's look at an example using our regex:
let regex = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/
let str1 = "#00"
let str2 = "#000"
Not including the #
, str1
only contains 2 characters from the given class within the brackets (the pattern was repeated 2 times), so it wouldn't be a match. Following the #
,str2
contains 3 characters, so str2
would match.
The OR operator, |
, indicates that one of multiple provided patterns must match. When reading code, simply read it as the English word "or". In matching a hexidecimal color code using our regex, let's look at just the pattern within the parentheses. We see a character class and a quantifier of 6 "OR" a character class and a quantifier of 3. Although they look very similar, these are two different patterns separated by the OR operator. One of those 2 patterns must match in order to validate a given hexidecimal color code.
Here's another example:
let regex = /gr(e|a)y/
let str1 = "grey"
let str2 = "gray"
Both strings would match the above regex.
A character class the set of characters enclosed in square brackets. Let's look at our regex again.
let regex = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/
[A-Fa-f0-9]
is our specified character class. Any one of the characters inside the brackets may be a match. It asserts the characters that may be included in order to be a valid hexidecimal color code. Since hex color codes only use the letters "a" through "f", this is the range of letters specified within the character class. Hex color codes are not case sensitive, so both uppercase and lowercase would be accepted, which is also specified within the brackets. Lastly, you see "0-9", which, as you probably guessed, asserts that any digit "0-9" may also be found in a hex color code.
Here's another example:
let regex = /[a-c]{3}/
let str1 = "cab"
let str2 = "car"
str2
finds "a" and "c", but because of the quantifier, a matching string must repeat that pattern three times. Because the character class specifies only "a" or "b" or "c", str2
is not a match, but str1
is.
A group in a regex is indicated with parentheses. In our regex, we have 1 group. Semantically named, like many components of a regular expression, a group groups a part of the regular expression to be remembered, or captured. In our example, the group's purpose is to match one of the patterns on either side of the or operater and utilize it after the #
to complete the regular expression pattern.
Let's look at an example:
let regex = /hello+/
let str1 = "helloo"
let str2 = "hellooooooo"
In the above regex, the +
is another type of quantifier. It means "repeat the preceding character 1 or more times". This means that both str1
and str2
are matches.
Now, let's change that regex by adding parentheses, creating a grouping.
let regex = /(hello)+/
Now, the +
quantifier is applied to the entire group instead of just the "o" like in the previous example. This means that matches would look more like "hellohello" or "hellohellohellohello".
Let's circle back to our original regular expression and deconstruct it.
regex = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/
/
indicates the beginning of the regular expression
^
indicates the beginning of the string to be matched
#
indicates the first character in the hexidecimal color code
(
indicates the beginning of a group
[A-Fa-f0-9]
indicates the characters that may be included in the hexidecimal color code
{3}
and {6}
indicate the number of times the preceding pattern is to be repeated
|
indicates that one of the two patterns on either side must match the given string
)
indicates the end of a group
$
indicates the end of the string to be matched
/
indicates the end of the regular expression
My name is Mia Mauro. I am a student in DU's Full-Stack Coding Bootcamp. Click here to view my GitHub profile.