Regular Expression (Regex) is a really important part of coding and server management. It allows you to do many checks in a one command.
An example would be to check a file, files, directory, or your entire file system with grep. An email validator on a web
form. User input meets requirements in a command line application, and so much more. I want to break down Regex in this
simple tutorial using a password strength checker. Simply put if it meets all the requirements then it will pass and if
not then it will fail. Simple?
I will provide explanations, definitions, and examples. All examples will be formatted for JavaScript. That being said you can execute those examples in your browser or Node command line interface. The Regex pattern that we will be utilizing today is for password strengths. It basically creates a standard and checks to ensure that the password provide passes this standard.
/^(?!.*(?:\s|(.)\1))(?=(.*[\d]){1,})(?=(.*[a-z]){1,})(?=(.*[A-Z]){1,})(?=(.*[\W]){1,}).{7,30}$/
The pattern could look long and as if I just typed a bunch of random characters on a keyboard.
However it is not the case.
It checks to ensure there are not any spaces, duplicate characters adjacent, contains at least one number, special character, uppercase, and lowercase character.
You can play around with this Regex pattern here
It is important to note that in most case scenarios a delimiter is used to start and stop the pattern used.
The first one is called Open and the last one is called Close.
Close does not represent the end of the pattern.
Flags come after the Close.
- Anchors
- Quantifiers & Alternations
- Character Classes
- Flags
- Grouping and Capturing
- Bracket Expressions
- Greedy and Lazy Match
- Boundaries
- Back-references
- Look-ahead and Look-behind
Anchors in Regex are just like a ship's anchor. They help define where we are stopping.
On a ship an anchor does not define the location of the ship, just where they started to stop.
With Regex, we use them define a starting point and can also include a stopping point.
We have four anchors to work with.
/**
^ - "Starting Anchor" Beginning of your line or string.
$ - "Ending Anchor" of your line or string.
\b - "Word Boundary" The start or end of a word inside the line or string.
\D - "Non Word Boundary" The start or end of a section of a word. Not the first letter or last letter.
*/
/** @const {string} TESTING_STRING The string to test our Regex patterns against. */
const TESTING_STRING = 'I am a silly sneaky snake!';
/**
* @RegExp
* @returns true
* @define Pass - The very first four characters of the string are "I am"
*/
/^I am/.test(TESTING_STRING);
/**
* @RegExp
* @returns false
* @define Fail - The very first two characters are "I " and not "am"
*/
/^am/.test(TESTING_STRING);
/**
* @RegExp
* @returns true
* @define Pass - The last six characters of the string are "snake!"
*/
/snake!$/.test(TESTING_STRING);
/**
* @RegExp
* @returns false
* @define Fail - The last five characters of the string are "nake!" not "snake"
*/
/snake$/.test(TESTING_STRING);
/**
* @RegExp
* @returns true
* @define Pass - "silly" is found
*/
/\bsilly/.test(TESTING_STRING);
/**
* @RegExp
* @returns false
* @define Fail - "sill" is not a complete word located in the string.
*/
/\bsill/.test(TESTING_STRING);
/**
* @RegExp
* @returns true
* @define Pass - "illy" is found and is not a complete word.
*/
/\Billy/.test(TESTING_STRING);
/**
* @RegExp
* @returns false
* @define Fail - "sill" the s character begins a word.
*/
/\Bsill/.test(TESTING_STRING);
In our Password Strength it uses the Starting Anchor and the Ending Anchor to define when the string starts and finishes
/^...$/
Quantifiers basically define the amount or set a limit on how many times the match should be made. For example the quantifier itself takes one number for a minimum number before the match is accepted. A second number is optional if omitted then the match keeps going until another restriction is found. If another number is provided then it looks for matches between the first number and the second number to be a positive match.
/**
+ - "plus" Matches 1 or more the preceding token.
* - "star" Matches 0 or more of the preceding token.
{number, } - "quantifier" Matches the specified quantity of the previous token.
? - "optional" Matches 0 or 1 of the preceing token, effectively making it optional.
? - "lazy" Makes the preceding quantifier lazy.
| - "alternation" Allows you to perform alternative matches.
*/
/** @const {string} TESTING_STRING The string to test our Regex patterns against. */
const TESTING_STRING = 'I am a silly sneaky snake!';
/**
* @RegExp
* @returns true
* Pass - silly is part of the string. However "silly sneaky snake!" is what was matched. "." is wild card and +
* continued on until the end of the string.
*/
/sill.+/.test(TESTING_SNAKE)
/**
* @RegExp
* @returns true
* Pass - Matched "silly" Even though it passes it does not match as much because there is not a "sillyy".
*/
/silly*/.test(TESTING_SNAKE)
/**
* @RegExp
* @returns true
* Pass - Matched "sneaky" because . wild is the first character then 3 more "aky" make four.
*/
/sn.{2,4}/.test(TESTING_SNAKE)
/**
* @RegExp
* @returns false
* Fail - Matches "snn" and does not exist.
*/
/sn{2,}/.test(TESTING_SNAKE)
/**
* @RegExp
* @returns true
* Pass - Matches "sneaky" or "snake!".
*/
/sne?a?ke?y/.test(TESTING_SNAKE)
/**
* @RegExp
* @returns true
* Pass - Matches "sneak". This is most likely not desired results. Lazy searching will stop matching
* as soon as possible.
*/
/sneak.*?/.test(TESTING_SNAKE)
/**
* @RegExp
* @returns true
* Pass - Matches "sneaky" or "snake!".
*/
/sneaky|snake/.test(TESTING_SNAKE)
/**
5 Quantifiers
*/
// 1 Alternation - When checking for spaces OR duplicate characters that are adjacent.
/...?!.*(?:\s|(.)\1).../
// 4 Quantifiers are used for each character type check and one to check the length of the string.
/...{1,}...{1,}...{1,}...{1,}...{7,30}.../
Quantifiers & Alternations allow the Password Strength to function and focus on so many aspects at the same time. .1 Alternation: No spaces & No adjacent duplicate characters. .2 Quantifier: Has at least one number. .3 Quantifier: Has at least one lowercase letter. .4 Quantifier: Has at least one uppercase letter. .5 Quantifier: Has at least one special character. .6 Quantifier: At least 7 characters long and no more than 30 characters.
Character Classes allow us to define exactly what type of character we are using a bunch of other moving parts. Like changing a phone number using Digits character class.
const MY_CELL = '555-555-5555'.replace(/\d\d\d-(\d\d\d-\d\d\d\d)/, '216-$1') // 216-555-5555
const MY_CELL = '555-555-5555'.replace(/\d\d\d-/, '') // 555-5555
/**
[] - "Sets" You can create character sets [y!], negate set [^y!], and ranges [0-9] [A-Z] [a-z]
. - "Dot" This acts as a wildcard
\w - "Word" Will find uppercase letters, lowercase letters, numbers, and the _ character.
\W - "Not Word" Will not look for those characters but all other characters.
\d - "Digit" Will look for only numbers.
\D - "Not Digit" Will look for anything but a number
\s - "Whitespace" Looks for spaces, tabs, and line breaks.
\S - "Not Whitespace" Looks for everything that is not a space, tab, or line break.
\p{L} - "Unicode" Looks a character in the specified unicode category.
\P{L} - "Not Unicode" Looks any character that is not in the specified unicode category.
\p{Han} - "Unicode script" Looks any character in the specified unicode script.
\P{Han} - "Not Unicode script" Looks any character that is not in the specified unicode script.
*/
const TEST_STRING = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!@#$%^&*()`-=_+{}|[]\\;\':",./<>?~ \t\n'
I will not be going into Unicode or Unicode scripts. You can find more information here
/**
* @RegExp
* @returns true
* @define Pass - Matches "yz0"
* Looks for three different parts.
* 1. One character that is uppercase, lowercase, or a number.
* 2. One character that is lowercase or number.
* 3. One character that is a number.
*/
/[A-Za-z0-9][a-z0-9][0-9]/.test(TEST_STRING)
/**
* @RegExp
* @returns true
* @define Pass - Matches "z0"
* 1. Looks for one character that is anything except an uppercase letter.
* 2. Looks for one character that is anything except an lowercase letter.
*/
/[^A-Z][^a-z]/.test(TEST_STRING)
/**
* @RegExp
* @return true
* @define Pass - Matches "{" Looks for only the characters inside the brackets.
*/
/[{}]/.test(TEST_STRING)
/**
* @RegExp
* @return true
* @define Pass - Matches "z01" Looks for z wildcard 1.
*/
/z.1/.test(TEST_STRING)
/**
* @RegExp
* @return true
* @define Pass - Matches "789!@" \w\w\w\W\W
* 1. First three characters are anything exception special characters that are not. _
* 2. The last two characters are Any characters that would not match \w
*/
/\w\w\w\W\W/.test(TEST_STRING)
/**
* @RegExp
* @return true
* @define Pass - Matches "z0123456789!"
* 1. \D Any character except numbers.
* 2. \d{10} next ten characters must be numbers.
* 3. \D Any character excepetion numbers.
*/
/\D\d{10}\D/.test(TEST_STRING)
/**
* @RegExp
* @return true
* @define Pass - Matches "A" Bracket allowed us to search for \s which is Whitespace and \S. So any character.
*/
/[\s\S]/.test(TEST_STRING)
/**
- 2 [] "Character Set"
- 2 [A-Z] "Ranged Set"
- 7 . "Wildcard"
- 1 \s "Whitespace"
- 1 \d "Digit"
- 1 \W "Non-Word Character"
*/
/.......\s...........[\d].......[a-z].......[A-Z].......[\W]......./
Knowing that, it becomes very obvious that the character class is a crucial part of Regex! Which Without them our Password Strength pattern might not even be possible.
A short section about the author with a link to the author's GitHub profile (replace with your information and a link to your profile)