Skip to content

Instantly share code, notes, and snippets.

@minprocess
Last active October 23, 2023 20:12
Show Gist options
  • Save minprocess/16c6bdaed839229df6c8d7125fb6277c to your computer and use it in GitHub Desktop.
Save minprocess/16c6bdaed839229df6c8d7125fb6277c to your computer and use it in GitHub Desktop.
Testing a regex for validating password strings

Testing a regex for validating a password string

Summary

This tutorial shows the result of testing I did of a regular expression (regex) that I found in the following stackoverflow.com question https://stackoverflow.com/questions/2370015/regular-expression-for-password-validation

The user asked for a regular expression for validating a password with the following condition: "Password must contain 8 characters and at least one number, one letter and one unique character such as !#$%&? "

The proposed regex (which will only work with the Latin alphabet) is ^.*(?=.{8,})(?=.*[a-zA-Z])(?=.*\d)(?=.*[!#$%&? "]).*$

Because I have a web app that generates random passwords, I thought it would be interesting to generate 10 passwords with that app and test them with the proposed regex. I added a string that a stackoverflow commenter said was not matched when tested although it should have been. I tested a 12th string with no letters.

The screen capture at the end of this gist shows testing of the proposed expression in regex101.com. The strings that match were shaded in with blue background. The strings that did not match have the default white background. Examination of the strings shows that given this limited set of test strings, the proposed reg ex works. This is discussed in the section (#did-the-expression-work)

While researching this gist, I found some interesting thoughts about passwords.

There are very good reasons to stop requiring passwords and let people log in with Google, Facebook, Twitter, Yahoo, or any other valid form of Internet driver's license. The reasons are given in the following https://stackoverflow.com article that answers the question "How should one properly validate passwords?"
https://stackoverflow.com/questions/48345922/reference-password-validation

The above article proposes a regex for validating a password using any Unicode script. If one is going to validate a password it seems a lot better to allow any script.

The following https://stackoverflow.com post describes the advantages and disadvantages of client-side and server-side-validation https://stackoverflow.com/questions/162159/javascript-client-side-vs-server-side-validation

That conclusion of that article is that it is better to validate on both the client side and the server side.

Table of Contents

Did the Epression Work

The attached screen capture of https://regex101.com shows 13 random strings and a 14th string that was not random. There were 4 strings of length 10 there were not validated. These were: 'XAWjZXKPXi', 'b2NGCySJbH', 'Xi4dIr7oZx' and 'n9gmz78VfG'. These strings did not have the special characters so they were not matched. The string '12345%%%%' was not matched because it does not have a letter in it. Finally the two strings that had only 6 characters were not matched.

Based upon the testing in https://regex101.com the proposed expression works.

Breakdown of the Expression

The proposed expression ^.*(?=.{8,})(?=.*[a-zA-Z])(?=.*\d)(?=.*[!#$%&? "]).*$ is well done and has a simple and logical structure as shown in the table below that was provided with the expression. It has an anchor for the start ^, 4 grouping constructs (?...) for enforcing the constraints on length (8 or more), letters (at least one), digits (at least one) and special characters (at least one) and an anchor for the end $.

Sub-expression Meaning
^ Start
(?=.{8,}) Length, (?=.{8,}) is match everything 8 characters or longer
(?=.*[a-zA-Z]) Letters
(?=.*\d) Digits
(?=.*[!#$%&? "]) Special characters
$ End

Breakdown of the term for minimum length (?=.{8,})

(?= ) is a grouping construct and matches everything specified by .{8,}. See the table in Quantifiers. The term .{8,} has a quantifier {8,} for . (any character) and means 8 or more of any character needed to match. See the table for quantifiers below.

Breakdown of the term for at least one letter (?=.*[a-zA-Z])

(?= ) is a grouping construct and will match everything specified by .*[a-zA-Z]. The first part of .*[a-zA-Z] is .* which means zero or more of any character. The second part of [a-zA-Z] means that the character must be a small (Latin) letter or a capital (Latin) letter. See the table for character classes, Character classes.

It seems that the expression would not work using .* because this term means zero or more of any character. However, it does work as does using .+ which is means one or more of any character.

Breakdown of the term for at least one digit (?=.*\d)

(?= ) is a grouping construct and will match everything specified by .*\dwhere .*\d is zero or more occurences of a digit [0-9]. See the section on [Shorthand Character Classes (#shorthand-character-classes).

Breakdown of the term for at least one special character (?=.*[!#$%&? "])

As described above this term has the grouping contruct (?=) and a grouping contruct for the special characters.

Components

The following tables of RegEx components used the definitions from https://regex101.com. A very good reference is https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

The following websites for learning, building and test Regular Expressions were suggested by the Mozilla.org site. https://regex101.com https://regexr.com/ https://extendsclass.com/regex-tester.html

Anchors

Anchor Effect
\G Start of match
^ Start of string
$ End of string
\A Start of string
\Z Start of string
\z Absolute end of string
\b A word boundary
\B Non-word boundary

Example: To match an integer use ^\d+$. Because “start of string” must be matched before the match of \d+, and “end of string” must be matched right after it, the entire string must consist of digits for ^\d+$ to be able to match.

Quantifiers

Quantifier Effect
a? Zero or one of a
a* Zero or more of a
a+ One or more of a
a{3} Exactly 3 of a
a{3,} 3 or more of a
a{3,6} Between 3 and 6 of a
a* Greedy quantifier
a*? Lazy quantifier
a*+ Possessive quantifier

Grouping Constructs

Grouping Effect
(?:...) Match everything enclosed
(...) Capture everything enclosed
(?>...) Atomic group (non-capturing)
(?|...) Duplicate/reset subpattern group number

Character Classes

Class Effect
[abc] A single character of a, b or c
[^abc] A character except: a, b or c
[a-z] A character in the range: a-z
[^a-z] A character not in the range: a-z
[a-zA-Z] A character in the range: a-z or A-Z

Shorthand Character Classes

Character Short for
/d digit [0-9]
/b word boundary
/s white space

The OR Operator

The following expression uses the OR operator (|): ((^|, )(part1|part2|part3))+$

I found the above regex in a https://stackoverflow.com answer: https://stackoverflow.com/questions/8020848/how-is-the-and-or-operator-represented-as-in-regular-expressions

Matching: part1 part2, part1 part1, part2, part3

Not matching: part1, //with and without trailing spaces. part3, part2, otherpart1

Flags

According to https://mozilla.org there are six optional flags that allow for functionality like global and case insensitive searching. Note that in the attached screen capture that in the same line as the regular epression there is "/mg" on the right side so that all of the test strings will be searched.

Flag Description Corresponding property
d Generate indices for substring matches
g Global search
i Case-insensitive search
m Multi-line search
s Allows . to match newline characters
u "unicode"; treat a pattern as a sequence of unicode code points
y Perform a "sticky" search that matches starting at the current position in the target string

Character Escapes

According to https://regular-expressions.info/quickstart.html there are 12 special characters. These are:

Symbol Description
\ the backslash
^ the caret
$ the dollar sign
. the period or dot
| the vertical bar
? the question mark
* the asterisk
+ the plus sign
( the opening parenthesis
) the closing parenthesis
[ the opening square bracket
{ the opening curly brace

A detailed descriptions of Unicode Property Escapes is given at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Unicode_Property_Escapes

Author

This tutorial is written while I am taking a Coding Boot Camp for full stack web developer. Prior to taking this class I spent more than 30 years programming for simulation, estimation and optimization of mineral programming. I used at different times Fortran, C, C++ and C++ plus VBA for Excel. I think that the programming skills I am learning now, those primarily associated with HTML, CSS and JavaScript are, perhaps, the core skills most or at least extremely important for modern web development.

My LinkedIn profile is https://www.linkedin.com/in/bill-pate/

My portfolio of web apps is at https://minprocess.github.io/yet-another-portfolio/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment