Skip to content

Instantly share code, notes, and snippets.

@geekcoldhand
Last active November 4, 2022 20:35
Show Gist options
  • Save geekcoldhand/0a50c4e9251b4d13534afd5955bd299f to your computer and use it in GitHub Desktop.
Save geekcoldhand/0a50c4e9251b4d13534afd5955bd299f to your computer and use it in GitHub Desktop.

Regular Expressions Demo

Introductory to Regular Expressions 📖

Summary

We often see patterns in everyday characters and numbers ie: cell-phone #, emails, usernames, etc. How can we search for only emails or only usernames on a page no matter the sequence? In other words, how can we find if there's a pattern of usernames on the page?

We can teach the computer to recognize these pattens with the tools of regular expressions (Regex for short). If we wanted to tell the computer "go fetch all the emails in this given text!" it would look something like this:

/^([a-z0-9])@([\da-z])\.([a-z]{2,6})$/

But its saying much more than that. Its saying "I also want it to be any character or number before the @, I want the middle part to be any character a-z, I want the last part to be between two to 6 characters long ..."

Table of Contents

Regex Components

There are some syntax structure rules to follow when creating the regex syntax. First, every expression needs two forwards slashes. Your entire expression will be within this: /regex/ for execution. In the example we used for emails, you see the expression has these same componets.

/regex here/

Anchors

Anchors signify quieries at the begining and end of the our expressions. If we want to tell our expression to expect a certin criteria in the begining or in the end of our search we can used anchors Carrot ^ and Dollar Sign $. The carrot and dollar sign go after and before our forward slahes, respectivly like so: /^(regex) (regex)$/.

I've chosen the put parentheses around two seperate regex expressions to highlight it's imporatance. The expression $following$ the Carrot executes whereas the expression $before$ the Dollor Sign will be exectued.

/^(regex)` `(regex)$/

Quantifiers

Quantifiers tell the expression expectations about its length. We use curly brackets to create quanifitying notations {min,max} and you will see them placed behind differnet grouping of numbers or letters. You can think of them as a min and a max for a group. For intance, our emails have a couple of key componets the @ and the dot (.) so we can break that into three parts: $before$ @ $middle$ . $after$.

In the example about the email we set a min and max on the $after$ part of no less than 2 but no greater than 6. Take another look at the expression from the begining:

/^([a-z0-9])@(\da-z)\.([a-z]{2,6})$/
/^([regex])@regex.[regex]{2,6})$/

We can now identify our anchors and quantifiers but what are the parenthesis () and square bracets? [] That would be a Grouping Constructor. Its important to note that the quantifiers need to be inside our grouping constructors.

Grouping Constructs

The grouping constructors give your expression a range. That range will be between number, characters or both. When you see groupings of numbers and characters note that the order does not matter because it all mixed. We denote Bracket Expression groups by using the square brackets []

Bracket Expressions

Bracket Expressions can look for a range for example if we are looking at this bracket: [a-b0-1] we are saying any lowercase character a - b and or numbers 0 - 1. We can get the following vaild search results:

a1phabet, ball, b00

And not the following:

A1pha, cat, K3

Character Classes

The character class helps find characters and things of this nature. You could even find anything thats not a character which is very useful for vaildation checking. There are a few classes you can use, a few examples are: \s which matches a single white space.

\s : matches a single white space.
\t : matches a horizontal tab.
\v : Matches a vertical tab.
\d : macthes any digit

The OR Operator

The OR operator gives our expression the option to use some character or set. It can be used inside or outside of our groupings aswell. If we wanted to specify a letter a or g, we will see this used like: a | g

Flags

Flags are like filters for your search. You can add multiple stacks or just a single. Some include search all matches, making case insensetive, single line and even multiline matching.

/g: Without the global flag, subsequent searches will return the same match.
/i: case-insensitive
/s: single line
/m: multi line
/u: unicode

Character Escapes

If you wanted to look for a particular character but the computer interpretes it literally as a key character then use character escapes. For example, when we use the []this is looking for a group for characters. How can we look for something specifically with a square brackets in its syntax like hex[]? We would escape the meaning of the character by using the backslash \

Author

I'm a software engineer and designer based in Atlanta. focused on social networking, APIs, full-stack development, IoT devices, and everything in between.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment