Ever need to make sure some input given by a user follows a certain format? Need to specify exactly what the user can input? Well RegEx (or Regular Expressions) is what you're looking for! That's right! With this simple to understand concept, you (yes you!), can guarantee your desired format for your code!
- Anchors
- Quantifiers
- Grouping Constructs
- Bracket Expressions
- Character Classes
- The OR Operator
- Flags
- Character Escapes
- Example
Denoted using a caret (^
) or a dollar ($
), anchors are used to ensure strings contain a specific substring at certain positions. The caret, is used to denote an anchor for the start of the string, while the dollar indicates an anchor for the end of the string.
Operators which determine the number of characters to be allowed in a substring, using +
, *
, or ?
.
+
= Appears once or more.*
= Appears or not.?
= Appears once or none.
A subsection of a Regular Expression in which a substring much match a certain pattern using (\w+)
, \s
, or \W
.
(\w+)
= Match one or more characters in a word.\s
= Match whitespace.\W
= Match non-letter characters including numbers, spaces, and punctuation.
A group of values to be accepted, listed within square brackets ([]
).
[abc]
= Denotes a, b, or c (only lowercase).
Limitations set to limit input, listed inside bracket expressions. Can be used to in ranges or lists, or in conjunction with the caret (^
) or dollar ($
) notation.
- [a-z] = Limits to only lowercase letters from a to z.
- [A-Z0-9] = Limits to only uppercase letters from A to Z or numbers from 0 to 9.
Options which catches instances of more than one option, separated by the OR operator, |
.
HTML|CSS|JavaScript
= Assesses for HTML, CSS, or JavaScript.
Additional options passed to alter the scope of the Regular Expression. Includes 6 types, i
, g
, m
, s
, u
, y
.
i
= Set to case insensitive, removes need to specify a-z and A-Z.g
= Look for all matches, rather than just the first match.m
= Multiline mode, allows application on multi-lined expressions, but only applies beginning and end anchors to the very beginning and very end of the string.s
= Dotall mode, matches the dot.
to a new line\n
.u
= Install unicode support, allowing for use of surrogate pairs.y
= Sticky mode, used to match characters at a specified position.
Methods to use certain certain expressions as a substitute for expressions which cannot exist in the RegEx syntax, common character escapes including \n
, \\
, and \t
.
\n
= New line.\\
= Back slash, as\
is often used in escapes.\t
= Tab indent.
When given the example RegEx for a URL:
/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/
&(https?:\/\/)
, requiringhttps://
to appear at the start (due to the anchor) in the URL substring.?
immediately following thehttps://
expression states that the first section can appear either once or not at all.([\da-z\.-]+)
\d
states any number as a metacharacter anda-z
states any lowercase letter\.([a-z\.]{2,6})
declares an addition of a dot,.
and some lowercase letters between 2 and 6 characters for the domain of the site.([\/\w \.-]*)*\/?$/
\w
states any letter (both cases) or number as a metacharacter, with directories from the main site (after the domain).