Special characters in literal strings, precede them by a backslash \
character.
This means the standard regular expression \.
will appear as \\.
in Swift code.
Capturing parentheses: are used to group part of a pattern, example 3(pm|am)
would match the text 3 pm as well ass 3 am. The pipe character here |
acts like an OR operator.
The question mark after capturing parentheses means that whatever is inside the parentheses is optional. Nov(ember)?
, input could be Nov or November.
The $1
allows you to reference the first captured group of the preceding rule.
Character classes: represent a set of possible single-character matches. Character classes appear between square brackets [ ]
. As an example, a Regex of this could be t[aeiou]
will match "ta", "te", "ti", "to", "tu".
Any character in the set will match.
You can also define a range: 10[0-9]
, this is the same as 10[0123456789]
, the results are numbers between 100 and 109. Remember you can do the same with characters.
What if you want to explicitly not match a character? You use ^
. For example t[^o]
will match any combination of "t" and one other character excep the single instance of "to".
Examples:
p.p
matches pop, pup, pmp, p@p, ...
\w
matches any word like, does not match punctuation or symbols but any other set of numbers or characters. hellow\w
matches "hellow_", "hello9", "hello!".
\d
matches a numeric digit, this means [0-9]
. Example: \d\d?:\d\d
, will match strings in time format, such as "9:30" and "12:45".
\b
matches a word boundary characters such as spaces and puntctuation. to\b
will match "to" in "to the moon" but will not match "tomorrow".
\s
matches whiespace characters suc as spaces, tabs and newlines. hellow\s
will match "hello" in "well, hello there".
^
matches at the beginning of a line. ^Hello
will match against the string. "Hello there", but not "He said Hello".
$
matches at the end of a line. the end$
will match against “It was the end” but not “the end was near”
*
matches the previous element 0 or more times. 12*3
will match with the string 13
, 123
and 1222222223
.
+
will match the previous element 1 or mor times. 12+3
will match 123
but no 13
.
Curly braces {}
contain the minimum and maxium number of matches. For example 10{1,2}1
will match both "101" and "1001" but not "10001", the minimum number of matches is 1 and the maximum is 2. He[Ll]{2,}o
will match "HeLLo" and "HellLLLllo", the min number of matches is 2, but the max is not set. Remember, the [Ll]
stands for any of its characters is valid.