Skip to content

Instantly share code, notes, and snippets.

@richimf
Created December 3, 2018 21:02
Show Gist options
  • Save richimf/54ab23dcfd008358d916778e7450e03f to your computer and use it in GitHub Desktop.
Save richimf/54ab23dcfd008358d916778e7450e03f to your computer and use it in GitHub Desktop.
Regular expresions in swift (notes).

Regular expresions in swift (notes).

Special characters in literal strings, precede them by a backslash \ character.

This means the standard regular expression \. will appear as \\. in Swift code.

Capturing parentheses: are used to group part of a pattern, example 3(pm|am) would match the text 3 pm as well ass 3 am. The pipe character here | acts like an OR operator.

The question mark after capturing parentheses means that whatever is inside the parentheses is optional. Nov(ember)?, input could be Nov or November.

The $1 allows you to reference the first captured group of the preceding rule.

Character classes: represent a set of possible single-character matches. Character classes appear between square brackets [ ] . As an example, a Regex of this could be t[aeiou] will match "ta", "te", "ti", "to", "tu". Any character in the set will match. You can also define a range: 10[0-9], this is the same as 10[0123456789], the results are numbers between 100 and 109. Remember you can do the same with characters.

What if you want to explicitly not match a character? You use ^. For example t[^o] will match any combination of "t" and one other character excep the single instance of "to".

Examples:

p.p matches pop, pup, pmp, p@p, ...

\w matches any word like, does not match punctuation or symbols but any other set of numbers or characters. hellow\w matches "hellow_", "hello9", "hello!".

\d matches a numeric digit, this means [0-9]. Example: \d\d?:\d\d, will match strings in time format, such as "9:30" and "12:45".

\b matches a word boundary characters such as spaces and puntctuation. to\b will match "to" in "to the moon" but will not match "tomorrow".

\s matches whiespace characters suc as spaces, tabs and newlines. hellow\s will match "hello" in "well, hello there".

^ matches at the beginning of a line. ^Hello will match against the string. "Hello there", but not "He said Hello".

$ matches at the end of a line. the end$ will match against “It was the end” but not “the end was near”

* matches the previous element 0 or more times. 12*3 will match with the string 13, 123 and 1222222223.

+ will match the previous element 1 or mor times. 12+3 will match 123 but no 13.

Curly braces {} contain the minimum and maxium number of matches. For example 10{1,2}1 will match both "101" and "1001" but not "10001", the minimum number of matches is 1 and the maximum is 2. He[Ll]{2,}o will match "HeLLo" and "HellLLLllo", the min number of matches is 2, but the max is not set. Remember, the [Ll] stands for any of its characters is valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment