Skip to content

Instantly share code, notes, and snippets.

@jonahaung
Created November 23, 2019 14:46
Show Gist options
  • Save jonahaung/33ada73d81ccf98c07cc2aa1c3261cd9 to your computer and use it in GitHub Desktop.
Save jonahaung/33ada73d81ccf98c07cc2aa1c3261cd9 to your computer and use it in GitHub Desktop.
import UIKit
func highlightMatches(for pattern: String, inString string: String) -> NSAttributedString {
guard let regex = try? NSRegularExpression(pattern: pattern, options: []) else {
return NSMutableAttributedString(string: string)
}
let range = NSRange(string.startIndex..., in: string)
let matches = regex.matches(in: string, options: [], range: range)
let attributedText = NSMutableAttributedString(string: string)
for match in matches {
attributedText.addAttribute(.backgroundColor, value: UIColor.yellow, range: match.range)
}
return attributedText.copy() as! NSAttributedString
}
func listMatches(for pattern: String, inString string: String) -> [String] {
guard let regex = try? NSRegularExpression(pattern: pattern, options: []) else {
return []
}
let range = NSRange(string.startIndex..., in: string)
let matches = regex.matches(in: string, options: [], range: range)
return matches.map {
let range = Range($0.range, in: string)!
return String(string[range])
}
}
func listGroups(for pattern: String, inString string: String) -> [String] {
guard let regex = try? NSRegularExpression(pattern: pattern, options: []) else {
return []
}
let range = NSRange(string.startIndex..., in: string)
let matches = regex.matches(in: string, options: [], range: range)
var groupMatches: [String] = []
for match in matches {
let rangeCount = match.numberOfRanges
for group in 0..<rangeCount {
let range = Range(match.range(at: group), in: string)!
groupMatches.append(String(string[range]))
}
}
return groupMatches
}
func containsMatch(of pattern: String, inString string: String) -> Bool {
guard let regex = try? NSRegularExpression(pattern: pattern, options: []) else {
return false
}
let range = NSRange(string.startIndex..., in: string)
return regex.firstMatch(in: string, options: [], range: range) != nil
}
func replaceMatches(for pattern: String, inString string: String, withString replacementString: String) -> String? {
guard let regex = try? NSRegularExpression(pattern: pattern, options: []) else {
return string
}
let range = NSRange(string.startIndex..., in: string)
return regex.stringByReplacingMatches(in: string, options: [], range: range, withTemplate: replacementString)
}
//: ## Basic Examples
//:
//: This first example is about as simple as regular expressions get! It matches the word "jump" in the sample text:
let quickFox = "The quick brown fox jumps over the lazy dog."
listMatches(for: "jump", inString: quickFox)
//: This next example uses some special characters that are available in regular expressions. The parenthesis create a group, and the question mark says "match the previous element (the group in this case) 0 or 1 times". It matches either 'jump' or 'jumps':
listMatches(for:"jump(s)?", inString: quickFox)
//: This one matches an HTML or XML tag:
let htmlString = "<p>This is an example <strong>html</strong> string.</p>"
listMatches(for:"<([a-z][a-z0-9]*)\\b[^>]*>(.*?)", inString: htmlString)
//: Wow, looks complicated, eh? :] Hopefully things will become a bit more clear as you look through the rest of the examples here!
//: ## Cheat Sheet
//:
//: **.** matches any character. `p.p` matches pop, pup, pmp, p@p, and so on.
let anyExample = "pip, pop, p%p, paap, piip, puup, pippin"
listMatches(for:"p.p", inString: anyExample)
//: **\w** matches any 'word-like' character which includes the set of numbers, letters, and underscore, but does not match punctuation or other symbols. `hello\w` will match "hello_9" and "helloo" but not "hello!"
let wordExample = "hello helloooooo hello_1114 hello, hello!"
listMatches(for:"hello\\w+", inString: wordExample)
//: **\d** matches a numeric digit, which in most cases means `[0-9]`. `\d\d?:\d\d` will match strings in time format, such as "9:30" and "12:45".
let digitExample = "9:30 12:45 df:24 ag:gh"
listMatches(for:"\\d?\\d:\\d\\d", inString: digitExample)
//: **\b** matches word boundary characters such as spaces and punctuation. `to\b` will match the "to" in "to the moon" and "to!", but it will not match "tomorrow". `\b` is handy for "whole word" type matching.
let boundaryExample = "to the moon! when to go? tomorrow?"
listMatches(for:"to\\b", inString: boundaryExample)
//: **\s** matches whitespace characters such as spaces, tabs, and newlines. `hello\s` will match "hello " in "Well, hello there!".
let whitespaceExample = "Well, hello there!"
listMatches(for:"hello\\s", inString: whitespaceExample)
//: **^** matches at the beginning of a line. Note that this particular ^ is different from ^ inside of the square brackets! For example, `^Hello` will match against the string "Hello there", but not "He said Hello".
let beginningExample = "Hello there! He said hello."
highlightMatches(for:"^Hello", inString: beginningExample)
//: **$** matches at the end of a line. For example, `the end$` will match against "It was the end" but not "the end was near"
let endExample = "The end was near. It was the end"
highlightMatches(for:"end$", inString: endExample)
//: **\*** matches the previous element 0 or more times. `12*3` will match 13, 123, 1223, 122223, and 1222222223.
let zeroOrMoreExample = "13, 123, 1223, 122223, 1222222223, 143222343"
highlightMatches(for:"12*3", inString: zeroOrMoreExample)
//: **+** matches the previous element 1 or more times. `12+3` will match 123, 1223, 122223, 1222222223, but not 13.
let oneOrMoreExample = "13, 123, 1223, 122223, 1222222223, 143222343"
highlightMatches(for:"12+3", inString: oneOrMoreExample)
//: `?` matches the previous element 0 or 1 times. `12?3` will match 13 or 123, but not 1223.
let possibleExample = "13, 123, 1223"
highlightMatches(for:"12?3", inString: oneOrMoreExample)
//: Curly braces **{ }** contain the minimum and maximum number of matches. For example, `10{1,2}1` will match both "101" and "1001" but not "10001" as the minimum number of matches is 1 and the maximum number of matches is 2. `He[Ll]{2,}o` will match "HeLLo" and "HellLLLllo" and any such silly variation of "hello" with lots of L’s, since the minimum number of matches is 2 but the maximum number of matches is not set — and therefore unlimited!
let numberExample1 = "101 1001 10001"
let numberExample2 = "HeLLo HellLLLllo"
highlightMatches(for:"10{1,2}1", inString: numberExample1)
highlightMatches(for:"He[Ll]{2,}", inString: numberExample2)
//: Capturing parentheses **( )** are used to group part of a pattern. For example, `3 (pm|am)` would match the text "3 pm" as well as the text "3 am". The pipe character here (|) acts like an OR operator.
let cinema = "Are we going to the cinema at 3 pm or 5 pm?"
listMatches(for:"\\d (am|pm)", inString: cinema)
listGroups(for:"(\\d (am|pm))", inString: cinema)
//: You can include as many pipe characters in your regular expression as you would like. As an example, `(Tom|Dick|Harry)` is a valid pattern.
let greeting = "Hello Tom, Dick, Harry!"
listMatches(for:"(Tom|Dick|Harry)", inString: greeting)
replaceMatches(for:"(Tom|Dick|Harry)", inString: greeting, withString: "James")
//: Character classes represent a set of possible single-character matches. Character classes appear between square brackets **[ ]**.
//: As an example, the regular expression `t[aeiou]` will match "ta", "te", "ti", "to", or "tu". You can have as many character possibilities inside the square brackets as you like, but remember that any single character in the set will match. `[aeiou]` _looks_ like five characters, but it actually means "a" or "e" or "i" or "o" or "u".
let theVowels = "ta te ti to tu th tj tk tm"
listMatches(for:"t[aeiou]", inString: theVowels)
//: You can also define a range in a character class if the characters appear consecutively. For example, to search for a number between 100 to 109, the pattern would be `10[0-9]`. This returns the same results as `10[0123456789]`, but using ranges makes your regular expressions much cleaner and easier to understand. Ranges can also be used for characters - for example, `[a-z]` matches all lowercase characters, or `[a-zA-Z]` matches all upper and lower case.
let theNumbers = "100 101 105 121 229 816"
listMatches(for:"10[0-9]", inString: theNumbers)
listMatches(for:"t[a-h]", inString: theVowels)
//: Character classes usually contain the characters you _want_ to match, but what if you want to explicitly _not match_ a character? You can also define negated character classes, which use the `^` character. For example, the pattern `t[^o]` will match any combination of "t" and one other character _except_ for the single instance of "to".
let notClasses = "tim tam tum tom tem"
listMatches(for:"t[^oa]m", inString: notClasses)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment