Skip to content

Instantly share code, notes, and snippets.

@leommoore
Last active December 12, 2015 02:48
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save leommoore/4701268 to your computer and use it in GitHub Desktop.
Save leommoore/4701268 to your computer and use it in GitHub Desktop.
JavaScript - Regular Expressions

#JavaScript - Regular Expressions (Regex)

Regular expressions are a language for describing patterns in string data. It is available in many languages including JavaScript.

Regular expressions are denoted by slashed (/) instead of quotes. ie /Hello/. Also Regular expressions are objects in JavaScript and have a number of methods including test which returns true or false if the pattern is found.

###Search Search return the index of the start of the text (like IndexOf), if it is found. Remember, the first location (ie h) is 0. An result of -1 indicates that the Regex was not found.

"hello there".search(/th/);
//6

"hello there".search(/zz/);
//-1 

If you want to search for something containing a slash then you need to escape it or it will be taken to be the end of the string. For example if the are looking for the expression "a/c" we could express it as /a/c/. Most punctuation characters (ie {,*, etc) should be escaped.

var searchText = /a\/c/
"The following a/c's are in the list".search(searchText);
//14 

###Sets of Characters

If we want to search for something or something else we can do it with enclosing them in [].

var searchText = /[JS]]/
"The person called John is also called Smith".search(searchText);
//18 1st instance ie L 

You can also use dot (.) to indicate any character that is not a line break character. An escaped d (\d) indicates any digit. an escaped w (\w) indicates any word (ie alphanumeric and underscore). An escaped s (\s) matches any whitespace character (ie space, tab or newline).

You can replace \d, \w and \s characters with their capitals to negate their meanings. For example \S mathces any character which is not a whitespace character. You can invert a patterns by starting with ^ and putting it in []. For example:

var searchText = /[^ABC]]/
"ABCBACCBBADABC".search(searchText);
//10 because D is the first character not to be either A,B or C

###Word and String Boundaries The character ^ indicates the start of the string and $ indicates the end.

/a/.test("blah");
//true

/^a$/.test("blah");
//false because a would have to be at the start and end of the string.

/^a$/.test("a");
//true

The \b escape character denotes the word boundry, which can be a punctuation, whitespace or the start or end of a string.

/\bdog\b/.test("Our dog is the best doggy around");
//true

/\bdog\b/.test("Our doggy is the best doggy around");
//false becuase there has to be a word boundry each side of the word dog

###Repeating Patterns It is also possible to catch repeating patterns bu putting the asterix (*) after the character. The plus (+) sign is similar but differs in that it requires the pattern to occur at least once. The question mark (?) means that the element can appear zero or one time, which means that it is optional.

var parenthethicText = /\(.*\)/;
"Its (the sloth's) claws were gigantic!".search(parenthethicText);
//4 because it accepts any number of characters between two parentheses

Remember that .* will match any number of characters. You can also use curley braces to specify an exact number {5} ie .{5} will match any 5 character string.

/.{5}/.test("doggy");
//true

If two numbers are specified ie {3,5} then the first number is the minimum number of times the pattern must exist and the second number is the maximum. Following this {3,} means that it must occur three or more times. {,5} means that it can only occur up to five times.

###Subexpressions

It is possible to use special characters line * and + on more than one character in the regular expression. Note: you can apply options such as i (case insensitive) after the rexular expression.

var cartoonCrying = /boo(hoo+)+/i;
cartoonCrying.test("Boohoooohoohooo");
//true

cartoonCrying.test("BOOooooHOoHooo");
//true because the i indicates that the pattern is case insensitive

###Or

There are times when you will want to see a defined number of permutaions.

var personName = /\b(Mr.|Mrs.|Ms.) (Smith|Doe)\b/i;
personName.test("Ms. Doe");
//true

###Matching

If a .match find the pattern then it will return it. If it cannot find it then it returns null.

"foo".match(/bar/i);
//null - No match

"foobar".match(/bar/i);
//bar - Match found

###Replace

You can also use .replace to replace the match text in the string with another string element. Note: the g option means global and it means replace all instances of the pattern and not just the first

"Foobar is the best bar in town.".replace(/bar/g, "buzz");
//"Foobuzz is the best buzz in town."

"Foobar is the best bar in town.".replace(/bar/, "buzz");
//"Foobuzz is the best bar in town."

###Summary

<tr><td colspan="4"></td></tr>
<tr><td><strong>Options:</strong></td><td colspan="3">i case insensitive, m make dot match newlines, x ignore whitespace in regex, o perform #{...} substitutions only once
</td></tr>
[abc]A single character of: a, b or c .Any single character
[^abc]Any single character except: a, b, or c \sAny whitespace character
[a-z]Any single character in the range a-z \SAny non-whitespace character
[a-zA-Z]Any single character in the range a-z or A-Z \dAny digit
^Start of line \DAny non-digit
$End of line \wAny word character (letter, number, underscore)
\AStart of string \WAny non-word character
\zEnd of string \bAny word boundary
(...)Capture everything enclosed (a|b)a or b
a?)Zero or one of a a*Zero or more of a
a+One or more of a a\{3\}Exactly 3 of a
a\{3,\}3 or more of a a\{3,6\}Between 3 and 6 of a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment