Last active
February 18, 2017 17:33
-
-
Save f9n/4b018571bb982f251a319f11ca2e6df6 to your computer and use it in GitHub Desktop.
Regex Cheat Sheet
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# http://ryanstutorials.net/regular-expressions-tutorial/ | |
dot (.) Match any character. | |
[ ] Match a range of characters contained within the square brackets. | |
[^ ] Match a character which is not one of those contained within the square brackets. | |
* Match zero or more of the preceeding item. | |
+ Match one or more of the preceeding item. | |
? Match zero or one of the preceeding item. | |
{n} Match exactly n of the preceeding item. | |
{n,m} Match between n and m of the preceeding item. | |
{n,} Match n or more of the preceeding item. | |
\ Escape, or remove the special meaning of the next character. | |
String A sequence of characters. | |
\s Match any character which is considered whitespace (space, tab etc). | |
\S Match any character which is not whitespace. | |
\d Match any character which is a digit ( 0 - 9 ). | |
\D Match any character which is not a digit. | |
\w Match any character which is a word character ( A-Z, a-z, 0-9 and _ ). | |
\W Match any character which is not a word character. | |
\t Match a tab. | |
\r Match a carriage return. | |
\n Match a line feed (or newline). | |
^ (caret) An anchor which matches the beginning of the line. | |
$ (dollar)An anchor which matches the end of the line. | |
\b Matches the beginning or end of a word. | |
\< Matches the beginning of a word. | |
\> Matches the end of a word. | |
( ) Group part of the regular expression. | |
\1 \2 etc Refer to something matched by a previous grouping. | |
| Match what is on either the left or right of the pipe symbol. | |
(?=x) Positive lookahead. | |
(?!x) Negative lookahead. | |
(?<=x) Positive lookbehind. | |
(?<!x) Negative lookbehind. | |
#https://docs.python.org/3.7/howto/regex.html | |
{0,} is the same as *, {1,} is equivalent to +, and {0,1} is the same as ?. | |
# https://www.analyticsvidhya.com/blog/2015/06/regular-expression-python/ | |
Operators Description | |
. Matches with any single character except newline ‘\n’. | |
? match 0 or 1 occurrence of the pattern to its left | |
+ 1 or more occurrences of the pattern to its left | |
* 0 or more occurrences of the pattern to its left | |
\w Matches with a alphanumeric character whereas \W (upper case W) matches non alphanumeric character. | |
\d Matches with digits [0-9] and /D (upper case D) matches with non-digits. | |
\s Matches with a single white space character (space, newline, return, tab, form) and \S (upper case S) | |
matches any non-white space character. | |
\b boundary between word and non-word and /B is opposite of /b | |
[..] Matches any single character in a square bracket and [^..] matches any single character not in square bracket | |
\ It is used for special meaning characters like \. to match a period or \+ for plus sign. | |
^ and $ ^ and $ match the start or end of the string respectively | |
{n,m} Matches at least n and at most m occurrences of preceding expression if we write it as {,m} then | |
it will return at least any minimum occurrence to max m preceding expression. | |
a| b Matches either a or b | |
( ) Groups regular expressions and returns matched text | |
\t, \n, \r Matches tab, newline, return | |
# https://pythonspot.com/regular-expressions/ | |
Regex Description | |
\d Matches any decimal digit; this is equivalent to the class [0-9] | |
\D Matches any non-digit character; this is equivalent to the class [^0-9]. | |
\s Matches any whitespace character; this is equivalent to the class [ \t\n\r\f\v]. | |
\S Matches any non-whitespace character; this is equivalent to the class [^ \t\n\r\f\v]. | |
\w Matches any alphanumeric character; this is equivalent to the class [a-zA-Z0-9_]. | |
\W Matches any non-alphanumeric character; this is equivalent to the class [^a-zA-Z0-9_]. | |
\Z Matches only at end of string | |
[..] Match single character in brackets | |
[^..] Match any single character not in brackets | |
. Match any character except newline | |
$ Match the end of the string | |
* Match 0 or more repetitions | |
+ 1 or more repetitions | |
{m} Exactly m copies of the previous RE should be matched. | |
| Match A or B. A|B | |
? 0 or 1 repetitions of the preceding RE | |
[a-z] Any lowercase character | |
[A-Z] Any uppercase character | |
[a-zA-Z] Any character | |
[0-9] Any digit | |
Example Regex | |
IP address (([2][5][0-5]\.)|([2][0-4][0-9]\.)|([0-1]?[0-9]?[0-9]\.)){3}(([2][5][0-5])|([2][0-4][0-9])|([0-1]?[0-9]?[0-9])) | |
Email [^@]+@[^@]+\.[^@]+ | |
Date MM/DD/YY (\d+/\d+/\d+) | |
Integer(+) (? | |
Integer [+-]?(? | |
Float (?<=>)\d+.\d+|\d+ | |
Hexadecimal \s–([0-9a-fA-F]+)(?:–)?\s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment