Last active
January 31, 2024 03:02
-
-
Save pritul2/d9f71a1c610fbc20e41802b44441df2f to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
\d : [0-9] single \D : not of \d | |
\w : [0-9][a-z][A-Z] single \W : not of \w | |
\s : whitespace or tab \S : not of \s | |
Special Characters: | |
. —> any character (including spaces) | |
* —> 0 or more | |
+ —> 1 or more | |
x? —> x is character and which is optional | |
[ ] —> referring as set of characters appearing anyone from set | |
[a-zA-Z] —> any characters from a to z or A to Z | |
[^a-z] —> no characters from a to z (includes whitespace & newline) | |
( ) —> grouping characters | |
| —> or character used with grouping | |
Positioning characters: | |
^abc —> abc from beginning | |
abc$ —> abc from end | |
{a,b} —> length of words | |
or | |
{a} —> length of words | |
\b{a,b}\b—> word boundary matching len(a) to len(b) | |
or | |
\b{a}\b | |
Note: | |
Use \. when refering the period (.) | |
Complex Examples: | |
rainbow.* —> rainbowcat , rainbowdog, rainbow… | |
colou?r —> color, colour | |
l[yi]nk —> link , lynk | |
\d{3}[-.]\d{3}[-.]\d{4} —> 3 digits then - or . then 3 digit then - or . then 4 digit | |
—> 917-555-1234, 646.867-5309, 999-444.3333 | |
[0-5] —> digits between 0 to 5 | |
[^abc] —> not a but b or c —> bxxx , cxxx | |
(cats|dogs) —> matches exactly cats or dogs | |
Advance Regex : | |
Obtain substring from string | |
Check out this for brushing skills | |
https://regexone.com/lesson/end? | |
Regex in Python | |
1) Obtaining whole matched string: | |
filtered_sheet_names = list(filter(expression.search, xls.sheet_names)) | |
val = list(filter(exp.search,f.read().split("\n"))) | |
2) Obtaining only particular matched part of string | |
re.findall("\d\d\d$",i) | |
3) Replacing matched string | |
re.sub("re1","string",i) | |
from re import match | |
values = ['123', '234', 'foobar'] | |
filtered_values = list(filter(lambda v: re.match('^\d+$', v), values)) #Note: Complete match is necessary to obtain# | |
Example: | |
Matching the ip address | |
^[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]$ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment