Skip to content

Instantly share code, notes, and snippets.

@cpuuntery
Last active May 31, 2023 15:48
Show Gist options
  • Save cpuuntery/8455f715af75cc60b323feb1e9da7885 to your computer and use it in GitHub Desktop.
Save cpuuntery/8455f715af75cc60b323feb1e9da7885 to your computer and use it in GitHub Desktop.
[Very important. I repeat, this is very important]
YOU CAN NOT USE TWO REGEX EXPRESSIONS AT THE SAME TIME, BUT YOU CAN TRY TO BE AS SPECIFIC AS POSSIBLE. AND TRY TO BE FLEXIBLE AND MIX AND MATCH THE EXPRESSIONS
AND IF WORST COME TO WORST YOU CAN REGEX A REGEX OUTPUT
#If you use any regex and the regex is not matching anything, or it is not matching the expected behaviour. In that case, you must check if the regex is lazy or greedy and the regex flags
g - Global. Finds all matches instead of stopping after the first.
i - Ignore case. /[a-z]/i is equivalent to /[a-zA-Z]/.
m - Multiline. lets ^ and $ match the start and end of every single line in a string that contains multiple lines, instead of only the start and end of the string
s - single-line mode. Use it if you want to select multiple lines. It will most likely be used with [select everything between]
# The .* is greedy and matches a lot of strings, sometime it is desirable, and sometimes it is not desirable. So to make it non-greedy (lazy as it called in the world of regex) just add ? Like this .*?
EVERY SINGLE REGEX OPERATIONS CONSTRUCTOR CAN BE USED WITH THE OTHER CONSTRUCTORS. FROM THE QUANTIFIERS EX (+) TO LOOKBEHIND EX (?=PATTERN). YOU CAN MIX AND MATCH REGEX CONSTRUCTORS TO YOUR HEART'S CONTENT. I REPEAT, THE SKY'S THE LIMIT.
[Very important. I repeat, this is very important]
# select every text until the end of the line
TAM3(?<=).*$
# regex select everything between
# include the word in the result
word.*?word
# exclude the word in the result
(?<=This is).*?(?=sentence)
# regex include lines containing a specific word
^.*(?=composing|composed).*$
# regex exclude lines containing a specific word
^(?:(?!android|google|qualcomm).)*$
# match strings longer than 40 characters
.{40,}
# \d+ means a all digits after a digit. \w+ means a all characters after a character including digits
To match 2020-4-28
\d+-\d+-\d+
OR
# if you want to specify the exact number of digits. It will not match if the number of digit is different. The same goes for characters
To match 2020-4-28
\d\d\d\d-\d\d-\d\d
# [a-z] is better than \w . because it excludes digits. for the UPPER CASE [A-Z] and for both case [a-zA-Z] alternatively you can use the i flag. just like there are /w+ there are [A-Z]+
# match all lines containing a string
^.*the-string.*$
# select a number of lines after a string
string((.*\n){number of lines})
# remove duplicate lines (the bothe the orignal and the duplicate)
(^.*$\n)\1+
# The ? Is like an OR statement. It will match anything, whether it started with the character before it or not.
To match negative number with the sign if they have it and positive number but without the sign.
-?\d+
To match negetive number and positibe number with the sighn if they have it
\+?-?\d+
# If the greedy match is too much and the lazy match is too little. The solution maybe to use \b (word boundary)
it is the same as (^) and ($) in the Multiline mode, but the difference is \b also mean before and after a spice
\b.*\b
# You can put * after \d, and it will mean any character after a digit, just like \d+ means any number after a digit until you find no digits
\d*
# quantifiers are very useful if you repeat the exact same regex
instead of this
regexregexregex
use
regex{3}
# Replace only the last part of a matched string
Every () is a group, and there are two groups. $1 return the first group and anything after it, but everything after $1 will replace value of the second group if there is nothing after $1 the second group will be removed
match parameter
(.+)(.)
replace parameter
$1
# character classes
. [^\r\n] matches anything except for a newline.
\d [0-9] matches a digit
\D [^0-9] matches all but a digit
\w [a-zA-Z0-9_] matches a word character
\W [^a-zA-Z0-9_] matches all but a word character
\s Matches any whitespace character, including spaces and tabs.
\S Matches all but a whitespace character.
# Character set is anything inside [] and anything inside will be treated as a separate character, and it will match anything that contains at least one of the characters inside the []
regex special characters inside [] don't need to be escaped
the quantifiers do not work inside the [] you need to put the quantifiers outside the []
you can put (|) between two character sets
(|) is an OR statement, it means any character from the first set OR the second set
Negated sets is like an exclude/inverse search,
the result is everything but the specified regex
it starts with [^ and end with]
[^regex]
# Capture group is anything inside () you can use it for
1-make the quantifier treat anything inside () as a whole string
(?:regex)
2-replace part of a matched string but keep the other part intact
(regex)
# Lookahead and Lookbehind
Positive Lookahead (?=regex)
Negative Lookahead (?!regex)
Positive Lookbehind (?<=regex)
Negative Lookbehind (?<!regex)
Positive Lookahead is usually like this
the_other_regex(?=regex)
Positive Lookbehind is usually like this
(?<=regex)the_other_regex
lookahead is a word boundary BEFORE the regex
Lookbehind is a word boundary AFTER the regex
You can use the (|) with Lookahead and Lookbehind
(?!string1|string2|string3)
# You can use the | operator inside () and []
# Pipe regex into another regex
Special thanks to [Lookahead] without it . Piping would not be possible.
The first regex pattern is piped into the second regex pattern or the third if it exists
^.*(?=does contain).*(?=does contain).*(?=does contain).*$
MAKE SURE THE FOLLOWING REGEX PATTERNS DOES NOT CONTAIN SPACE AT THE END
############################################### [DOES CONTAIN] AND [DOES NOT CONTAIN] CAN BE EITHER REGEX PATTERN OR A PLAIN STRING ##############################################################
^.*(?=does contain)(?:(?!does not contain).)*$ The first regex pattern [does contain] is piped into the second regex pattern [does not contain]
^.*(?=does contain).*(?=does contain).*$ The first regex pattern [does contain] is piped into the second regex pattern [does contain]
^(?:(?!.*does not contain).)(?:(?!does not contain).)*$ The first regex pattern [does not contain] is piped into the second regex pattern [does not contain]
^(?:(?!.*does not contain).).*(?=does contain).*$ The first regex pattern [does not contain] is piped into the second regex pattern [does contain]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment