Skip to content

Instantly share code, notes, and snippets.

@sysnucleus
Last active January 30, 2022 12:00
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save sysnucleus/aa9fd7400f1f4d0590aae76c55262de0 to your computer and use it in GitHub Desktop.
Save sysnucleus/aa9fd7400f1f4d0590aae76c55262de0 to your computer and use it in GitHub Desktop.
Commonly used Regular Expressions, with WebHarvy
(.*)
Selects only first line from a block of text or HTML
[\s]*(.*)
Selects first line, ignoring the starting white-spaces, (spaces, line feeds and carriage returns).
[\s]* matches all white-spaces till the first view-able character.
href=”([^”]*)
Gets the href link/URL from HTML. [^”]* matches till the next " character.
src=”([^”]*)
Gets src link/URL from HTML
Also can be modified according to requirement as shown below.
zoom-image=”([^”]*)
data-large-image=”([^”]*)
mailto:([^”]*)
Gets email address from HTML
Alloy Wheels([\s\S]*?)<div class="icon">
Gets the string between 'Alloy Wheels' and <div class="icon">. This can be modified to match
any string which is guaranteed to appear between 2 other strings in HTML or in TEXT.
[\s\S]* matches everything (white-space and non white-space - includes all characters)
Starting Text([\s\S]*?)Ending Text
General format of the above case. Just place ([\s\S]*?) between the starting and ending portion
and the in-between text or HTML is matched and selected.
itemprop="name">([^<]*)<div class="line">
Gets HTML code between itemprop="name"> and <div class="line">. [^<]* matches all characters till <.
itemprop="name">([\s\S]*?)<div class="line">
Same as above.
(?=[^M]*MAP)[^M]*MAP: \$(.*)|List Price: \$(.*)
Conditional regular expression. Captures MAP price if available, else capture List Price.
RegEx special characters like $, ., ^ etc. should be escaped by \ (example: \$, \. etc).
<img src="([^"]*)
First image URL in HTML
<img src=[\s\S]*?<img src="([^"]*)
Second image URL in HTML. src value of second img tag in HTML.
(In Stock)
Matches and gives value 'In Stock', only if the selected HTML or TEXT has the text 'In Stock'.
This can be used to check if the selected HTML or TEXT contains a specific string.
merch_name[^>]*>([^<]*)
Matches the string which comes between 2 HTML tags where the starting tag contains the text 'merch_name'.
[^>]*> matches till the next >
[^<]* matches till the next <
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment