Skip to content

Instantly share code, notes, and snippets.

@Jeff-Russ
Last active March 3, 2017 20:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Jeff-Russ/9bf5fbbb52b18c61b045d557545ad06d to your computer and use it in GitHub Desktop.
Save Jeff-Russ/9bf5fbbb52b18c61b045d557545ad06d to your computer and use it in GitHub Desktop.

Handy Regex Patterns

Here are some handy regex patterns and portions of patterns. I seem to keep coming back to these so it made sense to document them for the sake of memory.

Numerics

Signed integers or floats. Skipping the zero before the decimal, for example .1, +.1 or -.1 , is valid.

Must Match [-+]?\d*\.?\d+ or Require sign [-+]\d*\.?\d+
Optionally [-+]?\d*\.?\d* or Require sign [-+]\d*\.?\d+

Starts with Numeric. This is useful for PHP where math between strings or casting of strings to float or int works if strings start with numeric.

Must Match ^[-+]?\d*\.?\d+.*$
w/captures ^([-+]?\d*\.?\d+)(.*)$
cap. space ^([-+]?\d*\.?\d+)(\s*)(.*)$

using * instead of + for the space and the ending means the pattern won't fail just becuase they are missing, you will simply get an empty string for them. Here is that last one in PHP:

if (preg_match('/^([-+]?\d*\.?\d+)(\s*)(.*)$/', '-1.5 the ending', $capts)) {
  $num   = $c[0];
  $space = $c[1];
  $end   = $c[2];
} else $num = $space = $end = '';

You could skip the need to intialize the empty variable by taking any string, even if it does not start with numeric, by changing to ^([-+]?\d*\.?\d*)(\s*)(.*)$. Then if $c[0]==='' you know you didn't start with a numeric.

Delimiting Portions of Strings

If you know Bash programming you'll know wrapping a portion of a string in backticks or with $( ) causes it to be executed in a subshell. This is like string interpolation but with execution. You might want to accept a string that has a portion that is processed with a syntax, indicated by wrapping it in backticks.

Capture All (`.*`)
Just Inner (?:`(.*)`)

That last one works becuase you can nest a capture group ( ) inside a non-capture group (?: ) and put the backticks in the non-capture but just outside the capture.

Get After ^(?:`(.*)`)?(.+)?$

The Above would split '`cmd`after' into 'cmd' and 'after'. If you simply had 'after' it would be grabbed by the second capture. In other words, the default group when there are no backticks would be the last. If you want them to be the the first, so 'key' has the same result as '`key`'

Default to 1st Group ^`?([^`]*)`?(.*)$

Keep in mind you loose any backtick in the captures, so if you feed it '`key' you would get back 'key'. In other words, nothing is making sure you wrap with two backtick for the backticks to be considered delimitors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment