Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save mcarbonneaux/ac4d105d80808de8b35c8d4a15e1b57f to your computer and use it in GitHub Desktop.
Save mcarbonneaux/ac4d105d80808de8b35c8d4a15e1b57f to your computer and use it in GitHub Desktop.
Bash regular expression match with groups including example to parse http_proxy environment variable

The newer versions of bash include a regex operator =~

Simple example

$ re='t(es)t'
$ [[ "test" =~ $re ]]
$ echo $?
0
$ echo ${BASH_REMATCH[1]}
es

Note

  • operator returns true if it's able to match
  • nested groups are possible (example below shows ordering)
    • 0 is the whole match
    • optional groups are counted even if not present and will be indexed, but be empty/null
  • global match isn't suported, so it only matches once
  • the regex must be provided as an unquoted variable reference to the re var

Match fails if re specified directly as string. Seems to want to be unquoted...

[[ "test" =~ 't(es)t' ]]; echo $?;
1

More complex example to parse the http_proxy env var

http_proxy_re='^https?://(([^:]{1,128}):([^@]{1,256})@)?([^:/]{1,255})(:([0-9]{1,5}))?/?'
if [[ "$http_proxy" =~ $http_proxy_re ]]; then
  # care taken to skip parent nesting groups 1 and 5
  user=${BASH_REMATCH[2]}
  pass=${BASH_REMATCH[3]}
  host=${BASH_REMATCH[4]}
  port=${BASH_REMATCH[6]}
fi

Examples of tricky issues with limitations:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment