Skip to content

Instantly share code, notes, and snippets.

@Janfy
Forked from JPvRiel/bash_regex_match_groups.md
Created January 28, 2017 18:35
Show Gist options
  • Save Janfy/e11464ecb47167d6f88609dc28265d03 to your computer and use it in GitHub Desktop.
Save Janfy/e11464ecb47167d6f88609dc28265d03 to your computer and use it in GitHub Desktop.
Bash regular expression match with groups including example to parse http_proxy environment variable

The newer versions of bash include a regex operator =~

Simple example

$ re='t(es)t'
$ [[ "test" =~ $re ]]
$ echo $?
0
$ echo ${BASH_REMATCH[1]}
es

Note

  • operator returns true if it's able to match
  • nested groups are possible (example below shows ordering)
    • 0 is the whole match
    • optional groups are counted even if not present and will be indexed, but be empty/null
  • global match isn't suported, so it only matches once
  • the regex must be provided as an unquoted variable reference to the re var

Match fails if re specified directly as string. Seems to want to be unquoted...

[[ "test" =~ 't(es)t' ]]; echo $?;
1

More complex example to parse the http_proxy env var

http_proxy_re='^https?://(([^:]{1,128}):([^@]{1,256})@)?([^:/]{1,255})(:([0-9]{1,5}))?/?'
if [[ "$http_proxy" =~ $http_proxy_re ]]; then
  # care taken to skip parent nesting groups 1 and 5
  user=${BASH_REMATCH[2]}
  pass=${BASH_REMATCH[3]}
  host=${BASH_REMATCH[4]}
  port=${BASH_REMATCH[6]}
fi

Examples of tricky issues with limitations:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment