Skip to content

Instantly share code, notes, and snippets.

@Crystalh
Last active August 29, 2015 14:01
Show Gist options
  • Save Crystalh/6b5e3c72f9e0ed6e4163 to your computer and use it in GitHub Desktop.
Save Crystalh/6b5e3c72f9e0ed6e4163 to your computer and use it in GitHub Desktop.
Bash capture groups

data:

locator: urn:asset:a1e2d1df-a741-6547-93e2-e84692c8e981
locator: urn:bbc:cps:asset:27510321

script:

#!/bin/bash

while read l
    do
        if [[ $l =~ [(\w*:\s\w*:\w*:)(\w*-.*)] ]]; then
            for i in "${BASH_REMATCH[@]}"
            do
                echo "Index: $i"
            done
            
            echo "${BASH_REMATCH[2]}"
        fi
done < data.txt
@pgchamberlin
Copy link

I think Crystal's matches because surrounding the whole regex in [ ] makes Bash treat the whole regex as a single character class, and because that class contains characters that could match anything it does match everything - so you're right, it is matching all the individual letters, but is capturing none of them.

Looking at mine again, of course I had forgotten that .+ matches spaces, so it matches all characters up until the final :, then captures everything up to the end.

@sthulb
Copy link

sthulb commented May 23, 2014

Rubular uses PCRE and Bash uses ERE. There's subtle differences, but they're there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment