Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save cgddrd/bc4d42250ade0f645fb0 to your computer and use it in GitHub Desktop.
Save cgddrd/bc4d42250ade0f645fb0 to your computer and use it in GitHub Desktop.
[Regex] This regular expression is designed to work with character-delimited strings, and provides a means to specify the index of delimiter occurrence at which content is extracted.

Regular Expression - Match content contained within nth and nth + 1 occurence of specified character pattern.

This regular expression is designed to work with character-delimited strings, and provides a means to specify the index of delimiter occurrence at which content is extracted.

Parameters

  1. CHARACTER_PATTERN - Delimiter character pattern to search for within target string.
  2. INDEX - Zero-based index specifying the nth occurrence of CHARACTER_PATTERN at which content is to be extracted, up to nth + 1 occurrence.

Regular Expression (Single-line)

(?:(?:.*?CHARACTER_PATTERN\s?){INDEX})(.+?)(?:(?=\s?$)|(?=\s?CHARACTER_PATTERN))

Regular Expression (Multi-line + Comments)

1. (?:(?:.*?CHARACTER_PATTERN\s?){INDEX})      // Look for the nth occurence of 'CHARACTER_PATTERN'. (n = 'INDEX')

2. (.+?)                                       // Match everything after nth occurence of 'CHARACTER_PATTERN'
                                               // (Lazy - i.e. only match the as few tokens as possible) - This is important, as it  
                                               // ensures the next part only matches everything up to the NEXT (i.e. n+1) occurence
                                               // of 'CHARACTER_PATTERN', rather than matching up to the FINAL occurence, which would
                                               // be incorrect in this case.

3. (?:                                         // Look for either:

4.  (?=\s?$)                                   // An optional space, then the END OF LINE (Reqd. to handle the last delimted section
                                               // which may not have the 'CHARACTER_PATTERN' at the end - see below for examples.)

5.           |                                 // OR..

6.  (?=\s?CHARACTER_PATTERN)                  // Look for the NEXT ('INDEX'+1) occurence of 'CHARACTER_PATTERN'. This must be used in
                                              // conjunciton with the LAZY search in (2) to ensure matching stops after hitting the
                                              // next occurence (and does't continue to the VERY LAST occurence).

)

Examples

Example 1

Regex Pattern: (?:(?:.*?\|\s?){0})(.+?)(?:(?=\s?$)|(?=\s?\|))

Test String: String1 |String two is here| String 3 |[String-4]

Result : 1 match: 'String1'

Working Example - Regex 101

Example 2

Regex Pattern: (?:(?:.*?\|\s?){1})(.+?)(?:(?=\s?$)|(?=\s?\|))

Test String: String1 |String two is here| String 3 |[String-4]

Result : 1 match: 'String two is here'

Working Example - Regex 101

Example 3

Regex Pattern: (?:(?:.*?\|\s?){3})(.+?)(?:(?=\s?$)|(?=\s?\|))

Test String: String1 |String two is here| String 3 |[String-4]

Result : 1 match: '[String-4]'

Working Example - Regex 101

Example 4 (Using square-brackets)

Regex Pattern: (?:(?:.*?\|\s?){2})\[(.+?)\](?:(?=\s?$)|(?=\s?\|))

Test String: [String1] |[String two is here]| [String 3] |[String-4]

Result : 1 match: 'String 3'

Working Example - Regex 101

Example 5 (Exclude Fullstops)

Regex Pattern: (?:(?:.*?\|\s?){4})(.+?)(?:(?=\.?\s?$)|(?=\s?\.?\|))

Test String: [String1] |[String two is here]| [String 3] |[String-4] | String 5.

Result : 1 match: 'String 5'

Working Example - Regex 101

Example 6 (Split sub-selected string by second delimiter (e.g. '::')

Regex Pattern: (?:(?:.*?\|\s?){6}.*::)(.+?)(?:(?=\.?\s?$)|(?=\s?\.?\|))

Test String: [String1] | [String two is here] ||| String 3 | String-4 | String::5

Result : 1 match: '5'

Working Example - Regex 101

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment