Many different applications claim to support regular expressions. But what does that even mean?
Well there are lots of different regular expression engines, and they all have different feature sets and different time-space efficiencies.
The information here is just copied from: http://regular-expressions.mobi/refflavors.html
But for some reason, it's not accessible unless you have a mobile phone user agent.
Go to the main site for lots of regular expression information and their commercial product called RegexBuddy.
- JGsoft: This flavor is used by the Just Great Software products, including PowerGREP and EditPad Pro.
- .NET: This flavor is used by programming languages based on the Microsoft .NET framework versions 1.x, 2.0 or 3.x. It is generally also the regex flavor used by applications developed in these programming languages.
- Java: The regex flavor of the java.util.regex package, available in the Java 4 (JDK 1.4.x) and later. A few features were added in Java 5 (JDK 1.5.x) and Java 6 (JDK 1.6.x). It is generally also the regex flavor used by applications developed in Java.
- Perl: The regex flavor used in the Perl programming language, versions 5.6 and 5.8. Versions prior to 5.6 do not support Unicode.
- PCRE: The open source PCRE library. The feature set described here is available in PCRE 5.x and 6.x. PCRE is the regex engine used by the TPerlRegEx Delphi component and the RegularExrpessions and RegularExpressionsCore units in Delphi XE and C++Builder XE.
- ECMA (JavaScript): The regular expression syntax defined in the 3rd edition of the ECMA-262 standard, which defines the scripting language commonly known as JavaScript.
- Python: The regex flavor supported by Python's built-in re module.
- Ruby: The regex flavor built into the Ruby programming language.
- Tcl ARE: The regex flavor developed by Henry Spencer for the regexp command in Tcl 8.2 and 8.4, dubbed Advanced Regular Expressions.
- POSIX BRE: Basic Regular Expressions as defined in the IEEE POSIX standard 1003.2.
- POSIX ERE: Extended Regular Expressions as defined in the IEEE POSIX standard 1003.2.
- GNU BRE: GNU Basic Regular Expressions, which are POSIX BRE with GNU extensions, used in the GNU implementations of classic UNIX tools.
- GNU ERE: GNU Extended Regular Expressions, which are POSIX ERE with GNU extensions, used in the GNU implementations of classic UNIX tools.
- XML: The regular expression flavor defined in the XML Schema standard.
- XPath: The regular expression flavor defined in the XQuery 1.0 and XPath 2.0 Functions and Operators standard.
- AceText: Version 2 and later use the JGsoft engine. Version 1 did not support regular expressions at all.
- awk: The awk UNIX tool and programming language uses POSIX ERE.
- C#: As a .NET programming language, C# can use the System.Text.RegularExpressions classes, listed as ".NET" below.
- Delphi for .NET: As a .NET programming language, the .NET version of Delphi can use the System.Text.RegularExpressions classes, listed as ".NET" below.
- Delphi for Win32: Delphi for Win32 does not have built-in regular expression support. Many free PCRE wrappers are available.
- EditPad Pro: Version 6 and later use the JGsoft engine. Earlier versions used PCRE, without Unicode support.
- egrep: The traditional UNIX egrep command uses the "POSIX ERE" flavor, though not all implementations fully adhere to the standard. Linux usually ships with the GNU implementation, which use "GNU ERE".
- grep: The traditional UNIX grep command uses the "POSIX BRE" flavor, though not all implementations fully adhere to the standard. Linux usually ships with the GNU implementation, which use "GNU BRE".
- Emacs: The GNU implementation of this classic UNIX text editor uses the "GNU ERE" flavor, except that POSIX classes, collations and equivalences are not supported.
- Java: The regex flavor of the java.util.regex package is listed as "Java" in the table below.
- JavaScript: JavaScript's regex flavor is listed as "ECMA" in the table below.
- MySQL: MySQL uses POSIX Extended Regular Expressions, listed as "POSIX ERE" in the table below.
- Oracle: Oracle Database 10g implements POSIX Extended Regular Expressions, listed as "POSIX ERE" in the table below. Oracle supports backreferences \1 through \9, though these are not part of the POSIX ERE standard.
- Perl: Perl's regex flavor is listed as "Perl" in the table below.
- PHP: PHP's ereg functions implement the "POSIX ERE" flavor, while the preg functions implement the "PCRE" flavor.
- PostgreSQL: PostgreSQL 7.4 and later uses Henry Spencer's "Advanced Regular Expressions" flavor, listed as "Tcl ARE" in the table below. Earlier versions used POSIX Extended Regular Expressions, listed as POSIX ERE.
- PowerGREP: Version 3 and later use the JGsoft engine. Earlier versions used PCRE, without Unicode support.
- PowerShell: PowerShell's built-in -match and -replace operators use the .NET regex flavor. PowerShell can also use the System.Text.RegularExpressions classes directly.
- Python: Python's regex flavor is listed as "Python" in the table below.
- R: The regular expression functions in the R language for statistical programming use either the POSIX ERE flavor (default), the PCRE flavor (perl = true) or the POSIX BRE flavor (perl = false, extended = false).
- REALbasic: REALbasic's RegEx class is a wrapper around PCRE.
- RegexBuddy: Version 3 and later use a special version of the JGsoft engine that emulates all the regular expression flavors in this comparison. Version 2 supported the JGsoft regex flavor only. Version 1 used PCRE, without Unicode support.
- Ruby: Ruby's regex flavor is listed as "Ruby" in the table below.
- sed: The sed UNIX tool uses POSIX BRE. Linux usually ships with the GNU implementation, which use "GNU BRE".
- Tcl: Tcl's Advanced Regular Expression flavor, the default flavor in Tcl 8.2 and later, is listed as "Tcl ARE" in the table below. Tcl's Extended Regular Expression and Basic Regular Expression flavors are listed as "POSIX ERE" and "POSIX BRE" in the table below.
- VBScript: VBScript's RegExp object uses the same regex flavor as JavaScript, which is listed as "ECMA" in the table below.
- Visual Basic 6: Visual Basic 6 does not have built-in support for regular expressions, but can easily use the "Microsoft VBScript Regular Expressions 5.5" COM object, which implements the "ECMA" flavor listed below.
- Visual Basic.NET: As a .NET programming language, VB.NET can use the System.Text.RegularExpressions classes, listed as ".NET" below.
- wxWidgets: The wxRegEx class supports 3 flavors. wxRE_ADVANCED is the "Tcl ARE" flavor, wxRE_EXTENDED is "POSIX ERE" and wxRE_BASIC is "POSIX BRE".
- XML Schema: The XML Schema regular expression flavor is listed as "XML" in the table below.
- XPath: The regex flavor used by XPath functions is listed as "XPath" in the table below.
- XQuery: The regex flavor used by XQuery functions is listed as "XPath" in the table below.
Characters | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
Backslash escapes one metacharacter | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
\Q...\E escapes a string of metacharacters | YES | no | Java 6 | YES | YES | no | no | no | no | no | no | no | no | no | no |
\x00 through \xFF (ASCII character) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | no | no |
\n (LF), \r (CR) and \t (tab) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | YES | YES |
\f (form feed) and \v (vtab) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | no | no |
\a (bell) | YES | YES | YES | YES | YES | no | YES | YES | YES | no | no | no | no | no | no |
\e (escape) | YES | YES | YES | YES | YES | no | no | YES | YES | no | no | no | no | no | no |
\b (backspace) and \B (backslash) | no | no | no | no | no | no | no | no | YES | no | no | no | no | no | no |
\cA through \cZ (control character) | YES | YES | YES | YES | YES | YES | no | no | YES | no | no | no | no | no | no |
\ca through \cz (control character) | YES | YES | no | YES | YES | YES | no | no | YES | no | no | no | no | no | no |
Character Classes or Character Sets [abc] | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
[abc] character class | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
[^abc] negated character class | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
[a-z] character class range | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
Hyphen in [\d-z] is a literal | YES | YES | YES | YES | YES | no | no | no | no | no | no | no | no | no | no |
Hyphen in [a-\d] is a literal | YES | no | no | no | YES | no | no | no | no | no | no | no | no | no | no |
Backslash escapes one character class metacharacter | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | YES | YES |
\Q...\E escapes a string of character class metacharacters | YES | no | Java 6 | YES | YES | no | no | no | no | no | no | no | no | no | no |
\d shorthand for digits | YES | YES | ascii | YES | ascii | ascii | option | ascii | YES | no | no | no | no | YES | YES |
\w shorthand for word characters | YES | YES | ascii | YES | ascii | ascii | option | ascii | YES | no | no | YES | YES | YES | YES |
\s shorthand for whitespace | YES | YES | ascii | YES | ascii | YES | option | ascii | YES | no | no | YES | YES | ascii | ascii |
\D, \W and \S shorthand negated character classes | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | YES | YES | YES | YES |
[\b] backspace | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | no | no |
Dot | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
. (dot; any character except line break) | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
Anchors | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
^ (start of string/line) | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES |
$ (end of string/line) | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES |
\A (start of string) | YES | YES | YES | YES | YES | no | YES | YES | YES | no | no | no | no | no | no |
\Z (end of string, before final line break) | YES | YES | YES | YES | YES | no | no | YES | YES | no | no | no | no | no | no |
\z (end of string) | YES | YES | YES | YES | YES | no | \Z | YES | no | no | no | no | no | no | no |
\` (start of string) | no | no | no | no | no | no | no | no | no | no | no | YES | YES | no | no |
\' (end of string) | no | no | no | no | no | no | no | no | no | no | no | YES | YES | no | no |
Word Boundaries | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
\b (at the beginning or end of a word) | YES | YES | YES | YES | ascii | ascii | option | ascii | no | no | no | YES | YES | no | no |
\B (NOT at the beginning or end of a word) | YES | YES | YES | YES | ascii | ascii | option | ascii | no | no | no | YES | YES | no | no |
\y (at the beginning or end of a word) | YES | no | no | no | no | no | no | no | YES | no | no | no | no | no | no |
\Y (NOT at the beginning or end of a word) | YES | no | no | no | no | no | no | no | YES | no | no | no | no | no | no |
\m (at the beginning of a word) | YES | no | no | no | no | no | no | no | YES | no | no | no | no | no | no |
\M (at the end of a word) | YES | no | no | no | no | no | no | no | YES | no | no | no | no | no | no |
\< (at the beginning of a word) | no | no | no | no | no | no | no | no | no | no | no | YES | YES | no | no |
\> (at the end of a word) | no | no | no | no | no | no | no | no | no | no | no | YES | YES | no | no |
Alternation | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
| (alternation) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES | \| | YES | YES | YES |
Quantifiers | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
? (0 or 1) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES | \? | YES | YES | YES |
* (0 or more) | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES |
+ (1 or more) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES | \+ | YES | YES | YES |
{n} (exactly n) | YES | YES | YES | YES | YES | YES | YES | YES | YES | \{n\} | YES | \{n\} | YES | YES | YES |
{n,m} (between n and m) | YES | YES | YES | YES | YES | YES | YES | YES | YES | \{n,m\} | YES | \{n,m\} | YES | YES | YES |
{n,} (n or more) | YES | YES | YES | YES | YES | YES | YES | YES | YES | \{n,\} | YES | \{n,\} | YES | YES | YES |
? after any of the above quantifiers to make it "lazy" | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | no | YES |
Grouping and Backreferences | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
(regex) (numbered capturing group) | YES | YES | YES | YES | YES | YES | YES | YES | YES | \( \) | YES | \( \) | YES | YES | YES |
(?:regex) (non-capturing group) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | no | no |
\1 through \9 (backreferences) | YES | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | YES | YES | no | YES |
\10 through \99 (backreferences) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | n/a | no | no | n/a | YES |
Forward references \1 through \9 | YES | YES | YES | YES | YES | no | no | YES | no | no | n/a | no | no | n/a | no |
Nested references \1 through \9 | YES | YES | YES | YES | YES | YES | no | YES | no | no | n/a | no | no | n/a | no |
Backreferences non-existent groups are an error | YES | YES | YES | YES | YES | no | YES | no | YES | YES | n/a | YES | YES | n/a | YES |
Backreferences to failed groups also fail | YES | YES | YES | YES | YES | no | YES | YES | YES | YES | n/a | YES | YES | n/a | YES |
Modifiers | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
(?i) (case insensitive) | YES | YES | YES | YES | YES | /i only | YES | YES | YES | no | no | no | no | no | flag |
(?s) (dot matches newlines) | YES | YES | YES | YES | YES | no | YES | (?m) | no | no | no | no | no | no | flag |
(?m) (^ and $ match at line breaks) | YES | YES | YES | YES | YES | /m only | YES | always on | no | no | no | no | no | no | flag |
(?x) (free-spacing mode) | YES | YES | YES | YES | YES | no | YES | YES | YES | no | no | no | no | no | flag |
(?n) (explicit capture) | YES | YES | no | no | no | no | no | no | no | no | no | no | no | no | no |
(?-ismxn) (turn off mode modifiers) | YES | YES | YES | YES | YES | no | no | YES | no | no | no | no | no | no | no |
(?ismxn:group) (mode modifiers local to group) | YES | YES | YES | YES | YES | no | no | YES | no | no | no | no | no | no | no |
Atomic Grouping and Possessive Quantifiers | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
(?>regex) (atomic group) | YES | YES | YES | YES | YES | no | no | YES | no | no | no | no | no | no | no |
?+, *+, ++ and {m,n}+ (possessive quantifiers) | YES | no | YES | no | YES | no | no | no | no | no | no | no | no | no | no |
Lookaround | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
(?=regex) (positive lookahead) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | no | no |
(?!regex) (negative lookahead) | YES | YES | YES | YES | YES | YES | YES | YES | YES | no | no | no | no | no | no |
(?<=text) (positive lookbehind) | full regex | full regex | finite length | fixed length | fixed + alternation | no | fixed length | no | no | no | no | no | no | no | no |
(?<!text) (negative lookbehind) | full regex | full regex | finite length | fixed length | fixed + alternation | no | fixed length | no | no | no | no | no | no | no | no |
Continuing from The Previous Match | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
\G (start of match attempt) | YES | YES | YES | YES | YES | no | no | YES | no | no | no | no | no | no | no |
Conditionals | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
(?(?=regex)then|else) (using any lookaround) | YES | YES | no | YES | YES | no | no | no | no | no | no | no | no | no | no |
(?(regex)then|else) | no | YES | no | no | no | no | no | no | no | no | no | no | no | no | no |
(?(1)then|else) | YES | YES | no | YES | YES | no | YES | no | no | no | no | no | no | no | no |
(?(group)then|else) | YES | YES | no | no | YES | no | YES | no | no | no | no | no | no | no | no |
Comments | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
(?#comment) | YES | YES | no | YES | YES | no | YES | YES | YES | no | no | no | no | no | no |
Free-Spacing Syntax | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
Free-spacing syntax supported | YES | YES | YES | YES | YES | no | YES | YES | YES | no | no | no | no | no | YES |
Character class is a single token | YES | YES | no | YES | YES | n/a | YES | YES | YES | n/a | n/a | n/a | n/a | n/a | YES |
# starts a comment | YES | YES | YES | YES | YES | n/a | YES | YES | YES | n/a | n/a | n/a | n/a | n/a | no |
Unicode Characters | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
\X (Unicode grapheme) | YES | no | no | YES | option | no | no | no | no | no | no | no | no | no | no |
\u0000 through \uFFFF (Unicode character) | YES | YES | YES | no | no | YES | u"string" | no | YES | no | no | no | no | no | no |
\x{0} through \x{FFFF} (Unicode character) | YES | no | no | YES | option | no | no | no | no | no | no | no | no | no | no |
Unicode Properties, Scripts and Blocks | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
\pL through \pC (Unicode properties) | YES | no | YES | YES | option | no | no | no | no | no | no | no | no | no | no |
\p{L} through \p{C} (Unicode properties) | YES | YES | YES | YES | option | no | no | no | no | no | no | no | no | YES | YES |
\p{Lu} through \p{Cn} (Unicode property) | YES | YES | YES | YES | option | no | no | no | no | no | no | no | no | YES | YES |
\p{L&} and \p{Letter&} (equivalent of [\p{Lu}\p{Ll}\p{Lt}] Unicode properties) | YES | no | no | YES | option | no | no | no | no | no | no | no | no | no | no |
\p{IsL} through \p{IsC} (Unicode properties) | YES | no | YES | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{IsLu} through \p{IsCn} (Unicode property) | YES | no | YES | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{Letter} through \p{Other} (Unicode properties) | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{Lowercase_Letter} through \p{Not_Assigned} (Unicode property) | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{IsLetter} through \p{IsOther} (Unicode properties) | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{IsLowercase_Letter} through \p{IsNot_Assigned} (Unicode property) | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{Arabic} through \p{Yi} (Unicode script) | YES | no | no | YES | option | no | no | no | no | no | no | no | no | no | no |
\p{IsArabic} through \p{IsYi} (Unicode script) | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{BasicLatin} through \p{Specials} (Unicode block) | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{InBasicLatin} through \p{InSpecials} (Unicode block) | YES | no | YES | YES | no | no | no | no | no | no | no | no | no | no | no |
\p{IsBasicLatin} through \p{IsSpecials} (Unicode block) | YES | YES | no | YES | no | no | no | no | no | no | no | no | no | YES | YES |
Part between {} in all of the above is case insensitive | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
Spaces, hyphens and underscores allowed in all long names listed above (e.g. BasicLatin can be written as Basic-Latin or Basic_Latin or Basic Latin) | YES | no | Java 5 | YES | no | no | no | no | no | no | no | no | no | no | no |
\P (negated variants of all \p as listed above) | YES | YES | YES | YES | option | no | no | no | no | no | no | no | no | YES | YES |
\p{^...} (negated variants of all \p{...} as listed above) | YES | no | no | YES | option | no | no | no | no | no | no | no | no | no | no |
Named Capture and Backreferences | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
(?<name>regex) (.NET-style named capturing group) | YES | YES | no | no | no | no | no | no | no | no | no | no | no | no | no |
(?'name'regex) (.NET-style named capturing group) | YES | YES | no | no | no | no | no | no | no | no | no | no | no | no | no |
\k<name> (.NET-style named backreference) | YES | YES | no | no | no | no | no | no | no | no | no | no | no | no | no |
\k'name' (.NET-style named backreference) | YES | YES | no | no | no | no | no | no | no | no | no | no | no | no | no |
(?P<name>regex) (Python-style named capturing group | YES | no | no | no | YES | no | YES | no | no | no | no | no | no | no | no |
(?P=name) (Python-style named backreference) | YES | no | no | no | YES | no | YES | no | no | no | no | no | no | no | no |
multiple capturing groups can have the same name | YES | YES | n/a | n/a | no | n/a | no | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a |
XML Character Classes | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
\i, \I, \c and \C shorthand XML name character classes | no | no | no | no | no | no | no | no | no | no | no | no | no | YES | YES |
[abc-[abc]] character class subtraction | YES | 2.0 | no | no | no | no | no | no | no | no | no | no | no | YES | YES |
POSIX Bracket Expressions | |||||||||||||||
Feature | JGsoft | .NET | Java | Perl | PCRE | ECMA | Python | Ruby | Tcl ARE | POSIX BRE | POSIX ERE | GNU BRE | GNU ERE | XML | XPath |
[:alpha:] POSIX character class | YES | no | no | YES | ascii | no | no | YES | YES | YES | YES | YES | YES | no | no |
\p{Alpha} POSIX character class | YES | no | ascii | no | no | no | no | no | no | no | no | no | no | no | no |
\p{IsAlpha} POSIX character class | YES | no | no | YES | no | no | no | no | no | no | no | no | no | no | no |
[.span-ll.] POSIX collation sequence | no | no | no | no | no | no | no | no | YES | YES | YES | YES | YES | no | no |
[=x=] POSIX character equivalence | no | no | no | no | no | no | no | no | YES | YES | YES | YES | YES | no | no |
Perhaps add "what characters does
\w
match for each language."