glv/regexp_syntax.md

## regexp_syntax.md

      
    Raw
  

              regexp_syntax.md
            
          
    Ruby and Postgres Regular Expression Syntaxes

Ruby's regular expressions are unusually powerful.
Postgres' regular expressions are not as powerful, but they come close;
close enough that it's possible to do many pattern-based queries and string transformations entirely in a query.
And sometimes, it's very useful to have a single regular expression that works
in both environments, so that one can reliably query for data in a particular
format and then transform the results in Ruby using the same pattern.
In any case, if you're working in a Postgres-based Rails app, you occasionally need
to work with both kinds of regular expression. Unfortunately, there are subtle differences between
the syntaxes that are hard to remember. Furthermore, the two systems organize their regexp
documentation is different ways, so it's hard to even map between them easily. This
table is aimed at making it easier.
(The Regular-Expressions.info site does support both of these flavors of regular
expressions, and lets you do side-by-side comparisons, although finding the Postgres flavor
isn't obvious. But I find it helpful to have all of the information on one page; and
the process of building this file taught me a lot about both flavors of regular expression.)
Operators and Functions


Ruby
Postgres
Explanation


=~
~
matches


~*
matches (case-independent)


!~
!~
does not match


!~*
does not match (case-independent)


string.match(re)

returns a MatchData instance or nil if re does not match string


re.match(string)

returns a MatchData instance or nil if re does not match string


substring(string from pattern)
returns NULL if no match, the match text for the first capture group (if any), or the entire matched substring


regexp_replace(source, pattern, replacement [, flags ])
returns source, transformed per pattern and replacement (and flags, which can include g)


regexp_matches(string, pattern [, flags ])
returns no rows (if no match); a row containing an array of capture matches (if the pattern uses capture groups), or a row containing the entire matched substring. (The g flag, if supplied, results in one returned row for each match in string.)


regexp_split_to_table(string, pattern [, flags ])
splits string using pattern as a delimiter, returning a row for each fragment (ignoring zero-length fragments)


regexp_split_to_array(string, pattern [, flags ])
same as regexp_split_to_table except it returns an array of strings rather than rows


Alternation


      Ruby
      Postgres
      Explanation
    
  
      |
      |
      combines two expressions into a single one that matches either of the expressions; each expression is an alternative
    
  
Atoms

Atoms match a sequence of one or more characters (or zero or more characters, in the case of subpatterns).


Ruby
Postgres
Explanation


(re)
(re)
a sub-pattern (capturing)


(?<name>re)

a sub-pattern (named capture)


(?"name"re)

a sub-pattern (named capture)


(?:re)
(?:re)
a sub-pattern (non-capturing)


(?>re)

a sub-pattern (atomic, non-capturing)


.
.
any single character


[chars]
[chars]
a character class


[^*chars*]
[^*chars*]
a negated character class


\k
\k
where k is non-alphanumeric: matches k


\c
\c
where c is alphanumeric: an escape


{
{
if followed by a digit, introduces a bound quantifier; otherwise matches {


x
x
other characters match themselves


(Additionally, backreferences and escapes function as atoms.)
Quantifiers

Quantifiers can follow atoms, and they change the number of occurrences of the atom that can be matched. (Without a subsequent quantifier, an atom will always match exactly one consecutive occurrence.)
Greedy quantifiers match as many occurrences as possible while still allowing the overall match to succeed. By contrast, lazy quantifiers match the fewest number of occurrences as possible for overall success.
A possessive quantifier does not backtrack once it has matched. It behaves like a greedy quantifier, but having matched it refuses to “give up” its match even if this jeopardises the overall match. This is sometimes helpful for optimizing the performance of a regular expression.


Ruby
Postgres
Explanation


*
*
zero or more (greedy)


+
+
one or more (greedy)


?
?
zero or one (greedy)


{m}
{m}
exactly m


{m,}
{m,}
at least m (greedy)


{,m}

at most m (greedy)


{m,n}
{m,n}
at least m but no more than n (greedy)


*?
*?
zero or more (lazy)


+?
+?
one or more (lazy)


??
??
zero or one (lazy)


{m}?
{m}?
exactly m


{m,}?
{m,}?
at least m (lazy)


{,m}?

at most m (lazy)


{m,n}?
{m,n}?
at least m but no more than n (lazy)


*+

zero or more (possessive)


++

one or more (possessive)


?+

zero or one (possessive)


{m}+

exactly m


{m,}+

at least m (possessive)


{,m}+

at most m (possessive)


{m,n}+

at least m but no more than n (possessive)


Constraints / Anchors


Ruby
Postgres
Explanation


^
^
beginning of line


\A
\A
beginning of string


$
$
end of line


\Z
\Z
end of string (just before a terminating newline, if any)


\z

end of string


(?=re)
(?=re)
empty string when following characters match re


(?!re)
(?!re)
empty string when following characters do not match re


(?<=re)

empty string when preceding characters match re


(?<!re)

empty string when preceding characters do not match re


\b
\y
word boundary


\B
\Y
non-word-boundary


\m
beginning of word


\M
end of word


Character Classes

The following syntaxes are available within character class (bracket) expressions to enumerate the characters included (or excluded, in the case of negated classes).


Ruby
Postgres
Explanation


c
c
a single character


a-z
a-z
(where a and z are any characters) the entire sequence of characters from a to z inclusive


[class]

embedded class (OR'd with neighboring characters)


&&

intersection operator on embedded classes : [a-w&&[^c-g]z] means "([a-w] AND ([^c-g] OR z))" (in other words, [abh-w])


[:alnum:]
[:alnum:]
all alphanumeric characters


[:alpha:]
[:alpha:]
all alphabetic characters


[:blank:]
[:blank:]
all space and tab characters


[:cntrl:]
[:cntrl:]
all control characters


[:digit:]
[:digit:]
all decimal digits


[:graph:]
[:graph:]
all non-blank characters


[:lower:]
[:lower:]
all lowercase characters


[:print:]
[:print:]
all printable characters (like [:graph:], but includes space)


[:punct:]
[:punct:]
all punctuation characters


[:space:]
[:space:]
all whitespace characters (except tab)


[:upper:]
[:upper:]
all uppercase characters


[:xdigit:]
[:xdigit:]
all hexadecimal digits (same as [:digit:]A-Fa-f)


[:word:]

all characters in the following Unicode general categories: Letter, Mark, Number, Connector_Punctuation


[:ascii:]

all ASCII characters


Backreferences


Ruby
Postgres
Explanation


\nnn
\nnn
(up to three digits, with no leading zeroes) a backreference if not greater than the number of capturing groups; octal escape otherwise


\k<name>

reference to a named capture


\g<name>

reference to a named subpattern (re-evaluates the subpattern, rather than matching the same text)


Substitution Backreferences

In Ruby, information from the match may be used in any of three ways: as global variables ($something), as substitution references in the second argument of sub or gsub (\\something) or by calling into the MatchData object returned from match (md.something or md[:something]).
In Postgres, these can be used in the second argument to regexp_replace.


Ruby
Postgres
Explanation


$-, Regexp.last_match

the MatchData object from the most recent match


$&, \&, md[0]
\&
the complete matched text


$`, \`, md.pre_match

the text of the string preceding the match


$', \', md.post_match

the text of the string after the match


$1, \1, md[1]
\1
the first capture group (and so on for other numbered capture groups)


$+, \+

the last capture group


\k<name>, md[:name]

the named capture group with name name


Escapes

Pattern Escapes


Ruby
Postgres
Explanation


\d
\d
[[:digit:]]


\D
\D
[^[:digit:]]


\h

[[:xdigit:]]


\H

[^[:xdigit:]]


\s
\s
[[:space:]]


\S
\S
[^[:space:]]


\w
\w
[[:alnum:]_] (note underscore is included)


\W
\W
[^[:alnum:]_] (note underscore is included)


\p{Alnum}

Alphabetic and numeric character


\p{Alpha}

Alphabetic character


\p{Blank}

Space or tab


\p{Cntrl}

Control character


\p{Digit}

Digit


\p{Graph}

Non-blank character (excludes spaces, control characters, and similar)


\p{Lower}

Lowercase alphabetical character


\p{Print}

Like \p{Graph}, but includes the space character


\p{Punct}

Punctuation character


\p{Space}

Whitespace character ([:blank:], newline, carriage return, etc.)


\p{Upper}

Uppercase alphabetical


\p{XDigit}

Digit allowed in a hexadecimal number (i.e., 0-9a-fA-F)


\p{Word}

A member of one of the following Unicode general category Letter, Mark, Number, Connector_Punctuation


\p{ASCII}

A character in the ASCII character set


\p{Any}

Any Unicode character (including unassigned characters)


\p{Assigned}

An assigned character


\p{L}

Any character with Unicode General Category 'Letter'


\p{Ll}

Any character with Unicode General Category 'Letter: Lowercase'


\p{Lm}

Any character with Unicode General Category 'Letter: Mark'


\p{Lo}

Any character with Unicode General Category 'Letter: Other'


\p{Lt}

Any character with Unicode General Category 'Letter: Titlecase'


\p{Lu}

Any character with Unicode General Category 'Letter: Uppercase


\p{Lo}

Any character with Unicode General Category 'Letter: Other'


\p{M}

Any character with Unicode General Category 'Mark'


\p{Mn}

Any character with Unicode General Category 'Mark: Nonspacing'


\p{Mc}

Any character with Unicode General Category 'Mark: Spacing Combining'


\p{Me}

Any character with Unicode General Category 'Mark: Enclosing'


\p{N}

Any character with Unicode General Category 'Number'


\p{Nd}

Any character with Unicode General Category 'Number: Decimal Digit'


\p{Nl}

Any character with Unicode General Category 'Number: Letter'


\p{No}

Any character with Unicode General Category 'Number: Other'


\p{P}

Any character with Unicode General Category 'Punctuation'


\p{Pc}

Any character with Unicode General Category 'Punctuation: Connector'


\p{Pd}

Any character with Unicode General Category 'Punctuation: Dash'


\p{Ps}

Any character with Unicode General Category 'Punctuation: Open'


\p{Pe}

Any character with Unicode General Category 'Punctuation: Close'


\p{Pi}

Any character with Unicode General Category 'Punctuation: Initial Quote'


\p{Pf}

Any character with Unicode General Category 'Punctuation: Final Quote'


\p{Po}

Any character with Unicode General Category 'Punctuation: Other'


\p{S}

Any character with Unicode General Category 'Symbol'


\p{Sm}

Any character with Unicode General Category 'Symbol: Math'


\p{Sc}

Any character with Unicode General Category 'Symbol: Currency'


\p{Sc}

Any character with Unicode General Category 'Symbol: Currency'


\p{Sk}

Any character with Unicode General Category 'Symbol: Modifier'


\p{So}

Any character with Unicode General Category 'Symbol: Other'


\p{Z}

Any character with Unicode General Category 'Separator'


\p{Zs}

Any character with Unicode General Category 'Separator: Space'


\p{Zl}

Any character with Unicode General Category 'Separator: Line'


\p{Zp}

Any character with Unicode General Category 'Separator: Paragraph'


\p{C}

Any character with Unicode General Category 'Other'


\p{Cc}

Any character with Unicode General Category 'Other: Control'


\p{Cf}

Any character with Unicode General Category 'Other: Format'


\p{Cn}

Any character with Unicode General Category 'Other: Not Assigned'


\p{Co}

Any character with Unicode General Category 'Other: Private Use'


\p{Cs}

Any character with Unicode General Category 'Other: Surrogate'


\p{script}

Any character from the Unicode script, where script is one of Arabic, Armenian, Balinese, Bengali, Bopomofo, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Inherited, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao, Latin, Lepcha, Limbu, Linear_B, Lycian, Lydian, Malayalam, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Ol_Chiki, Old_Italic, Old_Persian, Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Saurashtra, Shavian, Sinhala, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Vai, and Yi


(Any of the above escapes of the form \p{something} can be negated by using the ^ character, as \p{^something}.)
Literal Character Escapes


Ruby
Postgres
Explanation


\\
\\
backslash


\a
\a
alert (bell) character, as in C


\b

backspace (only in character class)


\b
backspace, as in C


\B
synonym for backslash (\) to help reduce the need for backslash doubling


\cX
(where X is any character) the character whose low-order 5 bits are the same as those of X, and whose other bits are all zero


\e

the escape character


\e
the character whose collating-sequence name is ESC, or failing that, the character with octal value 033


\f
\f
form feed, as in C


\n
\n
newline, as in C


\r
\r
carriage return, as in C


\t
\t
horizontal tab, as in C


\uhhhh
(where hhhh is exactly four hexadecimal digits) the character whose hexadecimal value is 0xhhhh


\Uhhhhhhhh
(where hhhhhhhh is exactly eight hexadecimal digits) the character whose hexadecimal value is 0xhhhhhhhh


\v
\v
vertical tab, as in C


\xhhh
(where hhh is any sequence of hexadecimal digits) the character whose hexadecimal value is 0xhhh (a single character no matter how many hexadecimal digits are used)


\0
the character whose value is 0 (the null byte)


\oo
(where oo is exactly two octal digits, and is not a back reference) the character whose octal value is 0oo


\ooo
(where ooo is exactly three octal digits, and is not a back reference) the character whose octal value is 0ooo


Options

In Ruby, the single-letter version of an option can be specified after the closing delimiter of the regexp.  Options i, m, and x can also be embedded within the expression using the (?on-off:re) syntax, which turns on options on and turns off options off while interpreting subpattern re. The Regexp::CONSTANT version can be passed as the second parameter to Regexp.new (optionally combined with other constants with |).
In Postgres, the options can be included at the very start of an expression (possibly after an initial ***=) using the syntax (?opts); the options included in the string opts are in effect for the entire regular expression. The options can also be included in a string passed as a parameter to various pattern-related functions.
Do not assume that an option means the same thing in Ruby and Postgres just because it has the same letter! Consult the details of Ruby and Postgres options.


Ruby
Postgres
Explanation


i or Regexp::IGNORECASE

case-insensitive


m or Regexp::MULTILINE

multiline (treat newline as a character matched by .)


x or Regexp::EXTENDED

ignore whitespace and comments in pattern


o

perform #{} interpolation only once


u

regexp is encoded as UTF-8


e

regexp is encoded as EUC-JP


s

regexp is encoded as Windows-31J


n

regexp is encoded as ASCII-8BIT


***:
(at beginning of pattern) the rest of the pattern is an ARE (Postgres advanced regular expression)


***=
(at beginning of pattern) the rest of the pattern is a literal string


b
rest of RE is a BRE (POSIX basic regular expression)


c
case-sensitive matching (overrides operator type)


e
rest of RE is an ERE (POSIX extended regular experession)


i
case-insensitive matching (overrides operator type)


m
historical synonym for n


n
newline-sensitive matching


p
partial newline-sensitive matching


q
rest of RE is a literal ("quoted") string, all ordinary characters


s
non-newline-sensitive matching (default)


t
tight syntax (default)


w
inverse partial newline-sensitive ("weird") matching


x
expanded syntax


g
(with regexp_replace and regexp_matches only) operate on all matches, not just the first
Ruby	Postgres	Explanation
=~	~	matches
	~*	matches (case-independent)
!~	!~	does not match
	!~*	does not match (case-independent)
string.match(re)		returns a MatchData instance or nil if re does not match string
re.match(string)		returns a MatchData instance or nil if re does not match string
	substring(string from pattern)	returns NULL if no match, the match text for the first capture group (if any), or the entire matched substring
	regexp_replace(source, pattern, replacement [, flags ])	returns source, transformed per pattern and replacement (and flags, which can include `g`)
	regexp_matches(string, pattern [, flags ])	returns no rows (if no match); a row containing an array of capture matches (if the pattern uses capture groups), or a row containing the entire matched substring. (The `g` flag, if supplied, results in one returned row for each match in string.)
	regexp_split_to_table(string, pattern [, flags ])	splits string using pattern as a delimiter, returning a row for each fragment (ignoring zero-length fragments)
	regexp_split_to_array(string, pattern [, flags ])	same as `regexp_split_to_table` except it returns an array of strings rather than rows
Ruby	Postgres	Explanation
(re)	(re)	a sub-pattern (capturing)
(?<name>re)		a sub-pattern (named capture)
(?"name"re)		a sub-pattern (named capture)
(?:re)	(?:re)	a sub-pattern (non-capturing)
(?>re)		a sub-pattern (atomic, non-capturing)
.	.	any single character
[chars]	[chars]	a character class
[^chars]	[^chars]	a negated character class
\k	\k	where k is non-alphanumeric: matches k
\c	\c	where c is alphanumeric: an escape
{	{	if followed by a digit, introduces a bound quantifier; otherwise matches {
x	x	other characters match themselves
Ruby	Postgres	Explanation
*	*	zero or more (greedy)
+	+	one or more (greedy)
?	?	zero or one (greedy)
{m}	{m}	exactly m
{m,}	{m,}	at least m (greedy)
{,m}		at most m (greedy)
{m,n}	{m,n}	at least m but no more than n (greedy)
*?	*?	zero or more (lazy)
+?	+?	one or more (lazy)
??	??	zero or one (lazy)
{m}?	{m}?	exactly m
{m,}?	{m,}?	at least m (lazy)
{,m}?		at most m (lazy)
{m,n}?	{m,n}?	at least m but no more than n (lazy)
*+		zero or more (possessive)
++		one or more (possessive)
?+		zero or one (possessive)
{m}+		exactly m
{m,}+		at least m (possessive)
{,m}+		at most m (possessive)
{m,n}+		at least m but no more than n (possessive)
Ruby	Postgres	Explanation
^	^	beginning of line
\A	\A	beginning of string
$	$	end of line
\Z	\Z	end of string (just before a terminating newline, if any)
\z		end of string
(?=re)	(?=re)	empty string when following characters match re
(?!re)	(?!re)	empty string when following characters do not match re
(?<=re)		empty string when preceding characters match re
(?<!re)		empty string when preceding characters do not match re
\b	\y	word boundary
\B	\Y	non-word-boundary
	\m	beginning of word
	\M	end of word
Ruby	Postgres	Explanation
c	c	a single character
a-z	a-z	(where a and z are any characters) the entire sequence of characters from a to z inclusive
[class]		embedded class (OR'd with neighboring characters)
&&		intersection operator on embedded classes : `[a-w&&[^c-g]z]` means "(`[a-w]` AND (`[^c-g]` OR `z`))" (in other words, `[abh-w]`)
[:alnum:]	[:alnum:]	all alphanumeric characters
[:alpha:]	[:alpha:]	all alphabetic characters
[:blank:]	[:blank:]	all space and tab characters
[:cntrl:]	[:cntrl:]	all control characters
[:digit:]	[:digit:]	all decimal digits
[:graph:]	[:graph:]	all non-blank characters
[:lower:]	[:lower:]	all lowercase characters
[:print:]	[:print:]	all printable characters (like [:graph:], but includes space)
[:punct:]	[:punct:]	all punctuation characters
[:space:]	[:space:]	all whitespace characters (except tab)
[:upper:]	[:upper:]	all uppercase characters
[:xdigit:]	[:xdigit:]	all hexadecimal digits (same as `[:digit:]A-Fa-f`)
[:word:]		all characters in the following Unicode general categories: Letter, Mark, Number, Connector_Punctuation
[:ascii:]		all ASCII characters
Ruby	Postgres	Explanation
\nnn	\nnn	(up to three digits, with no leading zeroes) a backreference if not greater than the number of capturing groups; octal escape otherwise
\k<name>		reference to a named capture
\g<name>		reference to a named subpattern (re-evaluates the subpattern, rather than matching the same text)
Ruby	Postgres	Explanation
`$-`, `Regexp.last_match`		the MatchData object from the most recent match
`$&`, `\&`, `md[0]`	`\&`	the complete matched text
$`, \`, `md.pre_match`		the text of the string preceding the match
`$'`, `\'`, `md.post_match`		the text of the string after the match
`$1`, `\1`, `md[1]`	`\1`	the first capture group (and so on for other numbered capture groups)
`$+`, `\+`		the last capture group
`\k<name>`, `md[:name]`		the named capture group with name name
Ruby	Postgres	Explanation
\d	\d	[[:digit:]]
\D	\D	[^[:digit:]]
\h		[[:xdigit:]]
\H		[^[:xdigit:]]
\s	\s	[[:space:]]
\S	\S	[^[:space:]]
\w	\w	[[:alnum:]_] (note underscore is included)
\W	\W	[^[:alnum:]_] (note underscore is included)
\p{Alnum}		Alphabetic and numeric character
\p{Alpha}		Alphabetic character
\p{Blank}		Space or tab
\p{Cntrl}		Control character
\p{Digit}		Digit
\p{Graph}		Non-blank character (excludes spaces, control characters, and similar)
\p{Lower}		Lowercase alphabetical character
\p{Print}		Like \p{Graph}, but includes the space character
\p{Punct}		Punctuation character
\p{Space}		Whitespace character ([:blank:], newline, carriage return, etc.)
\p{Upper}		Uppercase alphabetical
\p{XDigit}		Digit allowed in a hexadecimal number (i.e., 0-9a-fA-F)
\p{Word}		A member of one of the following Unicode general category Letter, Mark, Number, Connector_Punctuation
\p{ASCII}		A character in the ASCII character set
\p{Any}		Any Unicode character (including unassigned characters)
\p{Assigned}		An assigned character
\p{L}		Any character with Unicode General Category 'Letter'
\p{Ll}		Any character with Unicode General Category 'Letter: Lowercase'
\p{Lm}		Any character with Unicode General Category 'Letter: Mark'
\p{Lo}		Any character with Unicode General Category 'Letter: Other'
\p{Lt}		Any character with Unicode General Category 'Letter: Titlecase'
\p{Lu}		Any character with Unicode General Category 'Letter: Uppercase
\p{Lo}		Any character with Unicode General Category 'Letter: Other'
\p{M}		Any character with Unicode General Category 'Mark'
\p{Mn}		Any character with Unicode General Category 'Mark: Nonspacing'
\p{Mc}		Any character with Unicode General Category 'Mark: Spacing Combining'
\p{Me}		Any character with Unicode General Category 'Mark: Enclosing'
\p{N}		Any character with Unicode General Category 'Number'
\p{Nd}		Any character with Unicode General Category 'Number: Decimal Digit'
\p{Nl}		Any character with Unicode General Category 'Number: Letter'
\p{No}		Any character with Unicode General Category 'Number: Other'
\p{P}		Any character with Unicode General Category 'Punctuation'
\p{Pc}		Any character with Unicode General Category 'Punctuation: Connector'
\p{Pd}		Any character with Unicode General Category 'Punctuation: Dash'
\p{Ps}		Any character with Unicode General Category 'Punctuation: Open'
\p{Pe}		Any character with Unicode General Category 'Punctuation: Close'
\p{Pi}		Any character with Unicode General Category 'Punctuation: Initial Quote'
\p{Pf}		Any character with Unicode General Category 'Punctuation: Final Quote'
\p{Po}		Any character with Unicode General Category 'Punctuation: Other'
\p{S}		Any character with Unicode General Category 'Symbol'
\p{Sm}		Any character with Unicode General Category 'Symbol: Math'
\p{Sc}		Any character with Unicode General Category 'Symbol: Currency'
\p{Sc}		Any character with Unicode General Category 'Symbol: Currency'
\p{Sk}		Any character with Unicode General Category 'Symbol: Modifier'
\p{So}		Any character with Unicode General Category 'Symbol: Other'
\p{Z}		Any character with Unicode General Category 'Separator'
\p{Zs}		Any character with Unicode General Category 'Separator: Space'
\p{Zl}		Any character with Unicode General Category 'Separator: Line'
\p{Zp}		Any character with Unicode General Category 'Separator: Paragraph'
\p{C}		Any character with Unicode General Category 'Other'
\p{Cc}		Any character with Unicode General Category 'Other: Control'
\p{Cf}		Any character with Unicode General Category 'Other: Format'
\p{Cn}		Any character with Unicode General Category 'Other: Not Assigned'
\p{Co}		Any character with Unicode General Category 'Other: Private Use'
\p{Cs}		Any character with Unicode General Category 'Other: Surrogate'
\p{script}		Any character from the Unicode script, where script is one of Arabic, Armenian, Balinese, Bengali, Bopomofo, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret, Devanagari, Ethiopic, Georgian, Glagolitic, Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew, Hiragana, Inherited, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Lao, Latin, Lepcha, Limbu, Linear_B, Lycian, Lydian, Malayalam, Mongolian, Myanmar, New_Tai_Lue, Nko, Ogham, Ol_Chiki, Old_Italic, Old_Persian, Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Saurashtra, Shavian, Sinhala, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh, Ugaritic, Vai, and Yi
Ruby	Postgres	Explanation
\\	\\	backslash
\a	\a	alert (bell) character, as in C
\b		backspace (only in character class)
	\b	backspace, as in C
	\B	synonym for backslash (\) to help reduce the need for backslash doubling
	\cX	(where X is any character) the character whose low-order 5 bits are the same as those of X, and whose other bits are all zero
\e		the escape character
	\e	the character whose collating-sequence name is ESC, or failing that, the character with octal value 033
\f	\f	form feed, as in C
\n	\n	newline, as in C
\r	\r	carriage return, as in C
\t	\t	horizontal tab, as in C
	\uhhhh	(where hhhh is exactly four hexadecimal digits) the character whose hexadecimal value is 0xhhhh
	\Uhhhhhhhh	(where hhhhhhhh is exactly eight hexadecimal digits) the character whose hexadecimal value is 0xhhhhhhhh
\v	\v	vertical tab, as in C
	\xhhh	(where hhh is any sequence of hexadecimal digits) the character whose hexadecimal value is 0xhhh (a single character no matter how many hexadecimal digits are used)
	\0	the character whose value is 0 (the null byte)
	\oo	(where oo is exactly two octal digits, and is not a back reference) the character whose octal value is 0oo
	\ooo	(where ooo is exactly three octal digits, and is not a back reference) the character whose octal value is 0ooo
Ruby	Postgres	Explanation
i or `Regexp::IGNORECASE`		case-insensitive
m or `Regexp::MULTILINE`		multiline (treat newline as a character matched by `.`)
x or `Regexp::EXTENDED`		ignore whitespace and comments in pattern
o		perform `#{}` interpolation only once
u		regexp is encoded as UTF-8
e		regexp is encoded as EUC-JP
s		regexp is encoded as Windows-31J
n		regexp is encoded as ASCII-8BIT
	***:	(at beginning of pattern) the rest of the pattern is an ARE (Postgres advanced regular expression)
	***=	(at beginning of pattern) the rest of the pattern is a literal string
	b	rest of RE is a BRE (POSIX basic regular expression)
	c	case-sensitive matching (overrides operator type)
	e	rest of RE is an ERE (POSIX extended regular experession)
	i	case-insensitive matching (overrides operator type)
	m	historical synonym for n
	n	newline-sensitive matching
	p	partial newline-sensitive matching
	q	rest of RE is a literal ("quoted") string, all ordinary characters
	s	non-newline-sensitive matching (default)
	t	tight syntax (default)
	w	inverse partial newline-sensitive ("weird") matching
	x	expanded syntax
	g	(with `regexp_replace` and `regexp_matches` only) operate on all matches, not just the first