Skip to content

Instantly share code, notes, and snippets.

@bvisness
Last active August 18, 2016 04:22
Show Gist options
  • Save bvisness/8236eb9c66e2e84919ff2e081897deb4 to your computer and use it in GitHub Desktop.
Save bvisness/8236eb9c66e2e84919ff2e081897deb4 to your computer and use it in GitHub Desktop.
Verbal Expressions made better
//
// Chaining functions is kind of annoying, and seems unnecessary. Let's try
// building verbal expressions using arrays instead.
//
// (Plus let's throw some new features in, because regexes can do a lot!)
//
//
// Simple stuff can be done like this:
//
var myExpression = new VerbalExpression([
'foo', // straightforward literal
maybe('optional'), // equivalent to (optional)?
oneOf('a-zA-Z0-9'), // equivalent to [a-zA-Z0-9]
anythingBut('a-zA-Z0-9'), // equivalent to [^a-zA-Z0-9]
atLeastOne('foo'), // equivalent to (foo)+
zeroOrMore('foo'), // equivalent to (foo)*
or([ // equivalent to (foo|bar|baz)
'foo',
'bar',
'baz'
]),
anyCharacter(), // equivalent to .
]);
//
// All of this can be nested, or arrays can be given to match larger patterns.
// This would match `http://google.com`, `https://www.google.com/`, etc.
//
// This would expand to (http(s)?://)?(www.)?google.com(/)?
// (Some parentheses could be omitted from that regex, but if we are not
// writing these by hand, what does it matter?)
//
var myNestedExpression = new VerbalExpression([
maybe([
'http',
maybe('s'),
'://'
]),
maybe('www.'),
'google.com',
maybe('/')
]);
//
// We can throw in new groups by throwing in another array. This can be useful
// in some contexts, like this one, where we want to match .com, .co, .org, and
// .biz domains:
//
var myParenthesizedExpression = new VerbalExpression([
'.',
or([
['co', maybe('m')],
'org',
'biz'
])
]);
//
// Regex flags such as case-insensitivity, `.` matching newlines, and so on can
// be set with a flags parameter:
//
var flags = [caseInsentitive(), dotMatchesNewlines()];
var myFlaggedExpression = new VerbalExpression([/* fancy expression */], flags);
//
// Also, groups and backreferences are a pain, so let's deal with those too!
// This example would match pairs of HTML tags: `<html></html>`,
// `<span>Hello there!</span>`, and so on.
//
var myGroupedExpression = new VerbalExpression([
'<',
group('tagName', [
atLeastOne(oneOf('a-zA-Z'))
]),
'>',
zeroOrMore(anyCharacter()),
'<',
again('tagName'),
'>'
]);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment