Skip to content

Instantly share code, notes, and snippets.

@floere
Created March 22, 2012 01:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save floere/2154980 to your computer and use it in GitHub Desktop.
Save floere/2154980 to your computer and use it in GitHub Desktop.
A regexp to match named groups in a URL
# I've looked at the case at hand which is:
# foo.bar -> foo: foo.bar
# foo.bar -> foo: foo, format: bar
# This roughly translates to:
# For each part (foo, format), match as many possible subexpressions consisting of multiple word characters or one non-word characters
# (we might say \. explicitly, in this specific case). Do this lazily, except for the last part, since that one needs to gobble up the rest.
# And: If we have two or more parts, join them by a lazy match of a dot (\.?), which is not included in the named group.
# Examples
# 1 part:
#
p "foo".match(/(?<foo>(\w+|\W?)+)/) # => #<MatchData "foo" foo:"foo">
p "foo.bar".match(/(?<foo>(\w+|\W?)+)/) # => #<MatchData "foo.bar" foo:"foo.bar">
# 2 parts:
#
p "foo.bar".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+)/) # => #<MatchData "foo.bar" foo:"foo" bar:"bar">
p "foo.bar.bur".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+)/) # => #<MatchData "foo.bar.bur" foo:"foo" bar:"bar.bur">
# 3 parts:
#
p "foo.bar.bur".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+?)\.?(?<bur>(\w+|\W?)+)/) # => #<MatchData "foo.bar.bur" foo:"foo" bar:"bar" bur:"bur">
# (Note that the last expression is greedy – in the last example: (?<bur>(\w+|\W?)+) <- greedy – while the others are not, to gobble up the rest :) If you don't want this, don't make it greedy)
# So generally, what this means: If you have "pots", like foo, format etc. or let's say a, b, c, d, e, f… these regexps will distribute anything looking like:
# 1.2.3.4.5.6.7
# in these pots. If there's not enough to go around, e.g. with 1,2, then it will only be distributed to the first two pots.
# If there's too much to go around, it depends whether the last expression is greedy or not:
#
p "foo.bar".match(/(?<foo>(\w+|\W?)+?)/) # => foo matched
p "foo.bar".match(/(?<foo>(\w+|\W?)+)/) # => foo.bar gobbled up
@floere
Copy link
Author

floere commented Mar 23, 2012

Ok, almost got it. Cleaning up the code and preparing for a pull request :)

@floere
Copy link
Author

floere commented Mar 23, 2012

Note: I'm assuming this example should actually be nil. At least it looks like that to me. If I am wrong, please tell me.

"/:name.?:format?  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/ | /.bar | [.bar, nil]"

@floere
Copy link
Author

floere commented Mar 23, 2012

Let's move this to: sinatra/sinatra#492 Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment