-
-
Save floere/2154980 to your computer and use it in GitHub Desktop.
# I've looked at the case at hand which is: | |
# foo.bar -> foo: foo.bar | |
# foo.bar -> foo: foo, format: bar | |
# This roughly translates to: | |
# For each part (foo, format), match as many possible subexpressions consisting of multiple word characters or one non-word characters | |
# (we might say \. explicitly, in this specific case). Do this lazily, except for the last part, since that one needs to gobble up the rest. | |
# And: If we have two or more parts, join them by a lazy match of a dot (\.?), which is not included in the named group. | |
# Examples | |
# 1 part: | |
# | |
p "foo".match(/(?<foo>(\w+|\W?)+)/) # => #<MatchData "foo" foo:"foo"> | |
p "foo.bar".match(/(?<foo>(\w+|\W?)+)/) # => #<MatchData "foo.bar" foo:"foo.bar"> | |
# 2 parts: | |
# | |
p "foo.bar".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+)/) # => #<MatchData "foo.bar" foo:"foo" bar:"bar"> | |
p "foo.bar.bur".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+)/) # => #<MatchData "foo.bar.bur" foo:"foo" bar:"bar.bur"> | |
# 3 parts: | |
# | |
p "foo.bar.bur".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+?)\.?(?<bur>(\w+|\W?)+)/) # => #<MatchData "foo.bar.bur" foo:"foo" bar:"bar" bur:"bur"> | |
# (Note that the last expression is greedy – in the last example: (?<bur>(\w+|\W?)+) <- greedy – while the others are not, to gobble up the rest :) If you don't want this, don't make it greedy) | |
# So generally, what this means: If you have "pots", like foo, format etc. or let's say a, b, c, d, e, f… these regexps will distribute anything looking like: | |
# 1.2.3.4.5.6.7 | |
# in these pots. If there's not enough to go around, e.g. with 1,2, then it will only be distributed to the first two pots. | |
# If there's too much to go around, it depends whether the last expression is greedy or not: | |
# | |
p "foo.bar".match(/(?<foo>(\w+|\W?)+?)/) # => foo matched | |
p "foo.bar".match(/(?<foo>(\w+|\W?)+)/) # => foo.bar gobbled up |
OK, that just means we'll have to do special parsing (i.e. recognize that it's of to have :format?
not match the .
). Might need to do proper parsing of the pattern then instead of just some replacements + Regexp.new
.
Unsure about that. The patterns I posted actually mimic a pattern that I've seen above, and which could be summarized as:
Match all in group except the separator character(s) up to the separator character(s), then continue matching all in group except the separator character(s).
E.g. /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/
Are these the routing tests? https://github.com/sinatra/sinatra/blob/v1.3.2/test/routing_test.rb Maybe we can add the nice table above to it? :) (If it's not in there already and I've overlooked it)
Yes, except that '.' is not a separator character. The issue is, at the moment we do a simple gsub
for :format
, so we don't know that :format
is actually part of :name.?:format?
and I got the feeling that in order to solve this properly we'd need a proper parser, as this is not describable with a regexp.
Are these the routing tests? https://github.com/sinatra/sinatra/blob/v1.3.2/test/routing_test.rb Maybe we can add the nice table above to it? :) (If it's not in there already and I've overlooked it)
All but the failing ones are actually in there already (that's were I got em from).
Good point with the "." not being a separator character. A parser feels like the way to go here, but I had a feeling it might be doable using regexps. I'll have a look at Base#compile.
Thanks, I believe the tests would benefit a lot from being in a tabular form, but that's just me :)
Yeah, I meant a parser for the pattern -> regexp
step, not the request path -> route
parsing, I would still use a regexp there. the real question would be how to generate it. I would also like :name(.:format)?
to be possible.
Yes, I got that! :)
Ok, rewriting the challenge as: Find a way to elegantly map the given patterns into their corresponding regexps, such that they work for all given examples.
Full Pattern - Regexp mapping:
/ | /^\/$/
/foo | /^\/foo$/
/f\u00F6\u00F6 | /^\/f%C3%B6%C3%B6$/
/:foo | /^\/([^\/?#]+)$/
/:foo/:bar | /^\/([^\/?#]+)\/([^\/?#]+)$/
/hello/:person | /^\/hello\/([^\/?#]+)$/
/?:foo?/?:bar? | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/
/* | /^\/(.*?)$/
/:foo/* | /^\/([^\/?#]+)\/(.*?)$/
/test.bar | /^\/test(?:\.|%2E)bar$/
/test$/ | /^\/test(?:\$|%24)\/$/
/te+st/ | /^\/te(?:\+|%2B)st\/$/
/test(bar)/ | /^\/test(?:\(|%28)bar(?:\)|%29)\/$/
/path with spaces | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/
/foo&bar | /^\/foo(?:&|%26)bar$/
/*/foo/*/* | /^\/(.*?)\/foo\/(.*?)\/(.*?)$/
/:file.:ext | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/
/:name.?:format? | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/
/:user@?:host? | /^\/([^@%40\/?#]+)(?:@|%40)?([^@%40\/?#]+)?$/
/:name.?:format? | /^\/([^\.%2E\/?#]+)(?:\.|%2E)?([^\.%2E\/?#]+)?$/
I wouldn't mind working on this, if you don't mind :)
Ok, almost got it. Cleaning up the code and preparing for a pull request :)
Note: I'm assuming this example should actually be nil
. At least it looks like that to me. If I am wrong, please tell me.
"/:name.?:format? | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/ | /.bar | [.bar, nil]"
Let's move this to: sinatra/sinatra#492 Cheers!
The second one is probably
Can you check, @rkh?