Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
A regexp to match named groups in a URL
# I've looked at the case at hand which is:
# foo.bar -> foo: foo.bar
# foo.bar -> foo: foo, format: bar
# This roughly translates to:
# For each part (foo, format), match as many possible subexpressions consisting of multiple word characters or one non-word characters
# (we might say \. explicitly, in this specific case). Do this lazily, except for the last part, since that one needs to gobble up the rest.
# And: If we have two or more parts, join them by a lazy match of a dot (\.?), which is not included in the named group.
# Examples
# 1 part:
#
p "foo".match(/(?<foo>(\w+|\W?)+)/) # => #<MatchData "foo" foo:"foo">
p "foo.bar".match(/(?<foo>(\w+|\W?)+)/) # => #<MatchData "foo.bar" foo:"foo.bar">
# 2 parts:
#
p "foo.bar".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+)/) # => #<MatchData "foo.bar" foo:"foo" bar:"bar">
p "foo.bar.bur".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+)/) # => #<MatchData "foo.bar.bur" foo:"foo" bar:"bar.bur">
# 3 parts:
#
p "foo.bar.bur".match(/(?<foo>(\w+|\W?)+?)\.?(?<bar>(\w+|\W?)+?)\.?(?<bur>(\w+|\W?)+)/) # => #<MatchData "foo.bar.bur" foo:"foo" bar:"bar" bur:"bur">
# (Note that the last expression is greedy – in the last example: (?<bur>(\w+|\W?)+) <- greedy – while the others are not, to gobble up the rest :) If you don't want this, don't make it greedy)
# So generally, what this means: If you have "pots", like foo, format etc. or let's say a, b, c, d, e, f… these regexps will distribute anything looking like:
# 1.2.3.4.5.6.7
# in these pots. If there's not enough to go around, e.g. with 1,2, then it will only be distributed to the first two pots.
# If there's too much to go around, it depends whether the last expression is greedy or not:
#
p "foo.bar".match(/(?<foo>(\w+|\W?)+?)/) # => foo matched
p "foo.bar".match(/(?<foo>(\w+|\W?)+)/) # => foo.bar gobbled up
@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

I hope this is understandable. We could be more specific than \W if it's only \. that need to be matched, of course.

Owner

floere commented Mar 22, 2012

I hope this is understandable. We could be more specific than \W if it's only \. that need to be matched, of course.

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

No, \W is fine.

rkh commented Mar 22, 2012

No, \W is fine.

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Ok, so that causes issues, actually, as splats are non-greedy atm and making (parts of) the normal named patterns non-greedy makes splats eat up everything

rkh commented Mar 22, 2012

Ok, so that causes issues, actually, as splats are non-greedy atm and making (parts of) the normal named patterns non-greedy makes splats eat up everything

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

ah, wrong explenation, but it still doesn't work

rkh commented Mar 22, 2012

ah, wrong explenation, but it still doesn't work

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Failing example?

Owner

floere commented Mar 22, 2012

Failing example?

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

OK, so how do I apply this? The current regexp is /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/.

rkh commented Mar 22, 2012

OK, so how do I apply this? The current regexp is /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/.

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

regexp = /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/
def assert(cond) cond or fail("didn't match") end

assert "/foo" =~ regexp
assert $1 == "foo"

assert "/foo.bar" =~ regexp
assert $1 == "foo"
assert $2 == "bar"

assert "/foo." =~ regexp
assert $1 == "foo"

assert "/.bar" !~ regexp
assert "/foo/bar" !~ regexp

rkh commented Mar 22, 2012

regexp = /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/
def assert(cond) cond or fail("didn't match") end

assert "/foo" =~ regexp
assert $1 == "foo"

assert "/foo.bar" =~ regexp
assert $1 == "foo"
assert $2 == "bar"

assert "/foo." =~ regexp
assert $1 == "foo"

assert "/.bar" !~ regexp
assert "/foo/bar" !~ regexp
@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

In Sinatra:

diff --git a/test/routing_test.rb b/test/routing_test.rb
index 80b0a00..09983f0 100644
--- a/test/routing_test.rb
+++ b/test/routing_test.rb
@@ -950,6 +950,13 @@ class RoutingTest < Test::Unit::TestCase
     assert_equal 'looks good', body
   end

+  it "matches formats even if the format is potional" do
+    mock_app { p get('/:name.?:format?') { params[:format] } }
+    get '/foo.bar'
+    assert ok?
+    assert_equal "bar", body
+  end
+
   it 'raises an ArgumentError with block arity > 1 and too many values' do
     mock_app do
       get '/:foo/:bar/:baz' do |foo, bar|

rkh commented Mar 22, 2012

In Sinatra:

diff --git a/test/routing_test.rb b/test/routing_test.rb
index 80b0a00..09983f0 100644
--- a/test/routing_test.rb
+++ b/test/routing_test.rb
@@ -950,6 +950,13 @@ class RoutingTest < Test::Unit::TestCase
     assert_equal 'looks good', body
   end

+  it "matches formats even if the format is potional" do
+    mock_app { p get('/:name.?:format?') { params[:format] } }
+    get '/foo.bar'
+    assert ok?
+    assert_equal "bar", body
+  end
+
   it 'raises an ArgumentError with block arity > 1 and too many values' do
     mock_app do
       get '/:foo/:bar/:baz' do |foo, bar|
@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

I'm trying to understand what the problem is. (I can assume things, but I'd rather ask)

So I am running a few examples:

r = /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/

p '/:foo'.match(r)            # => #<MatchData "/:foo" 1:":foo" 2:nil>
p '/:foo.:format'.match(r)    # => #<MatchData "/:foo.:format" 1:":foo.:format" 2:nil>
p '/:foo/:bar/:baz'.match(r)  # => nil
p '/:name.?:format?'.match(r) # => nil

Is the expectation that all of these are matched in groups? What should be the results in these cases?

Owner

floere commented Mar 22, 2012

I'm trying to understand what the problem is. (I can assume things, but I'd rather ask)

So I am running a few examples:

r = /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/

p '/:foo'.match(r)            # => #<MatchData "/:foo" 1:":foo" 2:nil>
p '/:foo.:format'.match(r)    # => #<MatchData "/:foo.:format" 1:":foo.:format" 2:nil>
p '/:foo/:bar/:baz'.match(r)  # => nil
p '/:name.?:format?'.match(r) # => nil

Is the expectation that all of these are matched in groups? What should be the results in these cases?

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Or better… what do you need extracted, actually?

Owner

floere commented Mar 22, 2012

Or better… what do you need extracted, actually?

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Your second example should behave differently (i.e. result in #<MatchData "/:foo.:format" 1:":foo" 2:":format">).

rkh commented Mar 22, 2012

Your second example should behave differently (i.e. result in #<MatchData "/:foo.:format" 1:":foo" 2:":format">).

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Ok.

Can we maybe get a huge ass list of various examples going? :) This would help tremendously in the search for a good regexp! What should happen if "/.bar" comes along? Fail?

Owner

floere commented Mar 22, 2012

Ok.

Can we maybe get a huge ass list of various examples going? :) This would help tremendously in the search for a good regexp! What should happen if "/.bar" comes along? Fail?

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Yes, :foo should match at least on character. I'll do a list.

rkh commented Mar 22, 2012

Yes, :foo should match at least on character. I'll do a list.

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Thanks!

Owner

floere commented Mar 22, 2012

Thanks!

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Working
=======

 Pattern             | Current Regexp                                           | Example                          | Should Be                            | Is Currently                        
---------------------|----------------------------------------------------------|----------------------------------|--------------------------------------|--------------------------------------
 "/"                 | /^\/$/                                                   | "/"                              | []                                   | []                                  
 "/foo"              | /^\/foo$/                                                | "/foo"                           | []                                   | []                                  
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo"                           | ["foo"]                              | ["foo"]                             
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo?"                          | nil                                  | nil                                 
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo/bar"                       | nil                                  | nil                                 
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo%2Fbar"                     | ["foo%2Fbar"]                        | ["foo%2Fbar"]                       
 "/:foo/:bar"        | /^\/([^\/?#]+)\/([^\/?#]+)$/                             | "/foo/bar"                       | ["foo", "bar"]                       | ["foo", "bar"]                      
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/"                              | nil                                  | nil                                 
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo/"                          | nil                                  | nil                                 
 "/f\u00F6\u00F6"    | /^\/f%C3%B6%C3%B6$/                                      | "/f%C3%B6%C3%B6"                 | []                                   | []                                  
 "/hello/:person"    | /^\/hello\/([^\/?#]+)$/                                  | "/hello/Frank"                   | ["Frank"]                            | ["Frank"]                           
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | "/hello/world"                   | ["hello", "world"]                   | ["hello", "world"]                  
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | "/hello"                         | ["hello", nil]                       | ["hello", nil]                      
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | "/"                              | [nil, nil]                           | [nil, nil]                          
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | ""                               | [nil, nil]                           | [nil, nil]                          
 "/*"                | /^\/(.*?)$/                                              | "/"                              | [""]                                 | [""]                                
 "/*"                | /^\/(.*?)$/                                              | "/foo"                           | ["foo"]                              | ["foo"]                             
 "/*"                | /^\/(.*?)$/                                              | "/"                              | [""]                                 | [""]                                
 "/*"                | /^\/(.*?)$/                                              | "/foo/bar"                       | ["foo/bar"]                          | ["foo/bar"]                         
 "/:foo/*"           | /^\/([^\/?#]+)\/(.*?)$/                                  | "/foo/bar/baz"                   | ["foo", "bar/baz"]                   | ["foo", "bar/baz"]                  
 "/:foo/:bar"        | /^\/([^\/?#]+)\/([^\/?#]+)$/                             | "/user@example.com/name"         | ["user@example.com", "name"]         | ["user@example.com", "name"]        
 "/:file.:ext"       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/                     | "/pony.jpg"                      | ["pony", "jpg"]                      | ["pony", "jpg"]                     
 "/:file.:ext"       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/                     | "/pony%2Ejpg"                    | ["pony", "jpg"]                      | ["pony", "jpg"]                     
 "/:file.:ext"       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/                     | "/.jpg"                          | nil                                  | nil                                 
 "/test.bar"         | /^\/test(?:\.|%2E)bar$/                                  | "/test.bar"                      | []                                   | []                                  
 "/test.bar"         | /^\/test(?:\.|%2E)bar$/                                  | "/test0bar"                      | nil                                  | nil                                 
 "/test$/"           | /^\/test(?:\$|%24)\/$/                                   | "/test$/"                        | []                                   | []                                  
 "/te+st/"           | /^\/te(?:\+|%2B)st\/$/                                   | "/te+st/"                        | []                                   | []                                  
 "/te+st/"           | /^\/te(?:\+|%2B)st\/$/                                   | "/test/"                         | nil                                  | nil                                 
 "/te+st/"           | /^\/te(?:\+|%2B)st\/$/                                   | "/teeest/"                       | nil                                  | nil                                 
 "/test(bar)/"       | /^\/test(?:\(|%28)bar(?:\)|%29)\/$/                      | "/test(bar)/"                    | []                                   | []                                  
 "/path with spaces" | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/ | "/path%20with%20spaces"          | []                                   | []                                  
 "/path with spaces" | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/ | "/path%2Bwith%2Bspaces"          | []                                   | []                                  
 "/path with spaces" | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/ | "/path+with+spaces"              | []                                   | []                                  
 "/foo&bar"          | /^\/foo(?:&|%26)bar$/                                    | "/foo&bar"                       | []                                   | []                                  
 "/:foo/*"           | /^\/([^\/?#]+)\/(.*?)$/                                  | "/hello%20world/how%20are%20you" | ["hello%20world", "how%20are%20you"] | ["hello%20world", "how%20are%20you"]
 "/*/foo/*/*"        | /^\/(.*?)\/foo\/(.*?)\/(.*?)$/                           | "/bar/foo/bling/baz/boom"        | ["bar", "bling", "baz/boom"]         | ["bar", "bling", "baz/boom"]        
 "/*/foo/*/*"        | /^\/(.*?)\/foo\/(.*?)\/(.*?)$/                           | "/bar/foo/baz"                   | nil                                  | nil                                 
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/foo"                           | ["foo", nil]                         | ["foo", nil]                        
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/.bar"                          | [".bar", nil]                        | [".bar", nil]                       

Broken
======

 Pattern             | Current Regexp                                           | Example                          | Should Be                            | Is Currently                        
---------------------|----------------------------------------------------------|----------------------------------|--------------------------------------|--------------------------------------
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/foo.bar"                       | ["foo", "bar"]                       | ["foo.bar", nil]                    
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/foo%2Ebar"                     | ["foo", "bar"]                       | ["foo%2Ebar", nil]                  
 "/:user@?:host?"    | /^\/([^\/?#]+)(?:@|%40)?([^\/?#]+)?$/                    | "/foo@bar"                       | ["foo", "bar"]                       | ["foo@bar", nil]                    
 "/:user@?:host?"    | /^\/([^\/?#]+)(?:@|%40)?([^\/?#]+)?$/                    | "/foo.foo@bar"                   | ["foo.foo", "bar"]                   | ["foo.foo@bar", nil]                
 "/:user@?:host?"    | /^\/([^\/?#]+)(?:@|%40)?([^\/?#]+)?$/                    | "/foo@bar.bar"                   | ["foo", "bar.bar"]                   | ["foo@bar.bar", nil]  

rkh commented Mar 22, 2012

Working
=======

 Pattern             | Current Regexp                                           | Example                          | Should Be                            | Is Currently                        
---------------------|----------------------------------------------------------|----------------------------------|--------------------------------------|--------------------------------------
 "/"                 | /^\/$/                                                   | "/"                              | []                                   | []                                  
 "/foo"              | /^\/foo$/                                                | "/foo"                           | []                                   | []                                  
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo"                           | ["foo"]                              | ["foo"]                             
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo?"                          | nil                                  | nil                                 
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo/bar"                       | nil                                  | nil                                 
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo%2Fbar"                     | ["foo%2Fbar"]                        | ["foo%2Fbar"]                       
 "/:foo/:bar"        | /^\/([^\/?#]+)\/([^\/?#]+)$/                             | "/foo/bar"                       | ["foo", "bar"]                       | ["foo", "bar"]                      
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/"                              | nil                                  | nil                                 
 "/:foo"             | /^\/([^\/?#]+)$/                                         | "/foo/"                          | nil                                  | nil                                 
 "/f\u00F6\u00F6"    | /^\/f%C3%B6%C3%B6$/                                      | "/f%C3%B6%C3%B6"                 | []                                   | []                                  
 "/hello/:person"    | /^\/hello\/([^\/?#]+)$/                                  | "/hello/Frank"                   | ["Frank"]                            | ["Frank"]                           
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | "/hello/world"                   | ["hello", "world"]                   | ["hello", "world"]                  
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | "/hello"                         | ["hello", nil]                       | ["hello", nil]                      
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | "/"                              | [nil, nil]                           | [nil, nil]                          
 "/?:foo?/?:bar?"    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/                         | ""                               | [nil, nil]                           | [nil, nil]                          
 "/*"                | /^\/(.*?)$/                                              | "/"                              | [""]                                 | [""]                                
 "/*"                | /^\/(.*?)$/                                              | "/foo"                           | ["foo"]                              | ["foo"]                             
 "/*"                | /^\/(.*?)$/                                              | "/"                              | [""]                                 | [""]                                
 "/*"                | /^\/(.*?)$/                                              | "/foo/bar"                       | ["foo/bar"]                          | ["foo/bar"]                         
 "/:foo/*"           | /^\/([^\/?#]+)\/(.*?)$/                                  | "/foo/bar/baz"                   | ["foo", "bar/baz"]                   | ["foo", "bar/baz"]                  
 "/:foo/:bar"        | /^\/([^\/?#]+)\/([^\/?#]+)$/                             | "/user@example.com/name"         | ["user@example.com", "name"]         | ["user@example.com", "name"]        
 "/:file.:ext"       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/                     | "/pony.jpg"                      | ["pony", "jpg"]                      | ["pony", "jpg"]                     
 "/:file.:ext"       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/                     | "/pony%2Ejpg"                    | ["pony", "jpg"]                      | ["pony", "jpg"]                     
 "/:file.:ext"       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/                     | "/.jpg"                          | nil                                  | nil                                 
 "/test.bar"         | /^\/test(?:\.|%2E)bar$/                                  | "/test.bar"                      | []                                   | []                                  
 "/test.bar"         | /^\/test(?:\.|%2E)bar$/                                  | "/test0bar"                      | nil                                  | nil                                 
 "/test$/"           | /^\/test(?:\$|%24)\/$/                                   | "/test$/"                        | []                                   | []                                  
 "/te+st/"           | /^\/te(?:\+|%2B)st\/$/                                   | "/te+st/"                        | []                                   | []                                  
 "/te+st/"           | /^\/te(?:\+|%2B)st\/$/                                   | "/test/"                         | nil                                  | nil                                 
 "/te+st/"           | /^\/te(?:\+|%2B)st\/$/                                   | "/teeest/"                       | nil                                  | nil                                 
 "/test(bar)/"       | /^\/test(?:\(|%28)bar(?:\)|%29)\/$/                      | "/test(bar)/"                    | []                                   | []                                  
 "/path with spaces" | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/ | "/path%20with%20spaces"          | []                                   | []                                  
 "/path with spaces" | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/ | "/path%2Bwith%2Bspaces"          | []                                   | []                                  
 "/path with spaces" | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/ | "/path+with+spaces"              | []                                   | []                                  
 "/foo&bar"          | /^\/foo(?:&|%26)bar$/                                    | "/foo&bar"                       | []                                   | []                                  
 "/:foo/*"           | /^\/([^\/?#]+)\/(.*?)$/                                  | "/hello%20world/how%20are%20you" | ["hello%20world", "how%20are%20you"] | ["hello%20world", "how%20are%20you"]
 "/*/foo/*/*"        | /^\/(.*?)\/foo\/(.*?)\/(.*?)$/                           | "/bar/foo/bling/baz/boom"        | ["bar", "bling", "baz/boom"]         | ["bar", "bling", "baz/boom"]        
 "/*/foo/*/*"        | /^\/(.*?)\/foo\/(.*?)\/(.*?)$/                           | "/bar/foo/baz"                   | nil                                  | nil                                 
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/foo"                           | ["foo", nil]                         | ["foo", nil]                        
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/.bar"                          | [".bar", nil]                        | [".bar", nil]                       

Broken
======

 Pattern             | Current Regexp                                           | Example                          | Should Be                            | Is Currently                        
---------------------|----------------------------------------------------------|----------------------------------|--------------------------------------|--------------------------------------
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/foo.bar"                       | ["foo", "bar"]                       | ["foo.bar", nil]                    
 "/:name.?:format?"  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/                   | "/foo%2Ebar"                     | ["foo", "bar"]                       | ["foo%2Ebar", nil]                  
 "/:user@?:host?"    | /^\/([^\/?#]+)(?:@|%40)?([^\/?#]+)?$/                    | "/foo@bar"                       | ["foo", "bar"]                       | ["foo@bar", nil]                    
 "/:user@?:host?"    | /^\/([^\/?#]+)(?:@|%40)?([^\/?#]+)?$/                    | "/foo.foo@bar"                   | ["foo.foo", "bar"]                   | ["foo.foo@bar", nil]                
 "/:user@?:host?"    | /^\/([^\/?#]+)(?:@|%40)?([^\/?#]+)?$/                    | "/foo@bar.bar"                   | ["foo", "bar.bar"]                   | ["foo@bar.bar", nil]  
@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Great! Does the result need to be an Array? Is #match used? Why not #split?

Owner

floere commented Mar 22, 2012

Great! Does the result need to be an Array? Is #match used? Why not #split?

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

I'm asking this because:

r = /[\/\.%2E]+/
p "/foo.bar"    .split(r) # => ["", "foo", "bar"]
p "/foo%2Ebar"  .split(r) # => ["", "foo", "bar"]

r = /[\/@]+/
p "/foo@bar"    .split(r) # => ["", "foo", "bar"]
p "/foo.foo@bar".split(r) # => ["", "foo.foo", "bar"]
p "/foo@bar.bar".split(r) # => ["", "foo", "bar.bar"]

Which is very close to the expected behavior, while being relatively simple (to generate as well). Cheers!

Owner

floere commented Mar 22, 2012

I'm asking this because:

r = /[\/\.%2E]+/
p "/foo.bar"    .split(r) # => ["", "foo", "bar"]
p "/foo%2Ebar"  .split(r) # => ["", "foo", "bar"]

r = /[\/@]+/
p "/foo@bar"    .split(r) # => ["", "foo", "bar"]
p "/foo.foo@bar".split(r) # => ["", "foo.foo", "bar"]
p "/foo@bar.bar".split(r) # => ["", "foo", "bar.bar"]

Which is very close to the expected behavior, while being relatively simple (to generate as well). Cheers!

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

match is used. So you mean we should detect that scenario and "fix" it after matching?

rkh commented Mar 22, 2012

match is used. So you mean we should detect that scenario and "fix" it after matching?

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

I wonder about the usage of match since using split seems a much better fit (in terms of generating the regexps and mental mapping), still assuming that an array of parts is needed as a result.
No, I am suggesting to use split if that is a possibility.

Owner

floere commented Mar 22, 2012

I wonder about the usage of match since using split seems a much better fit (in terms of generating the regexps and mental mapping), still assuming that an array of parts is needed as a result.
No, I am suggesting to use split if that is a possibility.

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

I don't get yet what the program flow would be like with split.

rkh commented Mar 22, 2012

I don't get yet what the program flow would be like with split.

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Can you point me to the match in the Sinatra code, please? :) (Then I am able to help much better)

Note: From your table I assumed you actually need an array of the parts as output, which is why I offered the idea of using split.

Owner

floere commented Mar 22, 2012

Can you point me to the match in the Sinatra code, please? :) (Then I am able to help much better)

Note: From your table I assumed you actually need an array of the parts as output, which is why I offered the idea of using split.

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Thanks! I see.

Ok then, match it is:

r = /^\/([^\/?#]+)(?:\.|%2E)+([^\/?#]+)?$/
p "/foo.bar".match(r)
p "/foo%2Ebar".match(r)

r = /^\/([^\/?#]+)(?:@|%40)+([^\/?#]+)?$/
p "/foo@bar".match(r)
p "/foo.foo@bar".match(r)
p "/foo@bar.bar".match(r)

(Both differ from the original in 1 position: ? -> +)

Cheers!

Owner

floere commented Mar 22, 2012

Thanks! I see.

Ok then, match it is:

r = /^\/([^\/?#]+)(?:\.|%2E)+([^\/?#]+)?$/
p "/foo.bar".match(r)
p "/foo%2Ebar".match(r)

r = /^\/([^\/?#]+)(?:@|%40)+([^\/?#]+)?$/
p "/foo@bar".match(r)
p "/foo.foo@bar".match(r)
p "/foo@bar.bar".match(r)

(Both differ from the original in 1 position: ? -> +)

Cheers!

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

That doesn't match "/foo".

rkh commented Mar 22, 2012

That doesn't match "/foo".

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

I'm beginning to understand. Also: How about being more constructive? :)

Owner

floere commented Mar 22, 2012

I'm beginning to understand. Also: How about being more constructive? :)

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Haha, I'm sorry, I super thankful for your investigation so far. I just don't know if this is even possible.

rkh commented Mar 22, 2012

Haha, I'm sorry, I super thankful for your investigation so far. I just don't know if this is even possible.

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Yeah, my problem was that I didn't understand what Sinatra was actually doing – but now I do. Source code reading helps :)

Owner

floere commented Mar 22, 2012

Yeah, my problem was that I didn't understand what Sinatra was actually doing – but now I do. Source code reading helps :)

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

I believe I got one. On to the next.

r = /^\/([^\.%2E\/?#]+)(?:\.|%2E)?([^\.%2E\/?#]+)?$/
p "/foo".match(r)
p "/foo.bar".match(r)
p "/foo%2Ebar".match(r)
Owner

floere commented Mar 22, 2012

I believe I got one. On to the next.

r = /^\/([^\.%2E\/?#]+)(?:\.|%2E)?([^\.%2E\/?#]+)?$/
p "/foo".match(r)
p "/foo.bar".match(r)
p "/foo%2Ebar".match(r)
@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

The second one is probably

r = /^\/([^@%40\/?#]+)(?:@|%40)?([^@%40\/?#]+)?$/
p "/foo".match(r)
p "/foo@bar".match(r)
p "/foo.foo@bar".match(r)
p "/foo@bar.bar".match(r)

Can you check, @rkh?

Owner

floere commented Mar 22, 2012

The second one is probably

r = /^\/([^@%40\/?#]+)(?:@|%40)?([^@%40\/?#]+)?$/
p "/foo".match(r)
p "/foo@bar".match(r)
p "/foo.foo@bar".match(r)
p "/foo@bar.bar".match(r)

Can you check, @rkh?

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

OK, that just means we'll have to do special parsing (i.e. recognize that it's of to have :format? not match the .). Might need to do proper parsing of the pattern then instead of just some replacements + Regexp.new.

rkh commented Mar 22, 2012

OK, that just means we'll have to do special parsing (i.e. recognize that it's of to have :format? not match the .). Might need to do proper parsing of the pattern then instead of just some replacements + Regexp.new.

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Unsure about that. The patterns I posted actually mimic a pattern that I've seen above, and which could be summarized as:
Match all in group except the separator character(s) up to the separator character(s), then continue matching all in group except the separator character(s).
E.g. /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/

Owner

floere commented Mar 22, 2012

Unsure about that. The patterns I posted actually mimic a pattern that I've seen above, and which could be summarized as:
Match all in group except the separator character(s) up to the separator character(s), then continue matching all in group except the separator character(s).
E.g. /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Are these the routing tests? https://github.com/sinatra/sinatra/blob/v1.3.2/test/routing_test.rb Maybe we can add the nice table above to it? :) (If it's not in there already and I've overlooked it)

Owner

floere commented Mar 22, 2012

Are these the routing tests? https://github.com/sinatra/sinatra/blob/v1.3.2/test/routing_test.rb Maybe we can add the nice table above to it? :) (If it's not in there already and I've overlooked it)

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Yes, except that '.' is not a separator character. The issue is, at the moment we do a simple gsub for :format, so we don't know that :format is actually part of :name.?:format? and I got the feeling that in order to solve this properly we'd need a proper parser, as this is not describable with a regexp.

rkh commented Mar 22, 2012

Yes, except that '.' is not a separator character. The issue is, at the moment we do a simple gsub for :format, so we don't know that :format is actually part of :name.?:format? and I got the feeling that in order to solve this properly we'd need a proper parser, as this is not describable with a regexp.

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Are these the routing tests? https://github.com/sinatra/sinatra/blob/v1.3.2/test/routing_test.rb Maybe we can add the nice table above to it? :) (If it's not in there already and I've overlooked it)

All but the failing ones are actually in there already (that's were I got em from).

rkh commented Mar 22, 2012

Are these the routing tests? https://github.com/sinatra/sinatra/blob/v1.3.2/test/routing_test.rb Maybe we can add the nice table above to it? :) (If it's not in there already and I've overlooked it)

All but the failing ones are actually in there already (that's were I got em from).

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Good point with the "." not being a separator character. A parser feels like the way to go here, but I had a feeling it might be doable using regexps. I'll have a look at Base#compile.

Thanks, I believe the tests would benefit a lot from being in a tabular form, but that's just me :)

Owner

floere commented Mar 22, 2012

Good point with the "." not being a separator character. A parser feels like the way to go here, but I had a feeling it might be doable using regexps. I'll have a look at Base#compile.

Thanks, I believe the tests would benefit a lot from being in a tabular form, but that's just me :)

@rkh

This comment has been minimized.

Show comment Hide comment
@rkh

rkh Mar 22, 2012

Yeah, I meant a parser for the pattern -> regexp step, not the request path -> route parsing, I would still use a regexp there. the real question would be how to generate it. I would also like :name(.:format)? to be possible.

rkh commented Mar 22, 2012

Yeah, I meant a parser for the pattern -> regexp step, not the request path -> route parsing, I would still use a regexp there. the real question would be how to generate it. I would also like :name(.:format)? to be possible.

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Yes, I got that! :)

Owner

floere commented Mar 22, 2012

Yes, I got that! :)

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

Ok, rewriting the challenge as: Find a way to elegantly map the given patterns into their corresponding regexps, such that they work for all given examples.

Full Pattern - Regexp mapping:

/                 | /^\/$/
/foo              | /^\/foo$/
/f\u00F6\u00F6    | /^\/f%C3%B6%C3%B6$/
/:foo             | /^\/([^\/?#]+)$/
/:foo/:bar        | /^\/([^\/?#]+)\/([^\/?#]+)$/
/hello/:person    | /^\/hello\/([^\/?#]+)$/
/?:foo?/?:bar?    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/
/*                | /^\/(.*?)$/
/:foo/*           | /^\/([^\/?#]+)\/(.*?)$/
/test.bar         | /^\/test(?:\.|%2E)bar$/
/test$/           | /^\/test(?:\$|%24)\/$/
/te+st/           | /^\/te(?:\+|%2B)st\/$/
/test(bar)/       | /^\/test(?:\(|%28)bar(?:\)|%29)\/$/
/path with spaces | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/
/foo&bar          | /^\/foo(?:&|%26)bar$/
/*/foo/*/*        | /^\/(.*?)\/foo\/(.*?)\/(.*?)$/
/:file.:ext       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/
/:name.?:format?  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/
/:user@?:host?    | /^\/([^@%40\/?#]+)(?:@|%40)?([^@%40\/?#]+)?$/
/:name.?:format?  | /^\/([^\.%2E\/?#]+)(?:\.|%2E)?([^\.%2E\/?#]+)?$/
Owner

floere commented Mar 22, 2012

Ok, rewriting the challenge as: Find a way to elegantly map the given patterns into their corresponding regexps, such that they work for all given examples.

Full Pattern - Regexp mapping:

/                 | /^\/$/
/foo              | /^\/foo$/
/f\u00F6\u00F6    | /^\/f%C3%B6%C3%B6$/
/:foo             | /^\/([^\/?#]+)$/
/:foo/:bar        | /^\/([^\/?#]+)\/([^\/?#]+)$/
/hello/:person    | /^\/hello\/([^\/?#]+)$/
/?:foo?/?:bar?    | /^\/?([^\/?#]+)?\/?([^\/?#]+)?$/
/*                | /^\/(.*?)$/
/:foo/*           | /^\/([^\/?#]+)\/(.*?)$/
/test.bar         | /^\/test(?:\.|%2E)bar$/
/test$/           | /^\/test(?:\$|%24)\/$/
/te+st/           | /^\/te(?:\+|%2B)st\/$/
/test(bar)/       | /^\/test(?:\(|%28)bar(?:\)|%29)\/$/
/path with spaces | /^\/path(?:%20|(?:\+|%2B))with(?:%20|(?:\+|%2B))spaces$/
/foo&bar          | /^\/foo(?:&|%26)bar$/
/*/foo/*/*        | /^\/(.*?)\/foo\/(.*?)\/(.*?)$/
/:file.:ext       | /^\/([^\/?#]+)(?:\.|%2E)([^\/?#]+)$/
/:name.?:format?  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/
/:user@?:host?    | /^\/([^@%40\/?#]+)(?:@|%40)?([^@%40\/?#]+)?$/
/:name.?:format?  | /^\/([^\.%2E\/?#]+)(?:\.|%2E)?([^\.%2E\/?#]+)?$/
@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 22, 2012

I wouldn't mind working on this, if you don't mind :)

Owner

floere commented Mar 22, 2012

I wouldn't mind working on this, if you don't mind :)

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 23, 2012

Ok, almost got it. Cleaning up the code and preparing for a pull request :)

Owner

floere commented Mar 23, 2012

Ok, almost got it. Cleaning up the code and preparing for a pull request :)

@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 23, 2012

Note: I'm assuming this example should actually be nil. At least it looks like that to me. If I am wrong, please tell me.

"/:name.?:format?  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/ | /.bar | [.bar, nil]"
Owner

floere commented Mar 23, 2012

Note: I'm assuming this example should actually be nil. At least it looks like that to me. If I am wrong, please tell me.

"/:name.?:format?  | /^\/([^\/?#]+)(?:\.|%2E)?([^\/?#]+)?$/ | /.bar | [.bar, nil]"
@floere

This comment has been minimized.

Show comment Hide comment
@floere

floere Mar 23, 2012

Let's move this to: sinatra/sinatra#492 Cheers!

Owner

floere commented Mar 23, 2012

Let's move this to: sinatra/sinatra#492 Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment