Skip to content

Instantly share code, notes, and snippets.

@gussan
Created November 19, 2012 06:18
Show Gist options
  • Save gussan/4109225 to your computer and use it in GitHub Desktop.
Save gussan/4109225 to your computer and use it in GitHub Desktop.
Fix ruby's uri parser to adapt subdomains including underscores
--- a/uri/common.rb
+++ b/uri/rfc3986_common.rb
@@ -45,9 +45,9 @@ module URI
RESERVED = ";/?:@&=+$,\\[\\]"
# domainlabel = alphanum | alphanum *( alphanum | "-" ) alphanum
- DOMLABEL = "(?:[#{ALNUM}](?:[-#{ALNUM}]*[#{ALNUM}])?)"
+ DOMLABEL = "(?:[#{ALNUM}](?:[-_#{ALNUM}]*[#{ALNUM}])?)"
# toplabel = alpha | alpha *( alphanum | "-" ) alphanum
- TOPLABEL = "(?:[#{ALPHA}](?:[-#{ALNUM}]*[#{ALNUM}])?)"
+ TOPLABEL = "(?:[#{ALPHA}](?:[-_#{ALNUM}]*[#{ALNUM}])?)"
# hostname = *( domainlabel "." ) toplabel [ "." ]
HOSTNAME = "(?:#{DOMLABEL}\\.)*#{TOPLABEL}\\.?"
@@ -365,7 +365,7 @@ module URI
# hostname = *( domainlabel "." ) toplabel [ "." ]
# reg-name = *( unreserved / pct-encoded / sub-delims ) # RFC3986
unless hostname
- ret[:HOSTNAME] = hostname = "(?:[a-zA-Z0-9\\-.]|%\\h\\h)+"
+ ret[:HOSTNAME] = hostname = "(?:[a-zA-Z0-9\\-._]|%\\h\\h)+"
end
# RFC 2373, APPENDIX B:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment