Skip to content

Instantly share code, notes, and snippets.

@jordansissel
Created December 17, 2011 21:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jordansissel/1491437 to your computer and use it in GitHub Desktop.
Save jordansissel/1491437 to your computer and use it in GitHub Desktop.
Confused unescaping happening when doing String#sub ?
string = "hello world"
expected = DATA.read.chomp
# Replace the entire string with the expected value.
actual = string.sub(string, expected)
puts "Input: #{string}"
puts "Expected result: #{expected}"
puts "Actual result: #{actual}"
puts "Equal: #{actual == expected}"
__END__
Hurray for slashes! \\testing//

While debugging some ruby grok bugs, I found something very strange where String#sub replaces all double backslashes with single backslashes

Input: hello world
Expected result: Hurray for slashes! \\testing//
Actual result: Hurray for slashes! \testing//
Equal: false

This is quite confusing. I said "replace this string with that string" and it took the 'that string' and replaced all double backslashes with single backslashes? This is very strange and certainly a bug.

This is probably an integration problem due to String#sub supporting captured groups when used with regexps; from the ruby docs - http://www.ruby-doc.org/core-1.9.3/String.html :

If replacement is a String it will be substituted for the matched text. It may contain back-references to the pattern’s capture groups of the form \d, where d is a group number, or \k, where n is a group name. If it is a double-quoted string, both back-references must be preceded by an additional backslash. However, within replacement the special match variables, such as &$, will not refer to the current match.

I think this is a bug. If the first argument to String#sub() is a string, then there will be no capturing so there is no need to process backslashes and capture groups in the replacement string.

@jordansissel
Copy link
Author

The reason I noticed this is because both arguments to my String#sub() call are from user input, and the user, pleasantly unknowing of the implementation oddities, has no reason to expect that two consecutive backslashes should be transformed into a single backslash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment