-
-
Save ErikCorryGoogle/99825a2393bd174b9eda867595a4c51f to your computer and use it in GitHub Desktop.
I like that names have to be unique. Some regexp flavours allow dupes and unify the storage between them, | |
but that feels complicated and difficult to spec. For something like this: | |
/(<foo>..)((<foo>..))*/ | |
normally you would reset the capture whenever you iterate the *-loop, but it would be strange to delete the | |
foo capture on entering the loop, or would it? | |
By putting the named captures on the match object as properties, you are preventing any future standard from | |
ever adding a new property to the Match object, ever, since it might conflict with the name of a named capture. | |
Perhaps it makes more sense to add a .map property to the match of type Map and have string keys on that map. | |
This also avoids the question of what happens if someone makes a named capture called __proto__ or prototype. | |
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map | |
You picked the .NET syntax for backreferences \k<name> instead of the Python syntax (P=name). I think the | |
Python one was probably better, because JS does not do a syntax error for unknown alpha escapes so \k<name> | |
would previously match the literal string "k<name>", whereas (?P...) would cause a syntax error previously. | |
If you switch to the .NET syntax for backreferences, you should of course do the same for the captures | |
themselves. | |
.NET will number all the unnamed captures first from left to right, then number the named ones from left to | |
right. Most others number all of them regardless of whether they are named. I think you went with the non- | |
.NET version, which feels right. | |
Re referencing captures within a string closure, perhaps it's cleaner to just augment "replace" with replaceMatch which takes a single match object like the array RegExp.prototype.exec returns.
I guess you support $ like $1 when the replacement is a string rather than a closure?
Github messing with my replies again. I meant:
I guess you support $<foo>
// this is supposed to look like dollar-lessthan-foo-greaterthan
I agree that adding named captures as properties on the match object is a bad idea, for the mentioned reasons.
One solution I could get behind is having a sub-object for the captures. Another way could be to change match object to be a map, so that you could do something like
/(?<foo>abc)/.exec("abc").get("foo")
I think using maps in any context here (either as the match object itself, or as the match.groups object) would be a pretty significant performance hit.
+1 for storing captures on a sub-object, where the sub-object is a plain JS object with one property per named capture, and the sub-object exists if and only if the regexp has named captures.
At some point we could introduce the /x syntax and that would also let us fix the other syntactic strangeness without forcing /u semantics. My favourite one is, what does /\c./ match?