public
Last active

Javascript regex grammar quirk

  • Download Gist
regex_end_of_string.md
Markdown

This gist regards some confusion resulting from this StackOverflow question.

Refer to sections 15.5.4.10 and 15.5.4.11 of the spec - page 145ish.

Before any passes, the cursor is at the start of the string:

, " , f , o , o ,   , b , a , r , " ,
^

lastIndex = 0
thisIndex = 0
previousLastIndex = undefined
n = 0
A = []

After matching the start of string quote, the cursor is here:

, " , f , o , o ,   , b , a , r , " ,
    ^

lastIndex = 0
thisIndex = 1
previousLastIndex = 0
n = 1
A = ['"']

After matching the end of string quote, the cursor is here:

, " , f , o , o ,   , b , a , r , " ,
                                    ^

lastIndex = 9
thisIndex = 9
previousLastIndex = 0
n = 2
A = ['"', '"']

At this point, previousLastIndex does not equal thisIndex, so we try to match again from the cursor position, successfully matching '0 double-quotes followed by the end of string'. After that, the cursor is here:

, " , f , o , o ,   , b , a , r , " ,
                                    ^

lastIndex = 9
thisIndex = 9
previousLastIndex = 9
n = 3
A = ['"', '"', '']

Since previousLastIndex == thisIndex, we set lastIndex to thisIndex+1, which makes the regex fall off the end of the string.

And so the replacement replaces our three strings, ['"', '"', ''] with '"', thus inserting one double-quote at the beginning and two at the end.

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.