Skip to content

Instantly share code, notes, and snippets.

@Xliff
Last active May 27, 2018 14:25
Show Gist options
  • Save Xliff/9f36da860843dd1793b1206493496c60 to your computer and use it in GitHub Desktop.
Save Xliff/9f36da860843dd1793b1206493496c60 to your computer and use it in GitHub Desktop.

I have the following token in a grammar that is behaving very mysteriously:

our token ident_sys is export {
  <keyword>
  |
  $<o>=[ <:Letter + [ _ @ # ]> <:Letter + [ _ @ # $ ]>* ]
  |
  # YYY: Verify that this is IDENT_QUOTED
  '"' [ <keyword> | <~~> ] '"'
  |
  "'" [ <keyword> | <~~> ] "'"
}

So, given that definition, you would expect the following to fail out:

!not
$dolla
0\@notavar
%caseabeer
'misquoted"
"misquoted2'

Yet I get the following gists from each one:

# 「not」
#  ws => 「」
#  ident_sys => 「not」
#   o => 「not」
#  ws => 「」

# 「dolla」
#  ws => 「」
#  ident_sys => 「dolla」
#   o => 「dolla」
#  ws => 「」

# 「@notavar」
#  ws => 「」
#  ident_sys => 「@notavar」
#   o => 「@notavar」
#  ws => 「」

# 「caseabeer」
#  ws => 「」
#  ident_sys => 「caseabeer」
#   o => 「caseabeer」
#  ws => 「」

# 「misquoted」
#  ws => 「」
#  ident_sys => 「misquoted」
#   o => 「misquoted」
#  ws => 「」

The only one of the above that does fail with the first character preserved is: "misquoted2'.

What is gobbling up the first character?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment