Skip to content

Instantly share code, notes, and snippets.

Created August 19, 2014 06:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/31b2017c97abee2c5af0 to your computer and use it in GitHub Desktop.
Save anonymous/31b2017c97abee2c5af0 to your computer and use it in GitHub Desktop.
This problem involves parsing a *syntax* that defines a *block* of code
composed of two or more *statements*. The syntax supports 2 different ways to
define the block. The problem is how to determine when a statement ends.
Here is the example code written in Ruby:
def f(x, y)
a = x * y
a + 1
end
This is a function named `f` that has a block containing 2 statements.
In the problem syntax there are 2 main ways to write this:
* Indentation-based:
\f(x, y)
a = x * y
a + 1
* Bracket-based:
\f(x, y) { a = x * y; a + 1 }
There are many permutations of these. For example the bracket based one can be:
\f(x, y) { a
= x
* y
; a
+ 1
}
The driving principle here is that when you have braces, statements are
terminated by semicolons, or the closing brace. This gives you lots of
formatting freedom.
In indentation based scoping, statements are terminated by an EOL *after* a
complete construct. This is to say, that you can split a statement into 2
lines, if the first line cannot be valid on its own:
\f(x, y)
a = x *
y
a + 1
Let's look at Ruby, which is a typical syntax that tries to avoid semicolons
but is forced to support backslash continuation. Note: this is getting close to
the heart of this problem.
This is valid Ruby:
def f(x, y)
a = x *
y
a + 1
end
The first statement can be split because the first half is not complete.
This is invalid Ruby:
def f(x, y)
a = x
* y
a + 1
end
The first half is valid. The line ends, so the statement ends. The second part
is invalid, so we have a syntax error.
The fix here is a continuation marker:
def f(x, y)
a = x \
* y
a + 1
end
This makes sense, because Ruby is a *bracketed* syntax with no mandatory
statement terminator (semicolon).
Here's the *new* idea that is not supported anywhere yet (afaik). If you are
using indentation scoping, you can also use indentation for statement
continuation:
\f(x, y)
a = x
* y
a + 1
The parser can look 1 token ahead for an indent, and take take to mean a
continuation, if indentation is not valid for some other reason. This is a
really nice addition to indentation based scoping. Note: I've suggested it to
the CoffeeScript project, but they've not used it yet.
OK, that's the groundwork. The problem that vexes me is how to express this
cleanly in a grammar.
Here is a first draft:
func-def: '\' func-name func-args ( indent-body | bracket-body )
indent-body: indent i-statement* undent
bracket-body: '{' b-statement* %% SEMI '}'
i-statement: ondent i-expression EOL
b-statement: b-ws* b-expression b-ws*
The *-expression part is where it gets hard…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment