Created
December 9, 2011 19:40
-
-
Save showell/1452964 to your computer and use it in GitHub Desktop.
sketch of CS line number support
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This is a sketch of how to add line number support to CoffeeScript. | |
# It only goes as far as getting line numbers into the AST output from | |
# nodes. Even that would be a huge step forward, and it would set the | |
# groundwork for other features, such as line-number mappings. | |
# | |
# LEXER HOOKS (coffee-script.coffee) | |
# | |
# The first thing to do is update coffee-script.coffee. Jison has | |
# line-number support, but you have to play nice with Jison: | |
# The real Lexer produces a generic stream of tokens. This object provides a | |
# thin wrapper around it, compatible with the Jison API. We can then pass it | |
# directly as a "Jison lexer". The code below is toward the bottom of the file. | |
# | |
# (WARNING: Jison is a bit fiddly about when you initialize yyloc, so this seemingly | |
# simple patch took me a while to work out. Trust me, just use it!) | |
parser.lexer = | |
lex: -> | |
[tag, @yytext, @yylineno] = @tokens[@pos] or [''] | |
first_line = @yylineno + 1 | |
@yylloc = | |
first_line: first_line | |
last_line: first_line + 1 | |
first_col: @pos | |
@pos++ | |
tag | |
setInput: (@tokens) -> | |
@pos = 0 | |
upcomingInput: -> | |
"" | |
yylloc: | |
first_line: 0 | |
first_col: 0 | |
last_line: 0 | |
last_col: 0 | |
# GRAMMAR HOOKS (SHIM) | |
# | |
# Once you give Jison the information it needs, then you can automatically | |
# update nodes within grammar.coffee. This is extremely ugly code, but it's | |
# useful for bootstrapping other line-number dev efforts. It gives every node | |
# in the AST at least an approximately correct line number, in a mostly non-invasive | |
# fashion. The code below is around line 33 or so of grammar.coffee. | |
o = (patternString, action, options) -> | |
patternString = patternString.replace /\s{2,}/g, ' ' | |
return [patternString, '$$ = $1;', options] unless action | |
action = if match = unwrap.exec action then match[1] else "(#{action}())" | |
action = action.replace /\bnew /g, '$&yy.' | |
action = action.replace /\b(?:Block\.wrap|extend)\b/g, 'yy.$&' | |
numTokens = (patternString.split ' ').length | |
custom_js = """ | |
$$ = (function(lstack) { | |
var node = #{action}; | |
if (node.updateJisonMetadata) { | |
node.updateJisonMetadata(yylineno, lstack, #{numTokens}); | |
}; | |
return node; | |
})(arguments[6]) | |
""" | |
[patternString, custom_js, options] | |
# NODES API, Part I | |
# | |
# Next, you will want to insert the following code in the Base class in | |
# nodes.coffee. I arbitrarily put this function around line 120 or so, but | |
# it can go anywhere in the Base class. This is the hook for line number | |
# support. | |
# Update node w/metadata from Jison (called at parse time) | |
# For nodes with children, the base class method does its best to | |
# generated first/last line number ranges from its implicit children, | |
# but this is less than robust, since its children get built during | |
# parse time, and the children aren't necessarily objects. In practice, | |
# many applications may only care about the leaf nodes in the AST. | |
updateJisonMetadata: (yylineno, lstack, numTokens) -> | |
n = lstack.length | |
tokens = lstack[(n - numTokens)...n] | |
for subnode in tokens | |
@firstLineNumber ||= subnode.first_line | |
@lastLineNumber = subnode.last_line | |
# NODES API, Part II | |
# | |
# Once nodes have members @firstLineNumber and @lastLineNumber, modify the | |
# toString() method in the Base class of nodes.coffee to output it as part of | |
# the nodes output. Once all this is in place, you are bootstrapped to write | |
# downstream tools, even if the line number support isn't perfect. | |
# | |
toString: (idt = '', name = @constructor.name) -> | |
lineno = if @firstLineNumber | |
"#{@firstLineNumber} #{@lastLineNumber - 1} " | |
else | |
' ' | |
tree = '\n' + lineno + idt + name | |
tree += '?' if @soak | |
@eachChild (node) -> tree += node.toString idt + TAB | |
tree | |
# LIMITATIONS: | |
# | |
# By hooking into the "o" function and the Base class of nodes, we get up | |
# and running quickly, but a long term solution needs more precision, particularly | |
# for composite nodes. Some nodes follow this pattern, and basically get line | |
# number support for free, once you apply the patches above: | |
Class: [ | |
o 'CLASS', -> new Class | |
o 'CLASS Block', -> new Class null, null, $2 | |
o 'CLASS EXTENDS Expression', -> new Class null, $3 | |
o 'CLASS EXTENDS Expression Block', -> new Class null, $3, $4 | |
o 'CLASS SimpleAssignable', -> new Class $2 | |
o 'CLASS SimpleAssignable Block', -> new Class $2, null, $3 | |
o 'CLASS SimpleAssignable EXTENDS Expression', -> new Class $2, $4 | |
o 'CLASS SimpleAssignable EXTENDS Expression Block', -> new Class $2, $4, $5 | |
] | |
# The condition portion of a while loop. | |
WhileSource: [ | |
o 'WHILE Expression', -> new While $2 | |
o 'WHILE Expression WHEN Expression', -> new While $2, guard: $4 | |
o 'UNTIL Expression', -> new While $2, invert: true | |
o 'UNTIL Expression WHEN Expression', -> new While $2, invert: true, guard: $4 | |
] | |
# It's the nodes like below that require more specific surgery or some kind of | |
# rethink. The If node will get line numbers when it's parsed, but they won't | |
# include the ELSE block. | |
IfBlock: [ | |
o 'IF Expression Block', -> new If $2, $3, type: $1 | |
o 'IfBlock ELSE IF Expression Block', -> $1.addElse new If $4, $5, type: $3 | |
] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment