Skip to content

Instantly share code, notes, and snippets.

@showell
Created December 9, 2011 19:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save showell/1452964 to your computer and use it in GitHub Desktop.
Save showell/1452964 to your computer and use it in GitHub Desktop.
sketch of CS line number support
# This is a sketch of how to add line number support to CoffeeScript.
# It only goes as far as getting line numbers into the AST output from
# nodes. Even that would be a huge step forward, and it would set the
# groundwork for other features, such as line-number mappings.
#
# LEXER HOOKS (coffee-script.coffee)
#
# The first thing to do is update coffee-script.coffee. Jison has
# line-number support, but you have to play nice with Jison:
# The real Lexer produces a generic stream of tokens. This object provides a
# thin wrapper around it, compatible with the Jison API. We can then pass it
# directly as a "Jison lexer". The code below is toward the bottom of the file.
#
# (WARNING: Jison is a bit fiddly about when you initialize yyloc, so this seemingly
# simple patch took me a while to work out. Trust me, just use it!)
parser.lexer =
lex: ->
[tag, @yytext, @yylineno] = @tokens[@pos] or ['']
first_line = @yylineno + 1
@yylloc =
first_line: first_line
last_line: first_line + 1
first_col: @pos
@pos++
tag
setInput: (@tokens) ->
@pos = 0
upcomingInput: ->
""
yylloc:
first_line: 0
first_col: 0
last_line: 0
last_col: 0
# GRAMMAR HOOKS (SHIM)
#
# Once you give Jison the information it needs, then you can automatically
# update nodes within grammar.coffee. This is extremely ugly code, but it's
# useful for bootstrapping other line-number dev efforts. It gives every node
# in the AST at least an approximately correct line number, in a mostly non-invasive
# fashion. The code below is around line 33 or so of grammar.coffee.
o = (patternString, action, options) ->
patternString = patternString.replace /\s{2,}/g, ' '
return [patternString, '$$ = $1;', options] unless action
action = if match = unwrap.exec action then match[1] else "(#{action}())"
action = action.replace /\bnew /g, '$&yy.'
action = action.replace /\b(?:Block\.wrap|extend)\b/g, 'yy.$&'
numTokens = (patternString.split ' ').length
custom_js = """
$$ = (function(lstack) {
var node = #{action};
if (node.updateJisonMetadata) {
node.updateJisonMetadata(yylineno, lstack, #{numTokens});
};
return node;
})(arguments[6])
"""
[patternString, custom_js, options]
# NODES API, Part I
#
# Next, you will want to insert the following code in the Base class in
# nodes.coffee. I arbitrarily put this function around line 120 or so, but
# it can go anywhere in the Base class. This is the hook for line number
# support.
# Update node w/metadata from Jison (called at parse time)
# For nodes with children, the base class method does its best to
# generated first/last line number ranges from its implicit children,
# but this is less than robust, since its children get built during
# parse time, and the children aren't necessarily objects. In practice,
# many applications may only care about the leaf nodes in the AST.
updateJisonMetadata: (yylineno, lstack, numTokens) ->
n = lstack.length
tokens = lstack[(n - numTokens)...n]
for subnode in tokens
@firstLineNumber ||= subnode.first_line
@lastLineNumber = subnode.last_line
# NODES API, Part II
#
# Once nodes have members @firstLineNumber and @lastLineNumber, modify the
# toString() method in the Base class of nodes.coffee to output it as part of
# the nodes output. Once all this is in place, you are bootstrapped to write
# downstream tools, even if the line number support isn't perfect.
#
toString: (idt = '', name = @constructor.name) ->
lineno = if @firstLineNumber
"#{@firstLineNumber} #{@lastLineNumber - 1} "
else
' '
tree = '\n' + lineno + idt + name
tree += '?' if @soak
@eachChild (node) -> tree += node.toString idt + TAB
tree
# LIMITATIONS:
#
# By hooking into the "o" function and the Base class of nodes, we get up
# and running quickly, but a long term solution needs more precision, particularly
# for composite nodes. Some nodes follow this pattern, and basically get line
# number support for free, once you apply the patches above:
Class: [
o 'CLASS', -> new Class
o 'CLASS Block', -> new Class null, null, $2
o 'CLASS EXTENDS Expression', -> new Class null, $3
o 'CLASS EXTENDS Expression Block', -> new Class null, $3, $4
o 'CLASS SimpleAssignable', -> new Class $2
o 'CLASS SimpleAssignable Block', -> new Class $2, null, $3
o 'CLASS SimpleAssignable EXTENDS Expression', -> new Class $2, $4
o 'CLASS SimpleAssignable EXTENDS Expression Block', -> new Class $2, $4, $5
]
# The condition portion of a while loop.
WhileSource: [
o 'WHILE Expression', -> new While $2
o 'WHILE Expression WHEN Expression', -> new While $2, guard: $4
o 'UNTIL Expression', -> new While $2, invert: true
o 'UNTIL Expression WHEN Expression', -> new While $2, invert: true, guard: $4
]
# It's the nodes like below that require more specific surgery or some kind of
# rethink. The If node will get line numbers when it's parsed, but they won't
# include the ELSE block.
IfBlock: [
o 'IF Expression Block', -> new If $2, $3, type: $1
o 'IfBlock ELSE IF Expression Block', -> $1.addElse new If $4, $5, type: $3
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment