the vscode-tree-sitter "adding a new language" recipe is a 10-step one (though not all steps appear to be necessary from a technical standpoint).
apparently verilog support was added via a PR so there is an example that might be inspected for reference.
there are 2 candidates for a clojure grammar:
on first glance, the former looks like it may have had more development (62 commits over a few months). the latter has 3 commits in 1 day.
for the moment, will try the former.
one notable aspect of vscode-tree-sitter
is that it makes use of web-tree-sitter. haven't verified, but i think what this means is that the extension uses node's ability to execute wasm. tree-sitter is usually seen as requiring the use of C-code compiled to native code. if vscode-tree-sitter
is really just using wasm to achieve tree-sitter functionality, the extension doesn't need to have native code for each of the typical 3 platforms (i.e. linux, macos, and windows).
steps 0, 1, 3, 4, 5, 6, 7, 8 completed. step 2 has a functioning piece, but needs work.
get tree-sitter-clojure setup locally.
this turns out to not work out-of-the-box for node 12 (at least not here).
after upgrading a couple of dependencies (tree-sitter-cli and nan) to their latest and not succeeding, downgrading node to 10 seemed to do the trick, though there seem to be some people who believe that shouldn't be necessary...
relevant issues include:
interestingly, the aforementioned tree-sitter-clojure dev setup instructions point to an issue (oakmac/tree-sitter-clojure#17) in which pedrorgirardi appears 🙂
add a dependency for tree-sitter-clojure to vscode-tree-sitter.
initial thoughts: iiuc, there is no npm package for tree-sitter-clojure, so may be adding a git or github url or local dependency in package.json might be paths forward.
decided to add:
"tree-sitter-clojure": "git+https://github.com/oakmac/tree-sitter-clojure",
to the "devDependencies" section of package.json
.
npm install
was successful with node v10.18.1
.
on a side note, there was a vulnerability reported so ran npm audit fix
.
add a color function to ./lib/colors.ts
initial thoughts: atm, there is no lib
directory on the master branch, but the commit for adding verilog support added colorVerilog to ./src/color.ts, so may be doing something analogous will work.
it turns out that though the description of this step is possibly the shortest, this is the most involved step. there doesn't appear to be any documentation about how to create a color function, though there are multiple examples.
a skeleton function was first created based on existing examples from color.ts
. specifically, the colorTypescript
function was copy-modified to return an appropriate but mostly empty Map<string, Range[]>
.
a slightly filled in version of the function is:
export function colorClojure(root: Parser.Tree, visibleRanges: {start: number, end: number}[]) {
const functions: Range[] = []
const keywords: Range[] = []
let visitedChildren = false
let cursor = root.walk()
let parents = [cursor.nodeType]
while (true) {
// Advance cursor
if (visitedChildren) {
if (cursor.gotoNextSibling()) {
visitedChildren = false
} else if (cursor.gotoParent()) {
parents.pop()
visitedChildren = true
continue
} else {
break
}
} else {
const parent = cursor.nodeType
if (cursor.gotoFirstChild()) {
parents.push(parent)
visitedChildren = false
} else {
visitedChildren = true
continue
}
}
// Skip nodes that are not visible
if (!visible(cursor, visibleRanges)) {
visitedChildren = true
continue
}
// Color tokens
console.log(cursor.nodeType)
switch (cursor.nodeType) {
case 'function_name':
variables.push({start: cursor.startPosition, end: cursor.endPosition})
break
case 'keyword':
keywords.push({start: cursor.startPosition, end: cursor.endPosition})
break
}
}
cursor.delete()
return new Map([
['variable', functions],
['keyword', keywords],
])
}
add a language to the dictionary at the top of ./lib/extension.ts
initial thoughts: similarly to step 2, emulating the verilog case of adding to src/extension.ts is one idea.
the following line was enough:
'clojure': {module: 'tree-sitter-clojure', color: colors.colorClojure},
add a simplified textmate grammar as ./textmate/clojure.tmLanguage.json
initial thoughts: may be a simplified version of what calva uses or just use that as-is?
technically, this step doesn't seem to be necessary to get tree-sitter working for clojure. if merging with vscode-tree-sitter is important, it may be worth doing. possibly it could be attended to later.
it turns out this step is necessary. added calva's clojure.tmLanguage.json
.
add a reference to the grammar to the contributes.grammars
section of package.json
initial thoughts: seems straight-forward
added the following to the contributes.grammars
section of package.json
:
{
"language": "clojure",
"scopeName": "source.clojure",
"path": "./textmate/clojure.tmLanguage.json"
},
add a reference to onLanguage:clojure
to the activationEvents
section of package.json
initial thoughts: seems straight-forward
added the following to the activationEvents
section of package.json
:
"onLanguage:clojure",
add an example to examples/clojure
initial thoughts: similar to step 4, this doesn't appear necessary to get things working, but for merging, it might be worth it at some point.
in any case, it seems doable
added example.clj
to a newly created clojure
subdirectory of the examples
directory. example.clj
was obtained from the tree-sitter-clojure
repository.
try under VSCode to test / verify changes
initial thoughts: sounds like a good idea
turns out the official instructions appear incomplete. upon testing it became clear that an important piece was missing: the parser.
scripts/gen-parsers.sh
suggests it might be done by executing:
./node_modules/.bin/tree-sitter build-wasm ./node_modules/tree-sitter-clojure
in the repository root. that should produce tree-sitter-clojure.wasm
.
Then place the parser along with its friends in the directory parsers
:
mv tree-sitte-clojure.wasm parsers/
Perhaps updating scripts/gen-parsers.sh
to contain the command to generate the clojure parser is likely to increase the chances of a PR being accepted.
Note that the parser generation may fail if one does not have emcc
(or docker
-- ugh).
emcc
can be obtained by one of the following methods:
-
a trail of yak-shaving including the tweaked building of llvm and binaryen along with emscripten (involving close reading of instructions),
-
the expedient but possibly less safe method of obtaining and setting up prebuilt emscripten, -OR-
-
the intermediate option of learning to use and applying the
emsdk
command to build and install appropriate llvm, binaryen, and emscripten bits (see output ofemsdk list
for precisely which things toemsdk install
-- after whichsource emsdk_env.sh
appears necessary for activation)
take a screenshot comparing before and after
initial thoughts: if steps up to this point worked, this seems non-problematic
submit a PR
initial thoughts: even if the relevant steps up to this point have succeeded, it's not clear whether doing this is a good idea.
some other possible courses action include:
- calva integration
- an independent clojure-specific extension
- clover integration
...but before any of that, probably want to observe what exactly was achieved (e.g. performance, completeness of grammar, bugginess, stability, etc.)