Skip to content

Instantly share code, notes, and snippets.

@sogaiu
Last active January 27, 2020 23:42
Show Gist options
  • Save sogaiu/23e2f1bc6c5793b004c0c6ed3b596569 to your computer and use it in GitHub Desktop.
Save sogaiu/23e2f1bc6c5793b004c0c6ed3b596569 to your computer and use it in GitHub Desktop.
experience report: adding clojure support to vscode-tree-sitter

adding clojure support to vscode-tree-sitter

the vscode-tree-sitter "adding a new language" recipe is a 10-step one (though not all steps appear to be necessary from a technical standpoint).

apparently verilog support was added via a PR so there is an example that might be inspected for reference.

there are 2 candidates for a clojure grammar:

on first glance, the former looks like it may have had more development (62 commits over a few months). the latter has 3 commits in 1 day.

for the moment, will try the former.

one notable aspect of vscode-tree-sitter is that it makes use of web-tree-sitter. haven't verified, but i think what this means is that the extension uses node's ability to execute wasm. tree-sitter is usually seen as requiring the use of C-code compiled to native code. if vscode-tree-sitter is really just using wasm to achieve tree-sitter functionality, the extension doesn't need to have native code for each of the typical 3 platforms (i.e. linux, macos, and windows).

current status

steps 0, 1, 3, 4, 5, 6, 7, 8 completed. step 2 has a functioning piece, but needs work.

step 0

get tree-sitter-clojure setup locally.

this turns out to not work out-of-the-box for node 12 (at least not here).

after upgrading a couple of dependencies (tree-sitter-cli and nan) to their latest and not succeeding, downgrading node to 10 seemed to do the trick, though there seem to be some people who believe that shouldn't be necessary...

relevant issues include:

interestingly, the aforementioned tree-sitter-clojure dev setup instructions point to an issue (oakmac/tree-sitter-clojure#17) in which pedrorgirardi appears 🙂

step 1

add a dependency for tree-sitter-clojure to vscode-tree-sitter.

initial thoughts: iiuc, there is no npm package for tree-sitter-clojure, so may be adding a git or github url or local dependency in package.json might be paths forward.

decided to add:

"tree-sitter-clojure": "git+https://github.com/oakmac/tree-sitter-clojure",

to the "devDependencies" section of package.json.

npm install was successful with node v10.18.1.

on a side note, there was a vulnerability reported so ran npm audit fix.

step 2

add a color function to ./lib/colors.ts

initial thoughts: atm, there is no lib directory on the master branch, but the commit for adding verilog support added colorVerilog to ./src/color.ts, so may be doing something analogous will work.

it turns out that though the description of this step is possibly the shortest, this is the most involved step. there doesn't appear to be any documentation about how to create a color function, though there are multiple examples.

a skeleton function was first created based on existing examples from color.ts. specifically, the colorTypescript function was copy-modified to return an appropriate but mostly empty Map<string, Range[]>.

a slightly filled in version of the function is:

export function colorClojure(root: Parser.Tree, visibleRanges: {start: number, end: number}[]) {
	const functions: Range[] = []
	const keywords: Range[] = []
	let visitedChildren = false
	let cursor = root.walk()
	let parents = [cursor.nodeType]
	while (true) {
		// Advance cursor
		if (visitedChildren) {
			if (cursor.gotoNextSibling()) {
				visitedChildren = false
			} else if (cursor.gotoParent()) {
				parents.pop()
				visitedChildren = true
				continue
			} else {
				break
			}
		} else {
			const parent = cursor.nodeType
			if (cursor.gotoFirstChild()) {
				parents.push(parent)
				visitedChildren = false
			} else {
				visitedChildren = true
				continue
			}
		}
		// Skip nodes that are not visible
		if (!visible(cursor, visibleRanges)) {
			visitedChildren = true
			continue
		}
		// Color tokens
		console.log(cursor.nodeType)
 		switch (cursor.nodeType) {
			case 'function_name':
				variables.push({start: cursor.startPosition, end: cursor.endPosition})
				break
			case 'keyword':
				keywords.push({start: cursor.startPosition, end: cursor.endPosition})
				break
		}
	}
	cursor.delete()
	return new Map([
		['variable', functions],
		['keyword', keywords],
	])
}

step 3

add a language to the dictionary at the top of ./lib/extension.ts

initial thoughts: similarly to step 2, emulating the verilog case of adding to src/extension.ts is one idea.

the following line was enough:

'clojure': {module: 'tree-sitter-clojure', color: colors.colorClojure},

step 4

add a simplified textmate grammar as ./textmate/clojure.tmLanguage.json

initial thoughts: may be a simplified version of what calva uses or just use that as-is?

technically, this step doesn't seem to be necessary to get tree-sitter working for clojure. if merging with vscode-tree-sitter is important, it may be worth doing. possibly it could be attended to later.

it turns out this step is necessary. added calva's clojure.tmLanguage.json.

step 5

add a reference to the grammar to the contributes.grammars section of package.json

initial thoughts: seems straight-forward

added the following to the contributes.grammars section of package.json:

{
    "language": "clojure",
    "scopeName": "source.clojure",
    "path": "./textmate/clojure.tmLanguage.json"
},

step 6

add a reference to onLanguage:clojure to the activationEvents section of package.json

initial thoughts: seems straight-forward

added the following to the activationEvents section of package.json:

"onLanguage:clojure",

step 7

add an example to examples/clojure

initial thoughts: similar to step 4, this doesn't appear necessary to get things working, but for merging, it might be worth it at some point.

in any case, it seems doable

added example.clj to a newly created clojure subdirectory of the examples directory. example.clj was obtained from the tree-sitter-clojure repository.

step 8

try under VSCode to test / verify changes

initial thoughts: sounds like a good idea

turns out the official instructions appear incomplete. upon testing it became clear that an important piece was missing: the parser.

scripts/gen-parsers.sh suggests it might be done by executing:

./node_modules/.bin/tree-sitter build-wasm ./node_modules/tree-sitter-clojure

in the repository root. that should produce tree-sitter-clojure.wasm.

Then place the parser along with its friends in the directory parsers:

mv tree-sitte-clojure.wasm parsers/

Perhaps updating scripts/gen-parsers.sh to contain the command to generate the clojure parser is likely to increase the chances of a PR being accepted.

Note that the parser generation may fail if one does not have emcc (or docker -- ugh).

emcc can be obtained by one of the following methods:

  • a trail of yak-shaving including the tweaked building of llvm and binaryen along with emscripten (involving close reading of instructions),

  • the expedient but possibly less safe method of obtaining and setting up prebuilt emscripten, -OR-

  • the intermediate option of learning to use and applying the emsdk command to build and install appropriate llvm, binaryen, and emscripten bits (see output of emsdk list for precisely which things to emsdk install -- after which source emsdk_env.sh appears necessary for activation)

step 9

take a screenshot comparing before and after

initial thoughts: if steps up to this point worked, this seems non-problematic

before

after

step 10

submit a PR

initial thoughts: even if the relevant steps up to this point have succeeded, it's not clear whether doing this is a good idea.

some other possible courses action include:

  • calva integration
  • an independent clojure-specific extension
  • clover integration

...but before any of that, probably want to observe what exactly was achieved (e.g. performance, completeness of grammar, bugginess, stability, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment