The purpose of this format is to provide a language-agnostic way of representing source code that can be compiled to an Abstract Syntax Tree (AST). The format is represented using JSON and should be able to capture the important properties of most code.
All code in these datasets are represented as abstract syntax trees (ASTs), stored in a JSON format. Each JSON object represents a node in the AST, and has the following properties:
type
[required]: The type of the node (e.g. "if-statement", "expression", "variable-declaration", etc.). In Snap, this could be the name of a built-in block (e.g. "forward", "turn"). The set of possible types is pre-defined by a given programming language, as they generally correspond to keywords. The possible types for a given language are defined in the grammar file for the dataset, discussed later.value
[optional]: This contains any user-defined value for the node, such as the identifier for a variable or function, the value of a