Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save chrisseaton/90f8cbfd48ac7aef00c5d8e219f9c347 to your computer and use it in GitHub Desktop.
Save chrisseaton/90f8cbfd48ac7aef00c5d8e219f9c347 to your computer and use it in GitHub Desktop.

One Approach to Easy and High Performance VMs

This is a lab that walks you through one approach to the challenge of how we can have easy and high performance language virtual machines.

It's designed to come after Mario Wolczko's lecture on a Concise and Opinionated History of Virtual Machines, and illustrates practically some of the concepts and techniques introduced there.

This lab is about just one approach to easy and high performance VMs, and that is the Truffle and Graal approach. Truffle is a framework for developing self-specialising AST interpreters in Java, and Graal is a dynamic compiler, which Truffle uses on your behalf to give you a just-in-time compiler for your interpreter, using partial evaluation.

This lab of course won't teach you how to work on a production language or compiler in an hour and a half, but I hope it can concretely do is to show how Truffle makes it easy to implement a language, and how accessible the Graal compiler is, so you know you can reach for these tools if you need them in future.

What we will do in this lab

  • Refresh the big ideas of self-specialising AST interpreters and partial evaluation

  • Set up for Truffle development

  • Tour a simple language interpreter in Truffle

  • Add new functionality to the simple language

  • Combining Truffle with Graal for partial evaluation

  • Tour Graal

  • Stretch ideas

Tools

You will need:

  • Internet access

  • Some basic development tools like Python 2 and Git

  • A Java 8 JDK and some Java ecosystem tools like Maven 3

  • An IDE like Eclipse (Neon and Oxygen version both work), or a plain text editor and grep if you prefer

  • Probably in practice you need me to talk you through the tasks - it's supposed to be interactive

The two big ideas - specialising AST interpreters and partial evaluation

Mario talked about two techniques used in implementing high-performance language implementations easily - AST interpretation and partial evaluation. We can demonstrate these two key ideas to ourselves in order to make sure that they're clear.

Here's a simple AST interpreter in Python.

I haven't written a parser, as we'll leave parsers as a solved problem for this lab, and instead we'll create AST nodes directly by using their constructors.

You can see the essential parts of the AST interpreter - the node classes, and the execute methods. See how the execute method of the AddNode calls execute on the left and right operands.

class ConstantNode:
    def __init__(self, value):
        self.value = value

    def execute(self):
        return self.value

class AddNode:
    def __init__(self, left, right):
        self.left = left
        self.right = right
    
    def execute(self):
        return self.left.execute() + self.right.execute()
        
class SubNode:
    def __init__(self, left, right):
        self.left = left
        self.right = right
    
    def execute(self):
        return self.left.execute() - self.right.execute()

print AddNode(ConstantNode(6),
    SubNode(ConstantNode(4), ConstantNode(2))).execute()

Tasks:

  • Add MulNode and DivNode for a multiply and divide operation.

This just means adding a new class, and then you can see how you add the logic for the operation in the execute method. Notice how the code needed for the new operations are contained within the new classes - we don't have to modify a central execute method. This is an advantage of writing AST interpreters.

  • Manually partially evaluate your interpreter for the expression 1 + 2 - 3

Partial evaluation is a technique for compiling or executing an AST interpreter. What it means in practice is that we take the code for a call to an execute method and inline that code.

AddNode(ConstantNode(1), ConstantNode(2)).execute()
ConstantNode(1).execute() + ConstantNode(2).execute()
1 + 2
3

This is effectively what the Graal partial evaluator we're going to be using does.

Setting up for Truffle development

To look at a Truffle language we're going to need the latest release of GraalVM. You can download it for Linux from GitHub, or from OTN for macOS:

After extracting it, set the JAVA_HOME environment variable to point to it (to Contents/Home on macOS).

You then want to clone an example Truffle language, SimpleLanguage. SimpleLanguage is a bit like a simpler version of JavaScript and is heavily commented to show how Truffle is used to build a language. After cloning it you can build it using Maven - it's just a normal Java application.

$ export JAVA_HOME=`pwd`/graalvm-ee-1.0.0-rc3
$ git clone https://github.com/graalvm/simplelanguage.git
$ cd simplelanguage
$ mvn package

Tasks:

  • Compile SimpleLanguage

  • Run the SimpleLangauge HelloWorld.sl test program

Look at the sl launcher program, and the simplelanguage/language/tests directory.

  • Run some more test programs that look interesting.

Touring a simple language interpreter in Truffle

SimpleLanguage is designed to be easy to read. A good way to navigate it is to also use the Eclipse IDE. Install the m2e and m2e-apt plugins from Help, Eclipse Marketplace, and then do File, Import, Maven, Existing Maven Projects to import SimpleLanguage.

Tasks:

  • Find the AST node Java class that implements the addition operator

Use Navigate, Open Type to search for classes.

You'll see that it appears to have multiple execute methods, unlike our Python example, and that the results of executing child nodes are passed in as parameters to the method, rather than executing them yourself. Each overload matches different pairs of types from the results from the children. This reduces the number of if branches you need to write to match the types that your operators support.

Read the comments in this file.

  • Find where the parser emits the node

A companion class called SLAddNodeGen is created automatically by Truffle. Look at SLNodeFactory to see the node being created. Note how the parser directly creates a SLAddNode - there are no other intermediate representations.

  • Find the if node, and where the parser emits it

The SLIfNode looks different to SLAddNode and doesn't have child values passed in as parameters. Why do you think that is? What does this logic talking about profiling do?

  • Find the return node - who catches the exception it throws?

SLReturnNode throws an exception. Why is it doing that if there's no error. Who catches this exception? Use Eclipse's right click and Find References on the class name to find out.

Adding new functionality

  • Add support for the < operator comparing two strings for lexical ordering

Can you get this program working by modifying SLLessThanNode? You will need to run mvn package before running sl again.

function main() {
  println("a" < "b");
  println("b" < "a");
  println("a" < "a");
}

Combining Truffle with Graal for partial evaluation

  • Run the SimpleLanguage LoopCall.sl test program

LoopCall.sl runs in a loop, so we'd expect parts of it to get hot and get compiled. Run with the -J-Dgraal.TruffleBackgroundCompilation=false -J-Dgraal.TraceTruffleCompilation=true options to see this happening. root add is the root node, of the add function, being compiled.

  • Dump Graal graphs from LoopCall.sl to IGV

IGV is a graphical tool to understand what Graal and Truffle are doing. Download IGV from GitHub:

https://github.com/oracle/graal/releases

$ idealgraphvisualizer/bin/idealgraphvisualizer &

When you open IGV, click Remove State on the right-hand side, to make the graphs simpler.

Then run sl with the -dump option. Explore the tree on the left of the IGV screen.

  • Find the Truffle AST graph in IGV

  • Find the add node in the Graal graph for the add function in IGV

Look at After TruffleTier and see if you can make some sense of the compiler nodes you see there. Can you relate them back to the Java code for the AST interpreter?

  • Look at the graphics for the loop function. How does Graal represent the while loop? How is it different to how the while loop is represented in the Truffle AST?

  • Dump assembly from LoopCall.sl, and find the relevant add assembly instruction in the compiled add function

Use the -disassemble option.

  • Why is the add instruction followed by a jo instruction and what does this do?

  • Where does the jo instruction jump to?

Touring Graal

To build Graal in order to open it in an IDE, we need to install a special JVM with a feature called JVMCI. You can get this for Linux from GitHub or OTN for macOS.

Then clone our special build tool, mx, and then Graal (clone at depth 1 to save time).

$ git clone https://github.com/graalvm/mx.git
$ export PATH=`pwd`/mx:$PATH
$ git clone --depth 1 https://github.com/oracle/graal.git
$ cd graal/compiler
$ JAVA_HOME=~/Documents/labsjdk1.8.0_172-jvmci-0.46/Contents/Home mx eclipseinit

Open Eclipse, and open the graal directory as a workspace.

File, Import, General, Existing Projects Into Workspace and select the graal directory again.

  • Find the Graal IR graph node Java class that represents the add instruction we had in the add SimpleLanguage function

Like SimpleLanguage had SLAdd, Graal has a node in the compiler for IntegerAddExactNode, that we saw when we used IGV. How does this class differ from the AST node? What do you think the canonical method does?

  • Find where this node is created from Java bytecode

Can you find where this node is created? Is is created by a parser? What kind of parser is it?

  • Can you follow the code from IntegerAddExactNode to find where that jump on overflow branch is created, and what bytes get emitted?

You should end at the jcc method in AMD64Assembler. What is addPatchAt all about?

  • What happens in IntegerAddExactSplitNode if the operation cannot overflow?

  • Follow this code through to find what machine-code bytes are emitted for add-with-overflow on the AMD64 architecture

Stretch ideas

  • Find the Truffle JavaScript implementation, clone it and explore it

  • See if you can add a new language feature to JavaScript

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment