Skip to content

Instantly share code, notes, and snippets.

@Heath123
Last active August 13, 2022 13:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Heath123/01d889d13f998fbe45992a9564947f97 to your computer and use it in GitHub Desktop.
Save Heath123/01d889d13f998fbe45992a9564947f97 to your computer and use it in GitHub Desktop.
w2c2 + AssemblyScript tutorial

w2c2 + AssemblyScript tutorial

This tutorial shows you how to make a simple Hello World work with AssemblyScript and w2c2. For this example we will create a simple print function in C and pass it through to AssemblyScript, and try to explain the concepts behind how it works.

Step 1 - Write and compile the AssemblyScript file

Create a file called hello.ts with the following content:

print("Hello, world!")

Now if you try to compile it with asc -o hello.wasm hello.ts, you will get this error:

ERROR TS2304: Cannot find name 'print'.

This is because AssemblyScript doesn't know where to resolve the print function. We want to tell it to import this function from the environment. If you look at the AssemblyScript documentation, at https://www.assemblyscript.org/concepts.html#module-imports you can see:

In AssemblyScript, host functionality can be imported by utilizing the ambient context, that is using a declare statement:

// assembly/env.ts
export declare function logInteger(i: i32): void
// assembly/index.ts
import { logInteger } from "./env"

logInteger(42)

To keep it simple, let's keep it in the same file for now:

declare function print(str: string): void
print("Hello, world!")

Now if you run asc -o hello.wasm hello.tsthe compilation should succeed. If you want to check that this function has been added to the file as an import, run wasm2wat hello.wasm | grep import:

  (import "hello" "print" (func (;0;) (type 1)))
  (import "env" "abort" (func (;1;) (type 2)))

The first string is the module being imported from. In this case, since the declare function statement appeared in hello.ts, the module name is hello. The second string is the name of the function.

It worked, but what's this env.abort function? If you check https://www.assemblyscript.org/concepts.html#special-imports you can see that three functions need to be defined by the environment - env.abort ("Called on unrecoverable errors"), env.trace ("Called when trace is called in user code"), and env.seed ("Called when the random number generator needs to be seeded"). These last two functions are only present if they are needed, so in our simple example we can ignore them.

Step 2 - Convert to C

Now, assuming that you have w2c2 installed, you can call it on the result. We will use clang-format to properly indent the code so it's readable.

w2c2 hello.wasm | clang-format > hello.c

You will also need to make sure the w2c2_base.h file is present, as it provides some necessary declarations and functionality.

wget https://raw.githubusercontent.com/turbolent/w2c2/main/w2c2_base.h

Take a look at the resulting hello.c file. Some things to notice:

  • Imports are declared as function pointers. They are marked with extern, so the declarations are expected to be found in another file.
extern void (*f_hello_print)(U32);
extern void (*f_env_abort)(U32, U32, U32, U32);
  • There is an init function which initializes the module and runs any top-level code.
  • We declared print as taking in a string, but it takes in a 32-bit unsigned integer.

Why is the string argument a U32 rather than a char* or similar? Well, WebAssembly, being designed as a low-level compilation target rather than a high-level language with types, doesn't distinguish between pointers, unsigned integers and signed integers, as long as they are the same size. This is only specified in the instructions where they are used, similarly to assembly language. As WASM is currently 32-bit by default, all pointers can be represented as a U32 type. So, this U32 represents an offset into the virtual WASM memory where the string can be found.

As an aside, WebAssembly does not actually natively support any types with less than 32 bits. If you use an 8-bit or 16-bit integer in AssemblyScript, it actually simulates this with a 32-bit integer by cutting it down to the correct amount of bits after certain operations like additition (yes, this can have a performance impact). The only integer types available in WASM are 32-bit and 64-bit integers.

Step 3 - Glue code

The next step is to write code to initialize the module and provide the needed functionality. Create a file called main.c. First, we need to include some headers:

#include <stdio.h>
#include <stdlib.h>

#include "w2c2_base.h"

This includes some standard library functions, and w2c2_base.h provides some definitions and macros that we need to interact with the module. Next, we need to declare some things that we need to access from the hello.c file:

extern wasmMemory *e_memory;
extern void init(); // extern is not strictly needed here but makes it more clear

We need to access the memory to be able to read the string, and we need to call init() to initialize the module. Now, let's declare our print function. We need to parse and print the string passed as a memory offset to the C code. There is information on the memory layout of various data types such as strings and arrays at https://www.assemblyscript.org/runtime.html#class-layout.

References to an object always point at the start of the payload, with the header beginning 20 bytes before.

Strings always use class id 1, with their 16-bit char codes (UTF-16 code units, allowing isolated surrogates like JS) as the payload. For example, if rtSize is 8, the string's .length is 4.

So, the pointer we receive points to UTF-16 data in WASM memory. If only ASCII characters are used, as they are in this case, we can just skip every other byte and pretend it's ASCII, which we will do here for simplicity. Note that the pointer is not a C memory address, but an offset into the WASM memory. We can use the macro i32_load8_u provided by w2c2_base.h, which loads a single byte from the WASM memory.

The AssemblyScript documentation states that the string length is rtSize / 2, and that rtSize is at an offset of -4 in the header, so we first have to load that and halve it to get the string's length.

void print(U32 offset) {
  // The length is half of rtSize
  int length = i32_load(e_memory, offset - 4) / 2;
  for (int i = 0; i < length; i++) {
    // Load and print the character at the address
    // We multiply i by 2 as we are skipping every other byte
    printf("%c", i32_load(e_memory, offset + i * 2));
  }
  // Add a newline at the end
  printf("\n");
}

Now we need to declare the function pointer that hello.c is expecting to find, and set it to the address of the print function.

void (*f_hello_print)(U32) = &print;

Let's do a similar thing for f_env_abort. We could make this print useful error information, but for now this is just a minimal example.

// This cannot be called abort or it will clash with the standard library's abort function
void env_abort(U32 message, U32 filename, U32 line, U32 col) {
  printf("abort\n");
  exit(1);
}
void (*f_env_abort)(U32, U32, U32, U32) = &env_abort;

We also need to implement trap, which is similar except that trapping is a part of the WebAssembly specification itself rather than a detail of AssemblyScript, so it's not an import but just a normal function.

void trap(Trap trap) {
  printf("trap\n");
  exit(1);
}

And now all we need is the main function:

int main() {
  // init will automatically run the top level code, including our hello world
  init();
  return  0;
}

Your file should now look like this:

#include <stdio.h>
#include <stdlib.h>

#include "w2c2_base.h"

extern wasmMemory *e_memory;
extern void init(); // extern is not strictly needed here but makes it more clear

void print(U32 offset) {
  // The length is half of rtSize
  int length = i32_load(e_memory, offset - 4) / 2;
  for (int i = 0; i < length; i++) {
    // Load and print the character at the address
    // We multiply i by 2 as we are skipping every other byte
    printf("%c", i32_load(e_memory, offset + i * 2));
  }
  // Add a newline at the end
  printf("\n");
}
void (*f_hello_print)(U32) = &print;

// This cannot be called abort or it will clash with the standard library's abort function
void env_abort(U32 message, U32 filename, U32 line, U32 col) {
  printf("abort\n");
  exit(1);
}
void (*f_env_abort)(U32, U32, U32, U32) = &env_abort;

void trap(Trap trap) {
  printf("trap\n");
  exit(1);
}

int main() {
  init();
  return 0;
}

Now compile and run:

$ gcc main.c hello.c 
$ ./a.out
Hello, world!

Congratulations, you should now understand how to use w2c2 with AssemblyScript. If I got anything wrong please comment so I can correct it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment