Skip to content

Instantly share code, notes, and snippets.

@axic
Last active May 11, 2024 03:13
Show Gist options
  • Save axic/17ddbbce4738ccf4040d30cbb5de484e to your computer and use it in GitHub Desktop.
Save axic/17ddbbce4738ccf4040d30cbb5de484e to your computer and use it in GitHub Desktop.
EVM Assembly Language

EVM Assembly Language

Motivation

The goal is to specify a syntax for an EVM assembly language, which can be used across various tools.

The format should be human readable, map EVM as closely as possible, allow for comments and refrain from complex syntax.

Specification

  1. Opcodes are upper case only
  2. Opcodes are separated with white space (including, but not limited to, space, tab, new line)
  3. Every EVM instruction is a valid opcode
  4. With the exception of PUSH, none of the opcodes have an argument
  5. The argument of PUSH is also separated by white space
  6. Argument of PUSH is either a decimal or a hexadecimal number (prefixed with 0x)
  7. PUSH1..32 is defined to push data with exact length
  8. PUSH is an alias to PUSH32
  9. Comments are denoted by ;; and the rest of the line is ignored
  10. PUSH accepts a special syntax for jump labels (PUSH [labelname])
  11. Labels are identifiers followed by a colon. When referenced in a push, their offset in the bytecode is pushed to the stack. Note: JUMPDEST needs to follow a label.
  12. Literal data, not to be processed by the assembler, must be hexademical digits following the pseudo opcode LIT

Rules 1 .. 8 are already followed by many tools, even the standard tests comply with them.

Grammar

TBD

Examples

List of opcodes, no comments (what the usual Ethereum tests look like):

PUSH1 0x60 PUSH1 0x40 MSTORE PUSH1 0x8 JUMP JUMPDEST PUSH1 0x2 JUMP

Using jump labels and comments:

  PUSH 0x60        ;; contract A {\n}
ErrorTag:
  PUSH 0x40        ;; contract A {\n}
  MSTORE           ;; contract A {\n}
  PUSH [tag1]      ;; contract A {\n}
  JUMP             ;; contract A {\n}
tag1:              ;; contract A {\n}
  JUMPDEST         ;; contract A {\n}
  PUSH [ErrorTag]  ;; contract A {\n}
  JUMP             ;; contract A {\n}

Constructor code:

  PUSH 0x60           ;; contract A {\n    function a()...
  PUSH 0x40           ;; contract A {\n    function a()...
  MSTORE              ;; contract A {\n    function a()...
  PUSH [end - start]
  DUP1                ;; contract A {\n    function a()...
  PUSH [start]
  PUSH 0              ;; contract A {\n    function a()...
  CODECOPY            ;; contract A {\n    function a()...
  PUSH 0              ;; contract A {\n    function a()...
  RETURN              ;; contract A {\n    function a()...
start:
  LIT 60606040526000357c0100000000000000000000000000000000000000000000000000000000900480630dbe671f146039576035565b6002565b3460025760486004805050604a565b005b6001604a025b56
end:

(These examples are based on Solidity output.)

Assembler output

Standard output of the assembler is the bytecode in hex digits, without a leading 0x.

Questions

Q: support lowercase opcodes?

Q: should PUSH be an alias for smallest PUSH the literal fits into?

Q: should a functional syntax also be supported? e.g.:

loop:
JUMPI(MUL(1, 2), [loop])

Q: should Solidity's PUSHLIB be supported or should it be a special syntax of PUSH?

Option A:

PUSHLIB LibraryName

Option B:

PUSH {LibraryName}
@gcolvin
Copy link

gcolvin commented Oct 17, 2016

Whatever the details, I support cleaning up the gratuitous differences between our assembly syntaxes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment