Skip to content

Instantly share code, notes, and snippets.

@tangentstorm
Created March 2, 2013 09:31
Show Gist options
  • Save tangentstorm/5070301 to your computer and use it in GitHub Desktop.
Save tangentstorm/5070301 to your computer and use it in GitHub Desktop.
# B4 - a bootstrap assembler for Ngaro/RetroVM
# --------------------------------------------------------------
# This is a tiny, pure-binary forth interpreter. Input consists
# of 4-byte sequences, which can either be ngaro instructions or
# space-terminated strings.
# Data area.
# --------------------------------------------------------------
# When ngaro boots up, the instruction pointer is set to zero, which
# means it will execute whatever instruction it finds in the first
# cell.
#
# In ngaro, instructions below 31 are executed directly, and higher
# numbers are treated as subroutine calls. This means that you can't
# call a procedure that resides before cell 31 without trickery or
# special treatment.
#
# Instead, we are going to use the first part of ram to store our
# initial dictionary, and other data the system needs to function.
#
# To prevent the system from treating this data as code, we therefore
# must start the image by jumping over this data. That means we have
# to decide up front how big it is.
#
# We will leave space for a few important variables for compatability
# with retroforth, followed by a proto-dictionary that holds just
# enough to allow retro to boot into forth.
# 00h - 01h : 2 cells for the jump
# ------------------------------------------------
JMP: next
# 02h - 04h : 3 cells for the "real" dictionary
# ------------------------------------------------
$last 0 # 02h most recently defined header
$heap 0 # 03h next free spot in ram
$which 0 # 04h most recently looked up header
# 05h - 0Ah : 6 cells to hold device metadata
# ------------------------------------------------
$mem 0 # 05h "memory" in retro. size of ram
$fb 0 # 06h do we have graphics? (short for framebuffer)
$fw 0 # 07h framebuffer width
$fh 0 # 08h framebuffer height
$cw 0 # 09h console width
$ch 0 # 0Ah console height
# 0Bh - 2Ah : 32 cells for asm dictionary + end mark
# ------------------------------------------------
# These are 4-byte strings, which fit in a single
# cell and can thus be treated as integers as far
# as we are concerned.
$ops
'NOP ' 'LIT:' 'DUP ' 'DROP' 'SWAP' # 0Bh - 0Fh
'PUSH' 'POP ' 'NXT:' 'JMP:' 'RET ' # 10h - 14h
'JLT:' 'JGT:' 'JNE:' 'JEQ:' 'GET ' # 15h - 19h
'PUT ' 'ADD ' 'SUB ' 'MUL ' 'DVM ' # 1Ah - 1Eh
'AND ' 'OR ' 'XOR ' 'SHL ' 'SHR ' # 1Fh - 23h
'ZRET' 'INC ' 'DEC ' 'IN ' 'OUT ' # 24h - 28h
'WAIT' # 29h
0 # 2Ah : mark the end of opcode, start of macros
# goal is to be able to let the assembler use a second
# vm to simplify the bootstrapping process.
#
# asm should be able to control this directly.
# possibly:
# - look the word up in the dictionary
# - output the word to stdout (so the assembler can read it)
# use cases for words:
# 1. add a new 4-char word to the dictionary.
# 2. compile a call to a word
# 3. execute a word immediately
# 4. for macros, compile a call to execute the word later
# 5. distingush words from numbers
# if we use the high bits on the 3 rightmost characters,
# then we can hold up to 8 colors. can't use the leftmost
# high bit because it's used to indicate negative numbers
# so, if token is in $7F000000 .. $7FFFFFFF then it's a
# word else it's a 24-bit literal.
:find
LIT: ops
DUP at
DUP JEQ:
NXT: find
# BIOS
# ------------------------------------------------
:here LIT: heap GET RET
:.new
:.def
:.for here RET
:wait LIT: 0 DUP OUT WAIT RET
:putc LIT: 1 LIT: 2 OUT wait RET
:getc LIT: 1 DUP DUP OUT wait IN RET
:here LIT: heap GET RET
:over PUSH DUP POP SWAP RET
:word # ( n -> d c b a ) convert word to 4 chars
LIT: 4 DUP
NXT: word
RET
# LIT: 16777216 # 1 << 24
# 4 .for 256 DIVM .nxt DROP
# 1684234849 should be abcd
:main
# ------------------------------------------------
# Placeholder for the instruction to execute.
# This is similar to revectoring in retro: we leave
# enough room to compile a 'jmp: xxx' instruction.
# (Or any other instruction.)
$ins0 NOP $ins1 NOP
# prompt for input
# ------------------------------------------------
# We have to be careful to (eventually) return to
# this loop no matter what happens, after executing
# the above instruction(s). For primitives, we just
# fall through.
98 putc 52 putc 62 putc 32 putc # "b4> "
# read instruction
# --------------------------------------------------
# Fetch the next character from the input device.
# Normally, this is a single ascii character from the
# keyboard, but it can actually be any 32 bit binary
# sequence.
#
# For bootstrapping, we will use this fact to transmit
# up to 4 characters at once.
#
# This operation will cause the VM to block (pause)
# until the instruction is received.
getc
#### scraps ####
# ------------------------------------------------
# Now that we're back, we will zero out those slots.
# Retro's normal behavior is to block.
LIT: 0 LIT: ins0 PUT
LIT: 0 LIT: ins1 PUT
# However, since we are bootstrapping, we don't yet know where to jump.
# One approach would be to hand-assemble this code on graph paper
# or something so we could pre-calculate all the jumps, but this is
# cumbersome and would have to be re-done every time we change our
# code.
#
# Instead, we are going to provide a way for the system to self-program.
#
# The trick is to start the image with an infinite loop, and leave
# the first two cells blank. The first time the loop runs, these will
# be no-ops, but somewhere inside the loop, we will overwrite them
# with other instructions. At the end of the loop, we will jump back
# to cell 0 and the instructions will be executed.
# this control loop cannot call routines via the return stack because
# we want to be able to use and inspect the return stack from outside.
# therefore, we can't use anything but goto statements (direct jumps).
# so as not to screw up the control stack.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment