Created
March 2, 2013 09:31
-
-
Save tangentstorm/5070301 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# B4 - a bootstrap assembler for Ngaro/RetroVM | |
# -------------------------------------------------------------- | |
# This is a tiny, pure-binary forth interpreter. Input consists | |
# of 4-byte sequences, which can either be ngaro instructions or | |
# space-terminated strings. | |
# Data area. | |
# -------------------------------------------------------------- | |
# When ngaro boots up, the instruction pointer is set to zero, which | |
# means it will execute whatever instruction it finds in the first | |
# cell. | |
# | |
# In ngaro, instructions below 31 are executed directly, and higher | |
# numbers are treated as subroutine calls. This means that you can't | |
# call a procedure that resides before cell 31 without trickery or | |
# special treatment. | |
# | |
# Instead, we are going to use the first part of ram to store our | |
# initial dictionary, and other data the system needs to function. | |
# | |
# To prevent the system from treating this data as code, we therefore | |
# must start the image by jumping over this data. That means we have | |
# to decide up front how big it is. | |
# | |
# We will leave space for a few important variables for compatability | |
# with retroforth, followed by a proto-dictionary that holds just | |
# enough to allow retro to boot into forth. | |
# 00h - 01h : 2 cells for the jump | |
# ------------------------------------------------ | |
JMP: next | |
# 02h - 04h : 3 cells for the "real" dictionary | |
# ------------------------------------------------ | |
$last 0 # 02h most recently defined header | |
$heap 0 # 03h next free spot in ram | |
$which 0 # 04h most recently looked up header | |
# 05h - 0Ah : 6 cells to hold device metadata | |
# ------------------------------------------------ | |
$mem 0 # 05h "memory" in retro. size of ram | |
$fb 0 # 06h do we have graphics? (short for framebuffer) | |
$fw 0 # 07h framebuffer width | |
$fh 0 # 08h framebuffer height | |
$cw 0 # 09h console width | |
$ch 0 # 0Ah console height | |
# 0Bh - 2Ah : 32 cells for asm dictionary + end mark | |
# ------------------------------------------------ | |
# These are 4-byte strings, which fit in a single | |
# cell and can thus be treated as integers as far | |
# as we are concerned. | |
$ops | |
'NOP ' 'LIT:' 'DUP ' 'DROP' 'SWAP' # 0Bh - 0Fh | |
'PUSH' 'POP ' 'NXT:' 'JMP:' 'RET ' # 10h - 14h | |
'JLT:' 'JGT:' 'JNE:' 'JEQ:' 'GET ' # 15h - 19h | |
'PUT ' 'ADD ' 'SUB ' 'MUL ' 'DVM ' # 1Ah - 1Eh | |
'AND ' 'OR ' 'XOR ' 'SHL ' 'SHR ' # 1Fh - 23h | |
'ZRET' 'INC ' 'DEC ' 'IN ' 'OUT ' # 24h - 28h | |
'WAIT' # 29h | |
0 # 2Ah : mark the end of opcode, start of macros | |
# goal is to be able to let the assembler use a second | |
# vm to simplify the bootstrapping process. | |
# | |
# asm should be able to control this directly. | |
# possibly: | |
# - look the word up in the dictionary | |
# - output the word to stdout (so the assembler can read it) | |
# use cases for words: | |
# 1. add a new 4-char word to the dictionary. | |
# 2. compile a call to a word | |
# 3. execute a word immediately | |
# 4. for macros, compile a call to execute the word later | |
# 5. distingush words from numbers | |
# if we use the high bits on the 3 rightmost characters, | |
# then we can hold up to 8 colors. can't use the leftmost | |
# high bit because it's used to indicate negative numbers | |
# so, if token is in $7F000000 .. $7FFFFFFF then it's a | |
# word else it's a 24-bit literal. | |
:find | |
LIT: ops | |
DUP at | |
DUP JEQ: | |
NXT: find | |
# BIOS | |
# ------------------------------------------------ | |
:here LIT: heap GET RET | |
:.new | |
:.def | |
:.for here RET | |
:wait LIT: 0 DUP OUT WAIT RET | |
:putc LIT: 1 LIT: 2 OUT wait RET | |
:getc LIT: 1 DUP DUP OUT wait IN RET | |
:here LIT: heap GET RET | |
:over PUSH DUP POP SWAP RET | |
:word # ( n -> d c b a ) convert word to 4 chars | |
LIT: 4 DUP | |
NXT: word | |
RET | |
# LIT: 16777216 # 1 << 24 | |
# 4 .for 256 DIVM .nxt DROP | |
# 1684234849 should be abcd | |
:main | |
# ------------------------------------------------ | |
# Placeholder for the instruction to execute. | |
# This is similar to revectoring in retro: we leave | |
# enough room to compile a 'jmp: xxx' instruction. | |
# (Or any other instruction.) | |
$ins0 NOP $ins1 NOP | |
# prompt for input | |
# ------------------------------------------------ | |
# We have to be careful to (eventually) return to | |
# this loop no matter what happens, after executing | |
# the above instruction(s). For primitives, we just | |
# fall through. | |
98 putc 52 putc 62 putc 32 putc # "b4> " | |
# read instruction | |
# -------------------------------------------------- | |
# Fetch the next character from the input device. | |
# Normally, this is a single ascii character from the | |
# keyboard, but it can actually be any 32 bit binary | |
# sequence. | |
# | |
# For bootstrapping, we will use this fact to transmit | |
# up to 4 characters at once. | |
# | |
# This operation will cause the VM to block (pause) | |
# until the instruction is received. | |
getc | |
#### scraps #### | |
# ------------------------------------------------ | |
# Now that we're back, we will zero out those slots. | |
# Retro's normal behavior is to block. | |
LIT: 0 LIT: ins0 PUT | |
LIT: 0 LIT: ins1 PUT | |
# However, since we are bootstrapping, we don't yet know where to jump. | |
# One approach would be to hand-assemble this code on graph paper | |
# or something so we could pre-calculate all the jumps, but this is | |
# cumbersome and would have to be re-done every time we change our | |
# code. | |
# | |
# Instead, we are going to provide a way for the system to self-program. | |
# | |
# The trick is to start the image with an infinite loop, and leave | |
# the first two cells blank. The first time the loop runs, these will | |
# be no-ops, but somewhere inside the loop, we will overwrite them | |
# with other instructions. At the end of the loop, we will jump back | |
# to cell 0 and the instructions will be executed. | |
# this control loop cannot call routines via the return stack because | |
# we want to be able to use and inspect the return stack from outside. | |
# therefore, we can't use anything but goto statements (direct jumps). | |
# so as not to screw up the control stack. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment