tangentstorm/lower.b4a

## lower.b4a
# B4 - a bootstrap assembler for Ngaro/RetroVM
# --------------------------------------------------------------
# This is a tiny, pure-binary forth interpreter. Input consists
# of 4-byte sequences, which can either be ngaro instructions or
# space-terminated strings.


# Data area.
# --------------------------------------------------------------
# When ngaro boots up, the instruction pointer is set to zero, which
# means it will execute whatever instruction it finds in the first
# cell.
#
# In ngaro, instructions below 31 are executed directly, and higher
# numbers are treated as subroutine calls. This means that you can't
# call a procedure that resides before cell 31 without trickery or
# special treatment.
#
# Instead, we are going to use the first part of ram to store our
# initial dictionary, and other data the system needs to function.
#
# To prevent the system from treating this data as code, we therefore
# must start the image by jumping over this data. That means we have
# to decide up front how big it is.
#
# We will leave space for a few important variables for compatability
# with retroforth, followed by a proto-dictionary that holds just
# enough to allow retro to boot into forth.


# 00h - 01h  : 2 cells for the jump
# ------------------------------------------------
JMP: next


# 02h - 04h  : 3 cells for the "real" dictionary
# ------------------------------------------------
$last   0  # 02h most recently defined header
$heap   0  # 03h next free spot in ram
$which  0  # 04h most recently looked up header


# 05h - 0Ah  : 6 cells to hold device metadata
# ------------------------------------------------
$mem  0  # 05h "memory" in retro. size of ram
$fb   0  # 06h do we have graphics? (short for framebuffer)
$fw   0  # 07h framebuffer width
$fh   0  # 08h framebuffer height
$cw   0  # 09h console width
$ch   0  # 0Ah console height


# 0Bh - 2Ah : 32 cells for asm dictionary + end mark
# ------------------------------------------------
# These are 4-byte strings, which fit in a single
# cell and can thus be treated as integers as far
# as we are concerned.

$ops

'NOP '    'LIT:'    'DUP '    'DROP'    'SWAP' # 0Bh - 0Fh
'PUSH'    'POP '    'NXT:'    'JMP:'    'RET ' # 10h - 14h
'JLT:'    'JGT:'    'JNE:'    'JEQ:'    'GET ' # 15h - 19h
'PUT '    'ADD '    'SUB '    'MUL '    'DVM ' # 1Ah - 1Eh
'AND '    'OR  '    'XOR '    'SHL '    'SHR ' # 1Fh - 23h
'ZRET'    'INC '    'DEC '    'IN  '    'OUT ' # 24h - 28h
'WAIT'                                         # 29h

0 # 2Ah : mark the end of opcode, start of macros

# goal is to be able to let the assembler use a second
# vm to simplify the bootstrapping process.
#
# asm should be able to control this directly.

# possibly:
# - look the word up in the dictionary
# - output the word to stdout (so the assembler can read it)

# use cases for words:
# 1. add a new 4-char word to the dictionary.
# 2. compile a call to a word
# 3. execute a word immediately
# 4. for macros, compile a call to execute the word later
# 5. distingush words from numbers

# if we use the high bits on the 3 rightmost characters,
# then we can hold up to 8 colors. can't use the leftmost
# high bit because it's used to indicate negative numbers
# so, if token is in $7F000000 .. $7FFFFFFF then it's a
# word else it's a 24-bit literal.

:find
  LIT: ops
  DUP at
  DUP  JEQ:
  NXT: find

# BIOS
# ------------------------------------------------

:here LIT: heap GET RET
:.new
:.def

:.for here RET
:wait  LIT: 0  DUP         OUT WAIT  RET
:putc  LIT: 1  LIT: 2      OUT wait  RET
:getc  LIT: 1  DUP     DUP OUT wait  IN RET
:here  LIT: heap GET RET
:over  PUSH DUP POP SWAP RET

:word # ( n -> d c b a ) convert word to 4 chars
  LIT: 4 DUP
  NXT: word
RET


# LIT: 16777216 # 1 << 24
# 4 .for 256 DIVM .nxt DROP
# 1684234849 should be abcd

:main
# ------------------------------------------------
# Placeholder for the instruction to execute.
# This is similar to revectoring in retro: we leave
# enough room to compile a 'jmp: xxx' instruction.
# (Or any other instruction.)

$ins0 NOP     $ins1 NOP


# prompt for input
# ------------------------------------------------
# We have to be careful to (eventually) return to
# this loop no matter what happens, after executing
# the above instruction(s). For primitives, we just
# fall through.

98 putc  52 putc   62 putc  32 putc # "b4> "


# read instruction
# --------------------------------------------------
# Fetch the next character from the input device.
# Normally, this is a single ascii character from the
# keyboard, but it can actually be any 32 bit binary
# sequence.
#
# For bootstrapping, we will use this fact to transmit
# up to 4 characters at once.
#
# This operation will cause the VM to block (pause)
# until the instruction is received.

getc


#### scraps ####

# ------------------------------------------------
# Now that we're back, we will zero out those slots.
# Retro's normal behavior is to block.
LIT: 0 LIT: ins0 PUT
LIT: 0 LIT: ins1 PUT

# However, since we are bootstrapping, we don't yet know where to jump.
# One approach would be to hand-assemble this code on graph paper
# or something so we could pre-calculate all the jumps, but this is
# cumbersome and would have to be re-done every time we change our
# code.

#
# Instead, we are going to provide a way for the system to self-program.
#
# The trick is to start the image with an infinite loop, and leave
# the first two cells blank. The first time the loop runs, these will
# be no-ops, but somewhere inside the loop, we will overwrite them
# with other instructions. At the end of the loop, we will jump back
# to cell 0 and the instructions will be executed.


# this control loop cannot call routines via the return stack because
# we want to be able to use and inspect the return stack from outside.
# therefore, we can't use anything but goto statements (direct jumps).
# so as not to screw up the control stack.
	# B4 - a bootstrap assembler for Ngaro/RetroVM
	# --------------------------------------------------------------
	# This is a tiny, pure-binary forth interpreter. Input consists
	# of 4-byte sequences, which can either be ngaro instructions or
	# space-terminated strings.


	# Data area.
	# --------------------------------------------------------------
	# When ngaro boots up, the instruction pointer is set to zero, which
	# means it will execute whatever instruction it finds in the first
	# cell.
	#
	# In ngaro, instructions below 31 are executed directly, and higher
	# numbers are treated as subroutine calls. This means that you can't
	# call a procedure that resides before cell 31 without trickery or
	# special treatment.
	#
	# Instead, we are going to use the first part of ram to store our
	# initial dictionary, and other data the system needs to function.
	#
	# To prevent the system from treating this data as code, we therefore
	# must start the image by jumping over this data. That means we have
	# to decide up front how big it is.
	#
	# We will leave space for a few important variables for compatability
	# with retroforth, followed by a proto-dictionary that holds just
	# enough to allow retro to boot into forth.


	# 00h - 01h : 2 cells for the jump
	# ------------------------------------------------
	JMP: next


	# 02h - 04h : 3 cells for the "real" dictionary
	# ------------------------------------------------
	$last 0 # 02h most recently defined header
	$heap 0 # 03h next free spot in ram
	$which 0 # 04h most recently looked up header


	# 05h - 0Ah : 6 cells to hold device metadata
	# ------------------------------------------------
	$mem 0 # 05h "memory" in retro. size of ram
	$fb 0 # 06h do we have graphics? (short for framebuffer)
	$fw 0 # 07h framebuffer width
	$fh 0 # 08h framebuffer height
	$cw 0 # 09h console width
	$ch 0 # 0Ah console height


	# 0Bh - 2Ah : 32 cells for asm dictionary + end mark
	# ------------------------------------------------
	# These are 4-byte strings, which fit in a single
	# cell and can thus be treated as integers as far
	# as we are concerned.

	$ops

	'NOP ' 'LIT:' 'DUP ' 'DROP' 'SWAP' # 0Bh - 0Fh
	'PUSH' 'POP ' 'NXT:' 'JMP:' 'RET ' # 10h - 14h
	'JLT:' 'JGT:' 'JNE:' 'JEQ:' 'GET ' # 15h - 19h
	'PUT ' 'ADD ' 'SUB ' 'MUL ' 'DVM ' # 1Ah - 1Eh
	'AND ' 'OR ' 'XOR ' 'SHL ' 'SHR ' # 1Fh - 23h
	'ZRET' 'INC ' 'DEC ' 'IN ' 'OUT ' # 24h - 28h
	'WAIT' # 29h

	0 # 2Ah : mark the end of opcode, start of macros

	# goal is to be able to let the assembler use a second
	# vm to simplify the bootstrapping process.
	#
	# asm should be able to control this directly.

	# possibly:
	# - look the word up in the dictionary
	# - output the word to stdout (so the assembler can read it)

	# use cases for words:
	# 1. add a new 4-char word to the dictionary.
	# 2. compile a call to a word
	# 3. execute a word immediately
	# 4. for macros, compile a call to execute the word later
	# 5. distingush words from numbers

	# if we use the high bits on the 3 rightmost characters,
	# then we can hold up to 8 colors. can't use the leftmost
	# high bit because it's used to indicate negative numbers
	# so, if token is in $7F000000 .. $7FFFFFFF then it's a
	# word else it's a 24-bit literal.

	:find
	LIT: ops
	DUP at
	DUP JEQ:
	NXT: find

	# BIOS
	# ------------------------------------------------

	:here LIT: heap GET RET
	:.new
	:.def

	:.for here RET
	:wait LIT: 0 DUP OUT WAIT RET
	:putc LIT: 1 LIT: 2 OUT wait RET
	:getc LIT: 1 DUP DUP OUT wait IN RET
	:here LIT: heap GET RET
	:over PUSH DUP POP SWAP RET

	:word # ( n -> d c b a ) convert word to 4 chars
	LIT: 4 DUP
	NXT: word
	RET



	# LIT: 16777216 # 1 << 24
	# 4 .for 256 DIVM .nxt DROP
	# 1684234849 should be abcd

	:main
	# ------------------------------------------------
	# Placeholder for the instruction to execute.
	# This is similar to revectoring in retro: we leave
	# enough room to compile a 'jmp: xxx' instruction.
	# (Or any other instruction.)

	$ins0 NOP $ins1 NOP


	# prompt for input
	# ------------------------------------------------
	# We have to be careful to (eventually) return to
	# this loop no matter what happens, after executing
	# the above instruction(s). For primitives, we just
	# fall through.

	98 putc 52 putc 62 putc 32 putc # "b4> "


	# read instruction
	# --------------------------------------------------
	# Fetch the next character from the input device.
	# Normally, this is a single ascii character from the
	# keyboard, but it can actually be any 32 bit binary
	# sequence.
	#
	# For bootstrapping, we will use this fact to transmit
	# up to 4 characters at once.
	#
	# This operation will cause the VM to block (pause)
	# until the instruction is received.

	getc








	#### scraps ####

	# ------------------------------------------------
	# Now that we're back, we will zero out those slots.
	# Retro's normal behavior is to block.
	LIT: 0 LIT: ins0 PUT
	LIT: 0 LIT: ins1 PUT

	# However, since we are bootstrapping, we don't yet know where to jump.
	# One approach would be to hand-assemble this code on graph paper
	# or something so we could pre-calculate all the jumps, but this is
	# cumbersome and would have to be re-done every time we change our
	# code.

	#
	# Instead, we are going to provide a way for the system to self-program.
	#
	# The trick is to start the image with an infinite loop, and leave
	# the first two cells blank. The first time the loop runs, these will
	# be no-ops, but somewhere inside the loop, we will overwrite them
	# with other instructions. At the end of the loop, we will jump back
	# to cell 0 and the instructions will be executed.


	# this control loop cannot call routines via the return stack because
	# we want to be able to use and inspect the return stack from outside.
	# therefore, we can't use anything but goto statements (direct jumps).
	# so as not to screw up the control stack.