- 16 registers -- 8 might be sufficient with rotating registers
- 24-bit addressing = 16MW memory max
- all code addresses relative (PIC)
- memory protection table: readable, writable, executable
- no "status" register: flags that can be set by comparisons, and used for conditions, and which are pushed during a call
- rotating register space and spill, instead of a stack
- real register space of (64?) registers treated as a ring
- accessible registers are mapped to a point on that ring
- subroutine prolog: (optional)
- indicates # of arguments + locals
- rotates register names by (previous) # of args + locals + 2, in effect saving them
- ensures that there are (# args + locals + 8) clean registers ready at r0+
- if there aren't enough unused registers to rotate in, "spills" some to spill space
- if spill space is full, trigger a fault
- assumes any subroutine arguments now exist in r0 and up
- copies LR and flags into 2 registers after locals
- subroutine epilog (optional):
- copies LR and flags out of frame registers
- un-rotates registers by previous flag amounts
- un-spills if necessary
- "call":
- copies PC -> LR, then jumps to address
- "ret":
- copies LR -> PC
- "branch":
- updates PC by relative offset (16 bit)
- "long jump":
- set CS from a register, then branch
- idea: reverse branches of < 16 words are fast (cache rotating buffer of last 16 instructions)
frame 1 (also: frame 2)
3 args 2 locals saved 2 args
+----+----+----+----+----+----+----+----+----+--
| r0 | r1 | r2 | r3 | r4 | frame | x0 | x1 | ...
+----+----+----+----+----+----+----+----+----+--
- 2 word frame
- flags 4 bits: flags (f0 - f3) 3 bits: # args + locals (0 - 7)
- LR
- 64 total rotating
- 8 args + locals (r0 - r7)
- 8 "scratch" out-args (x0 - x7)
- 4 flags (f0 - f3)
- PC: current execution address within CS (16 bit)
- LR: return address within CS (16 bit)
- RO: offset (within 64 total) of the current register frame
- FL: 4 on/off bits
- SS: spill segment
- CS: code segment (for PC)
- DS: data segment (16 bits << 8)
- each entry is 2 words:
- 16 bit segment (address << 8)
- 12 bit length in 1KW pages (<< 10) -- 0 = 2**12
- 4 bits of protection: undefined, executable, writable, readable
- smallest protectable region is one 10-bit page (1024 words)
- largest is 2**22 = 4MW
- to cover all of memory: 4 entries, 4MW each = 16MW (24 bits)
- maximum table size: 1 page, or 512 entries (might cover only 512KW)
- "syscall" instruction should switch to a system table
- also faults (treat like syscall)
- entry
- current R, frame, and all 8 X are spilled into current SS
- new SS, CS, DS, XS loaded -- old ones stashed in X4-X7
- LR and flags preserved in frame
- PC -> LR
- newly-created frame is treated as if it had 8 locals & 8 scratch, all blank (including flags), all dirty
- exit
- unspilled registers are not saved
- LR -> PC
- LR and flags loaded from frame
- X4-X7 loaded into segment registers
- registers and frame (LR, flags) are "unspilled" from SS
- int vector must contain:
- new CS (DS, XS copied to be the same)
- new SS
- new address
- N: inline const, 1 - 32 (5 bits)
- J: jump offset, +/- 256 (9 bits)
- RX: any register (4 bits)
- R0-R7: 0nnn
- X0-X7: 1nnn
- S: segment register (2 bits)
- CS, DS, SS, XS
- F: flag F0-F3 (2 bits)
- C: conditional on (not?) F0-F3 (3 bits)
- L: load (1) or store (0)
- B: binary op (4 bits)
- add
- sub
- shl - shift left
- shr - shift right
- shrs - shift right with sign extension
- rotl - rotate left
- or
- and
- xor (7 more)
- T: test op (3 bits)
- cmpz: compare -> zero (=)
- cmpn: compare -> negative (< or >)
- test: a & b != 0
- addc: add with carry
- subc: subtract with carry (3 more)
- Q: 3-register op (3 bits)
- add
- mul (high into X7)
- div (remainder into X7)
- smul (signed)
- sdiv (signed)
- addmul (X1 += X2 * X3, high into X7) (2 more)
-
- copy between registers
- a. ( 8) RX <- RX
- b. (11) RX <- RX if C
- c. (11) RX <-> S:[RX]
- d. (11) RX <-> S:[RX++]
- e. ( 9) RX <- N
- f. ( 4) RX <- imm
- g. ( 7) RX <-> S
-
- binary operations
- a. (12) B RX, RX
- b. (13) B RX, N
- c. ( 8) B RX, imm
- d. (13) T F, RX, RX
-
- three-register operations
- a. (12) Q R, R, R
-
- branch
- a. (12) br C, J
- b. ( 9) br J
- c. ( 2) jump S:imm
- d. ( 9) loop J - x0--, if x0 > 0 then br J
-
- i/o
- a. ( 9) in RX, n
- b. ( 8) in RX, RX
- c. ( 9) out RX, n
- d. ( 8) out RX, RX
- e. ( 5) int N
- f. ( 4) int RX
-
- one-register / tiny params
- a. ( 4) int RX
- b. ( 4) setmpt RX, imm - set memory protection table (length in RX)
- c. ( 3) set/clear F
- d. ( 3) locals # (1 - 8)
-
- no params
- a. call - imm
- b. ret
- c. retl - discard locals
- d. halt
-
- 010 rrrr + 9
- a. 11 000 rrrr
- b. 10 ccc rrrr
- c. 00 L ss rrrr
- d. 01 L ss rrrr
- e. 11 11 nnnnn
- f. 11 1111 111 - imm
- g. 11 0010 L ss
-
- 1 rrrr + 11
- a. 0 bbbb 00 rrrr
- b. 0 bbbb 1 nnnnn
- c. 0 bbbb 011111 - imm
- d. 1 ttt 0 ff rrrr
-
- 0011 + 12
- a. qqq rrr rrr rrr
-
- 011 + 13
- a. 0 ccc jjjj jjjj j
- b. 1 000 jjjj jjjj j
- c. 1 111 1111 111 ss - imm
- d. 1 001 jjjj jjjj j
-
- 00101 + 11
- a. 10 nnnnn rrrr
- b. 010 rrrr rrrr
- c. 11 nnnnn rrrr
- d. 011 rrrr rrrr
- e. 001 111 nnnnn
- f. 001 0000 rrrr
-
- 0010 0111 + 8
- a. 0000 rrrr
- b. 0001 rrrr
- c. 00100 fff
- d. 00101 xxx
-
- 0010 1111 1111 + 4
- in order (a = 0000, d = 0011)
- ????. 0000