Last active
July 28, 2016 22:43
-
-
Save HallM/7331e6ed5a2f68997e5922ab75ef70c3 to your computer and use it in GitHub Desktop.
First pass for a template bytecode / vm
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{fn (p1 p2) | |
{def c {fn (p3 p4) | |
{* {+ p1 p3} {- p4 p2}} | |
}} | |
{c 1 2} | |
} | |
fns immediately executed in the same scope do not need special closure style | |
using the stack works just fine. | |
also, fns which do not access external things, only params and locals | |
can use the stack without any issues | |
any function which reads from an identifier not local to the function itself | |
AND that function is not called inside the same scope it is created | |
(either called in a different scope OR passed as a parameter) | |
must use a closure | |
closures are two pointers, one to the fn, one to the environment that gets created | |
inside the function that originated the closure | |
a caller would have to know if it is calling a closure or not | |
to simplify, could always use the fat/double ptr | |
even for standard function ptrs, they just have NULL env ptr | |
an example is a subcomponent being called with a function, how does it know if its a closure or not? | |
it can't and it shouldnt deal with knowing the ENV ptr as well. should just be part of it | |
fn ptrs will be double-wides: | |
higher half: the env ptr (could be 0/null for non-closures or improperly called closures) | |
lower half: the address of the function | |
debate: which way should it be? either way could be fine... | |
acc would have to fit an entire fnptr in order to operate on it... | |
unless an fnptr was really like a tuple internally | |
then acc is just a pointer to the fnptr (probably on the stack) | |
in that case, it doesn't matter what the fnptr is | |
still TODO: | |
strings (string table with addresses?) | |
linker? (Lisplate has no include per se, only passing execution to a subcomponent) | |
reading from arrays | |
reading from objects/associative arrays | |
handling {data::value} | |
handling {data::obj.path.to.value} | |
or is it {get value {get to {get path obj}}} | |
maybe internally, but really the dot notation is easier to work with. compiler could translate | |
calling external functions in data, viewmodel, strings, or internal/runtime | |
lookups (find {value} in some namespace) | |
writing to the chunk to be returned | |
handling async | |
strings table: | |
each starts with the length, followed by the string itself | |
a pointer/address in the code points to the length | |
stored at the end, after all the instructions | |
the downside? requires a step to go through and assign all addresses | |
or, string addresses are just table[offset] in the code, then only the table address needs to be known | |
code gen time cannot know the address of strings until after all code is generated | |
storing strings at the beginning: | |
a bytecode file will have to specify where it begins execution (skip the data/tables) | |
doesn't need a linker step | |
location of a string can be determined during the compile process | |
if there were ever other tables, this doesnt pan out as nicely | |
registers: | |
acc (accumulator) | |
sp (stack pointer) | |
bp (base pointer) | |
f (flags: carry, overflow, negative, zero) | |
ep (env pointer?) | |
ip (instruction pointer) | |
access modes: | |
reg | |
access data in register | |
reg[+-#] | |
access indexed assuming register is a ptr to a list | |
const_literal | |
just a literal number | |
address | |
byte offset in the bytecode of where something is | |
address+# | |
assumes an array starts at address, then just add # to the address | |
address[+-#] | |
could address be a pointer to somewhere else? | |
special, assembly only things: | |
label | |
not really an access mode, not used in the bytecode/VM at all | |
just something for readable "assembly", gets translated to what it should be (address or branch-offset) | |
for branch, translates to an offset | |
for everything else, translates to address | |
instructions | |
exec label? How to specify the function to be called | |
execute an external function | |
call reg | |
call reg[] | |
call address | |
call address+# | |
call address[] | |
source could be a register, or indexed. cannot be a label or const_literal | |
already understands that acc should be a pointer to the fnptr (addr+env) | |
sets up the ep, pushes return to stack, jumps | |
consequently, the compiler must create fnptr's from just function addresses/labels | |
it must also understand that an address must be shifted left ptr-size to become an fnptr | |
ret | |
pop return address, jump to it | |
the return of a function deals with the chunk system | |
push reg | |
push reg[] | |
push address | |
push address+# | |
push address[] | |
push const | |
pop | |
pop reg | |
mov reg, reg | |
mov reg, reg[] | |
mov reg, address | |
mov reg, address+# | |
mov reg, address[] | |
mov reg, const | |
mov reg[], reg | |
mov reg[], reg[] | |
mov reg[], address | |
mov reg[], address+# | |
mov reg[], address[] | |
mov reg[], const | |
in theory, addresses should be immutable, so we should not need these: | |
~~mov address, reg~~ | |
~~mov address, reg[]~~ | |
~~mov address, address~~ | |
~~mov address, address+#~~ | |
~~mov address, address[]~~ | |
~~mov address, const~~ | |
~~mov address+#, reg~~ | |
~~mov address+#, reg[]~~ | |
~~mov address+#, address~~ | |
~~mov address+#, address+#~~ | |
~~mov address+#, address[]~~ | |
~~mov address+#, const~~ | |
~~mov address[], reg~~ | |
~~mov address[], reg[]~~ | |
~~mov address[], address~~ | |
~~mov address[], address+#~~ | |
~~mov address[], address[]~~ | |
~~mov address[], const~~ | |
add reg, reg | |
add reg, reg[] | |
add reg, const | |
sub reg, reg | |
sub reg, reg[] | |
sub reg, const | |
outs reg[] | |
outs address | |
outs address+# | |
outs address[] | |
outn reg | |
outn reg[] | |
outn address | |
outn address+# | |
outn address[] | |
outn const | |
example: | |
Note: all stack/index references are in "units" not bytes. | |
Real ASM would use bytes, but a VM may not need to. Plus, "units" are easy to work with for an example | |
; stack setup: | |
; p4 < bp[+3] | |
; p3 < bp[+2] | |
; return address < bp[+1] | |
; previous_base < bp[+0] | |
; **garbage** <- sp | |
c: | |
; make_stack_frame | |
push bp | |
mov bp, sp | |
; we dont need scratch space | |
push ep[+1] | |
push bp[+3] | |
exec subtract | |
push acc | |
push bp[+2] | |
push ep[+0] | |
exec addition | |
exec multiply | |
; pop_stack_frame | |
mov sp, bp | |
pop bp | |
ret; acc is already the value we want | |
; stack setup: | |
; p2 < bp[+3] | |
; p1 < bp[+2] | |
; return address < bp[+1] | |
; previous_base < bp[+0] | |
; __c__env <- bp[-1] | |
; __c__addr <- bp[-2] | |
; **garbage** <- sp | |
myfn1: | |
; make_stack_frame, need 1 because __c__env | |
push bp | |
mov bp, sp | |
sub sp, 2 ; 2 items, because fnptr is 2 items | |
; creating an env to attach to an fnptr | |
; create an entry in the env-segment (like a heap) for the env | |
; sets the values from the stack into the array in the env-segment | |
; sets the higher half of the fnptr to the ptr to that env array | |
push bp[+4] ; p2 | |
push bp[+3] ; p1 | |
push 2 ; because we have 2 items in the env | |
exec make_env ; acc is the ptr to the env in the heap | |
mov bp[-1], acc | |
mov bp[-2], c | |
; stack cleanup for cdecl call | |
; we make the caller do it, since the caller knows how many params it passed | |
add sp, 3 | |
push 2 | |
push 1 | |
call bp[-2] | |
add sp, 3 ; clean up stack | |
; pop_stack_frame | |
mov sp, bp | |
pop bp | |
; we dont return anything, so no worries on acc | |
ret |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment