cotto/criteria

## criteria
Probably the biggest question is what language we want to use.  We'll be
spending a huge amount of time writing whatever it is.  We need some criteria:

 * how easy is the language to learn for people accustomed to C
 * does the language has an object system?  We'll be implementing cmop (6model)
   in it, so the object system needs to be either self-hosting or non-existent
 * how efficiently does the language map to M0?  We want to generate efficient
   M0 and to have a clear idea of what the M0 for a given snippet looks like
 * it shouldn't allow things that don't make sense in M0 (not sure what this means)
 * the language should allow CPS stuff, either directly or indirectly.
   Indirectly is probably easier.
 * need distinction between compile-time and runtime constructs
 * need typed variables
 * something struct-like
 * a way to define M1 ops composed from M0 ops
 * light syntax, easy to implement, no optimizations
 * if we want macros, they need to be way smaller than C's
 * function-like casts will make the language much easier to parse visually
 * int* x, y; means two int*
 * easy to parse
 * question: INSP only or something more fine-grained?
 *

## possibilities
We need some kind of overlay language for M0 that we can use to reimplement
Parrot.  Writing poke_caller was hard and writing a factorial program will be
harder.  Anything less trivial will require an actual compiler or code
generator.  We have the following options:
 1 (PIR/M0) emit M0 from PIR
 2 (nqp/M0) emit M0 from nqp
 3 (winxed/M0) emit M0 from winxed
 4 (new-nqp/M0) write a new compiler targeting M0 using nqp
 5 (new-custom/M0) write a new compiler targeting M0 using a separate toolchain
 6 (steal/retarget) take someone's existing compiler, retarget it to M0

PIR/M0 has the advantage that we'll need to do something similar later anyway.
Being able to translate from PIR to M0 will be necessary if we want to continue
to support PIR, and we do.  I'm not sure if we'll want to replace Parrot's
current C code with PIR, though.  This approach is worth considering.

I don't like the idea of nqp/M0.  nqp is already quite slow and I don't see it
being feasible to get a speed improvement by using it more internally.  It
might be the case that we don't need to generate inefficient code to deal with
lvalue semantics if the translation is well-designed.  There's also the concern
that using nqp almost universally means using a bunch of pir:: garbage, which
would make M0 translation less efficient.  Overall it's a fairly nice language,
but I'm not certain that nqp/M0 is the best way forward.

winxed/M0 is a nice option.  The compiler already exists and has an alternate
version (winxedxx) that targets C++.  Unfortunately winxed isn't designed to
support multiple codegen backends, so we'd have to either refactor codegen into
a separate step (probably slowing down performance) or just fork it and write a
new backend.  The language itself is quite pleasant, but the compiler needs
work.  I'm still not convinced that this is a bad way forward.

new-nqp/M0 brings with it the speed issues of nqp.  nqp is very much designed
to support a highly flexible compilation workflow, so using it to generate M0
is a reasonable approach.  I'm not a fan of the langauge's speed and quirks,
though.  This approach could be made to work but it doesn't sound like the best
approach.

new-custom/M0 is almost included for the sake of completeness.  I'm tired of
writing meta-things and want to get some real work done.  Writing a new
compiler from scratch is decidedly non-lazy.

steal/retarget is a generalization of the winxed/M0 approach.  Instead of
retargeting winxed, we'd take an unrelated compiler for some language (js comes
to mind) and target that at M0 using CPS for control flow.  This has the
advantage that we're not writing (and debugging) a whole new compiler from
scratch, but it depends on us finding an appropriately-licensed compiler for a
suitable language and constructed in a modular fashion.


## syntax
possible syntax for mole ("M0 Overlay LanguagE")
*******
*types*
*******

I propose that we have 5 types; INSP for registers and cs, which is a C-like
string.  (This probably won't be exactly like a C-string, but close enough that
C code can use it if needed.)

registers: I, N, S, P
primitive string: cs


*****************
*constant values*
*****************

This describes what kind of constants can be used in mole code.

int:  [1-9]\d*
float: ...
hex:  0[xX][0-9a-fA-F]+
octal: 0[0-7]+
string: "[...]" (with escapes)


************************
*compile-time constants*
************************


**********************
*working with strings*
**********************

Strings pretend to be 0-indexed.  They actually also store their length and
encoding as the first five values.  The length is stored as a 4B int and the
encoding is stored in one byte, with 3 unused bytes for padding.  The string
for "hello, worlds?" would look as follows in memory:

 0x0             0x4             0x8             0xA             0x10            0x14
---------------------------------------------------------------------------------------------
|0x0|0x0|0x0|0xC|0x0|0x0|0x0|0x1| h | e | l | l | o | , |   | w | o | r | l | d | s | ? |\0 |
---------------------------------------------------------------------------------------------
size             encoding        0x0             0x4             0x8             0xA


*********
*structs*
*********

Structs may be defined as below.  Once a struct is defined, it can be used
wherever any other type can be used.  If a register is of a struct type, it is
assumed to point to a region of memory with the specified layout.  Struct
members are accessed using the '->' notation, as in C.  sizeof() can be used to
determine the number of bytes required by the struct.  This is similar to C,
except that sizeof() is purely a compile-time construct and can not be used to
calculate the length of an array.

struct {
    I int_thingy;
    N n_thingy;
} struct_thingy;

var I quux;

var struct_thingy st;

st = m0::sys_alloc sizeof(struct_thingy);
st->int_thingy = 39292934;
st->n_thingy   = 332.66;


********
*chunks*
********

Chunks are similar to functions.  They have a constants table, a metadata table
and a bytecode segment.  Values can be added to the constants table by
declaring a value with the keyword "const".  Annotations may be added
automatically by the mole compiler and can also be added manually with the .ann
"key" "value" syntax.


chunk main (I a1, I a2, I a3) {
    const I stdout 1;
    const cs hello "ohai.  im in ur m0";

    // annotation for the right file will be added by m1 compiler
    m0::print_i stdout, hello;
    var I i_thingy;
    i_thingy += a3++;
    c::fprintf(stdout, "asdfw %d\n", i_thingy);
    call_chunk "chunk_name", arg_array;
}


*********************
*calling conventions*
*********************

I don't know.  There are a couple options:

1) The first is that all calling conventions need to be dealt with explicitly.
This isn't nearly as bad as it'd be under M0 because of composed ops and it
would allow a very high degree of control without requiring the management of
all the minutae of the calling conventions more than once.

2) The second option is to have a default set of calling conventions that are
used with a simple minimalist syntax, but to allow them to be overridden with
composed ops.

3) The third option is to say that only the builtins can be used for control
flow.  For a very experimental language like mole, this approach is probably
insane.


**************
*composed ops*
**************

mole supports syntax to create composed ops which behave similarly to built-in
M0 ops.  The syntax is similar to chunnks with a few differences.  Composed ops
are declared using the "composed" keyword and do not have return statements.
Any values that the composed op needs to modify should passed as arguments.
Using a return statement in composed op is a syntax error.  Composed ops may
take an arbitrary number of arguments.  Variables may be declared in composed
ops as in functions.  composed ops are similar to inlined functions in C.

composed init_cf(P new_cf, I retpc_label) {
alloc_cf:
    I cf_size = 256;
    I flags = 0;
    new_cf = m0::gc_alloc cf_size, flags;

init_cf_copy:
    new_cf[INTERP] = cf[INTERP];
    new_cf[CHUNK]  = cf[CHUNK];
    new_cf[CONSTS] = cf[CONSTS];
    new_cf[MDS]    = cf[MDS];
    new_cf[BCS]    = cf[BCS];
    new_cf[PCF]    = cf[CF];
    new_cf[CF]     = new_cf;

init_cf_zero:
    new_cf[EH] = 0;
    new_cf[RETPC] = 0;
    new_cf[SPILLCF] = 0;
    RETPC = retpc_label;

init_cf_pc:
    new_cf[PC] = post_set;
    CF = new_cf;

post_set:
}
	Probably the biggest question is what language we want to use. We'll be
	spending a huge amount of time writing whatever it is. We need some criteria:

	* how easy is the language to learn for people accustomed to C
	* does the language has an object system? We'll be implementing cmop (6model)
	in it, so the object system needs to be either self-hosting or non-existent
	* how efficiently does the language map to M0? We want to generate efficient
	M0 and to have a clear idea of what the M0 for a given snippet looks like
	* it shouldn't allow things that don't make sense in M0 (not sure what this means)
	* the language should allow CPS stuff, either directly or indirectly.
	Indirectly is probably easier.
	* need distinction between compile-time and runtime constructs
	* need typed variables
	* something struct-like
	* a way to define M1 ops composed from M0 ops
	* light syntax, easy to implement, no optimizations
	* if we want macros, they need to be way smaller than C's
	* function-like casts will make the language much easier to parse visually
	* int* x, y; means two int*
	* easy to parse
	* question: INSP only or something more fine-grained?
	*
	We need some kind of overlay language for M0 that we can use to reimplement
	Parrot. Writing poke_caller was hard and writing a factorial program will be
	harder. Anything less trivial will require an actual compiler or code
	generator. We have the following options:
	1 (PIR/M0) emit M0 from PIR
	2 (nqp/M0) emit M0 from nqp
	3 (winxed/M0) emit M0 from winxed
	4 (new-nqp/M0) write a new compiler targeting M0 using nqp
	5 (new-custom/M0) write a new compiler targeting M0 using a separate toolchain
	6 (steal/retarget) take someone's existing compiler, retarget it to M0

	PIR/M0 has the advantage that we'll need to do something similar later anyway.
	Being able to translate from PIR to M0 will be necessary if we want to continue
	to support PIR, and we do. I'm not sure if we'll want to replace Parrot's
	current C code with PIR, though. This approach is worth considering.

	I don't like the idea of nqp/M0. nqp is already quite slow and I don't see it
	being feasible to get a speed improvement by using it more internally. It
	might be the case that we don't need to generate inefficient code to deal with
	lvalue semantics if the translation is well-designed. There's also the concern
	that using nqp almost universally means using a bunch of pir:: garbage, which
	would make M0 translation less efficient. Overall it's a fairly nice language,
	but I'm not certain that nqp/M0 is the best way forward.

	winxed/M0 is a nice option. The compiler already exists and has an alternate
	version (winxedxx) that targets C++. Unfortunately winxed isn't designed to
	support multiple codegen backends, so we'd have to either refactor codegen into
	a separate step (probably slowing down performance) or just fork it and write a
	new backend. The language itself is quite pleasant, but the compiler needs
	work. I'm still not convinced that this is a bad way forward.

	new-nqp/M0 brings with it the speed issues of nqp. nqp is very much designed
	to support a highly flexible compilation workflow, so using it to generate M0
	is a reasonable approach. I'm not a fan of the langauge's speed and quirks,
	though. This approach could be made to work but it doesn't sound like the best
	approach.

	new-custom/M0 is almost included for the sake of completeness. I'm tired of
	writing meta-things and want to get some real work done. Writing a new
	compiler from scratch is decidedly non-lazy.

	steal/retarget is a generalization of the winxed/M0 approach. Instead of
	retargeting winxed, we'd take an unrelated compiler for some language (js comes
	to mind) and target that at M0 using CPS for control flow. This has the
	advantage that we're not writing (and debugging) a whole new compiler from
	scratch, but it depends on us finding an appropriately-licensed compiler for a
	suitable language and constructed in a modular fashion.
	possible syntax for mole ("M0 Overlay LanguagE")
	*******
	types
	*******

	I propose that we have 5 types; INSP for registers and cs, which is a C-like
	string. (This probably won't be exactly like a C-string, but close enough that
	C code can use it if needed.)

	registers: I, N, S, P
	primitive string: cs


	*****************
	constant values
	*****************

	This describes what kind of constants can be used in mole code.

	int: [1-9]\d*
	float: ...
	hex: 0[xX][0-9a-fA-F]+
	octal: 0[0-7]+
	string: "[...]" (with escapes)



	************************
	compile-time constants
	************************



	**********************
	working with strings
	**********************

	Strings pretend to be 0-indexed. They actually also store their length and
	encoding as the first five values. The length is stored as a 4B int and the
	encoding is stored in one byte, with 3 unused bytes for padding. The string
	for "hello, worlds?" would look as follows in memory:

	0x0 0x4 0x8 0xA 0x10 0x14
	---------------------------------------------------------------------------------------------
	\|0x0\|0x0\|0x0\|0xC\|0x0\|0x0\|0x0\|0x1\| h \| e \| l \| l \| o \| , \| \| w \| o \| r \| l \| d \| s \| ? \|\0 \|
	---------------------------------------------------------------------------------------------
	size encoding 0x0 0x4 0x8 0xA



	*********
	structs
	*********

	Structs may be defined as below. Once a struct is defined, it can be used
	wherever any other type can be used. If a register is of a struct type, it is
	assumed to point to a region of memory with the specified layout. Struct
	members are accessed using the '->' notation, as in C. sizeof() can be used to
	determine the number of bytes required by the struct. This is similar to C,
	except that sizeof() is purely a compile-time construct and can not be used to
	calculate the length of an array.

	struct {
	I int_thingy;
	N n_thingy;
	} struct_thingy;

	var I quux;

	var struct_thingy st;

	st = m0::sys_alloc sizeof(struct_thingy);
	st->int_thingy = 39292934;
	st->n_thingy = 332.66;


	********
	chunks
	********

	Chunks are similar to functions. They have a constants table, a metadata table
	and a bytecode segment. Values can be added to the constants table by
	declaring a value with the keyword "const". Annotations may be added
	automatically by the mole compiler and can also be added manually with the .ann
	"key" "value" syntax.


	chunk main (I a1, I a2, I a3) {
	const I stdout 1;
	const cs hello "ohai. im in ur m0";

	// annotation for the right file will be added by m1 compiler
	m0::print_i stdout, hello;
	var I i_thingy;
	i_thingy += a3++;
	c::fprintf(stdout, "asdfw %d\n", i_thingy);
	call_chunk "chunk_name", arg_array;
	}



	*********************
	calling conventions
	*********************

	I don't know. There are a couple options:

	1) The first is that all calling conventions need to be dealt with explicitly.
	This isn't nearly as bad as it'd be under M0 because of composed ops and it
	would allow a very high degree of control without requiring the management of
	all the minutae of the calling conventions more than once.

	2) The second option is to have a default set of calling conventions that are
	used with a simple minimalist syntax, but to allow them to be overridden with
	composed ops.

	3) The third option is to say that only the builtins can be used for control
	flow. For a very experimental language like mole, this approach is probably
	insane.


	**************
	composed ops
	**************

	mole supports syntax to create composed ops which behave similarly to built-in
	M0 ops. The syntax is similar to chunnks with a few differences. Composed ops
	are declared using the "composed" keyword and do not have return statements.
	Any values that the composed op needs to modify should passed as arguments.
	Using a return statement in composed op is a syntax error. Composed ops may
	take an arbitrary number of arguments. Variables may be declared in composed
	ops as in functions. composed ops are similar to inlined functions in C.

	composed init_cf(P new_cf, I retpc_label) {
	alloc_cf:
	I cf_size = 256;
	I flags = 0;
	new_cf = m0::gc_alloc cf_size, flags;

	init_cf_copy:
	new_cf[INTERP] = cf[INTERP];
	new_cf[CHUNK] = cf[CHUNK];
	new_cf[CONSTS] = cf[CONSTS];
	new_cf[MDS] = cf[MDS];
	new_cf[BCS] = cf[BCS];
	new_cf[PCF] = cf[CF];
	new_cf[CF] = new_cf;

	init_cf_zero:
	new_cf[EH] = 0;
	new_cf[RETPC] = 0;
	new_cf[SPILLCF] = 0;
	RETPC = retpc_label;

	init_cf_pc:
	new_cf[PC] = post_set;
	CF = new_cf;

	post_set:
	}