Skip to content

Instantly share code, notes, and snippets.

@dafrito
Created August 7, 2011 10:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dafrito/1130268 to your computer and use it in GitHub Desktop.
Save dafrito/1130268 to your computer and use it in GitHub Desktop.
m0, mole thoughts
=pod
This is extremely extremely draft-ish. It is not intended for human consumption.
Register manipulation is the only means of working with data.
CPS is only method of flow control.
Pointers, memory allocation is considered foreign and its use should be obvious.
=head2 Assumptions
Mole's purpose is to provide access to the underlying VM in a
programmer-friendly manner. It is intended to be written by programmers. It
should provide affordances to the programmer via syntax and semantics to
separate efficient operations from those that are inefficient.
While Mole is intended to be low-level, it should not necessarily be difficult
to write or read - Mole need not be hostile to developers.
Please ignore the exact choice of syntax and opcode in the following examples. I
provided the examples solely to reinforce concepts; they do not imply my
personal approval for their specific behavior.
=head2 Principle of Least Surprise
Mole syntax and semantics should mirror the behavior of the VM. Other languages
mask relative differences in performance by making all operations appear
similar. This encourages the developer to forget performance concerns and focus
instead on readability and design. However, Mole's purpose as a low-level
language means that performance is inseparable from behavior. As a result, Mole
should emphasize differences in performance visually and semantically.
Inefficent operations should be obvious in code. B<Operations should appear
consistent if and only if their behavior and performance are consistent>. In
short, fast operations should feel fast, while slow operations should feel slow.
To illustrate this point, assume in the following snippet that "read_hd" reads
some data from disk:
add_i 2 3
mul_i 2 2
read_hd 2 # Throw speed out the window here.
inc_i 2
There's no reason why read_hd should appear similar to the math opcodes. read_hd
is likely to be a magnitude slower than the other opcodes, but the syntax does
nothing to indicate this momental difference in behavior. In fact, the syntax
implies that read_hd is "just another instruction". Mole should take care to
point out these lies.
Instead, slow operations should be made distinct in some form. I would go so far
as making "slow" operations have different syntax that opcodes
add_i 2 3
mul_i 2 2
write_to_disk(2)
inc_i 2
Observe that the opcode now stands out visually from the other opcodes. It no
longer aligns with the faster opcodes, and it uses parentheses to further
emphasize that it's a different beast. While this code is uglier than the first
example, it's much more useful to developers and maintainers.
Another benefit of this syntax choice is that the relative level of performance
is embedded in the syntax. Developers who may not be aware of the performance
implications behind the read_hd opcode must find out "the hard way." However, if
these differences are explicit, then developers can assess the performance
merely by understanding the convention.
=head3 Opcodes
The fastest bits of codes are likely to be those that perform register
manipulation, such as mathematical operations, bitwise operations, and so forth.
To indicate their lightness, these instructions should be quick to write and
appear uniform:
Individual opcodes should be terse, but they need not conform to an exact
length. To allow easy alignment, no quick opcode should be longer than 7
characters.
# Just right
add_i 2 3
sub_i 2 4
mul_i 2 4
div_i 2 4
mod_i 2 4
i_to_n 1 2
i_to_p 1 2
In the above example, instructions and parameters can be easily aligned on the
page. It's ultimately the developer's choice to do this, but the convention is
impractical if opcodes vary dramatically in length.
Keep opcodes terse, and use common conventions where applicable. The following
opcodes are too long, and they use "num" instead of "n" to refer to a primitive
type:
# Too long - these are the some of the most efficient operations we can do!
add_num 2 4
sub_num 2 3
div_num 2 4
Don't try to be too clever when naming opcodes. The following example sacrifices
too much readability for terseness.
# Too short to be useful. While their speed is evident, it's not clear what
# these operations do.
adi 2 4
sbi 2 3
dvi 2 4
mdi 2 4
Opcodes rarely exist alone; they're often provided as a set of related
operations. Related opcodes should appear related, and length symmetry is even
more important:
# Don't needlessly give up symmetry, like we did here.
add_i 2 3
multi_i 2 4
divi_i 2 3
push_i 2 4
ipop 3
= Memory Manipulation
Conversely, memory manipulation are likely to be much slower than registers, so
operations that manipulate memory should look distinct from quicker operations.
I/O operations are included in this task, as they are also "slow".
As a corollary to the above, the use of pointers preclude many optimizations, as
constraints on their use are generally unenforceable. Pointer use should be
distinct from register use to indicate their undesirability.
=cut
vim: set tw=80 :
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment