Created
August 7, 2011 10:15
-
-
Save dafrito/1130268 to your computer and use it in GitHub Desktop.
m0, mole thoughts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
=pod | |
This is extremely extremely draft-ish. It is not intended for human consumption. | |
Register manipulation is the only means of working with data. | |
CPS is only method of flow control. | |
Pointers, memory allocation is considered foreign and its use should be obvious. | |
=head2 Assumptions | |
Mole's purpose is to provide access to the underlying VM in a | |
programmer-friendly manner. It is intended to be written by programmers. It | |
should provide affordances to the programmer via syntax and semantics to | |
separate efficient operations from those that are inefficient. | |
While Mole is intended to be low-level, it should not necessarily be difficult | |
to write or read - Mole need not be hostile to developers. | |
Please ignore the exact choice of syntax and opcode in the following examples. I | |
provided the examples solely to reinforce concepts; they do not imply my | |
personal approval for their specific behavior. | |
=head2 Principle of Least Surprise | |
Mole syntax and semantics should mirror the behavior of the VM. Other languages | |
mask relative differences in performance by making all operations appear | |
similar. This encourages the developer to forget performance concerns and focus | |
instead on readability and design. However, Mole's purpose as a low-level | |
language means that performance is inseparable from behavior. As a result, Mole | |
should emphasize differences in performance visually and semantically. | |
Inefficent operations should be obvious in code. B<Operations should appear | |
consistent if and only if their behavior and performance are consistent>. In | |
short, fast operations should feel fast, while slow operations should feel slow. | |
To illustrate this point, assume in the following snippet that "read_hd" reads | |
some data from disk: | |
add_i 2 3 | |
mul_i 2 2 | |
read_hd 2 # Throw speed out the window here. | |
inc_i 2 | |
There's no reason why read_hd should appear similar to the math opcodes. read_hd | |
is likely to be a magnitude slower than the other opcodes, but the syntax does | |
nothing to indicate this momental difference in behavior. In fact, the syntax | |
implies that read_hd is "just another instruction". Mole should take care to | |
point out these lies. | |
Instead, slow operations should be made distinct in some form. I would go so far | |
as making "slow" operations have different syntax that opcodes | |
add_i 2 3 | |
mul_i 2 2 | |
write_to_disk(2) | |
inc_i 2 | |
Observe that the opcode now stands out visually from the other opcodes. It no | |
longer aligns with the faster opcodes, and it uses parentheses to further | |
emphasize that it's a different beast. While this code is uglier than the first | |
example, it's much more useful to developers and maintainers. | |
Another benefit of this syntax choice is that the relative level of performance | |
is embedded in the syntax. Developers who may not be aware of the performance | |
implications behind the read_hd opcode must find out "the hard way." However, if | |
these differences are explicit, then developers can assess the performance | |
merely by understanding the convention. | |
=head3 Opcodes | |
The fastest bits of codes are likely to be those that perform register | |
manipulation, such as mathematical operations, bitwise operations, and so forth. | |
To indicate their lightness, these instructions should be quick to write and | |
appear uniform: | |
Individual opcodes should be terse, but they need not conform to an exact | |
length. To allow easy alignment, no quick opcode should be longer than 7 | |
characters. | |
# Just right | |
add_i 2 3 | |
sub_i 2 4 | |
mul_i 2 4 | |
div_i 2 4 | |
mod_i 2 4 | |
i_to_n 1 2 | |
i_to_p 1 2 | |
In the above example, instructions and parameters can be easily aligned on the | |
page. It's ultimately the developer's choice to do this, but the convention is | |
impractical if opcodes vary dramatically in length. | |
Keep opcodes terse, and use common conventions where applicable. The following | |
opcodes are too long, and they use "num" instead of "n" to refer to a primitive | |
type: | |
# Too long - these are the some of the most efficient operations we can do! | |
add_num 2 4 | |
sub_num 2 3 | |
div_num 2 4 | |
Don't try to be too clever when naming opcodes. The following example sacrifices | |
too much readability for terseness. | |
# Too short to be useful. While their speed is evident, it's not clear what | |
# these operations do. | |
adi 2 4 | |
sbi 2 3 | |
dvi 2 4 | |
mdi 2 4 | |
Opcodes rarely exist alone; they're often provided as a set of related | |
operations. Related opcodes should appear related, and length symmetry is even | |
more important: | |
# Don't needlessly give up symmetry, like we did here. | |
add_i 2 3 | |
multi_i 2 4 | |
divi_i 2 3 | |
push_i 2 4 | |
ipop 3 | |
= Memory Manipulation | |
Conversely, memory manipulation are likely to be much slower than registers, so | |
operations that manipulate memory should look distinct from quicker operations. | |
I/O operations are included in this task, as they are also "slow". | |
As a corollary to the above, the use of pointers preclude many optimizations, as | |
constraints on their use are generally unenforceable. Pointer use should be | |
distinct from register use to indicate their undesirability. | |
=cut | |
vim: set tw=80 : |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment