Skip to content

Instantly share code, notes, and snippets.

@o11c
o11c / every-vm-tutorial-you-ever-studied-is-wrong.md
Last active May 8, 2024 10:26
Every VM tutorial you ever studied is wrong (and other compiler/interpreter-related knowledge)

Note: this was originally several Reddit posts, chained and linked. But now that Reddit is dying I've finally moved them out. Sorry about the mess.


URL: https://www.reddit.com/r/ProgrammingLanguages/comments/up206c/stack_machines_for_compilers/i8ikupw/ Summary: stack-based vs register-based in general.

There are a wide variety of machines that can be described as "stack-based" or "register-based", but not all of them are practical. And there are a lot of other decisions that affect that practicality (do variables have names or only address/indexes? fixed-width or variable-width instructions? are you interpreting the bytecode (and if so, are you using machine stack frames?) or turning it into machine code? how many registers are there, and how many are special? how do you represent multiple types of variable? how many scopes are there(various kinds of global, local, member, ...)? how much effort/complexity can you afford to put into your machine? etc.)

  • a pure stack VM can only access the top elemen
# TODO: variants of non-alphanumerics, greek letters, and others
!"#$%&'()*+,-./⓿❶❷❸❹❺❻❼❽❾:;<=>?@🅐🅑🅒🅓🅔🅕🅖🅗🅘🅙🅚🅛🅜🅝🅞🅟🅠🅡🅢🅣🅤🅥🅦🅧🅨🅩[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ # BlackCircle
!"#$%&'()*+,-./0123456789:;<=>?@🅰🅱🅲🅳🅴🅵🅶🅷🅸🅹🅺🅻🅼🅽🅾🅿🆀🆁🆂🆃🆄🆅🆆🆇🆈🆉[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ # BlackSquare
!"#$%&'()*+,-./𝟎𝟏𝟐𝟑𝟒𝟓𝟔𝟕𝟖𝟗:;<=>?@𝐀𝐁𝐂𝐃𝐄𝐅𝐆𝐇𝐈𝐉𝐊𝐋𝐌𝐍𝐎𝐏𝐐𝐑𝐒𝐓𝐔𝐕𝐖𝐗𝐘𝐙[\]^_`𝐚𝐛𝐜𝐝𝐞𝐟𝐠𝐡𝐢𝐣𝐤𝐥𝐦𝐧𝐨𝐩𝐪𝐫𝐬𝐭𝐮𝐯𝐰𝐱𝐲𝐳{|}~ # Bold
!"#$%&'()*+,-./0123456789:;<=>?@𝕬𝕭𝕮𝕯𝕰𝕱𝕲𝕳𝕴𝕵𝕶𝕷𝕸𝕹𝕺𝕻𝕼𝕽𝕾𝕿𝖀𝖁𝖂𝖃𝖄𝖅[\]^_`𝖆𝖇𝖈𝖉𝖊𝖋𝖌𝖍𝖎𝖏𝖐𝖑𝖒𝖓𝖔𝖕𝖖𝖗𝖘𝖙𝖚𝖛𝖜𝖝𝖞𝖟{|}~ # BoldFraktur

List of intended memory-management policies (note: the actual policies are below, after both kinds of modifiers):

structure modifiers:

  • relative - own address is added to make effective pointer. Useful for realloc and memcpy, as well as shared memory.
  • middle_pointer - actually points to the middle of an object, with a known way to find the start
    • note particularly how this interacts with subclasses. Possibly that should be the only way to create such a pointer? It isn't too crazy to synthesize a subclass just for the CowString trick ...
  • (other arithmetic tricks possible: base+scale+offset, with each of these possibly hard-coded (which requires that there be multiple copies of some ownership policies! we can't just use an enum) or possibly embedded)
  • (but what about "object is stored in a file too large to mmap" and such? Or should those only satisfy "ChunkyRandomIterator" concept?)
@o11c
o11c / ctor.cpp
Created April 14, 2014 23:10
llvm global constructors
#include <cstdio>
struct Foo
{
Foo()
{
puts("Hello, ");
}
};
@o11c
o11c / test-annotations.md
Last active October 6, 2023 12:11
Possible test annotations and results

Things that a test can be annotated with:

  • XFAIL(cond): for tests that are known to be buggy.
  • FLAKY(cond): for tests that have nondeterministic bugs that have not been hunted down.
  • SKIP(cond): for tests not applicable to the current platform, that cannot be fixed by installing or configuring dependencies.
  • MISSING(cond): for tests that can't run because of uninstalled or unconfigured dependencies.
  • WIP: for tests you are implementing
  • TIME(cpumin, cpumax, realmax): min/max computation expected for a test. Also real time added in case you sleep or something.
# Significant care is taken to be sh-compatible; if bash or zsh could be
# required, it could be made simpler or more generic.
# Known source'rs:
# ~/.profile
# ~/.zshrc
# ~/.xprofile
# ~/.xsessionrc
# ~/.bashrc
# ~/.config/plasma-workspace/env/*.sh
@o11c
o11c / division.py
Created February 7, 2021 05:23
The 3 flavors of division.
#!/usr/bin/env python3
import functools
import gmpy2
# assuming 8-bit math
# all functions are written to take any input, and produce signed output
def make_unsigned(v):
return v & 0xff

First, some notes:

  • Last checked for manpages-5.04 and linux-5.4, on Debian.
  • /proc/net is still used for documentation purposes, despite now being a symlink to /proc/self/net/
  • /proc/[pid]/task/[tid]/* is documented as including all of /proc/[pid]/* but this is not actually the case
  • Some files are actually documented in other man pages. But proc(5) needs to still mentions them (possibly just the containing directory though).
  • Some files are actually documented in the kernel's Documentation/ tree (but that also is incomplete). Even if proc(5) mentions that, this is suboptimal. Further, many links to Documentation/ are broken since a recent reorganization.
  • The /proc/sys/net/ reference is quite vague, and incomplete besides.
  • - means the file exists but is not documented
  • + means either it is documented but does not exist, or it exists but its contents are not documented
  • I've remove some clutter by hand and added a few notes.
<ChancyValue:
0.0000000000% chance of: 104
0.0000000000% chance of: 105
0.0000000000% chance of: 106
0.0000000000% chance of: 107
0.0000000000% chance of: 108
0.0000000000% chance of: 109
0.0000000000% chance of: 110
0.0000000000% chance of: 111
0.0000000000% chance of: 112
aarch64-elf
aarch64-linux-gnu
aarch64-rtems
alpha-linux-gnu
alpha-freebsd6
alpha-netbsd
alpha-openbsd
alpha64-dec-vms
alpha-dec-vms
am33_2.0-linux