Skip to content

Instantly share code, notes, and snippets.

@wolfwood
Last active November 15, 2022 01:01
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wolfwood/5447920 to your computer and use it in GitHub Desktop.
Save wolfwood/5447920 to your computer and use it in GitHub Desktop.
XOmB Activations: as they stand
============= Interrupt Primer =====================================
Computer: a deterministic monolith crunching away until infinity one
instruction after the other, as preordained by The Programmer.
If this is not your experience, it is because of interrupts (NB: it is
NOT because we don't write infinite loops, they are hidden down in the
bottom of most OSes and GUI and console applications). The idea
behind interrupts is that we can pause what the CPU is doing, handle
some new information (like key presses or the printer finishing a
document), and return to the initial task. Interrupts are the Great
Nondeterminism of the computing world (what they trigger is also
predetermined, but when is dynamic).
what causes interrupts: I/O and timers --- outside state changes
alternative: polling :(
Current hardware and OSes have taken the idea of interrupts to an
extreme: because they can restore the state of the CPU prior to the
interrupt without the interrupted process noticing, they do so with
extreme prejudice.
virtual CPU: never not know you aren't running
alternative++: activations
============= How XOmB implements Activations ======================
An activation is a piece of memory used to communicate between the
kernel and userspace. Currently all environments are created with a
2MB segment at address 1GB - 2MB, the only memory accessible in the
first 1GB. XOmB currently manages allocating this memory, but if we
move to userspace allocation this will become quite tricky unless we
can guarantee that it is possible for a correct program to always have
a free page available for allocation activations (and then anything
that faults in this segment is labeled incorrect and killed :).
The Activation, for simplicity, contains an InterruptStack struct (to
store the CPU state that is suspended on an interrupt) along with some
additional information for unwinding activations that occur during the
restoration of an activation (chained activations are my primary worry
regarding correctness, races and allocation issue), and finally a bool
to indicate whether the activation is valid, so that the kernel can
find a free activation to use when needed.
The underlying XOmB interrupt mechanic is not changed by activations,
the same templated code pushes registers to the interrupt stack but
instead of calling an interrupt handler, the activation dispatcher (an
un-scheduler if you will) is called. the dispatcher first finds a
free activation (XXX: in a lock-free manner that marks it as no longer
free) in the environments activation segment. Then the saved state of
the InterruptStack is copied to the activation. The interrupt is
acknowledged to the local APIC (to prevent denial of service) and
userspace is reentered using the same mechanism as initial entry and
the yield system call, with 2 parameters: an entry index of 4 and the
address of the activation used.
It may seem like it would be possible to avoid this copy by using the
interrupt stack AS the activation. The down sides of this approach are
a) need a whole page or more as the activation instead of ~100bytes b)
the activation must be read only to prevent corruption by userspace
code running on an adjacent CPU of an in-use kernel stack c) that
either the activation must remain kernel allocated, or we must edit
the ISR in the TSS on context switch and manage a race with any
interrupts that occur after we enter an environment but before we've
located a preallocated activation page that is free.
Because the only interrupt at the moment is a timer, userspace
currently uses the parameters passed from the kernel to call the
_entry function which restore the registers saved by the common
interrupt handler and then uses iretq to restore the hardware saved
registers and the RSP and RIP atomically. and this point it is too
late to mark the activation as free, so we currently leak activations.
while it may seem like the way to cure leaks is to begin with not
using iretq, restoring RSP and RIP without iretq is nearly impossible
because all registers will be occupied with application data and the
application stack cannot be assumed to be free below the pointer
(redzone optimization) but an indirect mov must be used to restore the
RIP (a preallocated address would risk overwrite from other CPUs also
restoring activations). It was theoretically possible to work around
this by storing the activation address in FSbase segment register and
doing FS relative addressing but this adds the FSbase register to the
state that must be preserved for userspace and complicates chained
activations.
So what about interrupts that we actually want to handle? For
throughput oriented workloads it may be reasonable to simply note that
the interrupt occurred, either by editing a bitmap that is checked
periodically by a 'process interrupts' thread (what if we want more
than one CPU to be able to process interrupts?) or by enqueuing a
preallocated thread to run the handler for the particular interrupt at
hand (what if we get two interrupts before the thread is scheduled?).
However, at least some interrupts will take priority over the
currently running thread. In this case we may need to allocate new
thread to handle the interrupt and to either allocate a thread to
restore the activation (as the activation may be in the stackless
thread scheduler code, or an interrupted activation recovery itself,
we cannot assume there is an existing thread to be added to the
scheduler and in any case an alternate 'enter from activation' would
need to be communicated).
Either path is sticky, and complicated farther by the fact that we
would ultimately like to be passing the interrupt initially to the
init process, so that the interrupt may be routed to another
environment entirely for quick handling without denial of service by
the present environment, but of course an activation that is not
immediately communicated to the suspended environment is no better
than a standard UNIX 'virtual CPU' that can be revoked without
warning.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment