wolfwood/activations

## activations
============= Interrupt Primer =====================================

Computer: a deterministic monolith crunching away until infinity one
instruction after the other, as preordained by The Programmer.

If this is not your experience, it is because of interrupts (NB: it is
NOT because we don't write infinite loops, they are hidden down in the
bottom of most OSes and GUI and console applications).  The idea
behind interrupts is that we can pause what the CPU is doing, handle
some new information (like key presses or the printer finishing a
document), and return to the initial task.  Interrupts are the Great
Nondeterminism of the computing world (what they trigger is also
predetermined, but when is dynamic).

what causes interrupts: I/O and timers --- outside state changes

alternative: polling :(

Current hardware and OSes have taken the idea of interrupts to an
extreme: because they can restore the state of the CPU prior to the
interrupt without the interrupted process noticing, they do so with
extreme prejudice.

virtual CPU: never not know you aren't running

alternative++: activations

============= How XOmB implements Activations ======================

An activation is a piece of memory used to communicate between the
kernel and userspace.  Currently all environments are created with a
2MB segment at address 1GB - 2MB, the only memory accessible in the
first 1GB. XOmB currently manages allocating this memory, but if we
move to userspace allocation this will become quite tricky unless we
can guarantee that it is possible for a correct program to always have
a free page available for allocation activations (and then anything
that faults in this segment is labeled incorrect and killed :).

The Activation, for simplicity, contains an InterruptStack struct (to
store the CPU state that is suspended on an interrupt) along with some
additional information for unwinding activations that occur during the
restoration of an activation (chained activations are my primary worry
regarding correctness, races and allocation issue), and finally a bool
to indicate whether the activation is valid, so that the kernel can
find a free activation to use when needed.

The underlying XOmB interrupt mechanic is not changed by activations,
the same templated code pushes registers to the interrupt stack but
instead of calling an interrupt handler, the activation dispatcher (an
un-scheduler if you will) is called.  the dispatcher first finds a
free activation (XXX: in a lock-free manner that marks it as no longer
free) in the environments activation segment. Then the saved state of
the InterruptStack is copied to the activation.  The interrupt is
acknowledged to the local APIC (to prevent denial of service) and
userspace is reentered using the same mechanism as initial entry and
the yield system call, with 2 parameters: an entry index of 4 and the
address of the activation used.

It may seem like it would be possible to avoid this copy by using the
interrupt stack AS the activation. The down sides of this approach are
a) need a whole page or more as the activation instead of ~100bytes b)
the activation must be read only to prevent corruption by userspace
code running on an adjacent CPU of an in-use kernel stack c) that
either the activation must remain kernel allocated, or we must edit
the ISR in the TSS on context switch and manage a race with any
interrupts that occur after we enter an environment but before we've
located a preallocated activation page that is free.

Because the only interrupt at the moment is a timer, userspace
currently uses the parameters passed from the kernel to call the
_entry function which restore the registers saved by the common
interrupt handler and then uses iretq to restore the hardware saved
registers and the RSP and RIP atomically. and this point it is too
late to mark the activation as free, so we currently leak activations.

while it may seem like the way to cure leaks is to begin with not
using iretq, restoring RSP and RIP without iretq is nearly impossible
because all registers will be occupied with application data and the
application stack cannot be assumed to be free below the pointer
(redzone optimization) but an indirect mov must be used to restore the
RIP (a preallocated address would risk overwrite from other CPUs also
restoring activations).  It was theoretically possible to work around
this by storing the activation address in FSbase segment register and
doing FS relative addressing but this adds the FSbase register to the
state that must be preserved for userspace and complicates chained
activations.

So what about interrupts that we actually want to handle? For
throughput oriented workloads it may be reasonable to simply note that
the interrupt occurred, either by editing a bitmap that is checked
periodically by a 'process interrupts' thread (what if we want more
than one CPU to be able to process interrupts?)  or by enqueuing a
preallocated thread to run the handler for the particular interrupt at
hand (what if we get two interrupts before the thread is scheduled?).

However, at least some interrupts will take priority over the
currently running thread. In this case we may need to allocate new
thread to handle the interrupt and to either allocate a thread to
restore the activation (as the activation may be in the stackless
thread scheduler code, or an interrupted activation recovery itself,
we cannot assume there is an existing thread to be added to the
scheduler and in any case an alternate 'enter from activation' would
need to be communicated).

Either path is sticky, and complicated farther by the fact that we
would ultimately like to be passing the interrupt initially to the
init process, so that the interrupt may be routed to another
environment entirely for quick handling without denial of service by
the present environment, but of course an activation that is not
immediately communicated to the suspended environment is no better
than a standard UNIX 'virtual CPU' that can be revoked without
warning.
	============= Interrupt Primer =====================================

	Computer: a deterministic monolith crunching away until infinity one
	instruction after the other, as preordained by The Programmer.

	If this is not your experience, it is because of interrupts (NB: it is
	NOT because we don't write infinite loops, they are hidden down in the
	bottom of most OSes and GUI and console applications). The idea
	behind interrupts is that we can pause what the CPU is doing, handle
	some new information (like key presses or the printer finishing a
	document), and return to the initial task. Interrupts are the Great
	Nondeterminism of the computing world (what they trigger is also
	predetermined, but when is dynamic).

	what causes interrupts: I/O and timers --- outside state changes

	alternative: polling :(

	Current hardware and OSes have taken the idea of interrupts to an
	extreme: because they can restore the state of the CPU prior to the
	interrupt without the interrupted process noticing, they do so with
	extreme prejudice.

	virtual CPU: never not know you aren't running

	alternative++: activations

	============= How XOmB implements Activations ======================

	An activation is a piece of memory used to communicate between the
	kernel and userspace. Currently all environments are created with a
	2MB segment at address 1GB - 2MB, the only memory accessible in the
	first 1GB. XOmB currently manages allocating this memory, but if we
	move to userspace allocation this will become quite tricky unless we
	can guarantee that it is possible for a correct program to always have
	a free page available for allocation activations (and then anything
	that faults in this segment is labeled incorrect and killed :).

	The Activation, for simplicity, contains an InterruptStack struct (to
	store the CPU state that is suspended on an interrupt) along with some
	additional information for unwinding activations that occur during the
	restoration of an activation (chained activations are my primary worry
	regarding correctness, races and allocation issue), and finally a bool
	to indicate whether the activation is valid, so that the kernel can
	find a free activation to use when needed.

	The underlying XOmB interrupt mechanic is not changed by activations,
	the same templated code pushes registers to the interrupt stack but
	instead of calling an interrupt handler, the activation dispatcher (an
	un-scheduler if you will) is called. the dispatcher first finds a
	free activation (XXX: in a lock-free manner that marks it as no longer
	free) in the environments activation segment. Then the saved state of
	the InterruptStack is copied to the activation. The interrupt is
	acknowledged to the local APIC (to prevent denial of service) and
	userspace is reentered using the same mechanism as initial entry and
	the yield system call, with 2 parameters: an entry index of 4 and the
	address of the activation used.

	It may seem like it would be possible to avoid this copy by using the
	interrupt stack AS the activation. The down sides of this approach are
	a) need a whole page or more as the activation instead of ~100bytes b)
	the activation must be read only to prevent corruption by userspace
	code running on an adjacent CPU of an in-use kernel stack c) that
	either the activation must remain kernel allocated, or we must edit
	the ISR in the TSS on context switch and manage a race with any
	interrupts that occur after we enter an environment but before we've
	located a preallocated activation page that is free.

	Because the only interrupt at the moment is a timer, userspace
	currently uses the parameters passed from the kernel to call the
	_entry function which restore the registers saved by the common
	interrupt handler and then uses iretq to restore the hardware saved
	registers and the RSP and RIP atomically. and this point it is too
	late to mark the activation as free, so we currently leak activations.

	while it may seem like the way to cure leaks is to begin with not
	using iretq, restoring RSP and RIP without iretq is nearly impossible
	because all registers will be occupied with application data and the
	application stack cannot be assumed to be free below the pointer
	(redzone optimization) but an indirect mov must be used to restore the
	RIP (a preallocated address would risk overwrite from other CPUs also
	restoring activations). It was theoretically possible to work around
	this by storing the activation address in FSbase segment register and
	doing FS relative addressing but this adds the FSbase register to the
	state that must be preserved for userspace and complicates chained
	activations.

	So what about interrupts that we actually want to handle? For
	throughput oriented workloads it may be reasonable to simply note that
	the interrupt occurred, either by editing a bitmap that is checked
	periodically by a 'process interrupts' thread (what if we want more
	than one CPU to be able to process interrupts?) or by enqueuing a
	preallocated thread to run the handler for the particular interrupt at
	hand (what if we get two interrupts before the thread is scheduled?).

	However, at least some interrupts will take priority over the
	currently running thread. In this case we may need to allocate new
	thread to handle the interrupt and to either allocate a thread to
	restore the activation (as the activation may be in the stackless
	thread scheduler code, or an interrupted activation recovery itself,
	we cannot assume there is an existing thread to be added to the
	scheduler and in any case an alternate 'enter from activation' would
	need to be communicated).

	Either path is sticky, and complicated farther by the fact that we
	would ultimately like to be passing the interrupt initially to the
	init process, so that the interrupt may be routed to another
	environment entirely for quick handling without denial of service by
	the present environment, but of course an activation that is not
	immediately communicated to the suspended environment is no better
	than a standard UNIX 'virtual CPU' that can be revoked without
	warning.