Last active
August 29, 2015 14:07
-
-
Save pythonesque/35bd8277ae079d81ece6 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#![feature(unsafe_destructor)] | |
/* | |
For safety, here are some guarantees we need fulfilled for allocated types for | |
an allocator of size class <M, N, O>. | |
* <T>::Size < M | |
e.g. consider 1-sized struct aligned at 4096. | |
And suppose we're at the very end of the page here (but it's aligned to 4096): | |
[ 4096 bytes][4096 bytes] | |
[1 byte]_____[4096 bytes] | |
* assume alignment is a power of 2. | |
* Size is integer multiple of alignment. | |
MPROT |4096 |4096 | | |
PAGE |4096 |4096 |4096 | | |
GUARD NO |YES |? | |
ALLOC |2048|2048|4096 | | |
WRITE |1|__|1|__|_______|1| | |
PAGE |4096 |4096 |4096 | | |
GUARD NO |YES |? | |
ALLOC |2048|4096 |4096 | | |
WRITE |1|__|1|_______|_______|1| | |
PAGE |4096|4096 |4096 | | |
GUARD NO |YES |? | |
ALLOC |1|4096|4096 | | |
WRITE |1|1|__|_____|1| | |
* worst case: allocation completely clears the stack guard, can access memory | |
anywhere. | |
* worst case if allocations must be smaller than GUARD_SIZE and an allocation | |
must always have at least one byte accessed on initialization: | |
|1 |GUARD_SIZE - 1|1|GUARD_SIZE - 2|1 | | |
STACK |GUARD |? | |
|ALLOC |ALLOC | | |
|WRITE| |WRITE| | |
* The easy fix: set MAX_ALLOC to 1. Then GUARD_SIZE - 2 in this worst-case | |
would be negative, which is impossible :) This is not very practical, though. | |
* The slightly less stupid but still easy fix: rewrite the scenario: | |
|1 |ALLOC_SIZE - 1|GUARD_SIZE + 1 - ALLOC_SIZE|2 * (ALLOC_SIZE - 1) - GUARD_SIZE|1 | | |
STACK |GUARD |? | |
|ALLOC |ALLOC | | |
|WRITE| |WRITE| | |
Then we require: ALLOC_SIZE - 1 <= GUARD_SIZE <= 2 * (ALLOC_SIZE - 1) | |
So if ALLOC_SIZE = 0, GUARD_SIZE + 1 < ALLOC_SIZE, or 2 * (ALLOC_SIZE - 1) < GUARD_SIZE | |
this can't happen. Since we can construct other bad scenarios if GUARD_SIZE + 1 < ALLOC_SIZE, | |
and ALLOC_SIZE = 0 is useless, the logical requirement is 2 * (ALLOC_SIZE - 1) < GUARD_SIZE. | |
i.e. GUARD_SIZE = 2 * ALLOC - 1. | |
By taking alignment into account there may be / certainly are more optimizations than this that | |
can be made, but this is a good starting point. And the optimizations may not be that important. | |
Also note: practically speaking we'll want this to be a power of 2... force ALLOC to be a | |
power-of-two and we can do GUARD_SIZE = ALLOC << 1. | |
* The more complex fix: figure out how (and if) one can restrict alignment so | |
this can never happen. Also: consider preferential alignment, reordering allocations, coalescing | |
lookups (so we only have to bump the pointer once if you want n allocations), etc. Is there a way | |
to expose this information to LLVM for optimization? That would be sweet. | |
|1|2|4|8|16|32|1| | | |
|1| | | | |
Effectively, can think of all allocations as being of either (align size) or | |
(max alloc size) and writing to either (byte 0) or (byte max alloc size - 1) | |
//and all alignments as being (relative to the stack guard) either | |
// (STACK_GUARD - | |
Guard page MUST be page aligned and allocations MUST be aligned at min-align, | |
and all alignments MUST be powers of 2, so we have | |
BASE + (REAL_SIZE) = GUARD_0 | |
BASE is (k-aligned): | |
If k < PAGE_SIZE, then | |
so we have | |
// | |
0 G | |
ALLOC |4096| | |
Key insight: with power of 2 alignments, | |
You may not allocate a structure larger than the guard page size. | |
What if the struct is aligned to 1? | |
[4096 bytes][4096 bytes] | |
____[1 byte][4096 bytes] | |
Still a matter of not allocating a structure larger than the guard page size. | |
What if alignment is 8? | |
[4096 bytes][4096 bytes] | |
[8 bytes] | |
____[1 byte][4096 bytes] | |
But you | |
Next location is: | |
(foo - 1) + bar | |
4095 + 4096 = 8191 | |
e.g. | |
So your guard pages must be at least | |
* <T>::Alignment == N | |
* <T>:: | |
* | |
* The alignment must be the maximum of: | |
(Align<T>, Size<T>) | |
2 | |
1968 | |
aligned at 4 | |
MOTIVATION: | |
Rust is great at managing regions! And Rust is also great at letting you | |
stack-allocate memory! But when we put it on the heap, we lose a lot of the | |
advantages of the stack: | |
* Must allocate / deallocate (slowwwww). | |
* Different sized types => one thing at a time, have to deal with size classes | |
and runtime metadata. | |
* Grab more than one of a type (in a vector) to avoid this cost => | |
must bounds check for safety. | |
But when we put it on the stack, we lose a lot of the advantages of the heap: | |
* Can't return references to things allocated locally. | |
* Can use a TypedArena (similar to this), but then we can't box things since | |
we can't destroy them, leading to overly long-lived regions. | |
* TypedArenas are also kinda inefficient... can't use them for things in different | |
size classes. And if you do it's WAY slower (think allocator, not stack), | |
especially if you start to fragment. | |
* No way to perform ~safe~ dynamically sized allocations. | |
* If you don't use a stack guard, you have to pay for checking the SP in every | |
function call. | |
It would really be nice to have the convenience (from a lifetimes perspective) of allocating | |
on the heap, but all the performance of allocating on the stack. | |
INTRODUCING SECONDARY STACKS | |
Secondary stacks try to address the above problems by creating a "second" stack that works | |
almost exactly like a normal stack, with a few important differences: | |
* NOT guaranteed to live in registers, etc. The stack is managed out of band. Similarly, | |
LLVM probably won't be able to optimize it very well, so you'll likely have to do that | |
yourself. | |
* Frames are NOT automatically created and dropped at entry and exit points; this must be | |
managed manually. The upside of this is that you can "dynamically allocate" your frame | |
sizes if you so choose. | |
* You can box things! | |
* Frame entry and exit do not run destructors. The region system is relied upon to force | |
the user to do this. | |
* There are maximum limits on individual allocations (based on predefined user parameters) | |
and these are enforced at compile time. This is to prevent jumping the stack guard (see | |
above). | |
* Can't reference allocated things as a fixed offset from the stack pointer (maybe you could | |
if we added restrictions on dynamic allocations... maybe worth offering an alternative that | |
works like that). So you still use the "real stack" most of the time. | |
What we don't do: allocation or deallocation (except when the stack is created), require | |
same-sized types, or bounds checks. Or make it impossible to return an allocated value. | |
Or pay for dynamic runtime size classes. Or prevent you from destroying values early. Or | |
prevent you from doing dynamic size allocations. And did I mention, it's safe? :) | |
HOW DOES IT WORK? | |
* mmap, mprotect according to calculations above. | |
* There's basically a single SP at all times. | |
* Utilizes Rust's region system to ensure safety. Allocating an element returns a Box with the | |
lifetime of the region and bumps the pointer. No checks needed :) | |
* (Potentially?) investigate a saner implementation of segmented stacks on SEGV. "Saner" in this | |
context: perhaps the stack switches directions and starts growing up, instead of down, and the | |
old stack still sticks arond until you're done. Kind of wasteful of memory but not slow. Or | |
maybe just do the usual inefficient segmented stacks implementation and treat it as a last | |
resort. Anyway, it's safe to just abort. | |
* Likely uses macros: would require a very powerful type system that allowed me to reason on nmbers | |
to make it safe otherwise, since we have to know that e.g. min alignment is the same for all items. | |
So it's not QUITE stack-like in that sense (this requirement could probably be avoided somehow though, | |
I'll have to think about it. Are stack frames aligned anyway? I'd guess they at least might be). | |
HOPEFULLY this will provide speed and ergonomic improvements for the cases where I'm using TypedArenas | |
right now. | |
*/ | |
/*pub struct Region { | |
} | |
pub struct ObStack<'a> { | |
stack: [u8], | |
}*/ | |
pub struct | |
fn foo<T, U>(x: T) -> U { | |
unsafe { return ::std::mem::transmute::<T, U>(x) } | |
} | |
fn main() { | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment