|
Sharing data without a "shared heap" |
|
|
|
This is a proposal for a shared heap without additional GC barriers. It requires some |
|
support from the compiler, in enforcing that only certain kinds of values can be shared |
|
between threads. |
|
|
|
* suppose that the nimrod compiler can distinguish between const and non-const pointers |
|
and refs. I think constness can be inferred and need not be declared in most places. |
|
|
|
* Define a "rigid" value to be a heap-allocated type V s.t. every pointer reachable from V |
|
is constant. |
|
|
|
Put otherwise, these types are rigid: |
|
- value types (ints, floats, chars, enums) |
|
- const ref T and const ptr T, if T is rigid itself |
|
- arrays and records of rigid values |
|
|
|
* The desired property of rigid types is that as long as a rigid value is alive (not |
|
GC'd), every value that has ever been reachable from it is also alive. Thus, a reference |
|
to a rigid object can be passed to a thread other than the thread that owns it, and that |
|
other thread can safely traverse the object as long as the owning thread does not |
|
deallocate it. |
|
|
|
* Although rigidity may seem like a significant constraint, it is not that bad. Almost all |
|
data in Haskell is rigid: once a value is created, its memory layout is fixed. Fixed-size |
|
numeric arrays are also rigid. Combining functional data structures with fixed-size homogenous |
|
arrays can cover all use-cases adequately. |
|
|
|
* Any object can be viewed as rigid, if we treat the mutable references that it contains as unreadable. That is, if x contains a constant pointer <a> and a non-constant pointer <b>, then a rigid view of X can be created by casting it to a type that has a constant pointer <a> and a 4-byte unreadable gap where the <b> pointer is stored. |
|
|
|
* Define the "owner" of a heap value to be the thread in whose heap the value is |
|
stored. All other threads that can access that value shall be called the "borrowers" of |
|
that value. |
|
|
|
* To manage cross-thread references to the rigid object, each borrower will have a proxy |
|
object. The proxy must live as long as the borrowed object is accessible to the |
|
borrower. When the proxy is GC'd in the borrower thread, a message is sent to the owner, |
|
to inform it that one of the borrowers has relinquished access to the object. This |
|
message is sent after a memory barrier. |
|
|
|
* When an object X is borrowed, the borrower may extract from it the address of some other |
|
object Y. The borrower need not construct a proxy for Y. However, the borrower must keep |
|
X's proxy alive as long as it has access to Y. To this end, we shall define a |
|
BorrowedRef to be a pair (r,p) of pointers, where r points to a borrowed rigid object, |
|
and p points to a proxy on the borrower's heap. The proxy must itself point to the |
|
borrowed object t from which r was extracted. |
|
|
|
* When the owner lends the object, it increases its refcount. After a memory barrier, the |
|
address of the object is pushed into a channel, and the borrower pulls the address out |
|
of the channel and wraps it in a proxy. Due to the increased refcount, the object and |
|
anything reachable from it will continue to live until the proxy is GC'd. |
|
|
|
* Inter-thread cycle avoidance: |
|
|
|
** A sufficient rule to prevent cycles is that if P is a proxy, and P is reachable from X, |
|
then X cannot be borrowed. This is statically enforceable. |
|
|
|
** A more flexible cycle-avoidance rule is that the threads are ranked, and a low-rank |
|
thread may borrow from a high-rank thread but not vice-versa. To enforce this, each |
|
channel will know the rank of the receiving thread. At runtime, a write into a channel |
|
will compare the lender to the borrower's rank, and raise an exception if the lender |
|
has the lower rank. |