Skip to content

Instantly share code, notes, and snippets.

@micklat
Created March 21, 2014 16:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save micklat/9690318 to your computer and use it in GitHub Desktop.
Save micklat/9690318 to your computer and use it in GitHub Desktop.
Sharing data without a "shared heap"
This is a proposal for a shared heap without additional GC barriers. It requires some
support from the compiler, in enforcing that only certain kinds of values can be shared
between threads.
* suppose that the nimrod compiler can distinguish between const and non-const pointers
and refs. I think constness can be inferred and need not be declared in most places.
* Define a "rigid" value to be a heap-allocated type V s.t. every pointer reachable from V
is constant.
Put otherwise, these types are rigid:
- value types (ints, floats, chars, enums)
- const ref T and const ptr T, if T is rigid itself
- arrays and records of rigid values
* The desired property of rigid types is that as long as a rigid value is alive (not
GC'd), every value that has ever been reachable from it is also alive. Thus, a reference
to a rigid object can be passed to a thread other than the thread that owns it, and that
other thread can safely traverse the object as long as the owning thread does not
deallocate it.
* Although rigidity may seem like a significant constraint, it is not that bad. Almost all
data in Haskell is rigid: once a value is created, its memory layout is fixed. Fixed-size
numeric arrays are also rigid. Combining functional data structures with fixed-size homogenous
arrays can cover all use-cases adequately.
* Any object can be viewed as rigid, if we treat the mutable references that it contains as unreadable. That is, if x contains a constant pointer <a> and a non-constant pointer <b>, then a rigid view of X can be created by casting it to a type that has a constant pointer <a> and a 4-byte unreadable gap where the <b> pointer is stored.
* Define the "owner" of a heap value to be the thread in whose heap the value is
stored. All other threads that can access that value shall be called the "borrowers" of
that value.
* To manage cross-thread references to the rigid object, each borrower will have a proxy
object. The proxy must live as long as the borrowed object is accessible to the
borrower. When the proxy is GC'd in the borrower thread, a message is sent to the owner,
to inform it that one of the borrowers has relinquished access to the object. This
message is sent after a memory barrier.
* When an object X is borrowed, the borrower may extract from it the address of some other
object Y. The borrower need not construct a proxy for Y. However, the borrower must keep
X's proxy alive as long as it has access to Y. To this end, we shall define a
BorrowedRef to be a pair (r,p) of pointers, where r points to a borrowed rigid object,
and p points to a proxy on the borrower's heap. The proxy must itself point to the
borrowed object t from which r was extracted.
* When the owner lends the object, it increases its refcount. After a memory barrier, the
address of the object is pushed into a channel, and the borrower pulls the address out
of the channel and wraps it in a proxy. Due to the increased refcount, the object and
anything reachable from it will continue to live until the proxy is GC'd.
* Inter-thread cycle avoidance:
** A sufficient rule to prevent cycles is that if P is a proxy, and P is reachable from X,
then X cannot be borrowed. This is statically enforceable.
** A more flexible cycle-avoidance rule is that the threads are ranked, and a low-rank
thread may borrow from a high-rank thread but not vice-versa. To enforce this, each
channel will know the rank of the receiving thread. At runtime, a write into a channel
will compare the lender to the borrower's rank, and raise an exception if the lender
has the lower rank.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment