micklat/sharing data without a shared heap

## sharing data without a shared heap
Sharing data without a "shared heap"

This is a proposal for a shared heap without additional GC barriers. It requires some
support from the compiler, in enforcing that only certain kinds of values can be shared
between threads.

* suppose that the nimrod compiler can distinguish between const and non-const pointers
  and refs. I think constness can be inferred and need not be declared in most places.

* Define a "rigid" value to be a heap-allocated type V s.t. every pointer reachable from V
  is constant.

Put otherwise, these types are rigid:
- value types (ints, floats, chars, enums)
- const ref T and const ptr T, if T is rigid itself
- arrays and records of rigid values

* The desired property of rigid types is that as long as a rigid value is alive (not
  GC'd), every value that has ever been reachable from it is also alive. Thus, a reference
  to a rigid object can be passed to a thread other than the thread that owns it, and that
  other thread can safely traverse the object as long as the owning thread does not
  deallocate it.

* Although rigidity may seem like a significant constraint, it is not that bad. Almost all
data in Haskell is rigid: once a value is created, its memory layout is fixed. Fixed-size
numeric arrays are also rigid. Combining functional data structures with fixed-size homogenous
arrays can cover all use-cases adequately.

* Any object can be viewed as rigid, if we treat the mutable references that it contains as unreadable. That is, if x contains a constant pointer <a> and a non-constant pointer <b>, then a rigid view of X can be created by casting it to a type that has a constant pointer <a> and a 4-byte unreadable gap where the <b> pointer is stored.

* Define the "owner" of a heap value to be the thread in whose heap the value is
  stored. All other threads that can access that value shall be called the "borrowers" of
  that value.

* To manage cross-thread references to the rigid object, each borrower will have a proxy
  object. The proxy must live as long as the borrowed object is accessible to the
  borrower. When the proxy is GC'd in the borrower thread, a message is sent to the owner,
  to inform it that one of the borrowers has relinquished access to the object. This
  message is sent after a memory barrier.

* When an object X is borrowed, the borrower may extract from it the address of some other
  object Y. The borrower need not construct a proxy for Y. However, the borrower must keep
  X's proxy alive as long as it has access to Y. To this end, we shall define a
  BorrowedRef to be a pair (r,p) of pointers, where r points to a borrowed rigid object,
  and p points to a proxy on the borrower's heap. The proxy must itself point to the
  borrowed object t from which r was extracted.

* When the owner lends the object, it increases its refcount. After a memory barrier, the
  address of the object is pushed into a channel, and the borrower pulls the address out
  of the channel and wraps it in a proxy. Due to the increased refcount, the object and
  anything reachable from it will continue to live until the proxy is GC'd.

* Inter-thread cycle avoidance:

** A sufficient rule to prevent cycles is that if P is a proxy, and P is reachable from X,
   then X cannot be borrowed. This is statically enforceable.

** A more flexible cycle-avoidance rule is that the threads are ranked, and a low-rank
   thread may borrow from a high-rank thread but not vice-versa. To enforce this, each
   channel will know the rank of the receiving thread. At runtime, a write into a channel
   will compare the lender to the borrower's rank, and raise an exception if the lender
   has the lower rank.
	Sharing data without a "shared heap"

	This is a proposal for a shared heap without additional GC barriers. It requires some
	support from the compiler, in enforcing that only certain kinds of values can be shared
	between threads.

	* suppose that the nimrod compiler can distinguish between const and non-const pointers
	and refs. I think constness can be inferred and need not be declared in most places.

	* Define a "rigid" value to be a heap-allocated type V s.t. every pointer reachable from V
	is constant.

	Put otherwise, these types are rigid:
	- value types (ints, floats, chars, enums)
	- const ref T and const ptr T, if T is rigid itself
	- arrays and records of rigid values

	* The desired property of rigid types is that as long as a rigid value is alive (not
	GC'd), every value that has ever been reachable from it is also alive. Thus, a reference
	to a rigid object can be passed to a thread other than the thread that owns it, and that
	other thread can safely traverse the object as long as the owning thread does not
	deallocate it.

	* Although rigidity may seem like a significant constraint, it is not that bad. Almost all
	data in Haskell is rigid: once a value is created, its memory layout is fixed. Fixed-size
	numeric arrays are also rigid. Combining functional data structures with fixed-size homogenous
	arrays can cover all use-cases adequately.

	* Any object can be viewed as rigid, if we treat the mutable references that it contains as unreadable. That is, if x contains a constant pointer <a> and a non-constant pointer <b>, then a rigid view of X can be created by casting it to a type that has a constant pointer <a> and a 4-byte unreadable gap where the <b> pointer is stored.

	* Define the "owner" of a heap value to be the thread in whose heap the value is
	stored. All other threads that can access that value shall be called the "borrowers" of
	that value.

	* To manage cross-thread references to the rigid object, each borrower will have a proxy
	object. The proxy must live as long as the borrowed object is accessible to the
	borrower. When the proxy is GC'd in the borrower thread, a message is sent to the owner,
	to inform it that one of the borrowers has relinquished access to the object. This
	message is sent after a memory barrier.

	* When an object X is borrowed, the borrower may extract from it the address of some other
	object Y. The borrower need not construct a proxy for Y. However, the borrower must keep
	X's proxy alive as long as it has access to Y. To this end, we shall define a
	BorrowedRef to be a pair (r,p) of pointers, where r points to a borrowed rigid object,
	and p points to a proxy on the borrower's heap. The proxy must itself point to the
	borrowed object t from which r was extracted.

	* When the owner lends the object, it increases its refcount. After a memory barrier, the
	address of the object is pushed into a channel, and the borrower pulls the address out
	of the channel and wraps it in a proxy. Due to the increased refcount, the object and
	anything reachable from it will continue to live until the proxy is GC'd.

	* Inter-thread cycle avoidance:

	** A sufficient rule to prevent cycles is that if P is a proxy, and P is reachable from X,
	then X cannot be borrowed. This is statically enforceable.

	** A more flexible cycle-avoidance rule is that the threads are ranked, and a low-rank
	thread may borrow from a high-rank thread but not vice-versa. To enforce this, each
	channel will know the rank of the receiving thread. At runtime, a write into a channel
	will compare the lender to the borrower's rank, and raise an exception if the lender
	has the lower rank.