JeffBezanson/buffer.md Secret

## buffer.md

      
    Raw
  

              buffer.md
            
          
    Julep: Redesigning Array using a new lower-level container type

History and motivation

From its inception, Julia has had an Array type intended to be used any time you need
"a bunch of things together in row". This is one of the most important and most unusual
types in the language. When Array was designed, it was believed (possibly by multiple people,
but Jeff is willing to take the blame) that the easiest-to-use interface would be a
single array type that does most of what users need. Many Julia programmers now feel that
the language and community have outgrown this trade-off, and problems are emerging:

Array has many features, with an unusually large amount of functionality written in C.
This makes the code harder to maintain and harder for the compiler to optimize.
Constructing Arrays is slower than it should be, since it is implemented with branches
in C instead of compiler specialization.
The Array struct layout is opaque, inhibiting optimizations and understanding.
The extra overhead of all the features of Array, even if tolerable in most cases, is a
significant cost when arrays are used to implement other types like Dict.

Array has the following "fancy" features beyond the obvious function of storing a series
of elements:

Multiple dimensions
1-d arrays can be resized efficiently at either end
Multiple arrays can share the same underlying storage, mostly to support no-copy reshaping
Foreign pointers can be wrapped as Arrays to share data without copying. Optionally,
we will call free on the pointer when the Array is garbage collected.
An Array can use a String as storage, in order to efficiently initialize it

Proposed design

We propose a new type with the following representation:
mutable struct MemoryBlock{T} <: AbstractArray{T,1}
    len::Int
    const ptr::Ptr{T}  # points to inline data or foreign memory
    # inline data ...
end
The intention of this abstraction is to represent "a range of memory beginning at a fixed
location". The memory space can either be directly after the metadata fields (in which
case the ptr field is redundant) or refer to foreign memory via the ptr field. In
both cases ptr points to the data. When the data is stored in-line, we predict no
extra overhead from loading the ptr field since it shares a cache line with the data.
The name of this type is not fully settled. Names that have been considered include:

MemoryBlock{T}
Memory{T}   --- this is what C# calls it
MemBlock{T}
Mem{T}
MemRegion{T}
Storage{T}
Buffer{T}

Buffer has been the preferred name in past proposals, but we feel it is not descriptive
or intuitive.
Resizing

To handle very large data, it is necessary to expose the ability to grow a memory region
by extending the OS kernel's mapping. Therefore MemoryBlock will have a resizing
function that tries to grow the block in-place, incrementing the len field if it is
possible and indicating failure if it is not possible.
Trying to resize a block represented by a foreign pointer will fail.
Growing blocks by moving, or any form of shrinking blocks, is left to higher-level
abstractions to implement.
We feel there is no need for a variant of this type that does not support resizing.
If truly necessary, it is possible to wrap a MemoryBlock in another type whose length
is fixed. Because a block cannot shrink, assuming a fixed data size is sound.
Note this type does not and cannot implement resize! or related functions like push!,
because it has different semantics. The only purpose of the resizing API here is to
take advantage of extra space if the memory manager happens to have it available.
Several specific interface functions have been proposed here (names are provisional):

grow!(mem, n): Grow size to n elements if possible, otherwise do nothing. Returns new length.
grow_upto!(mem, n): Grow size to at most n elements, possibly fewer, returning the new length.
isgrowable(mem, n): Returns whether it is possible to grow to n elements. OR:
available(mem): Return total space available for this block, i.e. the maximum you could grow to.

Note the second two functions (the queries) are potential sources of race conditions, since the
information might be stale by the time you use it.
New Array definition

mutable struct Array{T,N} <: AbstractArray{T,N}
    data::MemoryBlock{T}
    axes::NTuple{N,Int}
    offset::Int  # for 1d only
end
String-backed arrays

Base.StringVector returns a Vector whose storage is a String, allowing conversion
via the String constructor without copying. This is used to initialize (immutable)
strings efficiently. Declaring the data field of Array as Union{MemoryBlock, String}
is probably not a good idea.
A possible approach is to add a new type (called perhaps StringVector or StringBuilder)
for this purpose. StringVector is not exported, so changing it is technically allowed,
though the impact on the ecosystem would have to be evaluated. If necessary, we can
maintain StringVector returning a Vector, with the only issue being degraded performance
of the subsequent String(::Vector) call.
Remaining questions


Should MemoryBlock have an extra type parameter to specify mutability?
Should MemoryBlock have an extra type parameter to specify atomic access?

Implementation considerations

Dimension-dependent const-ness

Compilers always want to know when a value is constant. Array has the complication
that the size is constant only when N>1. Our layout system cannot express this, so
it needs to be hacked into the compiler and will continue to be.
Batched allocation of Array and MemoryBlock

This design requires allocating two objects (an Array and a MemoryBlock) for most
arrays, giving it a significant disadvantage over the current implementation.
We believe this can be fixed by adding a feature to the GC to allocate two objects
at once: (1) Space big enough to hold both objects is allocated, (2) A header bit
in the first object indicates that it is followed by another, (3) The compiler
recognizes this pattern:
Array(length) = new(Core.MemoryBlock(length))
and replaces it with a double allocation call.
Element layout access

It is important (especially in the GC) to have fast access to the layout metadata
for an array's element type. One idea is to set MemoryBlock{T}.layout === T.layout
to avoid an extra indirection.
Allocation optimizations

We need to consider how this design will affect advanced array optimizations including
SROA of arrays, allocation elision, and allocation hoisting. The two-layer object
structure might make this more difficult.