From its inception, Julia has had an Array
type intended to be used any time you need
"a bunch of things together in row". This is one of the most important and most unusual
types in the language. When Array
was designed, it was believed (possibly by multiple people,
but Jeff is willing to take the blame) that the easiest-to-use interface would be a
single array type that does most of what users need. Many Julia programmers now feel that
the language and community have outgrown this trade-off, and problems are emerging:
Array
has many features, with an unusually large amount of functionality written in C. This makes the code harder to maintain and harder for the compiler to optimize.- Constructing
Array
s is slower than it should be, since it is implemented with branches in C instead of compiler specialization. - The
Array
struct layout is opaque, inhibiting optimizations and understanding. - The extra overhead of all the features of
Array
, even if tolerable in most cases, is a significant cost when arrays are used to implement other types likeDict
.
Array
has the following "fancy" features beyond the obvious function of storing a series
of elements:
- Multiple dimensions
- 1-d arrays can be resized efficiently at either end
- Multiple arrays can share the same underlying storage, mostly to support no-copy reshaping
- Foreign pointers can be wrapped as
Array
s to share data without copying. Optionally, we will callfree
on the pointer when theArray
is garbage collected. - An
Array
can use aString
as storage, in order to efficiently initialize it
We propose a new type with the following representation:
mutable struct MemoryBlock{T} <: AbstractArray{T,1}
len::Int
const ptr::Ptr{T} # points to inline data or foreign memory
# inline data ...
end
The intention of this abstraction is to represent "a range of memory beginning at a fixed
location". The memory space can either be directly after the metadata fields (in which
case the ptr
field is redundant) or refer to foreign memory via the ptr
field. In
both cases ptr
points to the data. When the data is stored in-line, we predict no
extra overhead from loading the ptr
field since it shares a cache line with the data.
The name of this type is not fully settled. Names that have been considered include:
- MemoryBlock{T}
- Memory{T} --- this is what C# calls it
- MemBlock{T}
- Mem{T}
- MemRegion{T}
- Storage{T}
- Buffer{T}
Buffer
has been the preferred name in past proposals, but we feel it is not descriptive
or intuitive.
To handle very large data, it is necessary to expose the ability to grow a memory region
by extending the OS kernel's mapping. Therefore MemoryBlock
will have a resizing
function that tries to grow the block in-place, incrementing the len
field if it is
possible and indicating failure if it is not possible.
Trying to resize a block represented by a foreign pointer will fail.
Growing blocks by moving, or any form of shrinking blocks, is left to higher-level
abstractions to implement.
We feel there is no need for a variant of this type that does not support resizing. If truly necessary, it is possible to wrap a MemoryBlock in another type whose length is fixed. Because a block cannot shrink, assuming a fixed data size is sound.
Note this type does not and cannot implement resize!
or related functions like push!
,
because it has different semantics. The only purpose of the resizing API here is to
take advantage of extra space if the memory manager happens to have it available.
Several specific interface functions have been proposed here (names are provisional):
grow!(mem, n)
: Grow size ton
elements if possible, otherwise do nothing. Returns new length.grow_upto!(mem, n)
: Grow size to at mostn
elements, possibly fewer, returning the new length.isgrowable(mem, n)
: Returns whether it is possible to grow ton
elements. OR:available(mem)
: Return total space available for this block, i.e. the maximum you could grow to.
Note the second two functions (the queries) are potential sources of race conditions, since the information might be stale by the time you use it.
mutable struct Array{T,N} <: AbstractArray{T,N}
data::MemoryBlock{T}
axes::NTuple{N,Int}
offset::Int # for 1d only
end
Base.StringVector
returns a Vector
whose storage is a String
, allowing conversion
via the String
constructor without copying. This is used to initialize (immutable)
strings efficiently. Declaring the data field of Array
as Union{MemoryBlock, String}
is probably not a good idea.
A possible approach is to add a new type (called perhaps StringVector
or StringBuilder
)
for this purpose. StringVector
is not exported, so changing it is technically allowed,
though the impact on the ecosystem would have to be evaluated. If necessary, we can
maintain StringVector
returning a Vector
, with the only issue being degraded performance
of the subsequent String(::Vector)
call.
- Should MemoryBlock have an extra type parameter to specify mutability?
- Should MemoryBlock have an extra type parameter to specify atomic access?
Compilers always want to know when a value is constant. Array
has the complication
that the size is constant only when N>1. Our layout system cannot express this, so
it needs to be hacked into the compiler and will continue to be.
This design requires allocating two objects (an Array and a MemoryBlock) for most arrays, giving it a significant disadvantage over the current implementation. We believe this can be fixed by adding a feature to the GC to allocate two objects at once: (1) Space big enough to hold both objects is allocated, (2) A header bit in the first object indicates that it is followed by another, (3) The compiler recognizes this pattern:
Array(length) = new(Core.MemoryBlock(length))
and replaces it with a double allocation call.
It is important (especially in the GC) to have fast access to the layout metadata
for an array's element type. One idea is to set MemoryBlock{T}.layout === T.layout
to avoid an extra indirection.
We need to consider how this design will affect advanced array optimizations including SROA of arrays, allocation elision, and allocation hoisting. The two-layer object structure might make this more difficult.
A fixed-size vector type would also be useful for implementing datastructures like a circular buffer, so having that only be internal would be a shame. A regular implementation of a circular buffer could work with
MemoryBlock
directly and doesn't need the resizeability of a mutableMemoryBlock
, but I'm not sure the stack-allocatability of that is enough of a justification to have it - the use cases where fixed size AND stack allocated (at least for a circular buffer) matter at the same time are rather niche, and ought to be possible to emulate with a promotion of the mutable allocation to the stack (provided we can prove that the buffer doesn't escape subsequent calls and doesn't end up in TLS or is shared between threads). If I'm not mistaken, that should apply to sorting algorithms as well.