Skip to content

Instantly share code, notes, and snippets.

@JeffBezanson
Last active January 3, 2024 01:53
Show Gist options
  • Save JeffBezanson/a25dde3bebb5a734af87bb5ddcf31fb0 to your computer and use it in GitHub Desktop.
Save JeffBezanson/a25dde3bebb5a734af87bb5ddcf31fb0 to your computer and use it in GitHub Desktop.

Julep: Redesigning Array using a new lower-level container type

History and motivation

From its inception, Julia has had an Array type intended to be used any time you need "a bunch of things together in row". This is one of the most important and most unusual types in the language. When Array was designed, it was believed (possibly by multiple people, but Jeff is willing to take the blame) that the easiest-to-use interface would be a single array type that does most of what users need. Many Julia programmers now feel that the language and community have outgrown this trade-off, and problems are emerging:

  • Array has many features, with an unusually large amount of functionality written in C. This makes the code harder to maintain and harder for the compiler to optimize.
  • Constructing Arrays is slower than it should be, since it is implemented with branches in C instead of compiler specialization.
  • The Array struct layout is opaque, inhibiting optimizations and understanding.
  • The extra overhead of all the features of Array, even if tolerable in most cases, is a significant cost when arrays are used to implement other types like Dict.

Array has the following "fancy" features beyond the obvious function of storing a series of elements:

  • Multiple dimensions
  • 1-d arrays can be resized efficiently at either end
  • Multiple arrays can share the same underlying storage, mostly to support no-copy reshaping
  • Foreign pointers can be wrapped as Arrays to share data without copying. Optionally, we will call free on the pointer when the Array is garbage collected.
  • An Array can use a String as storage, in order to efficiently initialize it

Proposed design

We propose a new type with the following representation:

mutable struct MemoryBlock{T} <: AbstractArray{T,1}
    len::Int
    const ptr::Ptr{T}  # points to inline data or foreign memory
    # inline data ...
end

The intention of this abstraction is to represent "a range of memory beginning at a fixed location". The memory space can either be directly after the metadata fields (in which case the ptr field is redundant) or refer to foreign memory via the ptr field. In both cases ptr points to the data. When the data is stored in-line, we predict no extra overhead from loading the ptr field since it shares a cache line with the data.

The name of this type is not fully settled. Names that have been considered include:

  • MemoryBlock{T}
  • Memory{T} --- this is what C# calls it
  • MemBlock{T}
  • Mem{T}
  • MemRegion{T}
  • Storage{T}
  • Buffer{T}

Buffer has been the preferred name in past proposals, but we feel it is not descriptive or intuitive.

Resizing

To handle very large data, it is necessary to expose the ability to grow a memory region by extending the OS kernel's mapping. Therefore MemoryBlock will have a resizing function that tries to grow the block in-place, incrementing the len field if it is possible and indicating failure if it is not possible. Trying to resize a block represented by a foreign pointer will fail. Growing blocks by moving, or any form of shrinking blocks, is left to higher-level abstractions to implement.

We feel there is no need for a variant of this type that does not support resizing. If truly necessary, it is possible to wrap a MemoryBlock in another type whose length is fixed. Because a block cannot shrink, assuming a fixed data size is sound.

Note this type does not and cannot implement resize! or related functions like push!, because it has different semantics. The only purpose of the resizing API here is to take advantage of extra space if the memory manager happens to have it available.

Several specific interface functions have been proposed here (names are provisional):

  • grow!(mem, n): Grow size to n elements if possible, otherwise do nothing. Returns new length.
  • grow_upto!(mem, n): Grow size to at most n elements, possibly fewer, returning the new length.
  • isgrowable(mem, n): Returns whether it is possible to grow to n elements. OR:
  • available(mem): Return total space available for this block, i.e. the maximum you could grow to.

Note the second two functions (the queries) are potential sources of race conditions, since the information might be stale by the time you use it.

New Array definition

mutable struct Array{T,N} <: AbstractArray{T,N}
    data::MemoryBlock{T}
    axes::NTuple{N,Int}
    offset::Int  # for 1d only
end

String-backed arrays

Base.StringVector returns a Vector whose storage is a String, allowing conversion via the String constructor without copying. This is used to initialize (immutable) strings efficiently. Declaring the data field of Array as Union{MemoryBlock, String} is probably not a good idea.

A possible approach is to add a new type (called perhaps StringVector or StringBuilder) for this purpose. StringVector is not exported, so changing it is technically allowed, though the impact on the ecosystem would have to be evaluated. If necessary, we can maintain StringVector returning a Vector, with the only issue being degraded performance of the subsequent String(::Vector) call.

Remaining questions

  • Should MemoryBlock have an extra type parameter to specify mutability?
  • Should MemoryBlock have an extra type parameter to specify atomic access?

Implementation considerations

Dimension-dependent const-ness

Compilers always want to know when a value is constant. Array has the complication that the size is constant only when N>1. Our layout system cannot express this, so it needs to be hacked into the compiler and will continue to be.

Batched allocation of Array and MemoryBlock

This design requires allocating two objects (an Array and a MemoryBlock) for most arrays, giving it a significant disadvantage over the current implementation. We believe this can be fixed by adding a feature to the GC to allocate two objects at once: (1) Space big enough to hold both objects is allocated, (2) A header bit in the first object indicates that it is followed by another, (3) The compiler recognizes this pattern:

Array(length) = new(Core.MemoryBlock(length))

and replaces it with a double allocation call.

Element layout access

It is important (especially in the GC) to have fast access to the layout metadata for an array's element type. One idea is to set MemoryBlock{T}.layout === T.layout to avoid an extra indirection.

Allocation optimizations

We need to consider how this design will affect advanced array optimizations including SROA of arrays, allocation elision, and allocation hoisting. The two-layer object structure might make this more difficult.

@oscardssmith
Copy link

a size 0 MemorySegment won't be resizable.

@vtjnash
Copy link

vtjnash commented Jun 7, 2023

@JeffBezanson Is there a place in Memory{T} for keeping the isshared bit, which is a hint to the Array that it should do copy-on-write instead of using memmove when resizing?

@oscardssmith
Copy link

we can keep it in the foreign pointer, but I'm not sure we should.

@Tokazama
Copy link

Tokazama commented Jun 7, 2023

we can keep it in the foreign pointer, but I'm not sure we should.

Could that potentially mess with foreign pointers that use those bits?

@oscardssmith
Copy link

pointers have bits to steal because CPUs only have 48 bits of the 64 bit addresses wired up.

@Tokazama
Copy link

Tokazama commented Jun 7, 2023

But if they're foreign pointers how do we know thats bits are free?

Sorry maybe that's a dumb question. Maybe just ignore me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment