Skip to content

Instantly share code, notes, and snippets.

@nagisa
Last active January 23, 2016 09:41
Show Gist options
  • Save nagisa/e7185cdd44c2c981c14e to your computer and use it in GitHub Desktop.
Save nagisa/e7185cdd44c2c981c14e to your computer and use it in GitHub Desktop.
  • Feature Name: flexible_layout
  • Start Date: 2016-01-17
  • RFC PR: (leave this empty)
  • Rust Issue: (leave this empty)

Summary

Introduce a #[layout] attribute which allows for a bit-precise control of field size and position in the structure.

Motivation

This is a non-invasive approach to solving two issues we have RFCs for. Namely:

  • union support for FFI interoperation;
  • bitfields for FFI interoperation.

Detailed design

A new attribute, #[layout], is implemented for fields in an arbitrary struct. The attribute allows for bit-precise positioning and sizing of the fields in the structure.

These 4 forms are the only valid #[layout] attributes (exact behaviour of each is descibed at length later):

  • #[layout({usize}..)] – to change the field’s offset in bits within the structure. This, notably, would be the primary method of implementing C-like untagged unions under this RFC (see examples).
  • #[layout(..+{usize})] – to change the field’s size in bits. This, notably, would be the primary method of implementing C-like bitfields under this RFC.
  • #[layout({usize}..{usize})] and #[layout({usize}..+{usize})] – to change the range of bits into which the field is laid out. This provides combination of the behaviour of previous two attributes and allows programmer to lay out fields in a bit-precise manner.

Presence of this attribute inside a struct makes accessing and inspecting fields in that struct unsafe. For the first iteration of the RFC by access we mean any of: taking reference of, reading from and writing to. Moreover, the compiler forbids taking references to fields whose starting bit is not equal to 0. This restriction may be loosened a bit in time.

Calculation of field locations and structure size

#[layout] attribute does not have any influence on the layout of fields declared earlier in the struct.

#[layout(..+x)] field: T, where x = ::mem::size_of::<T> * BITS_PER_BYTE is a no-op. Following two code snippets are, for all intents and purposes of struct layout, equivalent:

struct S {              // struct S {
    #[layout(..+32)]    //
    field1: u32,        //     field1: u32,
    #[layout(..+64)]    //
    field2: u64         //     field2: u64
}                       // }

This implies the rule of “relative” layout declarations’ starting point being determined by the compiler in the usual way. Fields in the struct are laid out starting at 0th bit.

When x < ::mem::size_of::<T> * BITS_PER_BYTE, it is guaranteed the accesses to the field will not expose more than x bits: when reading the extra most significant bits of the destination will be set to zero; when writing the extra most significant bits of the operand are ignored. We’ll refer to this behaviour as truncation further in the RFC.

When x > ::mem::size_of::<T> * BITS_PER_BYTE, it is guaranteed the field will take up at least the specified amount of bits, but the extra bits won’t get used in any way for the purposes of the field itself. We’ll refer to this behaviour as extension further in the rfc.

#[layout(x..y)] lays out the field in the exact range of bits specified. If the original type does not fully fit inside the specified range, truncation behaviour is invoked, and the extension behaviour is invoked otherwise.

#[layout(x..+y)] is equivalent to #[layout(x..x+y)].

#[layout(x..)] is equivalent to #[layout(x..+y)], where y = ::mem::size_of::<T> * BITS_PER_BYTE.

When laying out the next field without a #[layout] attribute, the compiler finds the last “occupied” bit and lays out the new field according to the usual rules (i.e. using the usual alignment, sizing etc). Thus, the following assertion holds:

struct Example {
    #[layout(128..)]
    field1: u8,
    field2: u64
}
assert!(::mem::size_of::<Example> >= (128 + 8 + 64) / 8);

Along with #[repr(C)]

This RFC specifies only the behaviour of the #[layout(0..)] and #[layout(..+{usize})] when used alongside #[repr(C)]. #[layout({usize}..)] for {usize} ≠ 0 and #[layout({usize}..{usize})] are left unspecified.

#[layout(..+{usize})] field: T is interpreted the same way as T field: {usize} in C targeting the same platform is.

When all the fields are #[layout(0..)], the structure behaves as it was a untagged union (also known as C-like union), honouring all of the ABI (sizing, alignment, etc) requirements for C unions on the target machine.

Finally, #[layout] attribute on a field will not, in any way, influence the layout of the fields that were declared before (in terms of order in the source file).

Usecases

Untagged unions

The proposal allows implementing C-like untagged unions in a fairly painless manner:

#[repr(C)]              // // in C…
struct Union {          // union Union {
    #[layout(0..)]
    variant1: T,        //     T variant1,
    #[layout(0..)]
    variant2: TT,       //     TT variant2
}                       // }

Bitfields

This proposal also allows implementing C-like bitfields in a fairly painless manner:

#[repr(C)]           // // in C…
struct FloatParts {  // struct floatparts {
    #[layout(..+1)]
    sign: bool,      //     uint8_t sign: 1,
    #[layout(..+8)]
    exponent: u8,    //     uint8_t exponent: 8,
    #[layout(..+23)]
    mantissa: u32    //     uint32_t mantissa: 23
}                    // };

Padding

struct Padded { // ::std::mem::size_of::<Padded>() = 8
    #[layout(..+64)]
    data: u8
}

Drawbacks

...

Alternatives

RFCs #1449, #1444, #1450 (before redaction).

Unresolved questions

  • Behaviour of #[layout({usize}..)] for {usize} ≠ 0 and #[layout({usize}..{usize})] when used with #[repr(C)].
  • Bikeshedding.
@nikomatsakis
Copy link

Some comments on the current version:

  • What does it mean to have both a #[repr] and #[layout] specified?
  • If you omit a #[layout], what does that mean?
  • I believe you wrote that including #[layout] implied that access to the structure was unsafe -- but it seems like a problem if access to bitfields is unsafe.
  • We probably want to think about how writing to bitfields relates to overflow checking.
  • We need to ensure that if we take the address of something, it is not only well aligned but also of adequate size.

(Sorry if I misunderstood some portion of what you wrote in the comments above.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment