Skip to content

Instantly share code, notes, and snippets.

@Fruneau
Created January 31, 2017 08:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Fruneau/fa83fe87a316514797c1eeaaaa2e5012 to your computer and use it in GitHub Desktop.
Save Fruneau/fa83fe87a316514797c1eeaaaa2e5012 to your computer and use it in GitHub Desktop.
Clang Importer `_Ref` qualifier proposal

Introduction

Directly importing C APIs is a core feature of the Swift compiler. In that process, C pointers are systematically imported as Unsafe*Pointer swift objects. However, in C we make the distinction between pointers that reference a single object, and those pointing to an array of objects. In the case of a single object of type T, the Swift compiler should be able to import the parameter T * as a inout T, and T const * as T. Since the compiler cannot makes the distinction between pointer types by itself, we propose to add an attribute of C pointer for that purpose.

Motivation

Let consider the following C API:

typedef struct sb_t {
    char * _Nonnull data;
    int len;
    int size;
} sb_t;

/** Append the string \p str to \p sb. */
void sb_adds(sb_t * _Nonnull sb, const char * _Nonnull str);

/** Append the content of \p other to \p sb. */
void sb_addsb(sb_t * _Nonnull sb, const sb_t * _Nonnull other);

/** Returns the amount of available memory of \p sb. */
int sb_avail(const sb_t * _Nonnull sb);

This is imported in Swift as follow:

struct sb_t {
    var data: UnsafeMutablePointer<Int8>
    var len: Int32
    var size: Int32
}

func sb_adds(_ sb: UnsafeMutablePointer<sb_t>, _ str: UnsafePointer<Int8>)
func sb_addsb(_ sb: UnsafeMutablePointer<sb_t>, _ other: UnsafePointer<sb_t>)
func sb_avail(_ sb: UnsafePointer<sb_t>) -> Int32

sb_adds() takes two pointers: the first one is supposed to point to a single object named sb that will be mutated in order to add the content of str which points to a c-string. So we have two kinds of pointers: the first points to a single object, the second to a buffer. But both are represented using Unsafe*Pointer. Swift cannot actually make the difference between those two kind of pointers since the C language provides no way to express it.

sb_addsb() takes two objects of type sb_t. The first is mutated by the function by appending the content of the second one, which is const. The constness is properly reflected in Swift. However, the usage of the imported API is Swift might be surprising since Swift requires usage of an inout parameter in order to build an Unsafe*Pointer object:

var sb = sb_t(...)
let sb2 = sb_t(...)
sb_addsb(&sb, &sb2) // error: cannot pass immutable value as inout argument: 'sb2' is a 'let' constant
sb_addsb(&sb, sb2) // cannot convert value of type 'sb_t' to expected argument type 'UnsafePointer<sb_t>!'

var sb3 = sb_t(...)
sb_addsb(&sb, &sb3) // works
sb_avail(&sb2) // cannot convert value of type 'sb_t' to expected argument type 'UnsafePointer<sb_t>!'

However, Swift also provides the swift_name() attribute that allows remapping a C function to a Swift method, which includes mapping one of the parameter to self::

__attribute__((swift_name("sb_t.add(self:string:)")))
void sb_adds(sb_t * _Nonnull sb, const char * _Nonnull str);
__attribute__((swift_name("sb_t.add(self:other:)")))
void sb_addsb(sb_t * _Nonnull sb, const sb_t * _Nonnull other);
__attribute__((swift_name("sb_t.avail(self:)")))
int sb_avail(const sb_t * _Nonnull sb);
struct sb_t {
    var data: UnsafeMutablePointer<Int8>
    var len: Int32
    var size: Int32

    mutating func add(string: UnsafePointer<Int8>)
    mutating func add(other: UnsafePointer<sb_t>)
    func avail() -> Int32
}

With that attribute used, there is no need to convert the parameter mapped to self: to an Unsafe*Pointer. As a consequence, we have an improved API:

sb2.avail() // This time it works!

But we also have some inconsistent behavior since only self: is affected by this:

sb.add(other: &sb2)  // error: cannot pass immutable value as inout argument: 'sb2' is a 'let' constant
sb.add(other: sb2) // cannot convert value of type 'sb_t' to expected argument type 'UnsafePointer<sb_t>!'

What we observe here is that mapping an argument to self: is enough for the compiler to be able to change its semantics. As soon as it knows the pointer is actually the pointer to a single object, it can deal with it without exposing it as an Unsafe*Pointer, making the API safer and less surprising.

Proposed solution

A new qualifier could be added to inform the compiler that a pointer points to a single object. Then the Swift compiler could use that new piece of the information to generate API that use directly the object type instead of the pointer type. We propose the introduction of a new qualifier named _Ref, semantically similar to a C++ reference. That is:

  • _Ref is applied with the same grammar as the _Nonnull, _Nullable, family
  • A pointer tagged _Ref cannot be used to access more than the single pointed object.
  • A pointer tagged _Ref is non-owning

Parameters qualified with _Ref would then be imported in Swift as follows:

  • T * _Ref _Nonnull is imported as inout T
  • T * _Ref _Nullable is imported as inout T?
  • T const * _Ref _Nonnull is imported as T
  • T const * _Ref _Nullable is imported as T?

Example

In the context of the provided example from the motivation section:

typedef struct sb_t {
    char * _Nonnull data;
    int len;
    int size;
} sb_t;

/** Append the string \p str to \p sb. */
void sb_adds(sb_t * _Ref _Nonnull sb, const char * _Nonnull str);

/** Append the content of \p other to \p sb. */
void sb_addsb(sb_t * _Ref _Nonnull sb, const sb_t * _SIngle _Nonnull other);

/** Returns the amount of available memory of \p sb. */
int sb_avail(const sb_t * _Ref _Nonnull sb);

Would be imported as follow:

struct sb_t {
    var data: UnsafeMutablePointer<Int8>
    var len: Int32
    var size: Int32
}

func sb_adds(_ sb: inout sb_t, _ str: UnsafePointer<Int8>)
func sb_addsb(_ sb: inout sb_t, _ other: sb_t)
func sb_avail(_ sb: sb_t) -> Int32

Impact on existing code

This proposal has no impact on existing code since it proposes additive changes only. However, opting in for the _Ref qualifier on APIs already exposed in Swift will impact the generated code.

  • For const pointers, the change is always source-incompatible
  • For non-const pointers, the change will be source-compatible everywhere we use the &object syntax to pass the argument from a plain object, but will break sources that passed an Unsafe*Pointer as argument.

Alternatives considered

It has been considered to use to qualifiers family instead of the _Ref:

  • one family to specify the kind of pointer: single object or array
  • one family to declare the ownership

This approach has the clear advantage to be more flexible, however it has been found to be less expressive. Considering C API already should use nullability qualifiers on every single pointers, forcing two additional qualifiers on every pointer would be painful and negatively impact the readability of the C APIs.

_Ref on the other hand is short and leverage a concept already known by developers, but is also more specific to particular use case.

Discussion

  • Safety: won't this make developper think they are calling safe APIs from Swift while the API is actually unsafe?

There is certainly a risk a C API make an improper use of _Ref (in particular, breaks the non-owning part of the contract). However, this kind of safety issues are already present when using the swift_name() attribute of function and mapping one of the pointer parameter of the function to self:, or when using the nullability qualifiers.

  • What about pointers stored in structures? or pointers returned by functions?

As a qualifier, _Ref could also be used on pointers that are not arguments of a function:

typedef struct {
    sb_t * _Ref obj;
} sb_ptr_t;

sb_t * _Ref sb_get_singleton(void);

Swift, however, cannot import those as sb_t but will still be forced to use Unsafe*Pointer<sb_t> since sb_t is a structure and as such is not stored by reference.

We could also imagine a standard Reference<T> type that would wrap a pointer to a T (and could exposes the API of T on it).

  • What about function pointers that take a _Ref object?

When an API takes a function pointer whose type includes a _Ref qualified parameter, the qualifier applies:

void take_cb(int (*a)(sb_t const * _Ref _Nonnull sb, sb_t * _Ref _Nonnull other))
func cb(sb: sb_t, other: inout sb_t) {
    ...
}

take_cb(cb)

Swift guarantees we cannot break the non-owning contract and that we respect the constness of the parameter. This is safer than using the Unsafe*Pointer-based alternative.

  • Other use cases than Swift's?

The _Ref qualifier could be used by static analysis to check that functions don't access memory it shouldn't access: as long as some code manipulates some memory through a _Ref qualified pointer, it shouldn't access memory address bellow that pointer or above that pointer plus the stride of the type (an exception remains for types ending with a zero-length array).

  • What about pointers to arrays of objects?

This is another topic. We could imagine a _Array qualifier that could take an optional length.

/* The number of elements is statically known or passed as argument */
int main(int argc, char ** _Array(argc) argv)

/* The number of element is unknown. */
int puts(const char * _Array str);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment