Skip to content

Instantly share code, notes, and snippets.

@michelf
Last active February 6, 2018 14:58
Show Gist options
  • Save michelf/08708dfe509d3f090b19424706e852f4 to your computer and use it in GitHub Desktop.
Save michelf/08708dfe509d3f090b19424706e852f4 to your computer and use it in GitHub Desktop.

Concurrent

Introduction

This proposal introduces the concept of concurrent functions to Swift. A concurrent function is guarantied to be thread-safe by protecting access to shared mutable states.

Concurrent functions are not allowed to mutate the global state.

This document is a work in progress. There is no implementation at this time.

Motivation

Understanding code is how we avoid bugs. Code that is complex becomes even more complex when you consider that each function has an unlimited number of variables it can affect or depend on in the global scope. When code is complex, mistakes happens and bugs ensues, with varying consequences.

Thread safety with shared data is hard to prove and error prone. Beside the unlimited supply of global variables each function could mistakenly touch, data passed in arguments to functions running in background queues often remains accessible to the queue that requested the function to be called in the first place, leaving open the possibility of a data race.

One way to limit data races without necessarily copying huge chunks of data when passing it to another queue is to use copy on write. Copy on write implements value semantics without the need to copy the data every time. A copy is created only when mutated and only if there are other references to it (otherwise the copy is unnecessary). Standard library types like Array and String use copy-on-write under the hood. But copy on write is unintuitive to implement and unchecked by the compiler. An error in the implementation can cause races between thread and result in memory corruption.

Whether the data is in a global variable or behind a class reference, sharing data is potentially risky. But even if you decide to deep copy the data structure before passing it to a function, the compiler can't check that every part was deeply copied either.

Proposed Solution

Concurrent functions guaranty provable thread safety for whatever data they deal with. Without using a blessed synchronization mechanism, they cannot affect the global state of the program or mutate shared data through class references. Concurrent functions must follow certain rules, all of them checked by the compiler.

A function, property, or subscript is made concurrent by prefixing it with the concurrent attribute:

struct S {
	concurrent func foo() {}
	concurrent var number: Int
	concurrent subscript(index: Int) -> Int { return 1 }
}

When a function has the concurrent attribute, the compiler enforces that the implementation of the function follows the rules below. As a shortcut, you can declare a type concurrent to make all of its members concurrent:

concurrent struct S {
	func foo() {}
	var number: Int
	subscript(index: Int) -> Int { return 1 }
}

Doing this will not prevent extensions from attaching extra non-concurrent methods.

Globals, locals, struct, and enum

(The rules for everything but class.)

The following rules make sure that concurrent functions cannot access to the global mutable state in the general case.

Functions & initializers

A concurrent function or initializer cannot access variables not passed as a parameter (note that self is an implicit parameter). Also, it can only call other functions, getters, setters for properties and subscripts, and initializers if they are themselves concurrent.

var globalVar = 1

concurrent func increment(_ param: Int) -> Int {
	var localVar = param
	localVar += 1 // this is fine, has no effect on the outside world
	localVar += globalVar // error: globalVar is not concurrent, cannot assign from a concurrent function
	return localVar
}

A concurrent function or initializer can take inout arguments and it can throw. Those are just special ways of returning a value that do not affect the predictability of the output.

var globalVar = 1

concurrent func increment(_ param: inout Int) {
	param += 1 // allowed
}

increment(&globalVar) // the context here is non-concurrent, we are allowed to mutate globalVar
Default Arguments

A concurrent function can have default arguments like a regular function, but the expression for the default value must itself follow the rules of a concurrent function.

Properties & subscripts

The getter and setter for a concurrent property or subscript must be concurrent, following the rules for concurrent functions.

A concurrent stored property can be part of a struct, enum, local variable, or an immutable global variable (let). Its didSet and willSet blocks must be concurrent, if present, following the rules for concurrent functions. concurrent var is not allowed at global scope except for computed properties.

struct A {
	// a concurrent global constant
	static concurrent let defaultNumber: 8

	// a concurrent stored property
	concurrent var number: Int

	// a concurrent computed property
	concurrent var text: String {
		get {
			return String(number)
		}
		set {
			number = Int(newValue) ?? A.defaultNumber
		}
	}

	// didSet/willSet must be concurrent in a concurrent property
	concurrent var name: String {
		didSet {
			// error: NSApplication.shared is not concurrent, cannot access from concurrent function
			NSApplication.shared.mainWindow?.title = name
		}
	}
}

Class

Because classes are reference types and there can be multiple references pointing to the same object, special rules apply to them. Specifically, the criteria for concurrent is that writing to concurrent stored properties in a class requires the class to be uniquely referenced. This prevents mutating shared data and enables copy-on-write.

Copy on write

In a class, some functions functions can be cloning. Writing to a concurrent stored property inside a class is only possible from a cloning function. Property setters for concurrent properties are implicitly cloning. When a function is cloning, the self parameter is passed as inout, allowing the function to replace the object instance with another one when necessary.

This is similar to how mutating works for structs.

@objc is incompatible with cloning. An object derived from an Objective-C base class can have cloning members, but those members are only accessible from Swift since Objective-C does not suport passing self as inout.

Functions

A concurrent function in a class follows the same rules as other concurrent functions.

A concurrent cloning function takes the class reference as inout. This will allow checking whether the reference is unique and replacing the object reference if needed. Calling a cloning function does not trigger this check until later when a stored property is written to, possibly from another function.

Properties & subscripts

A property or subscript can be concurrent if both its getter and setter follow the rules of concurrent functions.

A concurrent stored property has a concurrent getter and a concurrent cloning setter. Calling the setter of a concurrent stored property will automatically verify that the object is a known unique reference using isKnownUniqueReference. If the reference is not unique and the object conforms to the CopyableObject protocol, a copy of the object is assigned to self before assigning the new value. If the reference is not unique and the object does not conform to CopyableObject, this is a fatal error. More on CopyableObject later.

The setter of a concurrent computed property or subscript in a class is assumed to be cloning (taking self as inout) unless explicitly marked as noncloning.

Deinitializers

Deinitializers can only be concurrent if all the stored properties within also have a concurrent deinitializer (or no deinitializer). Otherwise deinitializing a class reference can make objects no longer referenced and end up triggering the deinitializer for those classes.

deinit is called whenever an object reference count reaches zero. Even when not explicitly declared, deinitializers are automatically generated by the compiler for deinitializing the stored properties. An implicit deinitializer is concurrent when the class declaration has the concurrent attribute.

class NonconcurrentTestbed {
	// implicit deinit is not concurrent
	// this object cannot be passed to a concurrent function
}

concurrent class ConcurrentTestbed {
	// implicit deinit is deemed concurrent since the class is concurrent
}

FIXME: Can we somehow prevent a non-concurrent deinit from being called from the wrong thread?

CopyableObject

Classes with concurrent cloning members may conform to the CopyableObject protocol. This protocol defines an initializer to be called when there is more than one reference to the object and we need a uniquely referenced copy to write to.

protocol CopyableObject: class {
	concurrent required init(copying object: Self)
}

A default implementation copying all the concurrent stored properties is synthesized by the compiler. It can't copy non-concurrent stored properties however because being a concurrent initializer it does not have access to them. In the presence of non-concurrent stored properties, no implementation is synthesized.

Note that there is generally no need for this be a deep copy, as this is an implementaiton of copy-on-write.

Note about the Ownership Manifesto

The Ownership Manifesto describes a Copyable protocol which gives an object to the ability to copy its value from one variable to another. This is a different concept than the CopyableObject protocol described here, which is a mean for the compiler to provide a copy-on-write behavior automatically.

Overrides

Overriding a function or property cannot weaken the concurrent guaranty made in the base class, but it can strengthen it. For instance, you can override a non-concurrent function with a concurrent function. Note however that a concurrent function is not allowed to call its superclass non-concurrent implementation.

Closures

A closure is implicitly concurrent if it only call concurrent functions, getters, setters, or subscripts. You can make this explicit by prefixing the parameter list with concurrent:

let closure = { concurrent (a: Int) -> Int in
	return a + 1
}

Conditional concurrent and concurrent constraints

In some cases a function will be concurrent in itself, but will have to call other functions passed through parameters. The concurrent attribute of the function then depends on the concurrent attribute of the passed arguments. For instance, here is a function taking a closure provided by the caller:

concurrent? func callMeBack(callback: concurrent? () -> Void) {
	callback()
}

When the callback parameter is concurrent, the callMeBack function is concurrent too. The question mark after concurrent means that the function is conditionally concurrent depending on the argument passed to it. This syntax proposed here works similarily to throws and rethrows.

If the function takes a generic type, it can contrain its concurrent attribute to match one or more members of that generic type. In the following example, whether altHashvalue is concurrent function depends on whether T.hashValue is concurrent:

concurrent? func altHashValue<T: Hashable>(_ a: T) -> Bool where T.hashValue: concurrent? {
	return a.hashValue &+ 1
}

The where T.hashValue: concurrent? constrain this the altHashValue function to be concurrent if of T.hashValue is a concurrent property. The compiler must check that whole body of code in the function follows the rules for a concurrent function while assuming T.hashValue is concurrent. At the call site, the funciton is deemed concurrent only if T.hashValue is concurrent.

If instead of being conditionally concurrent we wanted altHashValue to always be concurrent, then the constraint can express that as well. We can express that by removing the question mark from the concurrent attribute in front of altHashValue, and removing the question mark in the constraint:

concurrent func altHashValue<T: Hashable>(_ a: T) -> Bool where T.hashValue: concurrent {
	return a.hashValue &+ 1
}

Here the constraint tells us that T.hashValue must absolutely be concurrent for this function to be called.

Weak references

Weak references mutate automatically when the object at the other end vanishes. The mutation is operated in a thread-safe manner by the runtime. Stored properties containing weak object references are thus eligible to be concurrent.

Unsafe concurrent

For cases where a function is known to be concurrent but the compiler can't prove that it is, the standard library provides a wrapper function for calling non-concurrent functions. This should be particularly useful when wrapping code from other languages:

let result = unsafeConcurrent { someExternalFunctionInC() }

Unsafe concurrent works with conditional concurrent too:

concurrent? func callMeBack(callback: concurrent? @convention(c) () -> Void) {
	unsafeConcurrent {
		someExternalFunctionInC(callback)
	}
}

Thread-safety limitations

Concurrent functions are automatically thread-safe since they do not have access to any shared mutable state. It is safe to pass an object to a concurrent function running in another thread while continuing to use it in the current thread. This is because only the concurrent (non-cloning) members of the object are accessible to the concurrent function. The current thread is free to mutate the non-concurrent portions of the object.

Encapsulation Patterns

Some shared mutable states are still safe but can't be verified by the compiler. unsafeConcurrent can be used to deal with that, but it's far better if you can encapsulate usages of unsafeConcurrent in types and functions that are themselves always safe to use. Here are some examples.

Dispatch to background task, get back result

class DispatchQueue {
	/// DispatchQueue.async requires the task to be concurrent for
	/// thread safety.
	concurrent func async(_ task: concurrent () -> ())
}

/// Dispatch a block to a concurrent queue, then pass the result to
/// back to a completion block in the main thread. This function must
/// be called from the main thread only.
func dispatchFromMainThread<R>(queue: DispatchQueue, task: concurrent () -> R, completion: (R) -> ()) {
	precondition(Thread.current == .main)
	queue.async {
		// this closure is implicitly concurrent, as required by `async`
		let result = task()
		DispatchQueue.main.async {
			// we can safely call `completion` here because we know
			// we're in the same thread as before
			unsafeConcurrent {
				completion(result)
			}
		}
	}
}

label.text = "awaiting result..."
dispatchAndHandleResult(task: {
	// this closure is concurrent, can't touch non-concurrent stuff
	// because we're in a background thread
	return "hello world"
}, completion: {
	// this closure is not concurrent, can deal with objects in
	// main thread
	label.text = result
})

TODO: add more examples

Library evolution

Changing a non-concurrent function to a concurrent one in a future version of a library is allowed. A concurrent function that becomes non-concurrent is a breaking change however, so this should be avoided.

There is an exception however for open members in classes: changing them from non-concurrent to concurrent is not allowed as this would break overrides defined in other modules.

ABI considerations

To be determined.

Migration Strategy

At the syntactic level, the word concurrent becomes reserved in certain contexes. Surrounding the keyword in backticks will make it possible to keep the name as an identifier. Backticks are not needed to refer to a member after a dot.

While this feature is semantically purely additive, it will be expected however that library authors get significant pressure to make APIs concurrent everywhere possible so that their user base can make use of concurrent themselves. This might be a significant burden, especially for libraries with downstream dependencies not yet annotated for concurrent or libraries with code in other languages.

Given that the compiler is able to check if concurrent is valid for every function, we should offer a tool capable of proposing adding concurrent to declarations whenever they are eligible, in other words when all their dependencies are concurrent. A flag causing the compiler to emit those suggestions would work perfectly. Every time a library dependency is updated for concurrent, the tool could be re-run to see if more things can become concurrent.

Because concurrent is a commitment when it comes to public APIs, library authors should review more carefully the annotations suggested by the tool for anything that is public or open.

APIs imported from other languages can't be automatically checked for concurrent eligibility by the compiler. They would need to be annotated using a mechanism similar to nullability. The user on the Swift side can use unsafeConcurrent to wrap any external call if they are confident the function respects concurrent semantics.

Future Directions

Concurrent actors

The Task-based concurrency manifesto discuss the concept of actors. Actors attempt to isolate some code to be run independently on a queue. Isolation is limited however, since actors operates in the same memory space than the rest of the program and have access to everything.

We could improve this by having a concurrent actor, an actor where every member is concurrent. Such an actor would benefit from compiler checked data isolation. We could also make all actors automatically concurrent.

Alternatives considered

nonconcurrent for open members

As a special case, open functions and properties in classes cannot become concurrent in a future version of a library; this would break compatibility with existing overrides outside of the module. We could make it a tristate where the default allows maximum compatibility for future versions of the library:

  1. open: calling is not concurrent, but overrides must be concurrent (maximum future compatibility, less usefulness)
  2. open concurrent : calling is concurrent, overrides must be concurrent
  3. open nonconcurrent: calling is not concurrent, overrides may or may not be concurrent

That would however be a major source-breaking change for libraries, as anything currently open would have to be relabeled open nonconcurrent for the current overrides to continue to work. Automatic migration is a possibility. Another would be to keep the current meaning of open and use another word for the maximum compatibility option:

  1. open strict: calling is not concurrent, but overrides must be concurrent (maximum future compatibility, less usefulness)
  2. open concurrent : calling is concurrent, overrides must be concurrent
  3. open: calling is not concurrent, overrides may or may not be concurrent

The utility of having strict when it is not the default is up to debate however.

Implicit concurrent

Concurrent could be inferred within the module for members that do not have an explicit concurrent attribute set using the following rules:

  1. Any let variable is implicitly concurrent.
  2. A function is implicitly concurrent if it does not call any non-concurrent function.
  3. A computed property is implicitly concurrent if its implementation does not call any non-concurrent function.
  4. A stored property is implicitly concurrent if it is immutable (let) or is a mutable (var) member of a struct, enum, or a local variable.
  5. Specifically not included in (4) are global mutable stored properties and mutable stored properties in class instances.

Implicit concurrent is not visible to other modules. Public functions must be explicitly marked concurrent as a commitment that the implementation will stay concurrent in future revision of the module.

Implicit concurrent has some drawbacks. While it might improve progressive disclosure by allowing functions to be used where they need to be concurrent without explicit annotations, this means that a small change in the implementation of a function can make it non-concurrent and cause a cascade of functions depending on the first one to become non-concurrent, causing hard to decipher errors far from where the change took place.

So this proposal is only proposing implicit concurrent applies to closures. The same rules as above could be used by a migration tool to annotate concurrent functions.

Deinitializer alternatives

Another possibility is to disallow classes with deinitializers from being used inside of concurrent functions. This would end up limiting us to final classes however, since you can't prove that all derived classes of a base class will have no deinitializer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment