Skip to content

Instantly share code, notes, and snippets.

@andymatuschak
Last active May 1, 2024 12:32
Show Gist options
  • Save andymatuschak/d5f0a8730ad601bcccae97e8398e25b2 to your computer and use it in GitHub Desktop.
Save andymatuschak/d5f0a8730ad601bcccae97e8398e25b2 to your computer and use it in GitHub Desktop.
A composable pattern for pure state machines with effects (draft v3)

A composable pattern for pure state machines with effects

State machines are everywhere in interactive systems, but they're rarely defined clearly and explicitly. Given some big blob of code including implicit state machines, which transitions are possible and under what conditions? What effects take place on what transitions?

There are existing design patterns for state machines, but all the patterns I've seen complect side effects with the structure of the state machine itself. Instances of these patterns are difficult to test without mocking, and they end up with more dependencies. Worse, the classic patterns compose poorly: hierarchical state machines are typically not straightforward extensions. The functional programming world has solutions, but they don't transpose neatly enough to be broadly usable in mainstream languages.

Here I present a composable pattern for pure state machiness with effects, meant to be idiomatic in a typical imperative context. The pure-impure separation is inspired by functional core, imperative shell; the modeling of novelty and effect by Elm; the hierarchical composition by Harel statecharts.

The solution is implemented in Swift because it's a typical imperative context, and because Swift's types help illustrate the structure I'm suggesting.

The basic types

The key idea is that states, transitions, and specifications of effect are modeled by a pure value type; a separate dumb object actuates those effects (as in Elm; or think of the traditional Command pattern).

protocol StateType {
	/// Events are effectful inputs from the outside world which the state reacts to, described by some
	/// data type. For instance: a button being clicked, or some network data arriving.
	associatedtype InputEvent

	/// Commands are effectful outputs which the state desires to have performed on the outside world.
	/// For instance: showing an alert, transitioning to some different UI, etc.
	associatedtype OutputCommand

	/// In response to an event, a state may transition to some new value, and it may emit a command.
	mutating func handleEvent(event: InputEvent) -> OutputCommand

	// If you're not familiar with Swift, the mutation semantics here may seem like a very big red
	// flag, destroying the purity of this type. In fact, because states have *value semantics*, 
	// mutation becomes mere syntax sugar. From a semantic perspective, any call to this method 
	// creates a new instance of StateType; no code other than the caller has visibility to the
	// change; the normal perils of mutability do not apply.
	//
	// If this is confusing, keep in mind that we could equivalently define this as a function
	// which returns both a new state value and an optional OutputCommand (it just creates some
	// line noise later):
	//   func handleEvent(event: InputEvent) -> (Self, OutputCommand)

	/// State machines must specify an initial value.
	static var initialState: Self { get }

	// Traditional models often allow states to specific commands to be performed on entry or
	// exit. We could add that, or not.
}

A basic state machine with effects

Here's an example based on A Pattern Language of Statecharts, figure 3:

Drawing of a simple turnstyle state machine, implemented below

/// First up, the StateType. Note that this type is isolated, pure, trivial to fully test with zero mocks/stubs, etc.
enum TurnstyleState: StateType {
	case Locked(credit: Int)
	case Unlocked
	indirect case Broken(oldState: TurnstyleState)

	enum Event {
		case InsertCoin(value: Int)
		case AdmitPerson
		case MachineDidFail
		case MachineRepairDidComplete
	}

	enum Command {
		case SoundAlarm
		case CloseDoors
		case OpenDoors
	}

	static let initialState = TurnstyleState.Locked(credit: 0)
	private static let farePrice = 50

	// Try to picture the boxes-and-arrows diagram of this state machine using this
	// method. Hopefully the structure makes this translation mostly straightforward.
	mutating func handleEvent(event: Event) -> Command? {
		switch (self, event) {
		case (.Locked(let credit), .InsertCoin(let value)):
			let newCredit = credit + value
			if newCredit >= TurnstyleState.farePrice {
				self = .Unlocked
				return .OpenDoors
			} else {
				self = .Locked(credit: newCredit)
			}
		case (.Locked, .AdmitPerson):
			return .SoundAlarm
		case (.Locked, .MachineDidFail):
			self = .Broken(oldState: self)

		case (.Unlocked, .AdmitPerson):
			self = .Locked(credit: 0)
			return .CloseDoors
		case (.Unlocked, .MachineDidFail):
			self = .Broken(oldState: self)

		case (.Broken, .MachineRepairDidComplete):
			self = .Locked(credit: 0)

		default: break
			// Or should we throw?
			// In a more ideal world, it should be impossible to write code which reaches this line (I discuss
			// how this might be achieved at the very end).
		}

		return nil
	}
}

/// Now, an imperative shell that hides the enums and delegates to actuators.
/// Note that it has no domain knowledge: it just connects object interfaces.
class TurnstyleController {
	private var currentState = TurnstyleState.initialState
	private let doorHardwareController: DoorHardwareController
	private let speakerController: SpeakerController

	init(doorHardwareController: DoorHardwareController, speakerController: SpeakerController) {
		self.doorHardwareController = doorHardwareController
		self.speakerController = speakerController

		self.doorHardwareController.opticalPersonPassingSensorSignal
			.takeUntilObjectDeallocates(self)
			.subscribeNext { [unowned self] in self.handleEvent(.AdmitPerson) }
	}

	func customerDidInsertCoin(value: Int) {
		self.handleEvent(.InsertCoin(value: value))
	}

	func mechanicDidCompleteRepair() {
		self.handleEvent(.MachineRepairDidComplete)
	}

	private func handleEvent(event: TurnstyleState.Event) {
		switch currentState.handleEvent(event) {
		case .SoundAlarm?:
			self.speakerController.soundTheAlarm()
		case .CloseDoors?:
			self.doorHardwareController.sendControlSignalToCloseDoors()
		case .OpenDoors?:
			self.doorHardwareController.sendControlSignalToOpenDoors()
		case nil:
			break
		}
	}
}

protocol DoorHardwareController {
	func sendControlSignalToOpenDoors()
	func sendControlSignalToCloseDoors()

	/// A signal which fires an event whenever the physical door hardware detects a person passing through.
	var opticalPersonPassingSensorSignal: Signal<(), NoError> { get }
	
	// If the idea of a Signal is not familiar, don't worry about it: it's not terribly
	// important here. This could be a callback or a delegate method instead.
}

protocol SpeakerController {
	func soundTheAlarm()
}

Hierarchical state machines

Now, if you have a lot of states, you might want to start grouping them into "child" state machines, especially if there's a significant shared semantic. Here I rephrase the example state machine hierarchically: an inner state machine and an outer one, following figure 6 from "A Pattern Language of Statecharts."

Drawing of the turnstyle state machine, depicted hierarchically

Critically, no other types must change: the code above can use HierarchicalTurnstyleState as a drop-in replacement. StateTypes are closed under hierarchical composition. This is a natural consequence of the data-oriented design, which takes advantage of the simpler fact that enums are closed over carrying instances of other enums as an associated value, and that functions are closed over calling other functions. Nothing more complicated is required.

enum HierarchicalTurnstyleState: StateType {
	case Functioning(FunctioningTurnstyleState)
	case Broken(oldState: FunctioningTurnstyleState)

	enum Event {
		case InsertCoin(value: Int)
		case AdmitPerson
		case MachineDidFail
		case MachineRepairDidComplete
	}

	typealias Command = FunctioningTurnstyleState.Command // Of course, this could be some more complicated composition.

	static let initialState = HierarchicalTurnstyleState.Functioning(FunctioningTurnstyleState.initialState)

	mutating func handleEvent(event: Event) -> Command? {
		switch (self, event) {
		case (.Functioning(let state), .MachineDidFail):
			self = .Broken(oldState: state)
		case (.Functioning(let state), _) where FunctioningTurnstyleState.Event(event) != nil:
			var newState = state
			let command = newState.handleEvent(FunctioningTurnstyleState.Event(event)!)
			self = .Functioning(newState)
			return command

		case (.Broken(let oldState), .MachineRepairDidComplete):
			self = .Functioning(oldState)

		default:
			break // or maybe throw, etc
		}

		return nil
	}
}

// Note that this could be private, or not. It's just as much of a StateType as HierarchicalTurnstyleState.
enum FunctioningTurnstyleState: StateType {
	case Locked(credit: Int)
	case Unlocked

	enum Event {
		case InsertCoin(value: Int)
		case AdmitPerson
	}

	enum Command {
		case SoundAlarm
		case CloseDoors
		case OpenDoors
	}

	static let initialState = FunctioningTurnstyleState.Locked(credit: 0)
	private static let farePrice = 50

	mutating func handleEvent(event: Event) -> Command? {
		// Now somewhat simpler because the turnstyle can't be broken.
		switch (self, event) {
		case (.Locked(let credit), .InsertCoin(let value)):
			let newCredit = credit + value
			if newCredit >= TurnstyleState.farePrice {
				self = .Unlocked
				return .OpenDoors
			} else {
				self = .Locked(credit: newCredit)
			}
		case (.Locked, .AdmitPerson):
			return .SoundAlarm

		case (.Unlocked, .AdmitPerson):
			self = .Locked(credit: 0)
			return .CloseDoors

		default:
			break // or maybe throw, etc
		}

		return nil
	}
}

// Conversion between the layers. You could approach this a variety of ways. 
extension FunctioningTurnstyleState.Event {
	init?(_ event: HierarchicalTurnstyleState.Event) {
		switch event {
		case .InsertCoin(let value):
			self = .InsertCoin(value: value)
		case .AdmitPerson:
			self = .AdmitPerson
		case .MachineDidFail, .MachineRepairDidComplete:
			return nil
		}
	}
}

Orthogonal state machines

The hierarchical state machines introduced above were always alternations: the parent machine is always in a state of one of its child machines. But instead the composition might be a Cartesian product, where the parent runs the child machines simultaneously, and its effective state is a tuple of the child states.

The relationship between this kind of composition (orthogonal) and the earlier kind of composition (alternative) is the same as the relationship between product types (structs, tuples) and sum types (enums), so we see that reflected in the declaration below.

For instance, a keyboard might have two different state machines running simultaneously: one handles the toggling behavior of the main keypad's caps lock key and its impact on keypad behavior; the other handles the same thing for the numpad and num lock.

Drawing of a keyboard's state machine, separating caps lock and num lock functionality

struct KeyboardState: StateType {
	var mainKeypadState: MainKeypadState
	var numericKeypadState: NumericKeypadState

	enum HardwareKeyEvent {
		// A quick and dirty definition.
		case Alpha(Character)
		case Number(Int)
		case NumLock
		case CapsLock
	}

	static let initialState = KeyboardState(mainKeypadState: MainKeypadState.initialState, numericKeypadState: NumericKeypadState.initialState)

	mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
		if let mainKeypadEvent = MainKeypadState.HardwareKeyEvent(event) {
			return mainKeypadState.handleEvent(mainKeypadEvent)
		} else if let numericKeypadEvent = NumericKeypadState.HardwareKeyEvent(event) {
			return numericKeypadState.handleEvent(numericKeypadEvent)
		} else {
			return nil
		}
	}
}

enum MainKeypadState: StateType {
	case CapsLockOff
	case CapsLockOn

	enum HardwareKeyEvent {
		case Alpha(Character)
		case CapsLock
	}

	static let initialState = MainKeypadState.CapsLockOff

	mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
		switch (self, event) {
		case (.CapsLockOff, .CapsLock):
			self = .CapsLockOn
		case (.CapsLockOff, .Alpha(let c)):
			return .AlphaNumeric(c)

		case (.CapsLockOn, .CapsLock):
			self = .CapsLockOff
		case (.CapsLockOn, .Alpha(let c)):
			return .AlphaNumeric(c.uppercaseCharacter)
		}

		return nil
	}
}

enum NumericKeypadState: StateType {
	case NumLockOff
	case NumLockOn

	enum HardwareKeyEvent {
		case Number(Int)
		case NumLock
	}

	static let initialState = NumericKeypadState.NumLockOn

	mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
		switch (self, event) {
		case (.NumLockOff, .NumLock):
			self = .NumLockOn
		case (.NumLockOff, .Number(let n)):
			if let arrowDirection = ArrowDirection(numericKeypadInput: n) {
				return .Arrow(arrowDirection)
			}

		case (.NumLockOn, .NumLock):
			self = .NumLockOff
		case (.NumLockOn, .Number(let n)):
			return .AlphaNumeric("\(n)".characters.first!)
		}

		return nil
	}
}

enum KeyboardOutputCommand {
	case AlphaNumeric(Character)
	case Arrow(ArrowDirection)
}

enum ArrowDirection {
	case Up, Down, Left, Right
}

extension ArrowDirection {
	// Mapping the numpad keys to arrow keys.
	init?(numericKeypadInput: Int) {
		switch numericKeypadInput {
		case 2: self = .Up
		case 4: self = .Left
		case 6: self = .Right
		case 8: self = .Down
		default: return nil
		}
	}

}

// We can various other constructions which might make this kind of glue lighter.
extension MainKeypadState.HardwareKeyEvent {
	init?(_ hardwareKeyEvent: KeyboardState.HardwareKeyEvent) {
		switch hardwareKeyEvent {
		case .Alpha(let c): self = .Alpha(c)
		case .CapsLock: self = .CapsLock
		case .Number, .NumLock: return nil
		}
	}
}

extension NumericKeypadState.HardwareKeyEvent {
	init?(_ hardwareKeyEvent: KeyboardState.HardwareKeyEvent) {
		switch hardwareKeyEvent {
		case .Alpha, .CapsLock: return nil
		case .Number(let n): self = .Number(n)
		case .NumLock: self = .NumLock
		}
	}
}

extension Character {
	var uppercaseCharacter: Character {
		return Character(String(self).uppercaseString)
	}
}

Or more generally (but it's ugly and non-idiomatic):

struct OrthogonalState<StateA: StateType, StateB: StateType>: StateType {
	var stateA: StateA
	var stateB: StateB

	static var initialState: OrthogonalState<StateA, StateB> {
		return OrthogonalState(stateA: StateA.initialState, stateB: StateB.initialState)
	}

	mutating func handleEvent(event: OrthogonalStateEvent<StateA, StateB>) -> OrthogonalStateCommand<StateA, StateB>? {
		// In some orthogonal state machines, certain events will want to be dispatched to both children. We'd need a somewhat more complex structure for that.
		switch event {
		case .A(let event):
			return stateA.handleEvent(event).map(OrthogonalStateCommand.A)
		case .B(let event):
			return stateB.handleEvent(event).map(OrthogonalStateCommand.B)
		}
	}
}

// Not allowed to nest types in generic types.
enum OrthogonalStateEvent<StateA: StateType, StateB: StateType> {
	case A(StateA.Event)
	case B(StateB.Event)
}

enum OrthogonalStateCommand<StateA: StateType, StateB: StateType> {
	case A(StateA.Command)
	case B(StateB.Command)
}

// But hey, I can now define the same KeyboardState interface as an Adapter on OrthogonalState. Not that this is particularly cleaner, but it does contain less domain knowledge:
struct KeyboardState2: StateType {
	private typealias OrthogonalStateType = OrthogonalState<MainKeypadState, NumericKeypadState>
	private var orthogonalState: OrthogonalStateType

	enum HardwareKeyEvent {
		// A quick and dirty definition.
		case Alpha(Character)
		case Number(Int)
		case NumLock
		case CapsLock

		private var orthogonalEvent: OrthogonalStateType.Event {
			switch self {
			case .Alpha(let c): return .A(.Alpha(c))
			case .Number(let n): return .B(.Number(n))
			case .NumLock: return .B(.NumLock)
			case .CapsLock: return .A(.CapsLock)
			}
		}
	}

	static let initialState = KeyboardState2(orthogonalState: OrthogonalStateType.initialState)

	mutating func handleEvent(event: HardwareKeyEvent) -> KeyboardOutputCommand? {
		switch orthogonalState.handleEvent(event.orthogonalEvent) {
		case .A(let c)?: return c
		case .B(let c)?: return c
		case nil: return nil
		}
	}
}

Alternative approaches, outstanding issues

Functions, not enum manipulations

The use of enums in this pattern is not terribly idiomatic. In a fundamentally imperative language, we usually describe events with methods. Unfortunately, methods aren't as flexible in use-vs-reference as enums, so state machines defined in this way don't compose quite so cleanly.

Another issue with using functions to specify events is that they make the machine harder to "read" by emphasizing transitions over states. Compare this definition of TurnstyleState to the initial one; I argue it's much harder to picture the boxes-and-diagrams graph. (note that it's impossible to implement OrthogonalStateType on top of this structure)

enum TurnstyleState2 {
	case Locked(credit: Int)
	case Unlocked
	indirect case Broken(oldState: TurnstyleState2)

	enum Command {
		case SoundAlarm
		case CloseDoors
		case OpenDoors
	}

	static let initialState = TurnstyleState2.Locked(credit: 0)
	private static let farePrice = 50

	func insertCoin(value: Int) -> (TurnstyleState2, Command?) {
		switch self {
		case .Locked(let credit):
			let newCredit = credit + value
			if newCredit >= TurnstyleState.farePrice {
				return (.Unlocked, .OpenDoors)
			} else {
				return (.Locked(credit: newCredit), nil)
			}
		case .Unlocked, .Broken:
			return (self, nil)
		}
	}

	func admitPerson() -> (TurnstyleState2, Command?) {
		switch self {
		case .Locked:
			return (self, .SoundAlarm)
		case .Unlocked:
			return (.Locked(credit: 0), .CloseDoors)
		case .Broken:
			return (self, nil)
		}
	}

	func machineDidFail() -> TurnstyleState2 {
		switch self {
		case .Locked, .Unlocked:
			return .Broken(oldState: self)
		case .Broken:
			return self // Or throw?
		}
	}

	func machineRepairDidComplete() -> TurnstyleState2 {
		switch self {
		case .Broken(let oldState):
			return oldState
		case .Locked, .Unlocked:
			return self // Or throw?
		}
	}
}

Type-safe state transitions

All those places where I wrote "or throw?"... can we do better?

One key issue is that event handlers are fairly indiscriminate. If we're listening to events from some hardware sensor, those events are going to arrive regardless of what our state is, so we have to branch somewhere to decide whether to handle it or not. Right now, we branch in handleEvent. If individual states had their own types (only some of which included certain events), we'd just kick the branch up to whatever component dispatches events to states.

So a rigorous solution must meaningfully affect dispatch of the input events which lead to state changes. One solution (similar to Elm's) would be to allow states to (directly or indirectly) specify which upstream subscriptions are actually active. Those specifications could include enough type information to directly enforce that states only receive events with corresponding transitions.

I'll save a detailed treatment on this topic for another day.

Acknowledgements

My thanks to Colin Barrett, Chris Eidhof, Justin Spahr-Summers, and Rob Rix for useful feedback on a first draft of this concept.

My thanks to Matthew Johnson for suggesting that optionality of OutputCommand in handleEvent() can itself be optional.

@AquaGeek
Copy link

AquaGeek commented Aug 11, 2016

@andymatuschak (and everyone else involved), thanks for sharing this!

What's the license?

@andymatuschak
Copy link
Author

andymatuschak commented Aug 17, 2016

@AquaGeek: Good question. This publication is licensed under a Creative Commons Attribution 4.0 International License.

@andymatuschak
Copy link
Author

I've gotten a couple requests, so: the source in this publication is also available under the MIT License.

Copyright (c) 2016 Andy Matuschak

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

@jmfayard
Copy link

Very nice!
I tried to reimplement it in kotlin, it maps very nicely as well!

https://gist.github.com/jmfayard/ac6a94df1cc2994ab5b59f510c98133f

@alexsullivan114
Copy link

Question for the audience - how do people feel about defining the states of the state machine as objects and routing the event to the object rather than having the encompassing class switch on both the current state and the provided event? One drawback I can see of that approach would be that the individual states now know about the next state in the state machine. I'm not sure if that's actually a bad thing, though. Thoughts?

@nogtini
Copy link

nogtini commented Jan 25, 2018

@alexsullivan114 That's a curious and thought-provoking observation! Normally a "state" might be considered the set or composition of all run-time dependencies or links within the system, and the "next state" simply the next discrete form of that set after any change occurs to any dependency, but in these cases these "state machines" are severe, severe abstractions aiding knowledge share by creating a reasonable mental model or abstraction.

With that in mind, and given the fact that statecharts (orthogonal, hierarchical state machines as described above) are canonical UML, let's presume that the arrows in the graphics not only represent event transitions, but, in code form, also represent a code dependency as you're suggesting. They will certainly represent a knowledge dependency if you use some sort of event bus or router, but given that these implemented-in-code state machines serve the abstraction, and not necessarily the system (the system de facto is a state machine at all times, regardless of how you represent it in code), then creating either dependency becomes a simple quality issue. By this I mean are there any external constraints that might cause conflict in creating a direct coupling that hinders you from representing this abstraction efficiently? I would argue no, as the amount of knowledge encapsulated within each state is simply the state itself, and not any larger system demands.

@cbarrett
Copy link

Now that Swift can better encode final rather than initial representations, I think it'd be great to apply that here. I'm aiming to fork this with an example, but no promises :)

@ktraunmueller
Copy link

ktraunmueller commented Nov 7, 2018

@andymatuschak Very nice. However, the design requires a lot of convenience initializers for converting between the Event types of compound states and substates. The same is probably true for the different Command types . One needs to write quite a bit of boilerplate code for the type-conversion initializers, and the conversion initializer calls also don't help with readability.

It would seem more practical if all events (and commands/actions) were of the same type (i.e., one top-level enum), where each state would accept only a subset of the events and ignore all other events (the accepted events defining the set of valid transitions). But maybe that goes against the idea of your design.

One thing that feels wrong, though, is that the handleEvent() method of the composite state KeyboardState dispatches events only to one of its parallel substates, not both. I think all parallel substates should have the chance to react to an event. You have this comment "In some orthogonal state machines, certain events will want to be dispatched to both children. We'd need a somewhat more complex structure for that." in your code – I think the structure should be such that this is always possible (basically going towards a full implementation of Harel statecharts).

In this regard, Harel's paper has this orthogonality example where both A and D respond to event α:

screen shot 2018-11-08 at 09 09 52

"If event α then occurs, it transfers B to C and F to G simultaneously, resulting in the new combined state (C, G)". This is probably such a common use case (and useful feature to have) that the structure should be able to handle it. A handleEvent() method would then maybe need to return a Set<Command> instead of Command?.

What also struck me as odd is that the StateType protocol isn't actually used anywhere (only adopted, but you could just leave it out and the code would compile just as fine). As a generic protocol, it would need to serve as a type constraint on the element type of some generic data structure or argument to a generic function. What's the point of introducing a protocol that's not used anywhere to enforce some constraints?

@mfaani
Copy link

mfaani commented Feb 9, 2022

The image links seem to be broken. Also the link for "A Pattern Language of Statecharts" is broken. Would you mind fixing them?

@jackrwright
Copy link

I found the referenced paper here: A Pattern Language of Statecharts)

@zzzgit
Copy link

zzzgit commented Apr 21, 2022

Is there a js or java implementation?

@jordanebelanger
Copy link

Great paper, we make state machines that works the exact same way but with just slightly less structure, good inspiration to get things perfect 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment