Skip to content

Instantly share code, notes, and snippets.

@karlgluck
Last active December 4, 2016 23:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save karlgluck/8119e37d8cc06bd4c6e300ca17bc17d0 to your computer and use it in GitHub Desktop.
Save karlgluck/8119e37d8cc06bd4c6e300ca17bc17d0 to your computer and use it in GitHub Desktop.
A novel control flow pattern for easily writing a game's primary state machine

I'm working on some side projects and have been toying with ideas for more efficient ways to write games in Unity with C#. One of the major hangups for some of my larger side projects is that once it reaches sufficient complexity, it's hard to tell exactly what state the app is in. Even the simple wallet app I built has this issue. I've come up with the beginnings of a solution so I wanted to write it down before it gets lost.

The key problem is that you want your program to respond to inputs that require waiting for a response from something (a webserver, a user, etc). There are a number of options in Unity with C#.

These first 4 options are variations on a theme of control "elsewhere" in the program letting your code know that something has happened:

  • Virtual Callback E.g. if you derive from a class, and override a method which gets called in response to an event
  • Reflection Callback E.g. if you declare OnMouseMove on a class and that function gets called whenever the mouse moves
  • Callback Property Using delegate member on a class (with or without event for callback lists) to invoke a method on the class that wants to know about the event
  • Callback Chaining Calling a function and passing a delegate parameter that is called in order to return data later The biggest issue with these that I wrestle with is that you have to maintain additional state to keep track of whether the callback is valid or not. You don't want the map-drag callback that responds to mouse movement to be invoked when the user is clicking and dragging on a scroll box. With callbacks, this happens all over the place. Each callback often tests different pieces of the state for indicators about whether it should or should not respond--whether some particular GameObject is enabled, whether a global flag is set, etc. The amalgamation of all these tests is the true indicator of what state the app is in. And sometimes, things come along that unexpectedly change the state and break these tests and cause bugs.

From the user's point of view, an app either is or is not in a certain state--so why not do the same with code? After all, it's the natural way to write a program. Fortunately, Coroutines in Unity help here. However, coroutines have a different issue: while you're absolutely sure what state the program is in since only one line of a coroutine is executing at a time, it's harder to make non-hierarchical state transitions. For example, if you want to kick the user back to the main menu if an update was detected mid-game, you better be ready to walk back up the call stack of coroutines-calling-coroutines and have all those code paths handle the "let's bail out" condition. This gets really annoying. Plus, it's a bit of a pain to bring data from sub-coroutines back into the caller given that they cannot have out parameters. You can use some pretty contrived constructions like passing a delegate callback as the parameter, which assigns results into local variables of the caller...yuck...or pass a class instance that sends data back. But even in that case, you need a data structure for every coroutine call you want to make, or some set of standard ones as well as custom ones for special cases. If that's too annoying you could always just pass a Dictionary<string,object> and be done with it, but that's pretty ugly too--and you still have that looming problem of state transitions being a royal pain.

  • Coroutine yield return StartCoroutine (DoYourOtherThing ()) and wait for the coroutine to run.
  • Coroutine with Callback yield return StartCoroutine (WaitForInputThenCall (delegate (string userInput) { localVar = userInput; }));
  • Coroutine with Dictionary
string localVar;
{
   var retval = new Dictionary<string,object>();
   yield return StartCoroutine (WaitForInput (retval));
   localVar = (string)retval["userInput"];
}

Slightly cleaner if you want to pass data back is to poll shared state, but this gets messy when you want multiple fields simultaneously, or want to test for more complex things (e.g. all of user input sent, user clicked help or user closed input box).

  • Coroutine with Polling
string userInput = this.sharedUserInputVariable;
while (object.ReferenceEquals(userInput, this.sharedUserInputVariable))
{
  yield return null;
}

So what else can we do? Well, what all of these solutions are really trying to do is to get around a core problem: you're executing logic that you'd really just like to pause and come back to later when the code implementing that logic reaches a decision that it can't make without more info. In a single-purpose program like a calculator, this is actually super simple to solve. You don't need callbacks or polling or anything else fancy like that. You just stop the entire world and wait for the user to punch a button!

  • Blocking Put everything on hold and wait for whatever you want to happen.

Why can't we do this in more complex game code? Well, Coroutines as they are normally used is almost this, but the above showcases the fragility of the approach: state transitions and managing return values both suck.

So here's the nugget of an idea I've come to: what if the normal coroutine behavior were inverted, such that the coroutine implementing the logic of the program was actually invoked by the code that implements those decisions?

Here's the basic idea.

In the main loop of the controller class, you iterate your code pointer to grab a return value. The controller does a dynamic dispatch of the returned type from the RunLogic method to invoke ProcessLogicQuery for whatever type is returned. But here's the trick: this processing doesn't invoke the logic function again until the query structure has been filled out. The ProcessLogicQuery methods themselves are state-aware coroutines with full control of their own execution--however, they have a singular purpose and can be easily stubbed and tested.

   IEnumerator codePointer = RunLogic ();
   ...
   // in the main loop
   object query = codePointer.Current;
   yield return StartCoroutine (DynamicDispatch ("ProcessLogicQuery", query));

The RunLogic method executes the model-affecting process of the game. Whenever it can't make a decision or needs more info, it simply yield returns an appropriate class type to get the information it needs:

    IEnumerator RunLogic ()
    {
        ...
        string username;
        if (this.HasUserNameBeenSet())
        {
            username = this.GetUserName();
        }
        else
        {
            var queryUserName = new AskUserToSelectUsername ();
            yield return queryUserName;
            username = queryUserName.Username;
        }
        ...
    }

The method of obtaining information is now flexible, but we can do better: we can also make the state transitions flexible. The state of the app is always defined as the location of the current code pointer in RunLogic. However, there is no need for RunLogic to be the only logic coroutine. If we make the caller recognize the Delegate type from yield return, we can define that to mean "switch to and start running this other logic coroutine". This rule gives us the flexibility to switch between any two states at any time. Finally, a particular game can recognize more return types from the logic coroutine in order to make the UI easier--for example, the logic coroutine could return hints about what it's going to do soon so that the UI can start asynchronous operations (downloading a file, for example) to acquire information before the logic coroutine explicitly asks for it.

Implementation

This is an example WIP implementation.

using System.Collections;
using System;
using System.Reflection;

public sealed class GameMachine
{
    private IEnumerator codePointer;
    private Func<object, IEnumerator> dynamicDispatch;
    
    private GameMachine ()
    {
    }
    
    public static IEnumerator Start (IEnumerator codePointer, Func<object, IEnumerator> dynamicDispatch)
    {
        return new GameMachine ()
        {
            codePointer = codePointer,
            dynamicDispatch = dynamicDispatch,
        }.run();
    }
    
    public class GameMachineGoSub
    {
        internal IEnumerator CodePointer;
    }
    
    public static GameMachineGoSub GoSub (IEnumerator codePointer)
    {
        return new GameMachineGoSub
        {
            CodePointer = codePointer,
        };
    }
    
    public class GameMachineGoTo
    {
        public Func<IEnumerator> Code;
    }
    
    public static GameMachineGoTo GoTo (Func<IEnumerator> code)
    {
        return new GameMachineGoTo
        {
            Code = code
        };
    }

    IEnumerator run ()
    {
        while (codePointer.MoveNext())
        {
            object retval = codePointer.Current;
            var requestGoSub = retval as GameMachineGoSub;
            if (requestGoSub != null)
            {
                var sub = new GameMachine ()
                {
                    codePointer = requestGoSub.CodePointer,
                    dynamicDispatch = this.dynamicDispatch,
                };
                var subCodePointer = sub.run();
                while (subCodePointer.MoveNext())
                {
                    yield return subCodePointer.Current;
                }
                continue;
            }
            var requestGoTo = retval as GameMachineGoTo;
            if (requestGoTo != null)
            {
                codePointer = requestGoTo.Code.Invoke();
                continue;
            }
            if (retval != null)
            {
                var queryCodePointer = (IEnumerator)dynamicDispatch (retval);
                if (queryCodePointer == null)
                {
                    continue;
                }
                while (queryCodePointer.MoveNext())
                {
                    yield return queryCodePointer.Current;
                }
                continue;
            }
            else
            {
                yield return null;
            }
        }
    }

    public static Func<object, IEnumerator> CreateDynamicDispatchDelegate (object target, string methodName)
    {
        return delegate (object parameter)
        {
            var method = target.GetType().GetMethod(
                methodName,
                BindingFlags.IgnoreCase | BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance,
                null,
                new Type[] { parameter.GetType() },
                null);
            return (IEnumerator)method.Invoke (target, new object[] { parameter });
        };
    }
}

Usage

public class GameLogicRunner : MonoBehaviour
{
	void Start ()
	{
		 this.StartCoroutine (GameMachine.Start (GameLogic.Singleton.Run (), GameMachine.CreateDynamicDispatchDelegate (GameLogic.Singleton, "Handle")));
	}
}

public class GameLogic
{
  public static GameLogic Singleton = new GameLogic ();
  
  public IEnumerator Run ()
  {
    Debug.Log ("GameLogic.Run");
    yield return new WaitForSeconds (1.0f);
    yield return GameMachine.GoTo (this.anotherState());
  }
  
  private IEnumerator anotherState ()
  {
    Debug.Log ("GameLogic.AnotherState");
  }
  
  public IEnumerator Handle (YieldInstruction request)
  {
    Debug.Log ("Handle(YieldInstruction)");
    yield return request;
  }
}

Implementation Notes

I based a new game prototype around this mechanism and am very happy with the results so far. I've observed some useful patterns that may help to extend it going forward:

  • Decoupling the controllers from each other via auto-bound delegates is really handy; the [Bind] annotation has made this trivial.
  • I frequently want a decision to have multiple satisfying conditions. I could create defined classes for each, but often what I really desire is to listen for a set of events and proceed when one is received. The main class has an event queue (an array) that methods push class instances in to. Then, the logic coroutine uses wrapper-classes (WaitForEventOfType and WaitForAnyEventInTypes) to pull events of required types out of the queue. The cool thing here is that events which are not being listened to just fall out of the queue nicely.
  • I'm finding that this system works best when it purely answers the question "what is the app doing right now?" and I don't try to hit all the nails with this one hammer. For example, I tried creating a second system to run a town simulation and tie the two state machines together via events. It was going to work, but was also going to explode in complexity. One of the first things I thought I would need was multiple event queues requiring event duplication or routing. And when I wanted to make it run the simulation, I realized I was still just coalescing a bunch of state machines (one for each NPC, one for the player, one for the state of the speech bubbles, etc) into a single "town state machine". So really, I should split it out and make more independent, communicating machines. Abiding by my "zero, one, or many" pattern, it seemed like this was rapidly becoming a generalized framework that I should think more about in order to standardize. Instead, I pulled back and forced this to be a "one" solution and I really like how that's worked out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment