Skip to content

Instantly share code, notes, and snippets.

@jamesrcounts
Created July 7, 2012 01:32
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jamesrcounts/3063755 to your computer and use it in GitHub Desktop.
Save jamesrcounts/3063755 to your computer and use it in GitHub Desktop.
String.Contains, C# Finite State Machine Implementation

String.Contains C# Edition

I enjoy playing with state machines. They are simple enough to implement quickly, and complex enough to give the implementation language a little workout. Followers of this blog will know that I've enjoyed using Finite State Machines to explore CoffeeScript.

But, when I look at the search terms leading to my blog, I see that people want to see FSMs implemented in C#, C++, or even VB. About the only language I haven't seen people looking for is CoffeeScript. I'm not too suprised, since most people searching for FSM implementations are probably students trying to cheat on thier homework. I doubt any professors are accepting CoffeeScript implemenations.

When I see these search results, I consider quickly posting a solution in the requested language. I don't mind if students crib solutions from me. I'm a big believer in learning by example. Certainly if by looking at my FSMs I can save one student from writing a switch statement, then I'll consider it a public service. On the otherhand, I'm not really interested in producing solutions by language on demand. Its not so much that I'm worried about students reading my solutions, as much as it is that implenting FSMs over and over again in different languages isn't very interesting on its own merits.

While I was refactoring the String.Contains FSM for my previous post, I saved a few lines by taking advantage of closure. My inspiration for this refactoring came from a question one of the attendee's asked at my CoffeeScript Crash Course presentation at SoCalCodeCamp. If you watch the video you will hear him ask, "Does the @ imply closure, and if so why?" As I said in my presentation and on this blog, I'm no expert at CoffeeScript or JavaScript, so I can't claim to have instantly grasped the context of his question. However, as I listened to it again, and listened to his comments with the other attendees, I began to understand why he asked the question. The result was the new implementation I wrote about in my last post, which takes much more advantage of closure than the original implementation.

C# also supports closure, so I begain to wonder what a C# String.Contains implementation would look like if we wanted to use as much closure as possible. I've implemented this machine in C# and C++ before, but I'm intrigued enough by the closure idea to try C# again.

Getting Started

This sample should be simple, Visual Studio seems like overkill, so I'm going to code it in Sublime Text 2 and use the C# compiler on the command line.

csc .\StringCompare.cs

If StringCompare.cs contains this skeleton program, it should compile to a do-nothing program:

public class Program
{
    public static void Main(params string[] args)
    {
    }
}

Rules

In our CoffeeScript implementation we have a class called Rule with two subclasses. Lets see how this looks in C#:

public class Rule
{
    public Rule(int currentState, char inputChar, int nextState)
    {
        this.NextState = nextState;
        this.Match = (machineState, readChar) =>
        {
            return currentState == machineState && readChar == inputChar; 
        };
    }

    public int NextState { get; private set; }
    public Func<int, char, bool> Match { get; private set; }
}

Remember, our goal is to use as much closure as possible, so if we can avoid declarin data properties, we will. All of our data comes in through the constructor, so we can create a lambda expression that closes over the currentState and inputChar and use it as our Match function. We still need to declare the Match function and its signature, but there is no need to create instance properties for the data. For NextState we could define a Func<int>, but that would have no benefit compared to a simple int property, so we'll use a property.

That's all fine in theory, but does it work? Let's write a little test in the Main function.

public class Program
{
    public static void Main(params string[] args)
    {
        var myRule = new Rule(0, '1', 1);
        Console.WriteLine(myRule.Match(0,'1'));
        Console.WriteLine(myRule.Match(1,'1'));
    }
}

As expected, this test prints True followed by False. It's clear from this test that closure works just as well in constructors as anywhere else. Having satisfied my intuition, I can delete this test code If I want to and move on to implementing the specialized TrapRule and ReturnRule classes.

public class TrapRule : Rule
{
    public TrapRule(int currentState)
        : base(currentState, (char)0, currentState)
    {
        this.Match = (machineState, readChar) =>
        {
            return currentState == machineState;
        };
    }
}

Like the CoffeeScript implementation, I pass currentState to the base implementation twice. However, I can't pass null for a char value in C#. Instead, I pass in the null character 0. Its also interesting to note that I don't need to make Match virtual, and I don't need to override it. All I need to do is make the setter protected so that the subclass can replace the implementation. We can update the test in Main to verify that this subclass works.

public class Program
{
    public static void Main(params string[] args)
    {
        var myRule = new TrapRule(0);
        Console.WriteLine(myRule.Match(0,'1'));
        Console.WriteLine(myRule.Match(1,'1'));
    }
}

As expected, we see True printed, followed by False. Next, we can implement the ReturnRule.

public class ReturnRule : Rule
{
    public ReturnRule(int currentState, char inputChar)
        : base(currentState, inputChar, 0)
    {
        this.Match = (machineState, readChar) =>
        {
            return currentState == machineState && readChar != inputChar;
        };
    }
}

We actually don't need to pass currentState and inputChar to the base class, because it only needs to know the nextState. However, it doesn't hurt to pass it either. Once again, the child class replaces the Match implementation with one appropriate for the return rule. As before, we can update our test to verify the ReturnRule behavior.

public static void Main(params string[] args)
{
    var myRule = new ReturnRule(0, '0');
    Console.WriteLine(myRule.Match(0,'0'));
    Console.WriteLine(myRule.Match(0,'1'));
}

As expected, this prints False followed by True. In theory, we've implemented Rule classes that should work with a String.Contains implementation. We need to see how this works with a machine implementation.

Machine

In the CoffeeScript implementation, I saw no benefit in converting the createMachine function into a class. In C#, I might be able to do something similar with dynamic. Although this may not be as readable as creating a class, I'll try it anyway for kicks. It turns out that this doesn't work though. The C# compiler produces error CS0828 because you cannot assign a lambda expression to an anonymous type property. I don't use dynamic very often, so this is news to me. It seems I'm already learning more about C# through this exercise.

So, with this restriction in place, I will need to create a class to implement the machine.

public class Machine 
{
    public Machine(Rule[] transitionRules, int currentState, int[] acceptanceStates)
    {
        this.ReadAll = (input) =>
        {
            foreach(var inputChar in input)
            {
                Read(inputChar);
            }
        };

        this.Read = (inputChar) => 
        {
            foreach(var rule in transitionRules) 
            {
                if (rule.Match(currentState, inputChar))
                {
                    currentState = rule.NextState;
                    return;
                }
            }
            currentState = -1;
        };

        this.AcceptedInput = () => Array.IndexOf(acceptanceStates, currentState) >= 0;

    }

    public Action<string> ReadAll { get; private set; }
    public Action<char> Read { get; private set; }
    public Func<bool> AcceptedInput { get; private set; }
}

This implementation compiles, but does it work? We'll need some test code to determine for sure. To do that we'll need to implement createSearchMachine and testMachine. These implementations should be possible using Func<> instances in Main without having to create classes.

Here is createSearchMachine:

Func<string, Machine> createSearchMachine = (searchString) =>
{
    var source = searchString.ToCharArray();
    var states = Enumerable.Range(0, source.Length);
    var rules = states.Select(x => new Rule(x, source[x], x + 1))
        .Concat(states.Where(x => x != source.Length).Select(x => new ReturnRule(x, source[x])))
        .Concat(new []{ new TrapRule(source.Length) });
    return new Machine(rules.ToArray(), 0, new[]{ source.Length });
};

Its intresting that, while slightly more verbose and noisy, C# supports a nearly identical implementation as CoffeeScript. Here is testMachine:

Action<string, string> testMachine = (input, searchString) =>
{
    var machine = createSearchMachine(searchString);
    machine.ReadAll(input);
    Console.WriteLine("{0}--{1}", machine.AcceptedInput(), input);
};

Now we can finally run the machine and see how it works.

var target = "operating system";
testMachine("Windows is an operating system installed on many machines.", target);
testMachine("Welcome to the operating room, the Doctor is just finishing his martini.", target);

And our output is just what we expect:

C:\...\gist-3063755> .\StringContains.exe
True--Windows is an operating system installed on many machines.
False--Welcome to the operating room, the Doctor is just finishing his martini.
using System;
using System.Linq;
public class Rule
{
public Rule(int currentState, char inputChar, int nextState)
{
this.NextState = nextState;
this.Match = (machineState, readChar) =>
{
return currentState == machineState && readChar == inputChar;
};
}
public int NextState { get; private set; }
public Func<int, char, bool> Match { get; protected set; }
}
public class TrapRule : Rule
{
public TrapRule(int currentState)
: base(currentState, (char)0, currentState)
{
this.Match = (machineState, readChar) =>
{
return currentState == machineState;
};
}
}
public class ReturnRule : Rule
{
public ReturnRule(int currentState, char inputChar)
: base(currentState, inputChar, 0)
{
this.Match = (machineState, readChar) =>
{
return currentState == machineState && readChar != inputChar;
};
}
}
public class Machine
{
public Machine(Rule[] transitionRules, int currentState, int[] acceptanceStates)
{
this.ReadAll = (input) =>
{
foreach(var inputChar in input)
{
Read(inputChar);
}
};
this.Read = (inputChar) =>
{
foreach(var rule in transitionRules)
{
if (rule.Match(currentState, inputChar))
{
currentState = rule.NextState;
return;
}
}
currentState = -1;
};
this.AcceptedInput = () => Array.IndexOf(acceptanceStates, currentState) >= 0;
}
public Action<string> ReadAll { get; private set; }
public Action<char> Read { get; private set; }
public Func<bool> AcceptedInput { get; private set; }
}
public class Program
{
public static void Main(params string[] args)
{
Func<string, Machine> createSearchMachine = (searchString) =>
{
var source = searchString.ToCharArray();
var states = Enumerable.Range(0, source.Length);
var rules = states.Select(x => new Rule(x, source[x], x + 1))
.Concat(states.Where(x => x != source.Length).Select(x => new ReturnRule(x, source[x])))
.Concat(new []{ new TrapRule(source.Length) });
return new Machine(rules.ToArray(), 0, new[]{ source.Length });
};
Action<string, string> testMachine = (input, searchString) =>
{
var machine = createSearchMachine(searchString);
machine.ReadAll(input);
Console.WriteLine("{0}--{1}", machine.AcceptedInput(), input);
};
var target = "operating system";
testMachine("Windows is an operating system installed on many machines.", target);
testMachine("Welcome to the operating room, the Doctor is just finishing his martini.", target);
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment