Skip to content

Instantly share code, notes, and snippets.

@jehugaleahsa
Last active February 2, 2017 22:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jehugaleahsa/5ab78f7136cb6993c6b98652f6e93702 to your computer and use it in GitHub Desktop.
Save jehugaleahsa/5ab78f7136cb6993c6b98652f6e93702 to your computer and use it in GitHub Desktop.
A Transition to Statelessness

A Transition to Statelessness

Introduction

In 2012, I took a psychedelic journey into the world of functional programming. I started learning some ML-based languages, including F#, OCAML and Haskell. I also discovered Scala, which is now one of my all-time favorite languages.

I think too many people stop learning functional programming before they get into the really interesting parts. Filter/map/reduce is just the tip of the iceberg. Even if that is all you end up learning, you should at a minimum take to heart Command Query Separation. But the real meat of functional programming comes in the form of closures, partial application, currying, high-order functions and immutability. Check out this C#:

Func<int, int> getAdder(int value)
{
    return i => i + value;
}

// ...

var adder = getAdder(2);
Console.Out.WriteLine(adder(4));  // 6
Console.Out.WriteLine(getAdder(2)(1));  // 3

Look carefully, I just called a method that returns a method. What's really interesting is that the returned method still knows about the value passed in to getAdder. This is known as a closure; this is magic the compiler does on your behalf. Can you imagine the amount of code necessary to do this with classes? What's interesting is this simple pattern can be used to carry around an arbitrary amount of state, state that only the returned method can see. Sounds a lot like classes with private fields to me. In fact, any time you're dealing with a single method, delegate types (lambdas, etc.) can remove the need for interfaces and classes.

Around 2012 was also the time that front-end web development was exploding, with jQuery dominating the web. Believe it or not, JavaScript's unique combination of functional, dynamic and object-oriented programming allowed me to achieve a level of expressiveness I have yet to see matched. Here's the previous example in JavaScript:

function getAdder(value) {
    return (i) => i + value;
}

const adder = getAdder(2);
console.log(adder(4));  // 6
console.log(getAdder(2)(1));  // 3

In JavaScript, writing {} creates an object. Thanks to inline objects and closures, I can create an entire object with access to outer scope variables. In fact, this pattern (Module Pattern) is how most JavaScript developers achieve truly private backing data. This is why some JavaScript developers want to do away with formal classes altogether.

function getCalculator(value) {
    return {
        adder: function (x) { return x + value; },
        subtractor: function (x) { return x - value; },
        multiplier: function (x) { return x * value; },
        divider: function (x) { return x / value; }
    };
}

Statelessness

What turns most developers off from functional programming is its strong insistence that everything be stateless and immutable. Code starts looking awfully mathematical. However, it seems the community in the last decade has finally recognized that functional programming isn't all-or-nothing. Multi-threading, concurrency, etc. benefit from immutability. Processing huge amounts of data benefits from map/reduce. High-order functions and lazy evaluation lead to more reusable algorithm libraries. Lambdas, closures and partial application make languages more expressive. Most developers have started adopting functional programming practices without even realizing it. Sneaky...

Old examples of declarative/functional programming

When we write SQL, what we're really doing is old-school filter/map/reduce. SQL just happens to be a horrible, yet familiar, language for doing the job. We're so used to it's syntax that it doesn't bother most of us until we start building up complex queries. Almost every time you use a temp table, table variable or CTE, guess what, you're working around SQL's lack of expressiveness. Listing columns before table names prevents auto-completion and other modern development conveniences from being possibe.

LINQ takes a more modern approach to doing the same thing. In the past, I even devised my own syntax for working with data (https://gist.github.com/jehugaleahsa/03888d13ef2745cb67d0) that supports far more than just SQL. I would guess something similar will take the industry by storm in the upcoming years. At some point, it will just make more sense to update databases to recognize a new syntax than trying to convert from new representations to SQL. Just like we may someday find browsers natively supporting CoffeeScript or TypeScript or SASS or Jade.

How functional programming affected my day-to-day programming

Delving deep into functional programming changed the way I program forever. I almost never write code like this anymore:

List<ViewModel> models = new List<ViewModel>();
foreach (var entity in entities)
{
    ViewModel model = getViewModel(entity);
    models.Add(model);
}
return models;

LINQ has been around for nearly a decade and people still write code like this; it's slowly driving me crazy. These days, once I have the code to create one model from an entity, I just use LINQ's Select to do the work for me:

return entities.Select(e => getViewModel(e)).ToList();

One thing you learn from functional programming is that the code above is equivalent to:

return entities.Select(getViewModel).ToList();

While this code is the functionally the same, the first example creates an unnecessary lambda around getViewModel. The method getViewModel will be automatically converted to a delegate (Func<Entity, ViewModel>) by the compiler. While I would be more comfortable with the second snippet in a true functional programming language, it find it easier to read the first snippet in C#. I wouldn't assume one implementation is faster than the other without profiling.

While all I did was replace a foreach loop, switching to LINQ makes it easier to incorporate async/await if one or more of the steps are blocking. Tools like the Parallel extensions for LINQ can be used to run the filtering and mapping operations in parallel, as well. Finally, this model supports streaming data in from a database or other source rather than waiting for it to completely download into memory before processing. That's a lot of benefits, aside from reducing 5 lines down to 1.

My use of LINQ has gone way beyond simple mapping exercises. It is not uncommon to find my code using Join, SelectMany and GroupBy. In many cases, the code I am writing is just transforming data from one representation to another and almost any filter/map/reduce operation can be coded with LINQ.

To be honest, when I first started, I didn't exactly know as many "tricks" to keeping my functional code clean. I ended up writing some very strange-looking LINQ. After a while, I found ways to avoid switching between method and query syntax, utilized let to avoid duplication and so on (that's a whole other post).

What's interesting about adopting this style of programming is that I found my classes looking a lot differently. A pattern arose: dependencies became readonly backing fields, initialized in the constructor, and everything else I needed was passed to my methods. Only very rarely did my classes end up having any mutable state of their own (caching?). I had inadvertently stop using state!

Believe or not, this is a pretty natural thing when you work in a RESTful application. Unlike WebForms or the like, you no longer store information with the user's session. You get everything you need to process the request from the query string or POST data. Simple applications simply pull data from a database, transform it until its serialized into JSON and then poof - it's like nothing ever happened. Saving user changes is the exact same thing, just in the opposite direction. It's just a massive exercise in filter/map/reduce.

A common pattern in stateless code

Even in such stateless environments, you still had state within the scope of your methods, of course. But they weren't like your classical variables that are mutating all throughout the method. Rather, they were typically just a way to avoid writing code like this:

return repository.GetData(id).Where(d => d.IsActive).Select(d => mapper.GetViewMode(d)).ToList();

Instead, you just broke that up into smaller steps that avoided relying on the horizonal scroll bar.

var data = repository.GetData(id);
var activeData = data.Where(d => d.IsActive);
var models = activeData.Select(d => mapper.GetViewModel(d));
return models.ToList();

What's sad is C# never adopted an equivalent to Java's final or JavaScript's const keyword. It would make clear that I'm not going to be reassigning these "variables" after initializing them. This would make it easier for the compiler to optimize these variables away or even do magic behind the scenes to parallelize the operation.

Now, let's put the previous snippet in the context of a class:

public class DataAdapter
{
    private readonly IDataRepository repository;
    private readonly IDataMapper mapper;

    public DataAdapter(IDataRepository repository, IDataMapper mapper)
    {
        this.repository = repository;
        this.mapper = mapper;
    }

    public IEnumerable<ViewModel> GetModels(int id)
    {
        var data = repository.GetData(id);
        var activeData = data.Where(d => d.IsActive);
        var models = activeData.Select(d => mapper.GetViewModel(d));
        return models.ToList();
    }
}

The placement of IDataRepository and IDataMapper as backing fields is due to their relationship to the lifetime of the DataAdapter instance. You can reuse the same repository or mapper regardless of which id you pass in. It doesn't make sense to force users of your library to pass these into GetModels. However, the value of data, activeData and models are tied directly to the lifetime of id, so it makes sense for these to only live within the method.

You might think it is strange that I pass DataRepository and DataMapper into the constructor. This is pretty common with dependency injection. This avoids having to new up these classes in the constructor or later, where both might have a long list of dependencies of their own (e.g., an Entity Framework context, a connection string, other mappers, etc.).

A quick sign of someone who doesn't understand OOP or classes is the mismanagement of variable lifetimes. I have run across plenty of code that would reinitialize mapper for each call to GetMethod or try to stick one or more of the intermediate results as backing fields. Yuck! I make sure to mark my backing fields as readonly just to emphasize these are one-time initialized and not really state.

Lifetime comparison:
models < data < id < adapter <= repository == mapper

This pattern is so common, especially with DI, that at one time C# 6.0 was to introduce primary constructors. This feature was ultimately dropped, but something even better might reemerge in a few years. Other languages, like Scala and TypeScript are already supporting this concept.

What happened to design patterns?

I vaguely remember early on in my career using a lot more design patterns in my day-to-day programming. Earlier last year, I ran into a situation where working directly with data coming from my backend was leading to a lot of if/else nonsense. I thought, "How great would it be if I could just spin up an implementation and let it do the right thing polymorphically?"

It turns out it is hard to combine factory methods/classes with dependency injection, without becoming excessively dependent on your DI framework. I came up with an entire pattern that makes working with factory classes easier in the context of DI frameworks. I go into more details about this approach in another article I wrote: https://gist.github.com/jehugaleahsa/036952422c6f5739684e.

A more functional approach?

Functional programmers might snicker at me struggling with factory classes. One of the biggest criticisms of proper object-oriented programming is the excessive use of factory classes. This is strange considering the proliferation of factory methods appearing in functional programming languages. Functions return other functions quite often, which, as I mentioned earlier, can simulate classes with private state.

Dependency injection is the main reason for all this contortion. It would seem DI in functional programming circles is a lot less common. It's not because they are building simple scripts, but that they have less painful ways of injecting dependencies manually. Consider currying: you basically convert a function taking N parameters into a chain of functions taking one parameter each:

let getModels (repository: IDataRepository)(mapper: IDataMapper)(id: int): Seq<ViewModel> =
    id
    |> repository.getData
    |> Seq.filter (fun d -> d.IsActive)
    |> Seq.map mapper.getViewModel
    |> Seq.toList

As you can see from this F# example, ML-based languages support currying out-of-the-gate. How does this help us? Well, rather than working directly with getModels, you can initialize instances of IDataRepository and IDataMapper at the top of the application and bind them:

let boundGetModels = getModels(repository)(mapper)

Later on, boundGetModels can be called directly, just passing in id.

let models = boundGetModel(id)

I think this manual process is still tedious, requires you to make boundGetModels globally available somehow and ultimately it doesn't scale. However, without a bunch of classes getting in the way, operations just become a stream of function calls. Maybe the reason functional programs seem more "scripty" is because there isn't as much boilerplate. Without state, functional programming starts looking pretty advantageous.

Conclusions

I think I learned functional programming at just the right time. Most of the latest tools are utilizing concepts shared in common with functional programming, e.g., Reactive programming. More functional language features are appearing in C#, Java, C++, etc. Moreover, it ultimately results in less code because you're reusing algorithms already written by someone else. Plus, since it's more declarative in nature, future versions of languages may even be able to automatically parallelize your code.

I highly recommend anyone interested in learning functional programming to watch MIT's Structure and Intepretation of Computer Programs (SICP) lectures. While at first they seem slow, old or silly, some of the simple concepts they cover ultimately become essential to understanding functional programming.

Finally, don't let some of the more complicated functional programming concepts scare you away. What you'll find is that many concepts show up in many different forms in many different languages. I feel some times the communities surrounding certain languages are too specialized. I recently got very frustrated when trying to learn about monads, but it wasn't until I watched a video starring Douglas Crockford that I realized I had been using them for years. I think as more people from different backgrounds are exposed to these ideas, we'll come up with better names and more concrete examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment