Skip to content

Instantly share code, notes, and snippets.

@Shadowbeetle
Forked from kubischj/elixir-behaviour-protocol.md
Last active May 14, 2024 09:06
Show Gist options
  • Save Shadowbeetle/529dcf6d199216386e71cd3f497780b2 to your computer and use it in GitHub Desktop.
Save Shadowbeetle/529dcf6d199216386e71cd3f497780b2 to your computer and use it in GitHub Desktop.

On elixir protocols & behaviours for JavaScript/TypeScript developers

Coming from TypeScript, the difference between behaviours and protocols in elixir might not be immediately obvious. Both of them look pretty much like an interface from two different angles. We'll go into detail to try and clear up how they are different in this post, and also how they are similar.

tldr

Behaviours:

Behaviours are a way to define a set of functions that a module must implement. Behaviours are defined with a list of functions and their return types, without providing implementations for them. Other modules can then declare that they implement this behaviour by providing implementations for all the functions specified. They are more of a concept and the language constructs around it merely facilitate its documentation, and provide hints for the implementation.

Protocols:

Protocols provide a way to implement the same functionality for different types of data. They allow you to define a set of functions for a certain data type, together with different implementations for given structs of your choosing. You can extend on them later to support new types without modifying already existing implementations.

If you prefer a more allegoric explanation, I think LostKobrakai summarized it the best on Elixirforum

While most of you reading this will probably never define any Behaviours or Protocols, it's still worth understanding these concepts and the differences between them for two reasons:

  1. It's easier to understand the documentation of libraries you meet.
  2. When you design software with technologies you don't understand deeply and you meet a complex problem, the feeling creeps in that there is something you read about that might make it more simple, but you just don't have the time to fully understand and utilize it. Most of the time, that turns out to be wishful thinking. With a deeper understanding of the nuts and bolts of said technology, you can free yourself from these unhelpful thoughts.

But that's too many talking already, let's dive in.

Behaviours

Behaviours are defined by modules that expect another module as a parameter, most of the time passed in as an argument to the start_link function, or as a config field. This is to provide a form of inversion of control. They define what functions the input module should implement in order to be able to work using the @callback and @macrocallback attributes. They mostly serve documentation and linting purposes. You can annotate your module with any number of callbacks that specify the functions your module will be using, and indicate the expected return type for them.

Let's take an example: we have a software that can calculate the price of real estate based on the shape of its floor area. It calculates the area first, than multiples it with the price per square meter. The library only provides the logic, and wants to let us – the user – decide where we pull the prices from. So in the example below the RealEstate module expects a get_price_per_square_meter function that returns an integer defined on the module passed as the parameter for its get_price function:

defmodule RealEstate do
  @callback get_price_per_square_meter() :: integer()

  @spec get_price(TwoDShape, module()) :: float()
  def get_price(shape, module) do
    TwoDShape.area(shape) * module.get_price_per_square_meter()
  end
end

So we define a module that handles just that:

defmodule PricePerSquareMeter do
  @behaviour RealEstate
  @impl RealEstate
  def get_price_per_square_meter() do
    RealEstateAPIProvider.get_prices()
  end
end

Note that we state which behaviour we're implementing using the @behaviour model attribute, then annotate each function that implements it using @impl. This way the compiler can warn us if we don't implement the interface properly. However, it's up to us if we want to actually annotate our functions, as the code will compile without annotations as well, we just lose the helpful warning. That's why you'll find a lot of examples that omit these explicit markings.

With that done, you can call get_price, passing it the PricePerSquareMeter module as such:

defmodule BehaviourExample do
  @spec real_estate_price() :: :ok
  def real_estate_price() do
    land_price =
      RealEstate.get_price(%Circle{radius: 60}, PricePerSquareMeter)
    IO.puts("Land price #{land_price}")

    house_price =
      RealEstate.get_price(%Rectangle{width: 60, height: 40}, PricePerSquareMeter)
    IO.puts("House price #{house_price}")

    pyramid_price =
      RealEstate.get_price(%Triangle{base: 23, height: 55}, PricePerSquareMeter)
    IO.puts("Pyramid price #{pyramid_price}")
  end
end

(Of course, calculating real estate prices is a lot more complex, but hopefully this gives an idea how one can go about creating behaviours.)

An example of a simple behaviour that everyone encounters sooner or later is the Swoosh Adapter for email delivery, letting you use Swoosh to send emails using whatever custom delivery method your setup needs. Swoosh comes with a lot of the common email services already available, while you can also implement the adapter behaviour it defines to conform to your custom setup. The only thing you need to do is provide a module that defines the deliver, deliver_many, validate_config and validate_dependency functions.

In a real life scenario, you'd invoke the use Swoosh.Adapter macro, that inserts the necessary code for you, but for the sake of the example, we'll implement the behaviour explicitly.

# my_adapter.ex

defmodule MyApp.MyAdapter do
  @behaviour Swoosh.Adapter

  @impl Swoosh.Adapter
  def validate_config(config) do
    required_config = [:api_key]
    Swoosh.Adapter.validate_config(required_config, config)
  end

  @impl Swoosh.Adapter
  def validate_dependency do
    Swoosh.Adapter.validate_dependency([MyApp.MailApiClient])
  end

  @impl Swoosh.Adapter
  def deliver(email, config) do
    MyApp.MailApiClient.post!("https://my-service.org/deliver", Jason.encode!(email), config)
  end

  @impl Swoosh.Adapter
  def deliver_many(emails, config) do
    MyApp.MailApiClient.post!("https://my-service.org/deliver_many", Jason.encode!(emails), config)
  end
end

And then, in either your config.exs or runtime.exs, you tell Swoosh to use your implementation of its adapter behaviour.

# config.exs || runtime.exs

  config :my_app, MyApp.Mailer,
    adapter: MyApp.MyAdapter,
    api_key: api_key

Behaviours in Elixir can be thought of as a blueprint for the functions a model needs to operate effectively. These functions are also annotated, enhancing the documentation you can generate. This approach contrasts with JavaScript's inversion of control, where libraries usually require a single callback as a parameter. Before you flip out, I'm not talking about the dreaded callback hell from before the time of async-await, rather the app.get(callback) or socket.on('message', callback) pattern.

However, we can find similar examples in JavaScript land too, as Vue implements pretty much the same concept for component data and lifecycle methods:

export default {
  setup () {
    const data = ref(initialData)
    return {
      dataStore
    }
  },
  mounted () {
    this.loadDataFromApi()
  },
  beforeUnmount () {
    this.cleanup()
  }
}

Here, we provide an object for those that import the module. While in elixir the logic is reversed, and non-exported functions are marked as private, but the idea is the same: a collection of functions is passed to the library that provide instrocuctions to the library on what to do when certain events occur.

Which is also what Phoenix LiveView uses to implement its functionality. You could implement a countdown similar to the above setup like this:

defmodule AppWeb.PageLive do
  use Phoenix.LiveView

  @impl Phoenix.LiveView
  def mount(_session, socket) do
    {:ok, assign(data: loadDataFromDB())}
  end

  @impl Phoenix.LiveView
  def handle_event("event", params, socket) do
    {:noreply, handleEvent(params)}
  end
end

You can see it's mostly the same idea, except LiveView expects you to provide the various handlers when implementing the behaviour.

Closing the discussion on Behaviours, there is one thing to note: when you browse the elixir documentation looking for Behaviour, you'll find a module with the same name that's deprecated. Don't be fooled though, it's simply there because there used to be a module with certain macros to be used when defining Behaviours that was deprecated in favour of the @callback and @macrocallback module attributes. This is noted in the documentation, however it might cause some avoidable fright. You might also run into some of the remnants of the deprecated module in code written with previous versions.

Protocols

Protocols are all about data manipulation and while they describe an interface, they achieve a lot more than the interface keyword in TypeScript. While in defprotocol ... do ... end block you define the functions one must implement for the protocol to be applicable to a given struct, just like interfaces do with classes, protocols also are responsible for dispatching function calls to their respective implementations.

Let's take an example from our first example, where we wanted to calculate the area of a 2D shape. In TypeScript, we can create an interface called TwoDShape that will declare what methods a class must implement for it to be considered a TwoDShape. In our case we'll define a circle, a rectangle and a triangle.

interface TwoDShape {
  area(): number;
}

class Circle implements TwoDShape {
  constructor(public radius: number) {}
  area() {
    return Math.PI * Math.pow(this.radius, 2);
  }
}
class Rectangle implements TwoDShape {
  constructor(public width: number, public height: number) {}

  area() {
    return this.width * this.height;
  }
}

class Triangle implements TwoDShape {
  constructor(public base: number, public height: number) {}

  area() {
    return (1 / 2) * this.base * this.height;
  }
}

As we know an interface merely provides type safety: we can declare that a given function expects a TwoDShape, which is any object that is marked as an implementation of the interface. So our RealEstate.getPrice method does not need to bother what kind of shape it's working with, as long as it's a 2D shape.

class RealEstate {
  getPrice(shape: TwoDShape, pricePerSquareMeter: number) {
    return shape.area() * pricePerSquareMeter;
  }
}

Elixir's protocol looks similar to the interface definition.

#two_d_shape.ex
defprotocol TwoDShape do
  @spec area(t) :: float() | integer()
  def area(shape)
end

defimpl TwoDShape, for: Triangle do
  def area(%Triangle{base: base, height: height}) do
    1 / 2 * base * height
  end
end

# triangle.ex
defmodule Triangle do
  @enforce_keys [:base, :height]
  defstruct [:base, :height]

  # the type definition is not mandatory, it only helps generating the documentation
  @type t() :: %__MODULE__{
          base: integer(),
          height: integer()
        }
end

# circle.ex
defmodule Circle do
  @enforce_keys [:radius]
  defstruct [:radius]

  # the type definition is not mandatory, it only helps generating the documentation
  @type t() :: %__MODULE__{
          radius: integer()
        }

  defimpl TwoDShape do
    def area(%Circle{radius: radius}) do
      :math.pi() * :math.pow(radius, 2)
    end
  end
end

# rectangle.ex
defmodule Rectangle do
  @enforce_keys [:width, :height]
  defstruct [:width, :height]

  # the type definition is not mandatory, it only helps generating the documentation
  @type t() :: %__MODULE__{
          width: integer(),
          height: integer()
        }

  defimpl TwoDShape do
    def area(%Rectangle{width: width, height: height}) do
      width * height
    end
  end
end

There two things to note here. One is that, while in TS only the data type (class) can define the implementation of a given interface, in elixir both the data type (struct) and the protocol itself can handle this task. This can be extremely useful if you want to create a Protocol that needs to be implemented for built-in types, but you want to give the freedom to implement it to your users. The other thing to note is that while we provided type specs, elixir is not a statically typed language, so they are not mandatory and won't break the compilation, but merely provide warnings. So in essence we only get an error at runtime if a Protocol is not implemented for a given struct.

Now let's see how the Protocol is used in action!

defmodule Protocol.RealEstate do
  @spec get_price(TwoDShape, integer()) :: float()
  def get_price(shape, price_per_square_meter) do
    TwoDShape.area(shape) * price_per_square_meter
  end
end

Let's see the interesting part from the TS and Elixir implementations side-by-side:

return shape.area() * pricePerSquareMeter;

In TypeScript, we call the area method that's attached to the object.

TwoDShape.area(shape) * price_per_square_meter

While in elixir we call the area function of the TwoDShape module by passing it an arbitrary struct. The runtime then determines which specific area function to call based on the struct instance provided. In Elixir, structs do not have methods of their own. While functions related to a struct might be grouped in the same module, they are essentially standalone functions that operate on specific data types. This is where Protocols come into play, forming a system that associates certain structs with specific functions for effective dispatch when needed.

Probably the most commonly used Protocol in elixir is the Enumerable, that we use when we call functions from the Enum and the Stream modules. The Enumerable has four required functions: count, reduce, slice and member, and allows iterating over values of the data types the protocol is implemented for.

defprotocol Enumerable do
  def reduce(enumerable, acc, fun)
  
  def count(enumerable)
  
  def member?(enumerable, element)
  
  def slice(enumerable)  
end

We can take a look at the Enumerable implementation for Maps:

def reduce(map, acc, fun) do
  Enumerable.List.reduce(:maps.to_list(map), acc, fun)
end

Reduce uses the implementation for lists, by first converting the map to a list, and forwarding the accumulator and the supplied function to Enumerable.List.reduce. Let's take a look at it in turn.

defimpl Enumerable, for: List do
  [...]
  
  def reduce(_list, {:halt, acc}, _fun), do: {:halted, acc}
  def reduce(list, {:suspend, acc}, fun), do: {:suspended, acc, &reduce(list, &1, fun)}
  def reduce([], {:cont, acc}, _fun), do: {:done, acc}
  def reduce([head | tail], {:cont, acc}, fun), do: reduce(tail, fun.(head, acc), fun)
end

Let's start from the bottom. In case it's called with a non-empty list ([head | tail] pattern) then it simply calls itself again with the tail of the list, applying the provided fun to the head and the accumulator. This continues until the list is empty, and the function returns {:done, acc}. The next two functions above simply handle the case if the reduce function is called with either {:suspend, acc} or {:halt, acc} instead of {:cont, acc}. Makes sense, if we think about how Enum.reduce or Array.prototype.reduce works, but what's with these extra tuples everywhere? They provide a way for finer control over the iterations end. Let's take for example the Enum.any? function.

  def any?(enumerable, fun) do
    Enumerable.reduce(enumerable, {:cont, false}, fn entry, _ ->
      if fun.(entry), do: {:halt, true}, else: {:cont, false}
    end)
    |> elem(1)
  end

The provided enumerable is reduced, until the provided handler function returns a truthy value, at which point a {:halt, true} tuple is returned, and then the second element of the tagged tuple is extracted with |> elem(1).

Based on that, we can make sense of the other implementations as well.

defimpl Enumerable, for: Map do
  def count(map) do
    {:ok, map_size(map)}
  end

For getting the count, the implementation simply calls the map_size kernel function.

  def member?(map, {key, value}) do
    {:ok, match?(%{^key => ^value}, map)}
  end

  def member?(_map, _other) do
    {:ok, false}
  end

The member function uses pattern matching to decide if the key and value pair is present in the Map or not.

  def slice(map) do
    size = map_size(map)
    {:ok, size, &:maps.to_list/1}
  end

As for slice, in the end it will convert the Map to a List and slice it using the implementation for Lists.

Regarding protocols, general implementations, deriving and fallback to any still remain, but I think the documentation is satisfying on these topics, and I don't want to pointlessly rephrase it here just for the sake of spilling characters on a screen. The only thing that needs to be added, is that you'll probably want to take a look at Jason.Encoder, as most of the time you'll derive it for your specific structs when you want to send them over the wire using Phoenix.

Behaviours & Protocols

In summary, behaviours and protocols are pretty similar, with the main differences are behaviours being more of a documentation aid, and being more about modules while protocols are about data. Both of them are mostly useful for library creators who wish to share their code with a wider audience, while users would most likely find themselves on the other side, creating modules to satisfy behaviours, and calling functions that use protocols, and sometimes even implementing protocols for their own data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment