Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Basic considerations when creating a module in OCaml

Basic Module Design

First-In-First-Out Stack

module FIFO : sig
  exception Empty

  type 'a t
  val empty: 'a t
  val size : 'a t -> int
  val peek : 'a t -> 'a option
  val push : 'a -> 'a t -> 'a t
  val pop  : 'a t -> 'a t (* Empty *)


  • Keep representation abstract
  • Functions that introduce t values
  • Functions that consume or observe t values
  • Type design should exclude most illegal states
  • Think about client code using this module


Abstractions can be either

  • immmutable (like FIFO.t above), or
  • mutable (like Hashtbl.t)


The type system prevents most errors, we still have:

  • Checked runtime errors - leading to exceptions
  • Unchecked runtime errors
  • Choice how to report errors: exceptions or optional values


Immutable abstractions are easier to describe than mutable abstractions using invariants and algebraic equations:

size(empty) = 0
size(push(x,s)) = size(s) + 1
size(pop(s)) = size(s) - 1
pop(empty)= undefined
pop(push(x,empty)) = empty
pop(push(x,s)) = push(x,pop(s)), s != empty
pop(push(y,push(x,empty))) = push(y,empty)
peek(empty) = None
peek(push(x,empty)) = Some(x)
peek(push(x,s)) = peek(s), s != empty
peek(push(y,push(x,empty))) = Some(x)

This is an informal notation which is still very useful and concise to capture the interaction between functions. These equations can be translated into unit tests. (Some people might prefer to write tests rather than invariants.)


Equality in OCaml as defined by = and <> is structural and might not be the right choice to compare values. For example:

  • A set abstraction can have many internal representations for the same set. Hence, structural equality would be wrong to use for set equality.

  • The FIFO stack above may or may not have a canonical internal representation - it could depend on the order of operations. As it stands, probably only empty would be safe to use in =. It would be even safer to provide an is_empty predicate in the signature.

  • Using = on a FIFO.t value could be an unchecked runtime error.


If an abstraction consumes other values - what assumptions do we have to make over them?

  • If our implementation is agnostic over the values it consumes, we can use parametric polymorphism: easy and efficient.
  • If we have to make assumptions over values, we need access to their properties. This leads to an implementation as a functor such that the functor argument provides access to all functions that we need.

In the case of the FIFO stack we simply store values but never inspect them. This makes it impossible to print them, for example, in the implementation. We also should not compare or order them in the implementation because we can't assume that eqality and order are well defined.

Other Considerations:

  • Complexity
  • Type Variance (in combination with polymorphic variants)


The idea of an efficient, functional implementation is to use two lists:

  • type 'a t = 'a list * 'a list

  • We push to the front of the second list.

    # push 1 empty |> push 2 |> push 3;;
    - : int list * int list = ([1], [3; 2])
  • We pop from the front of the first list

  • If the first list becomes empty, the reversed second list becomes the first list

    # push 1 empty |> push 2 |> push 3 |> pop
    - : int list * int list = ([2; 3], [])
    # push 1 empty |> push 2 |> push 3 |> pop |> push 4 |> push 5;;
    - : int list * int list = ([2; 3], [5; 4])
  • Reversing the second list has complexity O(n) but this effort is amortized over all push/pop operations, which makes them O(1) on average.


exception Empty
type 'a t = 'a list * 'a list

let empty = ([], [])

let size (xs, ys) = List.length xs + List.length ys

(* invariant: the left list is never empty unless the 
   stack is empty *)

let peek = function
  | x::_  , _     -> Some x
  | []    , []    -> None
  | []    , _     -> assert false

let push y = function
  | []    , []    -> [y], []
  | xs    , ys    -> xs, y::ys
  | []    , _ys   -> assert false

let pop = function
  | []    , []    -> raise Empty
  | [_]   , ys    -> List.rev ys, []
  | _::xs , ys    -> xs         , ys
  | []    , ys    -> assert false

Note: the stack containing 1..5 could have two representations:

  • ([1; 2], [5; 4; 3])
  • ([1] , [5; 4; 3; 2])
Copy link

mseri commented Apr 25, 2019

@lindig nice summary:

  • there is a small typo: euqality -> equality

  • why not module type FIFO on the first block, and then module Fifo: FIFO ... in the last? So you can also show where abstraction and implementation details come into play?

Copy link

mseri commented Apr 25, 2019

How much are you going to say about type variance?

How do you deal with undefined behaviour (pop(empty)) across multiple implementations?

How do you test algebraic properties like the ones you presented?

Copy link

lindig commented Apr 26, 2019

Thanks for the feedback.

  • I think type variants demands its own session.
  • You are right that I should have used the signature to constrain the implementation to make the connection more obvious.
  • I'm not sure the algebraic properties are correct. I want to show that it makes sense to concisely document the interaction between functions in a module. This is not always easy and it is ad hoc because making this precise you end up writing theorems. If the abstraction is mutable, you can't use these but still probably want something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment