lindig/basic-module-design.md

## basic-module-design.md

      
    Raw
  

              basic-module-design.md
            
          
    Basic Module Design

First-In-First-Out Stack

module FIFO : sig
  exception Empty

  type 'a t
  val empty: 'a t
  val size : 'a t -> int
  val peek : 'a t -> 'a option
  val push : 'a -> 'a t -> 'a t
  val pop  : 'a t -> 'a t (* Empty *)
end

Signature


Keep representation abstract
Functions that introduce t values
Functions that consume or observe t values
Type design should exclude most illegal states
Think about client code using this module

Mutability

Abstractions can be either

immmutable (like FIFO.t above), or
mutable (like Hashtbl.t)

Errors

The type system prevents most errors, we still have:

Checked runtime errors - leading to exceptions
Unchecked runtime errors
Choice how to report errors: exceptions or optional values

Invariants

Immutable abstractions are easier to describe than mutable abstractions
using invariants and algebraic equations:
size(empty) = 0
size(push(x,s)) = size(s) + 1
size(pop(s)) = size(s) - 1
pop(empty)= undefined
pop(push(x,empty)) = empty
pop(push(x,s)) = push(x,pop(s)), s != empty
pop(push(y,push(x,empty))) = push(y,empty)
peek(empty) = None
peek(push(x,empty)) = Some(x)
peek(push(x,s)) = peek(s), s != empty
peek(push(y,push(x,empty))) = Some(x)

This is an informal notation which is still very useful and concise to
capture the interaction between functions. These equations can be
translated into unit tests. (Some people might prefer to write tests
rather than invariants.)
Equality

Equality in OCaml as defined by = and <> is structural and might not
be the right choice to compare values. For example:


A set abstraction can have many internal representations for the same
set. Hence, structural equality would be wrong to use for set
equality.


The FIFO stack above may or may not have a canonical internal
representation - it could depend on the order of operations. As it
stands, probably only empty would be safe to use in =.  It would
be even safer to provide an is_empty predicate in the signature.


Using = on a FIFO.t value could be an unchecked runtime error.


Polymorphism

If an abstraction consumes other values - what assumptions do we have to
make over them?

If our implementation is agnostic over the values it consumes, we can
use parametric polymorphism: easy and efficient.
If we have to make assumptions over values, we need access to their
properties.  This leads to an implementation as a functor such that
the functor argument provides access to all functions that we need.

In the case of the FIFO stack we simply store values but never inspect
them. This makes it impossible to print them, for example, in the
implementation. We also should not compare or order them in the
implementation because we can't assume that eqality and order are well
defined.
Other Considerations:


Complexity
Type Variance (in combination with polymorphic variants)

Implementation

The idea of an efficient, functional implementation is to use two lists:


type 'a t = 'a list * 'a list


We push to the front of the second list.
# push 1 empty |> push 2 |> push 3;;
- : int list * int list = ([1], [3; 2])


We pop from the front of the first list


If the first list becomes empty, the reversed
second list becomes the first list
# push 1 empty |> push 2 |> push 3 |> pop
- : int list * int list = ([2; 3], [])

# push 1 empty |> push 2 |> push 3 |> pop |> push 4 |> push 5;;
- : int list * int list = ([2; 3], [5; 4])


Reversing the second list has complexity O(n) but this effort is
amortized over all push/pop operations, which makes them O(1) on
average.


Code

exception Empty
type 'a t = 'a list * 'a list

let empty = ([], [])

let size (xs, ys) = List.length xs + List.length ys

(* invariant: the left list is never empty unless the 
   stack is empty *)

let peek = function
  | x::_  , _     -> Some x
  | []    , []    -> None
  | []    , _     -> assert false

let push y = function
  | []    , []    -> [y], []
  | xs    , ys    -> xs, y::ys
  | []    , _ys   -> assert false

let pop = function
  | []    , []    -> raise Empty
  | [_]   , ys    -> List.rev ys, []
  | _::xs , ys    -> xs         , ys
  | []    , ys    -> assert false

Note: the stack containing 1..5 could have two representations:

([1; 2], [5; 4; 3])
([1]   , [5; 4; 3; 2])