Skip to content

Instantly share code, notes, and snippets.

@brendanzab
Last active May 28, 2019 11:18
Show Gist options
  • Save brendanzab/9285eb8dfef5b6d6ccd87d90d6579590 to your computer and use it in GitHub Desktop.
Save brendanzab/9285eb8dfef5b6d6ccd87d90d6579590 to your computer and use it in GitHub Desktop.
Experimenting with what category theory might look like in Pikelet (https://github.com/brendanzab/pikelet)
record {
flip, always, (<|), (|>),
Category, id, compose, (<<), (>>),
Functor, map,
} where {
||| Flip the order of the first two arguments to a function
flip : {a b c : Type} -> (a -> b -> c) -> (b -> a -> c);
flip f x y = f y x;
||| Creates a function that always returns the same value
always : {a b : Type} -> a -> b -> a;
always x _ = x;
||| Backward function application
(<|) : {a b : Type} -> (a -> b) -> a -> b
(<|) f x = f x
||| Forward function application
(|>) : {a b : Type} -> a -> (a -> b) -> b
(|>) x f = f x
||| A category is a very general structure that provides a common way of composing
||| units of functionality
|||
||| The most common category programmers would be familiar with would be `Type`s
||| are the objects, and the functions between those types are the arrows. Many
||| other categories exist though, for example:
|||
||| - nodes in a directed graph, and the edges between those nodes.
||| - etc.
Category = Record {
||| An object in the category
Object : Type;
||| Arrows between the objects in the category
Arrow : Object -> Object -> Type;
||| The identity arrow
id : {a : Object} -> Arrow a a;
||| The composition of two arrows
compose : {a b c : Object} -> Arrow b c -> Arrow a b -> Arrow a c;
-- TODO: Laws via quickcheck or proofs?
};
id : {{C : Category}} {a : C.Object} -> C.Arrow a a
id {{C}} = C.id
compose : {{C : Category}} {a b c : C.Object} -> C.Arrow b c -> C.Arrow a b -> C.Arrow a c;
compose {{C}} = C.compose;
||| Backward composition of arrows
|||
||| This is most often used for function composition
(<<) = compose;
||| Forward composition of arrows
|||
||| This is most often used for function composition
(>>) = flip compose;
||| Provides a mapping from objects-to-objects and arrows-to-arrows for two
||| categories, `Source` and `Target`
|||
||| Mappings can be anything from applying a function to each element of a
||| collection, to compiling a source language to a target language.
|||
||| Haskell programmers might find this definition a little foreign - this
||| is because we use general categories in the definition, rather than
||| specializing it into the category of Pikelet functions
Functor = Record {
||| The source category
Source : Category;
||| The target category
Target : Category;
||| Maps an object in `Source` to an object in `Target`
Map : Source.Object -> Target.Object;
||| Maps an arrow in `Source` into an arrow in `Target`
map : {a b : Source.Object} -> Source.Arrow a b -> Target.Arrow (Map a) (Map b);
-- TODO: Laws via quickcheck or proofs?
};
map : {{F : Functor}} {a b : F.Source.Object} -> F.Source.Arrow a b -> F.Target.Arrow (F.Map a) (F.Map b);
map {{F}} = F.map;
||| A `Functor` from a category to itself
EndoFunctor = Record {
..Functor,
Target : (= Source), -- using a singleton type to 'pin' the definition to a single value
Map : (= id), -- NOTE: requires a definition of category for Pikelet functions
};
}
@brendanzab
Copy link
Author

brendanzab commented Apr 4, 2018

Open questions:

  • How do we maintain performance in the face of strict evaluation?
  • How do we handle universe levels?
  • How should we handle multiplicity annotations on binders (ie. unbounded, linear, erased)? Can we be polymorphic over these, or will we have to duplicate lots of APIs?
  • What does a Monad module look like?
  • What does an instance of Category for functions look like?
  • How good are the error messages? (objects and arrows and functors might be confusing at first!)
  • Could we automate the generation of the wrapper functions?
  • Could we parametrise these APIs by whether these have been proven correct, vs. whether they are tested by generative testing?

Copy link

ghost commented Apr 4, 2018

@brendanzab:

How do we maintain performance in the face of strict evaluation?

If the system is mostly intended to be used for theorem proving you will be dealing with very large terms in many cases and would also probably benefit greatly from some form of hash-consing and memoization of core operations (substitution, normalization, etc). Along similar lines, using a call-by-need approach would probably be better (this is what Agda is doing now) than call-by-value.

If you are able to implement reduction via an abstract machine (perhaps starting with something simple like Krivine) in the backend (perhaps with translation from a higher-level core) you would likely have more flexibility as far as experimenting with different evaluation strategies.

How do we handle universe levels?

Personally I would suggest ignoring universes levels until much later on. It's hard to get right and it's not very interesting unless you intend for this system to be used for serious proofs that you must be able to trust. For most of the mathematics I do as a hobbyist, I simply turn off the Agda universe checking now because it's not worth the trouble. It's better with Coq and Idris but there's a good deal of complexity in the implementations they've had to sort out in order to make it work and there are still some weird corner cases.

What does a Monad module look like?

If you want to build up category theory this way, there are several things you'll need to consider:

First, doing this in an intensional, Martin-Löf style type theory (versus something like a cubical type theory) will require that you formulate not standard Categories but what are instead known as E-Categories. These are categories with an extra piece of data that defines an equivalence relation on the morphisms. Then the coherence laws are intended to hold not up to propositional equality but up to this specified equivalence relation. Example:

record Category : Set where
  field
    -- graph
    obj : Set
    hom : (a b : obj) -> Set
    -- operations
    idn : {a : obj} -> hom a a
    seq : {a b c : obj} -> hom a b -> hom b c -> hom a c
    -- equivalence relation
    rel : {a b : obj} (f g : hom a b) -> Set
    reflexivity : {a b : obj} {f : hom a b} -> rel f f
    transitivity : {a b : obj} {f g h : hom a b} (p : rel f g) (q : rel g h) -> rel f h
    symmetry : {a b : obj} {f g : hom a b} (p : rel g f) -> rel f g
    -- coherence laws
    id-left : {a b : obj} (f : hom a b) -> rel (seq id f) f
    id-right : {a b : obj} (f : hom a b) -> rel (seq f id) f
    assoc : {a b c d : obj} (f : hom a b) (g : hom b c) (h : hom c d) -> rel (seq (seq f g) h) (seq f (seq g h))
    seq-cong : {a b c : obj} {f0 f1 : hom a b} {g0 g1 : hom b c} (p : rel f0 f1) (q : rel g0 g1) -> rel (seq f0 g0) (seq f1 g1)

The reason you have to formulate things this way is because for categorical structures that involve things like functions or infinite structures (if you have codata), you will not be able to prove the laws using propositional equality in intensional type theory.

You can postulate functional extensionality (ext : {A B : Set} (f g : A -> B) -> ((x : A) -> f x = g x) -> f = g but since it's a postulate it has no computational behavior and eventually you'll get to a point where your proofs get stuck because they can't normalize further.

However, if you reason up to equivalence relations, things work out. For instance, you can use functional extensionality as the definition for the equivalence relation (requiring that it's provable) when you need something like that.

In any case, using equivalence relations like this is referred to in the literature as setoids (see here).

As for what Monad would look like, you'd need to formalize it as the saying goes: as a monoid in the e-category of endofunctors. It takes quite a bit of work to build up to that definition even in something like Agda though.

What does an instance of Category for functions look like?

Assuming you mean the category Set, you'll instead need to formalize the category of Setoids as long as you're working in an intensional Martin-Löf type theory.

Setoids are just sets equipped with an equivalence relation:

record Setoid : Set where
  field
    car : Set
    rel : (a b : car) -> Set
    reflexivity : {a : car} -> rel a a
    transitivity : {a b c : obj} (p : rel a b) (q : rel b c) -> rel a c
    symmetry : {a b : obj} (p : rel b a) -> rel a b

In fact, the e-categories I described above are categories with a setoid on the morphisms. So for the category of Setoids, the morphisms are setoid maps:

record Map (A B : Setoid) : Set where
  field
    ap-car : car A -> car B
    ap-rel : {a b : car A} -> rel A a b -> rel B (ap-car a) (ap-car b)

Next, you can say that two setoid Map are equivalent in the following sense:

record Rel {A B : Setoid} (F G : Map A B) : Set where
  field
    ext : {x : car A} -> rel B (ap-car F x) (ap-car G x)

Note this is in the spirit of the ext mentioned earlier, since you now have the following:

-- compare with {A B : Set} (f g : A -> B) -> ((x : A) -> f x = g x) -> f = g
mkExt : {A B : Setoid} (F G : Map A B) -> ({x : car A} -> rel B (ap-car F x) (ap-car G x)) -> Rel F G
mkExt F G p = record { ext = p }

Finally, the definition of the category of setoids:

SETOID : Category
SETOID = record
  { obj = Setoid
  ; hom = \ A B -> Map A B
  ; rel = \ F G -> Rel F G
  …
  }

Note that you can sometimes still define just the category SET (or TYPE if you prefer). I think it's doable in Agda. But most of the constructions from category theory that involve ET will fail quite early on due to the definition being "too strict" essentially (you run in the problem of extensionality). With SETOID you can prove most of the known results in category theory that don't rely on some non-constructive principles or axioms.

TL;DR: doing category theory in type theory is still Very Hard and basically open research. It's one of the motivating factors behind completely new approaches to type theory. Not to discourage you, but if you are building this system specifically in order to do category theory there's a good chance you're going to be disappointed just due to the fact that it's really complicated and messy to formalize this stuff still (even in the newer approaches).

Could we parametrise these APIs by whether these have been proven correct, vs. whether they are tested by generative testing?

I don't think the testing approach will get you very far to be honest, at least not if your intention is to reason about more advanced constructions. Most of the time there won't be a way to even systematically generate the data that you need for testing.

There is some possibility you could push a ways in that direction with something called synthetic topology (see here and here). Synthetic topology can be viewed as a kind of more general and systematic theory about testing and observation of properties (even in some cases involving infinite structure), but that's definitely another rabbit hole and I still think you'd run into pretty serious limitations in practice.

If your intention is to mostly use this stuff for programming, I don't think you need to worry about the proofs. I'm not sure you could do anything useful programmatically with the fact that the laws were checked either, if you did that. It would probably be easier to just leave the proofs out in that case.

@brendanzab
Copy link
Author

Thanks for the really detailed reply! I'll have to pick it apart a bit more later.

Just to be a bit clearer with my intensions: this is more meant to be used as a systems programming language, for similar purposes as Rust. This was more of an attempt to see what a bunch of modules might look like in this system using records. But was also fun to try encode some category stuff - was also trying to be a bit more explicit and approachable - I find the Agda examples pretty slow to unpick (given they lack comments and much in the way of pronounceable names). Thanks for your warning about it being open research though!

The question about performance of composition is more a response to a problem I see in Rust - even if we had HKT, you still wouldn't get a huge amount from category-theory inspired code reuse, due to all the intermediate copies of data you would get. We want to be able to use the stack as much as possible, and do stuff in place - hence what we see in the Iterator and Future APIs. I've seen Ed Kmett say that laziness is the best way to get that kind of reuse, but it would be great if there was another way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment