Skip to content

Instantly share code, notes, and snippets.

@mebubo
Last active January 1, 2023 01:29
Show Gist options
  • Save mebubo/339e66288d99da0a47039d33b62cf946 to your computer and use it in GitHub Desktop.
Save mebubo/339e66288d99da0a47039d33b62cf946 to your computer and use it in GitHub Desktop.

Manually overriding the typeclass dictionary in Purescript

The reifySymbol from the purescript-symbols package

In my effort to understand the recently added RowToList Purescript feature, I have been reading Liam Goodacre's post, which made me take a closer look at the purescript-symbols package.

I was puzzled by the implementation of the reifySymbol function. Here it is:

reifySymbol :: forall r. String -> (forall sym. IsSymbol sym => SProxy sym -> r) -> r
reifySymbol s f = coerce f { reflectSymbol: \_ -> s } SProxy where
  coerce
    :: (forall sym1. IsSymbol sym1              => SProxy sym1 -> r)
    -> { reflectSymbol :: SProxy "" -> String } -> SProxy ""   -> r
  coerce = unsafeCoerce

Now, the usage of unsafeCoerce is a clear sign that this code is somewhat special and not something that should be used often, but I was curious.

f is a function of one argument (of type SProxy sym), but is being called with two, the first of which is this strange record with the key reflectSymbol. What exactly is going on here?

The representation of typeclasses in Purescript

Like Haskell, Purescript has the concept of typeclasses. Very roughly speaking, they are a way to define an interface, have multiple implementations of that interface, and have the compiler pick the appropriate implementation for every usage.

Most often typeclasses are polymorphic: their definition contains a type variable, and then different instances can be defined for specific values of that variable.

For example, we could imagine a typeclass Size, which associates an Int to values of different types:

class Size a where
    size :: forall a. a -> Int

We could have an instance for String, returning the length of the string using a function from the purescript-strings package:

import Data.String as DS

instance sizeString :: Size String where
    size = DS.length

And an instance for Int, for which size is just an identity function:

instance sizeNumber :: Size Int where
    size n = n

We could even have an instance for Array a, for any a:

import Data.Array as DA

instance sizeArray :: Size (Array a) where
    size = DA.length

Now all of the following works, the correct implementation is picked by the compiler based on the types of the arguments to size:

s1 :: Int
s1 = size "foo"

s2 :: Int
s2 = size 1

s3 :: Int
s3 = size [1, 2, 3]

How is this represented at runtime? Let's look at the generated javascript. The class declaration itself generates some code:

var Size = function (size) {
    this.size = size;
};
var size = function (dict) {
    return dict.size;
};

So Size is a constructor that, given a value, constructs an object which stores the passed value under the key size. size, on the other hand, is a function that, given an object with the size key, returns the value associated to that key.

Let's have a look at the code generated for the instance declarations:

var sizeString = new Size(Data_String.length);
var sizeNumber = new Size(function (n) {
    return n;
});
var sizeArray = new Size(Data_Array.length);

Each instance is an instantiation of the Size constructor, with the corresponding implementation of the size function passed as an argument. At runtime, sizeString, sizeNumber and sizeArray will be objects with a single own property size containing the corresponding function.

Now let's look at the code produced by the calls to size:

var s1 = size(sizeString)("foo");
var s2 = size(sizeNumber)(1);
var s3 = size(sizeArray)([ 1, 2, 3 ]);

They are all calls to the size function which, as we have seen above, just performes a lookup on the object passed to it, returning the value of its size key. Different dictionaries are passed to size by the compiler, matching the type of the argument in the Purescript code. The function returned by the lookup is finally called with the corresponding argument.

We can view size as a curried function of 2 arguments, the first of which is the typeclass dictionary containing the actual implementation of size, and the second is the argument to that function.

Unsafely overriding the typeclass dictionary

It would seem that the compiler is in full control over which instance gets passed to size. Could we call size, but pass a dictionary of our choosing, potentially determined at runtime? It turns out that it is possible, using the same trick as in the implementation of the reifySymbol at the beginning of this article.

The trick is to lie to the compiler about the type of size, so that it does not try to call it with a typeclass dictionary, and call it as a regular purescript function of two arguments, passing our surrogate dictionary as the first argument, and the real argument as the second one.

import Unsafe.Coerce (unsafeCoerce)

s4 :: Int
s4 = coerce size { size: \n -> n + 7 } 35 where
 coerce :: (forall a. Size a => a -> Int) -> { size :: Int -> Int } -> Int -> Int
 coerce = unsafeCoerce

The lying part is achieved by using the unsafeCoerce function. It is declared as a function which can convert between 2 arbitrary types:

foreign import unsafeCoerce :: forall a b. a -> b

Its implementation is just an identity javascript function, i.e. it does not do anything at runtime, it is there just to convince the type chekcer that a type can be treated as another type.

Here unsafeCoerce is used to tell the compiler that the size funciton can be seen as a function of the following type:

{ size :: Int -> Int } -> Int -> Int

The trick works because the above is compatible with the actual runtime representation of size.

Here is what s4 compiles to:

var s4 = Unsafe_Coerce.unsafeCoerce(function (dictSize) {
    return size(dictSize);
})({
    size: function (n) {
        return n + 7 | 0;
    }
})(35);

Substituting the identity function for unsafeCoerce, and replacing function(dictSize) {return size(dictSize)} with just size, we get:

var s4 = size({
    size: function (n) {
        return n + 7 | 0;
    }
})(35);

This is fully analagous to the code generated for s1, s2 and s3 above, but this time a the custom typeclass dictionary is used.

Note that when passing the dictionary, we are relying on the knowledge of the runtime representation which is normally hidden from the users — a reason in itself to use this hack with caution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment