In my effort to understand the recently added
RowToList
Purescript feature, I have been reading Liam Goodacre's
post,
which made me take a closer look at the purescript-symbols package.
I was puzzled by the implementation of the reifySymbol
function. Here it is:
reifySymbol :: forall r. String -> (forall sym. IsSymbol sym => SProxy sym -> r) -> r
reifySymbol s f = coerce f { reflectSymbol: \_ -> s } SProxy where
coerce
:: (forall sym1. IsSymbol sym1 => SProxy sym1 -> r)
-> { reflectSymbol :: SProxy "" -> String } -> SProxy "" -> r
coerce = unsafeCoerce
Now, the usage of unsafeCoerce
is a clear sign that this code is somewhat special
and not something that should be used often, but I was curious.
f
is a function of one argument (of type SProxy sym
), but is being called
with two, the first of which is this strange record with the key reflectSymbol
.
What exactly is going on here?
Like Haskell, Purescript has the concept of typeclasses. Very roughly speaking, they are a way to define an interface, have multiple implementations of that interface, and have the compiler pick the appropriate implementation for every usage.
Most often typeclasses are polymorphic: their definition contains a type variable, and then different instances can be defined for specific values of that variable.
For example, we could imagine a typeclass Size
, which associates an Int
to values of different types:
class Size a where
size :: forall a. a -> Int
We could have an instance for String
, returning the length of the string using
a function from the purescript-strings
package:
import Data.String as DS
instance sizeString :: Size String where
size = DS.length
And an instance for Int
, for which size
is just an identity function:
instance sizeNumber :: Size Int where
size n = n
We could even have an instance for Array a
, for any a:
import Data.Array as DA
instance sizeArray :: Size (Array a) where
size = DA.length
Now all of the following works, the correct implementation is picked by the
compiler based on the types of the arguments to size
:
s1 :: Int
s1 = size "foo"
s2 :: Int
s2 = size 1
s3 :: Int
s3 = size [1, 2, 3]
How is this represented at runtime? Let's look at the generated javascript. The class declaration itself generates some code:
var Size = function (size) {
this.size = size;
};
var size = function (dict) {
return dict.size;
};
So Size
is a constructor that, given a value, constructs an object which
stores the passed value under the key size
. size
, on the other hand, is
a function that, given an object with the size
key, returns the value
associated to that key.
Let's have a look at the code generated for the instance declarations:
var sizeString = new Size(Data_String.length);
var sizeNumber = new Size(function (n) {
return n;
});
var sizeArray = new Size(Data_Array.length);
Each instance is an instantiation of the Size
constructor, with the
corresponding implementation of the size
function passed as an argument. At
runtime, sizeString
, sizeNumber
and sizeArray
will be objects with a
single own property size
containing the corresponding function.
Now let's look at the code produced by the calls to size
:
var s1 = size(sizeString)("foo");
var s2 = size(sizeNumber)(1);
var s3 = size(sizeArray)([ 1, 2, 3 ]);
They are all calls to the size
function which, as we have seen above, just
performes a lookup on the object passed to it, returning the value of its
size
key. Different dictionaries are passed to size
by the compiler,
matching the type of the argument in the Purescript code. The function returned
by the lookup is finally called with the corresponding argument.
We can view size
as a curried function of 2 arguments, the first of which is
the typeclass dictionary containing the actual implementation of size
, and
the second is the argument to that function.
It would seem that the compiler is in full control over which instance gets
passed to size
. Could we call size
, but pass a dictionary of our choosing,
potentially determined at runtime? It turns out that it is possible, using the
same trick as in the implementation of the reifySymbol
at the beginning of
this article.
The trick is to lie to the compiler about the type of size
, so that it does
not try to call it with a typeclass dictionary, and call it as a regular
purescript function of two arguments, passing our surrogate dictionary as the
first argument, and the real argument as the second one.
import Unsafe.Coerce (unsafeCoerce)
s4 :: Int
s4 = coerce size { size: \n -> n + 7 } 35 where
coerce :: (forall a. Size a => a -> Int) -> { size :: Int -> Int } -> Int -> Int
coerce = unsafeCoerce
The lying part is achieved by using the unsafeCoerce function. It is declared as a function which can convert between 2 arbitrary types:
foreign import unsafeCoerce :: forall a b. a -> b
Its implementation is just an identity javascript function, i.e. it does not do anything at runtime, it is there just to convince the type chekcer that a type can be treated as another type.
Here unsafeCoerce
is used to tell the compiler that the size
funciton can
be seen as a function of the following type:
{ size :: Int -> Int } -> Int -> Int
The trick works because the above is compatible with the actual runtime
representation of size
.
Here is what s4
compiles to:
var s4 = Unsafe_Coerce.unsafeCoerce(function (dictSize) {
return size(dictSize);
})({
size: function (n) {
return n + 7 | 0;
}
})(35);
Substituting the identity function for unsafeCoerce
, and replacing
function(dictSize) {return size(dictSize)}
with just size
, we get:
var s4 = size({
size: function (n) {
return n + 7 | 0;
}
})(35);
This is fully analagous to the code generated for s1
, s2
and s3
above, but this time a the custom typeclass dictionary is used.
Note that when passing the dictionary, we are relying on the knowledge of the runtime representation which is normally hidden from the users — a reason in itself to use this hack with caution.