Two canonical examples for specialization:
- Function1
- Tuple2
Specializing Function1
is a lot easier because functions do not have specialized fields.
Let's start with simplified version of Function1
trait that gets transalted into Java interface.
trait Function1[T,U] {
def apply(x: T): U
}
The corresponding Java interface looks like this:
public interface Function1<T, U> {
public U apply(T x);
public MethodHandle apply$sp(Class<?> tParam, Class<?> uParam);
}
Now, let's see how we could Java implementation of specialized function (x: Int) => x % 2 == 0
would look like:
public class IsEven implements Function1<Integer, Boolean> {
public boolean apply(int x) {
return x % 2 == 0;
}
public Boolean apply(Integer x) {
return apply(x.intValue());
}
public MethodHandle apply$sp(Class<?> tParam, Class<?> uParam) {
return IsEven::apply(int);
}
}
If you want to call that function in specialized context you do:
class Foo {
boolean foo(Function1<Integer, Boolean> f) {
int x = 2;
MethodHandle mh = f.apply$sp(int.class, boolean.class);
return (boolean) mh.ivokeExact(x);
}
}
case class Tuple2[T, U](val _1: T, val _2: U)
is expanded to (roughly) Java class like this:
public class Tuple2<T, U> {
private T _1;
private U _2;
public Tuple2(T _1, U _2) {
this._1 = _1;
this._2 = _2;
}
public T get_1() {
return _1;
}
public U get_2() {
return _2;
}
}
if we mark Tuple2
as specialized:
@specialized
case class Tuple2[T, U](val _1: T, val _2: U)
Java translation becomes:
public abstract class Tuple2<T, U> {
public static MethodHandle factory(Class<?> tParam, Class<?> uParam) {
return Metafactory.spin(Tuple2.class, tParam, uParam);
}
public T get_1();
public MethodHandle get_1(Class<?> tParam);
public U get_2();
public MethodHandle get_2(Class<?> uParam);
}
public final class Tuple2$noSp<T, U> extends Tuple2<T, U> {
private T _1;
private U _2;
public Tuple2$noSp(T _1, U _2) {
this._1 = _1;
this._2 = _2;
}
public T get_1() {
return _1;
}
public U get_2() {
return _2;
}
}
// template, not real class, to be rewritten by Metafactory and asm library.
// we used $sp suffix for type parameters to signal that after template expansion they become constants (and are not erased to Object).
// Also, we use unspecialized type parameters which are boxed version of specialized one in case of boxing is needed.
// E.g. if T$sp=int, then T=Integer; if T$sp=Object, then T=Object
public final class Tuple2$sp Tuple2$sp<T$sp, U$sp> extends Tuple2<T, U> {
private T$sp _1;
private U$sp _2;
public Tuple2$sp(T$sp _1, U$sp _2) {
this._1 = _1;
this._2 = _2;
}
public T get_1() {
return _1;
}
public U get_2() {
return _2;
}
public MethodHandle get_1(Class<?> tParam) {
return Tuple2$sp::_1;
}
public MethodHandle get_2(Class<?> uParam) {
return Tuple2$sp::_2;
}
}
Now, if you want to access element from Tuple2
in specialized context you do:
class Foo<T$sp> {
T$sp access(Tuple2 t) {
MethodHandle mh = t.get_1(T$sp.class);
return (T$sp) mh.invokeExact();
}
}
The short answer is: in situation when specialized generic code calls another specialized generic code.
Consider the following Scala code:
class C[@specialized T] {
def foo(length: Int): ArrayBuffer[T] = new ArrayBuffer[T](length)
}
val c = new C[Int]
val buf: ArrayBuffer[Int] = c.foo(10)
Let's assume that ArrayBuffer[T]
is a specialized collection so ArrayBuffer[Int]
has its internal array of Java's type int[]
. Let's also assume that in order to create an Array[T]
it's enough to have value classOf[T]
. That implies that ArrayBuffer needs an access to classOf[T]
and that's being achieved through specialized constructor. You could think that Class[T]
is being passed to constructor as implicit value similarly to how ClassTag
s are passed around except for argument order which is an implementation detail.
Now we should note that C.foo
method needs an access to classOf[T]
so it can pass to ArrayBuffer[T]
s constructor. We're ready to show full Java translation of above code.
class C<T> {
ArrayBuffer<T> create(Class<?> tParam, int length) {
return new ArrayBuffer[T](length, tParam);
}
MethodHandle create$sp(Class<?> tParam) {
// partially apply `create` method
return C::create(int, Class).bindTo(tParam);
}
public static MethodHandle factory(Class<?> tParam) {
// return MethodHandle to defualt constructor
return C::new()
}
}
C c = (C) C.factory(int.class).invoke();
MethodHandle mh = c.create$sp(int.class);
ArrayBuffer<Int> buf = (ArrayBuffer<Int>) mh.invoke(10);
Now, you can see that $sp
method pass runtime representation of type parameter to a method that contains actual implementation by partially applying a method handle. In cases where implementation does not call specialized code those values are not needed and are discarded in $sp
method and not passed further. This is a useful property that enables some optimizations and gives JIT an easier time optimizing the bytecode. I'll elaborate on the last point some other time.
This document does not discuss (yet) the following complex topics:
- mixing specialized and non-specialized code
- partial specialization (some type parameters are not specialized)
- binary compatiblity concerns
- trait specialization for traits that carry implementation
- value class specialization
- all the detials of interaction with arrays
Can't wait to dig deeper into this! Thanks for publishing it Greg.