gkossakowski/specialization-mh.md

## specialization-mh.md

      
    Raw
  

              specialization-mh.md
            
          
    Specialization with method handles

Two canonical examples for specialization:

Function1
Tuple2

Function1 specialization

Specializing Function1 is a lot easier because functions do not have specialized fields.
Let's start with simplified version of Function1 trait that gets transalted into Java interface.
trait Function1[T,U] {
  def apply(x: T): U
}
The corresponding Java interface looks like this:
public interface Function1<T, U> {
	public U apply(T x);
	public MethodHandle apply$sp(Class<?> tParam, Class<?> uParam);
}
Now, let's see how we could Java implementation of specialized function (x: Int) => x % 2 == 0 would look like:
public class IsEven implements Function1<Integer, Boolean> {
	public boolean apply(int x) {
		return x % 2 == 0;
	}
	
	public Boolean apply(Integer x) {
		return apply(x.intValue());
	}
	
	public MethodHandle apply$sp(Class<?> tParam, Class<?> uParam) {
		return IsEven::apply(int);
	}
}
If you want to call that function in specialized context you do:
class Foo {
	boolean foo(Function1<Integer, Boolean> f) {
		int x = 2;
		MethodHandle mh = f.apply$sp(int.class, boolean.class);
		return (boolean) mh.ivokeExact(x);
	}
}
Tuple2 specialization

case class Tuple2[T, U](val _1: T, val _2: U)
is expanded to (roughly) Java class like this:
public class Tuple2<T, U> {
  private T _1;
	private U _2;
	public Tuple2(T _1, U _2) {
		this._1 = _1;
		this._2 = _2;
	}

	public T get_1() {
		return _1;
	}

	public U get_2() {
		return _2;
	}
}
if we mark Tuple2 as specialized:
@specialized
case class Tuple2[T, U](val _1: T, val _2: U)
Java translation becomes:
public abstract class Tuple2<T, U> {
	public static MethodHandle factory(Class<?> tParam, Class<?> uParam) {
	 	return Metafactory.spin(Tuple2.class, tParam, uParam);
	}
	public T get_1();
	public MethodHandle get_1(Class<?> tParam);

	public U get_2();
	public MethodHandle get_2(Class<?> uParam);
}

public final class Tuple2$noSp<T, U> extends Tuple2<T, U> {
	private T _1;
	private U _2;
	public Tuple2$noSp(T _1, U _2) {
		this._1 = _1;
		this._2 = _2;
	}

	public T get_1() {
		return _1;
	}

	public U get_2() {
		return _2;
	}
}

// template, not real class, to be rewritten by Metafactory and asm library.
// we used $sp suffix for type parameters to signal that after template expansion they become constants (and are not erased to Object).
// Also, we use unspecialized type parameters which are boxed version of specialized one in case of boxing is needed.
// E.g. if T$sp=int, then T=Integer; if T$sp=Object, then T=Object

public final class Tuple2$sp Tuple2$sp<T$sp, U$sp> extends Tuple2<T, U> {
	private T$sp _1;
	private U$sp _2;
	
	public Tuple2$sp(T$sp _1, U$sp _2) {
		this._1 = _1;
		this._2 = _2;
	}

	public T get_1() {
		return _1;
	}

	public U get_2() {
		return _2;
	}
	
	public MethodHandle get_1(Class<?> tParam) {
		return Tuple2$sp::_1;
	}
	
	public MethodHandle get_2(Class<?> uParam) {
		return Tuple2$sp::_2;
	}
}
Now, if you want to access element from Tuple2 in specialized context you do:
class Foo<T$sp> {
  T$sp access(Tuple2 t) {
  	MethodHandle mh = t.get_1(T$sp.class);
  	return (T$sp) mh.invokeExact();
  }
}
Why $sp methods take runtime representation of type parameters?

The short answer is: in situation when specialized generic code calls another specialized generic code.
Consider the following Scala code:
class C[@specialized T] {
	def foo(length: Int): ArrayBuffer[T] = new ArrayBuffer[T](length)
}

val c = new C[Int]
val buf: ArrayBuffer[Int] = c.foo(10)
Let's assume that ArrayBuffer[T] is a specialized collection so ArrayBuffer[Int] has its internal array of Java's type int[]. Let's also assume that in order to create an Array[T] it's enough to have value classOf[T]. That implies that ArrayBuffer needs an access to classOf[T] and that's being achieved through specialized constructor. You could think that Class[T] is being passed to constructor as implicit value similarly to how ClassTags are passed around except for argument order which is an implementation detail.
Now we should note that C.foo method needs an access to classOf[T] so it can pass to ArrayBuffer[T]s constructor. We're ready to show full Java translation of above code.
class C<T> {
	ArrayBuffer<T> create(Class<?> tParam, int length) {
		return new ArrayBuffer[T](length, tParam);
	}
	MethodHandle create$sp(Class<?> tParam) {
		// partially apply `create` method
		return C::create(int, Class).bindTo(tParam);
	}
	public static MethodHandle factory(Class<?> tParam) {
		// return MethodHandle to defualt constructor
	 	return C::new()
	}
}

C c = (C) C.factory(int.class).invoke();
MethodHandle mh = c.create$sp(int.class);
ArrayBuffer<Int> buf = (ArrayBuffer<Int>) mh.invoke(10);
Now, you can see that $sp method pass runtime representation of type parameter to a method that contains actual implementation by partially applying a method handle. In cases where implementation does not call specialized code those values are not needed and are discarded in $sp method and not passed further. This is a useful property that enables some optimizations and gives JIT an easier time optimizing the bytecode. I'll elaborate on the last point some other time.
TODOs

This document does not discuss (yet) the following complex topics:

mixing specialized and non-specialized code
partial specialization (some type parameters are not specialized)
binary compatiblity concerns
trait specialization for traits that carry implementation
value class specialization
all the detials of interaction with arrays