Sunday, August 30, 2020

Type Erasure, Manifest & Specialization in Scala 2.x

Target audience: Intermediate
Estimated reading time: 4'

Both Scala and Java programming languages employ type erasure to eliminate generics during compilation. This mechanism upholds type constraints exclusively at compile time, removing the specific type details during runtime. Consequently, at runtime, the JVM fails to distinguish between Higher-Kinded types (i.e., List[T] and List[Int].
In this article, we delve into two strategies in Scala that counter this limitation: Manifests and specialization for primitive types.
Table of contents
Follow me on LinkedIn
Notes: 
  • For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted 
  • The code associated with this article is written using Scala 2.12.4

Type erasure

Type erasure ensures that no new classes are created for parameterized types; consequently, generics incur no runtime overhead and the generated bytecode contains only ordinary classes, interfaces, and methods.
The type parameters [U <: T]  are removed and replaced by their upper bound T or Any. The process involves boxing and un-boxing the primitives types if they are used in the code as type parameters, degrading performance. 

Implicit conversion

Let consider a class ListCompare that compare lists of parametric type U bounded by the type Item (line 3).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
case class Item()
 
class ListCompare[U <: Item](
  xs: List[U]
)(implicit f: U => Ordered[U]) {

  def compare(xso: List[U]): Boolean = xso match {
    case str: List[String] => 
      if(xs.size == xso.size) 
        xs.zip(xso).exists( x=> x._1.compareTo(x._2) != 0) 
      else false
    
    case n: List[Int] => 
      if(xs.size == xso.size) 
        xs.zip(xso).exists(x => x._1 != x._2) 
      else false
   
    case _ => false
  }
}

The class has to have an implicit conversion U => Ordered[T] (line 5) to support the comparison of strings. The code above will generate the following warning message: "non-variable type argument String in type pattern List[String] is unchecked since it is eliminated by erasure". The warning message is generated by the compiler because the two parameterized types, List[String] (line 7) and List[Int] (line 12) may not be available to the JVM at the time of the execution of the pattern matching.

Manifest

One solution is to use a Manifest. A Manifest[T] is an opaque descriptor for type T. It allows access to the erasure of the type as a class instance. The most common usage of manifest is related to the creation of native Arrays if the class is not known at compile time as illustrated in the code snippet below.

def myArray[T] = new Array[T](0)
def myArray[T](implicit m: Manifest[T]) = new Array[T](0)
def myArray[T: Manifest] = new Array[T](0)

  • The first line of code won't compile. 
  • The second function will maintain the erasure by passing the manifest as an implicit parameter. 
  • The last function defined the manifest as a context bound. 
Let's re-implement the compare method with a Manifest specified as an implicit parameter.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
class ListCompare[U <: Item](
  xs: List[U]
)(implicit f: U => Ordered[U]) {

  def compare(
    xso: List[U]
   )(implicit u: Manifest[List[U]]): Boolean = {
    if( u <:< manifest[List[String]] ) 
      if( xs.size == xso.size)
        xs.zip(xso)
          .exists( x=> x._1.compareTo(x._2) != 0) 
      else false
     
    else if(u <:< manifest[List[Int]])  
      if( xs.size == xso.size) 
        xs.zip(xso).exists(x => x._1!=x._2) 
      else false
     
    else false
  }
}

The code will compile and execute as expected. It relies on the "<:<" type operator, which is an old variant (lines 8 & 14). In any case, the code is far from being elegant and quite difficult for an inexperience Scala developer to grasp. There should be a better way.

Specialization

One better option is to generate a separate class for the primitive type, Int and String using the @specialized annotation. The annotation @specialized (line 1) forces the compiler to generate byte code for each primitive listed as specialized type. For instance the instruction @specialized(Int, Double) T generates two extra, efficient classes for the primitive types Int and Double. The original ListComp class and its method compare are re-written using the annotation as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class ListComp[@specialized Int, String, U <: Item](
   xs:List[U]
)(implicit f: U =>Ordered[U]) {
   
  def compare(xso: List[U]): Boolean =
    if(xs.size == xso.size) 
      xs.zip(xso)
        .exists( x => x._1.compareTo(x._2)!=0)   
    else 
      false
}

The code above will not throw a warning or error. However there is not such a thing as a free lunch, as the compiler generates extra byte code for the methods associated to the specialized primitive types. The objective in this case is to trade a higher memory consumption for performance improvement.

Alternative mechanisms

There are alternative, granular options to avoid type erasure error when compiling Scala containing generics:
  • T: ClassTag: Given a Collection[Any] it provides an implicit conversion to Collection[T]. It is not as powerful as TypeTag
  • T: TypeTag: Allows to differentiate between Higher Kinds such as List[String] and List[Int] for instance. However, it cannot differentiate A[T] from A[U] if A is an abstract class.
  • T: WeakTypeTag: Allow differentiation between concrete and abstract classes.

Thank you for reading this article. For more information ...

References

  • Programming in Scala M. Odesky, L.Spoon, B. Venners - Artima 2008
  • Scala for the Impatient - Cay Horstman Addison-Wesley 2012
  • github.com/patnicolas

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.