Thursday, May 24, 2018

Streaming Moving Averages with Apache Spark

Target audience: Intermediate
Estimated reading time: 5'

Spark Streaming offers an abstraction called Discretized Stream, or DStream, which signifies a continuous flow of data. This can either stem from an initial source or emerge as a result of transforming the input data. Essentially, a DStream is characterized by a consistent sequence of Resilient Distributed Datasets (RDD) [1].


Follow me on LinkedIn
Important notes
  • This post describes the streaming features as implemented in Apache Spark 2.0.x 
  • Stream should not confused with structured streaming [2].

Use case

The creation of models through supervised learning from a given training sets usually requires the application of some pre-processing algorithm such as smoothing, filtering or extrapolating existing data for missing values. 

Computing the moving average of a time series is a simple and convenient technique used to filter out noise and reduce the impact of outliers on trend lines. For this post, we consider the simplest form of moving average over a period of p values, as defined in the following formula \[\tilde{x_{t}}= \frac{1}{p}\sum_{j=t-p+1}^{t}x_{j}\] 
The iterative (or recursive) form is defined as \[\tilde{x_{t}}=\tilde{x_{t-1}}+\frac{1}{p}(x_{t}-x_{t-p})\] 

Spark DStreams

This section refers to discretized streams that were introduced in Spark 1.x. Structured streaming will be discussed in a future post.



Thank you for reading this article. For more information ...

References



---------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3

Monday, April 23, 2018

Overload Operators in C++ & Scala

Target audience: Beginner
Estimated reading time: 4'

Operator overloading allows for a reinterpretation of existing operators by imparting a distinct significance to them, all while preserving their original function. This is an example of compile-time polymorphism
While C++ offers the capability to assign unique meanings to operators for specific data types, how does its approach align with that of Scala's operator overloading?


Table of contents
Follow me on LinkedIn
 

Overview

In object-oriented programming, operator overloading is a specific case of polymorphism, where different operators may have different implementations depending on their arguments. The main benefit of operator overloading is to allow the developer to program using notation relevant to the domain (i.e. summation of vectors) and allows user-defined types a similar level.

C++ and Scala supports overloaded or redefinition of operators.
There is however a distinction between C++ and Scala in terms of operator overloading: In Scala the operators are over-loadable by the programmer who create methods with operator name while in C++ the operators are over-loadable by the language as a predefined set.

Notes
  • For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted.
  • Scala version used in 2.11.8

C++ operator overloading

In C++, operator overloading instructs the compiler to perform a certain operation when its corresponding operator is used on one or more variables. However, the && , || and comma operators cannot be overloaded.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
class Point {
private:
   int x;
   int y;
public:
      // Constructor
   Point(int x = 0, int y = 0) : x(x), y(y) { }

      // overloading += operator
   Point& operator+=(const Point& p) { 
       this.x += x;
       this.y += y;
       return (*this);
    }
  
   Point operator+(const Point& p) {
       Point result;
       result.x = x + p.getX();
       result.y  = y + p.getY();
       return result, 
   }

   friend std::ostream& operator<<(std::ostream& out, Point p) {
        out << "(" << p.getX() << "," << p.getY() << ")";
        return out;
   }
};


int main() {
    Point p1(4, 5), p2(5, 11)
    cout << "p1=" << p1 << "p1 + p2=" << p1+p2;
    return 0;
}

Scala overloading by proxy

Scala does not technically have operator overloading because it does not have operators in the common sense.  The operators are indeed function name defined by the programmer. Scala treats all operators as methods and thus allows operator overloading by proxy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
final class Point(val x: Int, val y: Int) {
 
   // Binary operator
   def +(p : Point) : Point = {
       require(p != null, "cannot add a null point")
       new Point(x + p.x, y + p.y)
    }
    
    override def toString : String = s"(${x.toString},${y.toString)"           
}
 
object Overload extends App {
    val p1 = new Point(2, 0)
    val p2 = new Point(4, -2)
       
    Console.println(s"p1=$p1 and p1+p2=${(p1 + p2)}")
}

The class Point is immutable as the coordinate, x and y are declared private value. It is highly recommended to create immutable objects and containers because it reduced the object state lifecycle. Immutability avoids the need of complex and error-prone concurrent update of the state of an object (race condition).

Thank you for reading this article. For more information ...

References

  • Scala for the Impatient - C. Horstman - Addison-Wesley 2012
  • The Annotated C++ Reference Manual - M. Ellis,  B. Stroustrup - Addison-Wesley 1991
  • github.com/patnicolas

---------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3

Friday, March 16, 2018

Fisher-Yates Shuffle in Scala

Target audience: Intermediate
Estimated reading time: 4'


This post describes the Fisher-Yates algorithm to shuffle randomly a sequence of objects.
Follow me on LinkedIn

Note: The code associated with this article is written using Scala 2.12.4

Overview

The generation of a sequence of randomly ordered elements from an existing sequence is a common problem. How can you be sure if the new sequence of items is actually randomly generated?
The Fisher-Yates shuffle algorithm (a.k.a. Donald Knuth shuffle) produces unbiased permutations with a similar likelihood.
One important benefit of Fisher-Yates is the ability to shuffle the elements of the sequence, in place.


Note:
There are several interpretations of the shuffling algorithm. This post describes the implementation of the most common Fisher-Yates algorithm


Fisher-Yates

Here is the Fisher-Yates in a nutshell. Let's consider an array arr of n elements
For each array index k from n-1 to 1
    Select a random value m  between k and 0
    Swap arr[r] and arr[k]
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def fisherYates[T](t: Seq[T]): Seq[T] = {

   setSeed(System.currentTimeMillis)
  
   (0 until t.size).foreach (n => {
       val randomIdx = n + nextInt(t.size-n)
       val tmp = t(randomIdx)
       t.update(randomIdx, t(n))
       t(n) = tmp
   })
   t
} 

The Fisher-Yates algorithm executes as an iteration with the following two steps:
  • Select a random item in the remaining sequence (line 6)
  • Swap the randomly selected item with the current item (lines 8,9)

Fast Fisher-Yates implementation

The performance of the Fisher-Yates algorithm can be improved by implementing the swapping of the randomly selected item with the current time in-place using bit manipulation operators.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def fisherYates2(seq: mutable.Seq[Int]): IndexedSeq[Int] = {
  require(seq.size > 0, "Undefined argument")

  setSeed(System.currentTimeMillis)
  (0 until seq.size).map(i => {
     var randomIdx: Int = i + nextInt(seq.size-i)
     seq(randomIdx) ^= seq(i)
     seq(i) = seq(randomIdx) ^ seq(i) 
     seq(randomIdx) ^= (seq(i))
     seq(i)
  })
}

The following code snippet tests the two implementations of the Fisher-Yates shuffling algorithm.
  val input = Array[Char](
    'a', 't', '4', '&', '9', '2', 'z', 'y', 'r', '8', '*', '?'
  )
  println(fisherYates[Char](input).mkString(","))
  // t,&,*,2,a,r,4,z,?,y,8,9

  println(fisherYates2(Array.tabulate(34)(n => n)).mkString(","))
  // 9,24,11,29,1,31,28,20,6,18,23,0,13,17,22,0,27,
  // 4,16,7,25,14,30,32,5,12,8,19,21,0,0,33,26,0
Note: Scala standard library provides developers with a shuffling method: Random.shuffle.

Thank you for reading this article. For more information ...

References


---------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3

Wednesday, February 7, 2018

Implementing AdaGrad Optimizer in Spark

Target audience: Intermediate
Estimated reading time: 5'

Have you ever desired a Gradient Descent approach with greater flexibility? If that's the case, Adaptive Gradient Descent (AdaGrad) could be a suitable alternative.


What you will learn: The workings of the Adaptive Gradient algorithm, how it improves upon Stochastic Gradient Descent, and its implementation using Apache Spark.



Table of contents
Follow me on LinkedIn
Notes
  • This post assumes that reader has rudimentary knowledge of the Scala API of Apache Spark and basic understanding of machine learning.
  • The code associated with this article is written using Scala 2.12.11 and Spark 3.1.0

Background

The optimization algorithm known as Stochastic Gradient Descent (SGD) is frequently employed to minimize the loss function during the training of various machine learning models, including support vector machines, logistic regression, and back-propagation neural networks. Typically, in its most basic form, SGD calculates the gradient using a singular learning rate.

It's quite usual to find that the features of a model exhibit significant variance across different observations. Under such circumstances, an adaptive gradient (AdaGrad) algorithm that assigns a unique learning rate to each feature could be an effective solution. There exist numerous methods to develop an algorithm that assigns a distinct learning rate to every feature. 
Accidentally, Spark ML library, MLlib, does not include an implementation of AdaGrad.
This article discusses the AdaGrad algorithm and an implementation for Apache Spark.

Stochastic Gradient Descent (SGD)

Apache Spark stands out as a rapid and versatile cluster computing system, offering advanced APIs in Java, Scala, Python, and R, along with an enhanced engine capable of supporting general execution graphs.

Within the Apache Spark ecosystem lies MLlib, a machine learning library. The stochastic gradient descent optimizer within this library serves as a randomized variant of the (batched) gradient descent algorithm, which is employed for minimizing a continuous differentiable objective function. In the realm of supervised machine learning, this objective function typically manifests as a loss function, such as logistic loss, sum of least squares, and others. \[L(w)=\frac{1}{n}\sum_{I=1}^{n}(y_{i}-f(x_{i}|w))^{2}\] The objective function L is expressed as the summation of differentiable functions. In supervised learning, the loss related to a specific feature is defined as a continuous, differentiable, convex function. \[L(w)=\sum_{i=1}^{n}L_{i}(w)\] In supervised learning, the vector w represents the vector of weights (or model parameters). At each iteration of the stochastic gradient descent, the weights are updated using the formula \[w_{t+1}=w_{t}-\eta \sum_{i=0}^{n}\frac{\partial L}{\partial w_{i, t}}\] Stochastic Gradient Descent (SGD) aims to reduce the loss function, which represents the difference between the model's predicted values and the expected values. In each iteration, SGD picks a small group of observations, referred to as a mini-batch, for the model's training. This repetitive procedure is designed to progressively approach the true global minimum of the loss function.

Adaptive Gradient Descent

The core concept of AdaGrad revolves around the strategy of boosting the learning rate for sparse features (or model parameters) while reducing it for more dense features. As a result, AdaGrad enhances the efficiency of converging towards the minimum loss in models with sparse features, particularly when these sparse features hold significant information. \[w_{t+1}=w_{t} -\frac{1}{\sqrt{\sum_{t=1}^{T}\bigtriangledown _{ti}^{t} + \varepsilon }}\frac{\partial L}{\partial w_{ti}}\]

SGD in Apache Spark

The Apache spark MLlib library has two implementations of SGD
  • Generic Gradient Descent and related classes in the mllib.optimization package
  • SGD bundled with classifier or regression algorithms such as LogisticRegressionWithSGD, LassoWithSGD, SVMWithSGD or RidgeRegressionWithSGD

We plan to utilize the optimization package to modify the stochastic gradient descent. Our goal is to use the mllib.optimization.GradientDescent template class from MLlib and create a custom Updater that implements an adaptive gradient with per-feature learning rates.

This Updater is responsible for "updating the model's weights" (in models like Logistic Regression or SVM) by multiplying the current learning rate with the partial derivative of the loss with respect to each weight, as previously described. We'll name our custom Updater as AdaGradUpdater, which will handle the model weight updates using the adaptive gradient approach. Following this, we'll initialize the SGD in the following manner.
   val adaSGD = new GradientDescent.
                    .setUpdater(new AdaGradUpdater)
                    .setStepSize(0.01)
                    . .....
The class AdaGradUpdater has to implement the generic compute method
  Updater.compute(
      oldWeights: Vector, 
      gradient: Vector, 
      stepSize: Double, 
      iter: Int, 
      regCoefs: Double
  ): (Vector, Double)
The method returns the tuple (vector of new weights, loss). Let's implement the AdaGrad algorithm


Implementation of AdaGrad

As mentioned earlier, the implementation of AdaGrad consists of overriding the method Updater.compute.

The computation of the learning rate requires us to record the past values of the square value of the gradient (previous steps) for this particular weight, in the array gradientHistory (line 2). First we define the method += to update the gradient history (lines 26-367. The first call to the method creates the gradient history (line 30).
The next step consists of converting the existing (old) weights into a Breeze dense vector brzWeights (line 13). The array of the new learning rates is computed as the inverseVelocity coefficient (line 40).

The learning rates are zipped with the old weights (line 14) to update the weights newWeights as a new dense vector(line 15-17). The linear algebra (matrix computation) on the Spark data node is actually performed by the LINPACK library under the cover through calls to brzAxpy (line 20) and brzNorm (line 21).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
 final class AdaGradL2Updater(dimension: Int) extends Updater {
  var gradientsHistory: Array[Double] = _

  override def compute(
       weightsOld: Vector,
       gradient: Vector,
       stepSize: Double,
       iter: Int,
       regParam: Double
  ): (Vector, Double) = {

    +=(gradient)
    val brzWeights: BV[Double] = weightsOld.toBreeze.toDenseVector
    val sumSquareDerivative = inverseVelocity.zip(brzWeights.toArray)
    val newWeights: BV[Double] = 
       new DenseVector[Double](sumSquareDerivative.view.map {
          case (coef, weight) => weight * (1.0 -regParam * coef)
       }

    brzAxpy(-1.0, gradient.toBreeze, newWeights)
    val norm = brzNorm(brzWeights, 2.0)

    (Vectors.fromBreeze(brzWeights), 0.5 * regParam * norm * norm)
  }


  private def +=(gradient: Vector): Unit = {
    val grad = gradient.toArray
    
    grad.view.zipWithIndex.foreach {
      case (g, index) => {
        if(gradientsHistory == null)
          cgradientsHistory = Array.fill(grad.length)(0.0)

        val existingGradient = gradientsHistory(index)
        gradientsHistory.update(index, existingGradient + g*g)
      }
    }
  }


  def inverseVelocity = gradientsHistory.map(1.0/Math.sqrt(_))
}

Analysis

Given a logistic regression model to classifier of time series pattern, we compute and compare the training binary cross-entropy loss without regularization using the default MLlib SGD and our implementation of AdaGrad.

The AdaGrad algorithm has its limitation. The accumulation can grow very large over time, causing the learning rate to shrink and become infinitesimally small, which effectively stops the parameter from updating.

AdaGrad is not the only alternative to traditional SGD. RMS and Adam are worth investigating. 

Thank you for reading this article. For more information ...

References


---------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3

Thursday, December 7, 2017

Category Theory in Scala 2.x

Target audience: Advanced
Estimated reading time: 6'

"Mathematics is security. Certainty. Truth. Beauty. Insight. Structure. Architecture. Whether it is differential topology, or functional analysis, or homological algebra, it is all one thing. ..."  Paul Halmos


Table of contents
Notes
  • This post requires a solid knowledge of functional programming as well understanding of differential geometry.
  • Implementation relies on Scala 2.12.10

Overview

Scala is a premier language that seamlessly integrates functional programming and object-oriented paradigms, supporting various advanced concepts including higher-kind types, functors, and monads. This article demonstrates Scala's ability to utilize covariant and contravariant functors for tensor analysis, particularly in the context of vector field applications.

Many Scala developers are familiar with key functional programming principles such as monads, functors, and applicatives. However, these concepts extend beyond Scala and functional programming in general. They originate from areas in mathematics like topology and algebraic topology. In fields such as differential geometry or differential topology, tensors, which utilize covariant and contravariant functors, are extensively employed.

Category theory

Introduction

Category theory is a branch of mathematics that deals with abstract structures and relationships between them. It's a high-level, formal language that's used to unify concepts across various fields of mathematics [ref 1]. Here's a breakdown of its fundamental aspects:
  • Categories: At the heart of category theory are categories. A category consists of objects and morphisms (also called arrows or maps) between these objects. Objects can be anything (sets, spaces, groups, etc.), and morphisms are ways of transforming one object into another. Importantly, these transformations must be able to be composed together, and this composition must be associative. Additionally, for each object, there is an identity morphism that acts neutrally in composition.
  • Morphisms: These are critical in category theory. A morphism represents a relationship or a function between two objects. It's not just a function in the traditional sense; it's a more generalized concept.
  • Functors: These are mappings between categories. A functor respects the structure of categories; it maps objects to objects, morphisms to morphisms, and preserves the composition and identity of morphisms.
  • Natural Transformations: These provide a way to transform one functor into another while respecting their action on objects and morphisms.
  • Universal Properties: These are properties that define an object in terms of its relationships with other objects, typically describing the 'most efficient' solution to a particular problem.
  • Limits and Colimits: These are concepts that generalize constructions like products, coproducts, intersections, and unions.
Category theory is known for its level of abstraction, making it a powerful tool in unifying concepts across different mathematical disciplines. It has found applications in fields such as algebra, topology, and computer science, particularly in the study of functional programming languages. By providing a high-level language for describing and analyzing mathematical structures, category theory helps reveal deep connections between seemingly disparate areas of mathematics.

Covariant functors

A covariant functor F from the category of topological spaces T to some algebraic category C is a function which assigns to each space X an object F(X) in C and to each continuous map f: X → Y a homomorphism F(f): F(X) → F(Y) with \[F(f\circ g)=F(f)\circ F(g)\begin{matrix} & F(id_{X})=id_{Y} \end{matrix}\]

I assume the reader has basic understanding of Functors [ref 2]. Here is short overview:
category C is composed of object x and morphism f defined as
\[C= \{ {x_{i}, f \in C | f: x_{i} \mapsto x_{j}} \}\] A functor F is a map between two categories C and D that preserves the mapping.
\[x\in C \Rightarrow F(x)\in D \\ x, y\in C \,\,\, F: x \mapsto y => F(x)\mapsto F(y)\] 
Let's look at the definition of a functor in Scala with the "preserving" mapping method, map

trait Functor[M[_]] {
     def map[U, V](m: M[U])(f: U => V): M[V]
}

Let's define the functor for a vector (or tensor) field. A vector field is defined as a sequence or list of fields (i.e. values or function values).
type VField[U] = List[U]

trait VField_Ftor extends Functor[VField] {
      override def map[U, V](vu: VField[U])(f: U => V): VField[V] = vu.map(f) 
}

This particular implementation relies on the fact that List is a category with its own functor. The next step is to define the implicit class conversion VField[U] => Functor[VField[U]] so the map method is automatically invoked for each VField instance.
implicit class vField2Functor[U](vu: VField[U]) extends VField_Ftor {  
    
  final def map[V](f: U => V): VField[V] = super.map(vu)(f) 
}

By default Covariant Functors (which preserve mapping) are known simply as Functors. Let's look at the case of Covector fields.

Contravariant functors

A contravariant functor G is similar to covariant functors. G(f) is a homomorphism from G(Y) to G(X) with \[G(f\circ g)=G(g)\circ G(f)\begin{matrix} & G(id_{X})=id_{Y} \end{matrix}\] Contravariant functor is a map between two categories that reverses the mapping of morphisms.
\[x, y\in C \,\,\, G: x \mapsto y => G(y)\mapsto G(x)\]
trait CoFunctor[M[_]] {
     
   def map[U, V](m: M[U])(f: V => U): M[V]
}

The map method of the Cofunctor implements the relation M[V->U] => M[U]->M[V]
Let's implement a co-vector field using a contravariant functor. The definition  describes a linear map between a vector V over a field X to the scalar product  V*: V => T.
A morphism on the category V* consists of a morphism of V => T or V => _ where is a vector field and T or _ is a scalar function value.
type CoField[V, T] = Function1[V, T] 

The co-vector field type, CoField is parameterized on the vector field type V which is a input or function parameter. Therefore the functor has to be contravariant.

The higher kind type M[_] takes a single type as parameter (i.e. M[V]) but a co-vector field requires two types:
  • V: Vector field
  • T: The scalar function is that the result of the inner product <.>
Fortunately the contravariant functor CoField_Ftor associated with the co-vector needs to be parameterized only with the vector field V. The solution is to pre-defined (or 'fix') the scalar type T using a higher kind projector for the type L[V] => CoField[V, T]     T => ({type L[X] = CoField[X,T]})#L

trait CoField_Ftor[T]  extends CoFunctor[({type L[X] = CoField[X,T]})#L ] {

  override def map[U,V](vu: CoField[U,T])(f: V => U): CoField[V,T] =  
    (v: V) => vu(f(v))
}

As you can see the morphism over the type V on the category CoField is defined as f: V => U instead of f: U => V. A kind parameterized on the return type (Function1) would require the 'default' (covariant) functor. Once again, we define an implicit class to convert a co-vector field, of type CoField to its functor, CoField2Ftor
implicit class CoField2Ftor [U,T](vu: CoField[U,T])  extends CoField_Ftor {
 
  final def map[V](f: V => U): CoField[V,T] = super.map(vu)(f) 
}

Monads

A monad is a concept from category theory, a branch of mathematics, which has become particularly important in functional programming. It's a way to represent computations as a series of actions in a specific context, providing a framework for handling side effects, manipulating data, and structuring programs. 
A monad is a type of functor, a structure that applies a function to values wrapped in a context. 
In functional programming, monads offer a way to deal with side effects (like state, IO,) while keeping the pure functional nature of the code. A monad can be then defined by its 3 methods: unit, map & flatMap [ref 3].

trait Monad[M[_]] {
  def unit[T](t: T): M[T]
  def map[T, U](m: M[T])(f: T => U): M[U]
  def flatMap[T, U](m: M[T])(f: T => M[U]): M[U]
}

Example for List:
  • unit(str: String) = List[String](str))
  • map(f: (str: String) => str.toUpperCase) = List[String](str.toUpperCase))
  • flatMap (f: (str: String) => List[String](str.toUpperCase))  =  List[String](str.toUpperCase))

Application to differential geometry

Category theory is applicable to differential geometry and tensor analysis on manifolds [ref 4]

Vector fields

Let's consider a 3 dimension Euclidean space with basis vector {ei} and a vector field V (f1, f2, f3) [Note: we follow Einstein tensor indices convention]

The vector field at the point P(x,y,z) as the tuple (f1(x,y,z), f2(x,y,z), f3(x,y,z)).
The vector over a field of k dimension field can be formally. mathematically defined as 
\[f: \boldsymbol{x} \,\, \epsilon \,\,\, \mathbb{R}^{k} \mapsto \mathbb{R} \\ f(\mathbf{x})=\sum_{i=1}^{n}{f^{i}}(\mathbf{x}).\mathbf{e}^{i}\] Example: \[f(x,y,z) = (2x+z^{3})\boldsymbol{\mathbf{\overrightarrow{i}}} + (xy+e^{-y}-z^{2})\boldsymbol{\mathbf{\overrightarrow{j}}} + \frac{x^{3}}{y}\boldsymbol{\mathbf{\overrightarrow{k}}}\] Now, let's consider the  same vector V with a second reference (origin O' and basis vector e'i).

\[f(\mathbf{x})=\sum_{i=1}^{n}{f'_{i}}(\mathbf{x}).\mathbf{e'}_{i}\]
The transformation matrix Sij convert the coordinates value functions fi and f'i. The tuple f =(fi) or more accurately defined as (fi) is the co-vector field for the vector field V
\[S_{ij}: \begin{Vmatrix} f^{1} \\ f^{2} \\ f^{3} \end{Vmatrix} \mapsto \begin{Vmatrix} f'^{1} \\ f'^{2} \\ f'^{3} \end{Vmatrix}\]
The scalar product of the co-vector f' and vector v(f) defined as is defined as
\[< f',v> = \sum f'_{i}.f^{i}\]
Given the scalar product we can define the co-vector field f' as a linear map
\[\alpha (v) = < f',v> (1) \] 


Functors for vector fields

Now we can apply functors to operations on vector fields  [ref 5].

Let's consider a field of function values FuncD of two dimension: v(x,y) = f1(x,y).i + f2(x,y).j. The Vector field VFieldD is defined as a list of two function values.
type DVector = Array[Double]
type FuncD = Function1[DVector, Double]
type VFieldD = VField[FuncD] 

The vector is computed by assigning a vector field to a specific point (P(1.5, 2.0). The functor is applied to the vector field, vField to generate a new vector field vField2
val f1: FuncD = new FuncD((x: DVector) => x(0)*x(1))
val f2: FuncD = new FuncD((x: DVector) => -2.0*x(1)*x(1))
  
val vField: VFieldD = List[FuncD](f1, f2)
val value: DVector = Array[Double](1.5, 2.0)  
val vField2: VFieldD = vfield.map( _*(4.0))


A co-vector field, coField is computed as the sum of the fields (function values) (lines 1, 2). Next, we compute the product of co-vector field and vector field (scalar field product) (line 3). We simply apply the co-vector Cofield (linear map) to the vector field. Once defined, the morphism _morphism is used to generate a new co-vector field coField2 through the contravariant function CoField2Functor.map (line 6).

Finally, a newProduct is generated by composing the original covariant field coField and the linear map coField2 (line 7).

1
2
3
4
5
6
7
val coField: CoField[VFieldD, FuncD] =  (vf: VFieldD) => vf(0) + vf(1)
val contraFtor: CoField2Functor[VFieldD, FuncD] = coField
val product = coField(vField)
val _morphism: VFieldD => VFieldD =  (vf: VFieldD) => vf.map( _*(3.0))

val coField2 = contraFtor.map( _morphism )
val newProduct: FuncD = coField2(coField)

Note: This section on the application of functors to vector fields can be extended to tensor fields and smooth manifolds.

Higher kinds in Scala 3

The implementation of functors in Scala 2 relies on the type system been based on class types first (including traits and object classes), along with type projections of the form T#X. Path-dependent types were described as a combination of singleton types and type projections. 
     ({type R[A+] = Either[Int, A]})#R
     ({type L[B] = Either[B, Int]})#L
Scala 3 modifies the implementation of higher kind projection. It uses the underscore, _, symbol for placeholders in type lambdas [ref 6]. The new, simplified formalism is:
     Either[Int, +_]
     Either[_, Int]
The new formalism can be used with existing code Scala 2 through a migration library [ref 7].



Thank you for reading this article. For more information ...

References

[4] Tensor Analysis on Manifolds- Chap 3 Vector Analysis on Manifolds- R. Bishop, S. Goldberg - Dover publication 1980 
[5Differential Geometric Structures/ Operations on Vector Bundles p20-22 -  W. Poor - Dover publications 2007


---------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3