Tuesday, February 19, 2013

Scrum & Distributed Teams

Target audience: Beginner
Estimated reading time: 3'

Table of contents
Follow me on LinkedIn

Overview

The power of Scrum is about instant decisions and collaborative work within the team and with the product owner.  Most of the books and articles on the subject assume, indirectly, that the entire team sits in the same building or vicinity. Unfortunately, an increasing number of software companies have either engineering teams spread around the world, or allow telecommuting or both. Such organizations create challenge for management and more specifically for the Scrum master.

Challenges

  • Different culture or ethics. Some culture have different rules about accessibility outside normal business hours and privacy. Most of western Europe has very strict labor regulations that may have an impact of the quality of the collaboration between teams.
  • Language barrier. Beside the obvious challenges of accommodating different regional accents, people are sometimes hesitant to ask the speaker to repeat himself or herself, leaving the listener guessing and making inaccurate assumptions. 
  • Confusing communication.  Companies with distributed teams offer multiple channels of communication in order to improve collaboration. However this strategy is not without risk as the same message, report or request may differ between channels of communication.  
  • Difficulty to build team Considering Scrum principles focus on high collaboration, it is quite a challenge to build the team dynamic and motivation. 
  • Interaction too formal. Team spirit and commitment is built through informal interaction. Such tactic is certainly more difficult to implement within a distributed team than a group of engineers sharing the same office.

Tools

The last few years have seen a significant increase of the number of options available to facilitate collaboration across continents. 

1. Video-conferencing: This approach is more suited for weekly status meeting, spring planning and retrospective. Skype is a low cost solution to create a bond between the team members and reduce noise in communication. Managers can observe the mood of the teams through non-verbal communication.

2. Instant Messaging
I have found Instant Messaging is perfectly suited for quick one-on-one exchange between engineers.  It is foremost a very effective tool because it allows non-formal, abbreviated, short messages with link to document, source code,  test cases or meeting minutes to be reviewed by a peer. Some IM solutions such as Meebo include extra features that are make the archiving and search through messages easier.

3. Sharing Documents
Although Scrum is not conducive to large amount of documentation, there is always a need for functional specifications, design documents or test plan to be drafted and shared. DropBox or Google Drive provide a simple and effective platform for engineers to share and update documents, especially when used in conjunction with a micro blog. Once a document has been updated, reviewed and approved, it can be converted into a Wiki page

4. Micro blogs
As long at the are used judiciously, Micro blogs such as Twitter are a great way to  notify a group of engineers or the entire team of changes in procedures, summary of daily sprint stand-up, update status or announce meetings or corporate events. However management needs to monitor the content, tone and frequency of those posts.  Employees because insensitive and unresponsive because they are constantly bombarded with large numbers of messages.

5. Wiki
 Wiki are still the best medium for well-defined and complete documents such as coding standard, list of user stories for an incoming sprint, minutes of a retrospective Scrum meeting.

6. Email
From my personal experience, email is an overused and abused communication medium. It should be reserved for formal and/or confidential information

7. Meetings
Because of time zone differences, we cannot expect everyone from any part of the world to attend every meetings. Meeting should be run effectively with a clearly defined agenda, time limit and detailed minutes so engineers do not feel compelled to drop more critical tasks to attend a meeting because of the fear to miss on important information. Meetings should be restricted to evaluate proposal, make recommendation, bring different point of view, and eventually make decisions.

Management

1. Managing communication
As I mentioned earlier, some of the communication tools can be easily abused or misused. It is the responsibility of the manager to monitor any team or company wide communication for sake of legal, ethical implication as well as the overall productivity of the engineering group.

2. Managing collaboration
Some tasks such as peer programming require to have  two or more engineers collaborate on a specific problem. I believe manager are responsible to organize schedule and set business hours of different teams to overlap to facilitate technical collaboration and brainstorming.


References


Monday, January 7, 2013

Naive Bayes Classifier in Java

Target audience: Advanced
Estimated reading time: 5'



Table of contents
Follow me on LinkedIn

Introduction

The Naive Bayes approach is a generative supervised learning method which is based on a simplistic hypothesis: it assumes that the existence of a specific feature of a class is unrelated to the existence of another feature. This condition of independence between model features is essential to the proper classification.
Mathematically, Bayes' theorem gives the relationship between the probabilities of A and B, p(A) and p(B), and the conditional probabilities of A given B and B given A, p(A|B) and p(B|A).
In its most common form the Naive Bayes Formula is defined for a proposition (or class) A and evidence (or observation) B with \[p(A|B)= \frac{p(B|A).p(A)}{p(B)}\]
   - p(A), the prior, is the initial degree of belief in A.
   - p(A|B), the posterior, is the degree of belief having accounted for B
   - p(B|A)/p(B) represents the support B provides for A
The case above can be extended to a network of cause-effect conditional probabilities p(X|Y).


In case of the features of the model are known to be independent. The probability of a observation x =( ...,x,...) to belong to a class C is computed as: \[p(C|\vec{x})=\frac{\prod (p(x_{i}|C).p(C)}{p(\vec{x})}\]. It is usually more convenient to compute the maximum likelihood of the probability of a new observation to belong to a specific class by converting the formula above. \[log\,p(C|\vec{x}) = \sum log\,p(x_{i}|C) + log\,p(C) - log\,p(\vec{x})\]

Note: For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted.

Software design

The class in the example below, implements a basic version Naive Bayes algorithm. The model and its feature is defined by the nested class NClass. This model class defines the features parameters (mean and variance of prior observations) and the class probability p(C).  The computation of the mean and variances of prior is implemented in the NClass.computeStats method.Some of the methods, setters, getters, comments and conditional test on arguments are omitted for the sake of clarity. The kernel function is to be selected at run-time. This implementation supports any number of features and classes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
public final class NaiveBayes implements Classifier {
  
  public final class NClass {
    private double[] _params         = null;
    private double[] _paramsVariance = null;
    private double   _classProb = 0.0;
 
    public NClass(int numParms) { 
      _params = new double[numParams];  
    }
 
    private void add(double[] data) {
      int numObservations = 0;
             
      _paramsVariance = new double[_params.length];
      for(int j = 0; j < data.length; ) {
        j++;
        for( int k = 0; k < _params.length; k++, j++) {
          _params[k] += data[j];
          _paramsVariance[k] += data[j]*data[j];
        }
        numObservations++;
      }
      _classProb = numObservations;
    }
 
    private void computeStats() {
      double  inv = 1.0/_classProb;
      double  invCube = invClassProb*invClassProb*invClassProb;
 
      for( int k = 0; k < _params.length; k++) {
        _params[k] /= _classProb;
        _paramsVariance[k] = _paramsVariance[k]*inv -
                   _params[k]*_params[k]*invCube;
      }
      _classProb /= _numObservations;
    }
  }
}

Kernel functions can be used to improve the classification observations by increasing the distance between prior belonging to a class during the training phase. In the case of 2 classes (Bernoulli classification) C1, C2 the kernel algorithm increases the distance between the mean values m1 and m2 of all the prior observations for each of the two classes, adjusted for the variance.


As Java as does not support local functions or closures we need to create a classes hierarchy to implement the different kernel(discriminant) functions. The example below defines a simple linear and logistic (sigmoid function) kernel functions implemented by nested classes. \[y = \theta x \,\,and\,\,y =\frac{1}{1+e^{-x}}\]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
public interface Discriminant {
   public double estimate(double value);
}
       
    //Nested class that implement a linear Discriminant 
public static class DiscriminantKernel 
              implements Discriminant  {
   private double _theta = 1.0;
   public DiscriminantKernel(double theta) { 
     _theta = theta; 
   }  
   public double estimate(double value) { 
     return value*_theta; 
   }
}
             
       // Nested class that implements a sigmoid function for kernel
public static class SigmoidKernel implements Kernel {
  public double estimate(double value) { 
    return 1.0/(1.0 + Math.exp(-value) 
  }
}

Ultimately, the NaiveBayes class implements the three key components of the learning algorithm:
  • Training: train
  • Run time classification: classify
A new observation is classify using the logarithmic version of the Naive Bayes formula, logP
First let's define the NaiveBayes class and its constructors.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
public final class NaiveBayes implements Classifier {

   public final class NClass { }

   private CDoubleArray[] _valuesArray = null;
   private NClass[] _classes = null;
   private int _numObservations = 0;
   private int _step = 1;
   private Kernel _kF = null;
      
   public NaiveBayes() { this(0,0) }
   public NaiveBayes(int numParams, int numClasses) { 
     this(numParams, numClasses, new NLinearDiscriminant());
   }
            
   public NaiveBayes(
      int numParams, 
      int numClasses, 
      final Discriminant kf
   ) {
     _classes = new NClass[numClasses];
     _valuesArray = new CDoubleArray[numClasses];
 
     for( int k = 0; k < numClasses; k++) {
       _classes[k] = new NClass(numParams);
       _valuesArray[k] = new CDoubleArray();
     }
     _kF = kf;
     this.discretize(0,numClasses);
  }
   ..
} 

Training

Next the training method, train is defined. The method consists merely in computing the statistics on historical data, _valuesArray and assign them to predefined classes _classes

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
public int train() throws ClassifierException {
  double[] values =  null;
           
  for( int j = 0; j < _valuesArray.length; j++) {
    values = _valuesArray[j].currentValues();
    _classes[j].add(values);
  }
           
  for( int j = 0; j < _classes.length; j++) {
    _classes[j].computeStats();
  }
  return values.length;
}

Classification

The run-time classification method classify uses the prior conditional probability to assign a new observation to an existing class. It generate the class id for a set of values or observations.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
public int classify(double[] values) {
           
   // Compute the normalizing denominator value
  double[] normalizedPriorProb = new double[values.length],  
           prob = 0.0;

  for( int valueIndex = 0; valueIndex < values.length; valueIndex++) {

    for(int classid = 0; classid < _classes.length; classid++) {
      prob = Math.abs(values[valueIndex] - 
          _classes[classid]._parameters[valueIndex]);
      if( prob > normalizedPriorProb[valueIndex]){               
          normalizedPriorProb[valueIndex] = prob;
      }
    }
  }
  return logP(values, normalizedPriorProb);
}

A new observation values is assigned to the appropriate class croding to its likelihood or log of conditional probability, by the method logP.
logP computes the likelihood for each value and use the Naive Bayes formula for logarithm of prior probability and log of class probability


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
private int logP(double[] values, double[] denominator) {
  double score = 0.0, 
         adjustedValue = 0.0, 
         prior = 0.0,
         bestScore = -Double.MAX_VALUE;
  int bestMatchedClass = -1;
                
  // Walk through all the classes defined in the model
  for(int classid = 0; classid < _classes.length; classid++) {
    double[] classParameters = _classes[classid]._parameters;
                     
    score = 0.0;
    for( int k = 0; k < values.length; k++) {
       adjustedValue = _kF.estimate(values[k]);
       prior = Math.abs(adjustedValue - classParameters[k])/
               denominator[k];
       score += Math.log(1.0 - prior);
    }
    score += Math.log(_classes[classid]._classProb);
                    
    if(score > bestScore) {
        bestScore = score;
        bestMatchedClass = classid;
    }
  }
  return bestMatchedClass;
}

Some of the ancillary private methods are omitted for the sake of clarification. We will look at the implementation of the same classifier in Scala in a later post.


References

  • The Elements of Statistics Learning: Data mining, Inference & Prediction - Hastie, Tibshirani, Friedman - Springer
  • Machine Learning for Multimedia Content Analysis - Y. Gong, W, Xu - Springer
  • Effective Java - J Bloch - Addison-Wesley

Saturday, December 15, 2012

Introduction of Scala to Java 7 Developers

Target audience: Beginner
Estimated reading time: 4'


Table of contents

The purpose of this post is to introduce the Scala programming language from the perspective of an experienced Java developer. It is assumed that the reader has a basic understanding of design patterns and multi-threaded applications.
In a nutshell, Scala is:
-
Object-oriented
- Functional
- Scalable (Actors and immutability)

The elements of the Scala programming language presented in this post are a small subset of the rich set of features developers would benefit from the language.

Notes: 
  • This post relies on Scala 2.10
  • For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted.

Immutability

Scala relies on immutable object and collections to reduce complexity associated with multi-threaded applications, eliminate race condition, simplify debugging.
The CRU (
Create, Read, Update) object lifecycle in Java is reduced to CR (Create, Read) in Scala.
The Scala standard library offers a large variety of immutable collections.

val x = 23.5  // Read only
val mp = new immutable.Map[String, String]
val xs = List[Any]()


Traits vs. interface

Java interfaces are purely declarative. Scala traits allows:
* method implementation
* member value initialization
* abstract and
lazy values (to be initialized through sub-class or on demand)

Scala traits cannot be instantiated and do not have constructors. Scala emulates multiple inheritance by stacking traits (Cake pattern).

trait MyObject[T] {
   val tobeDefined: Int
   def add(t: T): Option[T] = {}
   def head: T = { }
}

Lazy values

Scala supports lazy values that are declared as member of a class but instantiated once and only once when they are accessed

class MyObject[T] {
    lazy val n = (n<<2) + Random.next(n)
    ...
}

Implicits

3rd party libraries and frameworks cannot always be extended through composition or inheritance. Scala implicit class/types and methods allows a "behind the scene" type conversion. The following code snippet converts a double value into a vector/array of double of size 1. The implicit is automatically invoke whenever it is imported into another section of the code base: for instance, the vector v is constructed through the implicit conversion dbl2Vec.

type DblVector = Array[Double]
    // Definition of implicit
implicit def dbl2Vec(x: Double): DblVector = Array.fill(1)(x)
  ...
val v: DblVector = 4.5

The following example extends the concept of Exception handling, Try with a concept of Either converting any instance of Try to an instance of Either using toEither method of the implicit class Try2Either. The implicit class conversion is used in handling a rounding error in the normalize function.

implicit class Try2Either[T](_try: Try[T]) {
   def toEither[U](r: ()=>U)(f: T=>T): Either[U, T] = _try match {
      case Success(res) => Right(f(res))
      case Failure(e) => Console.println(e.toString); Left(r())  
   }
}

/**
 * Method fails and recover if the denominator is too small
*/
def normalize(x: DblVector): Either[Recover, DblVector] = Try {
    val sum = x.sum 
    if(sum > 0.01) x.map( _ / sum) 
    else throw new IllegalStateException("prob out of range")  
}
.toEither(() => new Recover)((v: DblVector) => v.map( Math.log(_)))


Tail recursion

Although convenient, traditional recursive implementation of computation can be ineffective, leaving developers to rely on iterative algorithms.
Scala's tail recursion is a very effective way to iterate through a data sequence without incurring a performance overhead because it avoids the creation of new stack frames during the recursion.
The following code snippet implements the Viterbi algorithm used to extract the most likely sequence of latent states in an hidden Markov model given a lambda model and a set of observations.

@scala.annotation.tailrec
private def viterbi(t: Int): Double = {

  if( t == 0) initial   
  else { 
    Range(0, lambda.getN).foreach( updateMaxDelta(t, _) )   
    if( t ==  obs.size-1) {
        val idxMaxDelta = Range(0, lambda.getN).map(i => 
                    (i, state.delta(t, i))).maxBy(_._2)
        state.QStar.update(t+1, idxMaxDelta._1)
        idxMaxDelta._2
    }
    else viterbi(t+1)
  }
}

Actor model

Traditional multi-threaded applications rely on accessing data located in shared memory. The mechanism relies on synchronization monitors such as locks, mutexes or semaphores to avoid deadlocks and inconsistent mutable states. Those applications are difficult to debug because of race condition and incur the cost of a large number of context switches.
The Actor model addresses those issues by using immutable data structures (messages) and asynchronous (non-blocking) communication.
The actors exchange messages and data using immutable messages as described in the example below.

sealed abstract class Message(val id: Int)
case class Activate(i: Int, xt: Array[Double]) extends Message(i)
case class Completed(i: Int, xt: Array[Double]) extends Message(i)
case class Start(i: Int =0) extends Message(i)


An actor processes messages queued in its mailbox asynchronously, FIFO. The method, receive which returns a PartialFunction is the message processing loop, which correspond to the Runnable.run() in Java

final class Worker(
     id: Int, 
     fct: DblSeries => DblSeries) 
extends Actor {
    // Actor event/message loop
  override def receive = {
    case msg: Activate => {
      val msgId = msg.id+id
      val output = fct(msg.xt)
      sender ! Completed(msgId, output)
  }
}

Syntactic similarities

Java developers will find comfort knowing that Scala preserves a lot of constructs and idioms which make Java 7 such a wide spread and supported programming language.

 Construct  Scala 2.10Java 7
Variable         var x: Float = 1.0F  float x = 1.0F;
Value val n = 4L
Constants final val MAX_TEMP: Float = 232.0F final float MAX_TEMP = 232.0F;
Class class Person { ...} public class Person { }
Constructor class Person(
  name: String,
  age: Int) {}
public class Person {
....private String name;
....private int age;
....public Person(String name, int age) {}
}
Setters Not applicable
Immutability
public class Person {
.. public void setName(String name) {}
}
Getters class Person(
  val name: String,
  val age: Int) {}

....//Public access to member values.
public class Person {
...public final String getName() {}
}
Method Arguments validation def add(index: Int): Unit = {
....require(index>=0 && index<size, ..
..    
s"$index out of bounds")
...
public void add(int index) {
...if( index < 0 || index >= size) {
... ...throw new IllegalArgumentException(...);
....
Singleton object Man extends Person {} public class Man extends Person {
....private static Man instance = new Man();
... private Man();
....public static Man getInstance() {
........ return instance;
... }
}
Overriding override def getName(id : Int) : String {} @override
public final getName(int id) {}
Interfaces trait Searchable {
..def search(id : Int) : String
}
interface Searchable {
...String search(int id);
}
Type Casting n : Int = o.asInstanceOf[Int] Integer n = (Integer)o;
Type Checking n.isInstanceOf[Int] n instanceOf Integer
Generics class Stack[T](val maxSize : Int) {
....def push(t : T) : Unit = {}
....def pop : T = {}
}
public class Stack <T> {
.....private int maxSize;
.....public Stack(int maxSize) {}
.....public void push(T t) {}
.....public <T> pop() {}
}
Abstract Class abstract
class Collection[T](max : Int) {

...def add(t : T) : Unit
...def get(index : Int) : Int
}
abstract
public class Collection <T> {

...abstract public void add(T t);
.. abstract public <T> get(int index);
}
Exception Try { .... }
match { 
....case Success(res) =>{}
... case Failure(e) => {}
}
try { .. }
catch( NullPointerException ex) {}
catch( Exception ex) {}
Bounded type
Parameters
class List[T <: Float] {}
class List[T >: Node] {}
public class List<T extend Float> {}
public class List <T super Node> {}
Wildcard
Parameters
class List[+T] {}
class List[-Node] {}
public class List<? extends T> {}
public class List<? super Node> {}
Index-based
For loop
for( i <= 0 until n) {} for( int i = 0; i < n; i++) {}
Iteration
Collection
for(index <- indices) {}for(int index : indices) {}


Key differences

There are quite a few important and tricky programmatic difference between Scala 2.11 and Java 7
  • == In Scala == compares instances similarly to equals. In Java == compares the reference to instances
  • Comparable integer java.lang.Integer supports comparable interface while scala.Int does not
  • Static method Java supports explicit static methods. Scala does not, although static methods are implicitly defined as method of objects (Singletons)
  • Inner class In Scala, Inner class are accessed through the reference Self to the outer class. Java accesses Inner class through the Outer class
  • Checked exceptions Scala does not support Checked exceptions, Java does.
  • Primitive types As a pure object oriented, Scala does not support primitive types. Java does
  • Expression as a value As a functional language, Scala defines expressions such as if ... else ... as value, that can be evaluated and assigned to. Java does not
  • Immutable collections Scala supports immutable collections by default. Java does not.
  • Localized imports and type aliases Scala has scoped import and type aliases. Java does not
  • Explicit operator overloading Scala allows developers to overload operator through the definition of new methods. Java does not.
  • Actor Scala provides full parallelism with actor and immutable messages. Java does not.
  • Futures Scala futures are asynchronous and can be updated thorough promises. Java does not support the concept
  • Higher order kind Scala has higher order kind, such as Monad. Java does not
  • Tail recursion (or elimination) Scala supports tail recursion in order to avoid stack overflow. Java does not
  • Monad and functors are core concept in Scala, but not in Java.

Note This comparative analysis relies on version 2.10.x of Scala language and Java 7. Future version of Scala and Java may produce a slightly different result.

So much more....

This post barely scratches the surface of the capabilities of Scala which include
  • Futures
  • Partial functions and partially applied functions
  • Dependency injection
  • Monadic representation
  • Macros
  • View bounded type parameterization
  • F-Bounded polymorphism
  • Closures
  • Delimited continuation
  • Method argument currying

References