Thursday, February 4, 2016

Revisit Newton-Raphson

Target audience: Intermediate
Estimated reading time: 3'

The Newton-Raphson method stands out for its straightforwardness, both conceptually and in execution. We'll delve into two distinct Scala implementations and wrap up the discussion by weighing the pros and cons of the Newton-Raphson approach.


Table of contents
Follow me on LinkedIn

The Newton-Raphson is a well-know technique to compute the root of a function, f(x) = 0 or the minimum/maximum of a function using its derivative f'(x) = 0.

First implementation

Let's dive into the computation of the root x of a function f. A function is defined by its Taylor series of derivatives as follow:
\[f(s)=\sum_{0}^{\infty }\frac{f^{(n)}(x)}{n!}(s-x)^{n}\] In case of the root f(s), the equation becomes
\[0=f(x)+f'(x)(x-s)+O((x-s)^{2}f"(s))\] resulting to
\[x_{n+1}=x_{n}- \frac{f(x_{n})}{f'(x_{n})}\]

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
final class NewtonRaphson(
   f:   Double => Double,
   df:  Double => Double,
   dff: Option[Double => Double] = None
) {
   final val EPS = 0.01

   @scala.annotation.tailrec
   def root(z: Double): Double = dff.map(_dff => {
       val nextX = x - f(x) / df(x)
       if (Math.abs(nextX - x) < EPS) nextX 
       else root(nextX)
    }
}

The constructor (lines 2 to 4) takes 3 arguments:
  • Function which root is to be computed
  • First order derivative
  • Optional second order derivative that is not used in this first implementation
The formula is implemented by the method root (lines 9 to 12). Let's apply the root method to a concrete example, with function f and its derivative ff:

1
2
3
4
5
val f = (x: Double) => Math.pow(x, 5.0) - 2.0*x
val ff = (x: Double) => 5.0*Math.pow(x,4.0) -2.0
 
val nr1 = new NewtonRaphson(f, ff)
val z1 = nr1.root(7.9)

So far so good. However it is assumed that the function has a single root, or in the case of a maximum/minimum, its derivative has single root, at least in the vicinity of the initial data point. What about computing the root of a function which may have no or several root?
Let's look at the following function g and its derivative gg


1
2
3
4
val g = (x: Double) => Math.exp(0.5 * x)
val gg = (x: Double) => 0.5 * g(x)

val nr3 = new NewtonRaphson(g, gg)

The call to the Newton-Raphson method nr5 will diverge. In this particular case, the computation generates a Double.NaN, at each iteration.
There are several options to guarantee that the method will properly exit in the case no root exists. Let's look at one solution that leverages on the second derivative.

Second derivative to the rescue

We need to go back to basics: in its simplest form, the Newton-Raphson does not take into account the derivative of order higher than 1. The second order derivative, f" can be used as rough approximation of the error on the approximation of the root. The error san be evaluated at each iteration against a convergence criteria EPS.
\[\Delta \varepsilon = \varepsilon _{n+1}- \varepsilon _{n} \sim \frac{f'(x_{n})}{f"(x_{n})} < EPS\] \[x_{n+1}=x_{n}-\frac{f(x_{n})}{f'(x_{n})}(1 + \Delta \varepsilon)\] The approximation of the error is also used to validate whether the algorithm converges toward a solution (root) or starts to diverge. Let's implement this variation of the Newton-Raphson using the second order derivative as a convergence criteria.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
@scala.annotation.tailrec
def root(z: Double): Double = dff.map(_dff => {

   @scala.annotation.tailrec
   def nr2ndOrder(xe: (Double, Double)): (Double, Double) = {
      val x = xe._1
      val eps = xe._2
      
      val nextEps = Math.abs(df(x) / _dff(x))
      val nextX = (x - f(x) *(1.0 + nextEps)/df(x))

      // rules to exit recursion
      if (eps > 0.0 && nextEps >= eps)
         (-1.0, -1.0)
      else if (Math.abs(nextEps - eps) < EPS) 
         (nextX, nextEps)
      else 
         nr2ndOrder((nextX, nextEps))
    }
  
    nr2ndOrder((z, 0.0))._1
}
 ).getOrElse(DIVERGE)

Once again, the computation of the root is implemented as an efficient tail recursion defined in the method nr2ndOrder (line 5)
The convergence criteria is implemented as follow:
  • if error, eps increases, exit the recursion => Diverge (lines 13 and 22)
  • if eps decreases with a value within the convergence criteria => Converged (line 15)
  • if eps decreases with a value higher than the convergence criteria => Recurse (line 18)

Summary

Let's compare the convergence of the simplest Newton-Raphson method with the convergence its variant using the 2nd order derivative, using the function f defined in the first paragraph.

Clearly, the adaptive step size using the second order derivative speeds up the convergence toward the root of the function f. The selection of the appropriate method comes down to the trade-off between the overhead of the computation of the second derivative and the larger number of iterations or recursions required to find the root within a predefined convergence criteria.

References