Showing posts with label Numpy. Show all posts

Thursday, January 2, 2025

Numpy Einstein Summation

Target audience: Beginner
Estimated reading time: 3'

Posts history
Newsletter: Geometric Learning in Python

Many research papers use Einstein summation notation to describe mathematical concepts. Wouldn't it be great to have a similar method or operator in Python? Fortunately, NumPy provides an excellent solution for this.

Overview

Matrix operations

Examples

Kalman filter state equation

Neural linear transformation

Simplified Einstein field equation

References

What you will learn: How to write an elegant implementation of any tensor equation using numpy einsum

Notes:
Environments: Python 3.12.5, Numpy 2.2.0
Source code is available on GitHub [ref 1]
To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statement.

Overview

The NumPy einsum method [ref 2] offers two modes of application:

1. Implicit Subscripts: The output format is inferred based on the input

subscripts.

Syntax: np.einsum('subscript1, subscript2, ...', tensor1, tensor2)

Example: np.einsum('ji', t) computes the transpose of np.einsum('ij', t).

2. Explicit Subscripts: The output format is explicitly specified.

Syntax np.einsum('subscript1, subscript2, ... -> output', tensor1, tensor2)

Example: np.einsum('ij,jk->ik', a, b) computes the matrix multiplication of

a and b.

Note: Broadcasting is supported using ellipsis notation.

Example: np.einsum('...ii->...i', t) extracts the diagonal along the last two

dimensions.

Matrix operations

The NumPy einsum method offers an elegant alternative to the commonly used @ operator or dedicated NumPy methods.

Matrix multiplication: einsum can be used as an alternative to the numpy operator @

a = np.array([[1.0, 0.5], [2.0, 1.5]])
b = np.array([[0.1, 2.0], [1.2, 0.5]])

output_matrix = np.einsum('ij,jk->ik', a, b)
ref_matrix = a @ b

Output:

[[0.70 2.25]

[2.00 4.75]]

Dot product: Alternative to numpy.dot method

a = np.array([1.0, 0.5, 2.0])
b = np.array([0.1, 2.0, 1.2])

np.einsum('i,i->', a, b)ref_dot_value = np.dot(a, b)

Output: 3.5

Matrix element-wise multiplication: Alternative to * operator

a = np.array([[1.0, 0.5], [2.0, 1.5]])
b = np.array([[0.1, 2.0], [1.2, 0.5]])

output = np.einsum('ij,ij->', a, b)
ref_output = a* b

Output: 4.25

Matrix outer product: Alternative to np.outer method

a = np.array([1.0, 0.5, 2.0, 0.7])
b = np.array([0.1, 2.0, 1.2])

output = np.einsum('i,j->ij', a, b)
ref_output = np.outer(a, b)

Output:

[[0.10 2.00 1.20 ]

[0.05 1.00 0.60 ]

[0.20 4.00 2.40 ]

[0.07 1.40 0.84]]

Matrix Transpose: Alternative to .T operator

a = np.array([[1.0, 0.5], [2.0, 1.5]])

output = np.einsum('ij->ji', a)
ref_output = a.T

Output:

[[1. 2. ]

[0.5 1.5]]

Matrix trace: Alternative to np.trace method

a = np.array([[1.0, 0.5], [2.0, 1.5]])

output = np.einsum('ii->', a)
ref_output = np.trace(a)

Output: 2.5

Examples

Kalman filter state equation

The state transition equation of the Kalman filter [ref 3, 4] is defined as\[ \widetilde{x}_{k/k-1}=A_{k}.\widetilde{x}_{k-1/k-1} + B_{k}.u_{k} + w \]

A = np.array([[1.0, 0.1], [0.0, 1.0]])   # State transition matrix
B = np.array([[0.5], [1.0]])                 # Control input matrixx_k_1 = np.array([2.0, 1.0])              # State vector at time k-1u = np.array([0.1])                             # Control inputw = np.array([0.05, 0.02])   
 
# Using numpy matrix operator
x_k = A @ x_k_1 + B @ u + w

# Using Einstein summation method
x_k = np.einsum('ij,j->i', A, x_k_1) + np.einsum('ij,j->i', B, u) + w

Output:

[2.20, 1.12]

Neural linear transformation

The linear transformation of a layer of multi-layer perceptron is defined as \[ \textbf{y} = W.\textbf{x} + b \] W is the weight matrix,

x is the input vector,

b is the bias vector,

y is the output vector.

W = np.array([[0.20,  0.80, -0.50], 
                       [0.50, -0.91,  0.26], 
                       [-0.26, 0.27,  0.17]])  # Shape (3, 3)
x = np.array([1, 2, 3])  # Input vector, shape (3,)
b = np.array([2.0, 3.0, 0.5])  # Bias vector, shape (3,)

# Using @ operator
y = W @ x + b

# Using Einstein summation
y_einsum = np.einsum('ij,j->i', W, x) + b

Output:

[2.30 2.46 1.29]

Simplified Einstein field equation

The Einstein field equations (EFE) in general relativity are complex tensor equations that describe the relationship between the geometry of spacetime and the matter and energy within it: \[ G_{\mu \nu }= R_{\mu \nu} - \frac{1}{2}Rg_{\mu \nu} = \frac{8\pi G}{c^{4}} T_{\mu \nu} \]

# Define the Riemannian metric tensor (4-dimension)
g = np.array([[-1, 0, 0, 0],
                     [  0, 1, 0, 0],  
                     [  0, 0, 1, 0],
                     [  0, 0, 0, 1]])  # Shape (4, 4)

# Ricci curvature tensor
R_mu_nu = np.random.rand(4, 4)  # Shape (4, 4)

# R: trace of Ricci tensor as scalar curvature
R = np.einsum('ii', np.linalg.inv(g) @ R_mu_nu)  #

# Compute Einstein tensor G_mu_nu
G_mu_nu = R_mu_nu - 0.5 * R * g

Limitations of the Linear Kalman Filter

Target audience: Beginner

Estimated reading time: 5'

Posts history

Newsletter: Geometric Learning in Python

Frustrated with the results from your Kalman Filter? The issue might be the non-linearity in the process you're modeling.
This article explores the impact of non-linearity on the accuracy of state predictions made by the linear Kalman filter.

Linear Kalman Filter

Theory

Limitations

Implementation

Setup

Prediction

Update

Evaluation

Use case

Study

References

Appendix

What you will learn: The impact of non-linearity of a dynamic process on the accuracy of the state predicted by a linear Kalman filter.

Notes:

Environments: Python 3.11, Matplotlib 3.9, Numpy 2.1.2
Source code is available at Github.com/patnicolas/Data_Exploration/Filter
To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statement.

Linear Kalman estimator

A deep dive in the underlying assumption and mathematics of Kalman Filter [ref 1] is beyond the scope of this article.

Theory

The linear Kalman filter, sometimes referred as standard Kalman filter is used for systems that can be described by linear state-space models and assumes that the system’s dynamics and observation models are linear [ref 2].

Fig. 1 Visualization of the Linear Kalman filter algorithm

After initialization, the linear Kalman Filter forecasts the system's state for the upcoming step and estimates the uncertainty associated with this prediction.

Considering

A_{n}

as the state transition model applied to the state

x_{n - 1}

B_{n}

as the control input model applied to the control vector

u_{n}

if it exists,

Q_{n}

as the covariance of the process noise, and

P_{n}

as the error covariance matrix, the forecasted state

x

x

is \[\begin{matrix} \widetilde{x}_{n/n-1}=A_{n}.\widetilde{x}_{n-1/n-1} + B_{n}.u_{n} \ \ (1)\\ P_{n/n-1}=A_{n}.P_{n-1/n-1}.A_{n}^{T}+Q_{n} \ \ (2) \end{matrix}\]

Upon receiving a measurement, the Kalman Filter adjusts or corrects the forecast and uncertainty of the current state. It then proceeds to predict future states, continuing this process.

Thus, with a measurement

z_{n - 1}

, a state

x_{n - 1}

, and the innovation

S_{n}

, the Kalman Gain

G_{n}

and the error covariance

P_{n}

are calculated according. \[\begin{matrix} S_{n}=H.P_{n/n-1}.H^{T} +R_{n} \ \ \ \ \ (3)\ \ \ \ \ \ \ \ \ \ \ \ \ \\ G_{n} = \frac{1}{S_{n}}.P_{n/n-1}.H^{T}\ \ \ \ \ (4) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \\ \widetilde{x}_{n/n} = \widetilde{x}_{n/n-1}+G_{n}(z_{n}-H.\widetilde{x}_{n/n-1}) \ \ \ (5) \\ g_{n}=I - G_{n}.H \ \ \ \ \ (6) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \\ P_{n,n}= g_{n}.P_{n/n-1}.g_{n}^{T}+G_{n}.R_{n}.G_{n}^{T} \ \ \ \ (7) \ \ \end{matrix}\]

Fig. 2 Prediction - Update cycle for each measurement

Limitations

The linear Kalman filter is a powerful tool for estimating the state of a linear dynamic system in the presence of uncertainty. However, it has several limitations, particularly when applied to real-world systems that might not perfectly match its assumptions. Here are the key limitations:

Assumption of Linearity: The Kalman Filter assumes that both the system dynamics and the measurement processes are linear. The relationship between the current state and the next state and the relationship between the state and the observations should be linear functions.
Assumption of Gaussian Noise: The Kalman Filter assumes that both process noise and measurement noise are Gaussian-distributed with known mean and covariance
Strict Accuracy of Models: The Kalman Filter requires an accurate mathematical model of the system (i.e., the transition matrix, control matrix, observation matrix, and their respective noise covariances).
Sensitivity to Initial Conditions: The Kalman Filter's performance is sensitive to the choice of initial conditions for the state estimate and error covariance matrix.
Time-Invariant Covariance Matrices: The Kalman Filter assumes that the process and measurement noise covariance matrices remain constant over time (or are known in advance if they change)

Implementation

Setup

The initial step involves programmatically implementing a linear Kalman filter using an object-oriented approach. To achieve this, we define a class named LinearKalmanFilter, which includes two constructors:

__init__: This is the default constructor and can be fully specified for a standard linear Kalman filter.
build: This alternative constructor is used when there is no control input, and both the process and measurement noise matrices have zero covariance.

The arguments of the constructors defines all the parameter of the Kalman filter:

_x0 : Initial values for the estimated state

_P0 : Initial values for the error covariance matrix

_A : State transition matrix (from state x[n-1] to state x[n])

_H : States to observations (or measurements) matrix

_Q : Process noise covariance matrix

_R : Observation or measurement matrix

_u0 : Optional initial value of the control variables

_B : Optional control matrix (No control if None)

class LinearKalmanFilter(object):

     # Default constructor for fully defined filter
    def __init__(self,
                 _x0: np.array,
                 _P0: np.array,
                 _A: np.array,
                 _H: np.array,
                 _Q: np.array,
                 _R: np.array,
                 _u0: np.array = None,
                 _B: np.array = None) -> None:
        self.x = _x0
        self.P = _P0
        self.A = _A
        self.H = _H
        self.Q = _Q
        self.R = _R
        self.u = _u0
        self.B = _B


    # Alternative constructor for the simplified Kalman filter
    #  - No control input
    #  - No covariance for process and measure noise
    @classmethod
    def build(cls, _x0: np.array,

                         _P0: np.array,

                         _A: np.array,

                         _H: np.array,

                         qr: (float, float)) -> Self:
        dim = len(_x0)
        Q = np.eye(dim)*qr[0]
        R = np.eye(1)*qr[1]
        return cls(_x0, _P0, _A, _H, Q, R)

Prediction

The prediction method, predict, implements the equations (1) and (2) with the process noise w passed as argument.

Note: We are using the Numpy multiplication operator @ instead of numpy.dot method.

def predict(self, w: np.array) -> NoReturn:
     # State:  x[n] = A.x~[n-1] + B.u[n-1] + v
     self.x = self.A @ self.x + w if self.B is None

                  else self.A @ self.x + self.B @ self.u + w # Eq (1)

    # Error covariance:  P[n] = A[n].P[n-1].A[n]^T + Q[n]
    self.P = self.A @ self.P @ self.A.T + self.Q           # Eq (2)

Update

The method update from class LinearKalmanFilter implements the computation of innovation and Kalman gain (equations 3 & 4) and update of state, x (equation 5)and error covariance matrix P (equation 7).

def update(self, z: np.array) -> NoReturn:
     # Innovation:  S[n] = H.P[n-1].H^T + R[n]
     S = self.H @ self.P @ self.H.T + R                 # Eq (3)

        
     # Gain: G[n] = P[n-1].H^T/S[n]
     G = self.P @ self.H.T @ np.linalg.inv(S)       # Eq (4)


     # State estimate y[n] = z[n] - H.x
     y = z - self.H @ self.x

     self.x = self.x + G @ y              # Eq (5)


     # Update error covariance matrix
     g = np.eye(self.P.shape[0]) - G @ self.H        # Eq (6)

     self.P = g @ self.P               # Eq (7)

Evaluation

The objective is to evaluate and quantify the limitation of the Kalman filter for non-linear state update (i.e. x[n] = f(x[n-1]) with f non-linear).

Use case

Our user scenario is the ubiquitous tracking of an object of coordinates x and y on a 2D plan. The plotting {x, y} defines the trajectory in 2D.

\[ x_{k}=x_{k-1}+ \dot{x}_{k-1}.\Delta t+\ddot{x}_{k-1} \frac{(\Delta t^2)}{2} \]

\[ \dot{x}_{k}=\dot{x}_{k-1}+\ddot{x}_{k-1}.\Delta t \]

The derivative vx = dx/dt (resp. vy = dy/dt) represents the velocity of the object along x (resp. y) axes. The second order derivative d^2x/dt^2 (resp. d^2y/dt^2) defines the acceleration along the x (resp. y) axes.

The force applied to the object is defined as F = m.acceleration, therefore the acceleration is the external control (variable u).

The state prediction can be implemented as the following matrix equation.

\[ x_{k}= \begin{bmatrix} x_{k} \\ y_{k} \\ \dot{x}_{k} \\ \dot{y}_{k} \end{bmatrix} \begin{bmatrix} 1 & 0 & \Delta t & 0 \\ 0 & 1 & 0 & \Delta t \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} x_{k-1}+\begin{bmatrix} \frac{1}{2} (\Delta t)^2 & 0\\ 0 & \frac{1}{2} (\Delta t)^2 \\ \Delta t & 0 \\ 0 & \Delta t \\ \end{bmatrix} \begin{bmatrix} \ddot{x}_{k-1} \\ \ddot{y}_{k-1} \end{bmatrix} \]

The evaluation code initializes the components of the linear Kalman filter, where acceleration serves as the control variable u

along with the control matrix B

. The simulation, which runs for 200 time steps, with time updates defined as t[n+1] = t[n] + dt is carried out using the simulate

dt = 0.1
ac = 0.5*dt*dt

# Variables of the linear Kalman Filter
x0 = np.array([[0.0], [np.pi], [0.8], [0.2]])
P0 = np.eye(x0.shape[0])
A = np.array([[1.0, 0.0, dt, 0.0],

                      [0.0, 1.0, 0.0, dt],

                      [0.0, 0.0, 1.0, 0.0],

                      [0.0, 0.0, 0.0, 1.0]])
B = np.array([[ac, 0.0], [0.0, ac], [dt, 0.0], [0.0, dt]])
H = np.array([[1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0]])
u = np.array([[1.0], [0.8]])
        
q = 0.6      # Variance for Q process noise matrix
r = 1.0       # Variance for R measurement noise matrix
Q = np.eye(4) * q
R = np.eye(1) * r
      
kf = LinearKalmanFilter(x0, P0, A, H, Q, R, u, B)

num_points = 200

estimation = kf.simulate(num_points,
                                       lambda i: obs_generator_lin(i),
                                       np.array([[0.4], [0.6], [0.1], [0.2]]))

Study

We write 4 functions to generate synthetic observations or measurements for tracking the object {x, y} as follows:

f(x) = x
f(x) = x*x
f(x) = exp(-x/10)
f(x) = sqrt(20,000*x)

def obs_generator_lin(i: int) -> np.array:
       return np.array([[i], [i]])
       
def obs_generator_sqr(i: int) -> np.array:
       return np.array([[i], [i*i]])

def obs_generator_exp(i: int) -> np.array:
       return np.array([[i], [math.exp(-i*0.1)]])

def obs_generator_sqrt(i: int) -> np.array:
       return np.array([[i], [math.sqrt(20_000.0*i)]])

The experiment generates 200 values for x-axis [0, 200] for linear and non-linear (sqr(x) = x*x) state update.

Fig. 3 Measured vs estimated trajectory for linear and non-linear (sqr) for state update

Next we plot the trajectory on the 3D plane {x, y} for the 4 generators of observations or measurements.

Fig. 4 Measured vs estimated 2D trajectory over 200 observations for 4 different generators

As expected, in the linear model [x, y = x], the estimated state aligns well with the generated observations (Plot #1). However, in the model [x, y = x^2], the estimated state diverges from the measurements over time (Plot #2), as the compound error grows exponentially. Conversely, for the model [x, y = sqrt{x}], the deviation between the estimated state and measurements decreases over time (Plot #4). In the highly non-linear case (Plot #3), the estimated state fails to keep up with the measurements altogether.

Thursday, January 2, 2025

Numpy Einstein Summation

Overview

Matrix operations

Examples

Kalman filter state equation

Neural linear transformation

Simplified Einstein field equation

Saturday, October 5, 2024

Limitations of the Linear Kalman Filter

Linear Kalman estimator

Theory

Limitations

Implementation

Setup

Prediction

Update

Evaluation

Use case

Study

Contact Form

Equation Editor

Popular Posts