Thursday, March 14, 2024

Geometric Learning in Python: Differential Operators

Target audience: Intermediate
Estimated reading time: 5'

The use of Riemannian manifolds and topological computing is gaining traction in machine learning circles as a means to surpass the existing constraints of data analysis within Euclidean space. 
This article aims to present the fundamental aspects of differential geometry.


Table of content
      Why SymPy
      Gradient
      Divergence
      Curl
      Validation
      Laplace
      Fourier
Follow me on LinkedIn

What you will learn: Vector fields, Differential operators and integral transforms as key components of differential geometry applied to deep learning.

Notes
  • Environments: Python  3.10.10 and SymPy 1.12
  • The differential operators are used in a future post dedicated to differential geometry and manifold learning
  • This article assumes that the reader is familiar with differential calculus and basic concept of geometry
  • Source code is available at github.com/patnicolas/Data_Exploration/diffgeometry
  • To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statements.

Overview

The heavy dependence on elements from Euclidean space for data analysis and pattern recognition in machine learning might be approaching a threshold. Incorporating differential geometry and Riemannian manifolds has already shown beneficial effects on the accuracy of deep learning models, as well as on reducing their development costs [ref 1].

Why differential geometry

Conventional approaches in machine learning, pattern recognition, and data analysis often presuppose that input data can be effectively represented using elements from Euclidean space. Although this assumption has been sufficient for numerous applications in the past, there's a growing awareness among data scientists and engineers that most data in vision and pattern recognition actually reside in a differential manifold (feature space) that's embedded within the raw data's Euclidean space
Leveraging this geometric information can result in a more precise representation of the data's inherent structure, leading to improved algorithms and enhanced performance in real-world applications [ref 2].

Why SymPy

SymPy is a Python library dedicated to symbolic mathematics. Its implementation is as simple as possible in order to be comprehensible and easily extensible. It is licensed under BSD. SymPy supports differential and integral calculus, Matrix operations, algebraic and polynomial equations, differential geometry, probability distributions, 3D plotting and discrete math [ref 3].
To install pip install sympy
Source code for SymPy is available at github.com/sympy/sympy.git

This article has 3 sections:
  1. Introduction of vector fields and implementation in SymPy
  2. Differential operators (Gradient, Divergence and Curl)
  3. Laplace and Fourier Transform

Vector fields

Let's start with an overview of vector fields as the basis of differential geometry [ref 4].
Consider a three-dimensional Euclidean space characterized by basis vectors {ei} and a vector field V, expressed as (f1, f2, f3). In this scenario, it is crucial to note that we are following the conventions of Einstein's tensor notation.

Visualization of a vector field in 3D Euclidean space

The vector field at the point P(x,y,z) is defined as the tuple (f1(x,y,z), f2(x,y,z), f3(x,y,z)). The vector over a field of n dimension field and basis vectors  {ei}  can be formally defined as \[f: \boldsymbol{x} \,\, \epsilon \,\,\, \mathbb{R}^{n} \mapsto \mathbb{R} \\ f(\mathbf{x})=\sum_{i=1}^{n}{f^{i}}(\mathbf{x}).\mathbf{e}_{i}\] Example for 3 dimension space: \[f(x,y,z) = (2x+z^{3})\boldsymbol{\mathbf{\overrightarrow{i}}} + (xy+e^{-y}-z^{2})\boldsymbol{\mathbf{\overrightarrow{j}}} + \frac{x^{3}}{y}\boldsymbol{\mathbf{\overrightarrow{k}}}\] Let's carry out the creation of this vector field, f, in SymPy and compute its value at f(1.0, 2.0, 0.5). To begin, we need to establish a 3D coordinate system, denoted as r, and then express the vector using r.

from sympy.vector import CoordSys3D
from sympy import exp
r = CoordSys3D('r') f = (2*r.x + r.z**3)*r.i + (r.x*r.y+sympy.exp(-r.y)-r.z**2)*r.j + (r.x**3/r.y)*r.k

w = f.evalf(subs={r.x: 1.0, r.y: 2.0, r.z: 0.2})
print(w)                                 # 2.008*r.i + 2.09533528323661*r.j + 0.5*r.k

Now, let's consider the same vector V with a second reference; origin 0' and basis vector e'i
Visualization of two coordinate systems

\[f(\mathbf{x})=\sum_{i=1}^{n} f'^{i}(\mathbf{x}).\mathbf{e'}_{i}\] The transformation matrix Sij convert the coordinates value functions  fi and f'i. The tuple f =(fiis the co-vector field for the vector field V 
\[S_{ij}: \begin{Vmatrix} f^{1} \\ f^{2} \\ f^{3} \end{Vmatrix} \mapsto \begin{Vmatrix} f'^{1} \\ f'^{2} \\ f'^{3} \end{Vmatrix}\] The scalar product of the co-vector f' and vector v(f) is defined as \[< f',v> = \sum f'_{i}.f^{i}\] Given the scalar product we can define the co-vector field f' as a linear map \[\alpha (v) = < f',v> (1) \] 

Differential operators

Let's demonstrate how to calculate differential operators using the SymPy library within a 3D Cartesian coordinate system. To begin, we'll create a class named DiffOperators which will encapsulate the implementation of various differential operators and transforms.

Gradient

Consider a scalar field f in a 3-dimension space. The gradient of this field is defined as the vector of the 3 partial derivatives f with respect to x, y and z [ref 5].\[\triangledown f= \frac{\partial f}{\partial x} \vec{i} + \frac{\partial f}{\partial y} \vec{j} + \frac{\partial f}{\partial z} \vec{k}\]

class DiffOperators(object):
    def __init__(self, expr: Expr):
        self.expr = expr

    def gradient(self) -> VectorZero:
        from sympy.vector import gradient

        return gradient(self.expr, doit=True)
Example of input function: \[f(x,y,z)=x^{2}yz\]
r = CoordSys3D('r')
this_expr = r.x*r.x*r.y*r.z

diff_operator = DiffOperators(this_expr)
grad_res = diff_operator.gradient()
\[\triangledown f(x,y,z) = 2xyz.\vec{i} + x^{2}z.\vec{j} + x^{2}y.\vec{k}\]

Divergence

Divergence is a vector operator used to quantify the strength of a vector field's source or sink at a specific point, producing a signed scalar value. When applied to a vector F, comprising components X, Y, and Z, the divergence operator consistently yields a scalar result [ref 6].
\[div(F)=\triangledown .F=\frac{\partial X}{\partial x}+\frac{\partial Y}{\partial y}+\frac{\partial Z}{\partial z}\]
def divergence(self, base_vec: Expr) -> VectorZero:
    from sympy.vector import divergence

    div_vec = self.expr*base_vec
    return divergence(div_vec, doit=True)
Using the same input expression as used for the gradient calculation:
div_res = diff_operator.divergence(r.i + r.j + r.k)
\[div(f(x,y,z)[\vec{i} + \vec{j}+ \vec{k}]) = 2x(y+z+xy)\]

Curl

In mathematics, the curl operator represents the minute rotational movement of a vector in three-dimensional space. This rotation's direction follows the right-hand rule (aligned with the axis of rotation), while its magnitude is defined by the extent of the rotation [ref 6]. Within a 3D Cartesian system, for a three-dimensional vector F, the curl operator is defined as follows:
\[ \triangledown * \mathbf{F}=\left (\frac{\partial F_{z}}{\partial y}- \frac{\partial F_{y}}{\partial z} \right ).\vec{i} + \left (\frac{\partial F_{x}}{\partial z}- \frac{\partial F_{z}}{\partial x} \right ).\vec{j} + \left (\frac{\partial F_{y}}{\partial x}- \frac{\partial F_{x}}{\partial y} \right ).\vec{k} \]
def curl(self, base_vectors: Expr) -> VectorZero:
    from sympy.vector import curl

    curl_vec = self.expr*base_vectors
    return curl(curl_vec, doit=True)

If we use only the two base vectors j and k:
curl_res = diff_operator.curl(r.j+r.k)
\[curl(f(x,y,z)[\vec{j} + \vec{k}]) = x^{2}\left ( z-y \right ).\vec{i}-2xyz.\vec{i}+2xyz.\vec{k}\]

Validation

We can confirm the accuracy of the gradient, divergence, and curl operators implemented in SymPy by utilizing the subsequent formulas:  \[curl(\triangledown f))= \triangledown *\left ( \triangledown f\right ) = 0\]  and  \[\triangledown . curl (F) = \triangledown .\left ( \triangledown * F \right ) = 0\]
grad_res = diff_operator.gradient()
assert(diff_operator.curl(grad_res) ==0)   # Print 0

curl_res = diff_operator.curl(r.i + r.j + r.k)
assert(diff_operator.divergence(curl_res) == 0) # Print 0


Transforms

An integral transform transforms a function from the time domain into its equivalent representation in the frequency domain, as depicted in the following diagram:

Illustration of conversion from time domain to frequency domains

The evaluation of SymPy functions follows a two-step process:
  1. Generate function F in frequency space
  2. Evaluate output of F for given input values.

Laplace

The Laplace transform is a type of integral transform that changes a function of a real variable (typically time, t) into a function in the s-plane, which is a variable in the complex frequency domain [ref 7].. Specifically, it transforms ordinary differential equations into algebraic ones and converts convolution operations into simpler multiplication. The formulation of the Laplace transform is:\[\mathfrak{L}[f(t)](s)=\int_{0}^{+\inf}f(t).e^{-st}dt\]
def laplace(self):
    from sympy import laplace_transform
        
    return laplace_transform(self.expr, t, s, noconds=True)


Let's create the Laplace transforms for two basic functions and then calculate the outcomes of these transformed functions in the frequency domain, using the values s=0.5 and a=0.2.
t, s = sympy.symbols('t, s', real=True)
a = sympy.symbols('a', real=True)
diff_operator = DiffOperators(sympy.exp(-a*t))
laplace_func = diff_operator.laplace()

print(Laplace_func.evalf(subs={s:0.5, a:0.2}))   #  1.4285
\[\int_{0}^{+\inf}e^{-at}.e^{-st}dt = \frac{1}{a+s}\]
diff_operator = DiffOperators(sympy.sqrt(-a*t))
laplace_func = diff_operator.laplace() 

print(Laplace_func.evalf(subs={s:0.5, a:0.2}))   #  1.1209
\[\int_{0}^{+\inf}\sqrt{at}.e^{-st}dt = \frac{\sqrt{a\pi}}{2.s^{3/2}}\]

Fourier

Like the Laplace transform, the Fourier transform is another integral transform that translates a function into a version that represents the frequencies contained in the original function. The result of this transformation is a complex-valued function dependent on frequency [ref 8]. Often referred to as the frequency domain representation of the original function, the formulation of the Fourier transform is as follows: \[F(x )=\int_{-\inf}^{+\inf}f(t).e^{-2\pi itx}dt\] The implementation uses the SymPy function fourier_transform which takes the function (self.expr) and x, and k parameters as arguments.
def fourier(self):
   from sympy import fourier_transform
     
   return fourier_transform(self.expr, x, k)

Let's implement the Fourier transforms for two simple functions, exp and sin, and then calculate the outcomes of these transformed functions in the frequency domain with k=0.4
x, k = sympy.symbols('x, k', real=True)
diff_operator = DiffOperators(sympy.exp(-x**2))

fourier_func = diff_operator.fourier()
print(fourier_func.evalf(subs={k: 0.4}))       # 0.3654
\[\int_{-\inf}^{+\inf}e^{-t^{2}}.e^{-2\pi itx}dt = \sqrt{\pi}.e^{-2\pi x^{2}}\]
diff_operator = DiffOperators(sympy.sin(x))
diff_operator.fourier()
\[\int_{-\inf}^{+\inf}\sin(t).e^{-2\pi itx}dt = 0\]


Thank you for reading this article. For more information ...

References




---------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning" Packt Publishing ISBN 978-1-78712-238-3










Friday, March 1, 2024

Geometric Learning in Python: Manifolds

Target audience: Intermediate
Estimated reading time: 7'


Intrigued by the idea of applying differential geometry to machine learning but feel daunted? 
Our second article in the Geometric Learning in Python series explores fundamental concepts such as manifolds, tangent spaces, and geodesics.
 

Table of contents
       Components
       Tangent vectors
       Geodesics
Follow me on LinkedIn

What you will learn: How to implement basic components of manifolds in Python

Notes:

Geometric learning

Geometric learning addresses the difficulties of limited data, high-dimensional spaces, and the need for independent representations in the development of sophisticated machine learning models, including graph-based and physics-informed neural networks.

The following highlights the advantages of utilizing differential geometry to tackle the difficulties encountered by researchers in the creation and validation of generative models [ref 1].
  • Understanding data manifolds: Data in high-dimensional spaces often lie on lower-dimensional manifolds. Differential geometry provides tools to understand the shape and structure of these manifolds, enabling generative models to learn more efficient and accurate representations of data.
  • Improving latent space interpolation: In generative models, navigating the latent space smoothly is crucial for generating realistic samples. Differential geometry offers methods to interpolate more effectively within these spaces, ensuring smoother transitions and better quality of generated samples.
  • Optimization on manifolds: The optimization processes used in training generative models can be enhanced by applying differential geometric concepts. This includes optimizing parameters directly on the manifold structure of the data or model, potentially leading to faster convergence and better local minima.
  • Geometric regularization: Incorporating geometric priors or constraints based on differential geometry can help in regularizing the model, guiding the learning process towards more realistic or physically plausible solutions, and avoiding overfitting.
  • Advanced sampling techniques: Differential geometry provides sophisticated techniques for sampling from complex distributions (important for both training and generating new data points), improving upon traditional methods by considering the underlying geometric properties of the data space.
  • Enhanced model interpretability: By leveraging the geometric structure of the data and model, differential geometry can offer new insights into how generative models work and how their outputs relate to the input data, potentially improving interpretability.
  • Physics-Informed Neural Networks:  Projecting physics law and boundary conditions such as set of partial differential equations on a surface manifold improves the optimization of deep learning models.
  • Innovative architectures: Insights from differential geometry can lead to the development of novel neural network architectures that are inherently more suited to capturing the complexities of data manifolds, leading to more powerful models.
Important NoteIn future articles, we will employ the Geomstats Python library and a use case involving the hypersphere to demonstrate some fundamental concepts of differential geometry.

Differential geometry basics

Differential geometry is an extensive and intricate area that exceeds what can be covered in a single article or blog post. There are numerous outstanding publications, including books [ref 234],  papers [ref 5] and tutorials [ref 6], that provide foundational knowledge in differential geometry and tensor calculus, catering to both beginners and experts 

To refresh your memory, here are some fundamental elements of a manifold:

A manifold is a topological space that, around any given point, closely resembles Euclidean space. Specifically, an n-dimensional manifold is a topological space where each point is part of a neighborhood that is homeomorphic to an open subset of n-dimensional Euclidean space.
Examples of manifolds include one-dimensional circles, two-dimensional planes and spheres, and the four-dimensional space-time used in general relativity.

Differential manifolds are types of manifolds with a local differential structure, allowing for definitions of vector fields or tensors that create a global differential tangent space.

A Riemannian manifold is a differential manifold that comes with a metric tensor, providing a way to measure distances and angles.

Fig 1 Illustration of a Riemannian manifold with a tangent space


A vector field assigns a vector (often represented as an arrow) to each point in a space, lying in the tangent plane at that point. Operations like divergence, which measures the volume change rate in a vector field flow, and curl, which calculates the flow's rotation, can be applied to vector fields.

Fig 2 Visualization of vector fields on a sphere in 3D space

Given a vector R defined in a Euclidean space Rand a set of coordinates ci a vector field along the variable lambda is defined: \[\frac{\mathrm{d} \vec{R}}{\mathrm{d} \lambda}=\sum_{i=1}^{n}\frac{\mathrm{d} c^{i}}{\mathrm{d} \lambda}\frac{\partial \vec{R}}{\partial c^i }\] In the case of 2 dimension space, the vector field can be expressed in cartesian (1) and polar (2) coordinates using the Einstein summation convention: \[\frac{\mathrm{d} \vec{R}}{\mathrm{d} \lambda}=\frac{\mathrm{d} x}{\mathrm{d} \lambda}\frac{\partial \vec{R}}{\partial x}+\frac{\mathrm{d} y}{\mathrm{d} \lambda}\frac{\partial \vec{R}}{\partial y} \ \ (1) = \frac{\mathrm{d} r}{\mathrm{d} \lambda}\frac{\partial \vec{R}}{\partial r}+\frac{\mathrm{d} \theta}{\mathrm{d} \lambda}\frac{\partial \vec{R}}{\partial \theta} \ \ (2)\]
Note: While crucial for grasping operations on manifold vector fields, the concepts of covariance and contravariance are outside the purview of this article.

The tangent space at a point on a manifold is the set of tangent vectors at that point, like a line tangent to a circle or a plane tangent to a surface.

Tangent vectors can act as directional derivatives, where you can apply specific formulas to characterize these derivatives.

Given a differentiable function f, a vector v in Euclidean space Rand a point x on manifold, the directional derivative in v direction at x is defined as: \[\triangledown _{v} f(x)=\sum_{i=1}^{n}v_{i}\frac{\partial f}{\partial x_{i}}(x) \ with \ f: \mathbb{R}^{n} \rightarrow \mathbb{R}\] and the tangent vector at the point x is defined as \[v(f(x\frac{\partial }{\partial x}))=(\triangledown _{v}(f))(x)\]
A geodesic is the shortest path (arc) between two points in a Riemannian manifold.

Fig 3 Illustration of manifold geodesics with Frechet mean


Given a Riemannian manifold M with a metric tensor g, the geodesic length L of a continuously differentiable curve  f: [a, b] -> M is \[L(f)=\int _a^b \sqrt {g_{f(t))} (\frac{\mathrm{d} f}{\mathrm{d} x}(t), \frac{\mathrm{d} f}{\mathrm{d} x}(t))}dt\]
An exponential map is a map from a subset of a tangent space of a Riemannian manifold. Given a tangent vector v at a point p on a manifold, there is a unique geodesic Gv that satisfy Gv(0)=p and G’v(0)=vThe exponential map is defined as expp(v)= Gv(1)


Intrinsic geometry involves studying objects, such as vectors, based on coordinates (or base vectors) intrinsic to the manifold's point. For example, analyzing a two-dimensional vector on a three-dimensional sphere using the sphere's own coordinates.

Extrinsic geometry studies objects relative to the ambient Euclidean space in which the manifold is situated, such as viewing a vector on a sphere's surface in three-dimensional space.

Geomstats library

Geomstats is a free, open-source Python library designed for conducting machine learning on data situated on nonlinear manifolds, an area known as Geometric Learning. This library offers object-oriented, thoroughly unit-tested features for fundamental manifolds, operations, and learning algorithms, compatible with various execution environments, including NumPy, PyTorch, and TensorFlow [ref 7].

The library is structured into two principal components:
  • geometry: This part provides an object-oriented framework for crucial concepts in differential geometry, such as exponential and logarithm maps, parallel transport, tangent vectors, geodesics, and Riemannian metrics.
  • learning: This section includes statistics and machine learning algorithms tailored for manifold data, building upon the scikit-learn framework.

Use case: Hypersphere

To enhance clarity and simplicity, we've implemented a unique approach that encapsulates the essential elements of a data point on a manifold within a data class.
An hypersphere S of dimension d, embedded in an Euclidean space d+1 is defined as: \[S^{d}=\left \{ x\in \mathbb{R}^{d+1} \ | \ \left \| x \right \| = 1\right \}\]

Components

First we encapsulate the key components of a point on a manifold into a data class ManifoldPoint for convenience with the following attributes:
  • id A label a point
  • location A n--dimension Numpy array 
  • tgt_vector An optional tangent vector, defined as a list of float coordinate
  • geodesic A flag to specify if geodesic has to be computed.
  • intrinsic A flag to specify if the coordinates are intrinsic, if True, or extrinsic if False.

Fig. 4 Illustration of a ManifoldPoint instance


Note:  Description of intrinsic and extrinsic coordinates are not required to understand basic Manifold components and will be covered in a future 

@dataclass
class ManifoldPoint:
    id: AnyStr
    location: np.array
    tgt_vector: List[float] = None
    geodesic: bool = False
intrinsic: bool = False


Let's build a HypersphereSpace as a Riemannian manifold defined as a spheric 3D manifold space of type Hypersphere and a metric hypersphere_metric of type HypersphereMetric.

import geomstats.visualization as visualization
from geomstats.geometry.hypersphere import Hypersphere, HypersphereMetric
from typing import NoReturn, List
import numpy as np
import geomstats.backend as gs


class HypersphereSpace(GeometricSpace):
    def __init__(self, equip: bool = False, intrinsic: bool=False):
        dim = 2
        super(HypersphereSpace, self).__init__(dim, intrinsic)

coordinates_type = 'intrinsic' if intrinsic else 'extrinsic'
self.space = Hypersphere(dim=self.dimension, equip=equip, 
default_coords_type=coordinates_type)
        self.hypersphere_metric = HypersphereMetric(self.space)


    def belongs(self, point: List[float]) -> bool:
        return self.space.belongs(point)

    def sample(self, num_samples: int) -> np.array:
        return self.space.random_uniform(num_samples)


    def tangent_vectors(self, manifold_points: List[ManifoldPoint]) -> List[np.array]:

    def geodesics(self,
                  manifold_points: List[ManifoldPoint],
                  tangent_vectors: List[np.array]) -> List[np.array]:

    def show_manifold(self, manifold_points: List[ManifoldPoint]) -> NoReturn:


The first two methods to generate and validate data point on the manifold are
  • belongs to test if a point belongs to the hypersphere
  • sample to generate points on the hypersphere using a uniform random generator

Tangent vectors

The method tangent_vectors computes the tangent vectors for a set of manifold point defined with their id, location, vector and geodesic flag. The implementation relies on a simple comprehensive list invoking the nested function tangent_vector (#1). The tangent vectors are computed by projection to the tangent plane using the exponential map associated to the metric hypersphere_metric (#2).

def tangent_vectors(self, manifold_points: List[ManifoldPoint]) -> List[np.array]:
 
   def tangent_vector(point: ManifoldPoint) -> (np.array, np.array):
        import geomstats.backend as gs

        vector = gs.array(point.tgt_vector)
        tangent_v = self.space.to_tangent(vector, base_point=point.location)
        end_point = self.hypersphere_metric.exp(            # 2
               tangent_vec=tangent_v, 
               base_point=point.location)
        return tangent_v, end_point

   return [self.tangent_vector(point) for point in manifold_points]     # 1


This test consists of generating 3 data points, samples on the hypersphere and construct the manifold points through a comprehensive list with a given vector [0.5, 0.3, 0.5] in the Euclidean space and geodesic disabled.

manifold = HypersphereSpace(True)

# Uniform randomly select points on the hypersphere
samples = manifold.sample(3)
# Generate the manifold data points   
manifold_points = [
  ManifoldPoint(
    id=f'data{index}',
    location=sample,
    tgt_vector=[0.5, 0.3, 0.5],
    geodesic=False) for index, sample in enumerate(samples)]

# Display the tangent vectors
manifold.show_manifold(manifold_points)

The code for the method show_manifold is described in the Appendix. The execution of the code snippet produces the following plot using Matplotlib.
Fig. 5 Visualization of three random data points and 
their tangent vectors on Hypersphere
 

Geodesics

The geodesics method calculates the trajectory on the hypersphere for each data point in manifold_points, using the tangent_vectors. Similar to how tangent vectors are computed, the determination of geodesics for a group of manifold points is guided by a Python comprehensive list to invoke the nested function geodesic.

def geodesics(self,
         manifold_points: List[ManifoldPoint],
         tangent_vectors: List[np.array]) -> List[np.array]:
  
    def geodesic(manifold_point: ManifoldPoint, tangent_vec: np.array) -> np.array:
          return self.hypersphere_metric.geodesic(
               initial_point=manifold_point.location,
               initial_tangent_vec=tangent_vec
           )

    return [geodesic(point, tgt_vec)
           for point, tgt_vec in zip(manifold_points, tangent_vectors) if point.geodesic]


The geodesic is visualized by plotting 40 intermediate infinitesimal exponential maps created by invoking linspace function as described in Appendix
Fig. 6 Visualization of two random data points with tangent vectors 
and geodesics on Hypersphere

References

[2] Differential Geometric Structures W. Poor - Dover Publications, New York  1981
[3] Tensor Analysis on Manifolds R Bishop, S. Goldberg - Dover Publications, New York  1980
[4] Introduction to Smooth Manifolds J. Lee - Springer Science+Business media New York 2013

-------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning", Packt Publishing ISBN 978-1-78712-238-3

Appendix

The implementation of the method show_manifold is shown for reference. It relies on the geomstats visualization library. The various components of data points on manifold (location, tangent vector, geodesics) are displayed according to the values of their attributes. Points on 3-dimension Euclidean space are optionally display for reference.

import geomstats.visualization as visualization


def show_manifold(self, 
      manifold_points: List[ManifoldPoint],
      euclidean_points: List[np.array] = None) -> NoReturn:
  import matplotlib.pyplot as plt

  fig = plt.figure(figsize=(10, 10))
  ax = fig.add_subplot(111, projection="3d")
  # Walk through the list of data point on the manifold
  for manifold_pt in manifold_points:
     ax = visualization.plot(
          manifold_pt.location,
                ax=ax,
                space="S2",
                s=100,
                alpha=0.8,
                label=manifold_pt.id)

            # If the tangent vector has to be extracted and computed
     if manifold_pt.tgt_vector is not None:
        tgt_vec, end_pt = self.__tangent_vector(manifold_pt)

        # Show the end point and tangent vector arrow
        ax = visualization.plot(end_pt, ax=ax, space="S2", s=100, alpha=0.8, label=f'End {manifold_pt.id}')
        arrow = visualization.Arrow3D(manifold_pt.location, vector=tgt_vec)
        arrow.draw(ax, color="red")

        # If the geodesic is to be computed and displayed
        if manifold_pt.geodesic:
           geodesics = self.__geodesic(manifold_pt, tgt_vec)

           # Arbitrary plot 40 data point for the geodesic from the tangent vector
           geodesics_pts = geodesics(gs.linspace(0.0, 1.0, 40))
           ax = visualization.plot(
                geodesics_pts,
                ax=ax,
                space="S2",
                color="blue",
                label=f'Geodesic {manifold_pt.id}')

# Display points in Euclidean space of Hypersphere if any specified
if euclidean_points is not None:
for index, euclidean_pt in enumerate(euclidean_points):
ax.plot(
euclidean_pt[
0],
euclidean_pt[
1],
euclidean_pt[
2],
**{
'label': f'E-{index}', 'color': 'black'},
alpha=0.5)

   ax.legend()
   plt.show()