Showing posts with label Manifold. Show all posts
Showing posts with label Manifold. Show all posts

Wednesday, May 1, 2024

K-means on Manifold

Target audience: Advanced
Estimated reading time: 7'

Traditional clustering models often fail complex datasets commonly found in advanced applications like medical imaging, 3D shape analysis, and natural language processing where data is highly interrelated.
K-means on manifolds respects the intrinsic geometry of the data, such as curvature and metric.


Table of contents

       Setup
       Euclidean space
       Hypersphere
Follow me on LinkedIn

What you will learn: How to apply k-means clustering on a Riemann manifold (Hypersphere) using Geomstats, contrasted with its implementation in Euclidean space, using scikit-learn library.

Notes

  • Environments: Python  3.10.10, Geomstats 2.7.0, scikit-learn 1.4.2
  • This article assumes that the reader is somewhat familiar with differential and tensor calculus [ref 1]. Please refer to our previous articles related to geometric learning [ref 2, 3, 4].
  • Source code is available at  Github.com/patnicolas/Data_Exploration/manifolds
  • To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statements.

Introduction

The primary goal of learning Riemannian geometry is to understand and analyze the properties of curved spaces that cannot be described adequately using Euclidean geometry alone. Riemannian geometry enables to describe the geometric structures of manifolds equipped with a metric, which defines the concept of distance and angle on these spaces.

This article is the seventh part of our ongoing series focused on geometric learning. In this installment, we utilize the Geomstats Python library [ref. 5] and explore the ubiquitous K-means clustering algorithm on the hypersphere manifold. The hypersphere which was introduced in a previous piece, Geometric Learning in Python: Manifolds - Hypersphere  and is detailed in the Geomstats API [ref. 6]. 

I highly recommend watching the comprehensive series of 22 YouTube videos Tensor Calculus - Eigenchris  to familiarize yourself with fundamental concepts of differential geometry. 
Summaries of my earlier articles on this topic can be found in the Appendix

There are many benefits for clustering data on a manifold for complex data sets [ref 7]:
  • Grouping of dense, continuous non-linear data depends on the 'shape' of data
  • Projection to Euclidean space may introduce distortion
  • Loss, distances are better assessed and computed through geodesics than Euclidean metrics (i.e. sphere)

K-means


Among the array of unsupervised learning algorithms, K-means stands out as one of the most well-known. This algorithm has a straightforward goal: to divide the data space so that data points within the same cluster are as similar as possible (intra-cluster similarity), and data points in different clusters are as dissimilar as possible (inter-cluster similarity). K-means aims to identify a predetermined number of clusters in an unlabeled dataset. It employs an iterative approach to finalize the clustering, which depends on the number of clusters specified by the user (denoted by the variable K).

Given K, Ck clusters each with a centroid mk, the input data xi is distributed across the cluster so to minimize the reconstruction error:\[Rerr(K)=\max_{C_k}\sum_{k=1}^{K}\sum_{x_i\in C_k}^{}\left \| x_i-m_k\right \|^2\]

Clustering data on a manifold


To assess and contrast the K-means model in both Euclidean space and on a hypersphere, it's necessary to create clustered data. This involves a two-step process:
  1. Generate a template cluster by employing a random generator on the manifold.
  2. Generate 4 clusters from the template using a special orthogonal Lie group in 3-dimensional space, SO(3).

Randomly generated manifold data

Let's evaluate and compare the following random generators for data points on the hypersphere we introduced in a previous article [ref 8]
  • Uniform distribution
  • Uniform distribution with constraints
  • von Mises-Fisher distribution

Uniform distribution
We start with the basic random uniform generator over interval [0, 1].\[r=rand_{[0,1]}(x)\]The data points for the 4 clusters are visualized in the following plot.
4-Cluster random generation using uniform distribution


Constrained uniform random generator.
In this scenario, we constrain random values r on each of the 3 dimension (or axis) within a sub-interval [ai, bi].\[r=rand_{[0,1]}(x)\ \ \ a_i < r_i < b_i\]

4-Cluster random generation using constrained uniform distribution

von mises-Fisher random generator
This approach relies on a generative mixture-model approach to clustering directional data based on the von Mises-Fisher distribution [ref 9].
Given a d-dimensional unit random vector x on a hypersphere of dimension d-1, the d-variate von Moses-Fisher distribution is defined by the following probability density distribution:\[f(x|\mu , \kappa )=C_d(\kappa).e^{\kappa \mu^Tx} \ \ \  \ C_d(\kappa)=\frac{\kappa^{\frac{d}{2}-1}}{2\pi^{\frac{d}{2}}I_{\frac{d}{2}-1}(\kappa)}\]
4-Cluster random generation using Von Mises-Fisher distribution

As anticipated, using a pure uniform random generator distributes data evenly across the hypersphere, rendering it ineffective for evaluating KMeans.
Instead, we will employ the von Mises-Fisher distribution and a constrained uniform random generator to more effectively analyze the performance of KMeans on a Riemann manifold.

Synthetic clusters using SO(3)

We leverage the SO(3) Lie group to replicate the randomly generated cluster.

Although the discussion of Lie groups and special orthogonal group in 3-dimensional space is beyond the scope of this article, here is a short summary:
In differential geometry, Lie groups play a crucial role by connecting the concepts of algebra and geometry. A Lie group is a mathematical structure that is both a group and a differentiable manifold. This means that the group operations of multiplication and taking inverses are smooth (differentiable), and it allows the application of calculus within the group structure.

The Special Orthogonal Lie group in 3-dimension space SO(3) is simply a group of 3 x 3 orthogonal matrices with determinant = 1. These represent rotation in
n-dimensional space and form a compact Lie group.

Implementation

Setup 

Let's wraps the random generators and KMeans training methods in a class, KMeansOnManifold.
The von Mises-Fisher generator for the data in the initial cluster is initialized with a mean, _mu and a kappa arbitrary value. The constrained uniform random generator, accepts random values in each dimension x: [-1, -0.35], y: [0.3, 1] and z: [-1. 0.4].
The SO(3) Lie group is initialized without metric (equip=False) to generate the 4 synthetic clusters.

from geomstats.geometry.hypersphere import Hypersphere
from geomstats.geometry.special_orthogonal import SpecialOrthogonal

class KMeansOnManifold(object):

   def __init__(self, num_samples: int, num_clusters: int, random_gen: AnyStr):
     # Step 1: Initialize the manifold
     self.hypersphere = Hypersphere(dim=2, equip=True)

     # Step 2: Generate a single cluster with random data points on hypersphere
     match random_gen:
        case 'random_von_mises_fisher':
             # Select a pivot or mean value
             _mu = self.hypersphere.random_uniform(n_samples=1)
             # Generate the cluster
             cluster = self.hypersphere.random_von_mises_fisher(
                      mu=_mu[0], 
                      kappa=60, 
                      n_samples=num_samples, 
                      max_iter=200)
        
        case 'random_riemann_normal':
             cluster = self.hypersphere.random_riemannian_normal(n_samples=num_samples, max_iter=300)
        
        case 'random_uniform':
            cluster = self.hypersphere.random_uniform(n_samples=num_samples)
        
        case 'constrained_random_uniform'
            # Generate random values with constrains on each dimension.
            y = [x for x in self.hypersphere.random_uniform(n_samples=100000)
            if x[0] <= -0.35 and x[1] >= 0.3 and x[2] <= -0.40]
            cluster = np.array(y)[0:num_samples]

        case _:
            raise ValueError(f'{random_gen} generator is not supported')
        
     # Step 3: Generate other clusters using SO(3) manifolds
     so3_lie_group = SpecialOrthogonal(3, equip=False)
     # Generate the clusters
     self.clusters = [cluster @ so3_lie_group.random_uniform() for _ in range(num_clusters)]



Data in Euclidean space

The data class, KMeansResult encapsulates the output (centroid and label) of the training of the KMeans on the synthetic clustered data.

@dataclass
class KMeansResult:
    center: np.array
    label: np.array

We rely k-means implementation in scikit-learn library [ref 10] (class KMeans) to identify the clusters in the Euclidean space, selecting the elkan algorithm, and k-means++ initialization.

def euclidean_clustering(self) -> List[KMeansResult]:
   from sklearn.cluster import KMeans

   kmeans = KMeans(
       n_clusters=len(self.clusters), 
       init='k-means++', 
       algorithm='elkan',
       max_iter=140)
  
   # Create a data set from points in clusters
   data = np.concatenate(self.clusters, axis=0)
   kmeans.fit(data)

   # Extract centroids and labels
   centers = kmeans.cluster_centers_
   labels = kmeans.labels_

   return [KMeansResult(center, label) for center, label in zip(centers, labels)]

Output:
Cluster Center: [ 0.56035023 -0.4030522   0.70054776], Label: 0
Cluster Center: [-0.1997325  -0.38496744  0.8826764 ], Label: 2
Cluster Center: [0.04443849 0.86749237 0.46118632], Label: 3
Cluster Center: [-0.83876485 -0.45621187  0.23570083], Label: 1

Clearly, KMeans was not able to identify the proper clusters

Data on hypersphere

The methods to train k-means on the hypersphere uses the same semantic as its sklearn counterpart. It leverage the Geomstats, RiemannianKMeans class method.

def riemannian_clustering (self) -> List[KMeansResult]:
    from geomstats.learning.kmeans import RiemannianKMeans

    # Invoke the Geomstats Riemann Means
    kmeans = RiemannianKMeans(space=self.hypersphere, n_clusters=len(self.clusters))
    
    # Build the data set from the clustered data points
    data = gs.concatenate(self.clusters, axis =0)
        
    kmeans.fit(data)

    # Extract predictions, centroids and labels
centers = kmeans.centroids_ labels = kmeans.labels_ return [KMeansResult(center, label) for center, label in zip(centers, labels)]


Similar to k-means in Euclidean space, we identify the centroids for 4 clusters using 500 randomly generated samples.

num_samples = 500
num_clusters = 4
kmeans = KMeansOnManifold(num_samples, num_clusters,  'random_von_mises_fisher')
kmeans_result = kmeans.riemannian_clustering()


Output:
500 random samples on 4 clusters with von-mises-Fisher distribution
Cluster Center: [ 0.17772496 -0.36363422  0.91443097], Label: 2
Cluster Center: [ 0.44403679  0.06735507 -0.89347335], Label: 0
Cluster Center: [ 0.85407911 -0.50905801  0.10681211], Label: 3
Cluster Center: [ 0.90899637  0.02635062 -0.41597025], Label: 1

500 random samples on 4 clusters with constrained uniform distribution
Cluster Center: [-0.05344069 -0.91613807  0.3972847 ], Label: 1
Cluster Center: [ 0.6796575   0.39400079 -0.61873181], Label: 2
Cluster Center: [ 0.51799972 -0.67116261 -0.530299 ], Label: 0
Cluster Center: [ 0.49290501 -0.45790221 -0.73984473], Label: 3

Note: The labels are arbitrary indices assigned to each cluster for the purpose of visualization and validation against true labels.

References



--------------------------------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning", Packt Publishing ISBN 978-1-78712-238-3

Appendix

Here is the list of published articles related to geometric learning:
Geometric Learning in Python: Basics introduced differential geometry as an applied to machine learning and its basic components.
Geometric Learning in Python: Manifolds described manifold components such as tangent vectors, geodesics with implementation in Python for Hypersphere using the Geomstats library.
Geometric Learning in Python: Intrinsic Representation reviewed the various coordinates system using extrinsic and intrinsic representation.
Geometric Learning in Python: Vector and Covector fields described vector and covector fields with Python implementation in 2 and 3-dimension spaces.
Geometric Learning in Python: Vector Operators illustrated the differential operators, gradient, divergence, curl and laplacian using SymPy library.
Geometric Learning in Python: Functional Data Analysis described the key elements of non-linear functional data analysis to analysis curves, images, or functions in very  high-dimensional spaces
Geometric Learning in Python: Riemann Metric & Connection reviewed Riemannian metric tensor, Levi-Civita connection and parallel transport for hypersphere.
Geometric Learning in Python: Riemann Curvature describes the intricacies of Riemannian metric curvature tensor and its implementation in Python using Geomstats library.

#geometriclearning #riemanngeometry #manifold #ai #python #geomstats #Liegroups #kmeans

 

Sunday, April 7, 2024

Geometric Learning in Python: Functional Data Analysis

Target audience: Advanced
Estimated reading time: 7'

In the realms of healthcare and IT monitoring, I encountered the challenge of managing multiple data points across various variables, features, or observations. Functional data analysis (FDA) is well-suited for addressing this issue. 
This article explores how the Hilbert sphere can be used to conduct FDA in non-linear spaces.

Table of contents
        FDA methods
        Formal notation
        Manifold structure
        Inner product
        Exponential map
        Logarithm map
References
Follow me on LinkedIn

What you will learnBasic concepts of functional data analysis in non-linear spaces through the use of manifolds, along with a hands-on application of Hilbert space using Geomstats in Python.

Notes
  • Environments: Python  3.10.10, Geomstats 2.7.0
  • This article assumes that the reader is somewhat familiar with differential and tensor calculus [ref 1]. Please refer to the previous articles related to geometric learning [ref 2, 3].
  • Source code is available at  Github.com/patnicolas/Data_Exploration/manifolds
  • To enhance the readability of the algorithm implementations, we have omitted non-essential code elements like error checking, comments, exceptions, validation of class and method arguments, scoping qualifiers, and import statements.'

Introduction

This article provides a summary of functional data analysis and then proceeds to introduce and implement a technique specifically for non-linear manifolds: Hilbert sphere.

This article is the 6th installment in our series on Geometric Learning in Python following

Functional data analysis

Functional data analysis (FDA) is a statistical approach designed for analyzing curves, images, or functions that exist within higher-dimensional spaces [ref 4].

Observation data types

Panel Data:
In fields like health sciences, data collected through repeated observations over time on the same individuals is typically known as panel data or longitudinal data. Such data often includes only a limited number of repeated measurements for each unit or subject, with varying time points across different subjects.

Time Series:
This type of data comprises single observations made at regular time intervals, such as those seen in financial markets.

Functional Data:
Functional data involves diverse measurement points across different observations (or subjects). Typically, this data is recorded over consistent time intervals and frequencies, featuring a high number of measurements per observational unit/subjects.

FDA methods

Methods in Functional Data Analysis are classified based on the type of manifold (linear or nonlinear) and the dimensionality or feature count of the space (finite or infinite). The categorization and examples of FDA techniques are demonstrated in the table below.

In Functional Data Analysis (FDA), the primary subjects of study are random functions, which are elements in a function space representing curves, trajectories, or surfaces. The statistical modeling and inference occur within this function space. Due to its infinite dimensionality, the function space requires a metric structure, typically a Hilbert structure, to define its geometry and facilitate analysis.

When the function space is properly established, a data scientist can perform various analytical tasks, including:
  • Computing statistics such as mean, covariance, and mode
  • Conducting classification and regression analyses
  • Performing hypothesis testing with methods like T-tests and ANOVA
  • Executing clustering
  • Carrying out inference
The following diagram illustrates a set of random functions around a smooth function X(tilde) over the interval [0, 1] \[\tilde{X}(t)=3e^{-t^{2}}.sin(25t)-2t^{2}+5 \ \ \ t\in [0, 1]\]

Fig. 1 Visualization of a random functions on Hilbert space

Methods in FDA are classified based on the type of manifold (linear or nonlinear) and the dimensionality or feature count of the space (finite or infinite). The categorization and examples of FDA techniques are demonstrated in the table below.

 Manifold
Dimension
Linear
Non-linear
Finite       Euclidean Rn.  RnSpecial orthogonal SO(3)
InfiniteSquare IntegrableHilbert sphere
          Table 1: Illustration of categorization of FDA techniques

This article focuses on Hilbert space which is specific function space equipped with a Riemann metric (inner product).

Formal notation

Let's consider a sample {x} generated by n Xi random functions as \[x_{i}(t)=X_{i}(t)_{i:1, n} \in \mathbb{R}\ \ \ \ t\in \top \subset \mathbb{R}\]
The function space is a manifold of square integrable functions defined as\[\textit{L}^{2}(T)=\left \{ f: T\rightarrow \mathbb{R}| \int_{T}^{.} f(t)^{T}f(t)dt < \infty \right \}\]The Riemann metric tensor is defined for tangent vectors f and is induced from and equal to the inner product:\[\left \langle f, g \right \rangle = \int_{T}^{} f(t)^{T}g(t)dt\ \ \ \left \| f \right \| _{\mathit{L}^{2}}=\sqrt{\left \langle f, f \right \rangle} \ \ \ \  (1)\]

Hilbert sphere

Hilbert space is a type of vector space that comes with an inner product, which establishes a distance function, making it a complete metric space. In the context of functional data analysis, attention is primarily given to functions that are square-integrable [ref 5].

Hilbert space has numerous important applications:
  • Probability theory: The space of random variables centered by the expectation
  • Quantum mechanics:
  • Differential equations:
  • Biological structures: (Protein structures, folds,..)
  • Medical imaging (MRI, CT-SCAN,...)
  • Meteorology

The Hilbert sphere S, which is infinite-dimensional, has been extensively used for modeling density functions and shapes, surpassing its finite-dimensional equivalent. This spherical Hilbert geometry facilitates invariant properties and allows for the efficient computation of geometric measures.

The Hilbert sphere is a particular case of function space defined as:\[H(T)=\left \{ f: T\rightarrow \mathbb{R} | \ \ \left \| f \right \|_{L^{2}}= 1 \right \}\]The Riemannian exponential map at p from the tangent space to the Hilbert sphere preserves the distance to origin and defined as:\[exp_{p}(f)=cos\left ( \left \| f \right \|_{E} \right )p+sin\left ( \left \| f \right \|_{E} \right)\frac{f}{\left \| f \right \|_{E}} \ \ \ \ (2) \] where ||f||E is the norm of f in the Euclidean space.
The logarithm (or inverse exponential) map is defined at point p, is defined as \[log_{p}(f)=arccos\left (\left \langle p, f \right \rangle_{p} \right )\frac{f}{\left \| f \right \|} \ \ \ \  (3) \]

Implementation

We will illustrate the various coordinates on the hypersphere space we introduced in a previous article Geometric Learning in Python: Manifolds
We leverage class ManifoldPoint introduced in our previous post, ManifoldPoint definition and used across our series on geometric learning. 
As a reminder:

@dataclass
class ManifoldPoint:
id: AnyStr
location: np.array
tgt_vector: List[float] = None
geodesic: bool = False
intrinsic: bool = False

Manifold structure

Let's develop a wrapper class named FunctionSpace to facilitate the creation of points on the Hilbert sphere and to carry out the calculation of the inner product, as well as the exponential and logarithm maps related to the tangent space. 

Our implementation relies on Geomstats library [ref 6] introduced in Geometric Learning in Python: Manifolds 

The function space will be constructed using num_domain_samples, which are evenly spaced real values within the interval [0, 1]. Points on a manifold can be generated using either the Geomstats HilbertSphere.random_point method or by specifying a base point, base_point, and a directional vector.

from geomstats.geometry.functions import HilbertSphere, HilbertSphereMetric


class FunctionSpace(HilbertSphere):
  def __init__(self, num_domain_samples: int):
      domain_samples = gs.linspace(0, 1, num=num_domain_samples)
      super(FunctionSpace, self).__init__(domain_samples, True)

  @staticmethod
  def create_manifold_point(id: AnyStr, vector: np.array, base_point: np.array) -> ManifoldPoint:
     
    # Compute the tangent vector using the direction 'vector' and point 'base_point'
     tgt_vector =  self.to_tangent(vector, base_point)
     return ManifoldPoint(id, base_point, tgt_vector)

  def random_manifold_points(self, n_samples: int) -> List[ManifoldPoint]: 
     return [ManifoldPoint(
           id=f'rand_{n+1}',
           location=random_pt) 
           for n, random_pt in enumerate(self.random_point(n_samples))]

Let's generate a point on the Hilbert sphere using a random base point on the manifold and a 4 dimension vector.

num_samples = 4
function_space = FunctionSpace(num_samples)
random_base_pt = function_space.random_point()

vector = np.array([1.0, 0.5, 1.0, 0.0])
manifold_pt = function_space.create_manifold_point('id', vector, random_pt)

Output:
Manifold point: 
    Base point=[[0.13347 0.85738 1.48770 0.29235]], 
    Tangent Vector=[[ 0.91176 -0.0667 0.01656 -0.19326]],
    No Geodesic, 
    Extrinsic

Inner product

Let's wrap the formula (1) into a method. We introduce the inner_product method to the FunctionSpace class, which serves to encapsulate the call to self.metric.inner_product from the Geomstats method HilbertSphere.inner_product

This method requires two parameters:
  • vector_1: The first vector used in the computation of the inner product
  • vector_2: The second vector used in the computation of the inner product
The second method, manifold_point_inner_product, adds the base point on the manifold without assumption of parallel transport. The base point is origin of both the tangent vector associated with the base point, manifold_base_pt and the tangent vector associated with the second point, manifold_pt.

def inner_product(self, tgt_vector1: np.array, tgt_vector2: np.array) -> np.array:
   return self.metric.inner_product(tgt_vector1,tgt_vector2)

def manifold_point_inner_product(
       self, 
       manifold_base_pt: ManifoldPoint, 
       manifold_pt: ManifoldPoint) -> np.array:

   return self.metric.inner_product(
               manifold_base_pt.tgt_vector,
               manifold_pt.tgt_vector,
            manifold_base_pt.location)

Let's calculate the inner product of two specific numpy vectors in an 8-dimensional space, using our class, FunctionSpace and focusing on the Euclidean inner product and the norm on the tangent space for one of the vectors.

num_Hilbert_samples = 8
functions_space = FunctionSpace(num_Hilbert_samples)
        
vector1 = np.array([0.5, 1.0, 0.0, 0.4, 0.7, 0.6, 0.2, 0.9])
vector2 = np.array([0.5, 0.5, 0.2, 0.4, 0.6, 0.6, 0.5, 0.5])
inner_prod = functions_space.inner_product(vector1, vector2)
print(f'Inner product of vectors 1 & 2: {str(inner_prod)}')
print(f'Euclidean norm of vector 1: {np.linalg.norm(vector)}')
print(f'Norm of vector 1: {str(math.sqrt(inner_prd))}')

Output:
Inner product of vectors1 & 2: 0.2700
Euclidean norm of vector 1: 1.7635
Norm of vector 1: 0.6071

Exponential map

Let's wrap the formula (2) into a method. We introduce the exp method to the FunctionSpace class, which serves to encapsulate the call to self.metric.exp from the Geomstats method HilbertSphere.exp

This method requires two parameters:
  • vector: The directional vector used in the computation the exponential map
  • manifold_base_pt: The base point on the manifold.
def exp(self, vector: np.array, manifold_base_pt: ManifoldPoint) -> np.array:
   return self.metric.exp(tangent_vec=vector, base_point=manifold_base_pt.location)

Let's compute the exponential map at a random base point on the manifold, for a numpy vector of 8-dimensional, using the class, FunctionSpace.

num_Hilbert_samples = 8
function_space = FunctionSpace(num_Hilbert_samples)

vector = np.array([0.5, 1.0, 0.0, 0.4, 0.7, 0.6, 0.2, 0.9])
assert num_Hilbert_samples == len(vector)
        
exp_map_pt = function_space.exp(vector, function_space.random_manifold_points(1)[0])
print(f'Exponential on Hilbert Sphere:\n{str(exp_map_pt)}')

Output:
Exponential on Hilbert Sphere: 
[0.97514 1.6356 0.15326 0.59434 1.06426 0.74871 0.24672 0.95872]

Logarithm map

Let's wrap the formula (3) into a method. We introduce the log method to the FunctionSpace class, which serves to encapsulate the call to self.metric.log from the Geomstats method HilbertSphere.log

This method requires two parameters:
  • manifold_base_pt: The base point on the manifold.
  • target_pt: Another point on the manifold, used to produce the log map.

def log(self, manifold_base_pt: ManifoldPoint, target_pt: ManifoldPoint) ->np.array:
   return self.metric.log(point=manifold_base_pt.location, base_point=target_pt.location)

Let's compute the exponential map at a random base point on the manifold, for a numpy vector of 8-dimensional, using the class, FunctionSpace.

num_Hilbert_samples = 8
function_space = FunctionSpace(num_Hilbert_samples)

random_points = function_space.random_manifold_points(2)
log_map_pt = function_space.log(random_points[0], random_points[1])
print(f'Logarithm from Hilbert Sphere {str(log_map_pt)}')

Output:
Logarithm from Hilbert Sphere 
[1.39182 -0.08986 0.32836 -0.24003 0.30639 -0.28862 -0.431680 4.15148]


References




-------------
Patrick Nicolas has over 25 years of experience in software and data engineering, architecture design and end-to-end deployment and support with extensive knowledge in machine learning. 
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning", Packt Publishing ISBN 978-1-78712-238-3