Monday, December 16, 2024
Posts History
2024 Reviews of Papers on Geometric Learning
- ChatGPT for Computational Topology
- An introduction to Topological Data Analysis
- Synthetic Data Generation and Deep Learning for the Topological Analysis of 3D Data
- Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras from First Principles
- Autoencoders for discovering manifold dimension and coordinates in data from complex dynamical systems
- Reliable Malware Analysis and Detection using Topology Data Analysis
- Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning
- Categorical Foundations of Explainable AI
- Introduction to Geometric Learning in Python with Geomstats
- Machine Learning Algebraic Geometry for Physics
- Deep Learning in Asset Pricing
- Intrinsic and extrinsic deep learning on manifolds
- Kalman Filters on Differentiable Manifolds
- Manifold Matching via Deep Metric Learning for Generative Modeling
- Machine learning a manifold
- A Geometric Perspective on Variational Autoencoders
- Deep Hyperspherical Learning
- Learning Weighted Submanifolds with Variational Autoencoders and Riemannian Variational Autoencoders
- Learning Manifold Dimensions with Conditional Variational Autoencoders
- Variational Transformer Autoencoder with Manifolds Learning
- Riemannian Score-Based Generative Modelling
- Riemannian Diffusion Models
- Convolutional Neural Networks on Manifolds: From Graphs and Back
- Riemannian Residual Neural Networks
- A singular Riemannian geometry approach to Deep Neural Networks
- Transformer with Hyperbolic Geometry
- Deep Extrinsic Manifold Representation for Vision Tasks
- Manifold Matching via Deep Metric Learning for Generative Modeling
- Pullback Flow Matching on Data Manifolds
- A Survey of Geometric Optimization for Deep Learning: From Euclidean Space to Riemannian Manifold
Through dialogue with ChatGPT, mathematicians can impart their knowledge of mathematical concepts and theories in simple English. ChatGPT would then use this guidance to convert the mathematical theories into executable algorithms and code, eliminating the necessity for mathematicians to be versed in the intricacies of coding.
The paper concentrates on directing ChatGPT to produce Python code, utilizing the 'networkx' and 'scipy' libraries, for several tasks:
- Computing Betti numbers for a simplicial complex
- Constructing a Vietoris-Rips complex
- Creating a boundary matrix for a specified simplicial complex
- Formulating a graph Laplacian matrix.
2. An introduction to Topological Data Analysis
This article provides a comprehensive yet concise guide to applying Topological Data Analysis (TDA) in data analysis, offering just enough information without overwhelming the reader. TDA is a burgeoning field that utilizes novel tools from algebraic topology and computational geometry to extract significant features from complex data sets.
What I appreciate about the paper is
- Clear definition of the TDA pipeline, which interestingly doesn't rely on Euclidean metrics.
- Its accessibility to non-experts, introducing fundamental TDA concepts like metric spaces, simplicial complexes, and Persistent homology in an understandable manner (although I feel that the sections on applying persistent homology in data analysis, feature engineering, and optimizing machine learning architectures could have been more detailed and extensive
- The way it illustrates the concept of filtration with practical examples and visual aids
- The application of persistent homology to proteins, demonstrated with Python code and the Gudhi library.
- Clear definition of the TDA pipeline, which interestingly doesn't rely on Euclidean metrics.
- Its accessibility to non-experts, introducing fundamental TDA concepts like metric spaces, simplicial complexes, and Persistent homology in an understandable manner (although I feel that the sections on applying persistent homology in data analysis, feature engineering, and optimizing machine learning architectures could have been more detailed and extensive
- The way it illustrates the concept of filtration with practical examples and visual aids
- The application of persistent homology to proteins, demonstrated with Python code and the Gudhi library.
Additionally, it explores the application and comparison of various networks for semantic segmentation of topology data, offering an enhanced TDA method. This includes: (1) A transformer block featuring self-attention for 3D point clouds, and (2) a manifold curvature estimator, both yielding similar outcomes.
The paper discusses how defining symmetry properties in the hidden layers of a deep learning model can streamline its training and make it easier to interpret. It introduces a neural network architecture that detects and classifies symmetric patterns in a labeled training dataset.
This method utilizes group theory to:
- Generate symmetry transformations that maintain the integrity of labeled data.
- Pinpoint infinitesimal transformations, known as symmetry generators.
- Modify the machine learning model's loss function to uncover sub-algebras (Lie algebra) of the symmetry group, which are formed as linear combinations of the symmetry generators.
- Ascertain the complete symmetry group and Lie subgroups that optimize the number of generators.
The concept was put to the test for identifying 3-dimensional (and in another case, 8-dimensional) manifolds within a 4-dimensional (and respectively, infinite-dimensional) Euclidean space. This was done using data derived from various nonlinear partial differential equations. The outcomes are remarkable: the proposed method uniquely succeeded in pinpointing the top 3 (or 8) singular/eigenvalues, outperforming both PCA and generic Autoencoders.
The reader should have basic knowledge of unsupervised, dimension reduction.
The Python code is available on GitHub https://github.com/mdgrahamwisc/IRMAE_WD
6. Reliable Malware Analysis and Detection using Topology Data Analysis
This paper introduces and evaluates three topological-based data analysis (TDA) techniques,
to efficiently analyze and detect complex malware signature. TDA is known for its robustness under noise and with imbalanced datasets.
- TDA mapper
- Persistence homology
- Topological model analysis tool
8. Categorical Foundations of Explainable AI
Category Theory to Explain Model Prediction
Explainable AI (XAI), a vital research area addressing ethical and legal AI issues, still lacks a solid mathematical foundation. Category theory offers a solution.
The paper highlights Geomstats, an open-source Python toolkit for handling data on non-linear manifolds. Included in the package are tutorials that combine theoretical concepts with practical Python implementations, such as
- Statistics and Geometric Statistics, demonstrated through hypersphere and Frechet mean examples.
- Techniques for managing Data on Manifolds, utilizing the exponential map (tangent vector) and the Riemann Tensor metric.
- Classifying Symmetric Positive Definite matrices, which allows for the application of standard learning algorithms, like the K-Nearest Neighbors classifier, to non-linear manifolds.
- Graph representations that involve embeddings in hyperbolic spaces.
- This paper serves as an ideal entry point into the world of manifold learning, offering an easy-to-understand overview without requiring deep expertise in differential geometry.
Source code available at github.com/geomstats/geomstats
A quick tutorial is available at https://youtu.be/Ju-Wsd84uG0
This paper is a valuable contribution to any discussion or teaching on Information Geometry (refer to the mentioned source).
While some are accustomed to applying laws of Physics, like Partial Differential Equations, to restrict deep learning models in machine learning, the use of machine learning combined with algebraic/differential geometry to process large datasets produced by physics is relatively unusual.
The paper utilizes the Calabi-Yau manifold, derived from the string theory concept of a 10-dimensional spacetime landscape. It examines unsupervised models such as PCA, Topology Data Analysis, Clustering, and then explores neural networks used for data analysis on hypersurfaces. The authors delve into a variety of subjects including projecting to lower-dimensional spaces, Hilbert series, interactions, and equivalences of Branes, as well as cluster mutations.
The study concludes by discussing the optimal transport problem and Kahler geometry in the context of Generative Adversarial Networks.
Reference: Information Geometry: Near Randomness and Near Independence - K. Arwini, C.T.J. Dobson – Springer 2008
The assessment and comparison of various machine learning models were conducted through Monte Carlo Simulations, with Max a Posterior Estimation (MAPE) as objective.
- Predicting stock prices is challenged by significant issues related to distribution and concept shift.
- Advanced deep learning models generally surpass traditional linear predictors in performance.
- Among these, Recurrent Neural Networks, especially LSTM and GRU featuring attention mechanisms and transformers, demonstrate superior effectiveness.
- The addition of skip connections in deep layers did not lead to performance enhancements.
- Convolutional Neural Networks were found to be less effective than even simple feedforward neural networks.
Note: This paper is accessible to those without financial expertise and only a basic understanding of deep learning, as the models are explained in clear terms.
Extrinsic deep neural network on manifolds
This architecture maintains the geometric characteristics of manifolds by employing an equivariant embeddings of a manifold into the Euclidean space.
This method is applied in regression models or Gaussian processes on manifolds, where the idea is to construct the neural network based on the manifold's representation after embedding, while still maintaining its geometric properties. By adopting this strategy, it becomes possible to utilize stochastic gradient descent and backpropagation techniques from Euclidean space. This results in enhanced accuracy compared to conventional machine learning algorithms like SVM, random forest, etc.
Intrinsic deep neural network on manifolds
The objective is to embed the inherent geometric nature of Riemannian manifolds using exponential and logarithmic maps. This framework, which projects localized points from a Riemannian manifold onto a single tangent space, proves beneficial when embeddings cannot be determined. Each localized tangent space (or chart) is mapped (via exp/log functions) onto a neural network. This architectural approach achieves higher accuracy compared to deep models in Euclidean space and the Extrinsic architecture.
The structure of the paper is as follows:
- A concise overview of differentiable manifolds, tangent spaces, and the exponential/logarithmic mappings.
- A discussion on modeling the error state and the covariance matrix.
- A detailed description of how to apply the predict and update steps of the Kalman filter from tangent spaces back to the manifold context.
Note: The reader should have a foundational understanding of differential geometry and Lie algebra to fully grasp the content of this paper.
I would recommend “The SO(3) and SE(3) Lie Algebras of Rigid Body Rotations and Motions and their Application to Discrete Integration, Gradient Descent Optimization, and State Estimation” to get familiar with lie algebras, manifolds and extended Kalman.
- Data generator sampling data on the manifold
- Metric generator learning geodesic distances.
The metric generator produces a pullback of the Euclidean space while the data generator produces a push forward of the prior distribution. The algorithm is described with easy-to-follow pseudo code.
The method is tested on unconditional ResNet image creation and GAN-based image super-resolution, showing improved Frechet Inception Distance and perception scores.
This paper will be especially of interest to engineers already familiar with GANs and Frechet metric.
To achieve this, an 8-layer feed-forward network is employed for interpolating data, predicting infinitesimally transformed fields, and identifying the symmetry. The Keras library is used for this implementation. The neural model is capable of recognizing symmetries in the Lie groups of SU(3), which involves orientation preservation, and SO(8), which relates to Riemannian metric invariance.
Understanding this paper benefits from some background in differential geometry, Lie algebra, and manifolds.
This modeling approach enables sampling of latent values for datasets like MNIST, CIFAR, and CELEBA. These samples are then assessed against a range of autoencoders, from Wasserstein to Hamiltonian, using evaluation metrics such as Frechet Inception Distance (FID) and Precision/Recall (PRD) – specifically, F1 scores.
This study addresses certain limitations of convolutional neural networks (CNNs) during training, such as overfitting and vanishing gradients. The suggested approach involves:
- Incorporating a geometric constraint by substituting the Euclidean inner product with the geodesic distance on a hypersphere.
- Replacing the softmax loss with a normalized softmax loss that is weighted by a metric tensor (dependent on the geodesic distance input and convolution weights) on the hypersphere.
The model's performance was assessed using the CIFAR-10/100 and ImageNet-2012 datasets, without ReLU activation, and compared to the ResNet-32 model. The proposed model achieves an accuracy comparable to the most efficient ResNet model but converges significantly faster.
A great paper from Nina Miolane, one of the contributors of Geomstats, a very useful Python library for Geometric learning.
The author explores alternative to Euclidean data representation, motivated by the brain connectomes in MRI and devise a Riemannian variational autoencoder (with latent space as a Manifold or Lie group).
The paper evaluates geodesic vs. non-geodesic subspaces, review the standard Euclidean variational autoencoder, introduces Riemannian VAE including maximization of a modified Evidence Lower Bound. The generative function applied to latent space is constructed from the Euclidean version through the Riemannian exponential map at the base point on the manifold. The model is evaluating using a 2-Wassertein distance on S3 and H2 groups/manifolds.
The authors seek to determine the dimension of the underlying manifold—a geometric characteristic—from the global minimum of a variational autoencoder (VAE) setup that includes an encoder, a decoder, and a Gaussian distribution prior.
The study highlights several trade-offs in the design of autoencoders, with these key findings:
- A conditioned prior is unnecessary: a normal distribution prior suffices.
- The initial variance of the Gaussian decoder impacts the loss function.
- Commonly used strategies like weight sharing to accelerate training may hinder convergence.
The evaluation of both Variational Autoencoders (VAE) and Conditional Variational Autoencoders (CVAE) was conducted using the standard ELBO metrics across three datasets:
- A synthetic dataset with 5-dimensional categorical data, consisting of 100,000 samples.
- The MNIST dataset implemented with ResNet blocks.
- The Fashion MNIST dataset.
The results indicate that the correct number of active dimensions were identified (5 for the synthetic dataset and 20 for MNIST), provided that the latent dimension was sufficiently large.
The paper illustrates how the Riemann metric effectively addresses the challenges of modeling non-linear latent spaces by interpolating between input data samples along geodesics. The researchers have developed a variational autoencoder that incorporates a spatial transformer as the encoder to map the latent space model onto a Riemann manifold.
The goal is to refine the variational auto-encoder to calculate the geodesic distances between input data points, utilizing the Riemann metric tensor, and to delineate a semantic feature/latent space. The encoded transformer component is responsible for calculating the prior in the latent space. Importantly, this new model does not necessitate any modifications to the existing loss function (ELBO) or the training methodology.
Implemented in PyTorch, the model consists of four convolutional layers, two linear reshaping layers, and fifty latent variables. It has been evaluated using a variety of image sets, including grayscale images (such as MNIST and FashionMNIST) and color, natural images (such as CelebA and CIFAR-10). The proposed model demonstrates substantial improvements in image reconstruction.
Note: The paper assumes knowledge of differential geometry and generative models. The source code is available on GitHub.
Diffusion models, or score-based generative models, progressively add noise to create a generative model that reverses the noise addition process over time. Although effective, these models primarily utilize data in Euclidean space and often fail to capture the intrinsic geometry of complex data such as proteins, geological formations, or high-energy physics components.
- Modify the noise addition process to suit manifolds using the Riemannian gradient.
- Incorporate Brownian motion into the Euclidean time-reversal formula.
- Implement Geodesic random walks to approximate sampling from stochastic differential equations.
- Approximate and evaluate the drift in the time-reversal process.
Code is available on Github.
22. Riemannian Diffusion Models
Euclidean diffusion models often fail to capture the generative factors related to the geometry of the space in applications such as protein modeling or geoscience. This paper introduces a variational architecture for the Riemann manifold, applying the Feynman-Kac conditional expectation to the Ito-diffusion process.
The algorithm is evaluated on various manifolds, including Hyperbolic spaces, SO(n), Tori, and Hyperspheres. Notably, the authors include an appendix reviewing key elements of differential geometry for non-experts.
This paper extends deep learning models to non-Euclidean spaces by introducing manifold convolution filters. The convolutional neural network is constructed by stacking convolution layers based on Laplace-Beltrami (LB) operators for the heat diffusion process.
- The extraction of orthogonal eigenvalues from the LB operator
- The definition of the manifold filter and its frequency response using eigenfunctions
- The assembly of a bank of filters into a manifold neural network
- The sampling of space and time domains to approximate a continuous architecture with point clouds
- The discretization of the filter by reformulating the LB operator as a graph Laplacian
24. Riemannian Residual Neural Networks
This paper introduces the extension of residual neural networks to Riemannian manifolds.
Residual Neural Networks
Residual networks were initially designed to mitigate the issues of vanishing and exploding gradients in deep neural networks. These networks achieve this by learning residual functions relative to the layer inputs using identity mappings.
Riemannian Residual Neural Networks
Projection and Implementation
The authors detail the method of projecting the residuals for each hidden layer from the local Euclidean tangent space onto manifold geodesics via the exponential map. The implementation focuses on hyperplanes and uses the Whitney embedding theorem to project vector fields onto the tangent space. The feature map, induced by these vector fields, is learned through a pullback mechanism, assuming a closed-form definition for the exponential map.
Performance
The Riemannian residual networks show superior performance compared to various graph neural network variants on manifolds with semi-defined positive matrices.
25. A singular Riemannian geometry approach to Deep Neural Networks
manifolds, with a Riemannian metric applied to the final layer. This metric is propagated back through the layers or maps.
The authors' approach includes:
- Reviewing essential elements of Riemannian geometry.
- Defining the geometry for smooth layers (utilizing weights, biases, and activation functions) and for the entire smooth network (considered as a sequence of maps between manifolds).
- Applying this framework to analyze the equivalence of neural networks in classification problems, focusing on representation within the input manifold.
- Evaluating level curves and equivalence classes for non-linear regression, demonstrated with the Ackley function.
- Studying the space of weights and biases for a given input.
26. Transformer with Hyperbolic Geometry
The paper presents a novel transformer architecture that leverages both hyperbolic and Euclidean spaces. It incorporates a hyperbolic-to-linear transformation within the self-attention module.
To address performance degradation in high-dimensional settings, the authors generate hyperbolic keys and queries. They use the Poincaré ball with negative Riemannian curvature as the hyperbolic representation, projecting from the tangent space using the exponential map. Backpropagation is carried out through a Riemannian adaptation of stochastic gradient descent.
The proposed model is evaluated using the F1 metric on two datasets: biomedical named-entity recognition (sequence labeling) and an English machine reading task. The results demonstrate that the model effectively reduces overfitting in very high-dimensional spaces.
The proposed approach, termed Deep Extrinsic Manifold Representation (DEMR), involves externally embedding manifolds at the final regression layer of neural networks, such as ResNet and PointNet. This embedding is defined using a matrix Lie group, with Singular Value Decomposition (SVD) applied to extract orthogonal vectors.
The approach demonstrates significant improvements for SE(3) and related quotient Lie groups, particularly in scenarios involving:
- Relative affine motion within SE(3)
- Variations in illumination and pose in face recognition tasks using Grassmann manifolds.
The reader is expected to have basic knowledge of Lie groups and embedding manifolds.
28. Manifold Matching via Deep Metric Learning for Generative Modeling
This study advances the recent progress in blending geometry with statistics to enhance generative models. It proposes a novel method for identifying manifolds within Euclidean spaces for generative models like variational encoders and GANs through two neural networks:
- Data generator sampling data on the manifold
- Metric generator learning geodesic distances.
The metric generator produces a pullback of the Euclidean space while the data generator produces a push forward of the prior distribution. The algorithm is described with easy-to-follow pseudo code.
The method is tested on unconditional ResNet image creation and GAN-based image super-resolution, showing improved Frechet Inception Distance and perception scores.
This paper will be especially of interest to engineers already familiar with GANs and Frechet metric.
29. Pullback Flow Matching on Data Manifolds
Generative modeling on data manifolds has proven to be more challenging than models that use the Euclidean metric. This paper addresses these challenges by proposing a method for accurate interpolation in latent space using neural ordinary differential equations (ODEs). The authors introduce a pullback geometry framework with Riemannian Flow Matching (RFM) to transform the input data manifold onto a latent manifold.
A solid understanding of differential geometry is recommended for readers.
30. A Survey of Geometric Optimization for Deep Learning: From Euclidean Space to Riemannian Manifold
Optimization on Riemannian manifolds leverages intrinsic geometric properties to uncover valuable information, addressing challenges like over-parameterization and feature redundancy. This approach reduces reliance on computationally intensive models while achieving faster convergence and mitigating issues like gradient explosion or vanishing.
Applications to machine learning problems, including dimensionality reduction and state-transition modeling, are thoroughly reviewed.
A reader with a basic understanding of differential geometry will find the material accessible and engaging. The inclusion of overviews for key Python libraries is a particularly appreciated touch.
He has been director of data engineering at Aideo Technologies since 2017 and he is the author of "Scala for Machine Learning", Packt Publishing ISBN 978-1-78712-238-3 and Geometric Learning in Python Newsletter on LinkedIn.