riemannianGeometry.jl

riemannianGeometry.jl

This is the fundamental unit of PosDefManifold. It contains functions for manipulating points in the Riemannian manifold of Symmetric Positive Definite (SPD) or Hermitian Positive Definite (HPD) matrices. In Julia those are Hermitian matrices, see typecasting matrices.

The functions are divided in six categories:

CategoryOutput
1. Geodesic equationsinterpolation, extrapolation,...
2. Distanceslength of geodesics
3. Graphs and Laplaciansfor spectral embedding, eigenmaps, system dynamics,...
4. Meansmid-points of geodesics, centers of mass of several points
5. Tangent Space operationsmaps from the manifold to the tangent space and viceversa
6. Procrustes problemsfor data matching, transfer learning,...

Geodesic equations

FunctionDescription
geodesicGeodesic equations (weighted mean of two positive definite matrices) for any metric

(1) geodesic(metric::Metric, P::ℍ{T}, Q::ℍ{T}, a::Real) where T<:RealOrComplex
(2) geodesic(metric::Metric, D::𝔻{S}, E::𝔻{S}, a::Real) where S<:Real

(1) Move along the geodesic from point $P$ to point $Q$ (two positive definite matrices) with arclegth $0<=a<=1$, using the specified metric, of type Metric::Enumerated type.

For all metrics,

  • with $a=0$ we stay at $P$,
  • with $a=1$ we move up to $Q$,
  • with $a=1/2$ we move to the mid-point of $P$ and $Q$ (mean).

Using the Fisher metric, argument a can be any real number, for instance:

  • with $0<a<1$ we move toward $Q$ (attraction),
  • with $a>1$ we move over and beyond $Q$ (extrapolation),
  • with $a<0$ we move back away from Q (repulsion).

$P$ and $Q$ must be flagged by julia as Hermitian. See typecasting matrices.

Note that if $Q=I$, the Fisher geodesic move is simply $P^a$ (no need to call this funtion then).

Nota Bene

For the logdet zero and Jeffrey metric no closed form expression for the geodesic is available to the best of authors' knowledge, so in this case the geodesic is found as the weighted mean using mean. For the Von Neumann not even an expression for the mean is available, so in this case the geodesic is not provided and a warning is printed.

(2) Like in (1), but for two real positive definite diagonal matrices $D$ and $E$.

Maths

For points $P$, $Q$ and arclength $a$, letting $b=1-a$, the geodesic equations for the supported metrics are:

Metricgeodesic equation
Euclidean$bP + aQ$
invEuclidean$\big(bP^{-1} + aQ^{-1}\big)^{-1}$
ChoEuclidean$TT^*$, where $T=bL_P + aL_Q$
logEuclidean$\text{exp}\big(b\hspace{2pt}\text{log}(P) + a\hspace{2pt}\text{log}(Q)\big)$
logCholesky$TT^*$, where $T=S_P+a(S_Q-S_P)+D_P\hspace{2pt}\text{exp}\big(a(\text{log}D_Q-\text{log}D_P)\big)$
Fisher$P^{1/2} \big(P^{-1/2} Q P^{-1/2}\big)^a P^{1/2}$
logdet0uses weighted mean algorithm logdet0Mean
Jeffreyuses weighted mean mean
VonNeumannN.A.
Wasserstein$b^2P+a^2Q +ab\big[(PQ)^{1/2} +(QP)^{1/2}\big]$

legend: $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).

See also: mean.

Examples

using PosDefManifold
P=randP(10)
Q=randP(10)
# Wasserstein mean
M=geodesic(Wasserstein, P, Q, 0.5)
# extrapolate suing the Fisher metric
E=geodesic(Fisher, P, Q, 2)
source

Distances

FunctionDescription
distanceSqr, distance²Squared distance between positive definite matrices
distanceDistance between positive definite matrices

(1) distanceSqr(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distanceSqr(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distanceSqr(metric::Metric, D::𝔻{S}) where S<:Real
(4) distanceSqr(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real

alias: distance²

(1) Return $δ^2(P, I)$, the square of the distance (or divergence) of positive definite matrix $P$ from the the identity matrix. See distance from the origin.

(2) Return $δ^2(P, Q)$, the square of the distance (or divergence) between two positive definite matrices $P$ and $Q$. See distance.

In both cases the distance function $δ$ is induced by the argument metric of type Metric::Enumerated type.

$P$ in (1) and $P$, $Q$ in (2) must be flagged by julia as Hermitian. See typecasting matrices.

(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite Diagonal matrices. See ℍVector type and 𝔻Vector type.

Maths

For point $P$ the squared distances from the identity for the supported metrics are:

MetricSquared Distance from the identity
Euclidean$∥P-I∥^2$
invEuclidean$∥P^{-1}-I∥^2$
ChoEuclidean$∥L_P-I∥^2$
logEuclidean$∥\textrm{log}P∥^2$
logCholesky$∥S_P∥^2+∥\textrm{log}D_P∥^2$
Fisher$∥\textrm{log}P∥^2$
logdet0$\textrm{logdet}\frac{1}{2}(P+I) - \frac{1}{2}\textrm{logdet}(P)$
Jeffrey$\frac{1}{2}\textrm{tr}(P+P^{-1})-n$
VonNeumann$\frac{1}{2}\textrm{tr}(P\textrm{log}P-\textrm{log}P)$
Wasserstein$\textrm{tr}(P+I) -2\textrm{tr}(P^{1/2})$

For points $P$ and $Q$ their squared distances for the supported metrics are:

MetricSquared Distance
Euclidean$∥P-Q∥^2$
invEuclidean$∥P^{-1}-Q^{-1}∥^2$
ChoEuclidean$∥ L_P - L_Q ∥^2$
logEuclidean$∥\textrm{log}P-\textrm{log}Q∥^2$
logCholesky$∥S_P-S_Q∥^2+∥\textrm{log}D_P-\textrm{log}D_Q∥^2$
Fisher$∥\textrm{log}(P^{-1/2}QP^{-1/2})∥^2$
logdet0$\textrm{logdet}\frac{1}{2}(P+Q) - \frac{1}{2}\textrm{logdet}(PQ)$
Jeffrey$\frac{1}{2}\textrm{tr}(Q^{-1}P+P^{-1}Q)-n$
VonNeumann$\frac{1}{2}\textrm{tr}(P\textrm{log}P-P\textrm{log}Q+Q\textrm{log}Q-Q\textrm{log}P)$
Wasserstein$\textrm{tr}(P+Q) -2\textrm{tr}(P^{1/2}QP^{1/2})^{1/2}$

legend: $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).

See also: distanceSqrMat.

Examples (1)

using PosDefManifold
P=randP(10)
d=distanceSqr(Wasserstein, P)
e=distanceSqr(Fisher, P)
metric=Metric(Int(logdet0)) # or metric=logdet0
s=string(metric) # check what is the current metric
f=distance²(metric, P) #using the alias distance²

Examples (2)

using PosDefManifold
P=randP(10)
Q=randP(10)
d=distanceSqr(logEuclidean, P, Q)
e=distance²(Jeffrey, P, Q)
source
(1) distance(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distance(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distance(metric::Metric, D::𝔻{S}) where S<:Real
(4) distance(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real

(1) Return $δ(P, I)$, the distance between positive definite matrix $P$ and the identity matrix.

(2) Return $δ(P, Q)$, the distance between positive definite matrices $P$ and $Q$.

(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite Diagonal matrices.

This is the square root of distanceSqr and is invoked with the same syntax therein.

See also: distanceMat.

source

Graphs and Laplacians

FunctionDescription
distanceSqrMat, distance²MatLower triangular matrix of all squared inter-distances
distanceMatLower triangular matrix of all inter-distances
laplacianLaplacian of a squared inter-distances matrix
laplacianEigenMaps, laplacianEMEigen maps (eigenvectors) of a Laplacian
spectralEmbeddingSpectral Embedding (all the above function run in series)

(1) distanceSqrMat(metric::Metric, 𝐏::ℍVector;
                                    <⏩=false>)
(2) distanceSqrMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
                                    <⏩=false>) where T<:AbstractFloat

alias: distance²Mat

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real LowerTriangular matrix comprising elements $δ^2(P_i, P_j)\textrm{, for all }i>=j$.

This is the lower triangular matrix holding all squared inter-distances (zero on diagonal), using the specified metric, of type Metric::Enumerated type, giving rise to distance function $δ$. See distanceSqr.

Only the lower triangular part is computed in order to optimize memory use.

By default, the result matrix is of type Float32. The type can be changed to another real type using method (2).

<optional keyword arguments>:

  • if ⏩=true the computation of inter-distances is multi-threaded.
Multi-Threading

Multi-threading is still experimental in julia. You should check the result on each computer. Multi-threading is automatically disabled if k<4 or if Julia uses only one thread. See Threads.

See: distance.

See also: laplacian, laplacianEigenMaps, spectralEmbedding.

Examples

using PosDefManifold
# Generate a set of 8 random 10x10 SPD matrices
Pset=randP(10, 8) # or, using unicode: 𝐏=randP(10, 8)
# Compute the squared inter-distance matrix according to the log Euclidean metric.
# This is much faster as compared to the Fisher metric and in general
# it is a good approximation.
Dsqr=distanceSqrMat(logEuclidean, Pset)
# or, using unicode: Δ²=distanceSqrMat(logEuclidean, 𝐏)

# return a matrix of type Float64
Dsqr64=distanceSqrMat(Float64, logEuclidean, Pset)

# Multi-threaded
Dsqr=distanceSqrMat(Fisher, Pset; ⏩=true)
source
(1) distanceMat(metric::Metric, 𝐏::ℍVector;
                                <⏩=true>)
(2) distanceMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
                                <⏩=true>) where T<:AbstractFloat

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real LowerTriangular matrix comprising elements $δ(P_i, P_j)\textrm{, for all }i>=j$.

This is the lower triangular matrix holding all inter-distances (zero on diagonal), using the specified metric, of type Metric::Enumerated type, giving rise to distance $δ$. See distance.

Only the lower triangular part is computed in order to optimize memory use.

By default, the result matrix is of type Float32. The type can be changed to another real type using method (2).

The elements of this matrix are the square root of distanceSqrMat.

<optional keyword arguments>:

  • if ⏩=true the computation of inter-distances is multi-threaded.
Multi-Threading

Multi-threading is still experimental in julia. You should check the result on each computer. Multi-threading is automatically disabled if k<4 or if Julia uses only one thread. See Threads.

See: distance.

Examples

using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
D=distanceMat(Fisher, Pset)
# or, using unicode: Δ=distanceMat(Fisher, 𝐏)

# return a matrix of type Float64
D64=distanceMat(Float64, Fisher, Pset)

# Multi-threaded
D64=distanceMat(Fisher, Pset; ⏩=true)
source
laplacian(Δ²:𝕃{S}) where S<:Real

Given a LowerTriangular matrix of squared inter-distances $Δ^2$, return the lower triangular part of the normalized Laplacian. The elements of the Laplacian are of the same type as the elements of $Δ^2$. The result is a LowerTriangular matrix.

First, a Gaussian radial basis functions is applied to all elements of $Δ^2$, such as

$W_{ij} = exp(\frac{\displaystyle{-Δ^2_{ij}}}{\displaystyle{ε}})$,

where $ε$ is the Gaussian scale parameter chosen automatically as the median of the elements $Δ^2_{ij}$.

Finally, the normalized Laplacian is defined as

$Ω = D^{-1/2}WD^{-1/2}$,

where $D$ is the diagonal matrix holding on the main diagonal the sum of the rows (or columns) of $W$.

Nota Bene

The normalized Laplacian as here defined can be requested for any input matrix of squared inter-distances, for example, those obtained on scalars or on vectors using appropriate metrics. In any case, only the lower triangular part of the Laplacian is taken as input. See typecasting matrices.

See also: distanceSqrMat, laplacianEigenMaps, spectralEmbedding.

Examples

using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Dsqr=distanceSqrMat(Fisher, Pset) # or: Δ²=distanceSqrMat(Fisher, 𝐏)
lap=laplacian(Dsqr) # or: Ω=laplacian(Δ²)
source
laplacianEigenMaps(Ω::𝕃{S}, q::Int;
                  <tol::Real=0, maxiter=300, ⍰=false>) where S<:Real

alias: laplacianEM

Given the lower triangular part of a normalized Laplacian $Ω$ (see laplacian ) return the eigen maps in $q$ dimensions, i.e., the $q$ eigenvectors of the normalized Laplacian associated with the largest $q$ eigenvalues, excluding the first (which is always equal to 1.0). The eigenvectors are of the same type as $Ω$.

The eigenvectors of the normalized Laplacian are computed by the power iterations+modified Gram-Schmidt method, allowing the execution of this function for big Laplacian matrices.

Return the 4-tuple $(Λ, U, iterations, convergence)$, where:

  • $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
  • $U$ holds in columns the eigen maps, that is, the $q$ eigenvectors,
  • $iterations$ is the number of iterations executed by the power method,
  • $convergence$ is the convergence attained by the power method.

The eigenvectors of $U$ holds the coordinates of the points in a low-dimension Euclidean space (typically two or three). This is done for, among other purposes, classifying them and following their trajectories over time or other dimensions. For examples of applications see Ridrigues et al. (2018) 🎓 and references therein.

Arguments: (Ω::𝕃, q; <tol::Real=0, maxiter=300, ⍰=false>):

  • $Ω$ is a real LowerTriangular normalized Laplacian obtained by the laplacian function,
  • $q$ is the dimension of the Laplacian eigen maps;
  • The following are <optional keyword arguments> for the power iterations:
    • tol is the tolerance for convergence (see below),
    • maxiter is the maximum number of iterations allowed,
    • if is true, the convergence at all iterations will be printed.
Nota Bene

The maximum value of $q$ that can be requested is $n-1$, where $n$ is the size of the Laplacian. In general, $q=2$ or $q=3$ is requested.

$tol$ defaults to the square root of Base.eps of the (real) type of $Ω$. This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.

See also: distanceSqrMat, laplacian, spectralEmbedding.

Examples

using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Dsqr=distanceSqrMat(Fisher, Pset) #or: Δ²=distanceSqrMat(Fisher, 𝐏)
lap= laplacian(Dsqr) # or: Ω=laplacian(Δ²)
evalues, maps, iterations, convergence=laplacianEM(lap, 2)
evalues, maps, iterations, convergence=laplacianEM(lap, 2; maxiter=100)
evalues, maps, iterations, convergence=laplacianEM(lap, 2; ⍰=true)
source
(1) spectralEmbedding(metric::Metric, 𝐏::ℍVector, q::Int;
                     <tol::Real=0, maxiter=300, ⍰=false>)

(2) spectralEmbedding(metric::Metric, 𝐏::ℍVector, q::Int, type::Type{T};
                     <tol::Real=0, maxiter=300, ⍰=false>) where T<:Real

Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ (real or complex), compute its eigen maps in $q$ dimensions.

This function runs one after the other the functions:

By default all computations above are done with Float32 precision. Another real type can be requested using method (2), where the type argument is defined.

Return the 4-tuple (Λ, U, iterations, convergence), where:

  • $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
  • $U$ holds in columns the $q$ eigenvectors, i.e., the $q$ coordinates of the points in the embedded space,
  • $iterations$ is the number of iterations executed by the power method,
  • $convergence$ is the convergence attained by the power method.

Arguments (metric, 𝐏, q, <tol::Real=0, maxiter=300, ⍰=false>):

  • metric is the metric of type Metric::Enumerated type used for computing the inter-distances,
  • $𝐏$ is a 1d array of $k$ positive matrices of ℍVector type,
  • $q$ is the dimension of the Laplacian eigen maps;
  • The following are <optional keyword arguments> for the power method iterative algorithm:
    • tol is the tolerance for convergence of the power method (see below),
    • maxiter is the maximum number of iterations allowed for the power method,
    • if is true the convergence at all iterations will be printed.
Nota Bene

$tol$ defaults to the square root of Base.eps of the Float32 type (1) or of the type passed as argumant (2). This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.

See also: distanceSqrMat, laplacian, laplacianEigenMaps.

Examples

using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
evalues, maps, iterations, convergence=spectralEmbedding(logEuclidean, Pset, 2)
# show convergence information
evalues, maps, iterations, convergence=spectralEmbedding(logEuclidean, Pset, 2; ⍰=true)
# use Float64 precision.
evalues, maps, iterations, convergence=spectralEmbedding(logEuclidean, Pset, 2, Float64)
source

Means

FunctionDescription
meanWeighted Fréchet mean (wFm) of a matrix set using any metric
meansAs above for several sets at once
generalizedMeanGeneralized wFm of a matrix set
geometricMean, gMeanwFm of a matrix set according to the Fisher metric (iterative)
logdet0Mean, ld0MeanwFm of a matrix set according to the logdet0 metric (iterative)
wasMeanwFm of a matrix set according to the Wasserstein metric (iterative)
powerMeanPower wFm of a matrix set (iterative)

Statistics.meanFunction.
(1) mean(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(2) mean(metric::Metric, D::𝔻{T}, E::𝔻{T}) where T<:Real

(3) mean(metric::Metric, 𝐏::ℍVector;
        <w::Vector=[], ✓w=true>)
(4) mean(metric::Metric, 𝐃::𝔻Vector;
        <w::Vector=[], ✓w=true>)

(1) Mean of two positive definite matrices, passed in arbitrary order as arguments $P$ and $Q$, using the specified metric of type Metric::Enumerated type. The order is arbitrary as all metrics implemented in PosDefManifold are symmetric. This is the midpoint of the geodesic. For the weighted mean of two positive definite matrices use instead the geodesic function. $P$ and $Q$ must be flagged as Hermitian. See typecasting matrices.

(2) Like in (1), but for two real diagonal positive definite matrices $D$ and $E$.

(3) Fréchet mean of an 1d array $𝐏$ of $k$ positive definite matrices $𝐏={P_1,...,P_k}$ of ℍVector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified metricas in (1).

(3) Fréchet mean of an 1d array $𝐃$ of $k$ positive definite matrices $𝐃={D_1,...,D_k}$ of 𝔻Vector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified metricas in (1).

If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted mean.

If <optional keword argument> ✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

Math

The Fréchet mean of a set of $k$ matrices ${P_1, P_2,..., P_k}$ weighted by ${w_1, w_2,..., w_k}:\sum_{i=1}^{k}w_i=1$ for the supported metrics are, for those with closed form expression:

Metricweighted Fréchet mean
Euclidean$\sum_{i=1}^{k}w_i P_i$
invEuclidean$\big(\sum_{i=1}^{k}w_i P_i^{-1}\big)^{-1}$
ChoEuclidean$TT^*$, where $T=bL_P + aL_Q$
logEuclidean$\textrm{exp}\big(\sum_{i=1}^{k}w_i\hspace{1pt} \textrm{log}P_i \big)$
logCholesky$TT^*$, where $T=\sum_{i=1}^{k}(w_kS_k)+\sum_{i=1}^{k}(w_k\textrm{log}D_k)$
Jeffrey$A^{1/2}\big(A^{-1/2}HA^{-1/2}\big)^{1/2}A^{1/2}$

and for those that verify an equation:

Metricequation verified by the weighted Fréchet mean
Fisher$\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_k G^{-1/2}\big)=0.$
logdet0$\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}=G^{-1}$
VonNeumannN.A.
Wasserstein$G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$

legend: $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$). $A$ and $H$ are the weighted arithmetic and weighted harmonic mean, respectively.

See: geodesic, mean, Fréchet mean.

Examples

using LinearAlgebra, Statistics, PosDefManifold
# Generate 2 random 3x3 SPD matrices
P=randP(3)
Q=randP(3)
M=mean(logdet0, P, Q) # (1)
M=mean(logdet0, P, Q) # (1)

R=randP(3)
# passing several matrices and associated weights listing them
# weights vector, does not need to be normalized
mean(Fisher, ℍVector([P, Q, R]); w=[1, 2, 3])

# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
weights=[1, 2, 3, 1]
# passing a vector of Hermitian matrices (ℍVector type)
M=mean(Euclidean, Pset; w=weights) # (2) weighted Euclidean mean
M=mean(Wasserstein, Pset)  # (2) unweighted Wassertein mean
# using unicode: M=mean(Wasserstein, 𝐏)
source
PosDefManifold.meansFunction.
(1) means(metric::Metric, 𝒫::ℍVector₂)

(2) means(metric::Metric, 𝒟::𝔻Vector₂)

(1) Given a 2d array $𝒫$ of positive definite matrices as an ℍVector₂ type compute the Fréchet mean for as many ℍVector type objects as hold in $𝒫$, using the specified metric of type Metric::Enumerated type. Return the means in a vector of Hermitian matrices, that is, as an ℍVector type.

(2) Given a 2d array $𝒟$ of real positive definite matrices as an 𝔻Vector₂ type compute the Fréchet mean for as many 𝔻Vector type objects as hold in $𝒟$, using the specified metric of type Metric::Enumerated type. Return the means in a vector of Diagonal matrices, that is, as a 𝔻Vector type.

The weigted Fréchet mean is not supported in this function.

See also: mean.

Examples

 using PosDefManifold
 # Generate a set of 4 random 3x3 SPD matrices
 Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
 # Generate a set of 40 random 4x4 SPD matrices
 Qset=randP(3, 40) # or, using unicode: 𝐐=randP(3, 40)
 # listing directly ℍVector objects
 means(logEuclidean, ℍVector₂([Pset, Qset])) # or: means(logEuclidean, ℍVector₂([𝐏, 𝐐]))
 # note that [𝐏, 𝐐] is actually a ℍVector₂ type object

 # creating and passing an object of ℍVector₂ type
 sets=ℍVector₂(undef, 2) # or: 𝒫=ℍVector₂(undef, 2)
 sets[1]=Pset # or: 𝒫[1]=𝐏
 sets[2]=Qset # or: 𝒫[2]=𝐐
 means(logEuclidean, sets) # or: means(logEuclidean, 𝒫)
source
generalizedMean(𝐏::ℍVector, p::Real;
               <w::Vector=[], ✓w=true>)

generalizedMean(𝐃::𝔻Vector, p::Real;
               <w::Vector=[], ✓w=true>)

(1) Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the weighted generalized means $G$ with real parameter $p$, that is,

$G=\big(\sum_{i=1}^{k}w_iP_i^p\big)^{1/p}$.

If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted generalized mean

$G=\big(\sum_{i=1}^{k}P_i^p\big)^{1/p}$.

(2) Like method (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type.

If <optional keword argument> ✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the weights each time.

The following special cases for parameter $p$ are noteworthy:

Notice that when matrices in 𝐏 all pair-wise commute, for instance if the matrices are diagonal, the generalized means coincide with the power means for any $p∈[-1, 1]$ and for $p=0.5$ it coincides also with the Wasserstein mean. For this reason the generalized means are used as default initialization of both the powerMean and wasMean algorithm.

See: generalized means.

See also: powerMean, wasMean, mean.

Examples

using LinearAlgebra, Statistics, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# unweighted mean
G = generalizedMean(Pset, 0.25) # or: G = generalizedMean(𝐏, 0.25)

# weighted mean
G = generalizedMean(Pset, 0.5; w=weights)

# with weights previously normalized we can set ✓w=false
weights=weights./sum(weights)
G = generalizedMean(Pset, 0.5; w=weights, ✓w=false)
source
(1) geometricMean(𝐏::ℍVector;
                 <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false, ⏩=false>)

(2) geometricMean(𝐃::𝔻Vector;
                 <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false>)

alias: gmean

(1) Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Fisher metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_i G^{-1/2}\big)=0.$

For estimating it, this function implements the well-known gradient descent algorithm, yielding iterations

$G ←G^{1/2}\textrm{exp}\big(\sum_{i=1}^{k}w_i\textrm{log}(G^{-1/2} P_i G^{-1/2})\big)G^{1/2}.$

If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted geometric mean.

If <optional keword argument> ✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,
  • tol is the tolerance for the convergence (see below).
  • if is true, the convergence attained at each iteration is printed.
  • if ⏩=true the iterations are multi-threaded.
Multi-Threading

Multi-threading is still experimental in julia. You should check the result on each computer. Multi-threading is automatically disabled if k<3 or if Julia uses only one thread. See Threads.

Nota Bene

In normal circumstances this algorithm converges monothonically. If the algorithm diverges a warning is printed indicating the iteration when this happened.

$tol$ defaults to 100 times the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the relative convergence criterion over two successive iterations to vanish for about half the significant digits minus 2.

(2) Like method (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type.

See: Fisher metric.

See also: powerMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = geometricMean(Pset) # or G, iter, conv = geometricMean(𝐏)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = geometricMean(Pset, w=weights)

# print the convergence at all iterations
G, iter, conv = geometricMean(Pset; w=weights, ⍰=true)

# now suppose Pset has changed a bit, initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = geometricMean(Pset; w=weights, ✓w=false, ⍰=true, init=G)

# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 160)
@benchmark(geometricMean(Pset)) # single-threaded
@benchmark(geometricMean(Pset; ⏩=true)) # multi-threaded
source
(1) logdet0Mean(𝐏::ℍVector;
               <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false>)

(2) logdet0Mean(𝐃::𝔻Vector;
               <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false>)

alias: ld0Mean

(1) Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the logdet zero metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}=G^{-1}$.

For estimating it, this function implements the fixed-point iteration algorithm suggested by (Moakher, 2012, p315)🎓, yielding iterations

$G ← \frac{1}{2}\big(\sum_{i=1}^{k}w_i(P_i+G)^{-1}\big)^{-1}$.

If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted logdet zero mean.

If <optional keword argument> ✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,
  • tol is the tolerance for the convergence (see below).
  • if is true, the convergence attained at each iteration is printed.
Nota Bene

In normal circumstances this algorithm converges monothonically. If the algorithm diverges a warning is printed indicating the iteration when this happened.

$tol$ defaults to 100 times the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the relative convergence criterion over two successive iterations to vanish for about half the significant digits minus 2.

(2) Like method (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type.

See: logdet zero metric, modified Bhattacharyya mean.

See also: powerMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = logdet0Mean(Pset) # or G, iter, conv = logdet0Mean(𝐏)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = logdet0Mean(Pset, w=weights)

# print the convergence at all iterations
G, iter, conv = logdet0Mean(Pset; w=weights, ⍰=true)

# now suppose Pset has changed a bit, initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = logdet0Mean(Pset; w=weights, ✓w=false, ⍰=true, init=G)
source
(1) wasMean(𝐏::ℍVector;
        <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false>)

(2) wasMean(𝐃::𝔻Vector;
        <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false>)

(1) Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Wasserstein metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying

$G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$.

For estimating it, this function implements the fixed-point iterative algorithm proposed by (Álvarez-Esteban et al., 2016)🎓:

$G ← G^{-1/2}\big(\sum_{i=1}^{k} w_i(G^{1/2}P_i G^{1/2})^{1/2}\big)^2 G^{-1/2}$.

If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted Wassertein mean.

If <optional keword argument> ✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and they should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with $p=0.5$ will be used,
  • tol is the tolerance for the convergence (see below).
  • if is true, the convergence attained at each iteration is printed.
Nota Bene

In normal circumstances this algorithm converges monothonically. If the algorithm diverges a warning is printed indicating the iteration when this happened.

$tol$ defaults to 100 times the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the relative convergence criterion over two successive iterations to vanish for about half the significant digits minus 2.

(2) Like in (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type. In this case the solution is available in closed-form, hence the <optional keyword arguments> init, tol and have no effect and return the 3-tuple $(G, 1, 0)$. See modified Bhattacharyya mean.

See: Wasserstein metric.

See also: powerMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = wasMean(Pset) # or: G, iter, conv = wasMean(𝐏)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = wasMean(Pset; w=weights)

# print the convergence at all iterations
G, iter, conv = wasMean(Pset; w=weights, ⍰=true)

# now suppose 𝐏 has changed a bit, initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = wasMean(Pset; w=weights, ⍰=true, init=G)
source
(1) powerMean(𝐏::ℍVector, p::Real;
         <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false>)

(2) powerMean(𝐃::𝔻Vector, p::Real;
         <w::Vector=[], ✓w=true, init=nothing, tol::Real=0, ⍰=false>)

(1) Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type, an optional non-negative real weights vector $w={w_1,...,w_k}$ and a real parameter p $\in[-1, 1]$, return the 3-tuple $(G, iter, conv)$, where $G$ is Lim and Palfia (2012)'s power means of order $p$ and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm, respectively. Mean $G$ is the unique positive definite matrix satisfying

$G=\sum_{i=1}^{k}(w_iG\textrm{#}_pP_i)$,

where $G\textrm{#}_pP_i$ is the Fisher geodesic equation. In particular:

  • with $p=-1$ this is the harmonic mean (see the inverse Euclidean metric),
  • with $p=+1$ this is the arithmetic mean (see the Euclidean metric),
  • at the limit of $p$ evaluated at zero from both side this is the geometric mean (see Fisher metric).

For estimating power means for $p\in(-1, 1)$, this function implements the fixed-point iterative algorithm of (Congedo et al., 2017b)🎓. For $p=0$ (geometric mean) this algorithm is run two times with a small positive and negative value of $p$ and the geometric mean of the two resulting means is returned, as suggested in (Congedo et al., 2017b)🎓. This way of estimating the geometric mean of a set of matrices is faster as compared to the usual gradient descent algorithm.

If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted power mean.

If <optional keword argument> ✓w=true (default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should type be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.

The following are more <optional keyword arguments>:

  • init is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with parameter $p$ will be used.
  • tol is the tolerance for the convergence (see below).
  • if is true, the convergence attained at each iteration is printed.
Nota Bene

In normal circumstances this algorithm converges monothonically. If the algorithm diverges a warning is printed indicating the iteration when this happened.

$tol$ defaults to 100 times the square root of Base.eps of the nearest real type of data input $𝐏$. This corresponds to requiring the relative convergence criterion over two successive iterations to vanish for about half the significant digits minus 2.

(2) Like in (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type. In this case the solution is available in closed-form, hence the <optional keyword arguments> init, tol and have no effect and return the 3-tuple $(G, 1, 0)$. See generalized means.

See: power means, generalized means, modified Bhattacharyya mean.

See also: generalizedMean, wasMean, logdet0Mean, mean.

Examples

using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)

# unweighted mean
G, iter, conv = powerMean(Pset, 0.5) # or G, iter, conv = powerMean(𝐏, 0.5)

# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]

# weighted mean
G, iter, conv = powerMean(Pset, 0.5; w=weights)

# print the convergence at all iterations
G, iter, conv = powerMean(Pset, 0.5; w=weights, ⍰=true)

# now suppose 𝐏 has changed a bit, initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = powerMean(Pset, 0.5; w=weights, ⍰=true, init=G)
source

Tangent Space operations

FunctionDescription
logMapLogarithmic map (from manifold to tangent space)
expMapExponential map (from tangent space to manifold)
vecPvectorization of matrices in the tangent space
matPmatrization of matrices in the tangent space (inverse of `vecp)

PosDefManifold.logMapFunction.
logMap(metric::Metric, P::ℍ{T}, G::ℍ{T}) where T<:RealOrComplex

Logaritmic Map: map a positive definite matrix $P$ from the SPD or Hermitian manifold into the tangent space at base-point $G$ using the Fisher metric.

$P$ and $G$ must be flagged as Hermitian. See typecasting matrices.

The map is defined as

$Log_G(P)=S=G^{1/2}\textrm{log}\big(G^{-1/2}PG^{-1/2}\big)G^{1/2}$.

The result is an Hermitian matrix. The inverse operation is expMap.

Arguments (metric, P, G):

  • metric is a metric of type Metric::Enumerated type.
  • $P$ is the positive definite matrix to be projected onto the tangent space,
  • $G$ is the tangent space base point,

Currently only the Fisher metric is supported for tangent space operations.

See also: vecP.

Examples

using PosDefManifold
P=randP(3)
Q=randP(3)
metric=Fisher
G=mean(metric, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(metric, P, G)
source
PosDefManifold.expMapFunction.
expMap(metric::Metric, S::ℍ{T}, G::ℍ{T}) where T<:RealOrComplex

Exponential Map: map an Hermitian matrix $S$ from the tangent space at base point $G$ into the SPD or Hermitian manifold (using the Fisher metric).

$S$ and $G$ must be flagged as Hermitian. See typecasting matrices.

The map is defined as

$Exp_G(S)=P=G^{1/2}\textrm{exp}\big(G^{-1/2}SG^{-1/2}\big)G^{1/2}$.

The result is a positive definite matrix. The inverse operation is logMap.

Arguments (metric, S, G):

  • metric is a metric of type Metric::Enumerated type,
  • $S$ is a Hermitian matrix, real or complex, to be projected on the SPD or Hermitian manifold,
  • $G$ is the tangent space base point.

Currently only the Fisher metric is supported for tangent space operations.

Examples

using PosDefManifold, LinearAlgebra
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P on the tangent space at the Fisher mean base point G
S=logMap(Fisher, P, G)
# adding the identity in the tangent space and reprojecting back onto the manifold
H=expMap(Fisher, ℍ(S+I), G)
source
PosDefManifold.vecPFunction.
vecP(S::ℍ{T}) where T<:RealOrComplex

Vectorize a tangent vector (which is an Hermitian matrix) $S$: mat -> vec.

It gives weight $1$ to diagonal elements and √2 to off-diagonal elements (Barachant et al., 2012)🎓.

The result is a vector holding $n(n+1)/2$ elements, where $n$ is the size of $S$.

$S$ must be flagged as Hermitian. See typecasting matrices.

The inverse operation is provided by matP.

Examples

using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
source
PosDefManifold.matPFunction.
matP(ς::Vector{T}) where T<:RealOrComplex

Matrizize a tangent vector (vector) ς : vec -> mat.

This is the function reversing the vecP function, thus the weighting applied therein is reversed as well.

If $ς=vecP(S)$ and $S$ is a $n⋅n$ Hermitian matrix, $ς$ is a tangent vector of size $n(n+1)/2$. The result of calling matP(ς) is then $n⋅n$ matrix $S$.

To Do: This function needs to be rewritten more efficiently

Examples

using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fishr, P, Q)
# projecting P at onto the tangent space at the Fisher mean base point
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
# Rotate the vector by an orthogonal matrix
n=Int(size(S, 1)*(size(S, 1)+1)/2)
U=randP(n)
z=U*v
# Get the point in the tangent space
S=matP(z)
source

Procrustes problems

FunctionDescription
procrustesSolution to the Procrustes problem in the manifold of positive definite matrices

procrustes(P::ℍ{T}, Q::ℍ{T}, extremum="min") where T<:RealOrComplex

Given two positive definite matrices $P$ and $Q$, return by default the solution of problem

$\textrm{argmin}_Uδ(P,U^*QU)$,

where $U$ varies over the set of unitary matrices and $δ(.,.)$ is a distance or divergence function.

$U^*QU$ is named in physics the unitary orbit of $Q$.

If the argument extremum is passed as "max", it returns instead the solution of

$\textrm{argmax}_Uδ(P,U^*QU)$.

$P$ and $Q$ must be flagged as Hermitian. See typecasting matrices.

As it has been shown in Bhatia and Congedo (2019)🎓, using each of the Fisher, logdet zero, Wasserstein and the Kullback-Leibler divergence (see logdet α), the best approximant to $P$ from the unitary orbit of $Q$ commutes with $P$ and, surprisingly, has the same closed-form expression, namely

$U_Q^↓U_P^{↓*}$ for the argmin and $U_Q^↑U_P^{↓*}$ for the argmax,

where $U^↓$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by decreasing order of corresponding eigenvalues and $U^↑$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by increasing order of corresponding eigenvalues.

The same solutions are known since a long time also by solving the extremal problem here above using the Euclidean metric (Umeyama, 1988).

Examples

using PosDefManifold
P=randP(3)
Q=randP(3)
# argmin problem
U=procrustes(P, Q)
# argmax problem
V=procrustes(P, Q, "max")
source