riemannianGeometry.jl
This is the fundamental unit of PosDefManifold. It contains functions for manipulating points in the Riemannian manifold of Symmetric Positive Definite (SPD) or Hermitian Positive Definite (HPD) matrices. In Julia those are Hermitian
matrices, see typecasting matrices.
The functions are divided in six categories:
Category | Output |
---|---|
1. Geodesic equations | interpolation, extrapolation, weighted mean of two matrices, ... |
2. Distances | length of geodesics |
3. Graphs and Laplacians | spectral embedding, eigenmaps, track system dynamics, ... |
4. Means | mid-points of geodesics, centers of mass of several points |
5. Tangent Space operations | maps from the manifold to the tangent space and viceversa |
6. Procrustes problems | data matching, transfer learning, ... |
⋅
Geodesic equations
Function | Description |
---|---|
geodesic | Geodesic equations (weighted mean of two positive definite matrices) for any metric |
⋅
PosDefManifold.geodesic
— Function.(1) geodesic(metric::Metric, P::ℍ{T}, Q::ℍ{T}, a::Real) where T<:RealOrComplex
(2) geodesic(metric::Metric, D::𝔻{S}, E::𝔻{S}, a::Real) where S<:Real
(1) Move along the geodesic from point $P$ to point $Q$ (two positive definite matrices) with arclegth $0<=a<=1$, using the specified metric, of type Metric::Enumerated type.
For all metrics,
- with $a=0$ we stay at $P$,
- with $a=1$ we move up to $Q$,
- with $a=1/2$ we move to the mid-point of $P$ and $Q$ (mean).
Using the Fisher metric, argument $a$ can be any real number, for instance:
- with $0<a<1$ we move toward $Q$ (attraction),
- with $a>1$ we move over and beyond $Q$ (extrapolation),
- with $a<0$ we move back away from Q (repulsion).
$P$ and $Q$ must be flagged by julia as Hermitian
. See typecasting matrices.
Note that if $Q=I$, the Fisher geodesic move is simply $P^a$ (no need to call this funtion then).
For the logdet zero and Jeffrey metric no closed form expression for the geodesic is available to the best of authors' knowledge, so in this case the geodesic is found as the weighted mean using the mean
function. For the Von Neumann not even an expression for the mean is available, so in this case the geodesic is not provided and a warning is printed.
(2) Like in (1), but for two real positive definite diagonal matrices $D$ and $E$.
Maths
For points $P$, $Q$ and arclength $a$, letting $b=1-a$, the geodesic equations for the supported metrics are:
Metric | geodesic equation |
---|---|
Euclidean | $bP + aQ$ |
invEuclidean | $\big(bP^{-1} + aQ^{-1}\big)^{-1}$ |
ChoEuclidean | $TT^*$, where $T=bL_P + aL_Q$ |
logEuclidean | $\text{exp}\big(b\hspace{2pt}\text{log}(P) + a\hspace{2pt}\text{log}(Q)\big)$ |
logCholesky | $TT^*$, where $T=S_P+a(S_Q-S_P)+D_P\hspace{2pt}\text{exp}\big(a(\text{log}D_Q-\text{log}D_P)\big)$ |
Fisher | $P^{1/2} \big(P^{-1/2} Q P^{-1/2}\big)^a P^{1/2}$ |
logdet0 | uses weighted mean algorithm logdet0Mean |
Jeffrey | uses weighted mean mean |
VonNeumann | N.A. |
Wasserstein | $b^2P+a^2Q +ab\big[(PQ)^{1/2} +(QP)^{1/2}\big]$ |
legend: $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).
See also: mean
.
Examples
using PosDefManifold
P=randP(10)
Q=randP(10)
# Wasserstein mean
M=geodesic(Wasserstein, P, Q, 0.5)
# extrapolate suing the Fisher metric
E=geodesic(Fisher, P, Q, 2)
Distances
Function | Description |
---|---|
distanceSqr , distance² | Squared distance between positive definite matrices |
distance | Distance between positive definite matrices |
⋅
PosDefManifold.distanceSqr
— Function.(1) distanceSqr(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distanceSqr(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distanceSqr(metric::Metric, D::𝔻{S}) where S<:Real
(4) distanceSqr(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real
alias: distance²
(1) Return $δ^2(P, I)$, the square of the distance (or divergence) of positive definite matrix $P$ from the the identity matrix. See distance from the origin.
(2) Return $δ^2(P, Q)$, the square of the distance (or divergence) between two positive definite matrices $P$ and $Q$. See distance.
In both cases the distance function $δ$ is induced by the argument metric
of type Metric::Enumerated type.
$P$ in (1) and $P$, $Q$ in (2) must be flagged by julia as Hermitian
. See typecasting matrices.
(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite Diagonal
matrices. See ℍVector type and 𝔻Vector type.
Maths
For point $P$ the squared distances from the identity for the supported metrics are:
Metric | Squared Distance from the identity |
---|---|
Euclidean | $∥P-I∥^2$ |
invEuclidean | $∥P^{-1}-I∥^2$ |
ChoEuclidean | $∥L_P-I∥^2$ |
logEuclidean | $∥\textrm{log}P∥^2$ |
logCholesky | $∥S_P∥^2+∥\textrm{log}D_P∥^2$ |
Fisher | $∥\textrm{log}P∥^2$ |
logdet0 | $\textrm{logdet}\frac{1}{2}(P+I) - \frac{1}{2}\textrm{logdet}(P)$ |
Jeffrey | $\frac{1}{2}\textrm{tr}(P+P^{-1})-n$ |
VonNeumann | $\frac{1}{2}\textrm{tr}(P\textrm{log}P-\textrm{log}P)$ |
Wasserstein | $\textrm{tr}(P+I) -2\textrm{tr}(P^{1/2})$ |
For points $P$ and $Q$ their squared distances for the supported metrics are:
Metric | Squared Distance |
---|---|
Euclidean | $∥P-Q∥^2$ |
invEuclidean | $∥P^{-1}-Q^{-1}∥^2$ |
ChoEuclidean | $∥ L_P - L_Q ∥^2$ |
logEuclidean | $∥\textrm{log}P-\textrm{log}Q∥^2$ |
logCholesky | $∥S_P-S_Q∥^2+∥\textrm{log}D_P-\textrm{log}D_Q∥^2$ |
Fisher | $∥\textrm{log}(P^{-1/2}QP^{-1/2})∥^2$ |
logdet0 | $\textrm{logdet}\frac{1}{2}(P+Q) - \frac{1}{2}\textrm{logdet}(PQ)$ |
Jeffrey | $\frac{1}{2}\textrm{tr}(Q^{-1}P+P^{-1}Q)-n$ |
VonNeumann | $\frac{1}{2}\textrm{tr}(P\textrm{log}P-P\textrm{log}Q+Q\textrm{log}Q-Q\textrm{log}P)$ |
Wasserstein | $\textrm{tr}(P+Q) -2\textrm{tr}(P^{1/2}QP^{1/2})^{1/2}$ |
legend: $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$).
See also: distanceSqrMat
.
Examples (1)
using PosDefManifold
P=randP(10)
d=distanceSqr(Wasserstein, P)
e=distanceSqr(Fisher, P)
metric=Metric(Int(logdet0)) # or metric=logdet0
s=string(metric) # check what is the current metric
f=distance²(metric, P) #using the alias distance²
Examples (2)
using PosDefManifold
P=randP(10)
Q=randP(10)
d=distanceSqr(logEuclidean, P, Q)
e=distance²(Jeffrey, P, Q)
PosDefManifold.distance
— Function.(1) distance(metric::Metric, P::ℍ{T}) where T<:RealOrComplex
(2) distance(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(3) distance(metric::Metric, D::𝔻{S}) where S<:Real
(4) distance(metric::Metric, D::𝔻{S}, E::𝔻{S}) where S<:Real
(1) Return $δ(P, I)$, the distance between positive definite matrix $P$ and the identity matrix.
(2) Return $δ(P, Q)$, the distance between positive definite matrices $P$ and $Q$.
(3) and (4) are specialized methods of (1) and (2), respectively, for real positive definite Diagonal
matrices.
This is the square root of distanceSqr
and is invoked with the same syntax therein.
See also: distanceMat
.
Graphs and Laplacians
Function | Description |
---|---|
distanceSqrMat , distance²Mat | Lower triangular matrix of all squared inter-distances |
distanceMat | Lower triangular matrix of all inter-distances |
laplacian | Laplacian of a squared inter-distances matrix |
laplacianEigenMaps , laplacianEM | Eigen maps (eigenvectors) of a Laplacian |
spectralEmbedding , spEmb | Spectral Embedding (all the above function run in series) |
⋅
PosDefManifold.distanceSqrMat
— Function. (1) distanceSqrMat(metric::Metric, 𝐏::ℍVector;
<⏩=false>)
(2) distanceSqrMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
<⏩=false>) where T<:AbstractFloat
alias: distance²Mat
Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real LowerTriangular
matrix comprising elements $δ^2(P_i, P_j)\textrm{, for all }i>=j$.
This is the lower triangular matrix holding all squared inter-distances (zero on diagonal), using the specified metric
, of type Metric::Enumerated type, giving rise to distance function $δ$. See distanceSqr
.
Only the lower triangular part is computed in order to optimize memory use.
By default, the result matrix is of type Float32
. The type can be changed to another real type
using method (2).
<optional keyword arguments>:
- if ⏩=true the computation of inter-distances is multi-threaded.
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<2k$. See Threads.
See: distance.
See also: laplacian
, laplacianEigenMaps
, spectralEmbedding
.
Examples
using PosDefManifold
# Generate a set of 8 random 10x10 SPD matrices
Pset=randP(10, 8) # or, using unicode: 𝐏=randP(10, 8)
# Compute the squared inter-distance matrix according to the log Euclidean metric.
# This is much faster as compared to the Fisher metric and in general
# it is a good approximation.
Δ²=distanceSqrMat(logEuclidean, Pset)
# return a matrix of type Float64
Δ²64=distanceSqrMat(Float64, logEuclidean, Pset)
# Multi-threaded
Δ²=distanceSqrMat(Fisher, Pset; ⏩=true)
# Get the full matrix of inter-distances
fullΔ²=Hermitian(Δ², :L)
PosDefManifold.distanceMat
— Function. (1) distanceMat(metric::Metric, 𝐏::ℍVector;
<⏩=true>)
(2) distanceMat(type::Type{T}, metric::Metric, 𝐏::ℍVector;
<⏩=true>) where T<:AbstractFloat
Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ of ℍVector type, create the $k⋅k$ real LowerTriangular
matrix comprising elements $δ(P_i, P_j)\textrm{, for all }i>=j$.
This is the lower triangular matrix holding all inter-distances (zero on diagonal), using the specified metric
, of type Metric::Enumerated type, giving rise to distance $δ$. See distance
.
Only the lower triangular part is computed in order to optimize memory use.
By default, the result matrix is of type Float32
. The type can be changed to another real type
using method (2).
The elements of this matrix are the square root of distanceSqrMat
.
<optional keyword arguments>:
- if ⏩=true the computation of inter-distances is multi-threaded.
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
See: distance.
Examples
using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Δ=distanceMat(Fisher, Pset)
# return a matrix of type Float64
Δ64=distanceMat(Float64, Fisher, Pset)
# Multi-threaded
Δ64=distanceMat(Fisher, Pset; ⏩=true)
# Get the full matrix of inter-distances
fullΔ=Hermitian(Δ, :L)
PosDefManifold.laplacian
— Function.laplacian(Δ²:𝕃{S}) where S<:Real
Given a LowerTriangular
matrix of squared inter-distances $Δ^2$, return the lower triangular part of the normalized Laplacian. The elements of the Laplacian are of the same type as the elements of $Δ^2$. The result is a LowerTriangular
matrix.
First, a Gaussian radial basis functions is applied to all elements of $Δ^2$, such as
$W_{ij} = exp(\frac{\displaystyle{-Δ^2_{ij}}}{\displaystyle{ε}})$,
where $ε$ is the Gaussian scale parameter given by $μ*(σ+1)*epsilon$, where
- $μ$ is the geometric mean of the elements $Δ^2_{ij}$,
- $σ$ is the geometric standard deviation of the elements $Δ^2_{ij}$,
- $epsilon$ is a <keyword optional parameter> that defaults to 1.
Finally, the normalized Laplacian is defined as
$Ω = D^{-1/2}WD^{-1/2}$,
where $D$ is the diagonal matrix holding on the main diagonal the sum of the rows (or columns) of $W$.
The normalized Laplacian as here defined can be requested for any input matrix of squared inter-distances, for example, those obtained on scalars or on vectors using appropriate metrics. In any case, only the lower triangular part of the Laplacian is taken as input. See typecasting matrices.
By tuning $epsilon$ you can control both the rate of compression of the ensuing spectralEmbedding
, that is, how many dimensions purposefully embed the data and the spread of points in those embedding dimensions.
See also: distanceSqrMat
, laplacianEigenMaps
, spectralEmbedding
.
Examples
using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Δ²=distanceSqrMat(Fisher, Pset)
Ω=laplacian(Δ²)
PosDefManifold.laplacianEigenMaps
— Function. laplacianEigenMaps(Ω::𝕃{S}, q::Int;
<
tol::Real=0,
maxiter::Int=300,
⍰=false >) where S<:Real
alias: laplacianEM
Given the lower triangular part of a normalized Laplacian $Ω$ (see laplacian
) return the eigen maps in $q$ dimensions, i.e., the $q$ eigenvectors of the normalized Laplacian associated with the largest $q$ eigenvalues, excluding the first (which is always equal to 1.0). The eigenvectors are of the same type as $Ω$.
The eigenvectors of the normalized Laplacian are computed by the power iterations+modified Gram-Schmidt method (see powerIterations
), allowing the execution of this function for Laplacian matrices of very large size.
Return the 4-tuple $(Λ, U, iterations, convergence)$, where:
- $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
- $U$ holds in columns the eigen maps, that is, the $q$ eigenvectors,
- $iterations$ is the number of iterations executed by the power method,
- $convergence$ is the convergence attained by the power method.
The eigenvectors of $U$ holds the coordinates of the points in a low-dimension Euclidean space (typically two or three for plotting). This is done for, among other purposes, plotting data in low-dimension classifying them and following their trajectories over time or other dimensions. For examples of applications see Ridrigues et al. (2018) 🎓 and references therein.
Arguments:
- $Ω$ is a real
LowerTriangular
normalized Laplacian obtained by thelaplacian
function, - $q$ is the dimension of the Laplacian eigen maps;
- The following are <optional keyword arguments> for the power iterations:
tol
is the tolerance for convergence (see below),maxiter
is the maximum number of iterations allowed,- if
⍰
is true, the convergence at all iterations will be printed.
The maximum value of $q$ that can be requested is $n-1$, where $n$ is the size of the Laplacian. In general, $q=2$ or $q=3$ is requested.
$tol$ defaults to the square root of Base.eps
of the (real) type of $Ω$. This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.
See also: distanceSqrMat
, laplacian
, spectralEmbedding
.
Examples
using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
Dsqr=distanceSqrMat(Fisher, Pset) #or: Δ²=distanceSqrMat(Fisher, 𝐏)
lap= laplacian(Dsqr) # or: Ω=laplacian(Δ²)
evalues, maps, iterations, convergence=laplacianEM(lap, 2)
evalues, maps, iterations, convergence=laplacianEM(lap, 2; maxiter=100)
evalues, maps, iterations, convergence=laplacianEM(lap, 2; ⍰=true)
PosDefManifold.spectralEmbedding
— Function. (1) spectralEmbedding(metric::Metric, 𝐏::ℍVector, q::Int;
<
tol::Real=0,
maxiter::Int=300,
⍰=false,
⏩=false >)
(2) spectralEmbedding(type::Type{T}, metric::Metric, 𝐏::ℍVector, q::Int;
< same optional keyword arguments as in (1) >) where T<:Real
alias: spEmb
Given a 1d array $𝐏$ of $k$ positive definite matrices ${P_1,...,P_k}$ (real or complex), compute its eigen maps in $q$ dimensions.
This function runs one after the other the functions:
distanceSqrMat
(compute the squared inter-distance matrix),laplacian
(compute the normalized Laplacian),laplacianEigenMaps
(get the eigen maps).
By default all computations above are done with Float32
precision. Another real type can be requested using method (2), where the type
argument is defined.
Return the 4-tuple (Λ, U, iterations, convergence)
, where:
- $Λ$ is a $q⋅q$ diagonal matrix holding on diagonal the eigenvalues corresponding to the $q$ dimensions of the Laplacian eigen maps,
- $U$ holds in columns the $q$ eigenvectors, i.e., the $q$ coordinates of the points in the embedded space,
- $iterations$ is the number of iterations executed by the power method,
- $convergence$ is the convergence attained by the power method.
Arguments:
metric
is the metric of type Metric::Enumerated type used for computing the inter-distances,- $𝐏$ is a 1d array of $k$ positive matrices of ℍVector type,
- $q$ is the dimension of the Laplacian eigen maps;
- The following are <optional keyword arguments> for the power method iterative algorithm:
tol
is the tolerance for convergence of the power method (see below),maxiter
is the maximum number of iterations allowed for the power method,- if
⍰
is true the convergence at all iterations will be printed. - if ⏩=true the computation of inter-distances is multi-threaded.
Multi-threading is still experimental in julia. You should check the result on each computer. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<2k$. See Threads.
$tol$ defaults to the square root of Base.eps
of the Float32
type (1) or of the type
passed as argumant (2). This corresponds to requiring equality for the convergence criterion over two successive power iterations of about half of the significant digits.
See also: distanceSqrMat
, laplacian
, laplacianEigenMaps
.
Examples
using PosDefManifold
# Generate a set of 4 random 10x10 SPD matrices
Pset=randP(10, 4) # or, using unicode: 𝐏=randP(10, 4)
evalues, maps, iter, conv=spectralEmbedding(logEuclidean, Pset, 2)
# show convergence information
evalues, maps, iter, conv=spectralEmbedding(logEuclidean, Pset, 2; ⍰=true)
# use Float64 precision.
evalues, maps, iter, conv=spectralEmbedding(Float64, logEuclidean, Pset, 2)
# Multi-threaded
evalues, maps, iter, conv=spectralEmbedding(logEuclidean, Pset, 2, ⏩=true)
Means
Function | Description |
---|---|
mean | Weighted Fréchet mean (wFm) of a scalar or matrix set using any metric |
means | As above for several sets at once |
generalizedMean | Generalized wFm of a matrix set |
geometricMean , gMean | wFm of a matrix set minimizing the dispersion according to the Fisher metric (iterative) |
geometricpMean , gpMean | robust wFm of a matrix set minimizing the p-dispersion according to the Fisher metric (iterative) |
logdet0Mean , ld0Mean | wFm of a matrix set according to the logdet0 metric (iterative) |
wasMean | wFm of a matrix set according to the Wasserstein metric (iterative) |
powerMean | Power wFm of a matrix set (iterative) |
⋅
Statistics.mean
— Function. (1) mean(metric::Metric, P::ℍ{T}, Q::ℍ{T}) where T<:RealOrComplex
(2) mean(metric::Metric, D::𝔻{T}, E::𝔻{T}) where T<:Real
(3) mean(metric::Metric, 𝐏::ℍVector;
<
w::Vector=[],
✓w=true,
⏩=false >)
(4) mean(metric::Metric, 𝐃::𝔻Vector;
< same optional keyword arguments as in (3) >)
(1) Mean of two positive definite matrices, passed in arbitrary order as arguments $P$ and $Q$, using the specified metric
of type Metric::Enumerated type. The order is arbitrary as all metrics implemented in PosDefManifold are symmetric. This is the midpoint of the geodesic. For the weighted mean of two positive definite matrices use instead the geodesic
function. $P$ and $Q$ must be flagged as Hermitian
. See typecasting matrices.
(2) Like in (1), but for two real diagonal positive definite matrices $D$ and $E$.
(3) Fréchet mean of an 1d array $𝐏$ of $k$ positive definite matrices $𝐏={P_1,...,P_k}$ of ℍVector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified metric
as in (1).
(5) Fréchet mean of an 1d array $𝐃$ of $k$ positive definite matrices $𝐃={D_1,...,D_k}$ of 𝔻Vector type, with optional non-negative real weights $w={w_1,...,w_k}$ and using the specified metric
as in (1).
If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted mean.
If <optional keyword argument> ✓w=true
(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.
Adopting the Fisher
, logdet0
and Wasserstein
metric in (3) and the logdet0
metric in (4), the mean is computed by means of an iterative algorithm and information on its convergence is displayed in the REPL. For suppressing this information and for more options for computing these means call directly functions geometricMean
, logdet0Mean
and wasMean
. See also the robust function geometricpMean
.
For (3) and (4), if ⏩=true
is passed as <optional keyword argument>, the computation of the mean is multi-threaded.
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
Math
The Fréchet mean of a set of $k$ matrices ${P_1, P_2,..., P_k}$ weighted by ${w_1, w_2,..., w_k}:\sum_{i=1}^{k}w_i=1$ for the supported metrics are, for those with closed form expression:
Metric | weighted Fréchet mean |
---|---|
Euclidean | $\sum_{i=1}^{k}w_i P_i$ |
invEuclidean | $\big(\sum_{i=1}^{k}w_i P_i^{-1}\big)^{-1}$ |
ChoEuclidean | $TT^*$, where $T=bL_P + aL_Q$ |
logEuclidean | $\textrm{exp}\big(\sum_{i=1}^{k}w_i\hspace{1pt} \textrm{log}P_i \big)$ |
logCholesky | $TT^*$, where $T=\sum_{i=1}^{k}(w_kS_k)+\sum_{i=1}^{k}(w_k\textrm{log}D_k)$ |
Jeffrey | $A^{1/2}\big(A^{-1/2}HA^{-1/2}\big)^{1/2}A^{1/2}$ |
and for those that verify an equation:
Metric | equation verified by the weighted Fréchet mean |
---|---|
Fisher | $\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_k G^{-1/2}\big)=0.$ |
logdet0 | $\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}=G^{-1}$ |
VonNeumann | N.A. |
Wasserstein | $G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$ |
legend: $L_X$, $S_X$ and $D_X$ are the Cholesky lower triangle of $X$, its strictly lower triangular part and diagonal part, respectively (hence, $S_X+D_X=L_X$, $L_XL_X^*=X$). $A$ and $H$ are the weighted arithmetic and weighted harmonic mean, respectively.
See: geodesic, mean, Fréchet mean.
Examples
using LinearAlgebra, Statistics, PosDefManifold
# Generate 2 random 3x3 SPD matrices
P=randP(3)
Q=randP(3)
M=mean(logdet0, P, Q) # (1)
M=mean(logdet0, P, Q) # (1)
R=randP(3)
# passing several matrices and associated weights listing them
# weights vector, does not need to be normalized
mean(Fisher, ℍVector([P, Q, R]); w=[1, 2, 3])
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
weights=[1, 2, 3, 1]
# passing a vector of Hermitian matrices (ℍVector type)
M=mean(Euclidean, Pset; w=weights) # (2) weighted Euclidean mean
M=mean(Wasserstein, Pset) # (2) unweighted Wassertein mean
# using unicode: M=mean(Wasserstein, 𝐏)
# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 160)
@benchmark(mean(logEuclidean, Pset)) # single-threaded
@benchmark(mean(logEuclidean, Pset; ⏩=true)) # multi-threaded
mean(metric::Metric, ν::Vector{T}) where T<:RealOrComplex
Mean of $k$ real or complex scalars, using the specified metric
of type Metric::Enumerated type. Note that using the Fisher, logEuclidean and Jeffrey metric, the resulting mean is the scalar geometric mean. Note also that the code of this method is in unit statistics.jl, while the code for all the others is in unit riemannianGeometry.jl.
Examples
using PosDefManifold
# Generate 10 random numbers distributed as a chi-square with 2 df.
ν=[randχ²(2) for i=1:10]
arithmetic_mean=mean(Euclidean, ν)
geometric_mean=mean(Fisher, ν)
harmonic_mean=mean(invEuclidean, ν)
harmonic_mean<=geometric_mean<=arithmetic_mean # AGH inequality
PosDefManifold.means
— Function. (1) means(metric::Metric, 𝒫::ℍVector₂;
<⏩=false>)
(2) means(metric::Metric, 𝒟::𝔻Vector₂;
<⏩=false>)
(1) Given a 2d array $𝒫$ of positive definite matrices as an ℍVector₂ type compute the Fréchet mean for as many ℍVector type objects as hold in $𝒫$, using the specified metric
of type Metric::Enumerated type. Return the means in a vector of Hermitian
matrices, that is, as an ℍVector
type.
(2) Given a 2d array $𝒟$ of real positive definite matrices as an 𝔻Vector₂ type compute the Fréchet mean for as many 𝔻Vector type objects as hold in $𝒟$, using the specified metric
of type Metric::Enumerated type. Return the means in a vector of Diagonal
matrices, that is, as a 𝔻Vector
type.
The weigted Fréchet mean is not supported in this function.
If <optional key argmuent> ⏩=true the computation of the means is multi-threaded.
Multi-threading is still experimental in julia. For each mean to be computed, multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$, where $k$ is the number of matrices for which the mean is to be computed. See Threads.
See also: mean
.
Examples
using PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# Generate a set of 40 random 4x4 SPD matrices
Qset=randP(3, 40) # or, using unicode: 𝐐=randP(3, 40)
# listing directly ℍVector objects
means(logEuclidean, ℍVector₂([Pset, Qset])) # or: means(logEuclidean, ℍVector₂([𝐏, 𝐐]))
# note that [𝐏, 𝐐] is actually a ℍVector₂ type object
# creating and passing an object of ℍVector₂ type
sets=ℍVector₂(undef, 2) # or: 𝒫=ℍVector₂(undef, 2)
sets[1]=Pset # or: 𝒫[1]=𝐏
sets[2]=Qset # or: 𝒫[2]=𝐐
means(logEuclidean, sets) # or: means(logEuclidean, 𝒫)
# going multi-threated
# first, create 20 sets of 200 50x50 SPD matrices
sets=ℍVector₂([randP(50, 200) for i=1:20])
# How much computing time we save ?
# (example min time obtained with 4 threads & 4 BLAS threads)
using BenchmarkTools
# non multi-threaded, mean with closed-form solution
@benchmark(means(logEuclidean, sets)) # (6.196 s)
# multi-threaded, mean with closed-form solution
@benchmark(means(logEuclidean, sets; ⏩=true)) # (1.897 s)
sets=ℍVector₂([randP(10, 200) for i=1:10])
# non multi-threaded, mean with iterative solution
# wait a bit
@benchmark(means(Fisher, sets)) # (4.672 s )
# multi-threaded, mean with iterative solution
@benchmark(means(Fisher, sets; ⏩=true)) # (1.510 s)
PosDefManifold.generalizedMean
— Function. generalizedMean(𝐏::Union{ℍVector, 𝔻Vector}, p::Real;
<
w::Vector=[],
✓w=true,
⏩=false >)
Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the weighted generalized means $G$ with real parameter $p$, that is,
$G=\big(\sum_{i=1}^{k}w_iP_i^p\big)^{1/p}$.
If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted generalized mean
$G=\big(\sum_{i=1}^{k}P_i^p\big)^{1/p}$.
If <optional keyword argument> ✓w=true
(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the weights each time.
If <optional key argmuent> ⏩=true the computation of the generalized mean is multi-threaded.
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
The following special cases for parameter $p$ are noteworthy:
- For $p=\frac{1}{2}$ the generalized mean is the modified Bhattacharyya mean.
- For $p=1$ the generalized mean is the Euclidean mean.
- For $p=-1$ the generalized mean is the inverse Euclidean mean.
- For (the limit of) $p=0$ the generalized mean is the log Euclidean mean, which is the Fisher mean when matrices in 𝐏 all pair-wise commute.
Notice that when matrices in 𝐏 all pair-wise commute, for instance if the matrices are diagonal, the generalized means coincide with the power means for any $p∈[-1, 1]$ and for $p=0.5$ it coincides also with the Wasserstein mean. For this reason the generalized means are used as default initialization of both the powerMean
and wasMean
algorithm.
See: generalized means.
See also: powerMean
, wasMean
, mean
.
Examples
using LinearAlgebra, Statistics, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# unweighted mean
G = generalizedMean(Pset, 0.25) # or: G = generalizedMean(𝐏, 0.25)
# weighted mean
G = generalizedMean(Pset, 0.5; w=weights)
# with weights previously normalized we can set ✓w=false
weights=weights./sum(weights)
G = generalizedMean(Pset, 0.5; w=weights, ✓w=false)
# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 160)
@benchmark(generalizedMean(Pset)) # single-threaded
@benchmark(generalizedMean(Pset; ⏩=true)) # multi-threaded
PosDefManifold.geometricMean
— Function. geometricMean(𝐏::Union{ℍVector, 𝔻Vector};
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0,
maxiter::Int=500,
⍰=false,
⏩=false,
adaptStepSize=true >)
alias: gmean
Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Fisher metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying
$\sum_{i=1}^{k}w_i\textrm{log}\big(G^{-1/2} P_i G^{-1/2}\big)=0.$
For estimating it, this function implements the well-known gradient descent algorithm, but with an exponential decaying step size $ς$, yielding iterations
$G ←G^{1/2}\textrm{exp}\big(ς\sum_{i=1}^{k}w_i\textrm{log}(G^{-1/2} P_i G^{-1/2})\big)G^{1/2}.$
If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted geometric mean.
If <optional keyword argument> ✓w=true
(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.
The following are more <optional keyword arguments>:
init
is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,tol
is the tolerance for the convergence (see below).maxiter
is the maximum number of iterations allowed- if
⍰
=true, the convergence attained at each iteration and the step size $ς$ is printed. Also, a warning is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).
- if
adaptStepSize
=false the step sizeς
is fixed to 1 at all iterations.
If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the log Euclidean mean, hence the <optional keyword arguments> init
, tol
and ⍰
have no effect and return the 3-tuple $(G, 1, 0)$. See the log Euclidean metric.
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
In normal circumstances this algorithm converges monothonically. If the algorithm diverges and ⍰
is true a warning is printed indicating the iteration when this happened.
The exponential decaying step size features a faster convergence rate as compared to the fixed step size $ς=1$ that is usually adopted. The decaying rate is inversely proportional to maxiter
, thus, increase/decrease maxiter
in order to set a slower/faster decaying rate. maxiter
should not be set too low though.
$tol$ defaults to the square root of Base.eps
of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.
See: Fisher metric.
See also: geometricpMean
, powerMean
, wasMean
, logdet0Mean
, mean
.
Examples
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = geometricMean(Pset) # or G, iter, conv = geometricMean(𝐏)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = geometricMean(Pset, w=weights)
# print the convergence at all iterations
G, iter, conv = geometricMean(Pset; ⍰=true)
# now suppose Pset has changed a bit, initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = geometricMean(Pset; w=weights, ✓w=true, ⍰=true, init=G)
# run multi-threaded when the number of matrices is high
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(geometricMean(Pset)) # single-threaded
@benchmark(geometricMean(Pset; ⏩=true)) # multi-threaded
# show the mean and the input points using spectral embedding
using Plots
k=80
Pset=randP(20, k)
G, iter, conv = geometricMean(Pset; ⏩=true)
push!(Pset, G)
Λ, U, iter, conv=spectralEmbedding(Fisher, Pset, 2; ⍰=true)
plot(U[1:k, 1], U[1:k, 2], seriestype=:scatter, title="Spectral Embedding", label="Pset")
plot!(U[k+1:k+1, 1], U[k+1:k+1, 2], seriestype=:scatter, label="mean")
PosDefManifold.geometricpMean
— Function. geometricpMean(𝐏::ℍVector, p::Real=goldeninv;
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0,
maxiter::Int=750,
⍰=false,
⏩=false,
adaptStepSize=true >)
alias: gpmean
Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type, a real parameter $0<p<1$ and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the geometric-p mean, i.e., the mean according to the Fisher metric minimizing the p-dispersion (see below) and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm.
This function implements the p-dispersion gradient descent algorithm with step-size $ς$ (to be published), yielding iterations
$G ←G^{1/2}\textrm{exp}\big(ς\sum_{i=1}^{k}pδ^2(G, P_i)^{p-1}w_i\textrm{log}(G^{-1/2} P_i G^{-1/2})\big)G^{1/2}.$
- if $p=1$ this yields the geometric mean (implemented with fixed step-size in
geometricMean
). - if $p=0.5$ this yields the geometric median.
- the default value of $p$ is the inverse of the golden ratio (0.61803...),yielding the geometric-golden mean.
If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted geometric-p mean.
If <optional keyword argument> ✓w=true
(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.
The following are more <optional keyword arguments>:
init
is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,tol
is the tolerance for the convergence (see below).maxiter
is the maximum number of iterations allowed.- if
⍰
=true, the step-size and convergence attained at each iteration is printed. Also, a warning is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).
- if
adaptStepSize
=true (default) the step size $ς$ for the gradient descent is adapted at each iteration (see below).
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
If the algorithm diverges and ⍰
is true a warning is printed indicating the iteration when this happened. This algorithm may temporary diverge, still reach convergence. Overall, while all other iterative algorithms implemented in PosDefMaifold are very stable, this is not.
The smaller the parameter $p$ is, the slower and less likely the convergence is. If the algorithm does not converge, try increasing $p$, initializing the algorithm with the output of geometricMean
and/or eliminating the otliers from the input set $𝐏$.
If adaptStepSize
is true (default) the step-size $ς$ is adapted at each iteration, otherwise a fixed step size $ς=1$ is used. Adapting the step size in general hastens convergence.
$tol$ defaults to the square root of Base.eps
of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.
See: Fisher metric.
See also: geometricMean
, powerMean
, wasMean
, logdet0Mean
, mean
.
Examples
using LinearAlgebra, PosDefManifold, Plots
# This examples show that this algorithm is more robust to outliers
# as compared to the standard geometric mean algorithm
# Generate a set of 100 random 10x10 SPD matrices
Pset=randP(10, 40)
# Get the usual geometric mean for comparison
G, iter1, conv1 = geometricMean(Pset, ⍰=true, ⏩=true)
# change p to observe how the convergence behavior changes accordingly
# Get the golden p-mean (default)
H, iter2, conv2 = geometricpMean(Pset, ⍰=true, ⏩=true)
# Get the median (0.5-mean)
H, iter2, conv2 = geometricpMean(Pset, 0.5, ⍰=true, ⏩=true)
println(iter1, " ", iter2); println(conv1, " ", conv2)
# move the first matrix in Pset to create an otlier
Pset[1]=geodesic(Fisher, G, Pset[1], 3)
G1, iter1, conv1 = geometricMean(Pset, ⍰=true, ⏩=true)
H1, iter2, conv2 = geometricpMean(Pset, 0.5, ⍰=true, ⏩=true)
println(iter1, " ", iter2); println(conv1, " ", conv2)
# collect the geometric and p-means, before and after the
# introduction of the outier in vector of Hermitian matrices `S`.
S=HermitianVector([G, G1, H, H1])
# check the interdistance matrix Δ² to verify that the geometric mean
# after the introduction of the outlier (``G1``) is farther away from
# the geometric mean as compared to how much ``H1`` is further away
# from ``H``, i.e., that element (4,3) is much smaller than element (2,1).
Δ²=distance²Mat(Float64, Fisher, S)
# how far are all these matrices from all the others?
fullΔ²=Hermitian(Δ², :L)
dist=[sum(fullΔ²[:, i]) for i=1:size(fullΔ², 1)]
# plot the matrices in `S` using spectral embedding.
using Plots
Λ, U, iter, conv = laplacianEM(laplacian(Δ²), 3; ⍰=true)
plot([U[1, 1]], [U[1, 2]], seriestype=:scatter, label="g-mean")
plot!([U[2, 1]], [U[2, 2]], seriestype=:scatter, label="g-mean outlier")
plot!([U[3, 1]], [U[3, 2]], seriestype=:scatter, label="p-mean")
plot!([U[4, 1]], [U[4, 2]], seriestype=:scatter, label="p-mean outlier")
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(geometricpMean(Pset)) # single-threaded
@benchmark(geometricpMean(Pset, ⏩=true)) # multi-threaded
PosDefManifold.logdet0Mean
— Function. logdet0Mean(𝐏::Union{ℍVector, 𝔻Vector};
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0,
maxiter::Int=500,
⍰=false,
⏩=false >)
alias: ld0Mean
Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the logdet zero metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying
$\sum_{i=1}^{k}w_i\big(\frac{1}{2}P_i+\frac{1}{2}G\big)^{-1}-G^{-1}=0$.
For estimating it, this function implements the fixed-point iteration algorithm suggested by (Moakher, 2012, p315)🎓, yielding iterations
$G ← \frac{1}{2}\big(\sum_{i=1}^{k}w_i(P_i+G)^{-1}\big)^{-1}$.
If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted logdet zero mean.
If <optional keyword argument> ✓w=true
(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.
The following are more <optional keyword arguments>:
init
is a matrix to be used as initialization for the mean. If no matrix is provided, the log Euclidean mean will be used,tol
is the tolerance for the convergence (see below).maxiter
is the maximum number of iterations allowed.- if
⍰
=true, the convergence attained at each iteration is printed and a warning is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
In normal circumstances this algorithm converges monothonically. If the algorithm diverges and ⍰
is true a warning is printed indicating the iteration when this happened.
$tol$ defaults to 100 times the square root of Base.eps
of the nearest real type of data input $𝐏$. This corresponds to requiring the square root of the relative convergence criterion over two successive iterations to vanish for about half the significant digits minus 2.
See: logdet zero metric, modified Bhattacharyya mean.
See also: powerMean
, wasMean
, logdet0Mean
, mean
.
Examples
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = logdet0Mean(Pset) # or G, iter, conv = logdet0Mean(𝐏)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = logdet0Mean(Pset, w=weights)
# print the convergence at all iterations
G, iter, conv = logdet0Mean(Pset; w=weights, ⍰=true)
# suppose Pset has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = logdet0Mean(Pset; w=weights, ✓w=false, ⍰=true, init=G)
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(logdet0Mean(Pset)) # single-threaded
@benchmark(logdet0Mean(Pset; ⏩=true)) # multi-threaded
PosDefManifold.wasMean
— Function. wasMean(𝐏::Union{ℍVector, 𝔻Vector};
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0,
maxiter::Int=500,
⍰=false,
⏩=false >)
Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type and optional non-negative real weights vector $w={w_1,...,w_k}$, return the 3-tuple $(G, iter, conv)$, where $G$ is the mean according to the Wasserstein metric and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm. Mean $G$ is the unique positive definite matrix satisfying
$G=\sum_{i=1}^{k}w_i\big( G^{1/2} P_i G^{1/2}\big)^{1/2}$.
For estimating it, this function implements the fixed-point iterative algorithm proposed by (Álvarez-Esteban et al., 2016)🎓:
$G ← G^{-1/2}\big(\sum_{i=1}^{k} w_i(G^{1/2}P_i G^{1/2})^{1/2}\big)^2 G^{-1/2}$.
If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted Wassertein mean.
If <optional keyword argument> ✓w=true
(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and they should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.
The following are more <optional keyword arguments>:
init
is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with $p=0.5$ will be used,tol
is the tolerance for the convergence (see below).maxiter
is the maximum number of iterations allowed.- if
⍰
=true, the convergence attained at each iteration is printed and a warning is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded (see below).
If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the modified Bhattacharyya mean, hence the <optional keyword arguments> init
, tol
and ⍰
have no effect and return the 3-tuple $(G, 1, 0)$. See modified Bhattacharyya mean.
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
In normal circumstances this algorithm converges monothonically. If the algorithm diverges and ⍰
is true a warning is printed indicating the iteration when this happened.
$tol$ defaults to the square root of Base.eps
of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the satisfying matrix equation divided by the number of elements to vanish for about half the significant digits.
See: Wasserstein metric.
See also: powerMean
, wasMean
, logdet0Mean
, mean
.
Examples
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = wasMean(Pset) # or: G, iter, conv = wasMean(𝐏)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = wasMean(Pset; w=weights)
# print the convergence at all iterations
G, iter, conv = wasMean(Pset; w=weights, ⍰=true)
# suppose 𝐏 has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = wasMean(Pset; w=weights, ⍰=true, init=G)
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(wasMean(Pset)) # single-threaded
@benchmark(wasMean(Pset; ⏩=true)) # multi-threaded
PosDefManifold.powerMean
— Function. powerMean(𝐏::Union{ℍVector, 𝔻Vector}, p::Real;
<
w::Vector=[],
✓w=true,
init=nothing,
tol::Real=0,
maxiter::Int=500,
⍰=false,
⏩=false >)
Given a 1d array $𝐏={P_1,...,P_k}$ of $k$ positive definite matrices of ℍVector type or real positive definite diagonal matrices of 𝔻Vector type, an optional non-negative real weights vector $w={w_1,...,w_k}$ and a real parameter p
$\in[-1, 1]$, return the 3-tuple $(G, iter, conv)$, where $G$ is Lim and Palfia (2012)'s power means of order $p$ and $iter$, $conv$ are the number of iterations and convergence attained by the algorithm, respectively. Mean $G$ is the unique positive definite matrix satisfying
$G=\sum_{i=1}^{k}(w_iG\textrm{#}_pP_i)$,
where $G\textrm{#}_pP_i$ is the Fisher geodesic equation. In particular:
- with $p=-1$ this is the harmonic mean (see the inverse Euclidean metric),
- with $p=+1$ this is the arithmetic mean (see the Euclidean metric),
- at the limit of $p$ evaluated at zero from both side this is the geometric mean (see Fisher metric).
For estimating power means for $p\in(-1, 1)$, this function implements the fixed-point iterative algorithm of (Congedo et al., 2017b)🎓. For $p=0$ (geometric mean) this algorithm is run two times with a small positive and negative value of $p$ and the geometric mean of the two resulting means is returned, as suggested in (Congedo et al., 2017b)🎓. This way of estimating the geometric mean of a set of matrices is faster as compared to the usual gradient descent algorithm.
If you don't pass a weight vector with <optional keyword argument> $w$, return the unweighted power mean.
If <optional keyword argument> ✓w=true
(default), the weights are normalized so as to sum up to 1, otherwise they are used as they are passed and should be already normalized. This option is provided to allow calling this function repeatedly without normalizing the same weights vector each time.
The following are more <optional keyword arguments>:
init
is a matrix to be used as initialization for the mean. If no matrix is provided, the instance of generalized means with parameter $p$ will be used.tol
is the tolerance for the convergence (see below).maxiter
is the maximum number of iterations allowed.- if
⍰
=true, the convergence attained at each iteration is printed and a warning is printed if convergence is not attained. - if ⏩=true the iterations are multi-threaded.
If the input is a 1d array of $k$ real positive definite diagonal matrices the solution is available in closed-form as the generalized mean of order p
, hence the <optional keyword arguments> init
, tol
and ⍰
have no effect and return the 3-tuple $(G, 1, 0)$. See generalized means.
Multi-threading is still experimental in julia. Multi-threading is automatically disabled if the number of threads Julia is instructed to use is $<2$ or $<4k$. See Threads.
In normal circumstances this algorithm converges monothonically. If the algorithm diverges and ⍰
is true a warning is printed indicating the iteration when this happened.
$tol$ defaults to the square root of Base.eps
of the nearest real type of data input $𝐏$. This corresponds to requiring the norm of the difference of the matrix solution over two successive iterations divided by the number of elements in the matrix to vanish for about half the significant digits.
(2) Like in (1), but for a 1d array $𝐃={D_1,...,D_k}$ of $k$ real positive definite diagonal matrices of 𝔻Vector type. In this case the solution is available in closed-form, hence the <optional keyword arguments> init
, tol
and ⍰
have no effect and return the 3-tuple $(G, 1, 0)$. See generalized means.
See: power means, generalized means, modified Bhattacharyya mean.
See also: generalizedMean
, wasMean
, logdet0Mean
, mean
.
Examples
using LinearAlgebra, PosDefManifold
# Generate a set of 4 random 3x3 SPD matrices
Pset=randP(3, 4) # or, using unicode: 𝐏=randP(3, 4)
# unweighted mean
G, iter, conv = powerMean(Pset, 0.5) # or G, iter, conv = powerMean(𝐏, 0.5)
# weights vector, does not need to be normalized
weights=[1, 2, 3, 1]
# weighted mean
G, iter, conv = powerMean(Pset, 0.5; w=weights)
# print the convergence at all iterations
G, iter, conv = powerMean(Pset, 0.5; w=weights, ⍰=true)
# suppose 𝐏 has changed a bit; initialize with G to hasten convergence
Pset[1]=ℍ(Pset[1]+(randP(3)/100))
G, iter, conv = powerMean(Pset, 0.5; w=weights, ⍰=true, init=G)
# estimate how much you gain running the algorithm in multi-threaded mode
using BenchmarkTools
Pset=randP(20, 120)
@benchmark(powerMean(Pset, 0.5)) # single-threaded
@benchmark(powerMean(Pset, 0.5; ⏩=true)) # multi-threaded
Tangent Space operations
Function | Description |
---|---|
logMap | Logarithmic map (from manifold to tangent space) |
expMap | Exponential map (from tangent space to manifold) |
vecP | vectorization of matrices in the tangent space |
matP | matrization of matrices in the tangent space (inverse of `vecp ) |
⋅
PosDefManifold.logMap
— Function.logMap(metric::Metric, P::ℍ{T}, G::ℍ{T}) where T<:RealOrComplex
Logaritmic Map: map a positive definite matrix $P$ from the SPD or Hermitian manifold into the tangent space at base-point $G$ using the Fisher metric.
$P$ and $G$ must be flagged as Hermitian
. See typecasting matrices.
The map is defined as
$Log_G(P)=S=G^{1/2}\textrm{log}\big(G^{-1/2}PG^{-1/2}\big)G^{1/2}$.
The result is an Hermitian
matrix. The inverse operation is expMap
.
Arguments (metric, P, G)
:
metric
is a metric of type Metric::Enumerated type.- $P$ is the positive definite matrix to be projected onto the tangent space,
- $G$ is the tangent space base point,
Currently only the Fisher metric is supported for tangent space operations.
See also: vecP
.
Examples
using PosDefManifold
P=randP(3)
Q=randP(3)
metric=Fisher
G=mean(metric, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(metric, P, G)
PosDefManifold.expMap
— Function.expMap(metric::Metric, S::ℍ{T}, G::ℍ{T}) where T<:RealOrComplex
Exponential Map: map an Hermitian
matrix $S$ from the tangent space at base point $G$ into the SPD or Hermitian manifold (using the Fisher metric).
$S$ and $G$ must be flagged as Hermitian
. See typecasting matrices.
The map is defined as
$Exp_G(S)=P=G^{1/2}\textrm{exp}\big(G^{-1/2}SG^{-1/2}\big)G^{1/2}$.
The result is a positive definite matrix. The inverse operation is logMap
.
Arguments (metric, S, G)
:
metric
is a metric of type Metric::Enumerated type,- $S$ is a Hermitian matrix, real or complex, to be projected on the SPD or Hermitian manifold,
- $G$ is the tangent space base point.
Currently only the Fisher metric is supported for tangent space operations.
Examples
using PosDefManifold, LinearAlgebra
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P on the tangent space at the Fisher mean base point G
S=logMap(Fisher, P, G)
# adding the identity in the tangent space and reprojecting back onto the manifold
H=expMap(Fisher, ℍ(S+I), G)
PosDefManifold.vecP
— Function.vecP(S::ℍ{T}) where T<:RealOrComplex
Vectorize a tangent vector (which is an Hermitian
matrix) $S$: mat -> vec.
It gives weight $1$ to diagonal elements and √2 to off-diagonal elements (Barachant et al., 2012)🎓.
The result is a vector holding $n(n+1)/2$ elements, where $n$ is the size of $S$.
$S$ must be flagged as Hermitian. See typecasting matrices.
The inverse operation is provided by matP
.
Examples
using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fisher, P, Q)
# projecting P at the base point given by the geometric mean of P and Q
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
PosDefManifold.matP
— Function.matP(ς::Vector{T}) where T<:RealOrComplex
Matrizize a tangent vector (vector) ς : vec -> mat.
This is the function reversing the vecP
function, thus the weighting applied therein is reversed as well.
If $ς=vecP(S)$ and $S$ is a $n⋅n$ Hermitian matrix, $ς$ is a tangent vector of size $n(n+1)/2$. The result of calling matP(ς)
is then $n⋅n$ matrix $S$.
To Do: This function needs to be rewritten more efficiently.
Examples
using PosDefManifold
P=randP(3)
Q=randP(3)
G=mean(Fishr, P, Q)
# projecting P at onto the tangent space at the Fisher mean base point
S=logMap(Fisher, P, G)
# vectorize S
v=vecP(S)
# Rotate the vector by an orthogonal matrix
n=Int(size(S, 1)*(size(S, 1)+1)/2)
U=randP(n)
z=U*v
# Get the point in the tangent space
S=matP(z)
Procrustes problems
Function | Description |
---|---|
procrustes | Solution to the Procrustes problem in the manifold of positive definite matrices |
⋅
PosDefManifold.procrustes
— Function.procrustes(P::ℍ{T}, Q::ℍ{T}, extremum="min") where T<:RealOrComplex
Given two positive definite matrices $P$ and $Q$, return by default the solution of problem
$\textrm{argmin}_Uδ(P,U^*QU)$,
where $U$ varies over the set of unitary matrices and $δ(.,.)$ is a distance or divergence function.
$U^*QU$ is named in physics the unitary orbit of $Q$.
If the argument extremum
is passed as "max", it returns instead the solution of
$\textrm{argmax}_Uδ(P,U^*QU)$.
$P$ and $Q$ must be flagged as Hermitian
. See typecasting matrices.
As it has been shown in Bhatia and Congedo (2019)🎓, using each of the Fisher, logdet zero, Wasserstein and the Kullback-Leibler divergence (see logdet α), the best approximant to $P$ from the unitary orbit of $Q$ commutes with $P$ and, surprisingly, has the same closed-form expression, namely
$U_Q^↓U_P^{↓*}$ for the argmin and $U_Q^↑U_P^{↓*}$ for the argmax,
where $U^↓$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by decreasing order of corresponding eigenvalues and $U^↑$ denotes the eigenvector matrix of the subscript argument with eigenvectors in columns sorted by increasing order of corresponding eigenvalues.
The same solutions are known since a long time also by solving the extremal problem here above using the Euclidean metric (Umeyama, 1988).
Examples
using PosDefManifold
P=randP(3)
Q=randP(3)
# argmin problem
U=procrustes(P, Q)
# argmax problem
V=procrustes(P, Q, "max")