GMMParameterEstimation.jl Documentation

GMMParameterEstimation.jl is a package for estimating the parameters of Gaussian k-mixture models using the method of moments. It works for general k with known mixing coefficients, and for k=2,3,4 for unknown mixing coefficients.

Example

The following code snippet will use the given moments to return an estimate of the parameters using the method of moments.

using GMMParameterEstimation
d = 3
k = 2
first_moments = [1.0, 0.980, 1.938, 3.478, 8.909, 20.526, 64.303]
diagonal_moments = [-0.580 5.682 -11.430 97.890 -341.161; -0.480 1.696 -2.650 11.872 -33.239]
off_diag_system = Dict{Vector{Int64}, Float64}([0, 1, 2] => -1.075, [1, 0, 1] => -0.252, [1, 2, 0] => 6.021, [1, 0, 2] => 1.117, [1, 1, 0] => -0.830, [0, 1, 1] => 0.884)
pass, (mixing_coefficients, means, covariances) = estimate_parameters(d, k, first_moments, diagonal_moments, off_diag_system)

$\\~\\$

Parameter estimation

The main functionality of this package stems from

GMMParameterEstimation.estimate_parametersFunction
estimate_parameters(d::Integer, k::Integer, first::Vector{Float64}, second::Matrix{Float64}, last::Dict{Vector{Int64}, Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the unknown mixing coefficient dense covariance matrix case, first should be a list of moments 0 through 3k for the first dimension, second should be a matrix of moments 1 through 2k+1 for the remaining dimensions, and last should be a dictionary of the indices as lists of integers and the corresponding moments. This requires k<= 4.

estimate_parameters(d::Integer, k::Integer, w::Array{Float64}, first::Vector{Float64}, second::Matrix{Float64}, last::Dict{Vector{Int64}, Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the known mixing coefficient dense covariance matrix case, w should be a vector of the mixing coefficients first should be a list of moments 0 through 3k for the first dimension, second should be a matrix of moments 1 through 2k+1 for the remaining dimensions, and last should be a dictionary of the indices as lists of integers and the corresponding moments.

estimate_parameters(d::Integer, k::Integer, first::Vector{Float64}, second::Matrix{Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the unknown mixing coefficient diagonal covariance matrix case, first should be a list of moments 0 through 3k for the first dimension, and second should be a matrix of moments 1 through 2k+1 for the remaining dimensions. This requires k<= 4.

estimate_parameters(d::Integer, k::Integer, w::Array{Float64}, first::Vector{Float64}, second::Matrix{Float64})

Compute an estimate for the parameters of a d-dimensional Gaussian k-mixture model from the moments.

For the known mixing coefficient diagonal covariance matrix case, w should be a vector of the mixing coefficients first should be a list of moments 0 through 3k for the first dimension, and second should be a matrix of moments 1 through 2k+1 for the remaining dimensions..

which computes the parameter recovery using Algorithm 1 from Estimating Gaussian Mixtures Using Sparse Polynomial Moment Systems. Note that the unknown mixing coefficient cases load a set of generic moments and the corresponding solutions to the first 1-D polynomial system from sys1_k2.jld2, sys1_k3.jld2, or sys1_k4.jld2 depending on k.

In one dimension, for a random variable $X$ with density $f$ we define the $i$th moment as $m_i=E[X^i]=\int xf(x)dx$. For a Gaussian mixture model, this results in a polynomial in the parameters. For a sample $\{y_1,y_2,\dots,y_N\}$, we define the $i$th sample moment as $\overline{m_i}=\frac{1}{N}\sum_{j=1}^N y_j^i$. The sample moments approach the true moments as $N\rightarrow\infty$, so by setting the polynomials equal to the empirical moments, we can then solve the polynomial system to recover the parameters.

For a multivariate random variable $X$ with density $f_X$ we define the moments as $m_{i_1,\dots,i_n} = E[X_1^{i_1}\cdots X_n^{i_n}] = \int\cdots\int x_1^{i_1}\cdots x_n^{i_n}f_X(x_1,\dots,x_n)dx_1\cdots dx_n$ and the empirical moments as $\overline{m}_{i_1,\dots,i_n} = \frac{1}{N}\sum_{j=1}^Ny_{j_1}^{i_1}\cdots y_{j_n}^{i_n}$. And again, by setting the polynomials equal to the empirical moments, we can then solve the system of polynomials to recover the parameters. However, choosing which moments becomes more complicated.

$\\~\\$

Generate and sample from Gaussian Mixture Models

Generation

Note that the entries of the resulting covariance matrices are generated from a normal distribution centered at 0 with variance 1.

$\\~\\$

GMMParameterEstimation.generateGaussiansFunction
generateGaussians(d::Integer, k::Integer, diagonal::Bool)

Generate means and covariances for k Gaussians with dimension d.

diagonal should be true for spherical case, and false for dense covariance matrices.

The parameters are returned as a tuple, with weights in a 1D vector, means as a k x d array, and variances as a k x d x d array if diagonal is false or as a list of Diagonal{Float64, Vector{Float64}} if diagonal is true to save memory. Note that each entry of each parameter is generated from a normal distribution centered at 0 with variance 1.

$\\~\\$

GMMParameterEstimation.getSampleFunction

getSample(numb::Integer, w::Vector{Float64}, means::Matrix{Float64}, covariances::Vector)

Generate a Gaussian mixture model sample with numb entries, mixing coefficients w, means means, and covariances covariances.

This relies on the Distributions package.

$\\~\\$

Computing moments

GMMParameterEstimation.sampleMomentsFunction
sampleMoments(sample::Matrix{Float64}, k; diagonal = false)

Use the sample to compute the moments necessary for parameter estimation using method of moments.

Returns moments 0 to 3k for the first dimension, moments 1 through 2k+1 for the other dimensions as a matrix, and a dictionary with indices and moments for the off-diagonal system if diagonal is false.

GMMParameterEstimation.diagonalPerfectMomentsFunction

diagonalPerfectMoments(d, k, w, truemeans, truecovariances)

Use the given parameters to compute the exact moments necessary for parameter estimation with diagonal covariance matrices.

Returns moments 0 to 3k for the first dimension, and moments 1 through 2k+1 for the other dimensions as a matrix.

GMMParameterEstimation.densePerfectMomentsFunction
densePerfectMoments(d, k, w, true_means, true_covariances)

Use the given parameters to compute the exact moments necessary for parameter estimation with dense covariance matrices.

Returns moments 0 to 3k for the first dimension, moments 1 through 2k+1 for the other dimensions as a matrix, and a dictionary with indices and moments for the off-diagonal system.

These expect parameters to be given with weights in a 1D vector, means as a k x d array, and covariances as a k x d x d array for dense covariance matrices or as a list of diagonal matrices for diagonal covariance matrices.

$\\~\\$

Build the polynomial systems

GMMParameterEstimation.build1DSystemFunction
build1DSystem(k::Integer, m::Integer)

Build the polynomial system for a mixture of 1D Gaussians where 'm'-1 is the highest desired moment and the mixing coefficients are unknown.

build1DSystem(k::Integer, m::Integer, a::Union{Vector{Float64}, Vector{Variable}})

Build the polynomial system for a mixture of 1D Gaussians where 'm'-1 is the highest desired moment, and a is a vector of the mixing coefficients.

This uses the usual recursive formula for moments of a univariate Gaussian in terms of the mean and variance, and then takes a convex combination with either variable mixing coefficients or the provided mixing coefficients.

$\\~\\$

GMMParameterEstimation.selectSolFunction
selectSol(k::Integer, solution::Result, polynomial::Expression, moment::Number)

Select a k mixture solution from solution accounting for polynomial and moment.

Sort out a k mixture statistically significant solutions from solution, and return the one closest to moment when polynomial is evaluated at those values.

Statistically significant means has positive variances here. This is used to select which solution from the parameter homotopy will be used.

$\\~\\$

As far as I am aware, the only closed form formula for the mixed dimensional moments of a multivariate Gaussian is that provided by Jo$\~{a}$o M. Pereira, Joe Kileel, and Tamara G. Kolda in Tensor Moments of Gaussian Mixture Models: Theory and Applications. However, the tensor moments are indexed in a different way than the multivariate moment notation we used. Let $m_{a_1\cdots a_n}$ be a d-th order multivariate moment and let $M_{i_1\cdots i_d}^{(d)}$ be an entry of the d-th order tensor moment. Then $m_{a_1\cdots a_n}=M_{i_1\cdots i_d}^{(d)}$ where $a_j=|\{i_k=j\}|$. Note that due to symmetry, the indexing of the tensor moment is non-unique. For example, $m_{102} = M_{133}^{(3)}=M_{331}^{(3)}=M_{313}^{(3)}=m_{102}$.

$\\~\\$

GMMParameterEstimation.mixedMomentSystemFunction
mixedMomentSystem(d, k, mixing, ms, vs)

Build a linear system for finding the off-diagonal covariances entries.

For a d dimensional Gaussian k-mixture model with mixing coefficients mixing, means ms, and covariances vs where the diagonal entries have been filled in and the off diagonals are variables.

The final step in our method of moments parameter recovery for non-diagonal covariance matrices is building and solving a system of $N:=\frac{k}{2}(d^2-d)$ linear equations in the same number of unknowns to fill in the off diagonal. The polynomial for $m_{a_1\cdots a_n}$ is linear if all but two $a_i=0$ and at least one $a_1=1$. There are $n^2-n$ of these for each order $\geq2$, so we need these equations for up to $\lceil \frac{k}{2}\rceil$-th order.

Note: the polynomial is still linear when 3 $a_i=1$ and the rest of the $a_i$ are 0 but this complicates generating the system so we did not include those.

Again referring back to Pereira et al. for a closed form method of generating the necessary moment polynomials, we generate the linear system using the already computed mixing coefficients, means, and diagonals of the covariances, and return it as a dictionary of index=>polynomial pairs that can then be matched with the corresponding moments.

Index