Partitioned Least Squares
Linear least squares is one of the most widely used regression methods among scientists in many fields. The simplicity of the model allows this method to be used when data is scarce and it is usually appealing to practitioners that need to gather some insight into the problem by inspecting the values of the learnt parameters. PartitionedLS is a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result.
The model
The Partitioned Least Squares model is formally defined as:
\[\begin{gather*} \text{minimize}_{\mathbf{\alpha}, \mathbf{\beta}} \| \mathbf{X} \times (\mathbf{P} \circ \mathbf{\alpha}) \times \mathbf{\beta} - \mathbf{y} \|_2^2 \\ \begin{aligned} \quad s.t.\quad &\mathbf{\alpha} \succeq 0\\ &\mathbf{P}^T \times \mathbf{\alpha} = \mathbf{1}. \end{aligned} \end{gather*}\]
where:
- $\mathbf{X}$ is $N \times M$ data matrix;
- $\mathbf{P}$ is a user-defined partition matrix having $K$ columns (one for each element of the partition), $M$ rows, and containing $1$ in $P_{i,j}$ if the $i$-th attribute belongs to the $j$-th partition and $0$ otherwise;
- $\mathbf{\beta}$ is a vector weighting the importance of each set of attributes in the partition;
- $\mathbf{\alpha}$ is a vector weighting the importance of each attribute within one of the sets in the partition. Note that the constraints imply that for each set in the partition the weights of the corresponding $\alpha$ variables are all positive and sum to $1$.
The PartitionedLS problem is non-convex and NP-complete. The library provides two algorithms to solve the problem anyway: an iterative algorithm based on the Alternating Least Squares approach and an optimal algorithm that guarantees requiring however exponential time in the cardinality of the partition (i.e., it is mainly useful when $K$ is small).
More details can be found in the paper Partitioned Least Squares.
To install this library
Just add it as a dependency to your Julia environment. Launch julia from the main directory of your project and enter the following commands:
# Opens the package manager REPL
]
# Activate you local environment (can be skipped if you want to install the library globally)
activate .
# Adds the library to the environment
add PartitionedLS
To use this library
You will need a matrix P describing the partitioning of your variables, e.g.:
P = [[1 0];
[1 0];
[0 1]]
specifies that the first and the second variable belongs to the first partition, while the third variable belongs to the second.
You have then the choice to use either the standard interface or the MLJ interface.
Standard interface
The standard interface defines a fit
function for each of the implemented algorithms. The function returns a tuple containing:
- a
PartLSFitResult
object containing the model and the parameters found by the algorithm; nothing
(this is mandated by the MLJ interface, but it is not used in this case).- a NamedTuple containing some additional information.
A complete example:
using PartitionedLS
X = [[1. 2. 3.];
[3. 3. 4.];
[8. 1. 3.];
[5. 3. 1.]]
y = [1.;
1.;
2.;
3.]
P = [[1 0];
[1 0];
[0 1]]
# fit using the optimal algorithm
result = fit(Opt, X, y, P, η = 0.0)
# Make predictions on the given data matrix. The function works
# with results returned by anyone of the solvers.
predict(result[1], X)
MLJ interface
The MLJ interface is a allows you to use the library in a more MLJ-like fashion. The interface is defined by the PartLS
model, which can be used in the MLJ framework. The model can be used in the same way as any other MLJ model.
A complete example:
using MLJ
using PartitionedLS
X = [[1. 2. 3.];
[3. 3. 4.];
[8. 1. 3.];
[5. 3. 1.]]
y = [1.;
1.;
2.;
3.]
P = [[1 0];
[1 0];
[0 1]]
# Define the model
model = PartLS(P=P, Optimizer=Opt, η=0.0)
# Fit the model
mach = machine(model, X, y)
fit!(mach)
# Make predictions
predict(mach, X)
API Documentation
PartitionedLS.PartLS
— Typemutable struct PartLS <: Deterministic
The PartLS struct represents a partitioned least squares model. Fields are:
Optimizer
: the optimization algorithm to use. It can beOpt
,Alt
orBnB
.P
: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The elementP_{k, i} = 1
if featurei
belongs to partitionk
.η
: the regularization parameter. It controls the strength of the regularization.ϵ
: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by theAlt
algorithm.T
: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by theAlt
algorithm.rng
: the random number generator to use.- If
nothing
, the global random number generatorrand
is used. - If an integer, the global number generator
rand
is used after seeding it with the given integer. - If an object of type
AbstractRNG
, the given random number generator is used.
- If
Example
model = PartLS(P=P, Optimizer=Alt, rng=123)
PartitionedLS.PartLSFitResult
— Typestruct PartLSFitResult
The PartLSFitResult struct represents the solution of the partitioned least squares problem. It contains the values of the α and β variables, the intercept t and the partition matrix P.
Fields
α::Vector{Float64}
: The values of the α variables. For each partition $k$, it holds the values of the α variables are such that $\sum_{i \in P_k} \alpha_{k} = 1$.
β::Vector{Float64}
: The values of the β variables. For each partition $k$, $\beta_k$ is the coefficient that multiplies the features in the k-th partition.
t::Float64
: The intercept term of the model.
P::Matrix{Int64}
: The partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element $P_{k, i} = 1$ if feature $i$ belongs to partition $k$.
MLJModelInterface.fit
— Functionfit(
::Type{Alt},
X::Matrix{Float64},
y::Vector{Float64},
P::Matrix{Int64};
η,
ϵ,
T,
nnlsalg,
rng
) -> Tuple{PartLSFitResult, Nothing, NamedTuple{(:opt,), <:Tuple{Any}}}
Fits a PartitionedLS model by alternating the optimization of the α and β variables. This version uses an optimization strategy based on non-negative-least-squaes solvers. This formulation is faster and more numerically stable with respect to fit(Alt, ...)
`.
Arguments
X
: $N × M$ matrix describing the examplesy
: $N$ vector with the output values for each exampleP
: $M × K$ matrix specifying how to partition the $M$ attributes into $K$ subsets. $P_{m,k}$ should be 1 if attribute number $m$ belongs to partition $k$.η
: regularization factor, higher values implies more regularized solutions. Default is 0.0.T
: number of alternating loops to be performed. Default is 100.ϵ
: minimum relative improvement in the objective function before stopping the optimization. Default is 1e-6nnlsalg
: specific flavour of nnls algorithm to be used, possible values are:pivot
,:nnls
,:fnnls
. Default is :nnls
Result
A Tuple with the following fields:
- a
PartLSFitResult
object containing the fitted model - a
nothing
object - a NamedTuple with a field
opt
containing the optimal value of the objective function
fit(
::Type{Opt},
X::Matrix{Float64},
y::Vector{Float64},
P::Matrix{Int64};
η,
nnlsalg,
returnAllSolutions
) -> Tuple{PartLSFitResult, Nothing, Any}
Fits a PartialLS Regression model to the given data and resturns the learnt model (see the Result section). It uses a coplete enumeration strategy which is exponential in K, but guarantees to find the optimal solution.
Arguments
X
: $N × M$ matrix describing the examplesy
: $N$ vector with the output values for each exampleP
: $M × K$ matrix specifying how to partition the $M$ attributes into $K$ subsets. $P_{m,k}$ should be 1 if attribute number $m$ belongs to
partition $k$.
η
: regularization factor, higher values implies more regularized solutions (default: 0.0)returnAllSolutions
: if true an additional output is appended to the resulting tuple containing all solutions found during the algorithm.nnlsalg
: the kind of nnls algorithm to be used during solving. Possible values are :pivot, :nnls, :fnnls (default: :nnls)
Example
X = rand(100, 10)
y = rand(100)
P = [1 0 0; 0 1 0; 0 0 1; 1 1 0; 0 1 1]
result = fit(Opt, X, y, P)
fit(
::Type{BnB},
X::Matrix{Float64},
y::Vector{Float64},
P::Matrix{Int64};
η,
nnlsalg
) -> Tuple{PartLSFitResult, Nothing, NamedTuple{(:opt, :nopen), <:Tuple{Any, Int64}}}
Implements the Branch and Bound algorithm to fit a Partitioned Least Squres model.
Arguments
X
: $N × M$ matrix describing the examplesy
: $N$ vector with the output values for each exampleP
: $M × K$ matrix specifying how to partition the $M$ attributes into $K$ subsets. $P_{m,k}$ should be 1 if attribute number $m$ belongs to
partition $k$.
η
: regularization factor, higher values implies more regularized solutions (default: 0.0)- nnlsalg: the kind of nnls algorithm to be used during solving. Possible values are :pivot, :nnls, :fnnls (default: :nnls)
Result
A tuple with the following fields:
- a
PartLSFitResult
object containing the fitted model - a
nothing
object - a NamedTuple with fields:
opt
containing the optimal value of the objective functionnopen
containing the number of open nodes in the branch and bound tree
fit(
m::PartLS,
verbosity,
X,
y
) -> Tuple{PartLSFitResult, Nothing, Any}
Fits a PartitionedLS Regression model to the given data and resturns the learnt model (see the Result section). It conforms to the MLJ interface.
Arguments
m
: APartLS
model to fitverbosity
: the verbosity levelX
: the data matrixy
: the target vector
MLJModelInterface.predict
— Functionpredict(
α::Vector{Float64},
β::Vector{Float64},
t::Float64,
P::Matrix{Int64},
X::Matrix{Float64}
) -> Vector{Float64}
Result
the prediction for the partitioned least squares problem with solution α, β, t over the dataset X and partition matrix P
predict(
model::PartLSFitResult,
X::Matrix{Float64}
) -> Vector{Float64}
Make predictions for the datataset X
using the PartialLS model model
.
Arguments
model
: a PartLSFitResultX
: the data containing the examples for which the predictions are sought
Return
the predictions of the given model on examples in X.
PartitionedLS.homogeneousCoords
— FunctionRewrites X and P in homogeneous coordinates. The result is a tuple (Xo, Po) where Xo is the homogeneous version of X and Po is the homogeneous version of P.
Arguments
X
: the data matrixP
: the partition matrix
Return
Xo
: the homogeneous version of XPo
: the homogeneous version of P
PartitionedLS.regularizeProblem
— FunctionAdds regularization terms to the problem. The regularization terms are added to the objective function as a sum of squares of the α variables. The regularization parameter η controls the strength of the regularization.
Arguments
X
: the data matrixy
: the target vectorP
: the partition matrixη
: the regularization parameter
Return
Xn
: the new data matrixyn
: the new target vector
Main idea
K new rows are added to the data matrix X, row $k \in \{1 \dots K\}$ is a vector of zeros except for the components that corresponds to features belonging to the k-th partition, which is set to sqrt(η). The target vector y is extended with K zeros.
The point of this change is that when the objective function is evaluated as $math \|Xw - y\|^2$, the new part of the matrix contributes to the loss with a factor of $η \sum \|w_i\|^2$ . This is equivalent to adding a regularization term to the objective function.