In [1]:
using Pkg
Pkg.activate("Algebra")
try 
    using Polynomials
catch
    Pkg.add("Polynomials")
end

using Polynomials

Polynomial Ring of Transformations

One goal of SemanticModels is to lift model augmentation into the world of category theory and algebra. This is analogous to a long tradition in the symbolification of mathematics, for example geometry is the study of shapes and their relations, which gets lifted to group theory by the algebraists who want to to make everything symbolic reasoning.

He we will represent a model class monomial regression with a data structure MonomialRegression and define a set group of transformations that can act on models from this class.

In [2]:
using SemanticModels
using SemanticModels.ModelTools
using SemanticModels.ModelTools.Transformations
using SemanticModels.ModelTools.MonomialRegressionModels

Monomial Regression

Here we take our class of models to be monomial regression and write a script that implements an experiment with this model.

  1. Define a function $f(a, x) = ax^p$ for some integer $p$
  2. Generate data from the distribution $x, f(a, x)+Normal(0,\sigma)$
  3. Given the sample data, use least squares to solve for $a$.

Note that the value of $p$ when we generate the data does not need to match the value of $p$ when we solve the least squares problem.

Here is an implementation of our model.

In [3]:
expr = quote
    module Regression
    using Random
    mid(a,b) = (a+b)/2
    function f(x, a)
        y = first(a)*x^0
        return y
    end

    function optimize(l, interval)
        left = interval[1]
        right = interval[2]
        if left > right - 1e-8
            return (interval)
        end
        midp = (left+right)/2
        if l(mid(left, midp)) > l(mid(midp, right))
            return optimize(l, (midp, right))
        else
            return optimize(l, (left, midp))
        end
    end

    function sample(g::Function, n::Int)
        x = randn(Float64, n)
        target = g.(x) .+ randn(Float64, n)./8
        return x, target
    end


    #setup

    Random.seed!(42)
    a = 1/2
    n = 10
    g(x) = a*(1 + x^1 + x^2)
    x, target = sample(g, n)
    loss(a) = sum((f.(x, a) .- target).^2)
    a = [-1, 1]
    # @show loss.([-1,-1/2,-1/4, 0, 1/4,1/3,1/2,2/3, 1])

    #solving
    ahat = optimize(loss, a)

    end
end;

Transforming the model

Valid transformations for this model include increasing and decreasing the power of the monomial by 1. And you can apply those transformations repeatedly.

Mathematically we can go from $f(a,x) = ax^p$ to $f(a,x) = ax^{p+1}$ or $f(a,x) = ax^{p-1}$. If we allow repeated application of these two generator transformations as a free monoid, we find that the set of transformations we get is isomorphic to the group $(Z, +, 0)$ which is the integers with addition. In Julia, we have decided to represent elements from this transformation group with a struct Pow{Int} <: Transformation. The composition operation for these transformations can be found in the source file groups.jl.

Composition of Pow{Int} maps naturally to addition of the power to transform with. You can see how transformations are applied below.

In [4]:
m = model(MonomialRegressionModel, deepcopy(expr))
@show bodyblock(m.f)
bodyblock(m.f) = Any[:(#= In[3]:6 =#), :(y = first(a) * x ^ 0), :(#= In[3]:7 =#), :(return y)]
Out[4]:
4-element Array{Any,1}:
 :(#= In[3]:6 =#)       
 :(y = first(a) * x ^ 0)
 :(#= In[3]:7 =#)       
 :(return y)            

With this machinery in place we can think about the action of $Z$ on a MonomialRegression model to create a family of models with different powers of monomoial. Each monomial model finds the best fit to a particular data set, and by leveraging this group action, we can systematically explore the space of all models in the class.

In [5]:
eval(m.expr);
 = Regression.ahat
Out[5]:
(0.9999999925494194, 1)
In [6]:
function (p::Poly)(m::MonomialRegressionModel)
    if m.expr.head == :module
        # we need to add the using statement to get polyval defined
        pushfirst!(m.expr.args[3].args, :(using Polynomials))
    end
    bodyblock(m.f)[end-2] = Parsers.replace(bodyblock(m.f)[end-2], :(x^0), :(polyval($p,x)))
end
Poly([1,0,1])(m)
m
Out[6]:
MonomialRegressionModel(
  f=:(function f(x, a)
      #= In[3]:6 =#
      y = first(a) * polyval(Poly(1 + x^2), x)
      #= In[3]:7 =#
      return y
  end),
  objective=:loss,
  coefficient=:a,
  interval=:(a₀ = [-1, 1]))
In [7]:
eval(m.expr);
 = Regression.ahat
WARNING: replacing module Regression.
Out[7]:
(0.42221833020448685, 0.42221833765506744)

Chosing the right polynomial

Given a fixed dataset, we want to loop over all the models in a class and fit the best coefficient, then we will decide which polynomial allows for the best fit to this data.

In [8]:
m = model(MonomialRegressionModel, deepcopy(expr))
# println(m)
# sol = eval(m.expr)
# @show sol.ahat
# @show sol.loss(sol.ahat[1])
results = Any[]
for i in 0:1
    for j in 0:1
        for k in 0:2
            m = model(MonomialRegressionModel, deepcopy(expr))
            (i*Poly([1]) + j*Poly([0, 1]) + k*Poly([0,0,1]))(m)
            @show m.f.args[2].args[2]
            sol = eval(m.expr)
            p = bodyblock(m.f)
            ahat = @show sol.ahat[1]
            lhat = @show sol.loss(sol.ahat[1])
            push!(results, (i, ahat, lhat, p))
        end   
    end
end
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(0), x))
sol.ahat[1] = -1
WARNING: replacing module Regression.
sol.loss(sol.ahat[1]) = 20.13038665673438
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(x^2), x))
sol.ahat[1] = 0.5020807608962059
sol.loss(sol.ahat[1]) = 4.592980186575482
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(2*x^2), x))
sol.ahat[1] = 
WARNING: replacing module Regression.
0.25104037672281265
sol.loss(sol.ahat[1]) = 4.592980186575492
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(x), x))
WARNING: replacing module Regression.
sol.ahat[1] = -0.13085374981164932
sol.loss(sol.ahat[1]) = 19.902665298586488
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(x + x^2), x))
sol.ahat[1] = 
WARNING: replacing module Regression.
0.6143628656864166
sol.loss(sol.ahat[1]) = 2.187454550179506
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(x + 2*x^2), x))
WARNING: replacing module Regression.
sol.ahat[1] = 0.29335421323776245
sol.loss(sol.ahat[1]) = 2.484606630021387
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(1), x))
sol.ahat[1] = 
WARNING: replacing module Regression.
0.9999999925494194
sol.loss(sol.ahat[1]) = 9.069784234818242
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(1 + x^2), x))
WARNING: replacing module Regression.
sol.ahat[1] = 0.42221833020448685
sol.loss(sol.ahat[1]) = 2.618319243042377
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(1 + 2*x^2), x))
sol.ahat[1] = 
WARNING: replacing module Regression.
0.23381679505109787
sol.loss(sol.ahat[1]) = 3.1968217851927285
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(1 + x), x))
WARNING: replacing module Regression.
sol.ahat[1] = 0.48516687750816345
sol.loss(sol.ahat[1]) = 15.865756420964225
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(1 + x + x^2), x))
sol.ahat[1] = 
WARNING: replacing module Regression.
0.5032734870910645
sol.loss(sol.ahat[1]) = 0.13228245782573908
((m.f).args[2]).args[2] = :(y = first(a) * polyval(Poly(1 + x + 2*x^2), x))
WARNING: replacing module Regression.
sol.ahat[1] = 0.26868781447410583
sol.loss(sol.ahat[1]) = 1.1389708980451678
WARNING: replacing module Regression.
In [9]:
using Printf
fmt(i::Int) = i
fmt(f::Real) = Printf.@sprintf "% .4f" round(f, sigdigits=5)
fmt(f::Any) = f[end-2].args[2].args[3].args[2]
println("\nResults:\n\n\ni\tâ         ℓ⋆ \t\tpoly\n----------------------------------------------")
for r in results
    println(join(fmt.(r), "     "))
end
best = sort(results, by=x->x[end-1])[1][2:end]
println("Model order $(fmt(best[end])) is the best with a=$(fmt(best[1])) and loss $(fmt(best[2]))")
Results:


i	â         ℓ⋆ 		poly
----------------------------------------------
0     -1      20.1300     Poly(0)
0      0.5021      4.5930     Poly(x^2)
0      0.2510      4.5930     Poly(2*x^2)
0     -0.1308      19.9030     Poly(x)
0      0.6144      2.1875     Poly(x + x^2)
0      0.2933      2.4846     Poly(x + 2*x^2)
1      1.0000      9.0698     Poly(1)
1      0.4222      2.6183     Poly(1 + x^2)
1      0.2338      3.1968     Poly(1 + 2*x^2)
1      0.4852      15.8660     Poly(1 + x)
1      0.5033      0.1323     Poly(1 + x + x^2)
1      0.2687      1.1390     Poly(1 + x + 2*x^2)
Model order Poly(1 + x + x^2) is the best with a= 0.5033 and loss  0.1323

Conclusions

We can see that this sweep over polynomials, recovers the right model. We can see some fundamental algebra at work here.

This example shows that you can represent the transformations on a model as a group and the action of the group on a model is the way to represent transformations on models. Insights from the algebraic structures homormorphic/isomorphic to the algebra of transformations provide insights on the class of models.

In [10]:
div(Poly([1,2,3])*Poly([0,1]), Poly([0,1]))
Out[10]:
1.0 + 2.0∙x + 3.0∙x^2
In [11]:
divrem((Poly([1,2,3])*Poly([0,1,2])), Poly([1,1,2,1]))
Out[11]:
(Poly(-5.0 + 6.0*x), Poly(5.0 + 8.0*x^2))
In [12]:
gcd((Poly([1,2,3])*Poly([1,3,4])), Poly([1,2,3]))
Out[12]:
1.0 + 2.0∙x + 3.0∙x^2