Grafting models together

Much of scientific thought and scholarship involves reading papers and textbooks with different models and synthesizing a novel model from components found in existing, well studied models.

We refer to this as model grafting, where a component is taking from one model and grafted onto another model. SemanticModels supports automating much of the low level code details from this task to free scientists to think about what features to combine instead of the mechanical aspects of changing the code by hand.

This notebook is an example based on the SEIR model and the ScalingModel examples in the epirecipes cookbook.

In [17]:
using Pkg
try
    using DifferentialEquations
catch
    Pkg.add("DifferentialEquations")
end
using DifferentialEquations
using SemanticModels.Parsers
using SemanticModels.ModelTools
using SemanticModels.ModelTools.ExpODEModels

Loading the original model

We use parsefile to load the model into an expression. The original model is an SEIR model which has 4 states suceptible, exposed, infected, and recovered. It has parameters $\beta, \gamma, \mu, \sigma$.

In [2]:
expr1 = parsefile("../examples/epicookbook/src/SEIRmodel.jl")
model1 = model(ExpODEModel, expr1)
matches = Expr[:(ODEProblem(seir_ode, init, tspan, pram))]
Out[2]:
ExpODEModel(
  calls=Expr[:(ODEProblem(seir_ode, init, tspan, pram))],
  funcs=Expr[:(function seir_ode(dY, Y, p, t)
      #= none:22 =#
      β = p[1]
      #= none:24 =#
      σ = p[2]
      #= none:26 =#
      γ = p[3]
      #= none:28 =#
      μ = p[4]
      #= none:31 =#
      S = Y[1]
      #= none:33 =#
      E = Y[2]
      #= none:35 =#
      I = Y[3]
      #= none:39 =#
      dY[1] = (μ - β * S * I) - μ * S
      #= none:40 =#
      dY[2] = β * S * I - (σ + μ) * E
      #= none:41 =#
      dY[3] = σ * E - (γ + μ) * I
  end)],
  variables=NamedTuple{(:state, :flux, :params),Tuple{Array{Expr,1},Array{Expr,1},Array{Expr,1}}}[(state = [:(S = Y[1]), :(E = Y[2]), :(I = Y[3])], flux = [:(dY[1] = (μ - β * S * I) - μ * S), :(dY[2] = β * S * I - (σ + μ) * E), :(dY[3] = σ * E - (γ + μ) * I)], params = [:(β = p[1]), :(σ = p[2]), :(γ = p[3]), :(μ = p[4])])],
  domains=Array{Expr,1}[[:(tspan = (0.0, 365.0))]],  values=Array{Expr,1}[[:(init = [0.8, 0.1, 0.1])]]
)
In [3]:
module1 = eval(model1.expr);

Running our baseline model

The code that defines the baseline model creates a module for that model to run in. This ensures that the code will not have unintended sideeffects when run in a julia process with other models. The entrypoint to this module is called main, which is a function that has no arguments and does the setup and execution of the model.

In [4]:
module1.main()
Out[4]:
retcode: Success
Interpolation: Automatic order switching interpolation
t: 36-element Array{Float64,1}:
   0.0                
   0.10908965469240953
   0.7788973072400299 
   2.0022043858052707 
   3.5836819370527637 
   5.630585941025178  
   8.113563799991052  
  11.002003751843638  
  14.108921347431746  
  17.26516081390395   
  20.382340482941302  
  23.434317772415014  
  26.421905487182087  
   ⋮                  
  98.68338628282227   
 111.73064653652324   
 125.97364242960741   
 141.73749114325807   
 159.38449250156185   
 179.5093858713896    
 203.0093666615227    
 231.2727928054646    
 265.23704493400595   
 301.6805130563528    
 340.11784104181737   
 365.0                
u: 36-element Array{Array{Float64,1},1}:
 [0.8, 0.1, 0.1]                     
 [0.787674, 0.112133, 0.0998292]     
 [0.716349, 0.181813, 0.0992542]     
 [0.602347, 0.290963, 0.10005]       
 [0.478968, 0.405116, 0.103913]      
 [0.34972, 0.518519, 0.112397]       
 [0.22959, 0.615065, 0.126132]       
 [0.131688, 0.681596, 0.144491]      
 [0.0664888, 0.710609, 0.164668]     
 [0.0304235, 0.709248, 0.183754]     
 [0.0130267, 0.690266, 0.200172]     
 [0.00535984, 0.663512, 0.213537]    
 [0.00217348, 0.634412, 0.224027]    
 ⋮                                   
 [0.000140107, 0.191784, 0.153901]   
 [0.000163973, 0.154595, 0.12976]    
 [0.000196473, 0.122252, 0.106471]   
 [0.000241558, 0.0943659, 0.0846786] 
 [0.000305431, 0.0707189, 0.0649601] 
 [0.00039883, 0.0510067, 0.0476593]  
 [0.000540922, 0.0349605, 0.033006]  
 [0.000767451, 0.0223529, 0.0211502] 
 [0.00113172, 0.0132328, 0.0124128]  
 [0.00164268, 0.00769449, 0.00707349]
 [0.00231435, 0.0044644, 0.00397959] 
 [0.00281559, 0.00319176, 0.00277793]

seir_ode is the name of the function we want to modify an ODEProblem is defined by the right hand side of the equation. $du/dt = f(u, t)$

The ScalingModel provides a population growth component that we want to graft onto the SEIR model to create an SEIR model with population dynamics. We load that model from its source file. You can inspect this file to see the definition of $dS/dt = r * (1 - S / K) * S - \beta * S * I$ which includes a population growth rate parameter $r$.

In [5]:
expr2 = parsefile("../examples/epicookbook/src/ScalingModel.jl")
Out[5]:
:(module ScalingModel
  #= none:16 =#
  #= none:17 =#
  using DifferentialEquations
  #= none:19 =#
  function micro_1(du, u, parms, time)
      #= none:27 =#
      (β, r, μ, K, α) = parms
      #= none:28 =#
      (S, I) = u
      #= none:29 =#
      dS = r * (1 - S / K) * S - β * S * I
      #= none:30 =#
      dI = β * S * I - (μ + α) * I
      #= none:31 =#
      du = [dS, dI]
  end
  #= none:39 =#
  function main()
      #= none:40 =#
      w = 1
      #= none:41 =#
      m = 10
      #= none:42 =#
      β = 0.0247 * m * w ^ 0.44
      #= none:43 =#
      r = 0.6 * w ^ -0.27
      #= none:44 =#
      μ = 0.4 * w ^ -0.26
      #= none:45 =#
      K = 16.2 * w ^ -0.7
      #= none:46 =#
      α = (m - 1) * μ
      #= none:49 =#
      parms = [β, r, μ, K, α]
      #= none:50 =#
      init = [K, 1.0]
      #= none:51 =#
      tspan = (0.0, 10.0)
      #= none:53 =#
      sir_prob = ODEProblem(micro_1, init, tspan, parms)
      #= none:55 =#
      sir_sol = solve(sir_prob)
      #= none:56 =#
      return sir_sol
  end
  end)

Once the ASTs are processed into a structured representation we can manipulate with regular julia code, we are able to write manipulations of the models that operate on a higher level than textual changes to the code.

In [6]:
model2 = model(ExpODEModel, expr2)
fluxes(x::ExpODEModel) = x.variables[1].flux
matches = Expr[:(ODEProblem(micro_1, init, tspan, parms))]
Out[6]:
fluxes (generic function with 1 method)

Find the expression we want to graft vital dynamics S rate expression

In [7]:
fluxvar = fluxes(model2)[1].args[2].args[1]
popgrowth = replacevar(findassign(model2.funcs[1], fluxvar)[1], :K, :N).args[2].args[2]
ex = model1.variables[1].flux[1]
ex.args[2] = :($(popgrowth)+$(ex.args[2]))
Out[7]:
:(r * (1 - S / N) * S + ((μ - β * S * I) - μ * S))

define N as the sum of the entries of Y ie. S+E+I

In [8]:
@assert argslist(:(function foo(x, y); return x+y; end)) == [:foo, :x, :y]
In [9]:
pushfirst!(bodyblock(model1.funcs[1]), :(N = sum(Y)))
Out[9]:
21-element Array{Any,1}:
 :(N = sum(Y))                                             
 :(#= none:22 =#)                                          
 :(β = p[1])                                               
 :(#= none:24 =#)                                          
 :(σ = p[2])                                               
 :(#= none:26 =#)                                          
 :(γ = p[3])                                               
 :(#= none:28 =#)                                          
 :(μ = p[4])                                               
 :(#= none:31 =#)                                          
 :(S = Y[1])                                               
 :(#= none:33 =#)                                          
 :(E = Y[2])                                               
 :(#= none:35 =#)                                          
 :(I = Y[3])                                               
 :(#= none:39 =#)                                          
 :(dY[1] = r * (1 - S / N) * S + ((μ - β * S * I) - μ * S))
 :(#= none:40 =#)                                          
 :(dY[2] = β * S * I - (σ + μ) * E)                        
 :(#= none:41 =#)                                          
 :(dY[3] = σ * E - (γ + μ) * I)                            

we need to add a new paramter to the function we are going to define this signature doesn't match the old signature so we are going to do some surgery on the main function to add that parameter this parameter instead could be added to the vector of parameters.

In [10]:
pusharg!(model1.funcs[1], :r)
# gensym gives us a unique name for the new function
g = gensym(argslist(model1.funcs[1])[1])
argslist(model1.funcs[1])[1] = g
Out[10]:
Symbol("##seir_ode#361")

Model Augmentations often require new parameters

When we add the population growth term to the SEIR model, we introduce a new parameter $r$ that needs to be supplied to the model. One problem with approaches that require scientists to modify source code is the fact that adding the new features necessitates changes to the APIs provided by the original author. SemanticModels.ModelTools provides a higher level API for making these changes that assist in propagating the necessary changes to the API.

For example, in this code we need to add an argument to the entrypoint function main and provide an anonymous function that conforms to the API that DifferentialEquations expects from its inputs.

In [11]:
mainx = findfunc(model1.expr, :main)[end]
pusharg!(mainx, :λ)
Out[11]:
:(function main(λ)
      #= none:46 =#
      pram = [520 / 365, 1 / 60, 1 / 30, 774835 / (65640000 * 365)]
      #= none:48 =#
      init = [0.8, 0.1, 0.1]
      #= none:49 =#
      tspan = (0.0, 365.0)
      #= none:51 =#
      seir_prob = ODEProblem(seir_ode, init, tspan, pram)
      #= none:53 =#
      sol = solve(seir_prob)
      #= none:54 =#
      return sol
  end)

An ODEProblem expects the user to provide a function $f(du, u, p, t)$ which takes the current fluxes, current system state, parameters, and current time as its arguments and updates the value of du. Since our new function g does not satisfy this interface, we need to introduce a wrapper function that does.

Here is an instance where having a smart compiler helps julia. In many dynamic languages where this kind of metaprogramming would be easy, the runtime is not smart enough to inline these anonymous functions, which means that there is additional runtime performance overhead to metaporgramming like this. Julia's compiler (and LLVM) can inline these functions which drastically reduces that overhead.

In [12]:
setarg!(model1.calls[end], :seir_ode, :((du,u,p,t)->$g(du,u,p,t,λ)))
@show model1.expr
NewModule = eval(model1.expr)
model1.expr = :(module SEIRmodel
  #= none:16 =#
  #= none:17 =#
  using DifferentialEquations
  #= none:20 =#
  function ##seir_ode#361(dY, Y, p, t, r)
      N = sum(Y)
      #= none:22 =#
      β = p[1]
      #= none:24 =#
      σ = p[2]
      #= none:26 =#
      γ = p[3]
      #= none:28 =#
      μ = p[4]
      #= none:31 =#
      S = Y[1]
      #= none:33 =#
      E = Y[2]
      #= none:35 =#
      I = Y[3]
      #= none:39 =#
      dY[1] = r * (1 - S / N) * S + ((μ - β * S * I) - μ * S)
      #= none:40 =#
      dY[2] = β * S * I - (σ + μ) * E
      #= none:41 =#
      dY[3] = σ * E - (γ + μ) * I
  end
  #= none:44 =#
  function main(λ)
      #= none:46 =#
      pram = [520 / 365, 1 / 60, 1 / 30, 774835 / (65640000 * 365)]
      #= none:48 =#
      init = [0.8, 0.1, 0.1]
      #= none:49 =#
      tspan = (0.0, 365.0)
      #= none:51 =#
      seir_prob = ODEProblem(((du, u, p, t)->begin
                      #= In[12]:1 =#
                      ##seir_ode#361(du, u, p, t, λ)
                  end), init, tspan, pram)
      #= none:53 =#
      sol = solve(seir_prob)
      #= none:54 =#
      return sol
  end
  end)
WARNING: replacing module SEIRmodel.
Out[12]:
Main.SEIRmodel

Modeling configuration

The following code sets up our modeling configuration with initial conditions and parameters. It represents the entry point to solving the model.

In [13]:
findfunc(model1.expr, :main)[end]
Out[13]:
:(function main(λ)
      #= none:46 =#
      pram = [520 / 365, 1 / 60, 1 / 30, 774835 / (65640000 * 365)]
      #= none:48 =#
      init = [0.8, 0.1, 0.1]
      #= none:49 =#
      tspan = (0.0, 365.0)
      #= none:51 =#
      seir_prob = ODEProblem(((du, u, p, t)->begin
                      #= In[12]:1 =#
                      ##seir_ode#361(du, u, p, t, λ)
                  end), init, tspan, pram)
      #= none:53 =#
      sol = solve(seir_prob)
      #= none:54 =#
      return sol
  end)

Solving the new model

Once we have changed the function seir_ode and adapted the API of main to suit we can do a parameter sweep over our new parameter by solving the problem with different values of $\lambda$.

In [14]:
newsol = NewModule.main(1)
Out[14]:
retcode: Success
Interpolation: Automatic order switching interpolation
t: 163-element Array{Float64,1}:
   0.0               
   0.1092377929325919
   0.5971081626945428
   1.4186733581080038
   2.4842675713561735
   3.912050145355999 
   5.651305946254539 
   7.522850303664864 
   9.477527594092548 
  11.475734003298248 
  13.506373079979856 
  15.003733501326096 
  16.462011147219222 
   ⋮                 
 332.03048615924314  
 334.3479146642967   
 336.74981679363106  
 339.248125496605    
 341.85694759349093  
 344.59311312133815  
 347.47698112331796  
 350.53353758565686  
 353.7938160297743   
 357.296626450204    
 361.0904045231589   
 365.0               
u: 163-element Array{Array{Float64,1},1}:
 [0.8, 0.1, 0.1]                
 [0.80543, 0.112287, 0.0998291] 
 [0.839048, 0.167952, 0.0993468]
 [0.927685, 0.267331, 0.0995929]
 [1.09971, 0.415349, 0.102038]  
 [1.43634, 0.67265, 0.10979]    
 [2.0368, 1.15226, 0.128791]    
 [2.95475, 2.07651, 0.168461]   
 [4.15675, 4.02508, 0.250589]   
 [5.06295, 8.23097, 0.424469]   
 [3.99389, 15.7968, 0.783853]   
 [1.80163, 21.2897, 1.20212]    
 [0.369301, 23.5185, 1.6834]    
 ⋮                              
 [2.11608e-5, 2.62107, 1.77906] 
 [2.1867e-5, 2.5217, 1.74222]   
 [2.26737e-5, 2.42268, 1.70311] 
 [2.35999e-5, 2.32382, 1.66167] 
 [2.46708e-5, 2.2249, 1.61781]  
 [2.59194e-5, 2.12568, 1.57141] 
 [2.7391e-5, 2.02591, 1.52232]  
 [2.91485e-5, 1.92527, 1.47034] 
 [3.12825e-5, 1.82345, 1.41522] 
 [3.39272e-5, 1.72007, 1.35666] 
 [3.7289e-5, 1.61473, 1.29428]  
 [4.1397e-5, 1.51294, 1.23143]  

Parameter Estimation

Adding a capability to a model usually introduces additional parameters that must be chosen. Analyzing a model requires developing a procedure for estimating those parameters or characterizing the effect of that parameter on the behavior of the modeled system.

Here we sweep over population growth rates to show what happens when the population growth rate changes.

In [15]:
scalegrowth(λ=1.0) = NewModule.main(λ)
Out[15]:
scalegrowth (generic function with 2 methods)
In [16]:
println("S\tI\tR")
for λ in [1.0,1.1,1.2,1.3,1.4,1.5]
    S,I,R = scalegrowth(λ)(365)
    println("$S\t$I\t$R")
end
S	I	R
4.139701921020379e-5	1.512940651801661	1.231428423912252
3.319429455761791e-5	1.7926581461029392	1.4394890713180382
2.7307348686254232e-5	2.096234031969224	1.6601782666955913
2.2812994101729267e-5	2.42499677172515	1.8937237260018192
1.9442941657092004e-5	2.778355142054837	2.1389525292315175
1.6952140781667204e-5	3.1545281626311956	2.39379350387796

It Works!

This simulation allows an epidemiologist to examine the effects of population growth on an SEIR disease outbreak. A brief analysis of this simulation shows that as you increase the population growth rate, you increase the final population of infected people. More sophisticated analysis could be employed to show something more interesting about this model.

We have shown how you can use SemanticModels.jl to combine features of various ODE systems and solve them with a state of the art solver to increase the capabilities of a code that implements a scientific model. We call this combination process grafting and believe that it supports a frequent use case of scientific programming.