Typegraphs: teaching the compiler to reason about models

In this notebook, we will be going through a tutorial on how to use the typegraph functionality to extract the relationships between types and functions within programs.

In [1]:
using SemanticModels
using SemanticModels.Parsers
using SemanticModels.ModelTools
import SemanticModels.ModelTools: CallArg, RetArg, Edge, Edges, @typegraph, typegraph

After loading @typegraph in to our workspace, we simply begin the extraction by calling the macro immediately followed by an Expr which will return an edge list which is easily passed through to MetaGraphs.jl to visualize the transformations that are taking place throughout the code.

*To learn more about Expr & metaprogramming, we recommend looking at the offcial Julia docs on the topic.

In [2]:
using SemanticModels.Parsers
import Base: ==

shorttype(T::Type) = T.name.name
Out[2]:
shorttype (generic function with 1 method)

Below we are using our parsefile function to take scripts and wrap them around in module headings that way they can be consumed by our other API's. We then pass off our Exprs to our typegraph function which collects all the type information we would like extracted from the code.

You can see the code that is being loaded here at agentbased2.jl

In [3]:
expr = parsefile("agentbased2.jl")
expr2 = ModelTools.typegraph(expr.args[end])
expr3 = :(module Foo
    using SemanticModels.ModelTools
    import SemanticModels.ModelTools: CallArg, RetArg
    $(expr2.args...) end);

In the above example, we have a simple agent based simulation where we defined new stucts to collect the singltonian type information for our simulation. To reduce the nosie from collecting the iterations throughout the runtime of our model we need to collect the unique calls through our our example which contain the relevant information.

In [4]:
Mod = eval(expr3)
Mod.main(10)
edgelist_symbol = Mod.edgelist

E = unique((f.func, f.args, f.ret) for f in Edges(edgelist_symbol))
E = unique(("\"$(f.func)\"", tuple(shorttype.(f.args)...), shorttype.(f.ret)) for f in Edges(edgelist_symbol))
@show E
newsam.agents = Symbol[:S, :I, :R, :S, :S, :S, :R, :S, :R, :I, :S, :R, :S, :S, :R, :S, :S, :R, :I, :I]
E = Tuple{String,Tuple{Symbol,Vararg{Symbol,N} where N},Symbol}[("\"count\"", (:StateModel, :Symbol), :Int64), ("\"count\"", (:StateModel,), :Int64), ("\"stateload\"", (:StateModel, :Symbol), :Float64), ("\"tick!\"", (:StateModel,), :Array), ("\"transition\"", (:StateModel, :Int64, :Symbol), :Symbol), ("\"describe\"", (:StateModel,), :Array), ("\"step!\"", (:StateModel, :Int64), :StateModel), ("\"main\"", (:Int64,), :Tuple)]
Out[4]:
8-element Array{Tuple{String,Tuple{Symbol,Vararg{Symbol,N} where N},Symbol},1}:
 ("\"count\"", (:StateModel, :Symbol), :Int64)              
 ("\"count\"", (:StateModel,), :Int64)                      
 ("\"stateload\"", (:StateModel, :Symbol), :Float64)        
 ("\"tick!\"", (:StateModel,), :Array)                      
 ("\"transition\"", (:StateModel, :Int64, :Symbol), :Symbol)
 ("\"describe\"", (:StateModel,), :Array)                   
 ("\"step!\"", (:StateModel, :Int64), :StateModel)          
 ("\"main\"", (:Int64,), :Tuple)                            

We are going to repeat the process again for another similar script to see if we are able to detect the differences between the two programs.

You can see the code that is being loaded here at agenttypes.jl

In [5]:
expr = parsefile("agenttypes.jl")
expr2 = ModelTools.typegraph(expr.args[end].args[end].args[end])
expr3 = :(module ModTyped
    using SemanticModels.ModelTools
    import SemanticModels.ModelTools: CallArg, RetArg
    $(expr2.args...) end);
In [6]:
Mod = eval(expr3)
Mod.main(10)
edgelist_typed = Mod.edgelist

E_typed = unique(("\"$(f.func)\"", tuple(shorttype.(f.args)...), shorttype.(f.ret)) for f in Edges(edgelist_typed))
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>17, Infected()=>2, Recovered()=>1]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>11, Infected()=>8, Recovered()=>1]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>2, Infected()=>14, Recovered()=>4]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>3, Infected()=>7, Recovered()=>10]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>5, Infected()=>7, Recovered()=>8]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>8, Infected()=>9, Recovered()=>3]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>4, Infected()=>8, Recovered()=>8]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>6, Infected()=>5, Recovered()=>9]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>7, Infected()=>6, Recovered()=>7]
describe(sm) = Pair{_1,Int64} where _1[Susceptible()=>6, Infected()=>7, Recovered()=>7]
Out[6]:
16-element Array{Tuple{String,Tuple{Symbol,Vararg{Symbol,N} where N},Symbol},1}:
 ("\"count\"", (:StateModel, :Susceptible), :Int64)                   
 ("\"count\"", (:StateModel,), :Int64)                                
 ("\"stateload\"", (:StateModel, :Susceptible), :Float64)             
 ("\"count\"", (:StateModel, :Infected), :Int64)                      
 ("\"stateload\"", (:StateModel, :Infected), :Float64)                
 ("\"count\"", (:StateModel, :Recovered), :Int64)                     
 ("\"stateload\"", (:StateModel, :Recovered), :Float64)               
 ("\"transition\"", (:StateModel, :Int64, :Susceptible), :Susceptible)
 ("\"transition\"", (:StateModel, :Int64, :Susceptible), :Infected)   
 ("\"transition\"", (:StateModel, :Int64, :Infected), :Recovered)     
 ("\"describe\"", (:StateModel,), :Array)                             
 ("\"transition\"", (:StateModel, :Int64, :Infected), :Infected)      
 ("\"transition\"", (:StateModel, :Int64, :Recovered), :Recovered)    
 ("\"transition\"", (:StateModel, :Int64, :Recovered), :Susceptible)  
 ("\"tick!\"", (:StateModel,), :StateModel)                           
 ("\"tick!\"", (:StateModel,), :Tuple)                                
In [7]:
println("=============\nSymbols Graph\n============")
for e in E
    println(join(e, ", "))
end
println("\n=============\nTypes Graph\n============")
for e in E_typed
    println(join(e, ", "))
end
=============
Symbols Graph
============
"count", (:StateModel, :Symbol), Int64
"count", (:StateModel,), Int64
"stateload", (:StateModel, :Symbol), Float64
"tick!", (:StateModel,), Array
"transition", (:StateModel, :Int64, :Symbol), Symbol
"describe", (:StateModel,), Array
"step!", (:StateModel, :Int64), StateModel
"main", (:Int64,), Tuple

=============
Types Graph
============
"count", (:StateModel, :Susceptible), Int64
"count", (:StateModel,), Int64
"stateload", (:StateModel, :Susceptible), Float64
"count", (:StateModel, :Infected), Int64
"stateload", (:StateModel, :Infected), Float64
"count", (:StateModel, :Recovered), Int64
"stateload", (:StateModel, :Recovered), Float64
"transition", (:StateModel, :Int64, :Susceptible), Susceptible
"transition", (:StateModel, :Int64, :Susceptible), Infected
"transition", (:StateModel, :Int64, :Infected), Recovered
"describe", (:StateModel,), Array
"transition", (:StateModel, :Int64, :Infected), Infected
"transition", (:StateModel, :Int64, :Recovered), Recovered
"transition", (:StateModel, :Int64, :Recovered), Susceptible
"tick!", (:StateModel,), StateModel
"tick!", (:StateModel,), Tuple

Visualizing the edges

Now that we have extracted the relevant type information, we want to visualize these transformations in a knowledge graph.

In [8]:
using MetaGraphs;
using LightGraphs;
# patch for https://github.com/JuliaGraphs/MetaGraphs.jl/pull/71/files
function escapehtml(i::AbstractString)
    # Refer to http://stackoverflow.com/a/7382028/3822752 for spec. links
    replace=Main.replace
    o = replace(i, "&" =>"&")
    o = replace(o, "\""=>""")
    o = replace(o, "'" =>"'")
    o = replace(o, "<" =>"&lt;")
    o = replace(o, ">" =>"&gt;")
    return o
end
WARNING: using LightGraphs.Edge in module Main conflicts with an existing identifier.
Out[8]:
escapehtml (generic function with 1 method)
In [9]:
function buildgraph(E)
    g = MetaDiGraph();
    set_indexing_prop!(g,:label);

    for e in E
        try
            g[e[2],:label]
        catch
            add_vertex!(g,:label,e[2]) # add ags
        end

        try
            g[e[3],:label]
        catch
            add_vertex!(g,:label,e[3]) # add rets
        end

        try
            add_edge!(g,g[e[2],:label],g[e[3],:label],:label,e[1])#escapehtml(string(e[1]))) # add func edges
        catch
            nothing
        end
    end
    return g
end

function projectors(g, key=:label)
    newedges = []
for v in vertices(g)
    args = g.vprops[v][key]
    if isa(args, Tuple)
        for (i, a) in enumerate(args)
            push!(newedges, ($i", a, args))
        end
    end
    end
        return newedges
end

function add_projectors!(g, key=:label)
    πs = projectors(g)
    for e in πs
        add_edge!(g,g[e[2],:label],g[e[3],:label],:label,e[1])
    end
    return g, πs
end
Out[9]:
add_projectors! (generic function with 2 methods)
In [10]:
function draw(g, filename)
    savegraph(filename, g, DOTFormat())
    try
        run(`dot -Tsvg -O $filename`)
    catch ex
        @warn "Problem running dot" error=ex
        try
            run(`dot --version`)
        catch
            @warn "Dot is not installed"
        end
    end
    display("image/svg+xml", read("$filename.svg", String))
end
Out[10]:
draw (generic function with 1 method)
In [11]:
g = buildgraph(E)
add_projectors!(g)
Out[11]:
({11, 16} directed Int64 metagraph with Float64 weights defined by :weight (default weight 1.0), Any[("π1", :StateModel, (:StateModel, :Symbol)), ("π2", :Symbol, (:StateModel, :Symbol)), ("π1", :StateModel, (:StateModel,)), ("π1", :StateModel, (:StateModel, :Int64, :Symbol)), ("π2", :Int64, (:StateModel, :Int64, :Symbol)), ("π3", :Symbol, (:StateModel, :Int64, :Symbol)), ("π1", :StateModel, (:StateModel, :Int64)), ("π2", :Int64, (:StateModel, :Int64)), ("π1", :Int64, (:Int64,))])
In [12]:
h = deepcopy(g)
g = buildgraph(E_typed)
g, πs = add_projectors!(g);
g
Out[12]:
{15, 32} directed Int64 metagraph with Float64 weights defined by :weight (default weight 1.0)
In [13]:
using Colors
cm = Colors.colormap("RdBu", 2nv(h))
color(v) = "#$(hex(cm[v + floor(Int, nv(h)/2)]))"
Out[13]:
color (generic function with 1 method)

We will draw the graph of types in initial model that uses symbols to represent the agent states. Remember that Symbol is type that represents "things that are like variable names" and in this case we are using the Symbol type to represent the agent states of :Susceptible, :Infected, and :Recovered.

In this drawing each vertex has its own color. These colors will be used again when drawing the next graph.

In [14]:
for v in vertices(h)
    h.vprops[v][:color] = color(v)
    h.vprops[v][:style] = "filled"
end
draw(h, "exampletypegraph.dot")
G 1 (:StateModel, :Symbol) 2 Int64 1->2 count 4 Float64 1->4 stateload 6 (:StateModel, :Int64, :Symbol) 2->6 π2 8 (:StateModel, :Int64) 2->8 π2 10 (:Int64,) 2->10 π1 3 (:StateModel,) 3->2 count 5 Array 3->5 tick! 7 Symbol 6->7 transition 7->1 π2 7->6 π3 9 StateModel 8->9 step! 9->1 π1 9->3 π1 9->6 π1 9->8 π1 11 Tuple 10->11 main

We then onstruct the typegraph for the program 2, which uses singleton types to represent the state of the agents. One of the central tennants of this project is that the more information you inject into the julia type system, the more the compiler can help you. Here we will see that they type system knows about the structure of the agents behavior now that the we have encoded their states as types.

now we draw the new, bigger type graph with the same color scheme as before. We define a graph homomorphism $\phi$ that maps every type to one of the vertices of the original graph show above. This homomorphism from $\phi: G \mapsto H$ shows how the semantics of the first program is embedded in the semantics of the second program.

In [15]:
ϕ(t) = begin
    d=Dict{Symbol,Symbol}(:Susceptible=>:Symbol,
        :Infected=>:Symbol,
        :Recovered=>:Symbol,
    )
    val = get(d, t, t)
    return val
end

for v in vertices(g)
    vname = ϕ.(g[v,:label])
    try
        vh = h[vname, :label]
        g.vprops[v][:fillcolor]=color(vh)
        g.vprops[v][:style]="filled"
    catch ex
        @warn ex
    end
end
In [16]:
draw(g, "typegraphmorphism.dot")
G 1 (:StateModel, :Susceptible) 2 Int64 1->2 count 4 Float64 1->4 stateload 7 (:StateModel, :Int64, :Susceptible) 2->7 π2 10 (:StateModel, :Int64, :Infected) 2->10 π2 13 (:StateModel, :Int64, :Recovered) 2->13 π2 3 (:StateModel,) 3->2 count 12 Array 3->12 describe 14 StateModel 3->14 tick! 15 Tuple 3->15 tick! 5 (:StateModel, :Infected) 5->2 count 5->4 stateload 6 (:StateModel, :Recovered) 6->2 count 6->4 stateload 8 Susceptible 7->8 transition 9 Infected 7->9 transition 8->1 π2 8->7 π3 9->5 π2 9->10 π3 10->9 transition 11 Recovered 10->11 transition 11->6 π2 11->13 π3 13->8 transition 13->11 transition 14->1 π1 14->3 π1 14->5 π1 14->6 π1 14->7 π1 14->10 π1 14->13 π1

Note that the type Symbol does not appear in the second program at all, but instead the types Susceptible, Infected, and Recovered play the role of the Symbol vertex in this new graph. The graph is bigger and harder to visualize, but it contains the same structure. You can chack that for every edge in the $g$ there is an edge in $h$ that has the same color vertices as endpoints.

In this new model the structure of the agents states is readily apparant in the type system.

In [17]:
tedges = filter((p) -> p[2][:label]=="\"transition\"", g.eprops)
tv = src.(keys(tedges))  dst.(keys(tedges))
DFA = g[tv]
draw(DFA, "type_DFA.dot")
G 1 (:StateModel, :Int64, :Recovered) 4 Susceptible 1->4 transition 5 Recovered 1->5 transition 2 (:StateModel, :Int64, :Infected) 2->5 transition 6 Infected 2->6 transition 3 (:StateModel, :Int64, :Susceptible) 3->4 transition 3->6 transition 4->3 π3 5->1 π3 6->2 π3

By contracting the edges labeled $\pi_3$ you identify a minor of the typegraph isomorphic to the discrete finite automata or finite state machine representation of the agents in our agent based simulation.

Since the typegraph contains this information, we can say that the julia compiler "understands" a semantic feature of the model. We can introduce compile time logic based on these properties of the model. ANy changes to the underlying code that changed the state space of the agents, or the possible transitions they undergo would affect the type graph of the model.

For the model version that used :Symbols, or categorical states, the julia type system is blissfully ignorant of the relationships between the states. However once we introduce types to the model, the compiler is able to represent the structure of the model. Any changes to the model will either preserve or disrupt this type graph and we will be able to identify and quantify that change to the structure of the model.

This example is an instance of a large phenomenon that we hope to advance in modeling. Programs that implement models can get more out of the compiler if they add more information. In this case we declared the states of our agents as types and the compiler was able to infer the state transition graph from the our code.

Conclusions

We can see from the examples presented here that similar models produce similar type graphs. In this case we have two programs that impement the same model and have homomorphic type graphs. This homomorphism is natural in the sense that every vertex that appears in both graphs satisfies $\phi(v) = v$. Adding information to the type system in the julia program allows the type graph of the model to understand more of the program semantics. This powerful insight can teach the machines to reason about the behaviour of computational models.