CausalELM
Most of the methods and structs here are private, not exported, should not be called by the user, and are documented for the purpose of developing CausalELM or to facilitate understanding of the implementation.
Structs
CausalELM.InterruptedTimeSeries
— TypeContainer for the results of an interrupted time series analysis
CausalELM.GComputation
— TypeContainer for the results of G-Computation
CausalELM.DoubleMachineLearning
— TypeContainer for the results of a double machine learning estimator
CausalELM.SLearner
— TypeS-Learner for CATE estimation.
CausalELM.TLearner
— TypeT-Learner for CATE estimation.
CausalELM.RLearner
— TypeContainer for the results of an R-learner
CausalELM.CausalEstimator
— TypeAbstract type for GComputation and DoubleMachineLearning
CausalELM.Metalearner
— TypeAbstract type for metalearners
CausalELM.ExtremeLearningMachine
— TypeAbstract type that includes vanilla and L2 regularized Extreme Learning Machines
CausalELM.ExtremeLearner
— TypeStruct to hold data for an Extreme Learning machine
CausalELM.RegularizedExtremeLearner
— TypeStruct to hold data for a regularized Extreme Learning Machine
Activation Functions
CausalELM.binary_step
— Functionbinary_step(x)
Apply the binary step activation function to a real number.
Examples
julia> binary_step(1)
1.0
binary_step(x)
Apply the binary step activation function to an array.
Examples
julia> binary_step([-1000, 100, 1, 0, -0.001, -3])
6-element Vector{Float64}
0.0
1.0
1.0
1.0
0.0
0.0
CausalELM.σ
— Functionσ(x)
Apply the sigmoid activation function to a real number.
Examples
julia> σ(1)
0.7310585786300049
σ(x)
Apply the sigmoid activation function to an array.
Examples
julia> σ([1, 0])
2-element Vector{Float64}
0.7310585786300049
0.5
Base.tanh
— Functiontanh(x)
Apply the tanh activation function to an array.
This is just a vectorized version of Base.tanh
Examples
julia> tanh([1, 0])
2-element Vector{Float64}
0.7615941559557649
0.0
CausalELM.relu
— Functionrelu(x)
Apply the ReLU activation function to a real number.
Examples
julia> relu(1)
1.0
relu(x)
Apply the ReLU activation function to an array.
Examples
julia> relu([1, 0, -1])
3-element Vector{Float64}
1.0
0.0
0.0
CausalELM.leaky_relu
— Functionleaky_relu(x)
Apply the leaky ReLU activation function to a real number.
Examples
julia> leaky_relu(1)
1.0
leaky_relu(x)
Apply the leaky ReLU activation function to an array.
Examples
julia> leaky_relu([-0.01, 0, 1])
3-element Vector{Float64}
1.0
0.0
0.0
CausalELM.swish
— Functionswish(x)
Apply the swish activation function to a real number.
Examples
julia> swish(1)
0.7310585786300049
swish(x)
Apply the swish activation function to an array.
Examples
julia> swish([1, 0, -1])
3-element Vector{Float64}
0.7310585786300049
0.0
-0.2689414213699951
CausalELM.softmax
— Functionsoftmax(x)
Apply the softmax activation function to a real number.
Examples
julia> softmax(1)
2.718281828459045
softmax(x)
Apply the softmax activation function to a vector.
Examples
julia> softmax([1, 2, 3])
3-element Vector{Float64}:
0.09003057317038046
0.24472847105479767
0.6652409557748219
softmax(x)
Apply the softmax activation function to the rows of an array.
Examples
julia> x = rand(5, 3)
5×3 Matrix{Float64}:
0.482117 0.225359 0.615589
0.255572 0.165051 0.427035
0.387384 0.424856 0.369219
0.175362 0.172561 0.111878
0.508207 0.258347 0.591111
julia> softmax(x)
5×3 Matrix{Float64}:
0.342895 0.265248 0.391857
0.322529 0.294616 0.382855
0.331106 0.343749 0.325146
0.340635 0.339682 0.319682
0.348998 0.271838 0.379164
CausalELM.softplus
— Functionsoftplus(x)
Apply the softplus activation function to a real number.
Examples
julia> softplus(1)
1.3132616875182228
softplus(x)
Apply the softplus activation function to an array.
Examples
julia> softplus([1, -1])
2-element Vector{Float64}
1.3132616875182228
0.31326168751822286
CausalELM.gelu
— Functiongelu(x)
Apply the GeLU activation function to a real number.
Examples
julia> gelu(1)
0.8411919906082768
gelu(x)
Apply the GeLU activation function to an array.
Examples
julia> gelu([-1, 0, 1])
3-element Vector{Float64}
-0.15880800939172324
0.0
0.8411919906082768
CausalELM.gaussian
— Functiongaussian(x)
Apply the gaussian activation function to a real number.
Examples
julia> gaussian(1)
0.11443511435028261
gaussian(x)
Apply the gaussian activation function to an array.
Examples
julia> gaussian([1, -1])
2-element Vector{Float64}
0.36787944117144233
0.36787944117144233
CausalELM.hard_tanh
— Functionhard_tanh(x)
Apply the hard_tanh activation function to a real number.
Examples
julia> hard_tanh(-2)
-1.0
hard_tanh(x)
Apply the hard_tanh activation function to an array.
Examples
julia> hard_tanh([-2, 0, 2])
3-element Vector{Float64}
-1.0
0.0
1.0
CausalELM.elish
— Functionelish(x)
Apply the ELiSH activation function to a real number.
Examples
julia> elish(1)
0.7310585786300049
elish(x)
Apply the ELiSH activation function to an array.
Examples
julia> elish([-1, 1])
2-element Vector{Float64}
-0.17000340156854793
0.7310585786300049
CausalELM.fourier
— Functionfourrier(x)
Apply the Fourier activation function to a real number.
Examples
julia> fourier(1)
0.8414709848078965
fourrier(x)
Apply the Fourier activation function to an array.
Examples
julia> fourier([-1, 1])
2-element Vector{Float64}
-0.8414709848078965
0.8414709848078965
Cross Validation
CausalELM.generate_folds
— Functiongenerate_folds(X, Y, folds)
Creates folds for cross validation.
Examples
julia> xfolds, y_folds = generate_folds(zeros(20, 2), zeros(20), 5)
CausalELM.generate_temporal_folds
— Functiongenerate_folds(X, Y, folds)
Creates rolling folds for cross validation of time series data.
Examples
julia> xfolds, y_folds = generate_temporal_folds(zeros(20, 2), zeros(20), 5, temporal=true)
CausalELM.validation_loss
— Functionvalidation_loss(xtrain, ytrain, xtest, ytest, nodes, metric; activation, regularized)
Calculate a validation metric for a single fold in k-fold cross validation.
Examples
julia> x = rand(100, 5); y = Float64.(rand(100) .> 0.5)
julia> validation_loss(x, y, 5, accuracy, 3)
0.0
CausalELM.cross_validate
— Functioncross_validate(X, Y, neurons, metric, activation, regularized, folds, temporal)
Calculate a validation metric for k folds using a single set of hyperparameters.
Examples
julia> x = rand(100, 5); y = Float64.(rand(100) .> 0.5)
julia> cross_validate(x, y, 5, accuracy)
0.0257841765251021
CausalELM.best_size
— Functionbest_size(X, Y, metric, task, activation, min_neurons, max_neurons, regularized, folds,
temporal, iterations, elm_size)
Compute the best number of neurons for an Extreme Learning Machine.
The procedure tests networks with numbers of neurons in a sequence whose length is given by iterations on the interval [minneurons, maxneurons]. Then, it uses the networks sizes and validation errors from the sequence to predict the validation error or metric for every network size between minneurons and maxneurons using the function approximation ability of an Extreme Learning Machine. Finally, it returns the network size with the best predicted validation error or metric.
Examples
julia> best_size(rand(100, 5), rand(100), mse, "regression")
11
CausalELM.shuffle_data
— Functionshuffle_data(X, Y, T)
Shuffles covariates, treatment vector, and outcome vector for cross validation.
Examples
julia> x, y, t = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> shuffle_data(x, y, t)
([0.6124923085225416 0.2713900065807924 … 0.6094796972512194 0.6067966603192685;
0.7186612932571539 0.8047878363606299 … 0.9787878554455594 0.885819212905816; … ;
0.773543733306263 0.10880091279797399 … 0.10525512055751185 0.6303472234021711;
0.10845217539341823 0.9911071602976902 … 0.014754069216096566 0.5256103389041187],
[0.4302689295553531, 0.2396683446618325, 0.7954433314513768, 0.7191098533903124,
0.8168563428651753, 0.7064320936729905, 0.048113106979693065, 0.3102938851371281,
0.6246380539228858, 0.3510284321966193 … 0.5324022501182528, 0.8354720951777901,
0.7526652774981095, 0.3639742621882005, 0.21030903031988923, 0.6936212944871928,
0.3910592143534404, 0.15152013651215301, 0.38891692138831735, 0.08827711410802941],
Float64[0, 0, 1, 1, 0, 1, 0, 0, 1, 0 … 0, 0, 1, 1, 1, 1, 0, 1, 0, 0])
Average Causal Effect Estimators
CausalELM.XLearner
— TypeX-Learner for CATE estimation.
CausalELM.estimate_effect!
— Functionestimate_effect!(DML, cate)
Estimate a treatment effect using double machine learning.
This method should not be called directly.
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = DoubleMachineLearning(X, Y, T)
julia> estimate_effect!(m1)
0.31067439
CausalELM.predict_residuals
— Functionpredict_residuals(DML, x_train, x_test, y_train, y_test, t_train, t_test)
Predict treatment and outcome residuals for doubl machine learning.
This method should not be called directly.
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> x_train, x_test = X[1:80, :], X[81:end, :]
julia> y_train, y_test = Y[1:80], Y[81:end]
julia> t_train, t_test = T[1:80], T[81:100]
julia> m1 = DoubleMachineLearning(X, Y, T)
julia> predict_residuals(m1, x_train, x_test, y_train, y_test, t_train, t_test)
100-element Vector{Float64}
0.6944714802199426
0.6102318624294397
0.9563033347529682
⋮
0.14016601301278353,
0.2217194742841072
0.199372555924635
CausalELM.moving_average
— Functionmoving_average(g)
Calculates a cumulative moving average.
Examples
julia> moving_average([1, 2, 3])
3-element Vector{Float64}
1.0
1.5
2.0
Metalearners
CausalELM.stage1!
— Functionstage1!(x)
Estimate the first stage models for an X-learner.
This method should not be called by the user.
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = XLearner(X, Y, T)
julia> stage1!(m1)
CausalELM.stage2!
— Functionstage2!(x)
Estimate the second stage models for an X-learner.
This method should not be called by the user.
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = XLearner(X, Y, T)
julia> stage1!(m1)
julia> stage2!(m1)
100-element Vector{Float64}
0.6579129842054047
0.7644471766429705
0.5462780002052421
⋮
0.8755515354984005
0.947588000142362
0.29294343704001025
Common Methods
CausalELM.estimate_causal_effect!
— Functionestimate_causal_effect!(its)
Estimate the effect of an event relative to a predicted counterfactual.
Examples
julia> X₀, Y₀, X₁, Y₁ = rand(100, 5), rand(100), rand(10, 5), rand(10)
julia> m1 = InterruptedTimeSeries(X₀, Y₀, X₁, Y₁)
julia> estimate_causal_effect!(m1)
0.25714308
estimate_causal_effect!(g)
Estimate a causal effect of interest using G-Computation.
If treatents are administered at multiple time periods, the effect will be estimated as the average difference between the outcome of being treated in all periods and being treated in no periods.For example, given that individuals 1, 2, ..., i ∈ I recieved either a treatment or a placebo in p different periods, the model would estimate the average treatment effect as E[Yᵢ|T₁=1, T₂=1, ... Tₚ=1, Xₚ] - E[Yᵢ|T₁=0, T₂=0, ... Tₚ=0, Xₚ].
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m2 = GComputation(X, Y, T)
julia> estimate_causal_effect!(m2)
0.31067439
estimate_causal_effect!(DML)
Estimate a causal effect of interest using double machine learning.
Unlike other estimators, this method does not support time series or panel data. This method also does not work as well with smaller datasets because it estimates separate outcome models for the treatment and control groups.
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m3 = DoubleMachineLearning(X, Y, T)
julia> estimate_causal_effect!(m3)
0.31067439
estimate_causal_effect!(s)
Estimate the CATE using an S-learner.
For an overview of S-learning see:
Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for
estimating heterogeneous treatment effects using machine learning." Proceedings of the
national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m4 = SLearner(X, Y, T)
julia> estimate_causal_effect!(m4)
100-element Vector{Float64}
0.20729633391630697
0.20729633391630697
0.20729633391630692
⋮
0.20729633391630697
0.20729633391630697
0.20729633391630697
estimate_causal_effect!(t)
Estimate the CATE using an T-learner.
For an overview of T-learning see:
Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for
estimating heterogeneous treatment effects using machine learning." Proceedings of the
national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m5 = TLearner(X, Y, T)
julia> estimate_causal_effect!(m5)
100-element Vector{Float64}
0.0493951571746305
0.049395157174630444
0.0493951571746305
⋮
0.049395157174630444
0.04939515717463039
0.049395157174630444
estimatecausaleffect!(x)
Estimate the CATE using an X-learner.
For an overview of X-learning see:
Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for
estimating heterogeneous treatment effects using machine learning." Proceedings of the
national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = XLearner(X, Y, T)
julia> estimate_causal_effect!(m1)
-0.025012644892878473
-0.024634294305967294
-0.022144246680543364
⋮
-0.021163590874553318
-0.014607310062509895
-0.022449034332142046
estimatecausaleffect!(R)
Estimate the CATE using an R-learner.
For an overview of R-learning see:
Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. "Metalearners for
estimating heterogeneous treatment effects using machine learning." Proceedings of the
national academy of sciences 116, no. 10 (2019): 4156-4165.
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = RLearner(X, Y, T)
julia> estimate_causal_effect!(m1)
-0.025012644892878473
-0.024634294305967294
-0.022144246680543364
⋮
-0.021163590874553318
-0.014607310062509895
-0.022449034332142046
Inference
CausalELM.summarize
— Functionsummarize(its, mean_effect)
Return a summary from an interrupted time series estimator.
p-values and standard errors are estimated using approximate randomization inference that permutes the time of the intervention.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> X₀, Y₀, X₁, Y₁ = rand(100, 5), rand(100), rand(10, 5), rand(10)
julia> m1 = InterruptedTimeSeries(X₀, Y₀, X₁, Y₁)
julia> estimate_causal_effect!(m1)
1-element Vector{Float64}
0.25714308
julia> summarize(m1)
{"Task" => "Regression", "Regularized" => true, "Activation Function" => relu,
"Validation Metric" => "mse","Number of Neurons" => 2,
"Number of Neurons in Approximator" => 10, "β" => [0.25714308],
"Causal Effect" => -3.9101138, "Standard Error" => 1.903434356, "p-value" = 0.00123356}
summarize(mod, n)
Return a summary from a CausalEstimator or Metalearner.
p-values and standard errors are estimated using approximate randomization inference.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = GComputation(X, Y, T)
julia> estimate_causal_effect!(m1)
0.3100468253
julia> summarize(m1)
{"Task" => "Regression", "Quantity of Interest" => "ATE", Regularized" => "true",
"Activation Function" => "relu", "Time Series/Panel Data" => "false",
"Validation Metric" => "mse","Number of Neurons" => "5",
"Number of Neurons in Approximator" => "10", "Causal Effect: 0.00589761,
"Standard Error" => 5.12900734, "p-value" => 0.479011245}
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = RLearner(X, Y, T)
julia> estimate_causal_effect(m1)
1-element Vector{Float64}
[0.5804032956]
julia> summarize(m1)
{"Task" => "Regression", "Quantity of Interest" => "ATE", Regularized" => "true",
"Activation Function" => "relu", "Validation Metric" => "mse", "Number of Neurons" => "5",
"Number of Neurons in Approximator" => "10", "Causal Effect" = 0.5804032956,
"Standard Error" => 2.129400324, "p-value" => 0.0008342356}
julia> X, Y, T = rand(100, 5), rand(100), [rand()<0.4 for i in 1:100]
julia> m1 = SLearner(X, Y, T)
julia> estimate_causal_effect!(m1)
100-element Vector{Float64}
0.20729633391630697
0.20729633391630697
0.20729633391630692
⋮
0.20729633391630697
0.20729633391630697
0.20729633391630697
julia> summarise(m1)
{"Task" => "Regression", Regularized" => "true", "Activation Function" => "relu",
"Validation Metric" => "mse", "Number of Neurons" => "5",
"Number of Neurons in Approximator" => "10",
"Causal Effect: [0.20729633391630697, 0.20729633391630697, 0.20729633391630692,
0.20729633391630697, 0.20729633391630697, 0.20729633391630697, 0.20729633391630697,
0.20729633391630703, 0.20729633391630697, 0.20729633391630697 … 0.20729633391630703,
0.20729633391630697, 0.20729633391630692, 0.20729633391630703, 0.20729633391630697,
0.20729633391630697, 0.20729633391630692, 0.20729633391630697, 0.20729633391630697,
0.20729633391630697], "Standard Error" => 5.3121435085, "p-value" => 0.0632454855}
CausalELM.generate_null_distribution
— Functiongenerate_null_distribution(mod, n)
Generate a null distribution for the treatment effect of G-computation, double machine learning, or metalearning.
This method estimates the same model that is provided using random permutations of the treatment assignment to generate a vector of estimated effects under different treatment regimes. When mod is a metalearner the null statistic is the difference is the ATE.
Note that lowering the number of iterations increases the probability of failing to reject the null hypothesis.
Examples
julia> x, y, t = rand(100, 5), rand(1:100, 100, 1), [rand()<0.4 for i in 1:100]
julia> g_computer = GComputation(x, y, t)
julia> estimate_causal_effect!(g_computer)
julia> generate_null_distribution(g_computer, 500)
500-element Vector{Float64}
500-element Vector{Float64}
0.016297180690693656
0.0635928694685571
0.20004144093635673
⋮
24.739658523175912
25.30523686137909
28.07474553316176
generate_null_distribution(its, n, mean_effect)
Generate a null distribution for the treatment effect in an interrupted time series analysis. By default, this method generates a null distribution of mean differences. To generate a null distribution of cummulative differences, set the mean_effect argument to false.
Randomization is conducted by randomly assigning observations to the pre and post-intervention periods, resestimating the causal effect, and repeating n times. The null distribution is the set of n casual effect estimates.
Note that lowering the number of iterations increases the probability of failing to reject the null hypothesis.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = rand(1:100, 100, 5), rand(100), rand(10, 5), rand(10)
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causale_ffect!(its)
julia> generate_null_distribution(its, 10)
10-element Vector{Float64}
-0.5012456678829079
-0.33790650529972194
-0.2534340182760628
⋮
-0.06217013151235991
-0.05905529159312335
-0.04927743270606937
CausalELM.quantities_of_interest
— Functionquantities_of_interest(mod, n)
Generate a p-value and standard error through randomization inference
This method generates a null distribution of treatment effects by reestimating treatment effects from permutations of the treatment vector and estimates a p-value and standard from the generated distribution.
Note that lowering the number of iterations increases the probability of failing to reject the null hypothesis.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x, y, t = rand(100, 5), rand(1:100, 100, 1), [rand()<0.4 for i in 1:100]
julia> g_computer = GComputation(x, y, t)
julia> estimate_causal_effect!(g_computer)
julia> quantities_of_interest(g_computer, 1000)
(0.114, 6.953133617011371)
quantities_of_interest(mod, n)
Generate a p-value and standard error through randomization inference
This method generates a null distribution of treatment effects by reestimating treatment effects from permutations of the treatment vector and estimates a p-value and standard from the generated distribution. Randomization for event studies is done by creating time splits at even intervals and reestimating the causal effect.
Note that lowering the number of iterations increases the probability of failing to reject the null hypothesis.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = rand(1:100, 100, 5), rand(100), rand(10, 5), rand(10)
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> quantities_of_interest(its, 10)
(0.0, 0.07703275541001667)
Model Validation
CausalELM.validate
— Functionvalidate(its; n, low, high)
Test the validity of an estimated interrupted time series analysis.
This method coducts a Chow Test, a Wald supremeum test, and tests the model's sensitivity to confounders. The Chow Test tests for structural breaks in the covariates between the time before and after the event. p-values represent the proportion of times the magnitude of the break in a covariate would have been greater due to chance. Lower p-values suggest a higher probability the event effected the covariates and they cannot provide unbiased counterfactual predictions. The Wald supremum test finds the structural break with the highest Wald statistic. If this is not the same as the hypothesized break, it could indicate an anticipation effect, a confounding event, or that the intervention or policy took place in multiple phases. p-values represent the proportion of times we would see a larger Wald statistic if the data points were randomly allocated to pre and post-event periods for the predicted structural break. Ideally, the hypothesized break will be the same as the predicted break and it will also have a low p-value. The omitted predictors test adds normal random variables with uniform noise as predictors. If the included covariates are good predictors of the counterfactual outcome, adding irrelevant predictors should not have a large effect on the predicted counterfactual outcomes or the estimated effect.
For more details on the assumptions and validity of interrupted time series designs, see: Baicker, Katherine, and Theodore Svoronos. Testing the validity of the single interrupted time series design. No. w26080. National Bureau of Economic Research, 2019.
- Note that this method does not implement the second test in Baicker and Svoronos because
the estimator in this package models the relationship between covariates and the outcome and uses an extreme learning machine instead of linear regression, so variance in the outcome across different bins is not much of an issue.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> X₀, Y₀, X₁, Y₁ = rand(100, 5), rand(100), rand(10, 5), rand(10)
julia> m1 = InterruptedTimeSeries(X₀, Y₀, X₁, Y₁)
julia> estimate_causal_effect!(m1)
[0.25714308]
julia> validate(m1)
{"Task" => "Regression", "Regularized" => true, "Activation Function" => relu,
"Validation Metric" => "mse","Number of Neurons" => 2,
"Number of Neurons in Approximator" => 10, "β" => [0.25714308],
"Causal Effect" => -3.9101138, "Standard Error" => 1.903434356, "p-value" = 0.00123356}
validate(m; num_treatments, min, max)
This method tests the counterfactual consistency, exchangeability, and positivity assumptions required for causal inference. It should be noted that consistency and exchangeability are not directly testable, so instead, these tests do not provide definitive evidence of a violation of these assumptions. To probe the counterfactual consistency assumption, we assume there were multiple levels of treatments and find them by binning the dependent vairable for treated observations using Jenks breaks. The optimal number of breaks between 2 and num_treatments is found using the elbow method. Using these hypothesized treatment assignemnts, this method compares the MSE of linear regressions using the observed and hypothesized treatments. If the counterfactual consistency assumption holds then the difference between the MSE with hypothesized treatments and the observed treatments should be positive because the hypothesized treatments should not provide useful information. If it is negative, that indicates there was more useful information provided by the hypothesized treatments than the observed treatments or that there is an unobserved confounder. Next, this methods tests the model's sensitivity to a violation of the exchangeability assumption by calculating the E-value, which is the minimum strength of association, on the risk ratio scale, that an unobserved confounder would need to have with the treatment and outcome variable to fully explain away the estimated effect. Thus, higher E-values imply the model is more robust to a violation of the exchangeability assumption. Finally, this method tests the positivity assumption by estimating propensity scores. Rows in the matrix are levels of covariates that have a zero probability of treatment. If the matrix is empty, none of the observations have an estimated zero probability of treatment, which implies the positivity assumption is satisfied.
For a thorough review of casual inference assumptions see: Hernan, Miguel A., and James M. Robins. Causal inference what if. Boca Raton: Taylor and Francis, 2024.
For more information on the E-value test see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples julia-repl julia> x, y, t = rand(100, 5), vec(rand(1:100, 100, 1)), Float64.([rand()<0.4 for i in 1:100]) julia> g_computer = GComputation(x, y, t, temporal=false) julia> estimate_causal_effect!(g_computer) julia> validate(g_computer) 2.7653668647301795
CausalELM.covariate_independence
— Functioncovariate_independence(its; n)
Test for independence between covariates and the event or intervention.
This is a Chow Test for covariates with p-values estimated via randomization inference. The p-values are the proportion of times randomly assigning observations to the pre or post-intervention period would have a larger estimated effect on the the slope of the covariates. The lower the p-values, the more likely it is that the event/intervention effected the covariates and they cannot provide an unbiased prediction of the counterfactual outcomes.
For more information on using a Chow Test to test for structural breaks see: Baicker, Katherine, and Theodore Svoronos. Testing the validity of the single interrupted time series design. No. w26080. National Bureau of Economic Research, 2019.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = (Float64.(rand(1:5, 100, 5)), randn(100), rand(1:5, (10, 5)),
randn(10))
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> covariate_independence(its)
Dict("Column 1 p-value" => 0.421, "Column 5 p-value" => 0.07, "Column 3 p-value" => 0.01,
"Column 2 p-value" => 0.713, "Column 4 p-value" => 0.043)
CausalELM.omitted_predictor
— Functionomitted_predictor(its; n)
See how an omitted predictor/variable could change the results of an interrupted time series analysis.
This method reestimates interrupted time series models with uniform random variables. If the included covariates are good predictors of the counterfactual outcome, adding a random variable as a covariate should not have a large effect on the predicted counterfactual outcomes and therefore the estimated average effect.
For more information on using a Chow Test to test for structural breaks see: Baicker, Katherine, and Theodore Svoronos. Testing the validity of the single interrupted time series design. No. w26080. National Bureau of Economic Research, 2019.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = (Float64.(rand(1:5, 100, 5)), randn(100), rand(1:5, (10, 5)),
randn(10))
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> omitted_predictor(its)
Dict("Mean Biased Effect/Original Effect" => -0.1943184744720332, "Median Biased
Effect/Original Effect" => -0.1881814122689084, "Minimum Biased Effect/Original Effect" =>
-0.2725194360603799, "Maximum Biased Effect/Original Effect" => -0.1419197976977072)
CausalELM.sup_wald
— Functionsup_wald(its; low, high, n)
Check if the predicted structural break is the hypothesized structural break.
This method conducts Wald tests and identifies the structural break with the highest Wald statistic. If this break is not the same as the hypothesized break, it could indicate an anticipation effect, confounding by some other event or intervention, or that the intervention or policy took place in multiple phases. p-values are estimated using approximate randomization inference and represent the proportion of times we would see a larger Wald statistic if the data points were randomly allocated to pre and post-event periods for the predicted structural break.
For more information on using a Chow Test to test for structural breaks see: Baicker, Katherine, and Theodore Svoronos. Testing the validity of the single interrupted time series design. No. w26080. National Bureau of Economic Research, 2019.
For a primer on randomization inference see: https://www.mattblackwell.org/files/teaching/s05-fisher.pdf
Examples
julia> x₀, y₀, x₁, y₁ = (Float64.(rand(1:5, 100, 5)), randn(100), rand(1:5, (10, 5)),
randn(10))
julia> its = InterruptedTimeSeries(x₀, y₀, x₁, y₁)
julia> estimate_causal_effect!(its)
julia> sup_wald(its)
Dict{String, Real}("Wald Statistic" => 58.16649796321913, "p-value" => 0.005, "Predicted
Break Point" => 39, "Hypothesized Break Point" => 100)
CausalELM.p_val
— Functionp_val(x, y, β; n, wald)
Estimate the p-value for the hypothesis that an event had a statistically significant effect on the slope of a covariate using randomization inference.
Examples
julia> x, y, β = reduce(hcat, (float(rand(0:1, 10)), ones(10))), rand(10), 0.5
julia> p_val(x, y, β)
0.98
julia> p_val(x, y, β; n=100, wald=true)
0.08534054
CausalELM.counterfactual_consistency
— Functioncounterfactual_consistency(m; num_treatments)
Examine the counterfactual consistency assumption. First, this function generates Jenks breaks based on outcome values for the treatment group. Then, it replaces treatment statuses with the numbers corresponding to each group. Next, it runs two linear regressions on for the treatment group, one with and one without the fake treatment assignemnts generated by the Jenks breaks. Finally, it subtracts the mean squared error from the regression with real data from the mean squared error from the regression with the fake treatment statuses. If this number is negative, it might indicate a violation of the counterfactual consistency assumption or omitted variable bias.
For a primer on G-computation and its assumptions see: Naimi, Ashley I., Stephen R. Cole, and Edward H. Kennedy. "An introduction to g methods." International journal of epidemiology 46, no. 2 (2017): 756-762.
Examples
julia> x, y, t = rand(100, 5), vec(rand(1:100, 100, 1)),
Float64.([rand()<0.4 for i in 1:100])
julia> g_computer = GComputation(x, y, t, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> counterfactual_consistency(g_computer)
2.7653668647301795
CausalELM.exchangeability
— Functionexchangeability(model)
Test the sensitivity of a G-computation or doubly robust estimator or metalearner to a violation of the exchangeability assumption.
For more information on the E-value test see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples
julia> x, y, t = rand(100, 5), vec(rand(1:100, 100, 1)),
Float64.([rand()<0.4 for i in 1:100])
julia> g_computer = GComputation(x, y, t, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> e_value(g_computer)
1.13729886008143832
CausalELM.e_value
— Functione_value(model)
Test the sensitivity of an estimator to a violation of the exchangeability assumption.
For more information on the E-value test see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples
julia> x, y, t = rand(100, 5), vec(rand(1:100, 100, 1)),
Float64.([rand()<0.4 for i in 1:100])
julia> g_computer = GComputation(x, y, t, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> e_value(g_computer)
2.2555405766985125
CausalELM.binarize
— Functionbinarize(x, cutoff)
Convert a vector of counts or a continuous vector to a binary vector.
Examples
julia> binarize([1, 2, 3], 2)
3-element Vector{Int64}:
0
0
1
CausalELM.risk_ratio
— Functionrisk_ratio(model)
Calculate the risk ratio for an estimated model.
If the treatment variable is not binary and the outcome variable is not continuous then the treatment variable will be binarized.
For more information on how other quantities of interest are converted to risk ratios see: VanderWeele, Tyler J., and Peng Ding. "Sensitivity analysis in observational research: introducing the E-value." Annals of internal medicine 167, no. 4 (2017): 268-274.
Examples
julia> x, y, t = rand(100, 5), vec(rand(1:100, 100, 1)),
Float64.([rand()<0.4 for i in 1:100])
julia> g_computer = GComputation(x, y, t, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> risk_ratio(g_computer)
2.5320694766985125
CausalELM.positivity
— Functionpositivity(model, min, max)
Find likely violations of the positivity assumption.
This method uses an extreme learning machine or regularized extreme learning machine to estimate probabilities of treatment. The returned matrix, which may be empty, are the covariates that have a (near) zero probability of treatment or near zero probability of being assigned to the control group, whith their entry in the last column being their estimated treatment probability. In other words, they likely violate the positivity assumption.
Examples
julia> x, y, t = rand(100, 5), vec(rand(1:100, 100, 1)),
Float64.([rand()<0.4 for i in 1:100])
julia> g_computer = GComputation(x, y, t, temporal=false)
julia> estimate_causal_effect!(g_computer)
julia> positivity(g_computer)
0×5 Matrix{Float64}
CausalELM.sums_of_squares
— Functionsums_of_squares(data, num_classes)
Calculate the minimum sum of squares for each data point and class for the Jenks breaks algorithm.
Examples
julia> sums_of_squares([1, 2, 3, 4, 5], 2)
5×2 Matrix{Real}:
0.0 0.0
0.25 0.25
0.666667 0.666667
1.25 1.16667
2.0 1.75
CausalELM.class_pointers
— Functionclass_pointers(data, num_classes, sums_of_sqs)
Compute class pointers that minimize the sum of squares for Jenks breaks.
Examples
julia> sums_squares = sums_of_sqs::Matrix{Float64}
5×2 Matrix{Float64}:
0.0 0.0
0.25 0.25
0.666667 0.666667
1.25 1.16667
2.0 1.75
julia> class_pointers([1, 2, 3, 4, 5], 2, sums_squares)
5×2 Matrix{Int64}:
1 0
1 1
1 1
1 1
1 1
CausalELM.backtrack_to_find_breaks
— Functionbacktrack_to_find_breaks(data, num_classes, sums_of_sqs)
Determine break points from class assignments.
Examples
julia> data = [1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
5-element Vector{Int64}:
1
2
3
4
5
julia> ptr = class_pointers([1, 2, 3, 4, 5], 2, sums_of_squares([1, 2, 3, 4, 5], 2))
5×2 Matrix{Int64}:
1 28
1 1
1 1
1 1
1 1
julia> backtrack_to_find_breaks([1, 2, 3, 4, 5], ptr)
2-element Vector{Int64}:
1
4
CausalELM.variance
— Functionvariance(data)
Calculate the variance of some numbers.
Note this function does not use Besel's correction.
Examples
julia> variance([1, 2, 3, 4, 5])
2.0
CausalELM.best_splits
— Functionbest_splits(data, num_classes)
Find the best number of splits for Jenks breaks.
This function finds the best number of splits by finding the number of splits that results in the greatest decrease in the slope of the line between itself and its GVF and the next higher number of splits and its GVF. This is the same thing as the elbow method.
Examples
julia> best_splits(collect(1:10), 5)
10-element Vector{Int64}:
1
3
3
⋮
3
4
CausalELM.group_by_class
— Functiongroup_by_class(data, classes)
Group data points into vectors such that data points assigned to the same class are in the same vector.
Examples
julia> group_by_class([1, 2, 3, 4, 5], [1, 1, 1, 2, 3])
3-element Vector{Vector{Real}}:
[1, 2, 3]
[4]
[5]
CausalELM.jenks_breaks
— Functionjenks_breaks(data, num_classes)
Generate Jenks breaks for a vector of real numbers.
Examples
julia> jenks_breaks([1, 2, 3, 4, 5], 3)
3-element Vector{Int64}:
1
3
4
CausalELM.fake_treatments
— Functionfake_treatments(data, num_classes)
Generate fake treatment statuses corresponding to the classes assigned by the Jenks breaks algorithm.
Examples
julia> fake_treatments([1, 2, 3, 4, 5], 4)
5-element Vector{Int64}:
1
2
3
4
4
CausalELM.sdam
— Functionsdam(x)
Calculate the sum of squared deviations for array mean for a set of sub arrays.
Examples
julia> sdam([5, 4, 9, 10])
26.0
CausalELM.scdm
— Functionsdcm(x)
Calculate the sum of squared deviations for class means for a set of sub arrays.
Examples
julia> scdm([[4], [5, 9, 10]])
14.0
CausalELM.gvf
— Functiongvf(x)
Calculate the goodness of variance fit for a set of sub vectors.
Examples
julia> gvf([[4, 5], [9, 10]])
0.96153846153
Validation Metrics
CausalELM.mse
— Functionmse(y, ŷ)
Calculate the mean squared error
See also mae
.
Examples
julia> mse([0.0, 0.0, 0.0], [0.0, 0.0, 0.0])
0.0
julia> mse([-1.0, -1.0, -1.0], [1.0, 1.0, 1.0])
4.0
CausalELM.mae
— Functionmae(y, ŷ)
Calculate the mean absolute error
See also mse
.
Examples
julia> mae([-1.0, -1.0, -1.0], [1.0, 1.0, 1.0])
2.0
julia> mae([1.0, 1.0, 1.0], [2.0, 2.0, 2.0])
1.0
CausalELM.accuracy
— Functionaccuracy(y, ŷ)
Calculate the accuracy for a classification task
Examples
julia> accuracy([1, 1, 1, 1], [0, 1, 1, 0])
0.5
julia> accuracy([1, 2, 3, 4], [1, 1, 1, 1])
0.25
Base.precision
— Functionprecision(y, ŷ)
Calculate the precision for a classification task
See also recall
.
Examples
julia> precision([0, 1, 0, 0], [0, 1, 1, 0])
0.5
julia> precision([0, 1, 0, 0], [0, 1, 0, 0])
1.0
CausalELM.recall
— Functionrecall(y, ŷ)
Calculate the recall for a classification task
See also precision
.
Examples
julia> recall([1, 2, 1, 3, 0], [2, 2, 2, 3, 1])
0.5
julia> recall([1, 2, 1, 3, 2], [2, 2, 2, 3, 1])
1.0
CausalELM.F1
— FunctionF1(y, ŷ)
Calculate the F1 score for a classification task
Examples
julia> F1([1, 2, 1, 3, 0], [2, 2, 2, 3, 1])
0.4
julia> F1([1, 2, 1, 3, 2], [2, 2, 2, 3, 1])
0.47058823529411764
CausalELM.confusion_matrix
— Functionconfusion_matrix(y, ŷ)
Generate a confusion matrix
Examples
julia> confusion_matrix([1, 1, 1, 1, 0], [1, 1, 1, 1, 0])
2×2 Matrix{Int64}:
1 0
0 4
julia> confusion_matrix([1, 1, 1, 1, 0, 2], [1, 1, 1, 1, 0, 2])
3×3 Matrix{Int64}:
1 0 0
0 4 0
0 0 1
Extreme Learning Machines
CausalELM.fit!
— Functionfit!(model)
Make predictions with an ExtremeLearner.
For more details see: Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: theory and applications." Neurocomputing 70, no. 1-3 (2006): 489-501.
Examples
julia> m1 = ExtremeLearner(x, y, 10, σ)
Extreme Learning Machine with 10 hidden neurons
julia> f1 = fit!(m1)
10-element Vector{Float64}
-4.403356409043448
-5.577616954029608
-2.1732800642523595
⋮
-2.4741301876094655
40.642730531608635
-11.058942121275233
fit!(model)
Fit a Regularized Extreme Learner.
For more details see: Li, Guoqiang, and Peifeng Niu. "An enhanced extreme learning machine based on ridge regression for regression." Neural Computing and Applications 22, no. 3 (2013): 803-810.
Examples
julia> m1 = RegularizedExtremeLearner(x, y, 10, σ)
Regularized Extreme Learning Machine with 10 hidden neurons
julia> f1 = fit!(m1)
10-element Vector{Float64}
-4.403356409043448
-5.577616954029608
-2.1732800642523595
⋮
-2.4741301876094655
40.642730531608635
-11.058942121275233
CausalELM.predict
— Functionpredict(model, X)
Use an ExtremeLearningMachine to make predictions.
For more details see: Huang G-B, Zhu Q-Y, Siew C. Extreme learning machine: theory and applications. Neurocomputing. 2006;70:489–501. https://doi.org/10.1016/j.neucom.2005.12.126
Examples
julia> m1 = ExtremeLearner(x, y, 10, σ)
Extreme Learning Machine with 10 hidden neurons
julia> f1 = fit(m1, sigmoid)
10-element Vector{Float64}
-4.403356409043448
-5.577616954029608
-2.1732800642523595
⋮
-2.4741301876094655
40.642730531608635
-11.058942121275233
julia> predict(m1, [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0])
4-element Vector{Float64}
9.811656638113011e-16
0.9999999999999962
-9.020553785284482e-17
0.9999999999999978
CausalELM.predict_counterfactual!
— Functionpredictcounterfactual(model, X)
Use an ExtremeLearningMachine to predict the counterfactual.
This should be run with the observed covariates. To use synthtic data for what-if scenarios use predict.
See also predict
.
Examples
julia> m1 = ExtremeLearner(x, y, 10, σ)
Extreme Learning Machine with 10 hidden neurons
julia> f1 = fit(m1, sigmoid)
10-element Vector{Float64}
-4.403356409043448
-5.577616954029608
-2.1732800642523595
⋮
-2.4741301876094655
40.642730531608635
-11.058942121275233
julia> predict_counterfactual(m1, [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0])
4-element Vector{Float64}
9.811656638113011e-16
0.9999999999999962
-9.020553785284482e-17
0.9999999999999978
CausalELM.placebo_test
— Functionplacebo_test(model)
Conduct a placebo test.
This method makes predictions for the post-event or post-treatment period using data in the pre-event or pre-treatment period and the post-event or post-treament. If there is a statistically significant difference between these predictions the study design may be flawed. Due to the multitude of significance tests for time series data, this function returns the predictions but does not test for statistical significance.
Examples
julia> m1 = ExtremeLearner(x, y, 10, σ)
Extreme Learning Machine with 10 hidden neurons
julia> f1 = fit(m1, sigmoid)
10-element Vector{Float64}
-4.403356409043448
-5.577616954029608
-2.1732800642523595
⋮
-2.4741301876094655
40.642730531608635
-11.058942121275233
julia> predict_counterfactual(m1, [1.0 1.0; 0.0 1.0; 0.0 0.0; 1.0 0.0])
4-element Vector{Float64}
9.811656638113011e-16
0.9999999999999962
-9.020553785284482e-17
0.9999999999999978
julia> placebo_test(m1)
([9.811656638113011e-16, 0.9999999999999962, -9.020553785284482e-17, 0.9999999999999978],
[0.5, 0.4, 0.3, 0.2])
CausalELM.ridge_constant
— Functionridge_constant(model)
Calculate the L2 penalty for a regularized extreme learning machine.
For more information see: Li, Guoqiang, and Peifeng Niu. "An enhanced extreme learning machine based on ridge regression for regression." Neural Computing and Applications 22, no. 3 (2013): 803-810.
Examples
julia> m1 = RegularizedExtremeLearner(x, y, 10, σ)
Extreme Learning Machine with 10 hidden neurons
julia> ridge_constant(m1)
0.26789338524662887
CausalELM.set_weights_biases
— Functionset_weights_biases(model)
Calculate the weights and biases for an extreme learning machine or regularized extreme learning machine.
For details see; Huang, Guang-Bin, Qin-Yu Zhu, and Chee-Kheong Siew. "Extreme learning machine: theory and applications." Neurocomputing 70, no. 1-3 (2006): 489-501.
Examples
julia> m1 = RegularizedExtremeLearner(x, y, 10, σ)
Extreme Learning Machine with 10 hidden neurons
julia> set_weights_biases(m1)
Utility Functions
Missing docstring for CausalELM.mean
. Check Documenter's build log for details.
Missing docstring for CausalELM.var
. Check Documenter's build log for details.
Missing docstring for CausalELM.consecutive
. Check Documenter's build log for details.
Missing docstring for CausalELM.one_hot_encode
. Check Documenter's build log for details.