# Some examples of use-cases using the market models of the package

This short tutorial covers the very basic use cases to get you started with diseq. More usage details can be found in the documentation of the package.

## Setup the environment

Load the required libraries.

library(diseq)
library(Formula)

Prepare the data. Normally this step is long and it depends on the nature of the data and the considered market. For this example, we will use simulated data. Although we could simulate data independently from the package, we will use the top-level simulation functionality of diseq to simplify the process. See the documentation of simulate_data for more information on the simulation functionality. Here, we simulate data using a data generating process for a market in disequilibrium with stochastic price dynamics.

nobs <- 2000
tobs <- 5

alpha_d <- -0.3
beta_d0 <- 6.8
beta_d <- c(0.3, -0.02)
eta_d <- c(0.6, -0.1)

alpha_s <- 0.6
beta_s0 <- 2.1
beta_s <- c(0.9)
eta_s <- c(-0.5, 0.2)

gamma <- 1.2
beta_p0 <- 0.9
beta_p <- c(-0.1)

sigma_d <- 1
sigma_s <- 1
sigma_p <- 1
rho_ds <- 0.0
rho_dp <- 0.0
rho_sp <- 0.0

seed <- 443

alpha_d, beta_d0, beta_d, eta_d,
alpha_s, beta_s0, beta_s, eta_s,
gamma, beta_p0, beta_p,
sigma_d = sigma_d, sigma_s = sigma_s, sigma_p = sigma_p,
rho_ds = rho_ds, rho_dp = rho_dp, rho_sp = rho_sp,
seed = seed
)

## Estimate the models

Prepare the basic parameters for model initialization. The simulate_data call uses Q for the simulated traded quantity, P for the simulated prices, id for subject identification, and date for time identification. It automatically creates the demand specific variables Xd1 and Xd2, the supply specific variable Xs1, the common (i.e. both demand and supply) variables X1 and X2, and the price dynamics’ variable Xp1.

market_spec <-   Q | P | id | date ~ P + Xd1 + Xd2 + X1 + X2 | P + Xs1 + X1 + X2

The market specification has two be modified in two cases. For the diseq_directional, the price variable is removed from the supply equation because the separation rule of the model can only be used for markets with exclusively either inelastic demand or supply. For the diseq_stochastic_adjustment, the right hand side of the price dynamics equation is appended in the market specification.

By default the models are estimated by allowing the demand, supply, and price equations to have correlated error shocks. The default verbosity behavior is to display errors and warnings that might occur when estimating the models.

By default all models are estimated using full information maximum likelihood based on the "BFGS" optimization algorithm. The first equilibrium_model call modifies the estimation behavior and estimates the model using two stage least squares. The diseq_basic call modifies the default optimization behavior and estimates the model using the "Nelder-Mead" optimization methods.

Standard errors are by default assumed to be homoscedastic. The second equilibrium_model and diseq_deterministic_adjustment calls modify this behavior by calculating clustered standard errors based on the subject identifier, while the diseq_basic and diseq_directional calls modify it by calculating heteroscedastic standard errors via the sandwich estimator.

eqmdl_reg <- equilibrium_model(
estimation_options = list(method = "2SLS")
)
eqmdl_fit <- equilibrium_model(
estimation_options = list(standard_errors = c("id"))
)
bsmdl_fit <- diseq_basic(
estimation_options = list(
method = "Nelder-Mead", control = list(maxit = 1e+5),
standard_errors = "heteroscedastic"
)
)
drmdl_fit <- diseq_directional(
formula(update(Formula(market_spec), . ~ . | . - P)),
estimation_options = list(standard_errors = "heteroscedastic")
)
estimation_options = list(standard_errors = c("id"))
)
formula(update(Formula(market_spec), . ~ . | . | Xp1)),
estimation_options = list(control = list(maxit = 1e+5))
)

## Post estimation analysis

### Marginal effects

Calculate marginal effects on the shortage probabilities. Diseq offers two marginal effect calls out of the box. The mean marginal effects and the marginal effects ate the mean. Marginal effects on the shortage probabilities are state-dependent. If the variable is only in the demand equation, the output name of the marginal effect is the variable name prefixed by D_. If the variable is only in the supply equation, the name of the marginal effect is the variable name prefixed by S_. If the variable is in both equations, then it is prefixed by B_.

variables <- c("P", "Xd1", "Xd2", "X1", "X2", "Xs1")

bsmdl_mme <- sapply(variables,
function(v) shortage_probability_marginal(bsmdl_fit, v),
USE.NAMES = FALSE
)
drmdl_mme <- sapply(variables,
function(v) shortage_probability_marginal(drmdl_fit, v),
USE.NAMES = FALSE
)
damdl_mme <- sapply(variables,
function(v) shortage_probability_marginal(damdl_fit, v),
USE.NAMES = FALSE
)
samdl_mme <- sapply(variables,
function(v) shortage_probability_marginal(samdl_fit, v),
USE.NAMES = FALSE
)
bsmdl_mem <- sapply(variables,
function(v) {
shortage_probability_marginal(bsmdl_fit, v, aggregate = "at_the_mean")
},
USE.NAMES = FALSE
)
drmdl_mem <- sapply(variables,
function(v) {
shortage_probability_marginal(drmdl_fit, v, aggregate = "at_the_mean")
},
USE.NAMES = FALSE
)
damdl_mem <- sapply(variables,
function(v) {
shortage_probability_marginal(damdl_fit, v, aggregate = "at_the_mean")
},
USE.NAMES = FALSE
)
samdl_mem <- sapply(variables,
function(v) {
shortage_probability_marginal(samdl_fit, v, aggregate = "at_the_mean")
},
USE.NAMES = FALSE
)

cbind(
bsmdl_mme, drmdl_mme, damdl_mme, samdl_mme, bsmdl_mem, drmdl_mem, damdl_mem,
samdl_mem
)
#>          bsmdl_mme    drmdl_mme     damdl_mme    samdl_mme   bsmdl_mem
#> B_P   -0.058233417  0.074489805 -0.1651059977 -0.199290256 -0.06399195
#> D_Xd1  0.084990788  0.055358542  0.0641363526  0.067995494  0.09339528
#> D_Xd2  0.009720347  0.006209734  0.0004054397  0.004403883  0.01068156
#> B_X1   0.181968214  0.117482602  0.1923432239  0.238684040  0.19996252
#> B_X2  -0.042143710 -0.021743702 -0.0444738050 -0.056832061 -0.04631118
#> S_Xs1 -0.265232209 -0.132090709 -0.1899577770 -0.206401099 -0.29146025
#>          drmdl_mem     damdl_mem    samdl_mem
#> B_P    0.079827913 -0.1925007197 -0.252675118
#> D_Xd1  0.059325660  0.0747779862  0.086209782
#> D_Xd2  0.006654738  0.0004727111  0.005583573
#> B_X1   0.125901672  0.2242572017  0.302621508
#> B_X2  -0.023301905 -0.0518529889 -0.072055945
#> S_Xs1 -0.141556628 -0.2214759566 -0.261690778

### Shortages

Copy the disequilibrium model tibble and augment it with post-estimation data. The disequilibrium models can be used to estimate:

• Shortage probabilities. These are the probabilities that the disequilibrium models assign to observing a particular extent of excess demand.

• Normalized shortages. The point estimates of the shortages are normalized by the variance of the difference of the shocks of demand and supply.

• Relative shortages: The point estimates of the shortages are normalized by the estimated supplied quantity.

mdt <- tibble::add_column(
bsmdl_fit@model_tibble,
normalized_shortages = c(normalized_shortages(bsmdl_fit)),
shortage_probabilities = c(shortage_probabilities(bsmdl_fit)),
relative_shortages = c(relative_shortages(bsmdl_fit))
)

How is the sample separated post-estimation? The indices of the observations for which the estimated demand is greater than the estimated supply are easily obtained.

abs_fitsep <- c(
nobs = length(shortage_indicators(bsmdl_fit)),
nshortages = sum(shortage_indicators(bsmdl_fit)),
nsurpluses = sum(!shortage_indicators(bsmdl_fit))
)
print(abs_fitsep)
#>       nobs nshortages nsurpluses
#>      10000       5202       4798

rel_fitsep <- abs_fitsep / abs_fitsep["nobs"]
names(rel_fitsep) <- c("total", "shortages_share", "surpluses_share")
print(rel_fitsep)
#>           total shortages_share surpluses_share
#>          1.0000          0.5202          0.4798

if (requireNamespace("ggplot2", quietly = TRUE)) {
ggplot2::ggplot(mdt, ggplot2::aes(shortage_probabilities)) +
ggplot2::geom_density() +
ggplot2::ggtitle(paste0(
"Normalized shortages density (",
model_name(bsmdl_fit), ")"
)) +
ggplot2::theme(
panel.background = ggplot2::element_rect(fill = "transparent"),
plot.background = ggplot2::element_rect(fill = "transparent", color = NA),
)
}

### Summaries

All the model estimates support the summary function. The eq_2sls provides also the first stage estimation, but it is not included in the summary and has to be explicitly asked.

summary(eqmdl_reg@fit[[1]]$first_stage_model) #> #> Call: #> lm(formula = first_stage_formula, data = object@model_tibble) #> #> Residuals: #> Min 1Q Median 3Q Max #> -4.7599 -0.8768 0.0110 0.8875 4.0701 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 5.30801 0.14560 36.455 < 2e-16 *** #> Xd1 0.16085 0.02596 6.196 6.02e-10 *** #> Xd2 -0.01098 0.02570 -0.427 0.669 #> X1 0.51332 0.02579 19.907 < 2e-16 *** #> X2 -0.12027 0.02573 -4.674 3.00e-06 *** #> Xs1 -0.43902 0.02591 -16.943 < 2e-16 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Residual standard error: 1.285 on 9994 degrees of freedom #> Multiple R-squared: 0.06945, Adjusted R-squared: 0.06898 #> F-statistic: 149.2 on 5 and 9994 DF, p-value: < 2.2e-16 summary(eqmdl_reg) #> Equilibrium Model for Markets in Equilibrium #> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2 #> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2 #> Market Clearing : Q = D_Q = S_Q #> Shocks : Correlated #> Nobs : 10000 #> Sample Separation : Not Separated #> Quantity Var : Q #> Price Var : P #> Key Var(s) : id, date #> Time Var : date #> #> systemfit results #> method: 2SLS #> #> N DF SSR detRCov OLS-R2 McElroy-R2 #> system 20000 19989 48358.5 4.6502 -2.12978 -0.739797 #> #> N DF SSR MSE RMSE R2 Adj R2 #> demand 10000 9994 27698.3 2.77149 1.66478 -2.58529 -2.58708 #> supply 10000 9995 20660.2 2.06706 1.43773 -1.67428 -1.67535 #> #> The covariance matrix of the residuals #> demand supply #> demand 2.77149 -1.03857 #> supply -1.03857 2.06706 #> #> The correlations of the residuals #> demand supply #> demand 1.000000 -0.433914 #> supply -0.433914 1.000000 #> #> #> 2SLS estimates for 'demand' (equation 1) #> Model Formula: Q ~ P + Xd1 + Xd2 + X1 + X2 #> <environment: 0x5649889a93b8> #> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1 #> <environment: 0x5649889a93b8> #> #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 9.7549133 0.3627929 26.88838 < 2.22e-16 *** #> P -0.9575689 0.0764469 -12.52594 < 2.22e-16 *** #> Xd1 0.3306545 0.0358080 9.23409 < 2.22e-16 *** #> Xd2 -0.0012847 0.0333156 -0.03856 0.9692408 #> X1 0.5778408 0.0516015 11.19815 < 2.22e-16 *** #> X2 -0.0893816 0.0344208 -2.59673 0.0094254 ** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Residual standard error: 1.66478 on 9994 degrees of freedom #> Number of observations: 10000 Degrees of Freedom: 9994 #> SSR: 27698.280302 MSE: 2.771491 Root MSE: 1.66478 #> Multiple R-Squared: -2.585289 Adjusted R-Squared: -2.587082 #> #> #> 2SLS estimates for 'supply' (equation 2) #> Model Formula: Q ~ P + Xs1 + X1 + X2 #> <environment: 0x5649889a93b8> #> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1 #> <environment: 0x5649889a93b8> #> #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -1.0494025 1.0315906 -1.01727 0.30905 #> P 1.0884357 0.1800529 6.04509 1.5467e-09 *** #> Xs1 0.8985423 0.0842354 10.66703 < 2.22e-16 *** #> X1 -0.4723384 0.0968275 -4.87814 1.0874e-06 *** #> X2 0.1566904 0.0361420 4.33541 1.4691e-05 *** #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Residual standard error: 1.437727 on 9995 degrees of freedom #> Number of observations: 10000 Degrees of Freedom: 9995 #> SSR: 20660.242409 MSE: 2.067058 Root MSE: 1.437727 #> Multiple R-Squared: -1.674279 Adjusted R-Squared: -1.675349 summary(eqmdl_fit) #> Equilibrium Model for Markets in Equilibrium #> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2 #> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2 #> Market Clearing : Q = D_Q = S_Q #> Shocks : Correlated #> Nobs : 10000 #> Sample Separation : Not Separated #> Quantity Var : Q #> Price Var : P #> Key Var(s) : id, date #> Time Var : date #> #> Maximum likelihood estimation #> Method : BFGS #> Convergence Status : success #> Starting Values : #> D_P D_CONST D_Xd1 D_Xd2 D_X1 D_X2 S_P #> 0.13815 5.15047 0.15414 0.01746 0.01405 0.03411 0.17315 #> S_CONST S_Xs1 S_X1 S_X2 D_VARIANCE S_VARIANCE RHO #> 4.15459 0.49648 -0.00247 0.04558 1.00000 1.00000 0.00000 #> #> Coefficients #> Estimate Std. Error z value Pr(z) #> D_P -1.01247 0.09858 -10.2708 9.545e-25 #> D_CONST 10.89751 0.47798 22.7991 4.679e-115 #> D_Xd1 -0.01743 0.01240 -1.4053 1.599e-01 #> D_Xd2 -0.00890 0.01151 -0.7736 4.392e-01 #> D_X1 0.60912 0.06310 9.6530 4.776e-22 #> D_X2 -0.10000 0.03720 -2.6880 7.187e-03 #> S_P -0.41615 0.01485 -28.0313 6.748e-173 #> S_CONST 7.46659 0.03056 244.2868 0.000e+00 #> S_Xs1 0.25233 0.02374 10.6298 2.167e-26 #> S_X1 0.30159 0.02790 10.8096 3.102e-27 #> S_X2 -0.02675 0.02167 -1.2343 2.171e-01 #> D_VARIANCE 3.01149 0.39228 7.6770 1.629e-14 #> S_VARIANCE 1.25304 0.03285 38.1432 0.000e+00 #> RHO 0.94562 0.01006 93.9638 0.000e+00 #> #> -2 log L: 57914.5 summary(bsmdl_fit) #> Basic Model for Markets in Disequilibrium #> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2 #> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2 #> Short Side Rule : Q = min(D_Q, S_Q) #> Shocks : Correlated #> Nobs : 10000 #> Sample Separation : Not Separated #> Quantity Var : Q #> Price Var : P #> Key Var(s) : id, date #> Time Var : date #> #> Maximum likelihood estimation #> Method : Nelder-Mead #> Max Iterations : 100000 #> Convergence Status : success #> Starting Values : #> D_P D_CONST D_Xd1 D_Xd2 D_X1 D_X2 S_P #> 0.13815 5.15047 0.15414 0.01746 0.01405 0.03411 0.17315 #> S_CONST S_Xs1 S_X1 S_X2 D_VARIANCE S_VARIANCE RHO #> 4.15459 0.49648 -0.00247 0.04558 1.00000 1.00000 0.00000 #> #> Coefficients #> Estimate Std. Error z value Pr(z) #> D_P 0.06505 0.02562 2.5387 1.113e-02 #> D_CONST 5.12574 0.21738 23.5795 6.252e-123 #> D_Xd1 0.29708 0.05483 5.4184 6.013e-08 #> D_Xd2 0.03398 0.03204 1.0604 2.889e-01 #> D_X1 0.32469 0.10101 3.2145 1.307e-03 #> D_X2 -0.03387 0.06251 -0.5418 5.879e-01 #> S_P 0.26860 0.02541 10.5717 4.031e-26 #> S_CONST 3.68315 0.25270 14.5749 4.056e-48 #> S_Xs1 0.92709 0.14013 6.6161 3.687e-11 #> S_X1 -0.31137 0.06729 -4.6271 3.708e-06 #> S_X2 0.11344 0.05179 2.1905 2.849e-02 #> D_VARIANCE 0.82064 0.03941 20.8252 2.557e-96 #> S_VARIANCE 0.99711 0.15989 6.2362 4.485e-10 #> RHO 0.11547 0.15802 0.7308 4.649e-01 #> #> -2 log L: 24362.3 summary(damdl_fit) #> Deterministic Adjustment Model for Markets in Disequilibrium #> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2 #> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2 #> Short Side Rule : Q = min(D_Q, S_Q) #> Separation Rule : P_DIFF analogous to (D_Q - S_Q) #> Shocks : Correlated #> Nobs : 8000 #> Sample Separation : Demand Obs = 2540, Supply Obs = 5460 #> Quantity Var : Q #> Price Var : P #> Key Var(s) : id, date #> Time Var : date #> #> Maximum likelihood estimation #> Method : BFGS #> Convergence Status : success #> Starting Values : #> D_P D_CONST D_Xd1 D_Xd2 D_X1 D_X2 S_P #> 0.10164 5.23014 0.16768 0.01574 0.09549 0.01532 0.14235 #> S_CONST S_Xs1 S_X1 S_X2 P_DIFF D_VARIANCE S_VARIANCE #> 4.38416 0.43711 0.07416 0.02458 1.00000 1.00000 1.00000 #> RHO #> 0.00000 #> #> Coefficients #> Estimate Std. Error z value Pr(z) #> D_P -0.18260 0.01688 -10.81970 2.777e-27 #> D_CONST 6.69333 0.12727 52.59287 0.000e+00 #> D_Xd1 0.16451 0.02071 7.94236 1.984e-15 #> D_Xd2 0.00104 0.01802 0.05773 9.540e-01 #> D_X1 0.43124 0.02687 16.04867 5.841e-58 #> D_X2 -0.06011 0.02328 -2.58213 9.819e-03 #> S_P 0.24090 0.01519 15.86368 1.131e-56 #> S_CONST 4.06253 0.12682 32.03405 3.662e-225 #> S_Xs1 0.48725 0.02328 20.93164 2.758e-97 #> S_X1 -0.06213 0.02177 -2.85453 4.310e-03 #> S_X2 0.05397 0.01886 2.86180 4.212e-03 #> P_DIFF 0.52648 0.02202 23.91277 2.256e-126 #> D_VARIANCE 1.02178 0.03318 30.79873 2.725e-208 #> S_VARIANCE 0.69727 0.01339 52.06415 0.000e+00 #> RHO 0.60944 0.02733 22.29848 3.823e-110 #> #> -2 log L: 39800.3 summary(samdl_fit) #> Stochastic Adjustment Model for Markets in Disequilibrium #> Demand RHS : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2 #> Supply RHS : S_P + S_Xs1 + S_X1 + S_X2 #> Price Dynamics RHS: (D_Q - S_Q) + Xp1 #> Short Side Rule : Q = min(D_Q, S_Q) #> Shocks : Correlated #> Nobs : 8000 #> Sample Separation : Not Separated #> Quantity Var : Q #> Price Var : P #> Key Var(s) : id, date #> Time Var : date #> #> Maximum likelihood estimation #> Method : BFGS #> Max Iterations : 100000 #> Convergence Status : success #> Starting Values : #> D_P D_CONST D_Xd1 D_Xd2 D_X1 D_X2 S_P #> 0.10164 5.23014 0.16768 0.01574 0.09549 0.01532 0.14235 #> S_CONST S_Xs1 S_X1 S_X2 P_DIFF P_CONST P_Xp1 #> 4.38416 0.43711 0.07416 0.02458 14.18100 3.31715 -0.98643 #> D_VARIANCE S_VARIANCE P_VARIANCE RHO_DS RHO_DP RHO_SP #> 1.00000 1.00000 1.00000 0.00000 0.00000 0.00000 #> #> Coefficients #> Estimate Std. Error z value Pr(z) #> D_P -0.283818 0.02445 -11.60957 3.685e-31 #> D_CONST 6.670856 0.16883 39.51312 0.000e+00 #> D_Xd1 0.295357 0.02671 11.05871 1.989e-28 #> D_Xd2 0.019129 0.02557 0.74827 4.543e-01 #> D_X1 0.572856 0.03683 15.55230 1.535e-54 #> D_X2 -0.092642 0.02612 -3.54740 3.891e-04 #> S_P 0.581854 0.04519 12.87565 6.172e-38 #> S_CONST 2.178987 0.24712 8.81755 1.170e-18 #> S_Xs1 0.896560 0.05190 17.27636 7.088e-67 #> S_X1 -0.463933 0.05749 -8.06931 7.070e-16 #> S_X2 0.154224 0.02928 5.26759 1.382e-07 #> P_DIFF 1.126426 0.06007 18.75213 1.860e-78 #> P_CONST 0.915931 0.11227 8.15807 3.404e-16 #> P_Xp1 -0.122637 0.03249 -3.77465 1.602e-04 #> D_VARIANCE 0.986643 0.05123 19.25817 1.206e-82 #> S_VARIANCE 1.005088 0.08611 11.67199 1.772e-31 #> P_VARIANCE 0.898735 0.10383 8.65587 4.892e-18 #> RHO_DS 0.066835 0.09213 0.72546 4.682e-01 #> RHO_DP 0.005708 0.06837 0.08349 9.335e-01 #> RHO_SP -0.006082 0.06801 -0.08943 9.287e-01 #> #> -2 log L: 39505.1 The estimated demanded and supplied quantities can be calculated per observation. market <- cbind( demand = demanded_quantities(bsmdl_fit)[, 1], supply = supplied_quantities(bsmdl_fit)[, 1] ) summary(market) #> demand supply #> Min. :6.103 Min. :4.863 #> 1st Qu.:6.873 1st Qu.:6.638 #> Median :7.043 Median :7.010 #> Mean :7.042 Mean :7.003 #> 3rd Qu.:7.209 3rd Qu.:7.369 #> Max. :8.093 Max. :9.137 The package offers also basic aggregation functionality. aggregates <- c( demand = aggregate_demand(bsmdl_fit), supply = aggregate_supply(bsmdl_fit) ) aggregates #>$demand.date
#> [1] 1 2 3 4 5
#> Levels: 1 2 3 4 5
#>
#> $demand.D_Q #> [1] 13913.91 14052.51 14115.25 14152.94 14181.93 #> #>$supply.date
#> [1] 1 2 3 4 5
#> Levels: 1 2 3 4 5
#>
#> \$supply.S_Q
#> [1] 13277.40 13851.89 14140.81 14351.21 14412.80