Some examples of use-cases using the market models of the package

This short tutorial covers the very basic use cases to get you started with diseq. More usage details can be found in the documentation of the package.

Setup the environment

Load the required libraries.

library(diseq)
library(Formula)

Prepare the data. Normally this step is long and it depends on the nature of the data and the considered market. For this example, we will use simulated data. Although we could simulate data independently from the package, we will use the top-level simulation functionality of diseq to simplify the process. See the documentation of simulate_data for more information on the simulation functionality. Here, we simulate data using a data generating process for a market in disequilibrium with stochastic price dynamics.

nobs <- 2000
tobs <- 5

alpha_d <- -0.3
beta_d0 <- 6.8
beta_d <- c(0.3, -0.02)
eta_d <- c(0.6, -0.1)

alpha_s <- 0.6
beta_s0 <- 2.1
beta_s <- c(0.9)
eta_s <- c(-0.5, 0.2)

gamma <- 1.2
beta_p0 <- 0.9
beta_p <- c(-0.1)

sigma_d <- 1
sigma_s <- 1
sigma_p <- 1
rho_ds <- 0.0
rho_dp <- 0.0
rho_sp <- 0.0

seed <- 443

stochastic_adjustment_data <- simulate_data(
  "diseq_stochastic_adjustment", nobs, tobs,
  alpha_d, beta_d0, beta_d, eta_d,
  alpha_s, beta_s0, beta_s, eta_s,
  gamma, beta_p0, beta_p,
  sigma_d = sigma_d, sigma_s = sigma_s, sigma_p = sigma_p,
  rho_ds = rho_ds, rho_dp = rho_dp, rho_sp = rho_sp,
  seed = seed
)

Estimate the models

Prepare the basic parameters for model initialization. The simulate_data call uses Q for the simulated traded quantity, P for the simulated prices, id for subject identification, and date for time identification. It automatically creates the demand specific variables Xd1 and Xd2, the supply specific variable Xs1, the common (i.e. both demand and supply) variables X1 and X2, and the price dynamics’ variable Xp1.

market_spec <-   Q | P | id | date ~ P + Xd1 + Xd2 + X1 + X2 | P + Xs1 + X1 + X2

The market specification has two be modified in two cases. For the diseq_directional, the price variable is removed from the supply equation because the separation rule of the model can only be used for markets with exclusively either inelastic demand or supply. For the diseq_stochastic_adjustment, the right hand side of the price dynamics equation is appended in the market specification.

By default the models are estimated by allowing the demand, supply, and price equations to have correlated error shocks. The default verbosity behavior is to display errors and warnings that might occur when estimating the models.

By default all models are estimated using full information maximum likelihood based on the "BFGS" optimization algorithm. The first equilibrium_model call modifies the estimation behavior and estimates the model using two stage least squares. The diseq_basic call modifies the default optimization behavior and estimates the model using the "Nelder-Mead" optimization methods.

Standard errors are by default assumed to be homoscedastic. The second equilibrium_model and diseq_deterministic_adjustment calls modify this behavior by calculating clustered standard errors based on the subject identifier, while the diseq_basic and diseq_directional calls modify it by calculating heteroscedastic standard errors via the sandwich estimator.

eqmdl_reg <- equilibrium_model(
  market_spec, stochastic_adjustment_data,
  estimation_options = list(method = "2SLS")
)
eqmdl_fit <- equilibrium_model(
    market_spec, stochastic_adjustment_data,
    estimation_options = list(standard_errors = c("id"))
)
bsmdl_fit <- diseq_basic(
    market_spec, stochastic_adjustment_data,
    estimation_options = list(
        method = "Nelder-Mead", control = list(maxit = 1e+5),
        standard_errors = "heteroscedastic"
    )
)
drmdl_fit <- diseq_directional(
    formula(update(Formula(market_spec), . ~ . | . - P)),
    stochastic_adjustment_data,
    estimation_options = list(standard_errors = "heteroscedastic")
)
damdl_fit <- diseq_deterministic_adjustment(
    market_spec, stochastic_adjustment_data,
    estimation_options = list(standard_errors = c("id"))
)
samdl_fit <- diseq_stochastic_adjustment(
    formula(update(Formula(market_spec), . ~ . | . | Xp1)),
    stochastic_adjustment_data,
    estimation_options = list(control = list(maxit = 1e+5))
)

Post estimation analysis

Marginal effects

Calculate marginal effects on the shortage probabilities. Diseq offers two marginal effect calls out of the box. The mean marginal effects and the marginal effects ate the mean. Marginal effects on the shortage probabilities are state-dependent. If the variable is only in the demand equation, the output name of the marginal effect is the variable name prefixed by D_. If the variable is only in the supply equation, the name of the marginal effect is the variable name prefixed by S_. If the variable is in both equations, then it is prefixed by B_.

variables <- c("P", "Xd1", "Xd2", "X1", "X2", "Xs1")

bsmdl_mme <- sapply(variables,
  function(v) shortage_probability_marginal(bsmdl_fit, v),
  USE.NAMES = FALSE
)
drmdl_mme <- sapply(variables,
  function(v) shortage_probability_marginal(drmdl_fit, v),
  USE.NAMES = FALSE
)
damdl_mme <- sapply(variables,
  function(v) shortage_probability_marginal(damdl_fit, v),
  USE.NAMES = FALSE
)
samdl_mme <- sapply(variables,
  function(v) shortage_probability_marginal(samdl_fit, v),
  USE.NAMES = FALSE
)
bsmdl_mem <- sapply(variables,
  function(v) {
    shortage_probability_marginal(bsmdl_fit, v, aggregate = "at_the_mean")
  },
  USE.NAMES = FALSE
)
drmdl_mem <- sapply(variables,
  function(v) {
    shortage_probability_marginal(drmdl_fit, v, aggregate = "at_the_mean")
  },
  USE.NAMES = FALSE
)
damdl_mem <- sapply(variables,
  function(v) {
    shortage_probability_marginal(damdl_fit, v, aggregate = "at_the_mean")
  },
  USE.NAMES = FALSE
)
samdl_mem <- sapply(variables,
  function(v) {
    shortage_probability_marginal(samdl_fit, v, aggregate = "at_the_mean")
  },
  USE.NAMES = FALSE
)

cbind(
  bsmdl_mme, drmdl_mme, damdl_mme, samdl_mme, bsmdl_mem, drmdl_mem, damdl_mem,
  samdl_mem
)
#>          bsmdl_mme    drmdl_mme     damdl_mme    samdl_mme   bsmdl_mem
#> B_P   -0.058233417  0.074489805 -0.1651059977 -0.199290256 -0.06399195
#> D_Xd1  0.084990788  0.055358542  0.0641363526  0.067995494  0.09339528
#> D_Xd2  0.009720347  0.006209734  0.0004054397  0.004403883  0.01068156
#> B_X1   0.181968214  0.117482602  0.1923432239  0.238684040  0.19996252
#> B_X2  -0.042143710 -0.021743702 -0.0444738050 -0.056832061 -0.04631118
#> S_Xs1 -0.265232209 -0.132090709 -0.1899577770 -0.206401099 -0.29146025
#>          drmdl_mem     damdl_mem    samdl_mem
#> B_P    0.079827913 -0.1925007197 -0.252675118
#> D_Xd1  0.059325660  0.0747779862  0.086209782
#> D_Xd2  0.006654738  0.0004727111  0.005583573
#> B_X1   0.125901672  0.2242572017  0.302621508
#> B_X2  -0.023301905 -0.0518529889 -0.072055945
#> S_Xs1 -0.141556628 -0.2214759566 -0.261690778

Shortages

Copy the disequilibrium model tibble and augment it with post-estimation data. The disequilibrium models can be used to estimate:

How is the sample separated post-estimation? The indices of the observations for which the estimated demand is greater than the estimated supply are easily obtained.

Summaries

All the model estimates support the summary function. The eq_2sls provides also the first stage estimation, but it is not included in the summary and has to be explicitly asked.

summary(eqmdl_reg@fit[[1]]$first_stage_model)
#> 
#> Call:
#> lm(formula = first_stage_formula, data = object@model_tibble)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -4.7599 -0.8768  0.0110  0.8875  4.0701 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  5.30801    0.14560  36.455  < 2e-16 ***
#> Xd1          0.16085    0.02596   6.196 6.02e-10 ***
#> Xd2         -0.01098    0.02570  -0.427    0.669    
#> X1           0.51332    0.02579  19.907  < 2e-16 ***
#> X2          -0.12027    0.02573  -4.674 3.00e-06 ***
#> Xs1         -0.43902    0.02591 -16.943  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.285 on 9994 degrees of freedom
#> Multiple R-squared:  0.06945,    Adjusted R-squared:  0.06898 
#> F-statistic: 149.2 on 5 and 9994 DF,  p-value: < 2.2e-16
summary(eqmdl_reg)
#> Equilibrium Model for Markets in Equilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 10000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> systemfit results 
#> method: 2SLS 
#> 
#>            N    DF     SSR detRCov   OLS-R2 McElroy-R2
#> system 20000 19989 48358.5  4.6502 -2.12978  -0.739797
#> 
#>            N   DF     SSR     MSE    RMSE       R2   Adj R2
#> demand 10000 9994 27698.3 2.77149 1.66478 -2.58529 -2.58708
#> supply 10000 9995 20660.2 2.06706 1.43773 -1.67428 -1.67535
#> 
#> The covariance matrix of the residuals
#>          demand   supply
#> demand  2.77149 -1.03857
#> supply -1.03857  2.06706
#> 
#> The correlations of the residuals
#>           demand    supply
#> demand  1.000000 -0.433914
#> supply -0.433914  1.000000
#> 
#> 
#> 2SLS estimates for 'demand' (equation 1)
#> Model Formula: Q ~ P + Xd1 + Xd2 + X1 + X2
#> <environment: 0x5649889a93b8>
#> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1
#> <environment: 0x5649889a93b8>
#> 
#>               Estimate Std. Error   t value   Pr(>|t|)    
#> (Intercept)  9.7549133  0.3627929  26.88838 < 2.22e-16 ***
#> P           -0.9575689  0.0764469 -12.52594 < 2.22e-16 ***
#> Xd1          0.3306545  0.0358080   9.23409 < 2.22e-16 ***
#> Xd2         -0.0012847  0.0333156  -0.03856  0.9692408    
#> X1           0.5778408  0.0516015  11.19815 < 2.22e-16 ***
#> X2          -0.0893816  0.0344208  -2.59673  0.0094254 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.66478 on 9994 degrees of freedom
#> Number of observations: 10000 Degrees of Freedom: 9994 
#> SSR: 27698.280302 MSE: 2.771491 Root MSE: 1.66478 
#> Multiple R-Squared: -2.585289 Adjusted R-Squared: -2.587082 
#> 
#> 
#> 2SLS estimates for 'supply' (equation 2)
#> Model Formula: Q ~ P + Xs1 + X1 + X2
#> <environment: 0x5649889a93b8>
#> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1
#> <environment: 0x5649889a93b8>
#> 
#>               Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept) -1.0494025  1.0315906 -1.01727    0.30905    
#> P            1.0884357  0.1800529  6.04509 1.5467e-09 ***
#> Xs1          0.8985423  0.0842354 10.66703 < 2.22e-16 ***
#> X1          -0.4723384  0.0968275 -4.87814 1.0874e-06 ***
#> X2           0.1566904  0.0361420  4.33541 1.4691e-05 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.437727 on 9995 degrees of freedom
#> Number of observations: 10000 Degrees of Freedom: 9995 
#> SSR: 20660.242409 MSE: 2.067058 Root MSE: 1.437727 
#> Multiple R-Squared: -1.674279 Adjusted R-Squared: -1.675349
summary(eqmdl_fit)
#> Equilibrium Model for Markets in Equilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 10000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation
#>   Method              : BFGS
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    0.13815    5.15047    0.15414    0.01746    0.01405    0.03411    0.17315 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>    4.15459    0.49648   -0.00247    0.04558    1.00000    1.00000    0.00000 
#> 
#> Coefficients
#>            Estimate Std. Error  z value      Pr(z)
#> D_P        -1.01247    0.09858 -10.2708  9.545e-25
#> D_CONST    10.89751    0.47798  22.7991 4.679e-115
#> D_Xd1      -0.01743    0.01240  -1.4053  1.599e-01
#> D_Xd2      -0.00890    0.01151  -0.7736  4.392e-01
#> D_X1        0.60912    0.06310   9.6530  4.776e-22
#> D_X2       -0.10000    0.03720  -2.6880  7.187e-03
#> S_P        -0.41615    0.01485 -28.0313 6.748e-173
#> S_CONST     7.46659    0.03056 244.2868  0.000e+00
#> S_Xs1       0.25233    0.02374  10.6298  2.167e-26
#> S_X1        0.30159    0.02790  10.8096  3.102e-27
#> S_X2       -0.02675    0.02167  -1.2343  2.171e-01
#> D_VARIANCE  3.01149    0.39228   7.6770  1.629e-14
#> S_VARIANCE  1.25304    0.03285  38.1432  0.000e+00
#> RHO         0.94562    0.01006  93.9638  0.000e+00
#> 
#> -2 log L: 57914.5
summary(bsmdl_fit)
#> Basic Model for Markets in Disequilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 10000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation
#>   Method              : Nelder-Mead
#>   Max Iterations      : 100000
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    0.13815    5.15047    0.15414    0.01746    0.01405    0.03411    0.17315 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>    4.15459    0.49648   -0.00247    0.04558    1.00000    1.00000    0.00000 
#> 
#> Coefficients
#>            Estimate Std. Error z value      Pr(z)
#> D_P         0.06505    0.02562  2.5387  1.113e-02
#> D_CONST     5.12574    0.21738 23.5795 6.252e-123
#> D_Xd1       0.29708    0.05483  5.4184  6.013e-08
#> D_Xd2       0.03398    0.03204  1.0604  2.889e-01
#> D_X1        0.32469    0.10101  3.2145  1.307e-03
#> D_X2       -0.03387    0.06251 -0.5418  5.879e-01
#> S_P         0.26860    0.02541 10.5717  4.031e-26
#> S_CONST     3.68315    0.25270 14.5749  4.056e-48
#> S_Xs1       0.92709    0.14013  6.6161  3.687e-11
#> S_X1       -0.31137    0.06729 -4.6271  3.708e-06
#> S_X2        0.11344    0.05179  2.1905  2.849e-02
#> D_VARIANCE  0.82064    0.03941 20.8252  2.557e-96
#> S_VARIANCE  0.99711    0.15989  6.2362  4.485e-10
#> RHO         0.11547    0.15802  0.7308  4.649e-01
#> 
#> -2 log L: 24362.3
summary(damdl_fit)
#> Deterministic Adjustment Model for Markets in Disequilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Separation Rule   : P_DIFF analogous to (D_Q - S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 8000
#>   Sample Separation : Demand Obs = 2540, Supply Obs = 5460
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation
#>   Method              : BFGS
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    0.10164    5.23014    0.16768    0.01574    0.09549    0.01532    0.14235 
#>    S_CONST      S_Xs1       S_X1       S_X2     P_DIFF D_VARIANCE S_VARIANCE 
#>    4.38416    0.43711    0.07416    0.02458    1.00000    1.00000    1.00000 
#>        RHO 
#>    0.00000 
#> 
#> Coefficients
#>            Estimate Std. Error   z value      Pr(z)
#> D_P        -0.18260    0.01688 -10.81970  2.777e-27
#> D_CONST     6.69333    0.12727  52.59287  0.000e+00
#> D_Xd1       0.16451    0.02071   7.94236  1.984e-15
#> D_Xd2       0.00104    0.01802   0.05773  9.540e-01
#> D_X1        0.43124    0.02687  16.04867  5.841e-58
#> D_X2       -0.06011    0.02328  -2.58213  9.819e-03
#> S_P         0.24090    0.01519  15.86368  1.131e-56
#> S_CONST     4.06253    0.12682  32.03405 3.662e-225
#> S_Xs1       0.48725    0.02328  20.93164  2.758e-97
#> S_X1       -0.06213    0.02177  -2.85453  4.310e-03
#> S_X2        0.05397    0.01886   2.86180  4.212e-03
#> P_DIFF      0.52648    0.02202  23.91277 2.256e-126
#> D_VARIANCE  1.02178    0.03318  30.79873 2.725e-208
#> S_VARIANCE  0.69727    0.01339  52.06415  0.000e+00
#> RHO         0.60944    0.02733  22.29848 3.823e-110
#> 
#> -2 log L: 39800.3
summary(samdl_fit)
#> Stochastic Adjustment Model for Markets in Disequilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Price Dynamics RHS: (D_Q - S_Q) + Xp1
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 8000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation
#>   Method              : BFGS
#>   Max Iterations      : 100000
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    0.10164    5.23014    0.16768    0.01574    0.09549    0.01532    0.14235 
#>    S_CONST      S_Xs1       S_X1       S_X2     P_DIFF    P_CONST      P_Xp1 
#>    4.38416    0.43711    0.07416    0.02458   14.18100    3.31715   -0.98643 
#> D_VARIANCE S_VARIANCE P_VARIANCE     RHO_DS     RHO_DP     RHO_SP 
#>    1.00000    1.00000    1.00000    0.00000    0.00000    0.00000 
#> 
#> Coefficients
#>             Estimate Std. Error   z value     Pr(z)
#> D_P        -0.283818    0.02445 -11.60957 3.685e-31
#> D_CONST     6.670856    0.16883  39.51312 0.000e+00
#> D_Xd1       0.295357    0.02671  11.05871 1.989e-28
#> D_Xd2       0.019129    0.02557   0.74827 4.543e-01
#> D_X1        0.572856    0.03683  15.55230 1.535e-54
#> D_X2       -0.092642    0.02612  -3.54740 3.891e-04
#> S_P         0.581854    0.04519  12.87565 6.172e-38
#> S_CONST     2.178987    0.24712   8.81755 1.170e-18
#> S_Xs1       0.896560    0.05190  17.27636 7.088e-67
#> S_X1       -0.463933    0.05749  -8.06931 7.070e-16
#> S_X2        0.154224    0.02928   5.26759 1.382e-07
#> P_DIFF      1.126426    0.06007  18.75213 1.860e-78
#> P_CONST     0.915931    0.11227   8.15807 3.404e-16
#> P_Xp1      -0.122637    0.03249  -3.77465 1.602e-04
#> D_VARIANCE  0.986643    0.05123  19.25817 1.206e-82
#> S_VARIANCE  1.005088    0.08611  11.67199 1.772e-31
#> P_VARIANCE  0.898735    0.10383   8.65587 4.892e-18
#> RHO_DS      0.066835    0.09213   0.72546 4.682e-01
#> RHO_DP      0.005708    0.06837   0.08349 9.335e-01
#> RHO_SP     -0.006082    0.06801  -0.08943 9.287e-01
#> 
#> -2 log L: 39505.1

The estimated demanded and supplied quantities can be calculated per observation.

The package offers also basic aggregation functionality.