An example of market-clearing assessment

This short tutorial gives an example of how one can statistically assess whether a market is in an equilibrium state. The tutorial assumes some familiarity with the concepts and the functionality of the package. The basic_usage vignette can be helpful in acquiring this familiarity.

Setup the environment

Load the required libraries.

library(diseq)
library(magrittr)

Prepare the data. Here, we simply simulate data using a data generating process for a market in equilibrium.

nobs <- 1000
tobs <- 5

alpha_d <- -3.9
beta_d0 <- 18.9
beta_d <- c(2.1, -0.7)
eta_d <- c(3.5, 6.25)

alpha_s <- 2.8
beta_s0 <- 3.2
beta_s <- c(2.65)
eta_s <- c(1.15, 4.2)

sigma_d <- 0.8
sigma_s <- 1.1
rho_ds <- 0.0

seed <- 42

eq_data <- simulate_data(
  "equilibrium_model", nobs, tobs,
  alpha_d, beta_d0, beta_d, eta_d,
  alpha_s, beta_s0, beta_s, eta_s,
  NA, NA, c(NA),
  sigma_d = sigma_d, sigma_s = sigma_s, rho_ds = rho_ds,
  seed = seed
)

Estimate the models

Prepare the basic parameters for model initialization.

verbose <- 2
correlated_shocks <- TRUE
formula <-   Q | P | id | date ~ P + Xd1 + Xd2 + X1 + X2 | P + Xs1 + X1 + X2

Set the estimation parameters.

optimization_method <- "BFGS"
optimization_options <- list(maxit = 10000, reltol = 1e-8)

Using the above parameterization, construct and estimate the model objects. Here we estimate two equilibrium models and four disequilibrium models. All the models are constructed using the simulated data from a model of market in equilibrium.

eqmdl_reg <- equilibrium_model(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(method = "2SLS")
)
#> Info: This is 'Equilibrium with correlated shocks' model
#> Warning: Removing unobserved '1' level(s).
eqmdl_fit <- equilibrium_model(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info: This is 'Equilibrium with correlated shocks' model
#> Warning: Removing unobserved '1' level(s).
bsmdl_fit <- diseq_basic(
  formula, eq_data[eq_data$date != 1, ],
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info: This is 'Basic with correlated shocks' model
#> Warning: Removing unobserved '1' level(s).
damdl_fit <- diseq_deterministic_adjustment(
  formula, eq_data,
  correlated_shocks = correlated_shocks, verbose = verbose,
  estimation_options = list(
    control = optimization_options, method = optimization_method
  )
)
#> Info: This is 'Deterministic Adjustment with correlated shocks' model
#> Info: Dropping 1000 rows by generating 'LAGGED_P'.
#> Info: Sample separated with 1971 rows in excess supply and 2029 in excess demand state.

Post estimation analysis

Summaries

All the models provide estimates for the simulated data. Even with simulated data, it is difficult to assess which model performs better by examining only the summaries in separation or collectively.

summary(eqmdl_reg@fit[[1]]$first_stage_model)
#> 
#> Call:
#> lm(formula = first_stage_formula, data = object@model_tibble)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.88388 -0.14215  0.00305  0.14088  0.81967 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  2.356218   0.036603   64.37   <2e-16 ***
#> Xd1          0.314532   0.006576   47.83   <2e-16 ***
#> Xd2         -0.102046   0.006549  -15.58   <2e-16 ***
#> X1           0.347118   0.006523   53.21   <2e-16 ***
#> X2           0.306744   0.006445   47.60   <2e-16 ***
#> Xs1         -0.401540   0.006568  -61.14   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.2087 on 3994 degrees of freedom
#> Multiple R-squared:  0.7409, Adjusted R-squared:  0.7406 
#> F-statistic:  2284 on 5 and 3994 DF,  p-value: < 2.2e-16
summary(eqmdl_reg)
#> Equilibrium Model for Markets in Equilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> systemfit results 
#> method: 2SLS 
#> 
#>           N   DF     SSR  detRCov   OLS-R2 McElroy-R2
#> system 8000 7989 7765.49 0.838051 0.895447    0.90887
#> 
#>           N   DF     SSR      MSE    RMSE       R2   Adj R2
#> demand 4000 3994 2579.28 0.645789 0.80361 0.930546 0.930459
#> supply 4000 3995 5186.20 1.298174 1.13937 0.860348 0.860208
#> 
#> The covariance matrix of the residuals
#>            demand     supply
#> demand  0.6457891 -0.0171846
#> supply -0.0171846  1.2981739
#> 
#> The correlations of the residuals
#>            demand     supply
#> demand  1.0000000 -0.0187684
#> supply -0.0187684  1.0000000
#> 
#> 
#> 2SLS estimates for 'demand' (equation 1)
#> Model Formula: Q ~ P + Xd1 + Xd2 + X1 + X2
#> <environment: 0x56498eb4e920>
#> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1
#> <environment: 0x56498eb4e920>
#> 
#>               Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept) 18.7291651  0.1529666 122.4395 < 2.22e-16 ***
#> P           -3.8534666  0.0629681 -61.1971 < 2.22e-16 ***
#> Xd1          2.1102794  0.0323478  65.2372 < 2.22e-16 ***
#> Xd2         -0.7156455  0.0261205 -27.3978 < 2.22e-16 ***
#> X1           3.5151555  0.0331078 106.1731 < 2.22e-16 ***
#> X2           6.2431905  0.0314974 198.2126 < 2.22e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.80361 on 3994 degrees of freedom
#> Number of observations: 4000 Degrees of Freedom: 3994 
#> SSR: 2579.281574 MSE: 0.645789 Root MSE: 0.80361 
#> Multiple R-Squared: 0.930546 Adjusted R-Squared: 0.930459 
#> 
#> 
#> 2SLS estimates for 'supply' (equation 2)
#> Model Formula: Q ~ P + Xs1 + X1 + X2
#> <environment: 0x56498eb4e920>
#> Instruments: ~Xd1 + Xd2 + X1 + X2 + Xs1
#> <environment: 0x56498eb4e920>
#> 
#>              Estimate Std. Error  t value   Pr(>|t|)    
#> (Intercept) 2.7596055  0.3499559  7.88558 3.9968e-15 ***
#> P           2.8849781  0.1084928 26.59143 < 2.22e-16 ***
#> Xs1         2.7054083  0.0569035 47.54382 < 2.22e-16 ***
#> X1          1.1758638  0.0515231 22.82207 < 2.22e-16 ***
#> X2          4.1766642  0.0489108 85.39346 < 2.22e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.139374 on 3995 degrees of freedom
#> Number of observations: 4000 Degrees of Freedom: 3995 
#> SSR: 5186.204766 MSE: 1.298174 Root MSE: 1.139374 
#> Multiple R-Squared: 0.860348 Adjusted R-Squared: 0.860208
summary(eqmdl_fit)
#> Equilibrium Model for Markets in Equilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Market Clearing   : Q = D_Q = S_Q
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    -2.6871    17.1339     1.7372    -0.5889     3.1155     5.8838     0.1426 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>    10.6921     1.5882     2.1173     5.0359     1.0000     1.0000     0.0000 
#> 
#> Coefficients
#>            Estimate Std. Error  z value      Pr(z)
#> D_P        -3.85348    0.06293 -61.2353  0.000e+00
#> D_CONST    18.73053    0.15336 122.1322  0.000e+00
#> D_Xd1       2.11014    0.03227  65.3909  0.000e+00
#> D_Xd2      -0.71602    0.02617 -27.3612 7.944e-165
#> D_X1        3.51516    0.03309 106.2424  0.000e+00
#> D_X2        6.24318    0.03147 198.3687  0.000e+00
#> S_P         2.88624    0.10846  26.6109 5.080e-156
#> S_CONST     2.75586    0.34985   7.8774  3.344e-15
#> S_Xs1       2.70593    0.05688  47.5696  0.000e+00
#> S_X1        1.17544    0.05150  22.8224 2.746e-115
#> S_X2        4.17628    0.04889  85.4181  0.000e+00
#> D_VARIANCE  0.64482    0.01899  33.9485 1.283e-252
#> S_VARIANCE  1.29703    0.05129  25.2886 4.267e-141
#> RHO        -0.01891    0.02353  -0.8034  4.217e-01
#> 
#> -2 log L: 6722.67
summary(bsmdl_fit)
#> Basic Model for Markets in Disequilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Not Separated
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    -2.6871    17.1339     1.7372    -0.5889     3.1155     5.8838     0.1426 
#>    S_CONST      S_Xs1       S_X1       S_X2 D_VARIANCE S_VARIANCE        RHO 
#>    10.6921     1.5882     2.1173     5.0359     1.0000     1.0000     0.0000 
#> 
#> Coefficients
#>            Estimate Std. Error z value      Pr(z)
#> D_P         -2.9119    0.07238 -40.233  0.000e+00
#> D_CONST     17.4679    0.21722  80.415  0.000e+00
#> D_Xd1        2.0264    0.04817  42.068  0.000e+00
#> D_Xd2       -0.7116    0.03279 -21.703 1.925e-104
#> D_X1         3.1680    0.04761  66.547  0.000e+00
#> D_X2         5.9840    0.04698 127.382  0.000e+00
#> S_P          0.9242    0.15942   5.797  6.755e-09
#> S_CONST      8.3532    0.56314  14.833  8.923e-50
#> S_Xs1        2.4675    0.11583  21.302 1.077e-100
#> S_X1         1.8785    0.09651  19.464  2.227e-84
#> S_X2         4.7120    0.10011  47.071  0.000e+00
#> D_VARIANCE   0.5960    0.02391  24.927 3.756e-137
#> S_VARIANCE   0.9703    0.07354  13.194  9.468e-40
#> RHO         -0.3355    0.05259  -6.379  1.788e-10
#> 
#> -2 log L: 8182.64
summary(damdl_fit)
#> Deterministic Adjustment Model for Markets in Disequilibrium
#>   Demand RHS        : D_P + D_Xd1 + D_Xd2 + D_X1 + D_X2
#>   Supply RHS        : S_P + S_Xs1 + S_X1 + S_X2
#>   Short Side Rule   : Q = min(D_Q, S_Q)
#>   Separation Rule   : P_DIFF analogous to (D_Q - S_Q)
#>   Shocks            : Correlated
#>   Nobs              : 4000
#>   Sample Separation : Demand Obs = 1971, Supply Obs = 2029
#>   Quantity Var      : Q
#>   Price Var         : P
#>   Key Var(s)        : id, date
#>   Time Var          : date
#> 
#> Maximum likelihood estimation
#>   Method              : BFGS
#>   Max Iterations      : 10000
#>   Relative Tolerance  : 1e-08
#>   Convergence Status  : success
#>   Starting Values     :
#>        D_P    D_CONST      D_Xd1      D_Xd2       D_X1       D_X2        S_P 
#>    -2.6871    17.1339     1.7372    -0.5889     3.1155     5.8838     0.1426 
#>    S_CONST      S_Xs1       S_X1       S_X2     P_DIFF D_VARIANCE S_VARIANCE 
#>    10.6921     1.5882     2.1173     5.0359     1.0000     1.0000     1.0000 
#>        RHO 
#>     0.0000 
#> 
#> Coefficients
#>             Estimate Std. Error  z value      Pr(z)
#> D_P        -3.848967    0.06636 -58.0023  0.000e+00
#> D_CONST    18.717272    0.16531 113.2287  0.000e+00
#> D_Xd1       2.110049    0.03227  65.3876  0.000e+00
#> D_Xd2      -0.716016    0.02617 -27.3624 7.681e-165
#> D_X1        3.515018    0.03309 106.2122  0.000e+00
#> D_X2        6.243126    0.03147 198.3693  0.000e+00
#> S_P         2.882241    0.11006  26.1867 3.763e-151
#> S_CONST     2.772435    0.35836   7.7364  1.022e-14
#> S_Xs1       2.705982    0.05689  47.5691  0.000e+00
#> S_X1        1.175343    0.05151  22.8185 3.001e-115
#> S_X2        4.176110    0.04890  85.4028  0.000e+00
#> P_DIFF      0.008358    0.03902   0.2142  8.304e-01
#> D_VARIANCE  0.644759    0.01900  33.9428 1.556e-252
#> S_VARIANCE  1.297088    0.05129  25.2875 4.389e-141
#> RHO        -0.018885    0.02353  -0.8025  4.223e-01
#> 
#> -2 log L: 6722.63

Model selection

The deterministic adjustment model has price dynamics that are analogous to excess demand and estimates one extra parameter. The directional model estimates one parameter less as the model does not have enough equations to identify prices in both demand and supply equations. The estimated parameters are summarized as follows.

sim_coef <- c(
  alpha_d, beta_d0, beta_d, eta_d,
  alpha_s, beta_s0, beta_s, eta_s,
  NA,
  sigma_d, sigma_s,
  rho_ds
)
names(sim_coef) <- names(coef(damdl_fit))

dm_inc <- coef(eqmdl_reg)[
  grep("demand", names(coef(eqmdl_reg)))
]
sp_inc <- coef(eqmdl_reg)[
  grep("supply", names(coef(eqmdl_reg)))
]
lm_coef <- c(
  dm_inc[2], dm_inc[-2], sp_inc[2], sp_inc[-2],
  NA,
  NA, NA,
  NA
)

eqmdl_coef <- append(
  coef(eqmdl_fit), c(NA),
  after = which(names(coef(eqmdl_fit)) ==
    prefixed_variance_variable(eqmdl_fit@system@demand)) - 1
)

bsmdl_coef <- append(
  coef(bsmdl_fit), c(NA),
  after = which(names(coef(bsmdl_fit)) ==
    prefixed_variance_variable(bsmdl_fit@system@demand)) - 1
)

damdl_coef <- coef(damdl_fit)

comp <- tibble::tibble(
  parameter = names(sim_coef),
  sim = sim_coef, lm = lm_coef, fi = eqmdl_coef,
  bm = bsmdl_coef, da = damdl_coef,
  lmerr = abs(lm_coef - sim_coef), fierr = abs(eqmdl_coef - sim_coef),
  bmerr = abs(bsmdl_coef - sim_coef), daerr = abs(damdl_coef - sim_coef)
)
comp
#> # A tibble: 15 × 10
#>    parameter    sim     lm      fi     bm       da    lmerr    fierr   bmerr
#>    <chr>      <dbl>  <dbl>   <dbl>  <dbl>    <dbl>    <dbl>    <dbl>   <dbl>
#>  1 D_P        -3.9  -3.85  -3.85   -2.91  -3.85     0.0465   0.0465   0.988 
#>  2 D_CONST    18.9  18.7   18.7    17.5   18.7      0.171    0.169    1.43  
#>  3 D_Xd1       2.1   2.11   2.11    2.03   2.11     0.0103   0.0101   0.0736
#>  4 D_Xd2      -0.7  -0.716 -0.716  -0.712 -0.716    0.0156   0.0160   0.0116
#>  5 D_X1        3.5   3.52   3.52    3.17   3.52     0.0152   0.0152   0.332 
#>  6 D_X2        6.25  6.24   6.24    5.98   6.24     0.00681  0.00682  0.266 
#>  7 S_P         2.8   2.88   2.89    0.924  2.88     0.0850   0.0862   1.88  
#>  8 S_CONST     3.2   2.76   2.76    8.35   2.77     0.440    0.444    5.15  
#>  9 S_Xs1       2.65  2.71   2.71    2.47   2.71     0.0554   0.0559   0.182 
#> 10 S_X1        1.15  1.18   1.18    1.88   1.18     0.0259   0.0254   0.729 
#> 11 S_X2        4.2   4.18   4.18    4.71   4.18     0.0233   0.0237   0.512 
#> 12 P_DIFF     NA    NA     NA      NA      0.00836 NA       NA       NA     
#> 13 D_VARIANCE  0.8  NA      0.645   0.596  0.645   NA        0.155    0.204 
#> 14 S_VARIANCE  1.1  NA      1.30    0.970  1.30    NA        0.197    0.130 
#> 15 RHO         0    NA     -0.0189 -0.335 -0.0189  NA        0.0189   0.335 
#> # … with 1 more variable: daerr <dbl>

Since we have used simulated data, we can calculate the average absolute error of the parameter estimation for each of the models. In practice, the population values are unknown and this calculation is impossible.

Moreover, the average absolute error cannot provide an overall estimation assessment as the market models have different parameter spaces. To assess the overall model performance one can instead use an information criterion.