Taking A Deep Dive Into systemfit

The systemfit package for R description says

Econometric estimation of simultaneous systems of linear and nonlinear equations using Ordinary Least Squares (OLS), Weighted Least Squares (WLS), Seemingly Unrelated Regressions (SUR), Two-Stage Least Squares (2SLS), Weighted Two-Stage Least Squares (W2SLS), and Three-Stage Least Squares (3SLS).

It provides convenient facilities for estimation of systems of equations using R. For a high-level overview of its use, see the vignette.

Let’s explore a bit the systemfit function. (Note the creative reuse of the package name as the name of the main function in the package!) Although we’ll work with the dataset provided by the authors of the package, we’ll dig deeper into its functionality than in the provided examples. That will remove its black box nature, giving us a better understanding of what’s going on.

Note that the dataset is quite short ($T=20$). There are two equations in the system: \[c_{t} = \alpha_{0} + \alpha_{1} p_{t} + \alpha_{2} y_{t} + \varepsilon_{dt}\] \[c_{t} = \beta_{0} + \beta_{1} p_{t} + \beta_{2} f_{t} + \beta_{3} T_{t} + \varepsilon_{st}\] $c$ is per capita food consumption, $p$ is the relative price of food, $y$ is disposable income, $f$ is the relative price of food received by farmers, and $T$ is a linear time trend. The first equation is the demand equation (because it includes income) and the second is the supply equation (because it includes the farm price and a time trend to account for technology).

Load the data and print it out:

library(systemfit)
data("Kmenta")
Kmenta

OLS estimation. Let’s start by estimating the model using equation-by-equation OLS.

fit.demand <- lm(consump ~ price + income, data=Kmenta)
fit.supply <- lm(consump ~ price + farmPrice + trend, data=Kmenta)
print(fit.demand)
print(fit.supply)

Now estimate the equations as one system with a call to systemfit. You can confirm that the estimates are the same. Strictly speaking, the method="OLS" argument is not needed, since that’s the default. I decided to include it anyway to be explicit.

eq.demand <- consump ~ price + income
eq.supply <- consump ~ price + farmPrice + trend
fit.sys <- systemfit(list(demand=eq.demand, supply=eq.supply), method="OLS", data=Kmenta)
print(fit.sys)

The first argument to systemfit defines the system. Note that eq.demand and eq.supply do not do any estimation - they are formulas that, when combined with a dataset, are used to do the estimation.

It’s perfectly acceptable to define our system of two equations as

list(eq.demand, eq.supply)

As a practical matter, you probably want to give the equations names so the output is easier to work with. That’s why we defined the system as

list(demand=eq.demand, supply=eq.supply)

Benefits of systemfit. What’s the benefit to estimating the system using systemfit rather than estimating the individual equations by OLS? If all you want is the coefficients of the equations, there is no benefit. The advantage is that you get other stuff for free if you use systemfit. Here are a few of the more useful options.

To get the full (stacked) vector of coefficients with proper names:

fit.sys$coefficients

To get just the first equation:

fit.sys$eq[[1]]

Although this prints out the equation as if it’s a regular call to the lm function, it’s a systemfit.equation object, with other information attached:

class(fit.sys$eq)
class(fit.sys$eq[[1]])
names(fit.sys$eq[[1]])
fit.sys$eq[[1]]$eqnLabel
fit.sys$eq[[1]]$eqnNo

I’m not sure why the names of the equations aren’t included, but a call to lapply can be used to print them out:

lapply(fit.sys$eq, function(z) { z$eqnLabel })

Residual covariance matrix. One reason to use systemfit for estimation is that you’re using it as the first step in SUR estimation. The residual covariance matrix used (the covariances of the residuals across equations) is

fit.sys$residCov

Note that this is not the same output as creating a matrix to hold the residuals of the two equations and calculating the covariance matrix using the cov function:

ols.residuals <- cbind(residuals(fit.demand), residuals(fit.supply))
cov(ols.residuals)

Although this approach is fine asymptotically, it matters a lot which degrees of freedom adjustment you apply with such a small sample. Looking at the source code (line 848 of file systemfit.R in the version currently installed on my computer), there’s a reference to a function named .calcResidCov, called using the code:

results$residCov <- .calcResidCov( resids, methodResidCov = control$methodResidCov,
  validObsEq = validObsEq, nCoefEq = nCoefLiEq, xEq = xMatEq,
  centered = control$centerResiduals, solvetol = control$solvetol )

The default value of systemfit.control$methodResidCov is "geomean" (geometric mean approach). Lines 40-42 of calcResidCov.R contain these lines to do the covariance calculations:

result[ i, j ] <-
  sum( resids[ validObsAll, i ] * resids[ validObsAll, j ] ) /
  sqrt( ( sum( validObsAll ) - nCoefEq[i] ) * ( sum( validObsAll ) - nCoefEq[j] ) )

The degrees of freedom adjustment for equation i is the number of estimated parameters in equation i. There are three estimated parameters in equation 1, and four estimated parameters in equation 2. Thus, the formulas for calculating the residual variance terms are \[\frac{\sum_{t=1}^{20} e_{1t}^{2}}{17}\] for equation 1, \[\frac{\sum_{t=1}^{20} e_{2t}^{2}}{16}\] for equation 2, and \[\frac{\sum_{t=1}^{20} e_{1t} e_{2t}}{\sqrt{(17)(16)}}\] for the covariance term.

After going through that long detour, let’s reproduce the elements of the residual covariance matrix:

sum(residuals(fit.demand)^2)/17
sum(residuals(fit.supply)^2)/16
sum(residuals(fit.demand)*residuals(fit.supply))/sqrt(17*16)
fit.sys$residCov

The cov function uses a denominator of 19 (which is $T-1$). Reproducing the output of the call to cov:

sum(residuals(fit.demand)^2)/19
sum(residuals(fit.supply)^2)/19
sum(residuals(fit.demand)*residuals(fit.supply))/19
cov(ols.residuals)

Of course, with even a moderate sample size of $T=500$, the exact nature of the degrees of freedom adjustment would not be important. The denominator used to compute the covariance term would be 499 vs $\sqrt{(497)(496)} = 496.5$, a difference of only 0.5%.

Whether or not it matters in practice, calling cov doesn’t account for the fact that the residuals are the output of an estimated regression model rather than variables in your dataset.

Coefficient covariance matrix. This is typically denoted something like $\widehat{V}_{OLS}$. The most common use would be as an input in a Wald or Hausman test statistic calculation.

fit.sys$coefCov

The non-zero elements are pulled directly from the individual estimated equations:

vcov(fit.demand)
vcov(fit.supply)

You can compute $\widehat{V}_{OLS}$ for the first equation using the formula $s^{2} \left(X^{\prime} X\right)^{-1}$, where $s^{2}$ is given by the formula above:

s2 <- sum(residuals(fit.demand)^2)/17
x1 <- model.matrix(fit.sys$eq[[1]])
s2 * solve(t(x1) %*% x1)
fit.sys$coefCov

The upper-left block of the latter matrix is equal to the matrix we calculated.

SUR estimation.

2SLS estimation.