
This is an estimation of power for this particular hypothesis test. We then check how often we correctly reject the null hypothesis of no interaction between x and g. In the first simulation, we generate data with an interaction, fit the correct model, and then calculate both the usual and robust standard errors.

We can demonstrate each of these points via simulation. Second, if the model is not correctly specified, the sandwich estimators are only useful if the parameters estimates are still consistent, i.e., if the misspecification does not result in bias. Zeileis (2006), the author of the sandwich package, also gives two reasons for not using robust standard errors “for every model in every analysis”:įirst, the use of sandwich estimators when the model is correctly specified leads to a loss of power. Because of this it might be a good idea to think carefully about your model before reflexively deploying robust standard errors. We may be missing key predictors, interactions, or non-linear effects. Let’s modify our formula above to substitute HC1 “meat” in our sandwich:īut it’s important to remember large residuals (or evidence of non-constant variance) could be due to a misspecified model. In our simple model above, \(k = 2\), since we have an intercept and a slope. \[\text_i^2\) refers to squared residuals, \(n\) is the number of observations, and \(k\) is the number of coefficients. The usual method for estimating coefficient standard errors of a linear model can be expressed with this somewhat intimidating formula: Now that we know the basics of getting robust standard errors out of Stata and R, let’s talk a little about why they’re robust by exploring how they’re calculated. While not really the point of this post, we should note the results say that larger turn circles and bigger trunks are associate with lower gas mileage. (We talk more about the different types and why it’s called the “sandwich” package below.)Ĭoefci(m, vcov. “HC1” is one of several types available in the sandwich package and happens to be the default type in Stata 16. The type argument allows us to specify what kind of robust standard errors to calculate. The sandwich package provides the vcovHC function that allows us to calculate robust standard errors. The lmtest package provides the coeftest function that allows us to re-calculate a coefficient table using a different variance-covariance matrix. Then we load two more packages: lmtest and sandwich.

First we load the haven package to use the read_dta function that allows us to import Stata data sets. To replicate the result in R takes a bit more work. Notice the third column indicates “Robust” Standard Errors. I conduct my analyses and write up my research in R, but typically I need to use word to share with colleagues or to submit to journals, conferences, etc.Mpg | Coef. Yes, word documents are still the standard format in the academic world. I’m not getting in the weeds here, but according to this document, robust standard errors are calculated thus for linear models (see page 6):\[\hat to export tables to word format.


That’s because (as best I can figure), when calculating the robust standard errors for a glm fit, Stata is using $n / (n - 1)$ rather than $n / (n = k)$, where $n$ is the number of observations and k is the number of parameters. Not too different, but different enough to make a difference.
