Modeling commands are very similar between R and Stata. The primary difference is that R argument go inside paranthesis rather than just after the name of the command as in Stata. Stata options come after a comma and are separate from formulae, but all R arguments–including formulae–are treated the same and separated by commas. Default R output is simpler than Stata: use `summary()`

if you want Stata-like output. In Stata, you save model output using `estimates store`

while in R you just assign the model object to a name. Note we specified which data we are using in R: R can have many data sets loaded at the same time!

```
glm y x z, ///
family(gaussian) link(identity)
estimates store example_model
```

Note in the above examples, a single Stata command can be spread over multiple lines using `///`

.

```
example_model <-
glm(y ~ x + z,
family = gaussian(link = "identity"),
data = example_data)
summary(example_model)
```

In R, all commands can span multiple lines so long as each line (other than the last) ends in an operator (e.g. `,`

as above, but also `+`

like in `ggplot2`

calls).

If we assume `x`

is a treatment and `time`

is a dummy indicating before and after, the linear model diff-in-diff estimators look as follows. These would be identical using GLM instead.

```
gen x_time = x*time
regress y x time x_time
```

```
example_did <- lm(y ~ x + time + x*time,
data = example_data)
summary(example_did)
```

In Stata, you can set your fixed effects IDs first using `xtset`

then when you run a regression on the data, use the `fe`

option.

```
xtset id
xtreg y x z, fe
```

You can run fixed effects using dummies for each unit. If `id`

is a factor, it will create K-1 dummies in your regression, leading to a fixed effects model.

`lm(y ~ x + z + id, data = your_data)`

You can also run fixed effects using the `lfe`

package. Here `id`

comes after `|`

. Everything after the first `|`

is treated as a fixed effect indicator. Additional bars can be added to specify intstrumental variables and clustered standard errors.

```
library(lfe)
felm(y ~ x + z | id, data = example_data)
```

If you want to fit a fixed effects panel model in the econometric style, as described in class, you might use the `plm`

package. You’ll want to use this if you want to run a Hausman test between fixed and random effects. Note you’ll need to specify *indices*: A variable that uniquely identifies the *groups* (here, `"id"`

) and that identifies the time or unique observations *within* groups (here, `"time"`

).

```
library(plm)
ex_plm_fe <- plm(y ~ z, index = c("id","time"),
model = "within",
data = example_data)
```

Note `model = "twoway"`

will give fixed effects on *both indices*. `model = "random"`

will give you random effects. Also, note that if you want to test constraints on a `plm()`

model–such as comparing nested models–`plm()`

uses Generalized Least Squares, not maximum likelihood, to fit models so you cannot use a likelihood ratio test. Use `lmtest::waldtest()`

instead.

In Stata, random effects models use the same base syntax as fixed effects, except with an `re`

option.

```
xtset id
xtreg y x z, re
```

In R, random effects models are usually conducted using the `lme4`

package. The formula has two parts: the main formula (`y ~ x + z`

) and the random effects (`(1|id)`

). Note it will report normal parameters as “fixed effects” because coefficients which are the same for all units are called fixed effects outside of econometrics. This can be confusing! They’re just normal coefficients.

```
library(lme4)
lmer(y ~ x + z + (1|id), data = example_data)
```

The number `1`

in front of `id`

here represents a *constant*. If you take a more advanced course on hierarchical models, youll learn about having *random slopes* which you can fit by replacing that `1`

with variables.

You can also fit random effects models in the econometric style using `plm`

. This is the easiest approach if you want to run a Hausman test–outside of econometrics, Hausman tests are a bit uncommon these days.

```
library(plm)
ex_plm_re <- plm(y ~ z, index = c("id","time"),
model = "random",
effect = "individual",
data = example_data)
```

Note you should specifcy `effect = "individual"`

in the random effects model so that random effects are applied to the `id`

variable but no random or fixed effects are applied to `time`

.

Hausman tests allow you to compare the econometric fixed effects model (i.e. dummies for each unit) to the random effects model (i.e. random intercepts for each unit) or a basic model (no dummies or random intercepts).

To do the Hausman test in Stata, you need to store your fied effects model and your random effects model, then compare them with `hausman`

.

```
xtreg y z, fe
estimates store fixed_model
xtreg y z, re
estimates store random_model
hausman fixed_model random_model
```

The easiest Hausman test to use in R is the one built into `plm`

. Like for Stata, you’ll fit both models, then give them to `phtest()`

.

```
library(plm)
ex_plm_fe <- plm(y ~ z, index = c("id","time"),
model = "within",
data = example_data)
ex_plm_re <- plm(y ~ z, index = c("id","time"),
model = "random",
effect = "individual",
data = example_data)
phtest(ex_plm_fe, ex_plm_re)
```