Reciprocal relationships, reverse causality, and temporal ordering

Testing theories with cross-lagged panel models

Charles C. Lanfear

University of Cambridge

Thiago R. Oliveira

University of Manchester

What is this?

We were invited to write an article on Cross-Lagged Panel Models (CLPMs) for the Journal of Developmental and Life Course Criminology

CLPMs are commonly used to examine reciprocality in developmental and life course research in Criminology, Psychology, and Sociology

Reciprocality

Classic criminological concerns

What is this?

We were invited to write a on Cross-Lagged Panel Models (CLPM) for the Journal of Developmental and Life Course Criminology

CLPMs are commonly used to examine reciprocality in developmental and life course research in Criminology, Psychology, and Sociology

They are also often used thoughtlessly or unnecessarily

From Hell’s heart I stab at thee

We were invited to write a on Cross-Lagged Panel Models (CLPM) for the Journal of Developmental and Life Course Criminology

CLPMs are commonly used to examine reciprocality in developmental and life course research in Criminology, Psychology, and Sociology

They are also often used thoughtlessly or unnecessarily

~~This vexes me~~
This is an effort to provide accessible guidance

But first, what is a cross-lagged panel model?

The classic CLPM

Provided repeated observations, cross-lags “enforce” temporal order

Strength and direction often inferred from cross-lagged path estimates

It is Maslow’s Hammer for panel data

An example: Disorder and fear

Does this make sense with yearly panel data?

When would we expect observing disorder to impact fear and vice versa?

Does this align with the theory we’re testing?

Contemporaneous effects

Can capture immediate effects… but it assumes the problem away

Strong theory might justify this, however

Three key messages

Good empirical research starts with a strong theoretical foundation

Panel survey data are not appropriate for answering every question

You should default to robust estimators

Our approach

The world is recursive

Non-recursive theory is a symptom of ignoring time or mechanisms
- E.g., supply and demand is a micro-macro model
Use a directed acyclic graphs (DAG) for theories

Data are imperfect

Estimators must often handle ambiguity, and assumptions should be clear
Use classical structural equation model (SEM) path diagrams for estimators
- The structural model should be recursive¹ and derived from the theoretical DAG

DAG for theory

Units of time should be clearly stated

No bidirectional paths

Include latent or missing variables

SEM for estimation

Encode ambiguities and assumptions, e.g., (in)dependence

The process

Clear theory is a prerequisite before specifying an estimator:

Derive recursive model from theory
- Use a DAG or recursive equation
- Include unobserved mechanisms when appropriate

Specify estimand
- Quantitative definition of the estimate of interest

Specify estimator
- Contemporaneous (fast) vs. lagged (slow) effects in structure
- Covariances to address ambiguity and relax assumptions

Theory, estimands, and time

When is a CLPM needed

A cross-lagged panel model is only needed when reciprocal effects over time are presumed in the theoretical model. When the researcher can theoretically assume no effects of the dependent variable on the independent variable over time, other statistical models will be more appropriate than a cross-lagged panel model

We distinguish between two forms of reciprocality:

Theoretical reciprocality (substance)
- We care about both paths; e.g., which is stronger?
Reverse causality (nuisance)
- We care about one path but the other may bias our estimate

Two forms

Start with theory

CLPMs are frequently used without careful attention to the theoretical process they are meant to capture… [which] should inform… the definition of the causal quantity of interest—the estimand, which is often left implicit, determined not by theory but by the structure of the available data. When the starting point is not a clearly specified causal process, one risks arriving at an estimator that produces a correct answer to the wrong question.

E.g., “the effect of last year’s informal social control on this year’s crime”

The one true causal timing

Estimands connect your theoretical question to your data by specifying:

Value of exposure
- E.g., a counterfactual difference
Timing of effect

what is important is recognizing that there is no “true causal timing” waiting to be discovered because causal timing is a choice a researcher makes when defining their estimand of interest.

Drinking problems

what is the true causal timing of drinking an additional pint of beer on one’s perceived wellbeing? What is this effect at 20 seconds, 20 minutes, 20 hours, or 20 years? What about the effect of an additional pint per day after 10 years? In this context, the reader may know intimately which exposure and which lags between measurement of exposure and outcome correspond to detectable and substantively interesting causal effects. Strong theories should be specific about the role of time and thus able to inform what sort of causal timings—whether delay between exposure and outcome or aggregations of exposures—are of substantive interest.

An example

Theories¹ posit that employment reduces offending—with obvious reverse causality as offending threatens job loss—and we have yearly data

If we think employment reduces offending by keeping people involved in day-to-day activities… we probably want daily or weekly data instead!

But if we think employment reduces offending by gradually committing people to conventional life… our yearly data might be useful!

A feasible estimand: The expected reduction in number of self-reported offenses in the present year per additional month of full-time work in the past year

Good, an answerable question!

Now let’s see how everything can go terribly wrong

Panel data and composite variables

It comes and goes in waves

Observations rarely correspond perfectly to a theoretical timing of interest

We discretize for convenience: time is continuous!¹

In discrete time, missing data are infinite!

Composite period variables

Many common measures in panel studies are composites:

Time spent (un)employed
Number of crimes committed

When our observation periods do not correspond to the timing of our underlying theoretical causal model, these period measures become what we will call composite period variables. They are a composite because the measured value is a deterministic sum of values that would be obtained by dividing the observation period up into smaller intervals and aggregating across them.

Composite variable are tricky

Composite variables

Can be usefully represented in DAGs
Raise ~~horrible~~ unique causal identification issues

Composite \(X \rightarrow Y\)

\(X_A \Rightarrow X\) indicates deterministic child-parent relationship

Violated assumptions

Ignorability / Exchangeability: Treatment status is (conditionally) independent of potential outcomes

\(X_A\) and \(X_B\) may have different causes!

Consistency: No hidden versions of treatment

One value of \(X\) can correspond to many permutations of \(X_A\) and \(X_B\)!

Oh no

Even worse when components aren’t simultaneously determined

Oh noes

I offer no solution

I just want it to haunt your dreams

But you should consider how it might impact your research design

Three common applied problems

Unobserved time-stable heterogeneity

While we may be interested in change over time within individuals, theory often leads us to expect relatively stable differences between individuals

Sometimes these are nuisances, other times they are substantively interesting

Unfortunately:

Autoregressive terms cannot fully capture time-stable confounding
Cross-lagged terms conflate within-unit change with between-unit differences
Can’t just toss in fixed effects due to Nickell bias from endogenous lag
- Bias shrinks in proportion to \(T\), but most panel surveys are small \(T\)

Work, crime, and self-control

Confounded by an unmeasured time-stable trait of individuals

Solution 1: Allison et al.’s (2017) ML-SEM

One-sided Mundlak estimator for reverse causality and nuisance heterogeneity

Solution 2: Hamaker et al.’s (2015) RI-CLPM

Better for theoretical reciprocality and explicitly separating effects

Temporal misspecification

As illustrated by Vaisey & Miles (2017), if…

\(y = \beta x_t + \alpha_i + e_{it}\) is the “true” contemporaneous model¹
\(y = \beta^* x_{t-1} + \alpha_i + e_{it}\), the lagged model is estimated instead

The resulting “bias” is: \(E(\beta^*) = -0.5\beta\)

Incorrect temporal order can reverse signs… which is bad:

False negatives: Confidently reject theories when true
Implies opposite effects
Lag-only is most people’s default specification

Illustration

This is just a violated independence assumption

You’ll see it everywhere now

Be very suspicious of unexpected reversed signs

Robust estimators

Proper solution depends on what assumptions about temporal order theory allows us to make

Solutions

There is no “true” causal timing to discover, there is only a target estimand
- Motivate estimand with theory

Use robust estimators:
- Contemporaneous covariance when “fast” effects are ambiguous
- Contemporaneous effects when they’re not

If an ambiguous contemporaneous path is of substantive interest:
- Consider collecting better data, you fool
- As a last resort, non-recursive models

Here be dragons

IV assumptions are strong but a pint may make them believable

Low inter-temporal variation

Cross-lagged panel models are models of change over time

If things don’t change, you have nothing to explain

\(Var(Y_2|Y_1) \rightarrow 0\) as \(\rho(Y_1,Y_2) \rightarrow 1\)
Unstable and imprecise estimates
Measurement error becomes a proportionally larger component
Common with short observation periods and stable traits

Example paper

Sometimes a near-perfect multicollinearity problem:

What remains to explain when prior values account for 90% of the variation in present values?

Solutions

Data collection

Collect data over a longer period, you fool
Embed an experiment or look for exogenous shocks
- A more plausible IV estimator
Oversample for change
- Change may be rapid for subgroups

Estimation

Consider a cross-sectional analysis
Use longer lags
Different aggregations
- E.g., smaller spatial units
Measurement models for error
- Do this anyway because outcomes are regressors and random measurement error attenuates estimates

Or perhaps give up and go get a pint

Giving up

Consider doing something else

researchers should think carefully about whether the data they have are suitable for answering their questions to begin with… panel survey data are better equipped to test slow and lasting effects proposed by developmental and life course theories… than rapid or transient processes from cognitive and interactional theories… (22)

Consider experiments, momentary assessments, etc.

Panel data with narrow observation intervals are sometimes also poorly suited for testing slow processes; high intertemporal correlations and proportions of variance explained may be signals there is insufficient change over time to produce precise estimates.

Just because you have panel data doesn’t mean you should use it

Again, but louder

Separate theory from estimation
- Theory comes first
  - a strong estimation strategy cannot make up for a theoretical deficit
- Then a relevant estimand
- Then an appropriate estimator last
Panel survey data are not appropriate for answering every question
Default to robust estimators and be clear about assumptions
- Contemporaneous covariances or effects
- ML-SEM and RI-CLPM

Feedback and Questions

I am pleased to share the draft!

Contact:

Charles C. Lanfear
Institute of Criminology
University of Cambridge
cl948@cam.ac.uk