Mastering Stata panel data transforms your empirical research. The journey from xtset to dynamic panel GMM is one of the most valuable skills in modern quantitative social science. Stata’s coherent syntax—where xt commands mirror their non-panel counterparts—makes learning efficient.
Remember the golden rules:
With this guide, you have a complete reference. Now load your data, type xtset, and let the analysis begin.
About the Author: [Your Name] is an applied econometrician specializing in longitudinal data analysis using Stata. This article is part of the Stata Mastery Series. stata panel data
Last Updated: March 2025. Commands verified with Stata 18.5.
FE removes time-invariant unobserved heterogeneity by within-transformation.
xtreg ln_wage hours age tenure, fe
Alternatively, using areg or reghdfe (for high-dimensional FE): With this guide, you have a complete reference
reghdfe ln_wage hours age tenure, absorb(idcode) vce(cluster idcode)
Raw panel data often arrives messy. Prepare it systematically.
When lagged dependent variables matter (e.g., wage depends on prior wage), standard FE is biased. Use Arellano-Bond GMM:
xtabond wage experience union, lags(1) maxldep(2)
Or the more flexible xtdpdgmm:
xtdpdgmm wage L.wage experience union, gmm(L.wage, lag(2 4)) iv(experience union)
Caution: GMM is powerful but complex. Check for overidentifying restrictions with Hansen test after estimation.
Stata recognizes the panel structure when creating lags or differences, ensuring it does not calculate the difference between two different entities.
* Create a lag variable (previous year's value)
gen lag_gdp = L.gdp
This ignores the panel structure and pools all data together. It is simple but often biased if unobserved unit-specific characteristics exist (omitted variable bias). About the Author : [Your Name] is an
reg y x1 x2, vce(cluster panel_id)
Note: Standard errors are usually clustered at the unit level to account for correlation within units.
| Pitfall | Solution |
|---------|----------|
| Forgetting to xtset | Always start with xtset |
| Using RE when FE is needed | Run Hausman test |
| Ignoring serial correlation | Use xtreg, fe with cluster-robust or xtregar |
| Overlooking unbalanced panels | Check xtdes and consider xtbalance |
| Not reporting within/between R² | Report both from xtreg, fe |