Vous êtes sur la page 1sur 1

Data Analysis Declare Data By declaring data type, you enable Stata to apply data munging and analysis

By declaring data type, you enable Stata to apply data munging and analysis functions specific to certain data types

with Stata 15 Cheat Sheet TIME SERIES webuse sunspot, clear PANEL / LONGITUDINAL webuse nlswork, clear
For more info see Stata’s reference manual (stata.com) tsset time, yearly xtset id year
Results are stored as either r -class or e -class. See Programming Cheat Sheet declare sunspot data to be yearly time series declare national longitudinal data to be a panel
Summarize Data Examples use auto.dta (sysuse auto, clear)
r
tsreport xtdescribe
unless otherwise noted
report panel aspects of a dataset
xtline plot
report time series aspects of a dataset r wage relative to inflation

univar price mpg, boxplot ssc install univar generate lag_spot = L1.spot xtsum hours 4
id 1 id 2

calculate univariate summary, with box-and-whiskers plot create a new variable of annual lags of sun spots tsline plot summarize hours worked, decomposing 2

stem mpg tsline spot


Number of sunspots 200 standard deviation into between and 0

within components
id 3 id 4
return stem-and-leaf display of mpg plot time series of sunspots
100 4

summarize price mpg, detail frequently used commands are


highlighted in yellow
e
arima spot, ar(1/2)
0
1850 1900 1950 xtline ln_wage if id <= 22, tlabel(#3) 2

calculate a variety of univariate summary statistics plot panel data as a line plot
0

estimate an auto-regressive model with 2 lags


1970 1980 1990

ci mean mpg price, level(99) for Stata 13: ci mpg price, level (99)
TIME SERIES OPERATORS e
xtreg ln_w c.age##c.age ttl_exp, fe vce(robust)
estimate a fixed-effects model with robust standard errors
r compute standard errors and confidence intervals L. lag x t-1 L2. 2-period lag x t-2
correlate mpg price F. lead x t+1 F2. 2-period lead x t+2 SURVEY DATA webuse nhanes2b, clear

return correlation or covariance matrix


D. difference x t-x t-1 D2. difference of difference xt-xt−1-(xt−1-xt−2)
svyset psuid [pweight = finalwgt], strata(stratid)
S. seasonal difference x t-xt-1 S2. lag-2 (seasonal difference) xt−xt−2
pwcorr price mpg weight, star(0.05) declare survey design for a dataset
USEFUL ADD-INS r
return all pairwise correlation coefficients with sig. levels tscollap compact time series into means, sums and end-of-period values svydescribe
mean price mpg carryforward carry non-missing values forward from one obs. to the next report survey data details
estimates of means, including standard errors tsspell identify spells or runs in time series svy: mean age, over(sex)
proportion rep78 foreign SURVIVAL ANALYSIS webuse drugtr, clear estimate a population mean for each subpopulation
estimates of proportions, including standard errors for stset studytime, failure(died) svy, subpop(rural): mean age
categories identified in varlist estimate a population mean for rural areas
r declare survey design for a dataset
e
e
ratio stsum svy: tabulate sex heartatk
estimates of ratio, including standard errors summarize survival-time data report two-way table with tests of independence
total price e
stcox drug age svy: reg zinc c.age##c.age female weight rural
estimates of totals, including standard errors estimate a Cox proportional hazard model estimate a regression using survey weights

Statistical Tests 1 Estimate Models stores results as e -class 2 Diagnostics some are inappropriate with robust SEs
tabulate foreign rep78, chi2 exact expected regress price mpg weight, vce(robust) estat hettest test for heteroskedasticity
tabulate foreign and repair record and return chi2 estimate ordinary least squares (OLS) model r ovtest test for omitted variable bias
and Fisher’s exact statistic alongside the expected values on mpg weight and foreign, apply robust standard errors vif report variance inflation factor
ttest mpg, by(foreign) regress price mpg weight if foreign == 0, vce(cluster rep78) dfbeta(length) Type help regress postestimation plots

estimate t test on equality of means for mpg by foreign regress price only on domestic cars, cluster standard errors calculate measure of influence for additional diagnostic plots
rreg price mpg weight, genwt(reg_wt) rvfplot, yline(0) avplots
r prtest foreign == 0.5 plot residuals plot all partial-

price

price
estimate robust regression to eliminate outliers

Residuals
mpg rep78
one-sample test of proportions probit foreign turn price, vce(robust) against fitted regression leverage
ADDITIONAL MODELS

price

price
ksmirnov mpg, by(foreign) exact estimate probit regression with pca built-in Stata principal components analysis
Fitted values values headroom weight plots in one graph
Kolmogorov-Smirnov equality-of-distributions test robust standard errors
3 Postestimation
command
factor analysis
factor
commands that use a fitted model
ranksum mpg, by(foreign) logit foreign headroom mpg, or poisson • nbreg count outcomes

equality tests on unmatched data (independent samples) estimate logistic regression and tobit censored data
regress price headroom length Used in all postestimation examples
report odds ratios ivregress ivreg2 instrumental variables

anova systolic drug webuse systolic, clear bootstrap, reps(100): regress mpg /* diff user-written difference-in-difference display _b[length] display _se[length]
analysis of variance and covariance */ weight gear foreign
rd ssc install ivreg2 regression discontinuity return coefficient estimate or standard error for mpg
xtabond xtdpdsys dynamic panel estimator from most recent regression model
e pwmean mpg, over(rep78) pveffects mcompare(tukey) estimate regression with bootstrapping teffects psmatch propensity score matching
jackknife r(mean), double: sum mpg margins, dydx(length) returns e-class information when post option is used
estimate pairwise comparisons of means with equal
variances include multiple comparison adjustment
synth
jackknife standard error of sample mean oaxaca
synthetic control analysis
Blinder-Oaxaca decomposition r
return the estimated marginal effect for mpg
margins, eyex(length)
Estimation with Categorical & Factor Variables more details at http://www.stata.com/manuals/u25.pdf return the estimated elasticity for price
CONTINUOUS VARIABLES OPERATOR DESCRIPTION EXAMPLE predict yhat if e(sample)
measure something i. specify indicators regress price i.rep78 specify rep78 variable to be an indicator variable create predictions for sample on which model was fit
CATEGORICAL VARIABLES
ib. specify base indicator regress price ib(3).rep78 set the third category of rep78 to be the base category predict double resid, residuals
fvset command to change base fvset base frequent rep78 set the base to most frequently occurring category for rep78
identify a group to which c. treat variable as continuous regress price i.foreign#c.mpg i.foreign treat mpg as a continuous variable and
calculate residuals based on last fit model
an observations belongs specify an interaction between foreign and mpg test headroom = 0
r test linear hypotheses that headroom estimate equals zero
o. omit a variable or indicator regress price io(2).rep78 set rep78 as an indicator; omit observations with rep78 == 2
INDICATOR VARIABLES
# specify interactions regress price mpg c.mpg#c.mpg create a squared mpg term to be used in regression
T F denote whether lincom headroom - length
something is true or false ## specify factorial interactions regress price c.mpg##c.mpg create all possible interactions with mpg (mpg and mpg2)
test linear combination of estimates (headroom = length)
Tim Essam (tessam@usaid.gov) • Laura Hughes (lhughes@usaid.gov) inspired by RStudio’s awesome Cheat Sheets (rstudio.com/resources/cheatsheets) geocenter.github.io/StataTraining updated June 2016
follow us @StataRGIS and @flaneuseks Disclaimer: we are not affiliated with Stata. But we like it. CC BY 4.0

Vous aimerez peut-être aussi