Académique Documents
Professionnel Documents
Culture Documents
gi ing a gift
Search
e ed b
Stata FAQ
How can I anal e a subpopulation of m surve data in Stata?
NOTE: Thi age a c ea ed i g S a a 9. A
i h S a a 8 ( ea ie e i
f S a a).
f he c de
hi
age i
i h S a a 10. The c de
hi
age i
Whe a a i g
e da a, i i c
a
a ce ai e
de , e ha
e ,
e
de
age 50. Whe a a i g he e b
a i (AKA d ai ),
eed
e he a
ia e i . S a a 9 ha
b
ai
i
ha a e e f e ib e a d ea
e. U i g he b
ai
i ( )i e e e i
a
he a a
e da a. If he da a e i b e , ea i g ha b e a i
be i c ded i he b
a i a e de e ed f
he da a
he a da d e
f he e i a e ca
be ca c a ed c ec . Whe he b
ai
i ( ) i ed,
he ca e defi
b he b
a i a e ed i he ca c a i
f he e i a e, b a ca e a e ed i he ca c a i
f he a da d e
. F
ei f ai
hi i e, ea e ee Sa i g Tech i e , Thi d Edi i b Wi ia G. C ch a (1977) a d S a A ea
E i a i b J. N. K. Ra (2003).
F he a e f c i e c , e i
he a e f a sv c
a d.
e he mean c
a df a
e a
e. H
e e , he subpop a d over
e
ig
e,
ed
We i a b
i g a he ea f
c i
a iab e, ell. Ne , e i c ide
a iab e
e i h he subpop
i , r_rnd, hich i c ded 0/1, a d both, hich i c ded 1/2. A
i ee, he subpop i ha d e he e
a iab e
diffe e .
use http://www.ats.ucla.edu/stat/stata/seminars/sv _stata_intro/strsrs, clear
sv : mean ell
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
2
620
Number of obs
=
Population size =
Design df
=
620
6194
618
-------------------------------------------------------------Linearized
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
22.83578
.669696
21.52063
24.15094
-------------------------------------------------------------He e e ca ee ha r_rnd i c ded 0/1. (Thi missing i i ed he e
h
ha he e a e
i i g a e f hi
a iab e. We i a
hi a e .) N ice i he
f he sv : tab c
a d ha he e a e 789.6 ca e c ded 1. (I i
a h e
be beca e e a e e i a i g hi a e i g he
babii
eigh .) I he
f he sv : mean c
a d, e
a
ee ha 789.552 ca e a e i c ded i he b
ai .
sv : tab r_rnd, count nolabel missing
(running tabulate on estimation sample)
Number of strata
Number of PSUs
=
=
2
620
Number of obs
Population size
Design df
=
620
= 6193.9997
=
618
---------------------yr_rnd
count
----------+----------0
5404
www.ats.ucla.edu/stat/stata/faq/sv _stata_subpop.htm
1/7
03-02-12
789.6
Total
6194
---------------------Key: count
= weighted counts
sv , subpop( r_rnd): mean ell
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
2
620
Number of obs
Population size
Subpop. no. obs
Subpop. size
Design df
=
620
=
6194
=
79
= 789.552
=
618
-------------------------------------------------------------Linearized
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
43.50105
2.658549
38.28016
48.72193
-------------------------------------------------------------No le '
o e a a iable coded 1/2 in ead of 0/1. He e e can ee ha both i coded 1/2. (Thi missing op ion i ed he e o
ho ha he e a e no mi ing al e fo hi a iable. We ill an o kno hi la e on.) No ice in he o p of he sv : tab
command ha he e a e 1888 ca e coded 1. Ho e e , in he o p of he sv : mean command, e ee ha all of he ob e a ion ,
6194 ca e , a e incl ded in he bpop la ion. Thi i beca e he subpop op ion m ha e a e/fal e a iable. A a ed on page
39 of he S a a 9 S e man al, hen he subpop op ion i ed, he bpop la ion i ac all defined b he 0 (fal e), hich indica e
ho e ca e o be e cl ded f om he bpop la ion. Non-0 al e a e incl ded in he anal i , e cep fo mi ing al e , hich a e
e cl ded f om he anal i . Beca e e ha e no ca e coded a 0, all of he ca e a e incl ded in he bpop la ion, a e plained in
he no e in he o p .
sv : tab both, count nolabel missing
(running tabulate on estimation sample)
Number of strata
Number of PSUs
=
=
2
620
Number of obs
Population size
Design df
=
620
= 6193.9997
=
618
---------------------both
count
----------+----------1
1888
2
4306
Total
6194
---------------------Key: count
= weighted counts
sv , subpop(both): mean ell
(running mean on estimation sample)
Note: subpop() subpopulation is same as full population
subpop() = 1 indicates observation in subpopulation
subpop() = 0 indicates observation not in subpopulation
Survey: Mean estimation
Number of strata =
Number of PSUs =
www.ats.ucla.edu/stat/stata/faq/sv _stata_subpop.htm
2
620
Number of obs
Population size
Subpop. no. obs
Subpop. size
Design df
=
=
=
=
=
620
6194
620
6194
618
2/7
03-02-12
-------------------------------------------------------------Linearized
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
22.83578
.669696
21.52063
24.15094
-------------------------------------------------------------No le ' c ea e a cop of both and ecode he 1
al e in he bpop la ion a iable. The o p of
sv : mean command ho
ha he all of he ca e
bpop la ion. No ice he no e ha S a a p o ide
2
610
Number of obs
Population size
Subpop. no. obs
Subpop. size
Design df
=
610
= 6094.03
=
424
= 4235.65
=
608
-------------------------------------------------------------Linearized
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
22.03727
.894207
20.28116
23.79338
-------------------------------------------------------------Yo can al o e if hen defining o
bpop la ion. I ho ld be e ed ha hi i VERY diffe en f om ing if o emo e ca e
f om an anal i . U ing if in he subpop op ion doe no emo e ca e f om he anal i . The ca e e cl ded f om he bpop la ion
b he if a e ill ed in he calc la ion of he anda d e o , a he ho ld be.
sv , subpop( r_rnd if mobilit < 50): mean ell
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
www.ats.ucla.edu/stat/stata/faq/sv _stata_subpop.htm
2
620
Number of obs
Population size
Subpop. no. obs
Subpop. size
Design df
=
620
=
6194
=
78
= 779.555
=
618
3/7
03-02-12
-------------------------------------------------------------Linearized
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
43.86654
2.668957
38.62521
49.10786
-------------------------------------------------------------sv , subpop( r_rnd if mobilit < 50 & hsg < 80): mean ell
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
2
620
Number of obs
Population size
Subpop. no. obs
Subpop. size
Design df
=
620
=
6194
=
78
= 779.555
=
618
-------------------------------------------------------------Linearized
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
43.86654
2.668957
38.62521
49.10786
-------------------------------------------------------------Yo can e ei he subpop o over i h m l iple a iable o c ea e he bpop la ion ha o an . Le ' ee ome e ample ing
he over op ion. Fi , e ill e r_rnd, o 0/1 a iable, hen both, o 1/2 a iable. No ice ha he o p i diffe en f om he
o p
ing he subpop op ion in ha bo h ca ego ie of he a iable a e gi en, and he e i no no e hen a 1/2 a iable i ed.
Plea e no e ha he over op ion i onl a ailable fo he
e command mean, proportion, ratio and total.
sv : mean ell, over( r_rnd)
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
2
620
Number of obs
=
Population size =
Design df
=
620
6194
618
0: yr_rnd = 0
No: yr_rnd = No
-------------------------------------------------------------Linearized
Over
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
0
19.81673
.6771138
18.48701
21.14646
No
43.50105
2.658549
38.28016
48.72193
-------------------------------------------------------------sv : mean ell, over(both)
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
2
620
Number of obs
=
Population size =
Design df
=
620
6194
618
No: both = No
Yes: both = Yes
-------------------------------------------------------------Linearized
Over
Mean
Std. Err.
[95% Conf. Interval]
www.ats.ucla.edu/stat/stata/faq/sv _stata_subpop.htm
4/7
03-02-12
-------------+-----------------------------------------------ell
No
24.64196
1.329677
22.03073
27.25319
Yes
22.04363
.8854687
20.30473
23.78252
-------------------------------------------------------------Now let's use both r_rnd and both as the subpopulation variables. First we will use the sv : tab command to ensure that there are
cases in all four categories. Then we use the sv : mean command with the over option.
sv : tab r_rnd both, count
(running tabulate on estimation sample)
Number of strata
Number of PSUs
=
=
2
620
Number of obs
Population size
Design df
=
620
= 6193.9997
=
618
=
=
0.0807
0.0896
P = 0.7647
2
620
Number of obs
=
Population size =
Design df
=
620
6194
618
5/7
03-02-12
2
620
Number of obs
=
Population size =
Design df
=
620
6194
618
1: emergrp = 1
2: emergrp = 2
3: emergrp = 3
4: emergrp = 4
-------------------------------------------------------------Linearized
Over
Mean
Std. Err.
[95% Conf. Interval]
-------------+-----------------------------------------------ell
1
14.53282
.9707122
12.62652
16.43911
2
19.21555
1.594388
16.08447
22.34662
3
26.41136
1.815472
22.84612
29.97661
4
38.60681
1.926595
34.82334
42.39028
-------------------------------------------------------------sv : mean ell, over(emergrp r_rnd both)
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
2
620
Number of obs
=
Population size =
Design df
=
620
6194
618
6/7
03-02-12
_subpop_13
29.11945
2.818771
23.58392
34.65498
_subpop_14
32.95607
2.611229
27.82811
38.08403
_subpop_15
54.09091
6.870972
40.59763
67.58419
_subpop_16
57.8
3.730597
50.47382
65.12618
-------------------------------------------------------------The s bpop option can be combined with the o er option. This is hand because if cannot be used with the o er option. B
combining the options, ou can have "the best of both worlds."
sv , subpop( r_rnd if mobilit < 50 & hsg < 80): mean ell, over(emergrp both)
(running mean on estimation sample)
Survey: Mean estimation
Number of strata =
Number of PSUs =
2
620
Number of obs
Population size
Subpop. no. obs
Subpop. size
Design df
=
620
=
6194
=
78
= 779.555
=
618
The content of this web site should not be construed as an endorsement of an particular web site, book, or software product b the
Universit of California.
www.ats.ucla.edu/stat/stata/faq/sv _stata_subpop.htm
7/7