Converting Repeated Measures Mixed Model Formula From SAS To R - Stack Overflow

Converting Repeated Measures mixed model formula from SAS to R Ask Question
There are several questions and posts about mixed models for more complex experimental designs, so I thought this more simple model would help other
beginners in this process as well as I.
So, my question is I would like to formulate a repeated measures ancova in R from sas proc mixed procedure:
proc mixed data=df1;

FitStatistics=akaike
class GROUP person day;
model Y = GROUP X1 / solution alpha=.1 cl;
repeated / type=cs subject=person group=GROUP;
lsmeans GROUP;
run;
Here is the SAS output using the data created in R (below):
. Effect panel Estimate Error DF t Value Pr >

|t| Alpha Lower Upper
Intercept -9.8693 251.04 7 -0.04
0.9697 0.1 -485.49 465.75
panel 1 -247.17 112.86 7 -2.19
0.0647 0.1 -460.99 -33.3510
panel 2 0 . . . .
. . .
X1 20.4125 10.0228 7 2.04
0.0811 0.1 1.4235 39.4016
Below is how I formulated the model in R using 'nlme' package, but am not getting similar coefficient estimates:
## create reproducible example fake panel data set:

set.seed(94); subject.id = abs(round(rnorm(10)*10000,0))
set.seed(99); sds = rnorm(10,15,5);means = 1:10*runif(10,7,13);trends =

runif(10,0.5,2.5)
this = NULL; set.seed(98)

for(i in 1:10) { this = c(this,rnorm(6, mean = means[i], sd =
sds[i])*trends[i]*1:6)}
set.seed(97)
that = sort(rep(rnorm(10,mean = 20, sd = 3),6))
df1 = data.frame(day = rep(1:6,10), GROUP = c(rep('TEST',30),rep('CONTROL',30)),

Y = this,
X1 = that,
person = sort(rep(subject.id,6)))
## use package nlme

require(nlme)
## run repeated measures mixed model using compound symmetry covariance

structure:
summary(lme(Y ~ GROUP + X1, random = ~ +1 | person,
correlation=corCompSymm(form=~day|person), na.action = na.exclude,
data = df1,method='REML'))
Now, the output from R, which I now realize is similar to the output from lm() :
Value Std.Error DF t-value p-value

(Intercept) -626.1622 527.9890 50 -1.1859379 0.2413
GROUPTEST -101.3647 156.2940 7 -0.6485518 0.5373
X1 47.0919 22.6698 7 2.0772934 0.0764
I believe I'm close as to the specification, but not sure what piece I'm missing to make the results match (within reason..). Any help would be appreciated!
UPDATE: Using the code in the answer below, the R output becomes:
> summary(model2)
Scroll to bottom for the parameter estimates look! identical to SAS.
Linear mixed-effects model fit by REML

Data: df1
AIC BIC logLik
776.942 793.2864 -380.471
Random effects:
Formula: ~GROUP - 1 | person
Structure: Diagonal
GROUPCONTROL GROUPTEST Residual
StdDev: 184.692 14.56864 93.28885
Correlation Structure: Compound symmetry

Formula: ~day | person
Join Stack Overflow to learn, share knowledge, and build your career.
Parameter estimate(s): Email Sign Up OR SIGN IN WITH Google Facebook
Rho
-0.009929987
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | GROUP
Parameter estimates:
TEST CONTROL
1.000000 3.068837
Fixed effects: Y ~ GROUP + X1
Value Std.Error DF t-value p-value

(Intercept) -9.8706 251.04678 50 -0.0393178 0.9688
GROUPTEST -247.1712 112.85945 7 -2.1900795 0.0647
X1 20.4126 10.02292 7 2.0365914 0.0811
r sas mixedmodels
edited Aug 8 '12 at 1:23 asked Aug 5 '12 at 20:15
bahakev
1,544 4 27 31
What do you mean about not getting similar results? Do you mean that there is information that's missing, or that you're getting different estimates? If the latter, are you sure that
the input data is the same? – David Robinson Aug 5 '12 at 20:22
I'm getting different estimates. I have indeed checked that the input data is identical also, i.e. df1 in SAS = df1 in R. – bahakev Aug 5 '12 at 20:26
might it just be a difference in contrasts of the fixed effects? i.e. contrasts(df1$GROUP) <- contr.SAS(2) ? – Ben Bolker Aug 6 '12 at 1:07
Hi @BenBolker! Good to see you noticing this thread. I'll be curious if you agree with my assessment below or not. I think this is trickier than the OP hoped, but it would be nice if I
was wrong. – Aaron Aug 6 '12 at 1:38
@bahakev: If you add the statistical information and ask it more in terms of what model is appropriate, this would be a great question for stats.stackexchange. – Aaron Aug 6 '12
at 1:56
2 Answers
Please try below:
model1 <- lme(

Y ~ GROUP + X1,
random = ~ GROUP | person,
correlation = corCompSymm(form = ~ day | person),
na.action = na.exclude, data = df1, method = "REML"
)
summary(model1)
I think random = ~ groupvar | subjvar option with R lme provides similar result of repeated /

subject = subjvar group = groupvar option with SAS/MIXED in this case.
Edit:
SAS/MIXED
R (a revised model2)
model2 <- lme(

Y ~ GROUP + X1,
random = list(person = pdDiag(form = ~ GROUP - 1)),
correlation = corCompSymm(form = ~ day | person),
weights = varIdent(form = ~ 1 | GROUP),
na.action = na.exclude, data = df1, method = "REML"
)
summary(model2)
So, I think these covariance structures are very similar (σg1 = τg2 + σ1).
Edit 2:
Covariate estimates (SAS/MIXED):
Variance person GROUP TEST 8789.23

CS person GROUP TEST 125.79
Variance person GROUP CONTROL 82775
CS person GROUP CONTROL 33297
So
TEST group diagonal element
= 125.79 + 8789.23
= 8915.02
CONTROL group diagonal element
= 33297 + 82775
= 116072
where diagonal element = σk1 + σk2.
Covariate estimates (R lme):
Random effects:
Formula: ~GROUP - 1 | person
Structure: Diagonal
GROUP1TEST GROUP2CONTROL Residual
StdDev: 14.56864 184.692 93.28885
Correlation Structure: Compound symmetry

Formula: ~day | person
Parameter estimate(s):
Rho
-0.009929987
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | GROUP
Parameter estimates:
1TEST 2CONTROL
1.000000 3.068837
So
TEST group diagonal element

= 14.56864^2 + (3.068837^0.5 * 93.28885 * -0.009929987) + 93.28885^2
= 8913.432
CONTROL group diagonal element
= 184.692^2 + (3.068837^0.5 * 93.28885 * -0.009929987) + (3.068837 *
93.28885)^2
= 116070.5
where diagonal element = τg2 + σ1 + σg2.
edited Aug 8 '12 at 11:12 answered Aug 6 '12 at 0:33
Triad sou.
2,749 2 16 27
I'm pretty sure that random = ~ GROUP | person doesn't change anything from the original code because each

person is only in one group. What that syntax would do is allow the covariance between levels of group to differ
within an individual. – Aaron Aug 6 '12 at 1:36
I still think the random effect isn't right. random = list(person = pdDiag(form = ~ GROUP - 1)) is still

allowing the covariance between levels of group to differ within an individual, but is forcing them not to be
correlated. – Aaron Aug 6 '12 at 17:15
Also, since R actually parameterizes the model in terms of the correlation and the variances, it's hard to see how
your matrix matches the R code you wrote. Not that it's necessarily wrong, but it would help if you used R's
parameterization to explain. – Aaron Aug 6 '12 at 17:17
This darn near perfectly matches the proc mixed output so I will mark as answered. Thanks Triad! – bahakev Aug

8 '12 at 0:51
@bahakev: That would be cool! But when I ran it I couldn't see how the variance terms agreed; can you share your
results by editing the original post? – Aaron Aug 8 '12 at 1:04
Oooh, this is going to be a tricky one, and if it's even possible using standard nlme functions, is going
to take some serious study of Pinheiro/Bates.
Before you spend the time doing that though, you should make absolutely sure that this is exact model
you need. Perhaps there's something else that might fit the story of your data better. Or maybe there's
something R can do more easily that is just as good, but not quite the same.
First, here's my take on what you're doing in SAS with this line:
repeated / type=cs subject=person group=GROUP;
This type=cs subject=person is inducing correlation between all the measurements on the same

person, and that correlation is the same for all pairs of days. The group=GROUP is allowing the
correlation for each group to be different.
In contrast, here's my take on what your R code is doing:
random = ~ +1 | person,
correlation=corCompSymm(form=~day|person)
This code is actually adding almost the same effect in two different ways; the random line is adding a

random effect for each person, and the correlation line is inducing correlation between all the
measurements on the same person. However, these two things are almost identical; if the correlation is
positive, you get the exact same result by including either of them. I'm not sure what happens when
you include both, but I do know that only one is necessary. Regardless, this code has the same
correlation for all individuals, it's not allowing each group to have their own correlation.
To let each group have their own correlation, I think you have to build a more complicated correlation
structure up out of two different pieces; I've never done this but I'm pretty sure I remember
Pinheiro/Bates doing it.
You might consider instead adding a random effect for person and then letting the variance be different
for the different groups with weights=varIdent(form=~1|group) (from memory, check my syntax,
please). This won't quite be the same but tells a similar story. The story in SAS is that the
measurements on some individuals are more correlated than the measurements on other individuals.
Thinking about what that means, the measurements for individuals with higher correlation will be closer
together than the measurements for individuals with lower correlation. In contrast, the story in R is that
the variability of measurements within individuals varies; thinking about that, measurements with higher
variability with have lower correlation. So they do tell similar stories, but come at it from opposite sides.
It is even possible (but I would be surprised) that these two models end up being different
parameterizations of the same thing. My intuition is that the overall measurement variability will be
different in some way. But even if they aren't the same thing, it would be worth writing out the
parameterizations just to be sure you understand them and to make sure that they are appropriately
describing the story of your data.
answered Aug 6 '12 at 1:34
Aaron
27.5k 4 50 105
A final thought though, changing the the correlation structure usually doesn't affect the estimates of the fixed effects,
so if that's what's different, there may be something else going too. – Aaron Aug 6 '12 at 1:54
Your answer sounds reasonable to me, but I really think we need to hear/see more from the OP about what the
output of each program looks like (I have access to SAS, but not conveniently) and where the key differences are ...
– Ben Bolker Aug 6 '12 at 4:01
Thanks, @BenBolker. I haven't tried running the OP's code either; I have access to SAS but not conveniently from
home. – Aaron Aug 6 '12 at 4:36
One thought I get errors when I delete the random= term while keeping the correlation= term. If, as you

say, they are redundant then I should be able to remove one. – bahakev Aug 8 '12 at 0:58
@bahakev: Be sure to switch to gls when you remove the random effects. – Aaron Aug 8 '12 at 1:03

Converting Repeated Measures Mixed Model Formula From SAS To R - Stack Overflow

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Converting Repeated Measures Mixed Model Formula From SAS To R - Stack Overflow

Transféré par

Droits d'auteur :

Formats disponibles

proc mixed data=df1;

. Effect panel Estimate Error DF t Value Pr >

## create reproducible example fake panel data set:

set.seed(99); sds = rnorm(10,15,5);means = 1:10*runif(10,7,13);trends =

this = NULL; set.seed(98)

df1 = data.frame(day = rep(1:6,10), GROUP = c(rep('TEST',30),rep('CONTROL',30)),

## use package nlme

## run repeated measures mixed model using compound symmetry covariance

Value Std.Error DF t-value p-value

Linear mixed-effects model fit by REML

Correlation Structure: Compound symmetry

Fixed effects: Y ~ GROUP + X1

Value Std.Error DF t-value p-value

I'm getting different estimates. I have indeed checked that the input data is identical also, i.e. df1 in SAS = df1 in R. – baha­kev Aug 5 '12 at 20:26

might it just be a difference in contrasts of the fixed effects? i.e. contrasts(df1$GROUP) <- contr.SAS(2) ? – Ben Bolker Aug 6 '12 at 1:07

model1 <- lme(

I think random = ~ groupvar | subjvar option with R lme provides similar result of repeated /

model2 <- lme(

Variance person GROUP TEST 8789.23

Correlation Structure: Compound symmetry

TEST group diagonal element

I'm pretty sure that random = ~ GROUP | person doesn't change anything from the original code because each

I still think the random effect isn't right. random = list(person = pdDiag(form = ~ GROUP - 1)) is still

This darn near perfectly matches the proc mixed output so I will mark as answered. Thanks Triad! – baha­kev Aug

repeated / type=cs subject=person group=GROUP;

This type=cs subject=person is inducing correlation between all the measurements on the same

This code is actually adding almost the same effect in two different ways; the random line is adding a

One thought ­­ I get errors when I delete the random= term while keeping the correlation= term. If, as you

@baha­kev: Be sure to switch to gls when you remove the random effects. – Aaron Aug 8 '12 at 1:03

Vous aimerez peut-être aussi

I'm getting different estimates. I have indeed checked that the input data is identical also, i.e. df1 in SAS = df1 in R. – bahakev Aug 5 '12 at 20:26

This darn near perfectly matches the proc mixed output so I will mark as answered. Thanks Triad! – bahakev Aug

One thought I get errors when I delete the random= term while keeping the correlation= term. If, as you

@bahakev: Be sure to switch to gls when you remove the random effects. – Aaron Aug 8 '12 at 1:03