Vous êtes sur la page 1sur 12

Madeline Lasell

SPH 245
Homework 1

PROBLEM 1:
(a) Normality: After compressing and sorting the data by treatment, I used proc univariate to test
for normality. The tables that depict the various tests of normality (specifically Shapiro-Wilks),
confirm that none of the pairwise comparisons violate the normality assumption, given that the
p-values are all insignificant (p>0.05). This means we cannot reject the null hypothesis that the
data has a normal distribution at 0.05. So, we can assume that this data is normally distributed.
Appendix A at the end of this report shows the p-values associated with each treatment
combination (drug, exercise, and high-fat diet).

Equal Variances: Because the data is normally distributed, I used the Bartlett test (below) to test
for homogeneity of variances (as it is a parametric test). Because the Bartlett test similarly
shows a p-value > 0.05 (p=0.9045), we can assume that the data maintains equal variances.

***Because this data is randomized and because of the fact that we can assume both normality
and equal variances, it DOES appear appropriate to use ANOVA for data analysis.

(b) Null Hypotheses:


1. H0: There is no interaction effect between exercise and drug on mean cholesterol level.
2. H0: There is no interaction effect between diet and drug on mean cholesterol level.
3. H0: There is no interaction effect between exercise and diet on mean cholesterol level.
4. H0: Controlling for exercise and drug assignment, there is no difference in mean cholesterol
between patients randomly assigned to a high-fat diet or not assigned to a high-fat diet.
5. H0: Controlling for exercise and diet assignment, there is no difference in mean cholesterol
between patients randomly assigned to a drug or not assigned to a drug.
6. H0: Controlling for diet and drug assignment, there is no difference in mean cholesterol between
patients randomly assigned to exercise or not assigned to exercise.

Alternative Hypotheses:
1. HA: There is an interaction effect between exercise and drug on cholesterol levels.
2. HA: There is an interaction effect between diet and drug on cholesterol levels.
3. HA: There is an interaction effect between exercise and diet on cholesterol levels.
4. HA: Controlling for exercise and drug assignment, there is a difference in mean cholesterol
between patients randomly assigned to a high-fat diet or not assigned to a high-fat diet.
5. HA: Controlling for exercise and drug assignment, there is a difference in mean cholesterol
between patients randomly assigned to a high-fat diet or not assigned to a high-fat diet.
6. HA: Controlling for exercise and drug assignment, there is a difference in mean cholesterol
between patients randomly assigned to a high-fat diet or not assigned to a high-fat diet.
Madeline Lasell
SPH 245
Homework 1

Our 2-way ANOVA (GLM) output indicates a p-value of less than 0.05 (p<0.0001), meaning that
we reject the null that there are no differences in mean cholesterol levels among all different
levels of treatment.

The following table shows that diet, exercise, and drug categories each have significant p-
values. This indicates that each appear to have an individual main effect when controlling for
the others. In contrast, the interactions prove to be insignificant (p-values greater than 0.05).

Results:
At a significance level of 0.05:

1) The interactions between diet and exercise (p=0.6649), diet and drug (p=0.0770), and
exercise and drug (p=0.4690) are insignificant. This means that:
i. the effect of diet on reducing cholesterol levels does not depend on exercise or the drug
ii. the effect of exercise on reducing cholesterol levels does not depend on diet or drug
iii. the effect of the drug on reducing cholesterol levels does not depend on diet or exercise

2) Diet treatment had a significant main effect on reducing mean cholesterol levels
(p<0.0001).

3) Exercise treatment had a significant main effect on reducing mean cholesterol levels
(p=0.0009).

4) Drug treatment had a significant main effect on reducing mean cholesterol levels
(p<0.0001).
Madeline Lasell
SPH 245
Homework 1

i. Because there are 3 types of drug, we have to compare their pairwise


differences to test for their interaction effects. According to Tukey’s post-hoc
test, Drugs A and B are significantly different from each other, as well as Drug A
and Drug C are significantly different from each other. However, Drugs B and C
are not significantly different from one another, meaning we can infer that
Drug A had the greatest effect on reducing cholesterol level.

PROBLEM 2:
(a) Normality: After compressing and sorting the data by treatment, I again used proc univariate to
test for normality. The normality test (Shapiro Wilks) confirmed that one of the pairwise
comparisons (Older, Imagery) violates the normality assumption with a significant p-value of
(p=0.0251). Therefore, we fail to reject the null hypothesis that the data is normal at a p-value of
0.05. So, we can assume that this data is NOT normally distributed. Appendix B at the end of
this report shows the p-values associated with each combination of age (younger or older) and
memorization process (5 different processes).

Equal Variances: If we look at the Brown and Forsythe’s Test (a nonparametric test appropriate
for this non-normal dataset), we can see that the p-value is greater than 0.05 (p=0.9045). This
means that we don’t have significant evidence to reject the null hypothesis that variances are
homogenous or equal. So, we can assume that the data maintains equal variances.

***However, although the data has equal variances, it doesn’t show normality. Therefore, we
can only use a 2-way ANOVA if we take the square root of the response variable (words) in
order to correct for normality.
Madeline Lasell
SPH 245
Homework 1

(b) Null Hypotheses:


• H0: Controlling for age, process type has no effect on the mean number of words
memorized.
• H0: Controlling for process, age has no effect on the mean number of words memorized.
• H0: There is no interaction effect between age and process on the mean number of words
memorized.

Alternative Hypotheses:
• HA: Controlling for age, process type has an effect on the mean number of words
memorized.
• HA: Controlling for process, age has an effect on the mean number of words memorized.
• HA: There is an interaction effect between age and process on the mean number of words
memorized.

Our 2-way ANOVA (GLM) output indicates a p-value below 0.05 (p<0.0001), meaning that we
reject the null that there are no differences in mean number of words memorized among use
of the different techniques.

The following table shows that age and process categories each have significant p-values. This
indicates that both appear to have an individual main effect when controlling for the other. In
addition, the interaction between them proves to be significant (p-values=0.0014).
Madeline Lasell
SPH 245
Homework 1

Results:
So, at a significance level of 0.05:

1) The interactions between age and process is significant on mean number of words
remembered (p=0.0014)

2) Age had a significant main effect on the mean number of words remembered
(p<0.0001)

3) Processes had a significant main effect on the mean number of words remembered
(p<0.0001)
a. Because there are 5 processes, we have to compare the pairwise differences for each to test
their interaction effects. According to Tukey’s post-hoc test, the following processes were:
Significantly different from one another:
1. Adjective, counting
2. Adjective, rhyming
3. Counting, Imagery
4. Counting, intention
5. Imagery, Rhyming
6. Intention, rhyming

Not significantly different from one another:


1. Adjective, imagery
2. Adjective, intention
3. Counting, rhyming
4. Imagery, intention

***In conclusion, we reject all 3 null hypothesis because the p-values are significant.
Madeline Lasell
SPH 245
Homework 1

FULL CODE:

*PROBLEM 1;
FILENAME REFFILE '/folders/myshortcuts/SAS_Scripts/Cholesterol.csv';

PROC IMPORT DATAFILE='/folders/myshortcuts/SAS_Scripts/Cholesterol.csv'


REPLACE
OUT=CHOLESTEROL_HMK1;
DBMS=CSV
GETNAMES=YES;
RUN;

PROC CONTENTS DATA=CHOLESTEROL_HMK1; RUN;

*1A;
DATA CHOLESTEROL_HMK1;
SET CHOLESTEROL_HMK1;
TREATMENTS = COMPRESS (DRUG||EXERCIES||DIET);
RUN;

PROC SORT DATA=CHOLESTEROL_HMK1;


BY TREATMENTS DESCENDING CHOLESTEROL;
RUN;

PROC UNIVARIATE DATA=CHOLESTEROL_HMK1 NORMAL NEXTROBS=0;


QQPLOT CHOLESTEROL / NORMAL(MU=EST SIGMA=EST COLOR=BLUE L=1);
BY TREATMENTS;
RUN;

PROC SGPLOT DATA=CHOLESTEROL_HMK1;


VBOX CHOLESTEROL/ CATEGORY=TREATMENTS;
RUN;

*1B;
PROC GLM DATA=CHOLESTEROL_HMK1;
CLASS TREATMENTS;
MODEL CHOLESTEROL=TREATMENTS;
MEANS TREATMENTS/ HOVTEST=BARTLETT HOVTEST=BF;
RUN;

PROC GLM DATA=CHOLESTEROL_HMK1;


CLASS DIET EXERCIES DRUG;
MODEL CHOLESTEROL = DIET EXERCIES DRUG DIET*EXERCIES DIET*DRUG DRUG*EXERCIES;
LSMEANS DIET EXERCIES DRUG / CL ADJUST=TUKEY;
RUN;
Madeline Lasell
SPH 245
Homework 1

*PROBLEM 2;
FILENAME REFFILE '/folders/myshortcuts/SAS_Scripts/MemoryA.csv';

PROC IMPORT DATAFILE='/folders/myshortcuts/SAS_Scripts/MemoryA.csv'


REPLACE
OUT=MEMORY_HMK1;
DBMS=CSV
GETNAMES=YES;
RUN;

PROC CONTENTS DATA=MEMORY_HMK1; RUN;

*2A;
DATA MEMORY_HMK1;
SET MEMORY_HMK1;
TECHNIQUES = COMPRESS (AGE||PROCESS);
SQRTWORDS=SQRT(WORDS);
RUN;

PROC SORT DATA=MEMORY_HMK1;


BY TECHNIQUES DESCENDING WORDS;
RUN;

PROC UNIVARIATE DATA=MEMORY_HMK1 NORMAL NEXTROBS=0;


QQPLOT WORDS/ NORMAL(MU=EST SIGMA=EST COLOR=BLUE L=1);
BY TECHNIQUES;
RUN;

PROC ANOVA DATA=MEMORY_HMK1;


CLASS TECHNIQUES;
MODEL WORDS=TECHNIQUES;
MEANS TECHNIQUES/ HOVTEST=BARTLETT HOVTEST=BF;
RUN;

*2B;
PROC GLM DATA=MEMORY_HMK1;
CLASS AGE PROCESS;
MODEL SQRTWORDS=AGE PROCESS AGE*PROCESS;
LSMEANS AGE PROCESS AGE*PROCESS/CL ADJUST=TUKEY;
RUN;
Madeline Lasell
SPH 245
Homework 1

APPENDIX A: 12 tables sorted in order by drug treatment type (A, B, C), exercise (yes, no), diet (yes, no)

Treatment A: No exercise, No diet

Treatment A: No exercise, Yes diet

Treatment A: Yes exercise, No diet

Treatment A: Yes exercise, Yes diet

Treatment B: No exercise, No diet


Madeline Lasell
SPH 245
Homework 1

Treatment B: No exercise, Yes diet

Treatment B: Yes exercise, No diet

Treatment B: Yes exercise, Yes diet

Treatment C: No exercise, No diet

Treatment C: No exercise, Yes diet


Madeline Lasell
SPH 245
Homework 1

Treatment C: Yes exercise, No diet

Treatment C: Yes exercise, Yes diet

APPENDIX B: 10 tables sorted by age (young, older), process (counting, rhyming, adjective, imagery,
intentional)

Older, Adjective

Older, Counting
Madeline Lasell
SPH 245
Homework 1

Older, Imagery **significant, non-normal distribution

Older, Intentional

Older, Rhyming

Younger, Adjective
Madeline Lasell
SPH 245
Homework 1

Younger, Counting

Younger, Imagery

Younger, Intentional

Younger, Rhyming

Vous aimerez peut-être aussi