Vous êtes sur la page 1sur 3

BASIC INFERENENTIAL DATA ANALYSIS

PRAGATI GOEL

Here we are going to analyse the ToothGrowth data from the R datasets package.

Load the data and perform some basic exploratory data analyses.

RCODE:
library(datasets)
head(ToothGrowth)
plot(ToothGrowth$dose,ToothGrowth$len)
table(ToothGrowth$supp,ToothGrowth$dose)
dmean=mean(ToothGrowth$dose)
dsd=sd(ToothGrowth$dose)
dmean
dsd
lmean=mean(ToothGrowth$len)
lsd=sd(ToothGrowth$len)
lmean
lsd
par(mfrow=c(1,2))
boxplot(len~dose,data=ToothGrowth,xlab="dose",ylab="length")
boxplot(len~supp,data=ToothGrowth,xlab="sup",ylab="length")
boxplot(dose~supp,data=ToothGrowth,xlab="sup",ylab="length")

OUTPUT:
> library(datasets)
> head(ToothGrowth)
len supp dose
1 4.2 VC 0.5
2 11.5 VC 0.5
3 7.3 VC 0.5
4 5.8 VC 0.5
5 6.4 VC 0.5
6 10.0 VC 0.5
> plot(ToothGrowth$dose,ToothGrowth$len)
> table(ToothGrowth$supp,ToothGrowth$dose)
0.5 1 2
OJ 10 10 10
VC 10 10 10

> dmean=mean(ToothGrowth$dose)
> dsd=sd(ToothGrowth$dose)
> dmean
[1] 1.166667
> dsd
[1] 0.6288722
> lmean=mean(ToothGrowth$len)
> lsd=sd(ToothGrowth$len)
> lmean
[1] 18.81333
> lsd
[1] 7.649315
> par(mfrow=c(1,2))
> boxplot(len~dose,data=ToothGrowth,xlab="dose",ylab="length")
> boxplot(len~supp,data=ToothGrowth,xlab="sup",ylab="length")
> boxplot(dose~supp,data=ToothGrowth,xlab="sup",ylab="length")

*THE FOLLOWINGTWO PLOTS WILL BE USED LATER

Since each of the entry of the above table is same, the design qualifies as “balanced design”. Above code
tells the mean and standard deviation and gives the visual relation between the three variables.
Summary of the data
R code: summary(ToothGrowth)
Output: > summary(ToothGrowth)
len supp dose
Min. : 4.20 OJ:30 Min. :0.500
1st Qu.:13.07 VC:30 1st Qu.:0.500
Median :19.25 Median :1.000
Mean :18.81 Mean :1.167
3rd Qu.:25.27 3rd Qu.:2.000
Max. :33.90 Max. :2.000

Tests to compare tooth growth by sup and dose.


For dose, taking the conventional two sided 5% alpha test for n=60 and null h
ypotheses to be equal to mu=dmean, alternative hypotheses to be mu=1.5, we ge
t the p value for 95% confidence interval.
R command:
z<-qnorm(0.975)
n=60
pnorm(dmean+z*dsd/sqrt(n),mean=1.5,sd=lsd/sqrt(n),lower.tail=FALSE)
ans: 0.9840548
Therefore we see 98.4% probability of the value to lie in 95%confidence interval and hence we fail to
reject the null hypothesis.
The tooth growth by sup and dose are clearly reflected by the plots above: len vs supp and len vs
dose(mentioned above)

THE CONCLUSIONS AND ASSUMPTIONS:

Conclusion:
We have compared the analysis to a series of hypothesis tests along with different plots and histograms
and comparison of means and illustrated the value of accounting for the interactions between factor
variables. The plot of tooth growth by dose also demonstrates that increase in dosage are associated
with increase in tooth growth.
ASSUMPTION:
Since we performed the normal test above, we assumed that the distribution followed central limit
theorem and is approximately normal. We took the conventional 95% confidence interval.

Vous aimerez peut-être aussi