Question 1.
(a). The scatter plot of the two variables that are variable crate and variable educat treating crime
rate as dependent variable is shown below. [file fromcrime.sav]

(b). after superimposing a straight line on to the scatter plot that is using the linear fit method, the
relationship roughly looks linear R sq linear of .066. Although it may be curving up slightly or
there may be an outlier. But if we use the cubic fit method the values are more fitted because the
value of R [.242] for the cubic fit method is higher than the linear fit.
(c) From the Pearsons correlation value we can say that, there is a perfect positive correlation
between these variables, which is statically significant at the 5% level. Because the perasons
coefficient r is 1.
Correlations

Correl ati ons

violent crime rate pct hs graduates
violent crime rate[ Pearson Correlation 1 -.256
Sig. (2-tailed)

.070
N 51 51
pct hs graduates Pearson Correlation -.256 1
Sig. (2-tailed) .070

N 51 51

(d) When the variable crate is correlate with variable educat the result of this regression is

Model Summary
Model R R Square
Square
Std. Error of the
Estimate
1 .256
a
.066 .046 430.724
a. Predictors: (Constant), pct hs graduates

Coeffi ci ents
a

Model
Unstandardized
Coefficients
Standardized
Coefficients
t Sig.
95% Confidence Interval for
B
B Std. Error Beta Lower Bound Upper Bound
1 (Constant) 2152.347 832.477

2.585 .013 479.421 3825.273
pct hs
-20.197 10.893 -.256 -1.854 .070 -42.087 1.693
a. Dependent Variable: violent crime rate

(E) plot of the standardized residual against the predicted values in order to detect any outliers
and to assess whether the relationship is linear and whether the residual variance is constant

There is one residual greater than 4 and the trend indicates that there is approximately a linear
relationship between crime rate and education. And from the scatter of points that tends to
increase a little as the predicted value increases which indicating that the assumption of constant
variance may not be appropriate.
Question2.
The data file for question 2 is in the country.sav which contains the demographic information of
122 countries.
(a). Explore the relationship between the variable using a scatter plot.
Dependent variable=lifeexpf
The result of the scatter plot matrix is shown below.

(b) The scatter plot matrix using the logarithm of the variables that dont have a linear
relationship is depicted below.
Logarithm of the variables are=lndocs,lnbeds,lngdp, and lnradio

As we can see easily from the scatter matrix plot the relationship is a linear relationship.
(C) using the forward selection to find the subset of variables that best explain the dependent
variable.
Dependent variable=lifeexp

Coeffi ci ents
a

Model
Unstandardized Coefficients
Standardized
Coefficients
t Sig. B Std. Error Beta
1 (Constant) 57.232 .688

83.233 .000
Natural log of doctors per
10000
6.290 .318 .880 19.792 .000
2 (Constant) 42.138 3.206

13.143 .000
Natural log of doctors per
10000
4.261 .513 .596 8.307 .000
Natural log of GDP 2.493 .519 .345 4.802 .000
3 (Constant) 41.697 3.140

13.278 .000
Natural log of doctors per
10000
4.123 .505 .577 8.168 .000
Natural log of GDP 1.871 .566 .259 3.306 .001
Natural log of radios per 100
people
1.684 .679 .142 2.482 .015
a. Dependent Variable: Female life expectancy 1992

Number of doctors, GDP and number of radio are all positively related to life expectancy in
females after controlling for the other variables.
(d) The cooks distance against the variable sequence
Dependent variable=lifeexp
As we can see from the result of the plot of the matrix of the cooks distance the most influential
countries are Chad,Afghanistan, an d Guinea.
(e) The distribution of the standardized residuals is shown below

With some possible outliers we can say that the distribution is normally distributed with the
normal distribution.

Question 3.
(a) Independent sample T test
(b) Independent sample T test
(c) Paired T test
(d) Paired T test
(e) Independent T test
(f) Paired test
(g) Independent T test
Question 4.
In the SPSS statistics box gives us the mean and standard deviation for each of the groups in this
case age. It also gives the number of people in each group (N). Always check these values first.

The first section of the Independent Samples Test output box gives us the results of Levenes test
for equality of variances. This tests whether the variance (variation) of ages for the two groups
(populations) is the same. The outcome of this test determines which of the t-values that SPSS
provides the correct one is.
Since the significance value from the output [.82] is larger than .05 it should be the first column
of the out table to be used, which is Equal variance is assumed.
In the given the output from the question, the significance level for Levenes test is .82. This is
larger than the cut-off of .05.
This means that the assumption of equal variances has not been violated; therefore, when it is
reported the t-value used is the one in the first column from the output.

From the out table the value of sig(2-tailed) in the first column is .000 less than .05 the required
cut off there is a significance difference in the populations mean ages of the two groups.

The value of t from the output table from the equal variance assumed column 3.9 and the values
for N1 and N2 is the same from the output table 100.
Up on substituting the value of the Eta squared is .0713.
Then according to the guideline( proposed by Cohen,1998) for interoperating this value are
.01=small effect
.06=moderate effect
.14=large effect
For this particular question the, which have the effect size of .0713, effect is in the range of
moderate and large.

An independent sample test was conducted to compare the average ages of people who buy and
who dont buy a product. There is a significance difference in buying the product [mean
29.45,SD 15.56 and mean 38, SD 15.49];t(198)=-3.9,p<.001) The magnitude of the difference in
the means was large (eta squared=.0713).

Question 14. The data file for this question is school.sav and the aim is to check whether there
are differences or not between the two groups that is above and below the median percentage of
low income for all Chicago schools.
The group statics of the percent low income is shown below

Group Stati sti cs

above or below median loinc N Mean Std. Deviation Std. Error Mean
Percent low income above the median for low inc
% 1993
32 73.219 8.7498 1.5468
below the median for low inc
% 1993
32 39.706 13.5002 2.3865

Independent Sampl es Test

Levene's Test for
Equality of
Variances t-test for Equality of Means

F Sig. t df
Sig. (2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference

Lower Upper
Percent
low
income
Equal
variances
assumed
5.793 .019 11.784 62 .000 33.5125 2.8439 27.8276 39.1974
Equal
variances not
assumed

11.784 53.138 .000 33.5125 2.8439 27.8087 39.2163

From the Levenes test for equality of variances we have sig. value of .019 which is less than
below the cut off value .05. The means that the variance between the two group (below and
above) are not the same. Therefore the value of t-test is used the one that is in the raw of variance
not assumed. To find out whether there is a significance difference between the two groups, refer
to the column labeled sig(2-tailed), which appears under the section labeled t-test for unequal
means. In combination with the Levens test result this value is 000 which is below .05.
Therefore there is a significant difference in the means of the two groups.

Group Stati sti cs

above or below median loinc N Mean Std. Deviation Std. Error Mean
average ACT score 1994 above the median for low inc
% 1993
32 15.022 .8746 .1546
below the median for low inc
% 1993
32 16.700 2.1506 .3802

Independent Sampl es Test

Levene's Test for
Equality of
Variances t-test for Equality of Means

F Sig. t df
Sig. (2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference

Lower Upper
average
ACT score
1994
Equal
variances
assumed
16.501 .000
-
4.089
62 .000 -1.6781 .4104 -2.4985 -.8577
Equal
variances not
assumed

-
4.089
40.982 .000 -1.6781 .4104 -2.5070 -.8493

In the similar approach there is a significance difference between the means of the two groups.
Because the significance value under the labeled sig(2-tailed) is 000 which is below .05

Question 16. The data file for this question is buying.sav and the aim is to test the null
hypothesis of the following
1. Family buying score is the same when pictures are shown and when they are not. The
result is shown below with its interpretation

Group Stati sti cs

Picture
Accompanied
Question N Mean Std. Deviation Std. Error Mean
Family Buying Score Pictures 48 159.08 27.564 3.979
No Pictures 50 168.00 21.787 3.081

Independent Sampl es Test

Levene's Test for
Equality of
Variances t-test for Equality of Means

F Sig. t df
Sig. (2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference

Lower Upper
Family
Score
Equal variances
assumed
1.382 .243 -1.780 96 .078 -8.917 5.008 -18.858 1.025
Equal variances
not assumed

-1.772
89.42
9
.080 -8.917 5.032 -18.915 1.081

From independent sample test we have the value of the significance .243 from the assumption of
equal variance that is the Levenes test and it is greater than the cut off that is .243>.05. This tells
us which t-test to be used weather the Equal variance or the unequal variance assumption. But
for this case since the Levenes test value is greater than .05 then the t-test value is from the
Equal variance assumption i.e value of t- that in the raw of equal variance. Therefore once the t-
test value is determined the significance value can be identified. For this particular question the
significance value that is P is .078. Since the value of p is greater than .05 [.078>.05] the null
hypothesis is not rejected. Therefore it is concluded that the family buying score is the same
when the pictures are shown and when they are not.
An independent sample t-test is conducted to compare the family buying score with picture and
without picture. There are no significance difference in scores with picture (M=159,SD=27.564)
and without picture (M=159,SD=21.787;t(96)=-1.78,p=.078]. The magnitude of the difference in
the mean is moderate (eta square=.03).
2. Ho: The average buying score for the husband is the same with and without pictures. Ho is the
null hypothesis.
Group Stati sti cs

Picture
Accompani
ed Question N Mean
Std.
Deviation
Std. Error
Mean
Sum of husband's
Pictures 48 80.12 14.258 2.058
No Pictures 50 83.98 14.329 2.026

Independent Sampl es Test

Levene's
Test for
Equality of
Variances t-test for Equality of Means

F Sig. t df
Sig.
(2-
tailed)
Mean
Difference
Std. Error
Difference
95%
Confidence
Interval of the
Difference

Lower Upper
Sum of
husband's
scores
Equal
variances
assumed
.036 .849 -1.335 96 .185 -3.855 2.889 -9.589 1.879
Equal
variances
not
assumed

-1.335 95.874 .185 -3.855 2.888 -9.588 1.878

With similar approach of the above, the significance for this part is .185 from [sig 2-tailed]
column and it is greater than the cut off (.05), the null hypothesis is not rejected. Still the average
buying score for husband is the same with and without picture.
An independent sample t-test is conducted to compare the family buying score with picture and
without picture. There are no significance difference in scores with picture
(M=80.12,SD=14.258) and without picture (M=83.98,SD=14.329;t(96)=-1.335,p=.185]. The
magnitude of the difference in the mean is moderate (eta square=.018).
3.Ho : The average buying score for the wives is the same with and without pictures. Ho is the
null hypothesis.
Group Stati sti cs

Picture
Accompanied
Question N Mean Std. Deviation Std. Error Mean
Sum of wife's buying scores Pictures 49 78.98 16.033 2.290
No Pictures 50 84.02 15.444 2.184

Independent Sampl es Test

Levene's Test
for Equality of
Variances t-test for Equality of Means

F Sig. t df
Sig. (2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference

Lower Upper
Sum of
wife's
scores
Equal
variances
assumed
.025 .876 -1.593 97 .114 -5.040 3.164 -11.319 1.239
Equal
variances not
assumed

-1.593
96.
677
.115 -5.040 3.165 -11.322 1.241

With similar approach of the above, the significance for this part is .114 from [sig 2-tailed]
column and it is greater than the cut off (.05), the null hypothesis is not rejected. Still the average
buying score for wives is the same with and without picture.
An independent sample t-test is conducted to compare the family buying score with picture and
without picture. There are no significance difference in scores with picture (M=78.98,
SD=16.033) and without picture (M=84.02, SD=15.444;t(97)=-1.593,p=.114]. The magnitude of
the difference in the mean is moderate (eta square=.025).
Question 18.
A manufacturer of high-performance automobiles produces disc brakes that must measure 322
millimeters in diameter. Quality control randomly draws 16 discs made by each of eight
production machines and measures their diameters.
The appropriate test to determine whether or not the mean diameters of the brakes in each sample
significantly differ from 322 millimeters is One Sample T Test to determine. The file for this
question is brake.sav and its confidence interval is 90%
The descriptive statics which displays the sample size, mean, standard deviation, and standard
error for each of the eight samples
The sample means disperse around the 322mm standard by what appears to be a small amount of
variation.

One-Sampl e Stati sti cs
Machine Number N Mean Std. Deviation Std. Error Mean
1 Disc Brake Diameter (mm)
16
3.219985E
2
.0111568 .0027892
2 Disc Brake Diameter (mm)
16
3.220143E
2
.0106913 .0026728
3 Disc Brake Diameter (mm)
16
3.219983E
2
.0104812 .0026203
4 Disc Brake Diameter (mm)
16
3.219954E
2
.0069883 .0017471
5 Disc Brake Diameter (mm)
16
3.220042E
2
.0092022 .0023005
6 Disc Brake Diameter (mm)
16
3.220025E
2
.0086440 .0021610
7 Disc Brake Diameter (mm)
16
3.220062E
2
.0093303 .0023326
8 Disc Brake Diameter (mm)
16
3.219967E
2
.0077085 .0019271
Since their confidence intervals lie entirely above 0.0, it is possible to say that machines 2, 5 and
7 are producing discs that are significantly wider than 322mm on the average. And similarly,
because its confidence interval lies entirely below 0.0, machine 4 is producing discs that are not
wide enough.

Question19.
A physician is evaluating a new diet for her patients with a family history of heart disease. To
test the effectiveness of this diet, 16 patients are placed on the diet for 6 months. Their weights
and triglyceride levels are measured before and after the study, and the physician wants to know
if either set of measurements has changed. The data are found in dietstudy.sav of SPSS sample
files. Use appropriate test to determine whether there is a statistically significant difference
between the pre- and post-diet weights and triglyceride levels of these patients
Pai red Samples Stati sti cs

Mean N Std. Deviation Std. Error Mean
Pair 1 Weight 198.38 16 33.472 8.368
Final weight 190.31 16 33.508 8.377
Pair 2 Triglyceride 138.44 16 29.040 7.260
Final triglyceride 124.38 16 29.412 7.353

As the study was made to know if there is a statistically significant difference between the pre-
and post-diet weights and triglyceride levels of these patients, a paired-samples t-test was
appropriate test
1. There is a statistically significant decrease in weight from pre-diet (M =198.38) to post-
diet (M =190.31), t(15)=11.175. Since the probability value p (0.000) <. 0005 (two-
tailed) which is substantially smaller than our specified alpha value of .05, there is a
significant difference in weight of the patients between the pre- and post-diet
measurements. The mean decrease in weight is 8.062 with a 95% confidence interval
ranging from 6.525 to 9.600. The t value is used to calculate the effect size statistic
squared (eta squared statistic).
Eta squared =
t
2
t
2
+(N-1)

Pai red Samples Correl ati ons

N Correlation Sig.
Pair 1 Weight & Final weight 16 .996 .000
Pair 2 Triglyceride & Final
triglyceride
16 -.286 .283
Eta squared =
11.175
2
11.175
2
+(16-1)
=0.893

According to Cohen 1988, pp. 2847 guidlines, the Eta squared value .01=small effect,
.06=moderate effect, .14=large effect.
Since the Eta squared value obtained 0.893 is greater than 0.14, there is a large effect with a
significant difference in weight of the patients between the pre- and post-diet measurements.

2. There is a statistically significant decrease in triglyceride from pre-diet (M =138.44) to
post-diet (M =124.38), t(15)=1.200. Since the probability value p (0.249) <. 0005 (two-
tailed) which is substantially smaller than our specified alpha value of .05, there is a
significant difference in triglyceride of the patients between the pre- and post-diet
measurements. The mean decrease in triglyceride is 14.062 with a 95% confidence
interval ranging from -10.915 to 39.040.

The effect size statistic squared (eta squared statistic):
Eta squared =
1.200
2
1.200
2
+(16-1)
=0.0876

Since the Eta squared value obtained 0.0876 is in between 0.06 and 0.14, there is moderate
effect with a significant difference in triglyceride of the patients between the pre- and post-diet
measurements.