Vous êtes sur la page 1sur 57

Yogesh .S.

Mangela

LOGBOOK
CONTENT PAGE PAGE NO

Practical One: Binomial & Hypergeometric Distributions 3

1. Describe the shape of the Binomial distribution when p = 0.2. 3


2. What is the median value of X and the modal value when p = 0.2. 3
3. How would you calculate the mean of the distribution? 3
4. Calculate the mean and compare with the median and the mode
and discuss the skewness. 3
5. Repeat for the cases when p = 0.4, 0.6, 0.8 and comment 4
6. Give examples of situations where the Binomial distribution
might arise in a Forensic investigation. 4
7. Describe the shape of the Hypergeometric
distribution when n = 20, m = 12 , r = 6 5
8. What are the median value of X and the modal value. 5
9. Calculate the mean and compare with the median and
the mode and discuss the skewness 5
10. Give examples of situations where the Hypergeometric
distribution might arise in a Forensic investigation. 5

Practical Two: The Poisson Distribution 6


1. What are the relationships between the following
(i) The standard deviation and the variance of any distribution? 6
(ii) the mean and variance of the Poisson distribution? 6
(iii) Estimate the mean of the number of murders recorded in a police area per month. 6

2. Describe the shape of the distribution of the number


of murders per month. What is the mean? What is the median? 6

Does the number of murders look to be normally distributed?


In what ways does the number of murders not satisfy the
conditions necessary for a normal distribution.

3. Compare the expected frequencies with those observed. (too good to be true!!!). 7

4 Now describe the shape of the distribution of murders in December


and discuss whether a Poisson distribution fits these data. 7

5 Now describe the shape of the distribution of sexual offences and


discuss whether a Poisson distribution fits these data. 7

6 Describe, in words, the situations where a random variable with


a Poisson distribution might be appropriate. What assumptions
are necessary and are they likely to hold true for the number of murders. 7

7 Discuss whether the following might have the characteristics of a Poisson distribution…

Page.no: 1 out of 57
Yogesh .S. Mangela

(i) Number of cot deaths per month per hospital area. 7


(ii) Number of cases of childhood cancers per area per year. 7

Practical Three: Categorical Data and Contingency

Tables Data Set One 8

1. Estimate probability that a randomly selected offender is a member of a gang.


2. Estimate probability that a randomly selected offender has carried a weapon
3. Estimate probability that a randomly selected offender has carried
a weapon given he has never joined a gang and has no close friends in a gang.
4. Estimate probability that a randomly selected offender has carried
a weapon given he has never joined a gang.
5. Estimate probability that a randomly selected offender has carried
a weapon given he is a gang member.
6. Comment on the relationship between gang membership and 9
weapon carrying using the bar chart.
7. Use the Chi Square Output to test association between gang and weapon behaviour. 9

Data Set 2:
1. Estimate the probability that a randomly selected crime involved
a white attacker and a white victim
2. Estimate the probability that a randomly selected crime involved
a white attacker.
3. Given that both attacker and victim were white, estimate
the probability that a reported crime will involve a fatality.
4. If no injury was reported, what is the estimated probability that
a crime involves a non-white attacker and a white victim?

If no injury was reported, what is the estimated probability that a crime involves a white attacker
and a white victim?

5. Comment on the relationship between injury level


6. and type of attacker/victim 11

Practical Four: The Normal Distribution 12

Practical 5: Regression 22

Practical 6: Analysis of Variance 38

Practical 7: Survival analysis: Kaplan-Meier 42


Bitrial 42
Kedney 46
HIV_azt 49
Practical 8: Survival analysis: Cox proportional hazards regression model 51

Practical 8: continued 53

Page.no: 2 out of 57
Yogesh .S. Mangela

Practical 1: Date: 08-02-2007

1.
1.2
S
1.0
hap
e of
0.8
Count

0.6

0.4

0.2

0.0
.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
x

Cases weighted by pdf_.2

binomial distribution when p: 0.2

1.The binomial distribution when p=0.2 shows that there has been increased in count from n = 0 – 5.
when n = 5 its highest peak , but it is steady after n = 6 ………15 where count = 1.0.

2. Median value is 8 when p = 0.2 and n = 15 of X and modal value is 5.


Median = n+1/2; (15+1)/2; (16/2) = 8th position = 1.0

3. To calculate mean of a distribution , check what is the probability when n, for instance; when n =
0.00 count is 0.05, when n = 1.0 count = 0.3……………n = 15 p = 1.0

It is just the sum of all the count divided by the sum of n. mean= (C1 + C2 + ……C (n-2) + C (n-1).

4. Mean = (0.05 + 0.3 + 0.6+ 0.9+1.0+1.01+1+1+1+1+1+1+1+1+1+1) = (13.860 / 15) = 0.924

The above bar graph suggests that Mean = 0.924, mode = 1.0 and median = 1.0. it suggests that , p =
0.2, n = 15, are related to each other.

P = 0.4
1.2

1.0

0.8
Count

0.6

0.4

0.2

0.0
.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00

Cases weighted by pdf_0.4

Page.no: 3 out of 57
Yogesh .S. Mangela

5
The bar graph of p = 0.4 indicates that the mean is = (10.351/15) = 0.690,

Median = n+1/2; (15+1)/2; (16/2) = 8th position = 1.0

Mode = 1.0

Pdf_0.4 indicates that there sudden rise from x = 1 – 9, from 9 it is becoming steady. It is positively
skewed.
P= 0.6

0.5

0.4
Count

0.3

0.2

0.1

0.0
1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
x

Cases weighted by pdf_0.6

Mean (1.961/15) = 0.131; Median = 0.350; Mode= 0.1

The bar chart for p 0.6 suggests that the binomial distribution is symmetric,

6. 20 Out of 50 is selected of accusation of a murder, what is the probability that the accused are
murderer. The probability of murderer is 0.25.

DNA sample was obtained to investigate whether, a person smokes or not. N=50 p= 0.75

P= 0.8
1.2

1.0

0.8
Count

0.6

0.4

0.2

Page.no:
0.0
4.00
out 1.00
of 572.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
x

Cases weighted by PDF_0.8


Yogesh .S. Mangela

Mean = (4.911/15) = 0.327, Median = 0.005, Model = 14

The pdf-0.8 is positively skewed.

7. Hypergeometric distribution
1.2

1.0
Count

0.8

0.6

0.4

0.2

0.0
1.00 2.00 3.00 4.00 5.00 6.00
x

Cases weighted by Pdf_2

The distribution is a skewed. At the point where 4 to 6 it becomes steady.

8. Median = (6+1)/2 = 3.5 = (3rd + 4th position)/2; (0.33+0.35)/2; (0.680/2); = 0.340 and the model is
1.00

9. Mean = (1.333/6) = 0.222 is mean value of X


10. 100 people are arrested for robbery where are only 70 are the real thieves.
25 people were accused and DNA sample were obtained to check that he / she had really taken
drugs, 10 were found red handed.

8 Different values of n, m, r etc

0.4

0.3
Count

0.2

0.1

0.0
1.00 2.00 3.00 4.00 5.00 6.00

Cases weighted by Pdf

Page.no: 5 out of 57
Yogesh .S. Mangela

1.2

1.0

0.8
Count

0.6

0.4

0.2

0.0
.00 1.00 2.00 3.00 4.00 5.00 6.00
x

Cases weighted by Pdf_3

PRACTICAL TWO WEEK1

1 Relationship

i) The Standard deviation and variance


Standarda deviation is the square root of th sample variance, where variance = (σ*σ)

ii) The man and variance for Poisson distribution;


Variance= var(x*x) - { E (x)} squared.

iii) The mean of the number of murders recorded in a police area per month is 1.21.

Move number of police areas with X murders (murder1)


Descriptive Statistics

N Minimum Maximum Mean Std. Deviation Variance


Number X 370 0 6 1.21 1.105 1.221
Valid N (listwise) 370

2) The mean of the number of murder per month is approximately 677/6= 112.83 and the median is (6
+ 1)/2 = 3.5 therefore 3rd + 4th position= (30+10)/2 = 20 median.

The number are not normally distributed, the bar graph suggests that the it is skewed, but not
normally
125
distributed, as
normal
100 distribution is
Count

bell shape
75
curved, in
order to be
50

normal
25 distribution the
data has to
0
0 1 2 3 4
divide in both
5 6
Number X
side equally.
Cases weighted by number of police areas with X murders

Page.no: 6 out of 57
Yogesh .S. Mangela

murder2
Descriptive Statistics

N Minimum Maximum Mean Std. Deviation Variance


Number X 370 0 11 1.22 1.905 3.629
Valid N (listwise) 370

3) The variance of the expected frequency was 3.629 on the other hand observed was 1.221.

4) The distribution is skewed, however seems that the data does not fit in Poisson distribution.

200

150
Count

100

50

0
0 1 2 3 4 5 6 7 8 9 10 11
Number X

Cases weighted by number of police area with X murders in December

Descriptive Statistics

N Minimum Maximum Mean Std. Deviation Variance


Number X 369 0 13 4.79 2.194 4.814
Valid N (listwise) 369

5) The bar graph suggests that the number of police areas reported sexual offences are symmetric or normally distributed.

6) Poisson distribution can be used were crime happens every week or every moth for
60

given time, could be on any days. Assumptions such as; murder happens every
Count

months or week, in a particular area, number of murders and times are unexpected
therefore your is not right to make this assumptions.
40

20

7) Characteristics of a Poisson distribution.

i) Number of cot deaths per month per hospital could describe by Poisson distributions.
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13

According to Poisson distribution the number of occurrences


Number X occurs in some given time.
In this case the number of cot death occurs in per months per hospital area.
Cases weighted by number of police areas with X reported sexual offences

ii) Number of case of childhood cancers per year. Yes this will also contain the
characteristics of Poisson distribution.

Page.no: 7 out of 57
Yogesh .S. Mangela

Practical 3 Week 2

1. Estimate probability that a randomly selected offender is a member of a gang.

(1230+1307+688)/7488 = 0.437

2. Estimate probability that a randomly selected offender has carried a weapon

1438/7488 = 0.192

3. Estimate probability that a randomly selected offender has carried a weapon given he has never
joined a gang and has no close friends in a gang.

255/2551= 0.099

4. Estimate probability that a randomly selected offender has carried a weapon given he has never
joined a gang.

516/4263= 0.1210

5. Estimate probability that a randomly selected offender has carried a weapon given he is a gang
member.

1438/3225= 0.446

Page.no: 8 out of 57
Yogesh .S. Mangela

[DataSet1] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-3\Gang.sav GANG


Case Processing Summary

Cases
Valid Missing Total
N Percent N Percent N Percent
WeaponBehaviour *
GangType 42 100.0% 0 .0% 42 100.0%

6) Relationship between gang and weapon;


The bar graph clearly shows that, those who had not carried weapon are equivalent to those who
carried weapon and active gang members. Never joined gang and had no friends and had not carried
weapon are similar compare to those who had carried, weapons.

Chi-Square Tests

Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square .000(a) 5 1.000
Likelihood Ratio .000 5 1.000
Linear-by-Linear
.000 1 1.000
Association
N of Valid Cases
42

a 8 cells (66.7%) have expected count less than 5. The minimum expected count is 1.00.

7) associationbetween gang and weapon behaviour


Ho: There is association between gang and weapon behaviour.
H1: No association between gang and weapon behaviour
P≥ 0.05 which is 1.00 therefore we force to accept H0 conclude that there is significant association
between gang and weapon behaviour.

Dataset TWO
Notes

[DataSet2] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-3\Injury.sav

Case Processing Summary

Cases
Valid Missing Total
N Percent N Percent N Percent
Attacker_Victim *
Degree_of_Injury 40 100.0% 0 .0% 40 100.0%

Page.no: 9 out of 57
Yogesh .S. Mangela

Attacker_Victim * Degree_of_Injury Crosstabulation

Degree_of_Injury Total
Fatal Serious Slight None
Attacker_Victim White/White Count 1 2 3 4 10
Expected Count 1.0 2.0 3.0 4.0 10.0
% within Degree_of_Injury
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
White/None Count
1 2 3 4 10
White
Expected Count 1.0 2.0 3.0 4.0 10.0
% within Degree_of_Injury
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
None White / Count
1 2 3 4 10
None White
Expected Count 1.0 2.0 3.0 4.0 10.0
% within Degree_of_Injury
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
Non White / Count
1 2 3 4 10
White
Expected Count 1.0 2.0 3.0 4.0 10.0
% within Degree_of_Injury
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
Total Count 4 8 12 16 40
Expected Count 4.0 8.0 12.0 16.0 40.0
% within Degree_of_Injury
100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 10.0% 20.0% 30.0% 40.0% 100.0%

Chi-Square Tests

Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square .000(a) 9 1.000
Likelihood Ratio .000 9 1.000
Linear-by-Linear
.000 1 1.000
Association
N of Valid Cases
40
a 16 cells (100.0%) have expected count less than 5. The minimum expected count is 1.00.

7. Estimate the probability that a randomly selected crime involved a white attacker and a white
victim
6521/11717= 0.557

8. Estimate the probability that a randomly selected crime involved a white attacker.
716/1171= 0.611

Page.no: 10 out of 57
Yogesh .S. Mangela

9. Given that both attacker and victim were white, estimate the probability that a reported crime
will involve a fatality.
183/237= 0.7723

4. If no injury was reported, what is the estimated probability that a crime involves a non-white
attacker and a white victim?
1801/3616= 0.4981

5. If no injury was reported, what is the estimated probability that a crime involves a white
attacker and a white victim?
1422/6521= 0.2781

Bar Chart

Degree_of_Injury
4
Fatal
Serious
Slight
None
3
Count

0
White/White White/None White None White / None Non White / White
White
Attacker_Victim

5) The bar graph clearly indicates that the types of injury are similar to each other.
For instance; white attacks white and white attacks none white are equal.

Page.no: 11 out of 57
Yogesh .S. Mangela
Practical -4 Week 2

Explore
[DataSet1] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-4\measurements.sav TASK 4

Case Processing Summary

Cases
Valid Missing Total
N Percent N Percent N Percent
shoe 83 100.0% 0 .0% 83 100.0%

Descriptives

Statistic Std. Error


shoe Mean 6.596 .2178
95% Confidence Lower Bound 6.163
Interval for Mean Upper Bound
7.030

5% Trimmed Mean 6.516


Median 6.000
Variance 3.936
Std. Deviation 1.9839
Minimum 2.0
Maximum 12.0
Range 10.0
Interquartile Range 3.0
Skewness .706 .264
Kurtosis .329 .523

Tests of Normality

Kolmogorov-Smirnov(a) Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
shoe .209 83 .000 .929 83 .000

a Lilliefors Significance Correction

The test of normality indicates that the shoe size is normally distributed with mean 6.596,
variance = 3.396 and S.D 1.9839

Page.no: 12 out of 57
Yogesh .S. Mangela
Histogram

25

20
Frequency

15

10

Mean =6.596
Std. Dev. =1.9839
0
N =83
2.0 4.0 6.0 8.0 10.0 12.0

shoe

shoe Stem-and-Leaf Plot

Frequency Stem & Leaf

.00 0 .
2.00 0 . 23
26.00 0 . 44445555555555555555555555
32.00 0 . 66666666666666666666667777777777
16.00 0 . 8888888899999999
6.00 1 . 001111
1.00 1 . 2

Stem width: 10.0


Each leaf: 1 case(s)

Page.no: 13 out of 57
Yogesh .S. Mangela

Normal Q-Q Plot of shoe

2.5
Expected Normal

0.0

-2.5

2 4 6 8 10 12

Observed Value

Detrended Normal Q-Q Plot of shoe

0.50
Dev from Normal

0.25

0.00

-0.25

2 4 6 8 10 12
Observed Value

Page.no: 14 out of 57
Yogesh .S. Mangela

12.0

10.0

8.0

6.0

4.0

2.0

shoe

Box and whisker plot shows that shoe size is normally distributed, as there no outliers, and median 6.
LQ=2 and UQ=12

[DataSet1] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-4\measurements.sav TASK 5

gender
Case Processing Summary

Cases
gender
Valid Missing Total
N Percent N Percent N Percent
height in cms male 23 100.0% 0 .0% 23 100.0%
female 60 100.0% 0 .0% 60 100.0%

Descriptives

gender Statistic Std. Error

Page.no: 15 out of 57
Yogesh .S. Mangela
height in cms male Mean 178.696 1.3766
95% Confidence Lower Bound 175.841
Interval for Mean Upper Bound
181.551

5% Trimmed Mean 178.565


Median 178.000
Variance 43.585
Std. Deviation 6.6019
Minimum 170.0
Maximum 190.0
Range 20.0
Interquartile Range 11.0
Skewness .223 .481
Kurtosis -1.201 .935
female Mean 165.667 .9429
95% Confidence Lower Bound 163.780
Interval for Mean Upper Bound
167.553

5% Trimmed Mean 165.630


Median 166.000
Variance 53.345
Std. Deviation 7.3037
Minimum 150.0
Maximum 180.0
Range 30.0
Interquartile Range 10.0
Skewness .209 .309
Kurtosis -.596 .608

Tests of Normality

Kolmogorov-Smirnov(a) Shapiro-Wilk
gender
Statistic df Sig. Statistic df Sig.
height in cms male .113 23 .200(*) .927 23 .094
female .114 60 .049 .969 60 .136
* This is a lower bound of the true significance.
a Lilliefors Significance Correction

Page.no: 16 out of 57
Yogesh .S. Mangela
Histogram

for gender= male


5

4
Frequency

Mean =178.696
Std. Dev. =6.6019
0 N =23
171.0 174.0 177.0 180.0 183.0 186.0 189.0

height in cms
Histogram

for gender= female


12

10

8
Frequency

Mean =165.667
Std. Dev. =7.3037
0 N =60
150.0 155.0 160.0 165.0 170.0 175.0 180.0

height in cms

Stem-and-Leaf Plots
height in cms Stem-and-Leaf Plot for
gender= male

Frequency Stem & Leaf

Page.no: 17 out of 57
Yogesh .S. Mangela
8.00 17 . 00002344
4.00 17 . 5778
6.00 18 . 001224
4.00 18 . 7888
1.00 19 . 0

Stem width: 10.0


Each leaf: 1 case(s)

height in cms Stem-and-Leaf Plot for


gender= female

Frequency Stem & Leaf

2.00 15 . 02
9.00 15 . 557778899
16.00 16 . 0000000001123334
13.00 16 . 5566667778889
11.00 17 . 00000001114
7.00 17 . 5777889
2.00 18 . 00

Stem width: 10.0


Each leaf: 1 case(s)

Normal Q-Q Plots


Normal Q-Q Plot of height in cms

for gender= male


2.0

1.5
Expected Normal

1.0

0.5

0.0

-0.5

-1.0

-1.5

170 175 180 185 190 195


Observed Value

Page.no: 18 out of 57
Yogesh .S. Mangela

Normal Q-Q Plot of height in cms

for gender= female


2
Expected Normal

-2

150 155 160 165 170 175 180 185


Observed Value

Detrended Normal Q-Q Plots

Page.no: 19 out of 57
Yogesh .S. Mangela
Detrended Normal Q-Q Plot of height in cms

for gender= male

0.4
Dev from Normal

0.2

0.0

-0.2

171 174 177 180 183 186 189


Observed Value
Detrended Normal Q-Q Plot of height in cms

for gender= female


0.4

0.3

0.2
Dev from Normal

0.1

0.0

-0.1

-0.2

-0.3

150 155 160 165 170 175 180


Observed Value

Page.no: 20 out of 57
Yogesh .S. Mangela

180.0
height in cms

160.0

male female

gender

Box and Whisker plot specify that the male are taller than female, Male had median of 175cm height
whereas female had approximately 165cm height.

Practical 5 Week 3

SEX
HEIGHTAGE

0
1
BMP WEIGHT
FEV1
RV
FRC
PEMAX TLC

AGE HEIGHT WEIGHT BMP FEV1 RV FRC TLC PEMAX

Page.no: 21 out of 57
Yogesh .S. Mangela

5.50

5.00

4.50
FEV1

4.00

3.50

3.00

2.50

160.00 165.00 170.00 175.00 180.00 185.00


HEIGHT

Practical 5 Week3 b)

Regression
[DataSet1] H:\2nd Year\MA2012N\Week3\week-3.sav

Variables Entered/Removed(b)

Variables Variables
Model Entered Removed Method
1 HEIGHT(a) . Enter
a All requested variables entered.
b Dependent Variable: FEV1

Model Summary(b)

Mode Change Statistics


l Adjusted R Std. Error of the
R R Square Square Estimate R Square
Change F Change df1 df2 Sig. F Change
1 .574(a) .330 .293 .58792 .330 8.858 1 18 .00
a Predictors: (Constant), HEIGHT
b Dependent Variable: FEV1

ANOVA(b)

Sum of
Model Squares df Mean Square F Sig.
1 Regression 3.062 1 3.062 8.858 .008(a)
Residual 6.222 18 .346
Total 9.283 19
a Predictors: (Constant), HEIGHT
b Dependent Variable: FEV1

Coefficients(a)

Page.no: 22 out of 57
Yogesh .S. Mangela

Mode Unstandardized Standardized


Coefficients Coefficients 95% Confidence Interval for B
l t Sig.
B Std. Error Beta Lower Bound Upper Bound
1 (Constant) -8.986 4.315 -2.082 .052 -18.051 .080
HEIGHT .073 .025 .574 2.976 .008 .022 .125
a Dependent Variable: FEV1

Residuals Statistics(a)

Minimum Maximum Mean Std. Deviation N


Predicted Value 3.0562 4.5027 3.8510 .40143 20
Std. Predicted Value -1.980 1.623 .000 1.000 20
Standard Error of Predicted
Value .133 .298 .180 .049 20

Adjusted Predicted Value 2.8894 4.4847 3.8396 .41374 20


Residual -1.10413 1.41930 .00000 .57224 20
Std. Residual -1.878 2.414 .000 .973 20
Stud. Residual -1.945 2.488 .009 1.014 20
Deleted Residual -1.18437 1.50721 .01139 .62132 20
Stud. Deleted Residual -2.127 2.985 .019 1.105 20
Mahal. Distance .023 3.920 .950 1.115 20
Cook's Distance .000 .192 .043 .057 20
Centered Leverage Value .001 .206 .050 .059 20
a Dependent Variable: FEV1

Charts

Histogram

Dependent Variable: FEV1

4
Frequency

Mean =1.01E-15
Std. Dev. =0.973
0 N =20
-2 0 2

Regression Standardized Residual

Page.no: 23 out of 57
Yogesh .S. Mangela
Normal P-P Plot of Regression Standardized Residual

Dependent Variable: FEV1


Expected Cum Prob

1.0

0.8

0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
Observed Cum Prob
Scatterplot

Dependent Variable: FEV1

2
Regression Studentized Deleted (Press)

1
Residual

-1

-2

-3

-2 0 2

Regression Standardized Predicted Value


Practical 5 Week 3 / 2a-b)

Page.no: 24 out of 57
Yogesh .S. Mangela
Graph
[DataSet2] H:\2nd Year\MA2012N\Week3\cystic.sav

SEX
HEIGHT AGE

0
1
BMP WEIGHT
FEV1
RV
FRC
PEMAX TLC

AGE HEIGHT WEIGHT BMP FEV1 RV FRC TLC PEMAX

Regression (Stepwise Data modelling)


Notes

[DataSet2] H:\2nd Year\MA2012N\Week3\cystic.sav

Variables Entered/Removed(a)

Variables Variables
Model Entered Removed Method
1
Stepwise
(Criteria:
Probability-
of-F-to-
enter <=
WEIGHT .
.050,
Probability-
of-F-to-
remove >=
.100).

a Dependent Variable: PEMAX

Model Summary(b)

Mode Change Statistics


Adjusted R Std. Error of
l R R Square Square the Estimate R Square
Change F Change df1 df2 Sig. F Change
1 .635(a) .404 .378 26.380 .404 15.559 1 23 .00
a Predictors: (Constant), WEIGHT
b Dependent Variable: PEMAX

Page.no: 25 out of 57
Yogesh .S. Mangela

ANOVA(b)

Sum of
Model Squares df Mean Square F Sig.
1 Regression 10827.159 1 10827.159 15.559 .001(a)
Residual 16005.481 23 695.890
Total 26832.640 24
a Predictors: (Constant), WEIGHT
b Dependent Variable: PEMAX

Coefficients(a)

Mode Unstandardized Standardized


Coefficients Coefficients 95% Confidence Interval for B
l t Sig.
B Std. Error Beta Lower Bound Upper Bound
1 (Constant) 63.546 12.702 5.003 .000 37.270 89.821
WEIGHT 1.187 .301 .635 3.944 .001 .564 1.809
a Dependent Variable: PEMAX

Excluded Variables(b)

Mode Collinearity
Partial Statistics
l Beta In t Sig. Correlation
Tolerance
1 AGE .212(a) .549 .588 .116 .179
HEIGHT .094(a) .224 .825 .048 .152
SEX -.174(a) -1.063 .299 -.221 .964
BMP -.361(a) -1.729 .098 -.346 .548
FEV1 .211(a) 1.179 .251 .244 .799
RV .129(a) .620 .542 .131 .614
FRC -.041(a) -.194 .848 -.041 .619
TLC .102(a) .567 .576 .120 .825
a Predictors in the Model: (Constant), WEIGHT
b Dependent Variable: PEMAX

Residuals Statistics(a)

Minimum Maximum Mean Std. Deviation N


Predicted Value 78.85 151.12 109.12 21.240 25
Std. Predicted Value -1.425 1.978 .000 1.000 25
Standard Error of
Predicted Value 5.288 11.884 7.212 1.952 25
Adjusted Predicted Value 76.85 147.59 108.62 20.773 25
Residual -44.305 48.408 .000 25.824 25
Std. Residual -1.680 1.835 .000 .979 25
Stud. Residual -1.733 1.954 .009 1.021 25
Deleted Residual -47.198 57.007 .505 28.144 25
Stud. Deleted Residual -1.818 2.093 .011 1.053 25
Mahal. Distance .005 3.911 .960 1.090 25
Cook's Distance .000 .426 .046 .085 25
Centered Leverage Value .000 .163 .040 .045 25
a Dependent Variable: PEMAX

Page.no: 26 out of 57
Yogesh .S. Mangela
Charts
Histogram

Dependent Variable: PEMAX

5
Frequency

Mean =2.78E-17
Std. Dev. =0.979
0 N =25
-2 0 2
Regression Standardized Residual
Normal P-P Plot of Regression Standardized Residual
Expected Cum Prob

Dependent Variable: PEMAX

1.0

0.8

0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
Observed Cum Prob
Scatterplot
Regression Studentized Deleted (Press)

Dependent Variable: PEMAX

2
Residual

-1

-2

-2 0 2
Regression Standardized Predicted Value

Regression (Backward Data Modelling)

Page.no: 27 out of 57
Yogesh .S. Mangela

[DataSet2] H:\2nd Year\MA2012N\Week3\cystic.sav

Variables Entered/Removed(b)

Variables Variables
Model Entered Removed Method
1 WEIGHT,
SEX, TLC,
BMP, RV,
FEV1, . Enter
HEIGHT,
FRC,
AGE(a)
2
Backward
(criterion:
Probability
. SEX
of F-to-
remove >=
.100).

3
Backward
(criterion:
Probability
. TLC
of F-to-
remove >=
.100).

4
Backward
(criterion:
Probability
. FRC
of F-to-
remove >=
.100).

5
Backward
(criterion:
Probability
. AGE
of F-to-
remove >=
.100).

6
Backward
(criterion:
Probability
. HEIGHT
of F-to-
remove >=
.100).

7
Backward
(criterion:
Probability
. RV
of F-to-
remove >=
.100).

a All requested variables entered.


b Dependent Variable: PEMAX

Model Summary(h)

Page.no: 28 out of 57
Yogesh .S. Mangela

Mode Change Statistics


Adjusted R Std. Error of
l R R Square Square the Estimate R Square
Change F Change df1 df2 Sig. F Change
1 .798(a) .637 .420 25.471 .637 2.929 9 15 .03
2 .797(b) .636 .454 24.710 -.001 .058 1 15 .81
3 .795(c) .632 .480 24.114 -.004 .190 1 16 .66
4 .792(d) .627 .502 23.592 -.005 .229 1 17 .63
5 .788(e) .621 .522 23.128 -.005 .261 1 18 .61
6 .784(f) .614 .537 22.754 -.007 .357 1 19 .55
7 .755(g) .570 .509 23.440 -.044 2.286 1 20 .14
a Predictors: (Constant), WEIGHT, SEX, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
b Predictors: (Constant), WEIGHT, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
c Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, FRC, AGE
d Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, AGE
e Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT
f Predictors: (Constant), WEIGHT, BMP, RV, FEV1
g Predictors: (Constant), WEIGHT, BMP, FEV1
h Dependent Variable: PEMAX

ANOVA(h)

Sum of
Model Squares df Mean Square F Sig.
1 Regression 17101.390 9 1900.154 2.929 .032(a)
Residual 9731.250 15 648.750
Total 26832.640 24
2 Regression 17063.488 8 2132.936 3.493 .016(b)
Residual 9769.152 16 610.572
Total 26832.640 24
3 Regression 16947.546 7 2421.078 4.164 .008(c)
Residual 9885.094 17 581.476
Total 26832.640 24
4 Regression 16814.390 6 2802.398 5.035 .003(d)
Residual 10018.250 18 556.569
Total 26832.640 24
5 Regression 16669.053 5 3333.811 6.232 .001(e)
Residual 10163.587 19 534.926
Total 26832.640 24
6 Regression 16478.040 4 4119.510 7.957 .001(f)
Residual 10354.600 20 517.730
Total 26832.640 24
7 Regression 15294.452 3 5098.151 9.279 .000(g)
Residual 11538.188 21 549.438
Total 26832.640 24
a Predictors: (Constant), WEIGHT, SEX, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
b Predictors: (Constant), WEIGHT, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
c Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, FRC, AGE
d Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, AGE
e Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT
f Predictors: (Constant), WEIGHT, BMP, RV, FEV1
g Predictors: (Constant), WEIGHT, BMP, FEV1
h Dependent Variable: PEMAX

Coefficients(a)

Page.no: 29 out of 57
Yogesh .S. Mangela
Mode Unstandardized Standardized
Coefficients Coefficients 95% Confidence Interval for B
l t Sig.
B Std. Error Beta Lower Bound Upper Bound
1 (Constant) 176.058 225.891 .779 .448 -305.417 657.534
AGE -2.542 4.802 -.385 -.529 .604 -12.777 7.693
HEIGHT -.446 .903 -.287 -.494 .628 -2.372 1.479
SEX -3.737 15.460 -.057 -.242 .812 -36.689 29.215
BMP -1.745 1.155 -.627 -1.510 .152 -4.207 .717
FEV1 1.081 1.081 .362 1.000 .333 -1.223 3.385
RV .197 .196 .507 1.004 .331 -.221 .615
FRC -.308 .492 -.403 -.626 .540 -1.358 .741
TLC .189 .500 .096 .377 .711 -.877 1.254
WEIGHT 2.993 2.008 1.602 1.490 .157 -1.287 7.273
2 (Constant) 153.039 198.715 .770 .452 -268.218 574.295
AGE -2.115 4.331 -.320 -.488 .632 -11.295 7.066
HEIGHT -.395 .852 -.254 -.464 .649 -2.200 1.411
BMP -1.742 1.121 -.625 -1.554 .140 -4.117 .634
FEV1 1.265 .743 .424 1.703 .108 -.310 2.840
RV .178 .174 .458 1.021 .323 -.192 .547
FRC -.248 .412 -.325 -.602 .555 -1.122 .626
TLC .208 .478 .106 .436 .669 -.805 1.222
WEIGHT 2.835 1.842 1.517 1.539 .143 -1.070 6.740
3 (Constant) 198.294 165.331 1.199 .247 -150.524 547.112
AGE -2.663 4.044 -.403 -.659 .519 -11.195 5.869
HEIGHT -.490 .804 -.315 -.609 .550 -2.185 1.206
BMP -1.963 .975 -.705 -2.012 .060 -4.020 .095
FEV1 1.248 .724 .418 1.724 .103 -.280 2.775
RV .160 .165 .411 .967 .347 -.189 .508
FRC -.176 .369 -.231 -.479 .638 -.954 .602
WEIGHT 3.156 1.648 1.689 1.915 .072 -.321 6.632
4 (Constant) 166.905 148.476 1.124 .276 -145.032 478.842
AGE -1.819 3.560 -.275 -.511 .616 -9.299 5.661
HEIGHT -.410 .769 -.264 -.533 .600 -2.026 1.206
BMP -1.949 .954 -.700 -2.043 .056 -3.953 .055
FEV1 1.412 .624 .473 2.263 .036 .101 2.723
RV .096 .095 .246 1.010 .326 -.103 .294
WEIGHT 2.874 1.506 1.539 1.908 .072 -.290 6.039
5 (Constant) 137.096 133.856 1.024 .319 -143.068 417.259
HEIGHT -.449 .751 -.288 -.598 .557 -2.020 1.122
BMP -1.641 .725 -.589 -2.265 .035 -3.158 -.124
FEV1 1.472 .601 .493 2.450 .024 .214 2.729
RV .110 .088 .283 1.245 .228 -.075 .295
WEIGHT 2.339 1.060 1.252 2.206 .040 .120 4.557
6 (Constant) 63.947 53.277 1.200 .244 -47.187 175.080
BMP -1.377 .565 -.494 -2.436 .024 -2.557 -.198
FEV1 1.548 .578 .518 2.679 .014 .343 2.753
RV .126 .083 .323 1.512 .146 -.048 .299
WEIGHT 1.749 .381 .936 4.595 .000 .955 2.543
7 (Constant) 126.334 34.720 3.639 .002 54.130 198.537
BMP -1.465 .579 -.526 -2.530 .019 -2.670 -.261
FEV1 1.109 .514 .371 2.155 .043 .039 2.178
WEIGHT 1.536 .364 .822 4.216 .000 .779 2.294
a Dependent Variable: PEMAX

Excluded Variables(g)

Page.no: 30 out of 57
Yogesh .S. Mangela

Mode Collinearity
Partial Statistics
l Beta In t Sig. Correlation
Tolerance
2 SEX -.057(a) -.242 .812 -.062 .441
3 SEX -.071(b) -.316 .756 -.079 .453
TLC .106(b) .436 .669 .108 .386
4 SEX .002(c) .009 .993 .002 .676
TLC .047(c) .217 .831 .052 .460
FRC -.231(c) -.479 .638 -.115 .093
5 SEX .006(d) .033 .974 .008 .677
TLC .081(d) .428 .674 .100 .576
FRC -.092(d) -.216 .831 -.051 .115
AGE -.275(d) -.511 .616 -.120 .071
6 SEX .011(e) .066 .948 .015 .679
TLC .107(e) .607 .551 .138 .644
FRC -.032(e) -.079 .938 -.018 .121
AGE -.303(e) -.577 .571 -.131 .072
HEIGHT -.288(e) -.598 .557 -.136 .086
7 SEX -.012(f) -.068 .947 -.015 .685
TLC .182(f) 1.100 .284 .239 .743
FRC .266(f) 1.201 .244 .259 .410
AGE -.517(f) -1.033 .314 -.225 .081
HEIGHT -.466(f) -.996 .331 -.217 .094
RV .323(f) 1.512 .146 .320 .422
a Predictors in the Model: (Constant), WEIGHT, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
b Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, FRC, AGE
c Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, AGE
d Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT
e Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1
f Predictors in the Model: (Constant), WEIGHT, BMP, FEV1
g Dependent Variable: PEMAX

Residuals Statistics(a)

Minimum Maximum Mean Std. Deviation N


Predicted Value 71.97 160.77 109.12 25.244 25
Std. Predicted Value -1.472 2.046 .000 1.000 25
Standard Error of
Predicted Value 6.062 15.174 9.160 2.040 25
Adjusted Predicted Value 69.56 159.23 108.48 25.272 25
Residual -42.388 40.373 .000 21.926 25
Std. Residual -1.808 1.722 .000 .935 25
Stud. Residual -1.962 1.937 .012 1.014 25
Deleted Residual -49.885 51.069 .637 25.843 25
Stud. Deleted Residual -2.118 2.086 .003 1.052 25
Mahal. Distance .645 9.098 2.880 1.824 25
Cook's Distance .001 .249 .045 .059 25
Centered Leverage Value .027 .379 .120 .076 25
a Dependent Variable: PEMAX

Charts

Page.no: 31 out of 57
Histogram
Yogesh .S. Mangela
Dependent Variable: PEMAX

6
Frequency

Mean =-5.55E-16
Std. Dev. =0.935
0 N =25
-2 0 2

Regression Standardized Residual

Normal P-P Plot of Regression Standardized Residual

Dependent Variable: PEMAX

1.0

0.8
Expected Cum Prob

0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0

Observed Cum Prob

Page.no: 32 out of 57
Scatterplot

Yogesh .S. Mangela

Dependent Variable: PEMAX

2.5
Regression Studentized Deleted (Press)
Residual

0.0

-2.5

-1 0 1 2
Regression Standardized Predicted Value

Regression (Forward Data Modelling)


Notes

[DataSet2] H:\2nd Year\MA2012N\Week3\cystic.sav

Variables Entered/Removed(a)

Variables Variables
Model Entered Removed Method
1 Forward
(Criterion:
Probability-
WEIGHT .
of-F-to-
enter <=
.050)
a Dependent Variable: PEMAX

Model Summary(b)

Mode Change Statistics


Adjusted R Std. Error of
l R R Square Square the Estimate R Square
Change F Change df1 df2 Sig. F Change
1 .635(a) .404 .378 26.380 .404 15.559 1 23 .00
a Predictors: (Constant), WEIGHT
b Dependent Variable: PEMAX

ANOVA(b)

Page.no: 33 out of 57
Yogesh .S. Mangela
Sum of
Model Squares df Mean Square F Sig.
1 Regression 10827.159 1 10827.159 15.559 .001(a)
Residual 16005.481 23 695.890
Total 26832.640 24
a Predictors: (Constant), WEIGHT
b Dependent Variable: PEMAX

Coefficients(a)

Mode Unstandardized Standardized


Coefficients Coefficients 95% Confidence Interval for B
l t Sig.
B Std. Error Beta Lower Bound Upper Bound
1 (Constant) 63.546 12.702 5.003 .000 37.270 89.821
WEIGHT 1.187 .301 .635 3.944 .001 .564 1.809
a Dependent Variable: PEMAX

Excluded Variables(b)

Mode Collinearity
Partial Statistics
l Beta In t Sig. Correlation
Tolerance
1 AGE .212(a) .549 .588 .116 .179
HEIGHT .094(a) .224 .825 .048 .152
SEX -.174(a) -1.063 .299 -.221 .964
BMP -.361(a) -1.729 .098 -.346 .548
FEV1 .211(a) 1.179 .251 .244 .799
RV .129(a) .620 .542 .131 .614
FRC -.041(a) -.194 .848 -.041 .619
TLC .102(a) .567 .576 .120 .825
a Predictors in the Model: (Constant), WEIGHT
b Dependent Variable: PEMAX

Residuals Statistics(a)

Minimum Maximum Mean Std. Deviation N


Predicted Value 78.85 151.12 109.12 21.240 25
Std. Predicted Value -1.425 1.978 .000 1.000 25
Standard Error of
Predicted Value 5.288 11.884 7.212 1.952 25
Adjusted Predicted Value 76.85 147.59 108.62 20.773 25
Residual -44.305 48.408 .000 25.824 25
Std. Residual -1.680 1.835 .000 .979 25
Stud. Residual -1.733 1.954 .009 1.021 25
Deleted Residual -47.198 57.007 .505 28.144 25
Stud. Deleted Residual -1.818 2.093 .011 1.053 25
Mahal. Distance .005 3.911 .960 1.090 25
Cook's Distance .000 .426 .046 .085 25
Centered Leverage Value .000 .163 .040 .045 25
a Dependent Variable: PEMAX

Charts

Page.no: 34 out of 57
Histogram

Yogesh .S. Mangela


Dependent Variable: PEMAX

4
Frequency

Mean =2.78E-17
Std. Dev. =0.979
0 N =25
-2 0 2
Regression Standardized Residual
Normal P-P Plot of Regression Standardized Residual

Dependent Variable: PEMAX

1.0

0.8
Expected Cum Prob

0.6

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
Observed Cum Prob

Page.no: 35 out of 57
Scatterplot

Yogesh .S. Mangela


Regression Studentized Deleted (Press) Dependent Variable: PEMAX

1
Residual

-1

-2

-2 0 2

Regression Standardized Predicted Value

Practical 6 Week 4

Explore

[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_bi)

1=control, 2=major depression, 3=bipolar depression, 4=schizophrenia,


5=atypical
Case Processing Summary

1=control, 2=major Cases


depression, 3=bipolar
Valid Missing Total
depression,
4=schizophrenia,
5=atypical

N Percent N Percent N Percent


cortisol control 31 100.0% 0 .0% 31 100.0%
major depression 14 100.0% 0 .0% 14 100.0%
biopolar depression 8 100.0% 0 .0% 8 100.0%
schizophrenia 14 100.0% 0 .0% 14 100.0%
atypical 4 100.0% 0 .0% 4 100.0%

Page.no: 36 out of 57
Yogesh .S. Mangela

cortisol
25.0

20.0

15.0
cortisol

67
31
10.0

53 66
5.0

0.0

control major depression biopolar schizophrenia atypical


depression

1=control, 2=major depression, 3=bipolar depression, 4=schizophrenia,


5=atypical

Graph
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Q1_bii

25.0

20.0

15.0
cortisol

10.0

5.0

0.0

2 4
group

Graph
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Q1_biii)

Page.no: 37 out of 57
Yogesh .S. Mangela

20.0

15.0
95% CI cortisol

10.0

5.0

0.0

-5.0

control major depression biopolar schizophrenia atypical


depression

1=control, 2=major depression, 3=bipolar depression, 4=schizophrenia,


5=atypical

Oneway

[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav Prac-6 ci)


Test of Homogeneity of Variances

cortisol
Levene
Statistic df1 df2 Sig.
16.121 4 66 .000

ANOVA

cortisol
Sum of
Squares df Mean Square F Sig.
Between Groups 1405.594 4 351.399 21.588 .000
Within Groups 1074.295 66 16.277
Total 2479.889 70

[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Dii)

Case Processing Summary

1=control, 2=major Cases


depression, 3=bipolar
Valid Missing Total
depression,
4=schizophrenia,
5=atypical

N Percent N Percent N Percent


LCORTISOL control 31 100.0% 0 .0% 31 100.0%
major depression 14 100.0% 0 .0% 14 100.0%
biopolar depression 8 100.0% 0 .0% 8 100.0%
schizophrenia 14 100.0% 0 .0% 14 100.0%
atypical 4 100.0% 0 .0% 4 100.0%

Page.no: 38 out of 57
Yogesh .S. Mangela

LCORTISOL

3.00

67
31
LCORTISOL

2.00

1.00

0.00

-1.00

control major depression biopolar schizophrenia atypical


depression

1=control, 2=major depression, 3=bipolar depression, 4=schizophrenia,


5=atypical

Graph
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Dii)
95% CI LCORTISOL

2.00

0.00

control major depression biopolar schizophrenia atypical


depression
1=control, 2=major depression, 3=bipolar depression, 4=schizophrenia,
5=atypical

3.00

2.00
LCORTISOL

1.00

0.00

-1.00

2 4
Page.no: 39 out of 57 group
Yogesh .S. Mangela

Oneway

[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC-6_Diii

Test of Homogeneity of Variances

LCORTISOL
Levene
Statistic df1 df2 Sig.
2.507 4 66 .050

ANOVA

LCORTISOL
Sum of
Squares df Mean Square F Sig.
Between Groups 32.019 4 8.005 16.003 .000
Within Groups 33.014 66 .500
Total 65.032 70

Post Hoc Tests


Multiple Comparisons

Dependent Variable: LCORTISOL


Dunnett t (2-sided)
(J) 1=control,
2=major 95% Confidence Interval
depression,
(I) 1=control, 2=major 3=bipolar
depression, 3=bipolar depression,
depression, Mean
4=schizophrenia, 4=schizophrenia,
Difference (I-
5=atypical 5=atypical J) Std. Error Sig.
Lower Bound Upper Bound
major depression control 1.57109(*) .22774 .000 .9911 2.1511
biopolar depression control -.14395 .28047 .973 -.8582 .5703
schizophrenia control -.26084 .22774 .671 -.8408 .3192
atypical control -.19067 .37575 .974 -1.1476 .7663

* The mean difference is significant at the .05 level.


a Dunnett t-tests treat one group as a control, and compare all other groups against it.

Practical 7: Survival analysis: Kaplan-Meier

Kaplan-Meier
[DataSet1] H:\2nd Year\MA2012N\Week5\btrial\btrial.sav PRC-7 1c)

Case Processing Summary

Page.no: 40 out of 57
Yogesh .S. Mangela
ihc_typ Censored
e Total N N of Events
N Percent
negative 36 16 20 55.6%
positive 9 8 1 11.1%
Overall 45 24 21 46.7%

Survival Table

Cumulative Proportion
ihc_typ N of N of
Surviving at the Time Cumulative Remaining
e Time Status Events Cases
Estimate Std. Error
negative 1 19.000 dead .972 .027 1 35
2 25.000 dead .944 .038 2 34
3 30.000 dead .917 .046 3 33
4 34.000 dead .889 .052 4 32
5 37.000 dead .861 .058 5 31
6 46.000 dead .833 .062 6 30
7 47.000 dead .806 .066 7 29
8 51.000 dead .778 .069 8 28
9 56.000 dead .750 .072 9 27
10 57.000 dead .722 .075 10 26
11 61.000 dead .694 .077 11 25
12 66.000 dead .667 .079 12 24
13 67.000 dead .639 .080 13 23
14 74.000 dead .611 .081 14 22
15 78.000 dead .583 .082 15 21
16 86.000 dead .556 .083 16 20
17 122.000 alive . . 16 19
18 123.000 alive . . 16 18
19 130.000 alive . . 16 17
20 130.000 alive . . 16 16
21 133.000 alive . . 16 15
22 134.000 alive . . 16 14
23 136.000 alive . . 16 13
24 141.000 alive . . 16 12
25 143.000 alive . . 16 11
26 148.000 alive . . 16 10
27 151.000 alive . . 16 9
28 152.000 alive . . 16 8
29 153.000 alive . . 16 7
30 154.000 alive . . 16 6
31 156.000 alive . . 16 5
32 162.000 alive . . 16 4
33 164.000 alive . . 16 3
34 165.000 alive . . 16 2
35 182.000 alive . . 16 1
36 189.000 alive . . 16 0

Page.no: 41 out of 57
Yogesh .S. Mangela
positive 1 22.000 dead .889 .105 1 8
2 23.000 dead .778 .139 2 7
3 38.000 dead .667 .157 3 6
4 42.000 dead .556 .166 4 5
5 73.000 dead .444 .166 5 4
6 77.000 dead .333 .157 6 3
7 89.000 dead .222 .139 7 2
8 115.000 dead .111 .105 8 1
9 144.000 alive . . 8 0

Means and Medians for Survival Time

ihc_typ Mean(a) Median


e 95% Confidence Interval 95% Confidence Interval

Estimate Std. Error Estimate Std. Error


Lower Bound Upper Bound Lower Bound Upper Bound
negative 128.167 11.530 105.567 150.766 . . . .
positive 69.222 13.257 43.239 95.206 73.000 46.212 .000 163.576
Overall 117.378 10.328 97.134 137.622 89.000 . . .
a Estimation is limited to the largest survival time if it is censored.

Mean for the survive time distribution

for Negative IHC-TYPE the population mean survival time is estimated to be 128 deaths with 95%
confidence interval (105.567, 150.766) deaths.
For positive IHC-TYPE the population mean survival time is estimated to be 69.222 deaths with 95%
confidence interval (43.239, 95.206) deaths

Mean for the survival time distribution

For negative HIC-TYPE the population median survival time m is estimated to be 0.00 deaths with
95% confidence interval (not included) deaths.

For positive HIC-TYPE the population median survival time M is estimated to be 73.00 with 95%
confidence interval (0.00, 163.576)

Percentiles

ihc_type 25.0% 50.0% 75.0%


Estimate Std. Error Estimate Std. Error Estimate Std. Error
negative . . . . 56.000 9.093
positive 89.000 23.697 73.000 46.212 38.000 11.314
Overall . . 89.000 . 51.000 8.899

Lower quartiles for the survival time distribution

For negative HIC-TYPE the population lower quartile survival time Q1 is estimated to be 56.00, Q2
is not estimated for negative HIC-TYPE deaths and Q3.
For positive HIC-TYPE the population lower quartile survival time Q1 is estimated to be 38.00
deaths, Q2 is estimated to be 73.00 deaths and Q3 is estimated to be 89.00 deaths.
Overall Comparisons

Page.no: 42 out of 57
Yogesh .S. Mangela
Chi-Square df Sig.
Log Rank (Mantel-Cox) 5.494 1 .019
Breslow (Generalized
Wilcoxon) 4.351 1 .037
Tarone-Ware 4.879 1 .027
Test of equality of survival distributions for the different levels of ihc_type.

H0: Survival time distribution of negative HIC-TYPE and positive HIC-TYPE are same.

H1: Survival time distribution of negative HIC-TYPE and positive HIC-TYPE are different.

Above table since p≤0.05 suggests that negative HIC-TYPE and positive HIC-TYPE are defferent.
Survival Functions

ihc_type
1.0
negative
positive
negative-censored
0.8 positive-censored
Cum Survival

0.6

0.4

0.2

0.0

0 50 100 150 200


time
The survival function plot suggests that the negative HIC-TYPE has a higher cumulative survival
probability throughout, with 189 deaths. the cumulative survival proportion for the two HIC-TYPE
are different, For instance positive HIC-TYPE 145 have 02% chance of survive. on the other hand ,
negative HIC-TYPE 145 its around 55% chance of survive.

Page.no: 43 out of 57
Yogesh .S. Mangela
Hazard Function

ihc_type
2.5
negative
positive
negative-censored
2.0 positive-censored
Cum Hazard

1.5

1.0

0.5

0.0

0 50 100 150 200


time

Kaplan-Meier

[DataSet2] H:\2nd Year\MA2012N\Week5\kidney\kidney.sav Prac 7 (2)

Case Processing Summary

CATHATER Total N N of Events


Censored
N Percent
SURGICALLY 43 15 28 65.1%
PERCUTANEOUSLY 76 11 65 85.5%
Overall 119 26 93 78.2%

The censor Status is INFECTED the status event is 1=YES, status event occurs as late as possible

Survival Table

Cumulative Proportion N of N of
Surviving at the Time Cumulative Remaining
CATHATER Time Status Events Cases
Estimate Std. Error

Page.no: 44 out of 57
Yogesh .S. Mangela
SURGICALLY 1 1.500 NO .977 .023 1 42
2 2.500 0 . . 1 41
3 2.500 0 . . 1 40
4 3.500 NO .952 .033 2 39
5 3.500 0 . . 2 38
6 3.500 0 . . 2 37
7 3.500 0 . . 2 36
8 4.500 NO . . 3 35
9 4.500 NO .899 .048 4 34
10 4.500 0 . . 4 33
11 5.500 NO .872 .054 5 32
12 5.500 0 . . 5 31
13 6.500 0 . . 5 30
14 6.500 0 . . 5 29
15 7.500 0 . . 5 28
16 7.500 0 . . 5 27
17 7.500 0 . . 5 26
18 7.500 0 . . 5 25
19 8.500 NO . . 6 24
20 8.500 NO .802 .068 7 23
21 8.500 0 . . 7 22
22 9.500 NO .766 .074 8 21
23 9.500 0 . . 8 20
24 10.500 NO .728 .080 9 19
25 10.500 0 . . 9 18
26 11.500 NO .687 .085 10 17
27 11.500 0 . . 10 16
28 12.500 0 . . 10 15
29 12.500 0 . . 10 14
30 13.500 0 . . 10 13
31 14.500 0 . . 10 12
32 14.500 0 . . 10 11
33 15.500 NO .625 .098 11 10
34 16.500 NO .562 .106 12 9
35 18.500 NO .500 .111 13 8
36 21.500 0 . . 13 7
37 21.500 0 . . 13 6
38 22.500 0 . . 13 5
39 22.500 0 . . 13 4
40 23.500 NO .375 .137 14 3
41 25.500 0 . . 14 2
42 26.500 NO .187 .149 15 1
43 27.500 0 . . 15 0

Page.no: 45 out of 57
Yogesh .S. Mangela
PERCUTANEOUSLY 1 .500 NO . . 1 75
2 .500 NO . . 2 74
3 .500 NO . . 3 73
4 .500 NO . . 4 72
5 .500 NO . . 5 71
6 .500 NO .921 .031 6 70
7 .500 0 . . 6 69
8 .500 0 . . 6 68
9 .500 0 . . 6 67
10 .500 0 . . 6 66
11 .500 0 . . 6 65
12 .500 0 . . 6 64
13 .500 0 . . 6 63
14 .500 0 . . 6 62
15 .500 0 . . 6 61
16 .500 0 . . 6 60
17 1.500 0 . . 6 59
18 1.500 0 . . 6 58
19 1.500 0 . . 6 57
20 1.500 0 . . 6 56
21 2.500 NO . . 7 55
22 2.500 NO .888 .038 8 54
23 2.500 0 . . 8 53
24 2.500 0 . . 8 52
25 2.500 0 . . 8 51
26 2.500 0 . . 8 50
27 2.500 0 . . 8 49
28 3.500 NO .870 .041 9 48
29 3.500 0 . . 9 47
30 3.500 0 . . 9 46
31 3.500 0 . . 9 45
32 3.500 0 . . 9 44
33 3.500 0 . . 9 43
34 4.500 0 . . 9 42
35 4.500 0 . . 9 41
36 4.500 0 . . 9 40
37 5.500 0 . . 9 39
38 5.500 0 . . 9 38
39 5.500 0 . . 9 37
40 5.500 0 . . 9 36
41 5.500 0 . . 9 35
42 6.500 NO .845 .047 10 34
43 6.500 0 . . 10 33
44 7.500 0 . . 10 32
45 7.500 0 . . 10 31
46 7.500 0 . . 10 30
47 8.500 0 . . 10 29
48 8.500 0 . . 10 28
49 8.500 0 . . 10 27
50 9.500 0 . . 10 26
51 9.500 0 . . 10 25
52 10.500 0 . . 10 24
53 10.500 0 . . 10 23
54 10.500 0 . . 10 22
55 11.500 0 . . 10 21
56 11.500 0 . . 10 20
57 12.500 0 . . 10 19
58 12.500 0 . . 10 18
59 12.500 0 . . 10 17
60 12.500 0 . . 10 16
61 14.500 0 . . 10 15
62 14.500 0 . . 10 14
Page.no: 46 out of 57
Yogesh .S. Mangela

Means and Medians for Survival Time

CATHATER Mean(a) Median


95% Confidence Interval 95% Confidence In
Estimate Std. Error Estimate Std. Error
Lower Bound Upper Bound Lower Bound Uppe
SURGICALLY 18.527 1.659 15.275 21.778 18.500 4.149 10.367
PERCUTANEOUSLY 23.649 1.386 20.933 26.366 . . .
Overall 21.028 1.206 18.664 23.391 26.500 . .
a Estimation is limited to the largest survival time if it is censored.

Means of the survival time distribution


For SURGICALLY Catheter type the population mean survival; time is estimated to be 18.527 with
95% confidence interval (15.275, 21.778).
For PERCUTANEOUSLY catheter type the population mean survival time is estimated to be 23.649
with 95% confidence interval (20.933, 26.366)

Median for the survival time distribution


For SURGICALLY Catheter type the population median survival; time is estimated to be 18.500 with
95% confidence interval (10.367, 26.633).
For PERCUTANEOUSLY catheter type the population median survival time is estimated to be
23.649 with 95% confidence interval (not included).
Percentiles

CATHATER 25.0% 50.0% 75.0%


Estimate Std. Error Estimate Std. Error Estimate Std. Error
SURGICALLY 26.500 2.386 18.500 4.149 10.500 2.080
PERCUTANEOUSLY . . . . . .
Overall . . 26.500 . 15.500 3.301

Lower quartiles for the survival time distribution


For SURGICALLY Catheter type the Q1 survival time is estimated to be 10.500. Q2 is 18.50 and Q3
is 26.500
For PERCUTANEOUSLY type the Q1 survival time is not estimated, Q2 and Q3 respectivaly.
Overall Comparisons

Chi-Square df Sig.
Log Rank (Mantel-Cox) 2.530 1 .112
Breslow (Generalized
Wilcoxon) .002 1 .964
Tarone-Ware .403 1 .526
Test of equality of survival distributions for the different levels of CATHATER.

H0: both the Surgically and Percutaneously are same survival distribution.
H1: both the Surgically and Percutaneously are different survival distribution.

P≥0.05 hence we accept H0 and conclude that there is no evidence that there are any difference
survival distribution in Surgically and Percutaneously CATHATER type.

Page.no: 47 out of 57
Yogesh .S. Mangela
Survival Functions

CATHATER
1.0
PERCUTANEOUSLY
SURGICALLY
PERCUTANEOUSLY-
censored
0.8 SURGICALLY-
censored
Cum Survival

0.6

0.4

0.2

0.0

0.0 5.0 10.0 15.0 20.0 25.0 30.0

TIME

From the survival function graph it suggests that the PERCUTANEUSALY cathater has a higher
cumulative survival probability than the SURGICALLY.

Hazard Function

CATHATER
2.0
PERCUTANEOUSLY
SURGICALLY
PERCUTANEOUSLY-
censored
SURGICALLY-
1.5 censored
Cum Hazard

1.0

0.5

0.0

0.0 5.0 10.0 15.0 20.0 25.0 30.0

TIME

Kaplan-Meier
[DataSet1] H:\2nd Year\MA2012N\Week5\HIV_azt\HIV_azt.sav 7.3.A)
Case Processing Summary

cd4 Total N N of Events Censored


N Percent
No 7 3 4 57.1%

Page.no: 48 out of 57
Yogesh .S. Mangela
Yes 27 14 13 48.1%
Overall 34 17 17 50.0%

Sensor status is CD4 (0 = no, 1 = yes) and event status is Drug (1 = AZT+zalcitabine, 1 =
AZT+zalcitabine+saquinavir )
Survival Table

Cumulative Proportion
Surviving at the Time

Estimate Std. Error


1 AZT+zalcit
4.000 .857 .132 1 6
abine
2 AZT+zalcit
38.000 .714 .171 2 5
abine
3 AZT+zalcit
51.000 abine+saq . . 2 4
uinavir
4 AZT+zalcit
56.000 abine+saq . . 2 3
uinavir
5 AZT+zalcit
94.000 abine+saq . . 2 2
uinavir
6 AZT+zalcit
180.000 .357 .267 3 1
abine
7 AZT+zalcit
180.000 abine+saq . . 3 0
uinavir
1 AZT+zalcit
6.000 .958 .041 1 23
abine
2 AZT+zalcit
11.000 .917 .056 2 22
abine
3 AZT+zalcit
12.000 .875 .068 3 21
abine
4 AZT+zalcit
12.000 abine+saq . . 3 20
uinavir
5 AZT+zalcit
22.000 abine+saq . . 3 19
uinavir
6 AZT+zalcit
32.000 .829 .078 4 18
abine
7 AZT+zalcit
35.000 .783 .086 5 17
abine
8 AZT+zalcit
39.000 .737 .093 6 16
abine
9 AZT+zalcit
45.000 .691 .098 7 15
abine
10 AZT+zalcit
48.000 abine+saq . . 7 14
uinavir
11 AZT+zalcit
49.000 .641 .102 8 13
abine
12 AZT+zalcit
75.000 .592 .106 9 12
abine
13 AZT+zalcit
80.000 .543 .108 10 11
abine
14 AZT+zalcit
80.000 abine+saq . . 10 10
uinavir

Page.no: 49 out of 57
Yogesh .S. Mangela
15 AZT+zalcit
84.000 .488 .110 11 9
abine
16 AZT+zalcit
85.000 .434 .110 12 8
abine
17 AZT+zalcit
85.000 abine+saq . . 12 7
uinavir
18 AZT+zalcit
87.000 .372 .111 13 6
abine
19 AZT+zalcit
90.000 abine+saq . . 13 5
uinavir
20 AZT+zalcit
102.000 .298 .111 14 4
abine
21 AZT+zalcit
160.000 abine+saq . . 14 3
uinavir
22 AZT+zalcit
171.000 abine+saq . . 14 2
uinavir
23 AZT+zalcit
180.000 abine+saq . . 14 1
uinavir
24 AZT+zalcit
238.000 abine+saq . . 14 0
uinavir

Means and Medians for Survival Time

cd4 Mean(a) Median


95% Confidence Interval 95% Confidence Interval
Estimate Std. Error Estimate Std. Error
Lower Bound Upper Bound Lower Bound Upper Bound
No 134.571 33.515 68.881 200.261 180.000 105.992 .000 387.744
Yes 111.253 19.893 72.262 150.244 84.000 6.958 70.363 97.637
Overall 118.394 17.854 83.399 153.388 85.000 5.002 75.196 94.804
a Estimation is limited to the largest survival time if it is censored.

Mean of the survival time distribution

For CD4 no the population mean survival time is estimated to be 134.571 with 95% confidence interval (68.881, 200.261)
For CD4 yes the population mean survival time estimated to be 111.253 with 95% confidence interval ( 72.262, 150.244)

Median of the survival time distribution

For CD4 No the population median survival time is estimated to be 180.00 with 95% confidence interval (0.00, 387.744)
Fro CD4 Yes the population median survival time is estimated to be 84.00 with 95% confidence interval (70.363, 97.637)

Percentiles

cd4 25.0% 50.0% 75.0%


Estimate Std. Error Estimate Std. Error Estimate Std. Error
No . . 180.000 105.992 38.000 60.103
Yes . . 84.000 6.958 39.000 8.721
Overall . . 85.000 5.002 39.000 7.773

Lower quartiles for the survival time distribution

For CD4 NO type the Q1 survival time is estimated to be 38.00. Q2 is 105.00 and Q3 is no estimated.

For CD4 NO type the Q1 survival time is estimated to be 39.00. Q2 is 6.958 and Q3 is no estimated.
Overall Comparisons

Page.no: 50 out of 57
Yogesh .S. Mangela

Chi-Square df Sig.
Log Rank (Mantel-Cox) .520 1 .471
Breslow (Generalized
Wilcoxon) .246 1 .620
Tarone-Ware .433 1 .510
Test of equality of survival distributions for the different levels of cd4.

H0: Both CD4 Yes and No have the same survival distribution
H1: Both CD4 Yes and No have the different survival distribution

P≥ 0.05 we force to accept Ho: conclude that the survival distribution of CD4 yes and No are same.
Survival Functions

cd4
1.0
No
Yes
No-censored
0.8 Yes-censored
Cum Survival

0.6

0.4

0.2

0.0

0 50 100 150 200 250

time
Above Survival function graph indicates that the CD4 no have the best survival function distribution.

Hazard Function

cd4
1.2 No
Yes
No-censored
Cum Hazard

1.0 Yes-censored

0.8

0.6

0.4

0.2

0.0

0 50 100 150 200 250


time

Cox Regression
Notes

Page.no: 51 out of 57
Yogesh .S. Mangela

[DataSet3] H:\2nd Year\MA2012N\Week6\Kidtran\kidtran.sav PRAC_8_i)

Case Processing Summary

N Percent
Cases available in Event(a) 140 16.2%
analysis Censored 721 83.5%
Total 861 99.8%
Cases dropped Cases with missing values
0 .0%
Cases with negative time 0 .0%
Censored cases before
the earliest event in a 2 .2%
stratum
Total
2 .2%

Total 863 100.0%


a Dependent Variable: Measures the Time to death

Categorical Variable Codings(b,c)

Frequency (1)
sex(a) 1=Male 524 1
2=Female 339 0
race(a 1=White 712 1
) 2=Black 151 0
a Indicator Parameter Coding
b Category variable: sex
c Category variable: race

Block 0: Beginning Block


Variables not in the Equation(a)

Score df Sig.
sex .310 1 .578
race 1.111 1 .292
age 53.433 1 .000
a Residual Chi Square = 53.688 with 3 df Sig. = .000

Best fitted model h(y)=h0(y) * exp{.138*Size}


h(y)=h0(y) * exp{.138*Size+1} = 1.148 (3 d.p) = 14.8%

Block 1: Method = Forward Stepwise (Likelihood Ratio)


Omnibus Tests of Model Coefficients(b,c)

Page.no: 52 out of 57
Yogesh .S. Mangela

-2 Log Overall (score) Change From Previous Step Change From Previou
Step Likelihood
Chi-square df Sig. Chi-square df Sig. Chi-square df
1(a) 1701.638 53.433 1 .000 56.732 1 .000 56.732 1
a Variable(s) Entered at Step Number 1: age
b Beginning Block Number 0, initial Log Likelihood function: -2 Log likelihood: 1758.370
c Beginning Block Number 1. Method = Forward Stepwise (Likelihood Ratio)

Following Explanatory Variable was used by Forward Stepwise selection is Age.


Variables in the Equation

B SE Wald df Sig. Exp(B)


Step 1 age .051 .007 51.203 1 .000 1.052

Best fitted model h(y)=h0(y) * exp{0.051*AGE}


As Age function increases it more likely that you death percentage is increase.

Best fitted model h(y)=h0(y) * exp{0.051*AGE+1} = 1.0523 = 5.23%

Variables not in the Equation(a)

Score df Sig.
Step 1 sex.025 1 .875
race
.305 1 .581
a Residual Chi Square = .328 with 2 df Sig. = .849

Model if Term Removed

Loss Chi-
Term Removed square df Sig.
Step 1 age 56.732 1 .000

Covariate Means

Mean
sex .607
race .825
age 42.835

Hazard Function at mean of covariates

0.35

0.30

0.25
Cum Hazard

0.20

0.15

0.10

0.05

Page.no: 53 out of 57
0.00

0 500 1000 1500 2000 2500 3000 3500


Measures the Time to death
Yogesh .S. Mangela

Survival Function at mean of covariates

1.00

0.95

0.90
Cum Survival

0.85

0.80

0.75

0.70

0 500 1000 1500 2000 2500 3000 3500

Cox Regression Measures the Time to death

[DataSet1] H:\2nd Year\MA2012N\Week6\prostatic.sav

Case Processing Summary

N Percent
Cases available in Event(a) 6 15.8%
analysis Censored 30 78.9%
Total 36 94.7%
Cases dropped Cases with missing values
0 .0%
Cases with negative time 0 .0%
Censored cases before
the earliest event in a 2 5.3%
stratum
Total
2 5.3%

Total 38 100.0%
a Dependent Variable: Survival time

Categorical Variable Codings(b)

Frequency (1)
TREATMENT( 1=placebo 18 1
a) 2=diethylstilbestrol 20 0

Page.no: 54 out of 57
Yogesh .S. Mangela
a Indicator Parameter Coding
b Category variable: TREATMENT (Treatment)

Block 0: Beginning Block


Variables not in the Equation(a)

Score df Sig.
TREATMENT 4.421 1 .035
AGE .082 1 .774
SERUM .151 1 .697
SIZE 9.644 1 .002
GLEASON 7.262 1 .007
a Residual Chi Square = 14.992 with 5 df Sig. = .010

Block 1: Method = Forward Stepwise (Likelihood Ratio)


Omnibus Tests of Model Coefficients(c,d)

-2 Log Overall (score) Change From Previous Step Change From Previou
Step Likelihood
Chi-square df Sig. Chi-square df Sig. Chi-square df
1(a) 29.042 9.644 1 .002 7.307 1 .007 7.307 1
2(b) 23.533 13.752 2 .001 5.508 1 .019 12.816 2
a Variable(s) Entered at Step Number 1: SIZE
b Variable(s) Entered at Step Number 2: GLEASON
c Beginning Block Number 0, initial Log Likelihood function: -2 Log likelihood: 36.349
d Beginning Block Number 1. Method = Forward Stepwise (Likelihood Ratio)

Variables in the Equation

B SE Wald df Sig. Exp(B)


Step 1 SIZE .101 .037 7.360 1 .007 1.107
Step 2 SIZE .104 .045 5.250 1 .022 1.109
GLEASON
.778 .362 4.620 1 .032 2.177

Size and Gleason were selected as best fitted model.


Best fitted model h(y)=h0(y) * exp{0.101*size+0.778*Gleason}
As Age function increases it more likely that you death percentage is increase.

Best fitted model for Size h(y)=h0(y) * exp{0.101*size+1} = 1.1062= 11.062%


As the tumour size increase the percentages death is increased.

Gleason h(y)=h0(y) * exp{0.778*Gleason+1} = 2.1771 = 21.77%

Variables not in the Equation(a,b)

Score df Sig.
Step 1 TREATMENT 1.249 1 .264
AGE .177 1 .674
SERUM .023 1 .881
GLEASON 5.714 1 .017

Page.no: 55 out of 57
Yogesh .S. Mangela
Step 2 TREATMENT .926 1 .336
AGE .255 1 .613
SERUM .025 1 .875
a Residual Chi Square = 7.714 with 4 df Sig. = .103
b Residual Chi Square = 1.435 with 3 df Sig. = .697

Model if Term Removed

Loss Chi-
Term Removed square df Sig.
Step 1 SIZE 7.307 1 .007
Step 2 SIZE 5.594 1 .018
GLEASON
5.508 1 .019

Covariate Means

Mean
TREATMENT .472
AGE 68.278
SERUM 14.000
SIZE 10.750
GLEASON 9.139

Survival Function at mean of covariates

1.0
Cum Survival

0.8

0.6

0 10 20 30 40 50 60 70
Survival time

Page.no: 56 out of 57
Yogesh .S. Mangela
Hazard Function at mean of covariates

0.4
Cum Hazard

0.2

0.0

10 20 30 40 50 60 70

Survival time

Page.no: 57 out of 57

Vous aimerez peut-être aussi