LogBook (V5)

Yogesh .S.
Mangela
LOGBOOK
CONTENT PAGE PAGE NO
Practical One: Binomial & Hypergeometric Distributions 3
1. Describe the shape of the Binomial distribution when p = 0.2. 3

2. What is the median value of X and the modal value when p = 0.2. 3
3. How would you calculate the mean of the distribution? 3
4. Calculate the mean and compare with the median and the mode
and discuss the skewness. 3
5. Repeat for the cases when p = 0.4, 0.6, 0.8 and comment 4
6. Give examples of situations where the Binomial distribution
might arise in a Forensic investigation. 4
7. Describe the shape of the Hypergeometric
distribution when n = 20, m = 12 , r = 6 5
8. What are the median value of X and the modal value. 5
9. Calculate the mean and compare with the median and
the mode and discuss the skewness 5
10. Give examples of situations where the Hypergeometric
distribution might arise in a Forensic investigation. 5
Practical Two: The Poisson Distribution 6

1. What are the relationships between the following
(i) The standard deviation and the variance of any distribution? 6
(ii) the mean and variance of the Poisson distribution? 6
(iii) Estimate the mean of the number of murders recorded in a police area per month. 6
2. Describe the shape of the distribution of the number

of murders per month. What is the mean? What is the median? 6
Does the number of murders look to be normally distributed?

In what ways does the number of murders not satisfy the
conditions necessary for a normal distribution.
3. Compare the expected frequencies with those observed. (too good to be true!!!). 7
4 Now describe the shape of the distribution of murders in December

and discuss whether a Poisson distribution fits these data. 7
5 Now describe the shape of the distribution of sexual offences and

discuss whether a Poisson distribution fits these data. 7
6 Describe, in words, the situations where a random variable with

a Poisson distribution might be appropriate. What assumptions
are necessary and are they likely to hold true for the number of murders. 7
7 Discuss whether the following might have the characteristics of a Poisson distribution…
Page.no: 1 out of 57
Yogesh .S. Mangela
(i) Number of cot deaths per month per hospital area. 7

(ii) Number of cases of childhood cancers per area per year. 7
Practical Three: Categorical Data and Contingency
Tables Data Set One 8
1. Estimate probability that a randomly selected offender is a member of a gang.

2. Estimate probability that a randomly selected offender has carried a weapon
3. Estimate probability that a randomly selected offender has carried
a weapon given he has never joined a gang and has no close friends in a gang.
a weapon given he has never joined a gang.
a weapon given he is a gang member.
6. Comment on the relationship between gang membership and 9
weapon carrying using the bar chart.
7. Use the Chi Square Output to test association between gang and weapon behaviour. 9
Data Set 2:
1. Estimate the probability that a randomly selected crime involved
a white attacker and a white victim
2. Estimate the probability that a randomly selected crime involved
a white attacker.
3. Given that both attacker and victim were white, estimate
the probability that a reported crime will involve a fatality.
4. If no injury was reported, what is the estimated probability that
a crime involves a non-white attacker and a white victim?
If no injury was reported, what is the estimated probability that a crime involves a white attacker
and a white victim?
5. Comment on the relationship between injury level

6. and type of attacker/victim 11
Practical Four: The Normal Distribution 12
Practical 5: Regression 22
Practical 6: Analysis of Variance 38
Practical 7: Survival analysis: Kaplan-Meier 42

Bitrial 42
Kedney 46
HIV_azt 49
Practical 8: Survival analysis: Cox proportional hazards regression model 51
Practical 8: continued 53
Yogesh .S. Mangela
Practical 1: Date: 08-02-2007
1.
1.2
S
1.0
hap
e of
0.8
Count
0.6
0.4
0.2
0.0
.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
x
Cases weighted by pdf_.2
binomial distribution when p: 0.2
1.The binomial distribution when p=0.2 shows that there has been increased in count from n = 0 – 5.
when n = 5 its highest peak , but it is steady after n = 6 ………15 where count = 1.0.
2. Median value is 8 when p = 0.2 and n = 15 of X and modal value is 5.

Median = n+1/2; (15+1)/2; (16/2) = 8th position = 1.0
3. To calculate mean of a distribution , check what is the probability when n, for instance; when n =
0.00 count is 0.05, when n = 1.0 count = 0.3……………n = 15 p = 1.0
It is just the sum of all the count divided by the sum of n. mean= (C1 + C2 + ……C (n-2) + C (n-1).
4. Mean = (0.05 + 0.3 + 0.6+ 0.9+1.0+1.01+1+1+1+1+1+1+1+1+1+1) = (13.860 / 15) = 0.924
The above bar graph suggests that Mean = 0.924, mode = 1.0 and median = 1.0. it suggests that , p =
0.2, n = 15, are related to each other.
P = 0.4
1.2
1.0
0.8
Count
0.6
0.4
0.2
0.0
.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
Cases weighted by pdf_0.4
Yogesh .S. Mangela
5
The bar graph of p = 0.4 indicates that the mean is = (10.351/15) = 0.690,
Median = n+1/2; (15+1)/2; (16/2) = 8th position = 1.0
Mode = 1.0
Pdf_0.4 indicates that there sudden rise from x = 1 – 9, from 9 it is becoming steady. It is positively
skewed.
P= 0.6
0.5
0.4
Count
0.3
0.2
0.1
0.0
1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
x
Cases weighted by pdf_0.6
Mean (1.961/15) = 0.131; Median = 0.350; Mode= 0.1
The bar chart for p 0.6 suggests that the binomial distribution is symmetric,
6. 20 Out of 50 is selected of accusation of a murder, what is the probability that the accused are
murderer. The probability of murderer is 0.25.
DNA sample was obtained to investigate whether, a person smokes or not. N=50 p= 0.75
P= 0.8
1.2
1.0
0.8
Count
0.6
0.4
0.2
Page.no:
0.0
4.00
out 1.00
of 572.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
x
Cases weighted by PDF_0.8

Yogesh .S. Mangela
Mean = (4.911/15) = 0.327, Median = 0.005, Model = 14
The pdf-0.8 is positively skewed.
7. Hypergeometric distribution
1.2
1.0
Count
0.8
0.6
0.4
0.2
0.0
1.00 2.00 3.00 4.00 5.00 6.00
x
Cases weighted by Pdf_2
The distribution is a skewed. At the point where 4 to 6 it becomes steady.
8. Median = (6+1)/2 = 3.5 = (3rd + 4th position)/2; (0.33+0.35)/2; (0.680/2); = 0.340 and the model is
1.00
9. Mean = (1.333/6) = 0.222 is mean value of X

10. 100 people are arrested for robbery where are only 70 are the real thieves.
25 people were accused and DNA sample were obtained to check that he / she had really taken
drugs, 10 were found red handed.
8 Different values of n, m, r etc
0.4
0.3
Count
0.2
0.1
0.0
1.00 2.00 3.00 4.00 5.00 6.00
Cases weighted by Pdf
Yogesh .S. Mangela
1.2
1.0
0.8
Count
0.6
0.4
0.2
0.0
.00 1.00 2.00 3.00 4.00 5.00 6.00
x
Cases weighted by Pdf_3
PRACTICAL TWO WEEK1
1 Relationship
i) The Standard deviation and variance

Standarda deviation is the square root of th sample variance, where variance = (σ*σ)
ii) The man and variance for Poisson distribution;

Variance= var(x*x) - { E (x)} squared.
iii) The mean of the number of murders recorded in a police area per month is 1.21.
Move number of police areas with X murders (murder1)

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation Variance

Number X 370 0 6 1.21 1.105 1.221
Valid N (listwise) 370
2) The mean of the number of murder per month is approximately 677/6= 112.83 and the median is (6
+ 1)/2 = 3.5 therefore 3rd + 4th position= (30+10)/2 = 20 median.
The number are not normally distributed, the bar graph suggests that the it is skewed, but not
normally
125
distributed, as
normal
100 distribution is
Count
bell shape
75
curved, in
order to be
50
normal
25 distribution the
data has to
0
0 1 2 3 4
divide in both
5 6
Number X
side equally.
Cases weighted by number of police areas with X murders
Yogesh .S. Mangela
murder2

Number X 370 0 11 1.22 1.905 3.629
3) The variance of the expected frequency was 3.629 on the other hand observed was 1.221.
4) The distribution is skewed, however seems that the data does not fit in Poisson distribution.
200
150
Count
100
50
0
0 1 2 3 4 5 6 7 8 9 10 11
Number X
Cases weighted by number of police area with X murders in December

Number X 369 0 13 4.79 2.194 4.814
5) The bar graph suggests that the number of police areas reported sexual offences are symmetric or normally distributed.
6) Poisson distribution can be used were crime happens every week or every moth for
60
given time, could be on any days. Assumptions such as; murder happens every
Count
months or week, in a particular area, number of murders and times are unexpected
therefore your is not right to make this assumptions.
40
20
7) Characteristics of a Poisson distribution.
i) Number of cot deaths per month per hospital could describe by Poisson distributions.
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13
According to Poisson distribution the number of occurrences

Number X occurs in some given time.
In this case the number of cot death occurs in per months per hospital area.
Cases weighted by number of police areas with X reported sexual offences
ii) Number of case of childhood cancers per year. Yes this will also contain the
characteristics of Poisson distribution.
Yogesh .S. Mangela
Practical 3 Week 2
1. Estimate probability that a randomly selected offender is a member of a gang.
(1230+1307+688)/7488 = 0.437
2. Estimate probability that a randomly selected offender has carried a weapon
1438/7488 = 0.192
3. Estimate probability that a randomly selected offender has carried a weapon given he has never
joined a gang and has no close friends in a gang.
255/2551= 0.099
4. Estimate probability that a randomly selected offender has carried a weapon given he has never
joined a gang.
516/4263= 0.1210
5. Estimate probability that a randomly selected offender has carried a weapon given he is a gang
member.
1438/3225= 0.446
Yogesh .S. Mangela
[DataSet1] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-3\Gang.sav GANG

Case Processing Summary
Cases
Valid Missing Total
N Percent N Percent N Percent
WeaponBehaviour *
GangType 42 100.0% 0 .0% 42 100.0%
6) Relationship between gang and weapon;

The bar graph clearly shows that, those who had not carried weapon are equivalent to those who
carried weapon and active gang members. Never joined gang and had no friends and had not carried
weapon are similar compare to those who had carried, weapons.
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square .000(a) 5 1.000
Likelihood Ratio .000 5 1.000
Linear-by-Linear
.000 1 1.000
Association
N of Valid Cases
42
a 8 cells (66.7%) have expected count less than 5. The minimum expected count is 1.00.
7) associationbetween gang and weapon behaviour

Ho: There is association between gang and weapon behaviour.
H1: No association between gang and weapon behaviour
P≥ 0.05 which is 1.00 therefore we force to accept H0 conclude that there is significant association
between gang and weapon behaviour.
Dataset TWO
Notes
[DataSet2] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-3\Injury.sav
Cases
Valid Missing Total
Attacker_Victim *
Degree_of_Injury 40 100.0% 0 .0% 40 100.0%
Yogesh .S. Mangela
Attacker_Victim * Degree_of_Injury Crosstabulation
Degree_of_Injury Total
Fatal Serious Slight None
Attacker_Victim White/White Count 1 2 3 4 10
Expected Count 1.0 2.0 3.0 4.0 10.0
% within Degree_of_Injury
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
White/None Count
1 2 3 4 10
White
Expected Count 1.0 2.0 3.0 4.0 10.0
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
None White / Count
1 2 3 4 10
None White
Expected Count 1.0 2.0 3.0 4.0 10.0
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
Non White / Count
1 2 3 4 10
White
Expected Count 1.0 2.0 3.0 4.0 10.0
25.0% 25.0% 25.0% 25.0% 25.0%
% of Total 2.5% 5.0% 7.5% 10.0% 25.0%
Total Count 4 8 12 16 40
Expected Count 4.0 8.0 12.0 16.0 40.0
100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 10.0% 20.0% 30.0% 40.0% 100.0%
Chi-Square Tests
Asymp. Sig.
Value df (2-sided)
Pearson Chi-Square .000(a) 9 1.000
Likelihood Ratio .000 9 1.000
Linear-by-Linear
.000 1 1.000
Association
N of Valid Cases
40
a 16 cells (100.0%) have expected count less than 5. The minimum expected count is 1.00.
7. Estimate the probability that a randomly selected crime involved a white attacker and a white
victim
6521/11717= 0.557
8. Estimate the probability that a randomly selected crime involved a white attacker.
716/1171= 0.611
Yogesh .S. Mangela
9. Given that both attacker and victim were white, estimate the probability that a reported crime
will involve a fatality.
183/237= 0.7723
4. If no injury was reported, what is the estimated probability that a crime involves a non-white
attacker and a white victim?
1801/3616= 0.4981
5. If no injury was reported, what is the estimated probability that a crime involves a white
attacker and a white victim?
1422/6521= 0.2781
Bar Chart
Degree_of_Injury
4
Fatal
Serious
Slight
None
3
Count
0
White/White White/None White None White / None Non White / White
White
Attacker_Victim
5) The bar graph clearly indicates that the types of injury are similar to each other.
For instance; white attacks white and white attacks none white are equal.
Yogesh .S. Mangela
Practical -4 Week 2
Explore
[DataSet1] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-4\measurements.sav TASK 4
Cases
Valid Missing Total
shoe 83 100.0% 0 .0% 83 100.0%
Descriptives
Statistic Std. Error

shoe Mean 6.596 .2178
95% Confidence Lower Bound 6.163
Interval for Mean Upper Bound
7.030
5% Trimmed Mean 6.516

Median 6.000
Variance 3.936
Std. Deviation 1.9839
Minimum 2.0
Maximum 12.0
Range 10.0
Interquartile Range 3.0
Skewness .706 .264
Kurtosis .329 .523
Tests of Normality
Kolmogorov-Smirnov(a) Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
shoe .209 83 .000 .929 83 .000
a Lilliefors Significance Correction
The test of normality indicates that the shoe size is normally distributed with mean 6.596,
variance = 3.396 and S.D 1.9839
Yogesh .S. Mangela
Histogram
25
20
Frequency
15
10
Mean =6.596
Std. Dev. =1.9839
0
N =83
2.0 4.0 6.0 8.0 10.0 12.0
shoe
shoe Stem-and-Leaf Plot
Frequency Stem & Leaf
.00 0 .
2.00 0 . 23
26.00 0 . 44445555555555555555555555
32.00 0 . 66666666666666666666667777777777
16.00 0 . 8888888899999999
6.00 1 . 001111
1.00 1 . 2
Stem width: 10.0

Each leaf: 1 case(s)
Yogesh .S. Mangela
Normal Q-Q Plot of shoe
2.5
Expected Normal
0.0
-2.5
2 4 6 8 10 12
Observed Value
Detrended Normal Q-Q Plot of shoe
0.50
Dev from Normal
0.25
0.00
-0.25
2 4 6 8 10 12
Observed Value
Yogesh .S. Mangela
12.0
10.0
8.0
6.0
4.0
2.0
shoe
Box and whisker plot shows that shoe size is normally distributed, as there no outliers, and median 6.
LQ=2 and UQ=12
[DataSet1] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-4\measurements.sav TASK 5
gender
Cases
gender
Valid Missing Total
height in cms male 23 100.0% 0 .0% 23 100.0%
female 60 100.0% 0 .0% 60 100.0%
Descriptives
gender Statistic Std. Error
Yogesh .S. Mangela
height in cms male Mean 178.696 1.3766
181.551

Median 178.000
Variance 43.585
Minimum 170.0
Maximum 190.0
Range 20.0
Skewness .223 .481
Kurtosis -1.201 .935
female Mean 165.667 .9429
167.553

Median 166.000
Variance 53.345
Minimum 150.0
Maximum 180.0
Range 30.0
Skewness .209 .309
Kurtosis -.596 .608
Tests of Normality
Kolmogorov-Smirnov(a) Shapiro-Wilk
gender
Statistic df Sig. Statistic df Sig.
height in cms male .113 23 .200(*) .927 23 .094
female .114 60 .049 .969 60 .136
* This is a lower bound of the true significance.
a Lilliefors Significance Correction
Yogesh .S. Mangela
Histogram
for gender= male

5
4
Frequency
Mean =178.696
Std. Dev. =6.6019
0 N =23
171.0 174.0 177.0 180.0 183.0 186.0 189.0
height in cms
Histogram
for gender= female

12
10
8
Frequency
Mean =165.667
Std. Dev. =7.3037
0 N =60
150.0 155.0 160.0 165.0 170.0 175.0 180.0
height in cms
Stem-and-Leaf Plots
height in cms Stem-and-Leaf Plot for
gender= male
Yogesh .S. Mangela
8.00 17 . 00002344
4.00 17 . 5778
6.00 18 . 001224
4.00 18 . 7888
1.00 19 . 0
Stem width: 10.0

height in cms Stem-and-Leaf Plot for

gender= female
2.00 15 . 02
9.00 15 . 557778899
16.00 16 . 0000000001123334
13.00 16 . 5566667778889
11.00 17 . 00000001114
7.00 17 . 5777889
2.00 18 . 00
Stem width: 10.0

Normal Q-Q Plots

Normal Q-Q Plot of height in cms
for gender= male

2.0
1.5
Expected Normal
1.0
0.5
0.0
-0.5
-1.0
-1.5
170 175 180 185 190 195

Observed Value
Yogesh .S. Mangela
Normal Q-Q Plot of height in cms
for gender= female

2
Expected Normal
-2
150 155 160 165 170 175 180 185

Observed Value
Detrended Normal Q-Q Plots
Yogesh .S. Mangela
Detrended Normal Q-Q Plot of height in cms
for gender= male
0.4
Dev from Normal
0.2
0.0
-0.2
171 174 177 180 183 186 189

Observed Value
Detrended Normal Q-Q Plot of height in cms
for gender= female

0.4
0.3
0.2
Dev from Normal
0.1
0.0
-0.1
-0.2
-0.3
150 155 160 165 170 175 180

Observed Value
Yogesh .S. Mangela
180.0
height in cms
160.0
male female
gender
Box and Whisker plot specify that the male are taller than female, Male had median of 175cm height
whereas female had approximately 165cm height.
Practical 5 Week 3
SEX
HEIGHTAGE
0
1
BMP WEIGHT
FEV1
RV
FRC
PEMAX TLC
AGE HEIGHT WEIGHT BMP FEV1 RV FRC TLC PEMAX
Yogesh .S. Mangela
5.50
5.00
4.50
FEV1
4.00
3.50
3.00
2.50
160.00 165.00 170.00 175.00 180.00 185.00

HEIGHT
Practical 5 Week3 b)
Regression
[DataSet1] H:\2nd Year\MA2012N\Week3\week-3.sav
Variables Entered/Removed(b)
Variables Variables
Model Entered Removed Method
1 HEIGHT(a) . Enter
a All requested variables entered.
b Dependent Variable: FEV1
Model Summary(b)
Mode Change Statistics

l Adjusted R Std. Error of the
R R Square Square Estimate R Square
Change F Change df1 df2 Sig. F Change
1 .574(a) .330 .293 .58792 .330 8.858 1 18 .00
a Predictors: (Constant), HEIGHT
ANOVA(b)
Sum of
Model Squares df Mean Square F Sig.
1 Regression 3.062 1 3.062 8.858 .008(a)
Residual 6.222 18 .346
Total 9.283 19
a Predictors: (Constant), HEIGHT
Coefficients(a)
Yogesh .S. Mangela
Mode Unstandardized Standardized

Coefficients Coefficients 95% Confidence Interval for B
l t Sig.
B Std. Error Beta Lower Bound Upper Bound
1 (Constant) -8.986 4.315 -2.082 .052 -18.051 .080
HEIGHT .073 .025 .574 2.976 .008 .022 .125
a Dependent Variable: FEV1
Residuals Statistics(a)
Minimum Maximum Mean Std. Deviation N

Predicted Value 3.0562 4.5027 3.8510 .40143 20
Std. Predicted Value -1.980 1.623 .000 1.000 20
Standard Error of Predicted
Value .133 .298 .180 .049 20
Adjusted Predicted Value 2.8894 4.4847 3.8396 .41374 20

Residual -1.10413 1.41930 .00000 .57224 20
Std. Residual -1.878 2.414 .000 .973 20
Stud. Residual -1.945 2.488 .009 1.014 20
Deleted Residual -1.18437 1.50721 .01139 .62132 20
Stud. Deleted Residual -2.127 2.985 .019 1.105 20
Mahal. Distance .023 3.920 .950 1.115 20
Cook's Distance .000 .192 .043 .057 20
Centered Leverage Value .001 .206 .050 .059 20
a Dependent Variable: FEV1
Charts
Histogram
Dependent Variable: FEV1
4
Frequency
Mean =1.01E-15
Std. Dev. =0.973
0 N =20
-2 0 2
Regression Standardized Residual
Yogesh .S. Mangela
Normal P-P Plot of Regression Standardized Residual

Expected Cum Prob
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Observed Cum Prob
Scatterplot
2
Regression Studentized Deleted (Press)
1
Residual
-1
-2
-3
-2 0 2
Regression Standardized Predicted Value

Practical 5 Week 3 / 2a-b)
Yogesh .S. Mangela
Graph
[DataSet2] H:\2nd Year\MA2012N\Week3\cystic.sav
SEX
HEIGHT AGE
0
1
BMP WEIGHT
FEV1
RV
FRC
PEMAX TLC
AGE HEIGHT WEIGHT BMP FEV1 RV FRC TLC PEMAX
Regression (Stepwise Data modelling)

Notes
Variables Entered/Removed(a)
Variables Variables
1
Stepwise
(Criteria:
Probability-
of-F-to-
enter <=
WEIGHT .
.050,
Probability-
of-F-to-
remove >=
.100).
a Dependent Variable: PEMAX
Model Summary(b)

Adjusted R Std. Error of
l R R Square Square the Estimate R Square
1 .635(a) .404 .378 26.380 .404 15.559 1 23 .00
a Predictors: (Constant), WEIGHT
b Dependent Variable: PEMAX
Yogesh .S. Mangela
ANOVA(b)
Sum of
1 Regression 10827.159 1 10827.159 15.559 .001(a)
Residual 16005.481 23 695.890
Total 26832.640 24
Coefficients(a)

l t Sig.
1 (Constant) 63.546 12.702 5.003 .000 37.270 89.821
WEIGHT 1.187 .301 .635 3.944 .001 .564 1.809
Excluded Variables(b)
Mode Collinearity
Partial Statistics
l Beta In t Sig. Correlation
Tolerance
1 AGE .212(a) .549 .588 .116 .179
HEIGHT .094(a) .224 .825 .048 .152
SEX -.174(a) -1.063 .299 -.221 .964
BMP -.361(a) -1.729 .098 -.346 .548
FEV1 .211(a) 1.179 .251 .244 .799
RV .129(a) .620 .542 .131 .614
FRC -.041(a) -.194 .848 -.041 .619
TLC .102(a) .567 .576 .120 .825
a Predictors in the Model: (Constant), WEIGHT

Predicted Value 78.85 151.12 109.12 21.240 25
Standard Error of
Predicted Value 5.288 11.884 7.212 1.952 25
Adjusted Predicted Value 76.85 147.59 108.62 20.773 25
Residual -44.305 48.408 .000 25.824 25
Std. Residual -1.680 1.835 .000 .979 25
Stud. Residual -1.733 1.954 .009 1.021 25
Deleted Residual -47.198 57.007 .505 28.144 25
Mahal. Distance .005 3.911 .960 1.090 25
Cook's Distance .000 .426 .046 .085 25
Yogesh .S. Mangela
Charts
Histogram
Dependent Variable: PEMAX
5
Frequency
Mean =2.78E-17
Std. Dev. =0.979
0 N =25
-2 0 2
Expected Cum Prob
1.0
0.8
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Observed Cum Prob
Scatterplot
2
Residual
-1
-2
-2 0 2
Regression (Backward Data Modelling)
Yogesh .S. Mangela
Variables Entered/Removed(b)
Variables Variables
1 WEIGHT,
SEX, TLC,
BMP, RV,
FEV1, . Enter
HEIGHT,
FRC,
AGE(a)
2
Backward
(criterion:
Probability
. SEX
of F-to-
remove >=
.100).
3
Backward
(criterion:
Probability
. TLC
of F-to-
remove >=
.100).
4
Backward
(criterion:
Probability
. FRC
of F-to-
remove >=
.100).
5
Backward
(criterion:
Probability
. AGE
of F-to-
remove >=
.100).
6
Backward
(criterion:
Probability
. HEIGHT
of F-to-
remove >=
.100).
7
Backward
(criterion:
Probability
. RV
of F-to-
remove >=
.100).
a All requested variables entered.

Model Summary(h)
Yogesh .S. Mangela

1 .798(a) .637 .420 25.471 .637 2.929 9 15 .03
2 .797(b) .636 .454 24.710 -.001 .058 1 15 .81
3 .795(c) .632 .480 24.114 -.004 .190 1 16 .66
4 .792(d) .627 .502 23.592 -.005 .229 1 17 .63
5 .788(e) .621 .522 23.128 -.005 .261 1 18 .61
6 .784(f) .614 .537 22.754 -.007 .357 1 19 .55
7 .755(g) .570 .509 23.440 -.044 2.286 1 20 .14
a Predictors: (Constant), WEIGHT, SEX, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
b Predictors: (Constant), WEIGHT, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
c Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, FRC, AGE
d Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, AGE
e Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT
f Predictors: (Constant), WEIGHT, BMP, RV, FEV1
g Predictors: (Constant), WEIGHT, BMP, FEV1
h Dependent Variable: PEMAX
ANOVA(h)
Sum of
1 Regression 17101.390 9 1900.154 2.929 .032(a)
Residual 9731.250 15 648.750
Total 26832.640 24
2 Regression 17063.488 8 2132.936 3.493 .016(b)
Residual 9769.152 16 610.572
Total 26832.640 24
3 Regression 16947.546 7 2421.078 4.164 .008(c)
Residual 9885.094 17 581.476
Total 26832.640 24
4 Regression 16814.390 6 2802.398 5.035 .003(d)
Residual 10018.250 18 556.569
Total 26832.640 24
5 Regression 16669.053 5 3333.811 6.232 .001(e)
Residual 10163.587 19 534.926
Total 26832.640 24
6 Regression 16478.040 4 4119.510 7.957 .001(f)
Residual 10354.600 20 517.730
Total 26832.640 24
7 Regression 15294.452 3 5098.151 9.279 .000(g)
Residual 11538.188 21 549.438
Total 26832.640 24
a Predictors: (Constant), WEIGHT, SEX, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
b Predictors: (Constant), WEIGHT, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
c Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, FRC, AGE
d Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, AGE
e Predictors: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT
f Predictors: (Constant), WEIGHT, BMP, RV, FEV1
g Predictors: (Constant), WEIGHT, BMP, FEV1
h Dependent Variable: PEMAX
Coefficients(a)
Yogesh .S. Mangela
l t Sig.
1 (Constant) 176.058 225.891 .779 .448 -305.417 657.534
AGE -2.542 4.802 -.385 -.529 .604 -12.777 7.693
HEIGHT -.446 .903 -.287 -.494 .628 -2.372 1.479
SEX -3.737 15.460 -.057 -.242 .812 -36.689 29.215
BMP -1.745 1.155 -.627 -1.510 .152 -4.207 .717
FEV1 1.081 1.081 .362 1.000 .333 -1.223 3.385
RV .197 .196 .507 1.004 .331 -.221 .615
FRC -.308 .492 -.403 -.626 .540 -1.358 .741
TLC .189 .500 .096 .377 .711 -.877 1.254
WEIGHT 2.993 2.008 1.602 1.490 .157 -1.287 7.273
2 (Constant) 153.039 198.715 .770 .452 -268.218 574.295
AGE -2.115 4.331 -.320 -.488 .632 -11.295 7.066
HEIGHT -.395 .852 -.254 -.464 .649 -2.200 1.411
BMP -1.742 1.121 -.625 -1.554 .140 -4.117 .634
FEV1 1.265 .743 .424 1.703 .108 -.310 2.840
RV .178 .174 .458 1.021 .323 -.192 .547
FRC -.248 .412 -.325 -.602 .555 -1.122 .626
TLC .208 .478 .106 .436 .669 -.805 1.222
WEIGHT 2.835 1.842 1.517 1.539 .143 -1.070 6.740
3 (Constant) 198.294 165.331 1.199 .247 -150.524 547.112
AGE -2.663 4.044 -.403 -.659 .519 -11.195 5.869
HEIGHT -.490 .804 -.315 -.609 .550 -2.185 1.206
BMP -1.963 .975 -.705 -2.012 .060 -4.020 .095
FEV1 1.248 .724 .418 1.724 .103 -.280 2.775
RV .160 .165 .411 .967 .347 -.189 .508
FRC -.176 .369 -.231 -.479 .638 -.954 .602
WEIGHT 3.156 1.648 1.689 1.915 .072 -.321 6.632
4 (Constant) 166.905 148.476 1.124 .276 -145.032 478.842
AGE -1.819 3.560 -.275 -.511 .616 -9.299 5.661
HEIGHT -.410 .769 -.264 -.533 .600 -2.026 1.206
BMP -1.949 .954 -.700 -2.043 .056 -3.953 .055
FEV1 1.412 .624 .473 2.263 .036 .101 2.723
RV .096 .095 .246 1.010 .326 -.103 .294
WEIGHT 2.874 1.506 1.539 1.908 .072 -.290 6.039
5 (Constant) 137.096 133.856 1.024 .319 -143.068 417.259
HEIGHT -.449 .751 -.288 -.598 .557 -2.020 1.122
BMP -1.641 .725 -.589 -2.265 .035 -3.158 -.124
FEV1 1.472 .601 .493 2.450 .024 .214 2.729
RV .110 .088 .283 1.245 .228 -.075 .295
WEIGHT 2.339 1.060 1.252 2.206 .040 .120 4.557
6 (Constant) 63.947 53.277 1.200 .244 -47.187 175.080
BMP -1.377 .565 -.494 -2.436 .024 -2.557 -.198
FEV1 1.548 .578 .518 2.679 .014 .343 2.753
RV .126 .083 .323 1.512 .146 -.048 .299
WEIGHT 1.749 .381 .936 4.595 .000 .955 2.543
7 (Constant) 126.334 34.720 3.639 .002 54.130 198.537
BMP -1.465 .579 -.526 -2.530 .019 -2.670 -.261
FEV1 1.109 .514 .371 2.155 .043 .039 2.178
WEIGHT 1.536 .364 .822 4.216 .000 .779 2.294
Excluded Variables(g)
Yogesh .S. Mangela
Mode Collinearity
Partial Statistics
Tolerance
2 SEX -.057(a) -.242 .812 -.062 .441
3 SEX -.071(b) -.316 .756 -.079 .453
TLC .106(b) .436 .669 .108 .386
4 SEX .002(c) .009 .993 .002 .676
TLC .047(c) .217 .831 .052 .460
FRC -.231(c) -.479 .638 -.115 .093
5 SEX .006(d) .033 .974 .008 .677
TLC .081(d) .428 .674 .100 .576
FRC -.092(d) -.216 .831 -.051 .115
AGE -.275(d) -.511 .616 -.120 .071
6 SEX .011(e) .066 .948 .015 .679
TLC .107(e) .607 .551 .138 .644
FRC -.032(e) -.079 .938 -.018 .121
AGE -.303(e) -.577 .571 -.131 .072
HEIGHT -.288(e) -.598 .557 -.136 .086
7 SEX -.012(f) -.068 .947 -.015 .685
TLC .182(f) 1.100 .284 .239 .743
FRC .266(f) 1.201 .244 .259 .410
AGE -.517(f) -1.033 .314 -.225 .081
HEIGHT -.466(f) -.996 .331 -.217 .094
RV .323(f) 1.512 .146 .320 .422
a Predictors in the Model: (Constant), WEIGHT, TLC, BMP, RV, FEV1, HEIGHT, FRC, AGE
b Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, FRC, AGE
c Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT, AGE
d Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1, HEIGHT
e Predictors in the Model: (Constant), WEIGHT, BMP, RV, FEV1
f Predictors in the Model: (Constant), WEIGHT, BMP, FEV1
g Dependent Variable: PEMAX

Predicted Value 71.97 160.77 109.12 25.244 25
Standard Error of
Predicted Value 6.062 15.174 9.160 2.040 25
Residual -42.388 40.373 .000 21.926 25
Std. Residual -1.808 1.722 .000 .935 25
Stud. Residual -1.962 1.937 .012 1.014 25
Mahal. Distance .645 9.098 2.880 1.824 25
Cook's Distance .001 .249 .045 .059 25
Charts
Histogram
Yogesh .S. Mangela
6
Frequency
Mean =-5.55E-16
Std. Dev. =0.935
0 N =25
-2 0 2
1.0
0.8
Expected Cum Prob
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Observed Cum Prob
Scatterplot
Yogesh .S. Mangela
2.5
Residual
0.0
-2.5
-1 0 1 2
Regression (Forward Data Modelling)

Notes
Variables Entered/Removed(a)
Variables Variables
1 Forward
(Criterion:
Probability-
WEIGHT .
of-F-to-
enter <=
.050)
Model Summary(b)

1 .635(a) .404 .378 26.380 .404 15.559 1 23 .00
ANOVA(b)
Yogesh .S. Mangela
Sum of
1 Regression 10827.159 1 10827.159 15.559 .001(a)
Residual 16005.481 23 695.890
Total 26832.640 24
Coefficients(a)

l t Sig.
1 (Constant) 63.546 12.702 5.003 .000 37.270 89.821
WEIGHT 1.187 .301 .635 3.944 .001 .564 1.809
Excluded Variables(b)
Mode Collinearity
Partial Statistics
Tolerance
1 AGE .212(a) .549 .588 .116 .179
HEIGHT .094(a) .224 .825 .048 .152
SEX -.174(a) -1.063 .299 -.221 .964
BMP -.361(a) -1.729 .098 -.346 .548
FEV1 .211(a) 1.179 .251 .244 .799
RV .129(a) .620 .542 .131 .614
FRC -.041(a) -.194 .848 -.041 .619
TLC .102(a) .567 .576 .120 .825
a Predictors in the Model: (Constant), WEIGHT

Predicted Value 78.85 151.12 109.12 21.240 25
Standard Error of
Predicted Value 5.288 11.884 7.212 1.952 25
Residual -44.305 48.408 .000 25.824 25
Std. Residual -1.680 1.835 .000 .979 25
Stud. Residual -1.733 1.954 .009 1.021 25
Mahal. Distance .005 3.911 .960 1.090 25
Cook's Distance .000 .426 .046 .085 25
Charts
Histogram
Yogesh .S. Mangela

4
Frequency
Mean =2.78E-17
Std. Dev. =0.979
0 N =25
-2 0 2
1.0
0.8
Expected Cum Prob
0.6
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Observed Cum Prob
Scatterplot
Yogesh .S. Mangela

Regression Studentized Deleted (Press) Dependent Variable: PEMAX
1
Residual
-1
-2
-2 0 2
Practical 6 Week 4
Explore
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_bi)
1=control, 2=major depression, 3=bipolar depression, 4=schizophrenia,

5=atypical
1=control, 2=major Cases

depression, 3=bipolar
Valid Missing Total
depression,
4=schizophrenia,
5=atypical

cortisol control 31 100.0% 0 .0% 31 100.0%
major depression 14 100.0% 0 .0% 14 100.0%
biopolar depression 8 100.0% 0 .0% 8 100.0%
schizophrenia 14 100.0% 0 .0% 14 100.0%
atypical 4 100.0% 0 .0% 4 100.0%
Yogesh .S. Mangela
cortisol
25.0
20.0
15.0
cortisol
67
31
10.0
53 66
5.0
0.0
control major depression biopolar schizophrenia atypical

depression

5=atypical
Graph
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Q1_bii
25.0
20.0
15.0
cortisol
10.0
5.0
0.0
2 4
group
Graph
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Q1_biii)
Yogesh .S. Mangela
20.0
15.0
95% CI cortisol
10.0
5.0
0.0
-5.0

depression

5=atypical
Oneway
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav Prac-6 ci)

Test of Homogeneity of Variances
cortisol
Levene
Statistic df1 df2 Sig.
16.121 4 66 .000
ANOVA
cortisol
Sum of
Squares df Mean Square F Sig.
Between Groups 1405.594 4 351.399 21.588 .000
Within Groups 1074.295 66 16.277
Total 2479.889 70
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Dii)
1=control, 2=major Cases

depression, 3=bipolar
Valid Missing Total
depression,
4=schizophrenia,
5=atypical

LCORTISOL control 31 100.0% 0 .0% 31 100.0%
major depression 14 100.0% 0 .0% 14 100.0%
biopolar depression 8 100.0% 0 .0% 8 100.0%
schizophrenia 14 100.0% 0 .0% 14 100.0%
atypical 4 100.0% 0 .0% 4 100.0%
Yogesh .S. Mangela
LCORTISOL
3.00
67
31
LCORTISOL
2.00
1.00
0.00
-1.00

depression

5=atypical
Graph
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC_6_Dii)
95% CI LCORTISOL
2.00
0.00

depression
5=atypical
3.00
2.00
LCORTISOL
1.00
0.00
-1.00
2 4
Page.no: 39 out of 57 group
Yogesh .S. Mangela
Oneway
[DataSet1] H:\2nd Year\MA2012N\Week4\cortisol.sav PRC-6_Diii
Test of Homogeneity of Variances
LCORTISOL
Levene
Statistic df1 df2 Sig.
2.507 4 66 .050
ANOVA
LCORTISOL
Sum of
Squares df Mean Square F Sig.
Between Groups 32.019 4 8.005 16.003 .000
Within Groups 33.014 66 .500
Total 65.032 70
Post Hoc Tests

Multiple Comparisons
Dependent Variable: LCORTISOL

Dunnett t (2-sided)
(J) 1=control,
2=major 95% Confidence Interval
depression,
(I) 1=control, 2=major 3=bipolar
depression, 3=bipolar depression,
depression, Mean
4=schizophrenia, 4=schizophrenia,
Difference (I-
5=atypical 5=atypical J) Std. Error Sig.
Lower Bound Upper Bound
major depression control 1.57109(*) .22774 .000 .9911 2.1511
biopolar depression control -.14395 .28047 .973 -.8582 .5703
schizophrenia control -.26084 .22774 .671 -.8408 .3192
atypical control -.19067 .37575 .974 -1.1476 .7663
* The mean difference is significant at the .05 level.

a Dunnett t-tests treat one group as a control, and compare all other groups against it.
Practical 7: Survival analysis: Kaplan-Meier
Kaplan-Meier
[DataSet1] H:\2nd Year\MA2012N\Week5\btrial\btrial.sav PRC-7 1c)
Yogesh .S. Mangela
ihc_typ Censored
e Total N N of Events
N Percent
negative 36 16 20 55.6%
positive 9 8 1 11.1%
Overall 45 24 21 46.7%
Survival Table
Cumulative Proportion
ihc_typ N of N of
Surviving at the Time Cumulative Remaining
e Time Status Events Cases
Estimate Std. Error
negative 1 19.000 dead .972 .027 1 35
2 25.000 dead .944 .038 2 34
3 30.000 dead .917 .046 3 33
4 34.000 dead .889 .052 4 32
5 37.000 dead .861 .058 5 31
6 46.000 dead .833 .062 6 30
7 47.000 dead .806 .066 7 29
8 51.000 dead .778 .069 8 28
9 56.000 dead .750 .072 9 27
10 57.000 dead .722 .075 10 26
11 61.000 dead .694 .077 11 25
12 66.000 dead .667 .079 12 24
13 67.000 dead .639 .080 13 23
14 74.000 dead .611 .081 14 22
15 78.000 dead .583 .082 15 21
16 86.000 dead .556 .083 16 20
17 122.000 alive . . 16 19
18 123.000 alive . . 16 18
19 130.000 alive . . 16 17
20 130.000 alive . . 16 16
21 133.000 alive . . 16 15
22 134.000 alive . . 16 14
23 136.000 alive . . 16 13
24 141.000 alive . . 16 12
25 143.000 alive . . 16 11
26 148.000 alive . . 16 10
27 151.000 alive . . 16 9
28 152.000 alive . . 16 8
29 153.000 alive . . 16 7
30 154.000 alive . . 16 6
31 156.000 alive . . 16 5
32 162.000 alive . . 16 4
33 164.000 alive . . 16 3
34 165.000 alive . . 16 2
35 182.000 alive . . 16 1
36 189.000 alive . . 16 0
Yogesh .S. Mangela
positive 1 22.000 dead .889 .105 1 8
2 23.000 dead .778 .139 2 7
3 38.000 dead .667 .157 3 6
4 42.000 dead .556 .166 4 5
5 73.000 dead .444 .166 5 4
6 77.000 dead .333 .157 6 3
7 89.000 dead .222 .139 7 2
8 115.000 dead .111 .105 8 1
9 144.000 alive . . 8 0
Means and Medians for Survival Time
ihc_typ Mean(a) Median

e 95% Confidence Interval 95% Confidence Interval
Estimate Std. Error Estimate Std. Error

Lower Bound Upper Bound Lower Bound Upper Bound
negative 128.167 11.530 105.567 150.766 . . . .
positive 69.222 13.257 43.239 95.206 73.000 46.212 .000 163.576
Overall 117.378 10.328 97.134 137.622 89.000 . . .
a Estimation is limited to the largest survival time if it is censored.
Mean for the survive time distribution
for Negative IHC-TYPE the population mean survival time is estimated to be 128 deaths with 95%
confidence interval (105.567, 150.766) deaths.
For positive IHC-TYPE the population mean survival time is estimated to be 69.222 deaths with 95%
confidence interval (43.239, 95.206) deaths
Mean for the survival time distribution
For negative HIC-TYPE the population median survival time m is estimated to be 0.00 deaths with
95% confidence interval (not included) deaths.
For positive HIC-TYPE the population median survival time M is estimated to be 73.00 with 95%
confidence interval (0.00, 163.576)
Percentiles
ihc_type 25.0% 50.0% 75.0%

Estimate Std. Error Estimate Std. Error Estimate Std. Error
negative . . . . 56.000 9.093
positive 89.000 23.697 73.000 46.212 38.000 11.314
Overall . . 89.000 . 51.000 8.899
Lower quartiles for the survival time distribution
For negative HIC-TYPE the population lower quartile survival time Q1 is estimated to be 56.00, Q2
is not estimated for negative HIC-TYPE deaths and Q3.
For positive HIC-TYPE the population lower quartile survival time Q1 is estimated to be 38.00
deaths, Q2 is estimated to be 73.00 deaths and Q3 is estimated to be 89.00 deaths.
Overall Comparisons
Yogesh .S. Mangela
Chi-Square df Sig.
Log Rank (Mantel-Cox) 5.494 1 .019
Breslow (Generalized
Wilcoxon) 4.351 1 .037
Tarone-Ware 4.879 1 .027
Test of equality of survival distributions for the different levels of ihc_type.
H0: Survival time distribution of negative HIC-TYPE and positive HIC-TYPE are same.
H1: Survival time distribution of negative HIC-TYPE and positive HIC-TYPE are different.
Above table since p≤0.05 suggests that negative HIC-TYPE and positive HIC-TYPE are defferent.
Survival Functions
ihc_type
1.0
negative
positive
negative-censored
0.8 positive-censored
Cum Survival
0.6
0.4
0.2
0.0
0 50 100 150 200

time
The survival function plot suggests that the negative HIC-TYPE has a higher cumulative survival
probability throughout, with 189 deaths. the cumulative survival proportion for the two HIC-TYPE
are different, For instance positive HIC-TYPE 145 have 02% chance of survive. on the other hand ,
negative HIC-TYPE 145 its around 55% chance of survive.
Yogesh .S. Mangela
Hazard Function
ihc_type
2.5
negative
positive
negative-censored
2.0 positive-censored
Cum Hazard
1.5
1.0
0.5
0.0
0 50 100 150 200

time
Kaplan-Meier
[DataSet2] H:\2nd Year\MA2012N\Week5\kidney\kidney.sav Prac 7 (2)
CATHATER Total N N of Events

Censored
N Percent
SURGICALLY 43 15 28 65.1%
PERCUTANEOUSLY 76 11 65 85.5%
Overall 119 26 93 78.2%
The censor Status is INFECTED the status event is 1=YES, status event occurs as late as possible
Survival Table
Cumulative Proportion N of N of
Surviving at the Time Cumulative Remaining
CATHATER Time Status Events Cases
Estimate Std. Error
Yogesh .S. Mangela
SURGICALLY 1 1.500 NO .977 .023 1 42
2 2.500 0 . . 1 41
3 2.500 0 . . 1 40
4 3.500 NO .952 .033 2 39
5 3.500 0 . . 2 38
6 3.500 0 . . 2 37
7 3.500 0 . . 2 36
8 4.500 NO . . 3 35
9 4.500 NO .899 .048 4 34
10 4.500 0 . . 4 33
11 5.500 NO .872 .054 5 32
12 5.500 0 . . 5 31
13 6.500 0 . . 5 30
14 6.500 0 . . 5 29
15 7.500 0 . . 5 28
16 7.500 0 . . 5 27
17 7.500 0 . . 5 26
18 7.500 0 . . 5 25
19 8.500 NO . . 6 24
20 8.500 NO .802 .068 7 23
21 8.500 0 . . 7 22
22 9.500 NO .766 .074 8 21
23 9.500 0 . . 8 20
24 10.500 NO .728 .080 9 19
25 10.500 0 . . 9 18
26 11.500 NO .687 .085 10 17
27 11.500 0 . . 10 16
28 12.500 0 . . 10 15
29 12.500 0 . . 10 14
30 13.500 0 . . 10 13
31 14.500 0 . . 10 12
32 14.500 0 . . 10 11
33 15.500 NO .625 .098 11 10
34 16.500 NO .562 .106 12 9
35 18.500 NO .500 .111 13 8
36 21.500 0 . . 13 7
37 21.500 0 . . 13 6
38 22.500 0 . . 13 5
39 22.500 0 . . 13 4
40 23.500 NO .375 .137 14 3
41 25.500 0 . . 14 2
42 26.500 NO .187 .149 15 1
43 27.500 0 . . 15 0
Yogesh .S. Mangela
PERCUTANEOUSLY 1 .500 NO . . 1 75
2 .500 NO . . 2 74
3 .500 NO . . 3 73
4 .500 NO . . 4 72
5 .500 NO . . 5 71
6 .500 NO .921 .031 6 70
7 .500 0 . . 6 69
8 .500 0 . . 6 68
9 .500 0 . . 6 67
10 .500 0 . . 6 66
11 .500 0 . . 6 65
12 .500 0 . . 6 64
13 .500 0 . . 6 63
14 .500 0 . . 6 62
15 .500 0 . . 6 61
16 .500 0 . . 6 60
17 1.500 0 . . 6 59
18 1.500 0 . . 6 58
19 1.500 0 . . 6 57
20 1.500 0 . . 6 56
21 2.500 NO . . 7 55
22 2.500 NO .888 .038 8 54
23 2.500 0 . . 8 53
24 2.500 0 . . 8 52
25 2.500 0 . . 8 51
26 2.500 0 . . 8 50
27 2.500 0 . . 8 49
28 3.500 NO .870 .041 9 48
29 3.500 0 . . 9 47
30 3.500 0 . . 9 46
31 3.500 0 . . 9 45
32 3.500 0 . . 9 44
33 3.500 0 . . 9 43
34 4.500 0 . . 9 42
35 4.500 0 . . 9 41
36 4.500 0 . . 9 40
37 5.500 0 . . 9 39
38 5.500 0 . . 9 38
39 5.500 0 . . 9 37
40 5.500 0 . . 9 36
41 5.500 0 . . 9 35
42 6.500 NO .845 .047 10 34
43 6.500 0 . . 10 33
44 7.500 0 . . 10 32
45 7.500 0 . . 10 31
46 7.500 0 . . 10 30
47 8.500 0 . . 10 29
48 8.500 0 . . 10 28
49 8.500 0 . . 10 27
50 9.500 0 . . 10 26
51 9.500 0 . . 10 25
52 10.500 0 . . 10 24
53 10.500 0 . . 10 23
54 10.500 0 . . 10 22
55 11.500 0 . . 10 21
56 11.500 0 . . 10 20
57 12.500 0 . . 10 19
58 12.500 0 . . 10 18
59 12.500 0 . . 10 17
60 12.500 0 . . 10 16
61 14.500 0 . . 10 15
62 14.500 0 . . 10 14
Yogesh .S. Mangela
CATHATER Mean(a) Median

95% Confidence Interval 95% Confidence In
Lower Bound Upper Bound Lower Bound Uppe
SURGICALLY 18.527 1.659 15.275 21.778 18.500 4.149 10.367
PERCUTANEOUSLY 23.649 1.386 20.933 26.366 . . .
Overall 21.028 1.206 18.664 23.391 26.500 . .
Means of the survival time distribution

For SURGICALLY Catheter type the population mean survival; time is estimated to be 18.527 with
95% confidence interval (15.275, 21.778).
For PERCUTANEOUSLY catheter type the population mean survival time is estimated to be 23.649
with 95% confidence interval (20.933, 26.366)
Median for the survival time distribution

For SURGICALLY Catheter type the population median survival; time is estimated to be 18.500 with
95% confidence interval (10.367, 26.633).
For PERCUTANEOUSLY catheter type the population median survival time is estimated to be
23.649 with 95% confidence interval (not included).
Percentiles
CATHATER 25.0% 50.0% 75.0%

SURGICALLY 26.500 2.386 18.500 4.149 10.500 2.080
PERCUTANEOUSLY . . . . . .
Overall . . 26.500 . 15.500 3.301

For SURGICALLY Catheter type the Q1 survival time is estimated to be 10.500. Q2 is 18.50 and Q3
is 26.500
For PERCUTANEOUSLY type the Q1 survival time is not estimated, Q2 and Q3 respectivaly.
Overall Comparisons
Chi-Square df Sig.
Log Rank (Mantel-Cox) 2.530 1 .112
Wilcoxon) .002 1 .964
Tarone-Ware .403 1 .526
Test of equality of survival distributions for the different levels of CATHATER.
H0: both the Surgically and Percutaneously are same survival distribution.
H1: both the Surgically and Percutaneously are different survival distribution.
P≥0.05 hence we accept H0 and conclude that there is no evidence that there are any difference
survival distribution in Surgically and Percutaneously CATHATER type.
Yogesh .S. Mangela
Survival Functions
CATHATER
1.0
PERCUTANEOUSLY
SURGICALLY
PERCUTANEOUSLY-
censored
0.8 SURGICALLY-
censored
Cum Survival
0.6
0.4
0.2
0.0
0.0 5.0 10.0 15.0 20.0 25.0 30.0
TIME
From the survival function graph it suggests that the PERCUTANEUSALY cathater has a higher
cumulative survival probability than the SURGICALLY.
Hazard Function
CATHATER
2.0
PERCUTANEOUSLY
SURGICALLY
PERCUTANEOUSLY-
censored
SURGICALLY-
1.5 censored
Cum Hazard
1.0
0.5
0.0
0.0 5.0 10.0 15.0 20.0 25.0 30.0
TIME
Kaplan-Meier
[DataSet1] H:\2nd Year\MA2012N\Week5\HIV_azt\HIV_azt.sav 7.3.A)
cd4 Total N N of Events Censored

N Percent
No 7 3 4 57.1%
Yogesh .S. Mangela
Yes 27 14 13 48.1%
Overall 34 17 17 50.0%
Sensor status is CD4 (0 = no, 1 = yes) and event status is Drug (1 = AZT+zalcitabine, 1 =
AZT+zalcitabine+saquinavir )
Survival Table
Cumulative Proportion
Surviving at the Time
Estimate Std. Error

1 AZT+zalcit
4.000 .857 .132 1 6
abine
2 AZT+zalcit
38.000 .714 .171 2 5
abine
3 AZT+zalcit
51.000 abine+saq . . 2 4
uinavir
4 AZT+zalcit
56.000 abine+saq . . 2 3
uinavir
5 AZT+zalcit
94.000 abine+saq . . 2 2
uinavir
6 AZT+zalcit
180.000 .357 .267 3 1
abine
7 AZT+zalcit
180.000 abine+saq . . 3 0
uinavir
1 AZT+zalcit
6.000 .958 .041 1 23
abine
2 AZT+zalcit
11.000 .917 .056 2 22
abine
3 AZT+zalcit
12.000 .875 .068 3 21
abine
4 AZT+zalcit
12.000 abine+saq . . 3 20
uinavir
5 AZT+zalcit
22.000 abine+saq . . 3 19
uinavir
6 AZT+zalcit
32.000 .829 .078 4 18
abine
7 AZT+zalcit
35.000 .783 .086 5 17
abine
8 AZT+zalcit
39.000 .737 .093 6 16
abine
9 AZT+zalcit
45.000 .691 .098 7 15
abine
10 AZT+zalcit
48.000 abine+saq . . 7 14
uinavir
11 AZT+zalcit
49.000 .641 .102 8 13
abine
12 AZT+zalcit
75.000 .592 .106 9 12
abine
13 AZT+zalcit
80.000 .543 .108 10 11
abine
14 AZT+zalcit
80.000 abine+saq . . 10 10
uinavir
Yogesh .S. Mangela
15 AZT+zalcit
84.000 .488 .110 11 9
abine
16 AZT+zalcit
85.000 .434 .110 12 8
abine
17 AZT+zalcit
85.000 abine+saq . . 12 7
uinavir
18 AZT+zalcit
87.000 .372 .111 13 6
abine
19 AZT+zalcit
90.000 abine+saq . . 13 5
uinavir
20 AZT+zalcit
102.000 .298 .111 14 4
abine
21 AZT+zalcit
160.000 abine+saq . . 14 3
uinavir
22 AZT+zalcit
171.000 abine+saq . . 14 2
uinavir
23 AZT+zalcit
180.000 abine+saq . . 14 1
uinavir
24 AZT+zalcit
238.000 abine+saq . . 14 0
uinavir
cd4 Mean(a) Median

95% Confidence Interval 95% Confidence Interval
Lower Bound Upper Bound Lower Bound Upper Bound
No 134.571 33.515 68.881 200.261 180.000 105.992 .000 387.744
Yes 111.253 19.893 72.262 150.244 84.000 6.958 70.363 97.637
Overall 118.394 17.854 83.399 153.388 85.000 5.002 75.196 94.804
Mean of the survival time distribution
For CD4 no the population mean survival time is estimated to be 134.571 with 95% confidence interval (68.881, 200.261)
For CD4 yes the population mean survival time estimated to be 111.253 with 95% confidence interval ( 72.262, 150.244)
Median of the survival time distribution
For CD4 No the population median survival time is estimated to be 180.00 with 95% confidence interval (0.00, 387.744)
Fro CD4 Yes the population median survival time is estimated to be 84.00 with 95% confidence interval (70.363, 97.637)
Percentiles
cd4 25.0% 50.0% 75.0%

No . . 180.000 105.992 38.000 60.103
Yes . . 84.000 6.958 39.000 8.721
Overall . . 85.000 5.002 39.000 7.773
For CD4 NO type the Q1 survival time is estimated to be 38.00. Q2 is 105.00 and Q3 is no estimated.
For CD4 NO type the Q1 survival time is estimated to be 39.00. Q2 is 6.958 and Q3 is no estimated.
Overall Comparisons
Yogesh .S. Mangela
Chi-Square df Sig.
Log Rank (Mantel-Cox) .520 1 .471
Wilcoxon) .246 1 .620
Tarone-Ware .433 1 .510
Test of equality of survival distributions for the different levels of cd4.
H0: Both CD4 Yes and No have the same survival distribution
H1: Both CD4 Yes and No have the different survival distribution
P≥ 0.05 we force to accept Ho: conclude that the survival distribution of CD4 yes and No are same.
Survival Functions
cd4
1.0
No
Yes
No-censored
0.8 Yes-censored
Cum Survival
0.6
0.4
0.2
0.0
0 50 100 150 200 250
time
Above Survival function graph indicates that the CD4 no have the best survival function distribution.
Hazard Function
cd4
1.2 No
Yes
No-censored
Cum Hazard
1.0 Yes-censored
0.8
0.6
0.4
0.2
0.0
0 50 100 150 200 250

time
Cox Regression
Notes
Yogesh .S. Mangela
[DataSet3] H:\2nd Year\MA2012N\Week6\Kidtran\kidtran.sav PRAC_8_i)
N Percent
Cases available in Event(a) 140 16.2%
analysis Censored 721 83.5%
Total 861 99.8%
Cases dropped Cases with missing values
0 .0%
Cases with negative time 0 .0%
Censored cases before
the earliest event in a 2 .2%
stratum
Total
2 .2%
Total 863 100.0%

a Dependent Variable: Measures the Time to death
Categorical Variable Codings(b,c)
Frequency (1)
sex(a) 1=Male 524 1
2=Female 339 0
race(a 1=White 712 1
) 2=Black 151 0
a Indicator Parameter Coding
b Category variable: sex
c Category variable: race
Block 0: Beginning Block

Variables not in the Equation(a)
Score df Sig.
sex .310 1 .578
race 1.111 1 .292
age 53.433 1 .000
a Residual Chi Square = 53.688 with 3 df Sig. = .000
Best fitted model h(y)=h0(y) * exp{.138*Size}

h(y)=h0(y) * exp{.138*Size+1} = 1.148 (3 d.p) = 14.8%
Block 1: Method = Forward Stepwise (Likelihood Ratio)

Omnibus Tests of Model Coefficients(b,c)
Yogesh .S. Mangela
-2 Log Overall (score) Change From Previous Step Change From Previou
Step Likelihood
Chi-square df Sig. Chi-square df Sig. Chi-square df
1(a) 1701.638 53.433 1 .000 56.732 1 .000 56.732 1
a Variable(s) Entered at Step Number 1: age
b Beginning Block Number 0, initial Log Likelihood function: -2 Log likelihood: 1758.370
c Beginning Block Number 1. Method = Forward Stepwise (Likelihood Ratio)
Following Explanatory Variable was used by Forward Stepwise selection is Age.

Variables in the Equation
B SE Wald df Sig. Exp(B)

Step 1 age .051 .007 51.203 1 .000 1.052
Best fitted model h(y)=h0(y) * exp{0.051*AGE}

As Age function increases it more likely that you death percentage is increase.
Best fitted model h(y)=h0(y) * exp{0.051*AGE+1} = 1.0523 = 5.23%
Score df Sig.
Step 1 sex.025 1 .875
race
.305 1 .581
a Residual Chi Square = .328 with 2 df Sig. = .849
Model if Term Removed
Loss Chi-
Term Removed square df Sig.
Step 1 age 56.732 1 .000
Covariate Means
Mean
sex .607
race .825
age 42.835
Hazard Function at mean of covariates
0.35
0.30
0.25
Cum Hazard
0.20
0.15
0.10
0.05
0.00
0 500 1000 1500 2000 2500 3000 3500

Measures the Time to death
Yogesh .S. Mangela
Survival Function at mean of covariates
1.00
0.95
0.90
Cum Survival
0.85
0.80
0.75
0.70
0 500 1000 1500 2000 2500 3000 3500
Cox Regression Measures the Time to death
[DataSet1] H:\2nd Year\MA2012N\Week6\prostatic.sav
N Percent
Cases available in Event(a) 6 15.8%
analysis Censored 30 78.9%
Total 36 94.7%
Cases dropped Cases with missing values
0 .0%
Cases with negative time 0 .0%
Censored cases before
the earliest event in a 2 5.3%
stratum
Total
2 5.3%
Total 38 100.0%
a Dependent Variable: Survival time
Categorical Variable Codings(b)
Frequency (1)
TREATMENT( 1=placebo 18 1
a) 2=diethylstilbestrol 20 0
Yogesh .S. Mangela
a Indicator Parameter Coding
b Category variable: TREATMENT (Treatment)
Block 0: Beginning Block

Score df Sig.
TREATMENT 4.421 1 .035
AGE .082 1 .774
SERUM .151 1 .697
SIZE 9.644 1 .002
GLEASON 7.262 1 .007
Block 1: Method = Forward Stepwise (Likelihood Ratio)

Omnibus Tests of Model Coefficients(c,d)
-2 Log Overall (score) Change From Previous Step Change From Previou
Step Likelihood
Chi-square df Sig. Chi-square df Sig. Chi-square df
1(a) 29.042 9.644 1 .002 7.307 1 .007 7.307 1
2(b) 23.533 13.752 2 .001 5.508 1 .019 12.816 2
a Variable(s) Entered at Step Number 1: SIZE
b Variable(s) Entered at Step Number 2: GLEASON
c Beginning Block Number 0, initial Log Likelihood function: -2 Log likelihood: 36.349
d Beginning Block Number 1. Method = Forward Stepwise (Likelihood Ratio)
Variables in the Equation
B SE Wald df Sig. Exp(B)

Step 1 SIZE .101 .037 7.360 1 .007 1.107
Step 2 SIZE .104 .045 5.250 1 .022 1.109
GLEASON
.778 .362 4.620 1 .032 2.177
Size and Gleason were selected as best fitted model.

Best fitted model h(y)=h0(y) * exp{0.101*size+0.778*Gleason}
As Age function increases it more likely that you death percentage is increase.
Best fitted model for Size h(y)=h0(y) * exp{0.101*size+1} = 1.1062= 11.062%

As the tumour size increase the percentages death is increased.
Gleason h(y)=h0(y) * exp{0.778*Gleason+1} = 2.1771 = 21.77%
Variables not in the Equation(a,b)
Score df Sig.
Step 1 TREATMENT 1.249 1 .264
AGE .177 1 .674
SERUM .023 1 .881
GLEASON 5.714 1 .017
Yogesh .S. Mangela
Step 2 TREATMENT .926 1 .336
AGE .255 1 .613
SERUM .025 1 .875
b Residual Chi Square = 1.435 with 3 df Sig. = .697
Model if Term Removed
Loss Chi-
Term Removed square df Sig.
Step 1 SIZE 7.307 1 .007
Step 2 SIZE 5.594 1 .018
GLEASON
5.508 1 .019
Covariate Means
Mean
TREATMENT .472
AGE 68.278
SERUM 14.000
SIZE 10.750
GLEASON 9.139
Survival Function at mean of covariates
1.0
Cum Survival
0.8
0.6
0 10 20 30 40 50 60 70
Survival time
Yogesh .S. Mangela
Hazard Function at mean of covariates
0.4
Cum Hazard
0.2
0.0
10 20 30 40 50 60 70
Survival time

LogBook (V5)

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

LogBook (V5)

Transféré par

Droits d'auteur :

Formats disponibles

Yogesh .S.

Practical One: Binomial & Hypergeometric Distributions 3

1. Describe the shape of the Binomial distribution when p = 0.2. 3

Practical Two: The Poisson Distribution 6

2. Describe the shape of the distribution of the number

Does the number of murders look to be normally distributed?

4 Now describe the shape of the distribution of murders in December

5 Now describe the shape of the distribution of sexual offences and

6 Describe, in words, the situations where a random variable with

(i) Number of cot deaths per month per hospital area. 7

Practical Three: Categorical Data and Contingency

Tables Data Set One 8

1. Estimate probability that a randomly selected offender is a member of a gang.

5. Comment on the relationship between injury level

Practical Four: The Normal Distribution 12

Practical 6: Analysis of Variance 38

Practical 7: Survival analysis: Kaplan-Meier 42

Practical 1: Date: 08-02-2007

Cases weighted by pdf_.2

binomial distribution when p: 0.2

2. Median value is 8 when p = 0.2 and n = 15 of X and modal value is 5.

4. Mean = (0.05 + 0.3 + 0.6+ 0.9+1.0+1.01+1+1+1+1+1+1+1+1+1+1) = (13.860 / 15) = 0.924

Cases weighted by pdf_0.4

Median = n+1/2; (15+1)/2; (16/2) = 8th position = 1.0

Cases weighted by pdf_0.6

Mean (1.961/15) = 0.131; Median = 0.350; Mode= 0.1

Cases weighted by PDF_0.8

Mean = (4.911/15) = 0.327, Median = 0.005, Model = 14

The pdf-0.8 is positively skewed.

Cases weighted by Pdf_2

The distribution is a skewed. At the point where 4 to 6 it becomes steady.

9. Mean = (1.333/6) = 0.222 is mean value of X

8 Different values of n, m, r etc

Cases weighted by Pdf

Cases weighted by Pdf_3

PRACTICAL TWO WEEK1

i) The Standard deviation and variance

ii) The man and variance for Poisson distribution;

Move number of police areas with X murders (murder1)

N Minimum Maximum Mean Std. Deviation Variance

N Minimum Maximum Mean Std. Deviation Variance

Cases weighted by number of police area with X murders in December

N Minimum Maximum Mean Std. Deviation Variance

7) Characteristics of a Poisson distribution.

According to Poisson distribution the number of occurrences

1. Estimate probability that a randomly selected offender is a member of a gang.

2. Estimate probability that a randomly selected offender has carried a weapon

[DataSet1] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-3\Gang.sav GANG

6) Relationship between gang and weapon;

7) associationbetween gang and weapon behaviour

[DataSet2] H:\2nd Year\MA2012N\Week1-2\Week-2\Prac-3\Injury.sav

Case Processing Summary

Attacker_Victim * Degree_of_Injury Crosstabulation

Case Processing Summary

Statistic Std. Error

5% Trimmed Mean 6.516

a Lilliefors Significance Correction

shoe Stem-and-Leaf Plot

Frequency Stem & Leaf

Stem width: 10.0

Normal Q-Q Plot of shoe

Detrended Normal Q-Q Plot of shoe