Académique Documents
Professionnel Documents
Culture Documents
Quantitative Management - I
Submitted by:
Roll No:
P19093
Date of Submission:
QUESTION - A
1. Study the Lungcap data set and answer the following questions.
2. Suppose it is given that 20% of the male smokers and 15% of the female smokers were born
caesarean. With the help of the data, verify the above statements. Give enough reasons for your
answers.
3. Plot the histogram of the distribution of Lungcap amongst smokers.
4. Plot the histogram of the distribution of Height amongst smokers.
5. Are height and Lungcap independent?
6. Are the variation of Lungcap of male smokers and female smokers equal?
7. Are the average of Lungcap of smokers and non-smokers equal?
8. Plot the histogram of the age amongst smokers.
9. What percentage of people below 16 years smoke?
10. What percentage of people above 17 years smoke?
11. Test if smoking habit and age are dependent.
12. Test if smoking habit and Lungcap are dependent.
13. Fit a suitable distribution to height and also to Lungcap. Test the goodness of fit.
QUESTION – B
Study the car data set and answer the following questions.
1. Find the average and variance of price and mileage separately. Comment on the results. How will
you interpret the result statistically?
2. Test if the mean mileage of different car manufacturers within some price range are equal.
Clearly specify all the assumptions and the null and alternative hypotheses.
3. Find a 90% confidence price range for the Chevrolet cars.
4. Find a 90% confidence for variance of prices for Pontiac cars.
5. Calculate the correlation coefficient between mileage and Liter for each company.
6. Comment on the results.
7. Suppose a car has a Liter of 3.8. How sure will you be that its mileage is more than 20,000?
8. Is there any correlation between prices and mileage?
QUESTION – C
1. Let 𝑌̅ be the mean of a random sample of size 𝑛1 from 𝑁(𝜇, 𝜎 2 = 10). Find 𝑛1 such that the
probability of the random interval (𝑌̅ − 1/2 , 𝑌̅ + 1/2 ) includes 𝜇 is approximately 0.954.
2. Let 𝑍̅ be the mean of a random sample of size 𝑛2 from 𝑁(𝜇, 𝜎 2 = 9). Find 𝑛2 such that the
probability of the random interval (𝑍̅ − 1 , 𝑍̅ + 1) includes 𝜇 is approximately 0.90.
3. Draw 200 random samples each of size 𝑛1 (found above) from a normal distribution with mean 5
and variance 3.
1|Page
Surya Kant Prasad P19093 Sec-B
4. Write down the distribution of the sample mean. Test using the data obtained in Q3 above, if the
sample means follow that distribution.
5. Draw 200 random samples each of size 25 from a normal distribution with mean 7 and variance
3.
6. Compute 95% confidence interval for the difference of means from each of the 200 samples.
Draw a graph to show all 200 confidence intervals and comment.
QUESTION – D
1. Collect stock prices for 5 companies from 1st Jan 2016 to 30th June 2016.
2. Plot the histogram of the returns for each company. Describe the histograms.
3. Test whether the average returns for 5 companies are equal. State clearly the assumptions
required, null and alternative hypotheses.
4. Test whether the average returns for each pair of companies are equal.
5. Comment on the results.
QUESTION – E
1. The income distribution of a very large population is exponential with average income ₹ 40, 000
per annum. Draw 500 samples (from the income distribution) of size 100 each. Sketch the
distribution of sample average income. Comment.
2. The age distribution of a very large population is given below:
Age Group 15-18 18-21 21-23 23-25 25-27 27-29 29-31 31-33 33-35
(years)
Proportion 0.1 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1
Draw 100 samples (from the age distribution) of size 50 each. Sketch the distribution of sample
average age. Comment.
2|Page
Surya Kant Prasad P19093 Sec-B
Section-A
Q.1.i.
GENDER SMOKING
SMOKES NEVER SMOKED TOTAL
MALE 33 334 367
FEMALE 44 314 358
TOTAL 77 648 725
Q.1.ii Given that one randomly selected person is a smoker, probability that the person is female:
P (Female|Smoker) = No. of female smokers
Total No. of smokers
= 44/77
= 0.571
Q.1.iii. Let
H0: Null Hypothesis that Gender and Smoking Habit are independent.
HA: Alternate Hypothesis that Gender and Smoking Habit are dependent.
Reject Ho if C is less than 5% p-value.
Observed Frequencies Expected Frequencies Difference Sq. Diff./Exp. Freq
F Value Given E Value Expected (Fij - Eij) (Fij - Eij)^2/Eij
F11 33 E11 38.98 -5.98 0.917
F12 334 E12 328.02 5.98 0.109
F21 44 E21 38.02 5.98 0.940
F22 314 E22 319.98 -5.98 0.112
Degrees of freedom= 1 C
2.077
1,0.05
3.841
Since c < 1, 0.05, the p-value for c (>10%) is more than 5%. Hence it is not sufficient to reject
HO and we can say that gender and smoking habits are independent.
Q.2 Given 20% (=m) of male smokers and 15 %( =f) of female smokers were born caesarean.
a) As per the sample,
No. of male smokers = 33 , No. of male smoker born caesarean =10
Proportion of male smoker born caesarean, Pm =10/33 =30.3%
Sample size, Nm=33
Since sample size > 30, as per CLT, Pm ~ N(Pm,Sm)
Standard deviation, Sm= (Pmx(1-Pm)/ Nm)^0.5= 0.08
3|Page
Surya Kant Prasad P19093 Sec-B
Q3/4.
4|Page
Surya Kant Prasad P19093 Sec-B
Since Fcal (=1.19)< Fcrit(1.96), there is not enough reasons to reject HO. Hence we accept the
hypothesis and state that the variances of male smokers and female smokers are equal.
Q7. Let 1 and 2 be the average of lungcap of smokers and non-smokers. Whereas 12 and 22 are
the sample variance of the respective population.
x1= Random Variable of average of lungcap of sample smokers ~ N(1,12/n1)
x2= Random Variable of average of lungcap of sample non-smokers~ N(2,22/n2)
As per data,
No. of smokers, n1=77 No. of non-smokers, n2= 648
Average of lungcap of smokers x1=8.645 Average of lungcap of non-smokers x2= 7.77
Sample lungcap variance of smoker, s= 3.545
5|Page
Surya Kant Prasad P19093 Sec-B
6|Page
Surya Kant Prasad P19093 Sec-B
Q.11. H0: Age and smoking habit are independent for Age
Smoking
the above age ranges Habit <15 >15 Total
HA: Age and smoking habit are dependent for Yes 42 35 77
the above age ranges No 506 142 648
Reject Ho if cal is less than 5% p-value. Total 548 177 725
Q13. As per the data, we have the following descriptive statistics for lungcap and height:
LungCap Height
Mean 7.863148 Mean 64.83628
Standard Standard
Deviation 2.662008 Deviation 7.202144
Count 725 Count 725
7|Page
Surya Kant Prasad P19093 Sec-B
8|Page
Surya Kant Prasad P19093 Sec-B
Section-B
Q1.
Price
Mean 21343.14
Standard Error 348.6119
Median 18025
Mode 10921.95
Standard Deviation 9884.853
Sample Variance 97710315
Kurtosis 3.291149
Skewness 1.575795
Range 62116.54
Minimum 8638.931
Maximum 70755.47
Sum 17159888
Count 804
9|Page
Surya Kant Prasad P19093 Sec-B
Mileage
Mean 19831.93
Standard Error 289.0619
Median 20913.5
Mode 18910
Deviation 8196.32
Sample Variance 67179657
Kurtosis 0.183909
Skewness -0.13125
Range 50121
Minimum 266
Maximum 50387
Sum 15944875
Count 804
We observe that the sample variance of price is more than mileage. That means the spread of
price around average is more than that of mileage. So we can say that wide range of priced cars
have mileage closer to 19831.93.
Q2. Average
Price Range mileage (xi) Variance (i2) Sample Size(ni)
<190000 20241.52 64394503 467
19k-41k 19759.26 65564947 297
>41K 15589.65 95651556 40
Let 1,2 and 3 be the average of mileage of cars in the price range as given in the table. Whereas
12 ,22 and 32 are the variance of the respective car price range.
Hypothesis Statement
HO : 1= 2=3
HA : 1≠ 2≠3
We conducted Anova test. Since the p-Value is less than 0.05, we reject HO and state that the
average mileage of the cars in above price range are not equal.
10 | P a g e
Surya Kant Prasad P19093 Sec-B
Q3.
Price-Chevrollet t-value Price
t+ 0.05,319 0.824822 16745.82
Mean 16427.6 t-0.95,319 -0.82482 16109.38
Standard Deviation 6901.439 CL= 636.4364
Sample Variance 47629867
Count 320
Confidence Level(90.0%) 636.4364 Or,else using t-distribution, s=6901.439/sqrt(320)
= 385.80
Q.4. 150
n
171.507
149,0.1
CI Variance (90%) 14515607 (n-1)s2/149,0.1
Section-C:
Q.4. Since we have taken the sample from a normal distribution N(5,3) each of size 159 (>30), the
average of 200 sample will follow normal distribution with N(5,3/159) which can be verified in the
descriptive statistics of the sample and histogram below.
Sample Mean = 5
Sample variance = 0.020
11 | P a g e
Surya Kant Prasad P19093 Sec-B
Frequency
40
35
30
25
20
15
10
5
0
Frequency
45
40
35
30
25
20
15
10
5
0
12 | P a g e
Surya Kant Prasad P19093 Sec-B
Section D:
Q.1. I collected stock price of Tata motors, Maruti Suzuki, Mahindra & Mahindra, Hero, Ashok Leyland
from 01.01.2016 to 30.06.2016.
Q.3. HO: Average stock returns of the five companies taken are equal (R1 = R2 = R3 = R4 = R5)
HO: Atleast one of average stock return is not equal to other stock returns.
Reject HO if p value < 0.05
Since the p-Value is greater than 0.05, we accept the HO and state that the average return of the
said companies are equal.
SUMMARY
Groups Count Sum Average Variance
Column 1 123 -0.08929 -0.00073 0.000389
Column 2 123 -0.32103 -0.00261 0.000528
Column 3 123 -0.25588 -0.00208 0.000457
Column 4 123 0.067323 0.000547 0.000298
Column 5 123 0.167521 0.001362 0.000262
13 | P a g e
Surya Kant Prasad P19093 Sec-B
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 0.001397 4 0.000349 0.90372 0.461334 2.38654
Within Groups 0.235817 610 0.000387
14 | P a g e
Surya Kant Prasad P19093 Sec-B
Section E)
a)
Bin Frequency
3.2
3.4
2
0
Airel Compact
3.6 3 6
Frequency
3.8 3
4 1 4
4.2 2 2
4.4 2
0 Frequency
4.6 2
4
4.2
4.8
5
3.2
3.4
3.6
3.8
4.4
4.6
5.2
More
4.8 0
5 3
5.2 1 Bin
More 0
15 | P a g e
Surya Kant Prasad P19093 Sec-B
Since .102 > 5%, we cannot reject null at 5% significance level and hence we can assume Ariel Compact
“Image” values is following normal distribution.
b) H0 : µ1=µ2=µ3=µ4=µ5
H1: Any other alternatives
α – level = 0.05 , R=5
Source of Variation SS df MS F P-value F crit
Rows 6.8504 19 0.360547 1.887215 0.027545 1.725029
Columns 45.1604 4 11.2901 59.09582 1.4E-22 2.492049
Error 14.5196 76 0.191047
Total 66.5304 99
FINV(0.05,4,15)=2.674
F-critical> 2.674, Hence, we can reject null at 5% significance level.
c)
H0 : 𝜎12 = 𝜎22
16 | P a g e
Surya Kant Prasad P19093 Sec-B
K = 2.5264.
d)
Brand Average Image Score
AC 4.16
AS 4.185
SE 4.585
SB 4.115
R 2.635
Highest average image score is for brand SE. It has the “Best Image”.
e) SE is the best image brand.
N = 20, Sample Mean = 4.585, Sample Standard Deviation S = 0.3232
Confidence Interval = (Sample Mean - tα/2S/√𝑁 , Sample Mean + tα/2S/√𝑁)
(1-α)% = 95% => α% = 5% => α/2% = 2.5% , t0.025,14 = 2.093
Confidence Interval = (4.585 – 2.093*0.3232/√20, 4.585 + 2.093*0.3232/√20) = (4.4338,4.7362)
f) N=100, 𝜎=0.67
P(a < 𝜎2 < b) = 90% => P(1/b < 1/ 𝜎2 < 1/a) = 90%
P((N-1)S2/b < (N-1)S2/ 𝜎2 < (N-1)S2/a) = 90%
(N-1)S2/ 𝜎2 will follow χ2N−1 distribution, N = 100.
P((N-1)S2/b < χ299 < (N-1)S2/a) = 90%
Using excel we find that
P(χ299 < 77.04633) = 0.05 => (N-1)S2/b = 77.04633 => b = 0.8635
P(χ299 < 123.2252) = 0.95 => (N-1)S2/a = 123.2252 => a = 0.5399
Confidence interval = (a,b) = (0.5399,0.8635)
17 | P a g e