Vous êtes sur la page 1sur 16

Answer 1

Using the probability theory and statistics, the number of cases in each country on the world, the death rates per country, and the frequency per case based on the country can be calculated for the pandemic A (H1N1) influenza. The analysis can further be extended to include the demographical data (age, gender etc) Additionally in statistics from the historical data, one can predict the future trends and seasonality of the pandemic A (H1N1) influenza with time series.

Answer 2
Our term project topic was Application of Statistics in Total Quality Management. We were 6 people in the project team. We have agreed to divide the project into 3 parts and accordingly 3 groups. Below are the groups and the people in each group.

Answer 3
ME=3 =0.5 No previous estimate of is available, but we are given that the range of the observations is R = 84 - 48 = 36. A conservative estimate (based on Chebychevs Rule) is = R/4 = 9

n=

2 z / 2 2 (1.645) 2 ( 9) 2 = = 24.35 ME 2 ( 3) 2

So the number of height that is needed is 25.

Answer 4
The is unknown, so the decision rule is based on t-statistics

Reject H0 if

t=

X 0 > t n 1, s n

Applying the lower-tail test


H 0 : = 6.6 The mean weight is equal or greater than 6.6 H 1 : < 6.6 The mean weight is less than 6.6

n=53 Suppose that =0.1 (90% level of confidence) is chosen for this test. tn-1, = t52, 0.1 = 1.298 (calculated from t-table) a-

t=

X 0 6.3 6.6 = = 1.285 s 1.7 n 53

t=-1.285 < -tn-1, /2 = 1.298, so do not Reject H0. This means that there is not sufficient evidence that the mean weight is lower than before. b Statistically we can not conclude that the mean weight is lower than before. We can not blame the power plant. c One can conclude that statistically the 0.3 pound difference between the sample mean weight and the population mean weight before the construction of the plant does not give enough proof to say that the population mean weight is decreased after the nuclear plants construction. 0.3 pound difference is statistically not significant.

Answer-5

The excel output of the regression analysis of the customers who purchased mail-order products from Berk Company are shown in the following tables :
Regression Statistics Multiple R 0.86185 R Square 0.74278 Adjusted R Square 0.74018 Standard Error 489.88404 Observations 1000 ANOVA df Regression Residual Total 10 989 999 SS MS F Significance F 685395671.6 68539567 285.5977 1.9251E-283 237346524.1 239986.4 922742195.7

Coefficient s Intercept 177.99369 Age 0.64619 Gender -39.79210 OwnHome 33.93841 Married -15.67078 Close -404.07382 Salary 0.01765 Children -151.63186 PrevCust -541.27584 PrevSpent 0.29038 Catalogs 42.38590

Standard Lower Upper Error t Stat P-value 95% 95% 72.23200 2.46419 0.01390 36.24811 319.73928 1.10674 0.58387 0.55944 -1.52563 2.81801 32.85426 -1.21117 0.22612 -104.26417 24.67996 35.43767 0.95769 0.33845 -35.60326 103.48008 42.46078 -0.36906 0.71216 -98.99436 67.65280 38.37903 -10.52851 0.00000 -479.38749 -328.76014 0.00099 17.80052 0.00000 0.01570 0.01959 17.80031 -8.51849 0.00000 -186.56258 -116.70114 54.23935 -9.97939 0.00000 -647.71328 -434.83841 0.04467 6.50012 0.00000 0.20272 0.37805 2.44794 17.31493 0.00000 37.58215 47.18966

Lower 95.0% 36.24811 -1.52563 -104.26417 -35.60326 -98.99436 -479.38749 0.01570 -186.56258 -647.71328 0.20272 37.58215

Upper 95.0% 319.73928 2.81801 24.67996 103.48008 67.65280 -328.76014 0.01959 -116.70114 -434.83841 0.37805 47.18966

H 0 : j = 0 H1 : j 0

=0.05 (Confidence level 95%) Checking the P-values for each variable, the variables with P-values less than =0.05 effects AmountSpent. These parameters are Close, Salary, Children, PrevCust, PrevSpent and Catalogs Repeating the multi-regression analysis with just the parameters that affect the AmountSpent will result in the following table.
Regression Statistics Multiple R 0.861344 R Square 0.741914 Adjusted R Square 0.740354 Standard Error 489.7204 Observations 1000 ANOVA df Regression Residual Total SS MS F Significance F 6 6.85E+08 1.14E+08 475.7579 5.9731E-288 993 2.38E+08 239826.1 999 9.23E+08

Coefficient s Intercept 195.38431 Close -400.95504 Salary 0.01757 Children -151.10013 PrevCust -542.21226 PrevSpent 0.29482 Catalogs 42.30869

Standard Lower Upper Error t Stat P-value 95% 95% 58.96516 3.31356 0.00095 79.67368 311.09494 38.30076 -10.46859 0.00000 -476.11477 -325.79532 0.00076 23.20905 0.00000 0.01609 0.01906 17.22573 -8.77177 0.00000 -184.90314 -117.29713 54.02618 -10.03610 0.00000 -648.23086 -436.19367 0.04447 6.62958 0.00000 0.20755 0.38208 2.44382 17.31250 0.00000 37.51304 47.10434

Lower 95.0% 79.67368 -476.11477 0.01609 -184.90314 -648.23086 0.20755 37.51304

Upper 95.0% 311.09494 -325.79532 0.01906 -117.29713 -436.19367 0.38208 47.10434

Comparing both tables (the one with all parameters and the one with just 6 parameters) shows that, the coefficients are similar to each other. Adjusted R-Square shows that 74% of the variation in AmountSpent can be explained by the variationi in Close, Salary, Children, PrevCust, PrevSpent and Catalogs. The equation for current year spending amounts at Berk Company is : AmountSpent = 195.38431 - 400.95504 (Close) + 0.01757 (Salary) - 151.10013 (Children) - 542.21226 (PrevCust) + 0.29482 (PrevSpent) + 42.30869 (Catalogs)

Multi-regression Plots / Residuals

Close Residual Plot


Residuals 5000 0 -5000 0 0.2 0.4 0.6 Close 0.8 1 1.2

Salary Residual Plot


Residuals 5000 0 -5000 $0 $50,000 $100,000 $150,000 $200,000 Salary

Children Residual Plot


Residuals 5000 0 -5000 0 1 2 Children 3 4

PrevSpent Residual Plot


Residuals 5000 0 -5000 $0 $1,000 $2,000 $3,000 $4,000 $5,000

PrevSpent

Catalogs Residual Plot


Residuals 5000 0 -5000 0 5 10 15 Catalogs 20 25 30

Multi-regression Plots / Line Fit

Close Line Fit Plot


AmountSpent $10,000 $5,000 $0 -$5,000 0 0.5 Close 1 1.5 AmountSpent Predicted AmountSpent

Salary Line Fit Plot


AmountSpent $10,000 $5,000 $0 -$5,000 $0 $50,0 $100, $150, $200, 00 000 000 000 Salary AmountSpent Predicted AmountSpent

Children Line Fit Plot


AmountSpent $10,000 $5,000 $0 -$5,000 0 1 2 Children 3 4 AmountSpent Predicted AmountSpent

PrevCust Line Fit Plot


AmountSpent $10,000 $5,000 $0 -$5,000 0 0.5 1 1.5 PrevCust AmountSpent Predicted AmountSpent

PrevSpent Line Fit Plot


AmountSpent $10,000 $5,000 $0 -$5,000 $0 $2,000 $4,000 $6,000 PrevSpent AmountSpent Predicted AmountSpent

Catalogs Line Fit Plot


AmountSpent $10,000 $5,000 $0 -$5,000 0 10 20 30 AmountSpent Predicted AmountSpent

Catalogs

Answer 6 In this analysis, a real time series belonging to the number of deaths in Mexico due to the pandemic A (H1N1) influenza was analyzed. The source data was taken from the Mexican Secretary of Health web site (http://portal.salud.gob.mx/contenidos/noticias/influenza/estadisticas.html) and recategorized based on months (the original data was based on weeks.) The report date to 19/06/2010. The Minitab statistical program (version 15) was employed to implement the analysis Number of Deaths (monthly) 0 0 3 95 43 34 75 36 304 304 174 79 75 53 43 8 3

Date Jan-09 Feb-09 Mar-09 Apr-09 May-09 Jun-09 Jul-09 Aug-09 Sep-09 Oct-09 Nov-09 Dec-09 Jan-10 Feb-10 Mar-10 Apr-10 May-10

The Exponential smoothing, Winters method and Autoregressive Method will be used as 3 basic forecasting techniques

a - Exponential Smoothing Minitab screen shot for the single exponential smoothing

The screen shot for the results

10

Smoothing Plot for Number of Deaths (monthly)


Single Exponential Method 300 N umber of Deaths (monthly)
Variable Actual Fits Forecasts 95.0% PI Smoothing Constant Alpha 0.2 Accuracy Measures MAPE 321.57 MAD 66.28 MSD 9008.42

200

100

-100 Jan May Sep Jan Month May Sep

Residual Plots for Number of Deaths (monthly)


Normal Probability Plot
99 90 50 10 1 -200 -100 0 100 Residual 200 Residual Percent 300 200 100 0 -100 0 40 80 Fitted Value 120 160

Versus Fits

Histogram
6.0 Frequency Residual 4.5 3.0 1.5 0.0 -100 -50 0 50 100 150 Residual 200 250 300 200 100 0 -100

Versus Order

6 8 10 12 Observation Order

14

16

11

b- Winters Method Minitab screen shot for the Winters Method

The screen shot for the results

12

Winters' Method Plot for Number of Deaths (monthly)


Multiplicative Method 600 N umber of Deaths (monthly) 500 400 300 200 100 0 -100 -200 Jan May Sep Jan Month May Sep
Variable Actual Fits Forecasts 95.0% PI Smoothing Constants Alpha (level) 0.2 Gamma (trend) 0.2 Delta (seasonal) 0.2 Accuracy Measures MAPE 284.6 MAD 66.0 MSD 10543.2

Residual Plots for Number of Deaths (monthly)


Normal Probability Plot
99 90 50 10 1 -200 -100 0 Residual 100 200 Residual Percent 100 0 -100 -200 0 150 300 Fitted Value 450 600

Versus Fits

Histogram
8 Frequency Residual 6 4 2 0 -250 -200 -150 -100 -50 Residual 0 50 100 0 -100 -200 2 4

Versus Order

6 8 10 12 Observation Order

14

16

13

c- Autoregressive Models Minitab screen shot for the Autoregressive Model (Autocorrelation)

The screen shot for the results

14

Autocorrelation Function for Number of Deaths (monthly)


(with 5% significance limits for the autocorrelations) 1.0 0.8 0.6 Autocorrelation 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 2 Lag 3 4

15

Conclusion: A small value of w=0.2 was used as exponential smoothing constant. This gives less weight to the current value of the series and yield a smoother series. Based on the available 17 months data, a further forecast for next 7 months were made. According the exponential smoothing method, in the next years there will be a constant date incidents as 62 deaths per month. In Winkers Method a multiplicative method was selected. Weights used in smoothing were all selected as 0.2. According to that method, over the next 7 months the death incidences per month will fluctuate between 51 and 413 depending on the month. In the autoregressive model, a default length of n/4 lag length was used. The ACF for these data shows large positive, significant spikes at lags 1 and 2 with subsequent positive autocorrelations that do not die off quickly. This pattern is typical of an autoregressive process.

16

Vous aimerez peut-être aussi