Vous êtes sur la page 1sur 7

Assignment No. 2 Fall_11 (SPSS) Quantitative Techniques in Project Management Instructor: Dr.

Basheer Ahmad Samim Submitted by: Imran Masih


Registration No. 1167109 Submission Date: 31st December 2011

Question No. 1. Write a regression model with the help of the following tables. Interpret the model parameters properly by testing the relevant hypotheses. What is your understanding about R-Square? How will you explain the sig. value in ANOVA table?

Model Summary Adjusted R Model 1 R .716


a

Std. Error of the Estimate .503 32.83959

R Square .513

Square

This table displays R, R squared, adjusted R squared, and the standard error. R is the correlation between the observed and predicted values of the dependent variable. The values of R range from -1 to 1. The sign of R indicates the direction of the relationship (positive or negative). The absolute value of R indicates the strength, with larger absolute values indicating stronger relationships. R squared (.513) is the proportion of variation in the dependent variable explained by the regression model. The values of R squared range from 0 to 1. Small values indicate that the model does not fit the data well. The sample R squared tends to optimistically estimate how well the model fits the population. Adjusted R squared (.503) attempts to correct R squared to more closely reflect the goodness of fit of the model in the population. Use R Squared to help us to determine which model is best. Choose a model with a high value of R squared that does not contain too many variables. Models with too many variables are often over fit and hard to interpret.

ANOVA Model 1 Regression Residual Total Sum of Squares 276897.379 263139.033 540036.413 Df

Mean Square 5 244 249 55379.476 1078.439

F 51.352

Sig. .000
a

The F statistic is the regression mean square (MSR) divided by the residual mean square (MSE). If the significance value of the F statistic is small (smaller than say 0.05) then the independent

variables do a good job explaining the variation in the dependent variable. If the significance value of F is larger than say 0.05 then the independent variables do not explain the variation in the dependent variable, and the null hypothesis that all the population values for the regression coefficients are 0 is accepted.

Coefficients

Standardized Unstandardized Coefficients Model 1 (Constant) Average monthly bill Pct used for business Years using our service Household income (1998) Propensity to leave B -15.534 .491 .494 11.224 .483 1.699 Std. Error 14.262 .129 .273 3.736 .200 .167 .209 .096 .146 .115 .486 Coefficients Beta t -1.089 3.820 1.812 3.005 2.415 10.151 Sig. .277 .000 .071 .003 .016 .000

a. Dependent Variable: Avg. monthly minutes

Here we can write the regression line as follows: Line = -15.534+0.491*Avg mon bill+.5*bus yr+11.224*service+.5*income+1.7*leave The beta coefficient tells us how strongly is the independent variable associated with the dependent variable. It is equal to the correlation coefficient between the 2 variables. Or, often the independent variables are measures in different units. The standardized coefficients or betas are an attempt to make the regression coefficients more comparable. If we transformed the data to z scores prior to our regression analysis, we would get the beta coefficients as we unstandardized coefficients. The t statistics can help us to determine the relative importance of each variable in the model. As a guide regarding useful predictors we will look for t values well below -2 or above +2.

Question No. 2. Test the interdependence of the given variables with the help of the following correlation matrix.
Correlations Avg monthly minutes Avg monthly minutes Pearson Correlation Sig. (2-tailed) N Average monthly bill Pearson Correlation Sig. (2-tailed) N Pct used for business Pearson Correlation Sig. (2-tailed) N Years using our Pearson service Correlation Sig. (2-tailed) N Household income (1998) Pearson Correlation Sig. (2-tailed) N Propensity to leave Pearson Correlation Sig. (2-tailed) N .000 250 .000 250 .016 250 .121 250 .000 250 250 .000 250 .608
**

Average monthly bill

Pct used for business .347


**

Years using our service .314


**

Household income (1998) .333


**

Propensity to leave .608


**

.478

**

.000 250 .478


**

.000 250 .505


**

.000 250 .303


**

.000 250 .213


**

.000 250 .312


**

250 1

.000 250 .347


**

.000 250 .505


**

.000 250 .309


**

.001 250 .232


**

.000 250 .152


*

250 1

.000 250 .314


**

.000 250 .303


**

.000 250 .309


**

.000 250 .241


**

.016 250 .098

250 1

.000 250 .333


**

.000 250 .213


**

.000 250 .232


**

.000 250 .241


**

.121 250 .238


**

250 1

.001 250 .312


**

.000 250 .152


*

.000 250 .098 250 .238


**

.000 250 1

**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).

The results of the analysis are presented in the form of a correlation matrix. Each variable entered is given a row and a column. So, if two variables are chosen, there will be a 2X2 matrix. Each cell contains three rows (the correlation, the significance level and N, or subject number). To compare two variables, we will find one variable in the left-hand column and the second variable in the top row. Follow each variable across the column, or down the row. The cell in which the two meet is the cell where the data

for that relationship is given. The top row in each cell indicates the correlation coefficient. We will see whether the correlation is significant, the second row will indicate the significance level. If the value is less than .05, then the correlation is significant and there is a relationship.

Question No. 3. Test the association between Estrogen Receptor Status and Pathological Tumor Size by explaining all the six steps in testing of hypothesis properly.
Estrogen Receptor Status * Pathological Tumor Size (Categories) Crosstabulation Pathological Tumor Size (Categories) <= 2 cm Estrogen Receptor Status Negative Count Expected Count Positive Count Expected Count Total Count Expected Count 211 235.3 385 360.7 596 596.0 2-5 cm 112 91.2 119 139.8 231 231.0 > 5 cm 7 3.6 2 5.4 9 9.0 Total 330 330.0 506 506.0 836 836.0

The cross tabulation table shows the Valid, Missing, and Total cases. The high percent of missing cases here reflects the people who were not asked this particular question in the survey. The Valid N (number of cases) is used in the table. The Crosstabs shows the 836 valid cases arranged in a table that shows that Estrogen Receptor Status in negativity total is 330 out of which count is 211 < 2 cm is 211, 2-5 cm is 112 and > 5 cm is 7 are reported in negative and out of 506 the positive answers are 385, 119 and 2. Our initial conclusion here might be that on pathological tumor size issues, theres a difference between negative and positive in pathological tumor size and estrogen receptor status in their responses.
Chi-Square Tests Asymp. Sig. (2Value Pearson Chi-Square Likelihood Ratio Linear-by-Linear Association N of Valid Cases 17.512
a

df 2 2 1

sided) .000 .000 .000

17.369 16.647 836

a. 1 cells (16.7%) have expected count less than 5. The minimum expected count is 3.55.

These results indicate that there is no statistically significant relationship between the pathological tumor size and estrogen receptor status (chi-square with one degree of freedom = 17.512, p = 0.000). Let's look at another example, this time looking at the linear relationship between gender (pathological tumor size) and estrogen receptor status. The point of this example is that one (or both) variables may have more than two levels, and that the variables do not have to have the same number of levels. In this example, pathological tumor size has two levels (pathological tumor size and estrogen receptor status) and estrogen receptor status has three levels (low, medium and high).

Question No. 4. For the following cases, specify which probability distribution to be used in testing of a given hypothesis and give reason in support of your choice:

In all above cases we will have different scenarios


For Case No. 01

We will use two trail test Z distribution because the population is normal known which is 15, n is greater then 30 which is 35 and sigma is unknown.
For Case No. 02

We will use two trail test t distribution because the population is normal known which 9.9, n is less then 30 which is 16 and sigma is known.
For Case No. 03

We will use one (right) trail test t distribution because the population is normal known which is 42, n is less then 30 which is 10 and sigma is known.
For Case No. 03

We will use one (right) trail test t distribution because the population is normal known which is 148, n is less then 30 which is 29 and sigma is known.

Vous aimerez peut-être aussi