0 Votes +0 Votes -

439 vues90 pagesPresentation

Mar 17, 2016

© © All Rights Reserved

PPT, PDF, TXT ou lisez en ligne sur Scribd

Presentation

© All Rights Reserved

439 vues

Presentation

© All Rights Reserved

- On-line NIR analysis in a high density polyethene plant, evaluation of sampling system and optimal calibration strategy
- SPE 26978 - Barbieri
- Innovation In Product Form And Function: Customer Perception Of Their Value
- Multiple Regression
- Savani Et Al 2011_Beliefs About Emotional Residue; The Idea That Emotions Leave a Trace in the Physical Environment
- The Impact of the Food and Financial Crises on Child Mortality: The case of sub-Saharan Africa
- An Econometric Estimate of the Demand for Mba Enrollment
- Sd Article
- soil paper.pdf
- 81063542 Regression Analysis
- Punjab Elections
- Lesson 2 - Main Estimation Methods for Panel Data-PTBNgoc
- Indice
- Excel Corr
- “Blind Retrospection Electoral Responses to Drought, Flu, and Shark
- AP Statistics Tutorial - Exploring the Data
- 2.6_Linear+Regression+with+One+Regressor+一元线性回归
- HW#9.pdf
- FTEE_63_63
- Regression Explained

Vous êtes sur la page 1sur 90

Computers II

Slide 1

and standard multiple regression

Sample problem

Steps in hierarchical multiple regression

Compu

ters II

multiple regression

Slide 2

between a set of independent variables and a dependent variable.

set of independent variables and the dependent variable, controlling

for or taking into account the impact of a different set of independent

variables on the dependent variable.

differences between the average salary for male employees and female

employees, even after we take into account differences between

education levels and prior work experience.

the analysis in a sequence of blocks, or groups that may contain one or

more variables. In the example above, education and work experience

would be entered in the first block and sex would be entered in the

second block.

Compu

ters II

Differences in statistical results

Slide 3

Coefficients, etc.) as each block of variables is entered into the

analysis.

In addition (if requested), SPSS prints and tests the key statistic

used in evaluating the hierarchical hypothesis: change in R for

each additional block of variables.

to the analysis is that the change in R (contribution to the

explanation of the variance in the dependent variable) is zero.

indicates that the variables in block 2 had a relationship to the

dependent variable, after controlling for the relationship of the

block 1 variables to the dependent variable.

Compu

ters II

Variations in hierarchical regression - 1

Slide 4

independent variables, i.e. the analyst can specify a hypothesis

that specifies an exact order of entry for variables.

variables: a set of control variables entered in the first block

and a set of predictor variables entered in the second block.

make a difference in scores on the dependent variable.

Predictors are the variables in whose effect our research

question is really interested, but whose effect we want to

separate out from the control variables.

Compu

ters II

Variations in hierarchical regression - 2

Slide 5

require statistical significance for the addition of each block of

variables.

of variables previously entered into the analysis, whether or not

a previous block was statistically significant. The analysis is

interested in obtaining the best indicator of the effect of the

predictor variables. The statistical significance of previously

entered variables is not interpreted.

problems.

Compu

ters II

problems

Slide 6

added to the analysis is interpreted rather than the overall R

for the model with all variables entered.

relationship between the predictors and the dependent variable

is presented.

verifying the significance of the predictor variables.

Differences in control variables are ignored.

Compu

ters II

Slide 7

of doing multiple regression to evaluate the

relationships among these variables. The

inclusion of the controlling for phrase

indicates that this is a hierarchical multiple

regression problem.

Multiple regression is feasible if the dependent

variable is metric and the independent

variables (both predictors and controls) are

metric or dichotomous, and the available data

is sufficient to satisfy the sample size

requirements.

Compu

ters II

Slide 8

Hierarchical multiple regression

requires that the dependent

variable be metric and the

independent variables be metric

or dichotomous.

metric level of measurement requirement for the dependent variable, if

we follow the convention of treating ordinal level variables as metric.

Since some data analysts do not agree with this convention, a note of

caution should be included in our interpretation.

"Age" [age] is interval, satisfying the metric or dichotomous level of

measurement requirement for independent variables.

"Highest academic degree" [degree] is ordinal, satisfying the metric or

dichotomous level of measurement requirement for independent

variables, if we follow the convention of treating ordinal level variables

as metric. Since some data analysts do not agree with this convention, a

note of caution should be included in our interpretation.

"Sex" [sex] is dichotomous, satisfying the metric or dichotomous level of

measurement requirement for independent variables.

True with caution

is the correct

answer.

Compu

ters II

Slide 9

sample size requirements for multiple

regression.

To answer this question, we will run the

initial or baseline multiple regression to

obtain some basic data about the

problem and solution.

ters II

Slide

10

assumptions and outliers, we will

make a decision whether we should

interpret the model that includes the

transformed variables and omits

outliers (the revised model), or

whether we will interpret the model

that uses the untransformed

variables and includes all cases

including the outliers (the baseline

model).

In order to make this decision, we

run the baseline regression before

we examine assumptions and

outliers, and record the R for the

baseline model. If using

transformations and outliers

substantially improves the analysis

(a 2% increase in R), we interpret

the revised model. If the increase is

smaller, we interpret the baseline

model.

model, select Regression

| Linear from the

Analyze model.

ters II

Slide

11

First, move the

dependent variable spdeg

to the Dependent text

box.

independent variables to

control for age and sex

to the Independent(s)

list box.

button to tell SPSS to add

another block of variables

to the regression analysis.

entering the variables into the

analysis from the drop down

Method menu. In this example,

we accept the default of Enter for

direct entry of all variables in the

first block which will force the

controls into the regression.

ters II

Slide

12

SPSS identifies that we

will now be adding

variables to a second

block.

predictor independent

variable degree to the

Independent(s) list box

for block 2.

Statistics button to

specify the statistics

options that we want.

ters II

Slide

13

First, mark the

checkboxes for

Estimates on the

Regression

Coefficients panel.

Fit, Descriptives, and R squared change.

The R squared change statistic will tell

us whether or not the variables added

after the controls have a relationship to

the dependent variable.

Fifth, click on

the Continue

button to close

the dialog box.

Durbin-Watson

statistic on the

Residuals panel.

Collinearity diagnostics

to get tolerance values

for testing

multicollinearity.

ters II

Slide

14

Click on the OK

button to

request the

regression

output.

ters II

Slide

15

The R of 0.281 is the benchmark

that we will use to evaluate the

utility of transformations and the

elimination of outliers.

to satisfy the assumptions of multiple

regression or the removal of outliers,

the proportion of variance in the

dependent variable explained by the

independent variables (R) was 28.1%.

The relationship is statistically

significant, though we would not stop if

it were not significant because the lack

of significance may be a consequence of

violation of assumptions or the inclusion

of outliers.

ters II

Slide

16

Descriptive Statistics

Mean

SPOUSES HIGHEST

DEGREE

AGE OF RESPONDENT

RESPONDENTS SEX

RS HIGHEST DEGREE

Std. Deviation

1.78

1.281

136

45.80

1.60

1.65

14.534

.491

1.220

136

136

136

minimum ratio of valid cases to independent

variables be at least 5 to 1. The ratio of valid

cases (136) to number of independent variables

(3) was 45.3 to 1, which was equal to or greater

than the minimum ratio. The requirement for a

minimum ratio of cases to independent variables

was satisfied.

In addition, the ratio of 45.3 to 1 satisfied the

preferred ratio of 15 cases per independent

variable.

The answer to the question is true.

ters II

Slide

17

variable - question

and sample size requirements, we turn our

attention to conformity with three of the

assumptions of multiple regression:

normality, linearity, and homoscedasticity.

First, we will evaluate the assumption of

normality for the dependent variable.

ters II

Slide

18

First, move the variables to the

list boxes based on the role that

the variable plays in the analysis

and its level of measurement.

button to request that SPSS produce

the output needed to evaluate the

assumption of normality.

Fourth, click on

the OK button to

produce the output.

Third, mark the checkboxes

for the transformations that

we want to test in evaluating

the assumption.

ters II

Slide

19

spouses highest degree

Descriptives

SPOUSES

HIGHEST DEGREE

Mean

95% Confidence

Interval for Mean

Lower Bound

Upper Bound

5% Trimmed Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Kurtosis

academic degree" [spdeg] did not satisfy the

criteria for a normal distribution. The

skewness of the distribution (0.573) was

between -1.0 and +1.0, but the kurtosis of

the distribution (-1.051) fell outside the

range from -1.0 to +1.0.

Statistic

1.78

1.56

Std. Error

.110

2.00

1.75

1.00

1.640

1.281

0

4

4

2.00

.573

-1.051

.208

.413

question is false.

ters II

Slide

20

spouses highest degree

[LGSPDEG=LG10(1+SPDEG)]" satisfied the criteria

for a normal distribution. The skewness of the

distribution (-0.091) was between -1.0 and +1.0 and

the kurtosis of the distribution (-0.678) was between

-1.0 and +1.0.

The "log of spouse's highest academic degree

[LGSPDEG=LG10(1+SPDEG)]" was substituted for

"spouse's highest academic degree" [spdeg] in the

analysis.

ters II

Slide

21

assumption of normality for

the control variable, age.

ters II

Slide

22

Descriptives

AGE OF RESPONDENT Mean

95% Confidence

Interval for Mean

Lower Bound

Upper Bound

5% Trimmed Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Kurtosis

satisfied the criteria for a normal distribution.

The skewness of the distribution (0.595) was

between -1.0 and +1.0 and the kurtosis of

the distribution (-0.351) was between -1.0

and +1.0.

Statistic

45.99

43.98

Std. Error

1.023

48.00

45.31

43.50

282.465

16.807

19

89

70

24.00

.595

-.351

.148

.295

ters II

Slide

23

highest academic degree

assumption of normality for

the predictor variable,

highest academic degree.

ters II

Slide

24

respondents highest academic degree

Descriptives

RS HIGHEST DEGREE

Mean

95% Confidence

Interval for Mean

Lower Bound

Upper Bound

5% Trimmed Mean

Median

Variance

Std. Deviation

Minimum

Maximum

Range

Interquartile Range

Skewness

Kurtosis

degree" [degree] satisfied the criteria for a

normal distribution. The skewness of the

distribution (0.948) was between -1.0 and

+1.0 and the kurtosis of the distribution

(-0.051) was between -1.0 and +1.0.

Statistic

1.41

1.27

Std. Error

.071

1.55

1.35

1.00

1.341

1.158

0

4

4

1.00

.948

-.051

.149

.297

ters II

Slide

25

respondents degree - question

normality, but the dependent variable did not.

However, the logarithmic transformation of "spouse's highest

academic degree" produced a variable that was normally

distributed and will be tested as a substitute in the analysis.

The script for linearity will support our using the transformed

dependent variable without having to add it to the data set.

ters II

Slide

26

selected, a default set of

transformations to test is marked.

option button to request

that SPSS produce the

output needed to evaluate

the assumption of linearity.

use the log transformation of the

dependent variable, we mark the

check box for the Logarithmic

transformation and clear the check

box for the Untransformed version

of the dependent variable.

OK button to

produce the output.

ters II

Slide

27

respondents highest academic degree

academic degree" and logarithmic

transformation of "spouse's highest

academic degree" was statistically

significant (r=.519, p<0.001). A

linear relationship exists between

these variables.

ters II

Slide

28

respondents age

relationship between logarithmic

transformation of "spouse's highest

academic degree"

[LGSPDEG=LG10(1+SPDEG)] and "age"

[age] indicated that the relationship

was weak, rather than nonlinear.

Neither the correlation between

logarithmic transformation of "spouse's

highest academic degree" and "age"

nor the correlations with the

transformations were statistically

significant.

The correlation between "age" and

logarithmic transformation of "spouse's

highest academic degree" was not

statistically significant (r=.009,

p=0.921). The correlations for the

transformations were: the logarithmic

transformation (r=.061, p=0.482); the

square root transformation (r=.034,

p=0.692); the inverse transformation

(r=.112, p=0.194); and the square

transformation (r=-.037, p=0.668)

ters II

Slide

29

independent variable in the analysis.

We will test if for homogeneity of

variance using the logarithmic

transformation of the dependent

variable which we have already

decided to use.

ters II

Slide

30

homogeneity of variance

option is selected, a default set of

transformations to test is marked.

First, click on the

Homogeneity of variance

option button to request

that SPSS produce the

output needed to evaluate

the assumption of linearity.

use the log transformation of the

dependent variable, we mark the

check box for the Logarithmic

transformation and clear the check

box for the Untransformed version

of the dependent variable.

OK button to

produce the output.

ters II

Slide

31

answer

variance in "log of spouse's highest

academic degree

[LGSPDEG=LG10(1+SPDEG)]" was

homogeneous for the categories of

"sex" [sex]. The probability

associated with the Levene statistic

(0.687) was p=0.409, greater than

the level of significance for testing

assumptions (0.01). The null

hypothesis that the group variances

were equal was not rejected.

The homogeneity of variance

assumption was satisfied. The

answer to the question is true.

ters II

Slide

32

normality for spouses highest academic degree with a

logarithmic transformation. We need to add this transformed

variable to the data set, so that we can incorporate it in our

detection of outliers.

We can use the script to compute transformed variables and add

them to the data set.

We select an assumption to test (Normality is the easiest), mark

the check box for the transformation we want to retain, and

clear the check box "Delete variables created in this analysis."

variable in the data set. To remove it,

you can delete the column or close the

data set without saving.

ters II

Slide

33

First, move the variable

SPDEG to the list box for

the dependent variable.

Normality option button to

request that SPSS do the test

for normality, including the

transformation we will mark.

we want to retain (Logarithmic)

and clear the checkboxes for

the other transformations.

box for the option

"Delete variables

created in this analysis".

Fifth, click on

the OK button.

ters II

Slide

34

column in the data editor, we

see than the log of SPDEG in

included in the data set.

ters II

Slide

35

variables in the script - 1

the list of variables, we see

that the log of SPDEG is not

included in the list of available

variables.

log of SPDEG to the list of

variables in the script, click

on the Reset button. This

will start the script over

again, with a new list of

variables from the data set.

ters II

Slide

36

variables in the script - 2

the list of variables now, we

see that the log of SPDEG is

included in the list of available

variables.

ters II

Slide

37

can be defined as a case that has a large residual

because the equation did a poor job of predicting

its value.

We will run the regression again incorporating any

transformations we have decided to test, and have

SPSS compute the standardized residual for each

case. Cases with a standardized residual larger

than +/- 3.0 will be treated as outliers.

ters II

Slide

38

detect outliers, select the

Linear Regression command

from the menu that drops

down when you click on the

Dialog Recall button.

ters II

Slide

39

substituting transformed variables

from the list of independent

variables. Include the log of

the variable, LGSPDEG.

button to select statistics

we will need for the

analysis.

ters II

Slide

40

First, mark the

checkboxes for

Estimates on the

Regression

Coefficients panel.

Fit, Descriptives, and R squared change.

The R squared change statistic will tell

us whether or not the variables added

after the controls have a relationship to

the dependent variable.

Durbin-Watson

statistic on the

Residuals panel.

checkbox for the

Casewise diagnostics,

which will be used to

identify outliers.

Sixth, click on

the Continue

button to close

the dialog box.

Collinearity diagnostics

to get tolerance values

for testing

multicollinearity.

ters II

Slide

41

Standardized Residuals so

that SPSS saves a new

variable in the data editor.

We will use this variable to

omit outliers in the revised

regression model.

Click on the

Continue

button to close

the dialog box.

ters II

Slide

42

Click on the OK

button to obtain

the output for the

revised model.

ters II

Slide

43

If cases have a standardized residual larger than +/- 3.0,

SPSS creates a table titled Casewise Diagnostics, in which it

lists the cases and values that results in their being an outlier.

If there are no outliers, SPSS does not print the Casewise

Diagnostics table. There was no table for this problem. The

answer to the question is true.

were less than +/- 3.0 by looking the

minimum and maximum standardized

residuals in the table of Residual Statistics.

Both the minimum and maximum fell in the

acceptable range.

Since there were no outliers,

we can use the regression just

completed to make our decision

about which model to interpret.

ters II

Slide

44

use the regression just completed to

make our decision about which

model to interpret.

If the R for the revised model is

higher by 2% or more, we will base

out interpretation on the revised

model; otherwise, we will interpret

the baseline model.

ters II

Slide

45

answer

satisfy the assumptions of multiple regression

and the removal of outliers, the proportion of

variance in the dependent variable explained by

the independent variables (R) was 28.1%.

After substituting transformed variables, the

proportion of variance in the dependent variable

explained by the independent variables (R)

was 27.1%.

Since the revised regression model did not

explain at least two percent more variance than

explained by the baseline regression analysis,

the baseline regression model with all cases and

the original form of all variables should be used

for the interpretation.

The transformations used to satisfy the

assumptions will not be used, so cautions

should be added for the assumptions violated.

False is the correct answer to the question.

ters II

Slide

46

model for the interpretation of this

analysis, the SPSS regression

output was re-created.

again, select the Linear

Regression command from

the menu that drops down

when you click on the Dialog

Recall button.

ters II

Slide

47

variable lgspdeg from the

dependent variable textbox

and add the variable spdeg.

button to remove

the request to

save standardized

residuals to the

data editor.

ters II

Slide

48

and omitting outliers - 3

Standardized Residuals

so that SPSS does not

save a new set of them

in the data editor when it

runs the new regression.

Click on the

Continue

button to close

the dialog box.

ters II

Slide

49

Click on the OK

button to

request the

regression

output.

ters II

Slide

50

assumption of independence

of errors for the analysis we

will interpret.

ters II

Slide

51

evidence and answer

Model Summaryc

Model

1

2

a regression

model

Std. Error

of for

R Square

can now

Rinterpretation,

R Square we

R Square

theexamine

Estimate the

Change

final

of-.015

independence

.014a assumptions

.000

1.290 of

.000

b

errors.

.531

.281

.265

1.098

.281

Change Statistics

F Change

.013

51.670

df1

df2

133

132

2

1

Sig. F Change

.987

.000

Durbin-W

atson

The

Durbin-Watson statistic is used to

b. Predictors:

(Constant), RESPONDENTS SEX, AGE OF RESPONDENT, RS HIGHEST DEGREE

test for the presence of serial correlation

among the residuals, i.e., the

assumption of independence of errors,

which requires that the residuals or

errors in prediction do not follow a

pattern from case to case.

ranges from 0 to 4. As a general rule of

thumb, the residuals are not correlated

if the Durbin-Watson statistic is

approximately 2, and an acceptable

range is 1.50 - 2.50.

The Durbin-Watson

statistic for this problem is

1.754 which falls within

the acceptable range.

If the Durbin-Watson

statistic was not in the

acceptable range, we

would add a caution to the

findings for a violation of

regression assumptions.

The answer to the

question is true.

1.754

ters II

Slide

52

Multicollinearity - question

an impact on our interpretation

is multicollinearity.

ters II

Slide

53

are larger than 0.10: "highest academic degree" [degree]

(.990), "age" [age] (.954) and "sex" [sex] (.947).

Multicollinearity is not a problem in this regression analysis.

True is the correct answer to the question.

ters II

Slide

54

and independent variables - question

confirm concerns the

relationship between the

dependent variable and the set

of predictors after including the

control variables in the analysis.

ters II

Slide

55

and independent variables evidence and answer

Hierarchical multiple regression was performed to test the

hypothesis that there was a relationship between the dependent

variable "spouse's highest academic degree" [spdeg] and the

predictor independent variables "highest academic degree"

[degree] after controlling for the effect of the control independent

variables "age" [age] and "sex" [sex]. In hierarchical regression,

the interpretation for overall relationship focuses on the change in

R. If change in R is statistically significant, the overall

relationship for all independent variables will be significant as well.

ters II

Slide

56

and independent variables evidence and answer

were added , (F(1, 132) = 51.670, p<0.001), the predictor

variable, highest academic degree, did contribute to the overall

relationship with the dependent variable, spouse's highest academic

degree. Since the probability of the F statistic (p<0.001) was less

than or equal to the level of significance (0.05), the null hypothesis

that change in R was equal to 0 was rejected. The research

hypothesis that highest academic degree reduced the error in

predicting spouse's highest academic degree was supported.

ters II

Slide

57

and independent variables evidence and answer

("highest academic degree") in the analysis was 0.281,

not 0.241.

Using a proportional reduction in error interpretation for

R, information provided by the predictor variables

reduced our error in predicting "spouse's highest

academic degree" [spdeg] by 28.1%, not 24.1%.

question is false because

the problem stated an

incorrect statistical value.

ters II

Slide

58

dependent variable - question

problems, we will focus the

interpretation of individual relationships

on the predictor variables and ignore the

contribution of the control variables.

ters II

Relationship of the predictor variable and the

dependent variable evidence and answer

Slide

59

Coefficientsa

Model

1

(Constant)

AGE OF RESPONDENT

RESPONDENTS SEX

(Constant)

AGE OF RESPONDENT

RESPONDENTS SEX

RS HIGHEST DEGREE

Unstandardized

Coefficients

B

Std. Error

1.781

.577

.001

.008

-.023

.231

.525

.521

.003

.007

.114

.198

.559

.078

Standardized

Coefficients

Beta

.009

-.009

.037

.044

.533

t

3.085

.100

-.100

1.007

.495

.575

7.188

(t = 7.188, p<0.001) for the independent

variable "highest academic degree" [degree],

the null hypothesis that the slope or b

coefficient was equal to 0 (zero) was rejected.

The research hypothesis that there was a

relationship between "highest academic

degree" and "spouse's highest academic

degree" was supported.

Sig.

.002

.920

.920

.316

.622

.566

.000

Collinearity Statistics

Tolerance

VIF

.956

.956

1.046

1.046

.954

.947

.990

1.049

1.056

1.010

ters II

Relationship of the predictor variable and the

dependent variable evidence and answer

Slide

60

Coefficientsa

Model

1

(Constant)

AGE OF RESPONDENT

RESPONDENTS SEX

(Constant)

AGE OF RESPONDENT

RESPONDENTS SEX

RS HIGHEST DEGREE

Unstandardized

Coefficients

B

Std. Error

1.781

.577

.001

.008

-.023

.231

.525

.521

.003

.007

.114

.198

.559

.078

Standardized

Coefficients

Beta

Collinearity Statistics

Tolerance

VIF

t

Sig.

3.085

.002

The b coefficient for the relationship

.009

.956

between

the .100

dependent.920

variable "spouse's

-.009 academic

-.100 degree"

.920[spdeg].956

highest

and the

independent1.007

variable "highest

academic

.316

degree"

[degree].

was

.559,

which.954

implies

.037

.495

.622

a direct relationship because the sign of

.575

.566Higher numeric

.947

the.044

coefficient

is positive.

values

for

the

independent

variable

.533

7.188

.000

.990

associated with higher numeric values for

the dependent variable "spouse's highest

academic degree" [spdeg].

respondents who had higher academic

degrees had spouses with higher academic

degrees" is correct. The answer to the

question is true with caution. Caution in

interpreting the relationship should be

exercised because of an ordinal variable

treated as metric; and violation of the

assumption of normality.

1.046

1.046

1.049

1.056

1.010

ters II

Slide

61

random number seed to use

in the validation analysis.

ters II

Slide

62

Validation analysis:

set the random number seed

your regression analysis

by conducting a 75/25%

cross-validation, using

998794 as the random

number seed.

seed, select the Random

Number Seed command

from the Transform menu.

ters II

Slide

63

Set seed to option

button to activate

the text box.

random seed stated in

the problem.

button to complete the

dialog box.

Note that SPSS does not

provide you with any

feedback about the change.

ters II

Slide

64

Validation analysis:

compute the split variable

variable that will split the

sample in two parts, click

on the Compute

command.

ters II

Slide

65

First, type the name for the

new variable, split, into the

Target Variable text box.

Second, the formula for the

value of split is shown in the

text box.

The uniform(1) function

generates a random decimal

number between 0 and 1.

The random number is

compared to the value 0.75.

OK button to

complete the dialog

box.

than or equal to 0.75, the

value of the formula will be 1,

the SPSS numeric equivalent

to true. If the random

number is larger than 0.75,

the formula will return a 0,

the SPSS numeric equivalent

to false.

ters II

Slide

66

split variable shows a

random pattern of zeros

and ones.

To select the cases for the

training sample, we select

the cases where split = 1.

ters II

Slide

67

validation training sample,

select the Linear Regression

command from the menu that

drops down when you click on

the Dialog Recall button.

ters II

Slide

68

First, scroll

down the list of

variables and

highlight the

variable split.

right arrow button to

move the split variable

to the Selection

Variable text box.

ters II

Slide

69

split is moved to the

Selection Variable text

box, SPSS adds "=?" after

the name to prompt up to

enter a specific value for

split.

Click on the

Rule button

to enter a

value for split.

ters II

Slide

70

for the training

sample, 1, into the

Value text box.

Continue button to

complete the value entry.

ters II

Slide

71

Click on the OK

button to

request the

output.

dialog box is closed, SPSS

adds the value we entered

after the equal sign. This

specification now tells

SPSS to include in the

analysis only those cases

that have a value of 1 for

the split variable.

ters II

Slide

72

Validation analysis - 1

regression model for the 75% training

sample replicate the pattern of statistical

significance found for the full data set.

relationship between the set of independent

variables and the dependent variable was

statistically significant, F(3, 103) = 11.569,

p<0.001, as was the overall relationship in the

analysis of the full data set, F(3, 132) = 17.235,

p<0.001

ters II

Slide

73

Validation analysis - 2

The validation of a hierarchical regression

model also requires that the change in R

demonstrate statistical significance in the

analysis of the 75% training sample.

satisfied this requirement

(F change(1, 103) =

34.319, p<0.001).

ters II

Slide

74

Validation analysis - 3

The pattern of significance for the individual

relationships between the dependent variable and

the predictor variable was the same for the

analysis using the full data set and the 75%

training sample.

spouse's highest academic degree was statistically significant

in both the analysis using the full data set (t=7.188,

p<0.001) and the analysis using the 75% training sample

(t=5.484, p<0.001). The pattern of statistical significance of

the independent variables for the analysis using the 75%

training sample matched the pattern identified in the

analysis of the full data set.

ters II

Slide

75

Validation analysis - 4

model using the training sample was 25.2%

(.502), compared to 40.6% (.637) for the

validation sample. The value of R for the

validation sample was actually larger than the

value of R for the training sample, implying a

better fit than obtained for the training sample.

This supports a conclusion that the regression

model would be effective in predicting scores for

cases other than those included in the sample.

supported the

generalizability of the

findings of the analysis to

the population

represented by the sample

in the data set.

The answer to the

question is true.

SW388R7

Data Analysis &

Computers II

Slide 76

regression analysis

The following flow charts depict the process for solving the complete

regression problem and determining the answer to each of the

questions encountered in the complete analysis.

Text in italics (e.g. True, False, True with caution, Incorrect

application of a statistic) represent the answers to each specific

question.

Many of the steps in hierarchical regression analysis are identical to

the steps in standard regression analysis. Steps that are different are

identified with a magenta background, with the specifics of the

difference underlined.

ters II

Slide

77

level of measurement

Question: do variables included in the analysis satisfy the level

of measurement requirements?

Is the dependent

variable metric and the

independent variables

metric or dichotomous?

Examine all independent

variables controls as

well as predictors

No

Incorrect

application of

a statistic

Yes

Ordinal variables included

in the relationship?

No

True

Yes

ters II

Slide

78

sample size

Question: Number of variables and cases satisfy sample size

requirements?

Compute the baseline

regression in SPSS

Ratio of cases to

independent variables at

least 5 to 1?

predictors, in the count of

independent variables

No

Inappropriate

application of

a statistic

Yes

Ratio of cases to

independent variables at

preferred sample size of at

least 15 to 1?

Yes

True

No

ters II

Slide

79

assumption of normality

Question: each metric variable satisfies the assumption of

normality?

Test the dependent

variable and both

controls and predictor

independent variables

The variable satisfies

criteria for a normal

distribution?

Yes

True

If more than one

transformation

satisfies normality,

use one with

smallest skew

No

False

inverse

transformation

satisfies normality?

Yes

Use transformation

in revised model,

no caution needed

No

Use untransformed

variable in analysis,

add caution to

interpretation for

violation of normality

ters II

Complete Hierarchical multiple regression analysis:

assumption of linearity

Slide

80

independent variable satisfies assumption of linearity?

If dependent variable was

transformed for normality, use

transformed dependent

variable in the test for linearity.

Probability of Pearson

correlation (r) <=

level of significance?

If independent variable

was transformed to

satisfy normality, skip

check for linearity.

No

transformation

satisfies

linearity, use one

with largest r

Probability of correlation

(r) for relationship with

any transformation of IV

<= level of significance?

No

Test both

control and

predictor

independen

t variables

Yes

Yes

Use transformation

in revised model

True

Weak

relationship.

No caution

needed

ters II

Slide

81

assumption of homogeneity of variance

Question: variance in dependent variable is uniform across the

categories of a dichotomous independent variable?

If dependent variable was

transformed for normality,

substitute transformed

dependent variable in the test

for the assumption of

homogeneity of variance

Test both

control and

predictor

independen

t variables

Probability of Levene

statistic <= level of

significance?

No

True

Yes

False

dependent variable, add caution to

interpretation for violation of

homoscedasticity

ters II

Slide

82

analysis: detecting outliers

Question: After incorporating any transformations, no outliers

were detected in the regression analysis.

If any variables were transformed

for normality or linearity, substitute

transformed variables in the

regression for the detection of

outliers.

for any case greater than

+/-3.00?

Yes

False

No

True

revised regression again.

ters II

Slide

83

picking regression model for interpretation

Question: interpretation based on model that includes

transformation of variables and removes outliers?

Yes

Pick revised regression with

transformations and omitting

outliers for interpretation

True

greater than R for

baseline regression by 2%

or more?

No

Pick baseline regression with

untransformed variables and all

cases for interpretation

False

ters II

Slide

84

assumption of independence of errors

Question: serial correlation of errors is not a problem in this regression

analysis?

Residuals are

independent,

Durbin-Watson between

1.5 and 2.5?

Yes

True

No

False

NOTE: caution

for violation of

assumption of

independence of

errors

ters II

Slide

85

multicollinearity

Question: Multicollinearity is not a problem in this regression analysis?

greater than 0.10,

indicating no

multicollinearity?

Yes

True

No

False

analysis until

problem is

diagnosed

ters II

Slide

86

overall relationship

Question: Finding about overall relationship between

dependent variable and independent variables.

Probability of F test of R

change less than/equal to

level of significance?

No

False

Yes

predictor variables

interpreted correctly?

No

False

Yes

Small sample, ordinal

variables, or violation of

assumption in the

relationship?

No

True

Yes

ters II

Slide

87

individual relationships

Question: Finding about individual relationship between

independent variable and dependent variable.

Probability of t test

between predictors and DV

<= level of significance?

No

False

Yes

Direction of relationship

between predictors and DV

interpreted correctly?

No

False

Yes

Small sample, ordinal

variables, or violation of

assumption in the

relationship?

No

True

Yes

ters II

Slide

88

individual relationships

Question: Finding about independent variable with largest

impact on dependent variable.

have the largest beta

coefficient (ignoring sign)

among predictors?

No

False

Yes

Small sample, ordinal

variables, or violation of

assumption in the

relationship?

No

True

Yes

ters II

Slide

89

validation analysis - 1

Question: The validation analysis supports the generalizability of the

findings?

Set the random seed and randomly

split the sample into 75% training

sample and 25% validation

sample.

for training sample <=

level of significance?

No

False

Yes

Probability of F for R

change for training sample

<= level of significance?

Yes

No

False

ters II

Slide

90

validation analysis - 2

predictor variables in

training sample matches

pattern for full data set?

No

False

Yes

Shrinkage in R (R for

training sample - R for

validation sample) < 2%?

Yes

True

No

False

- On-line NIR analysis in a high density polyethene plant, evaluation of sampling system and optimal calibration strategyTransféré parRune Mathisen
- SPE 26978 - BarbieriTransféré parPetro1111
- Innovation In Product Form And Function: Customer Perception Of Their ValueTransféré parertekg
- Multiple RegressionTransféré parSpandana Achanta
- Savani Et Al 2011_Beliefs About Emotional Residue; The Idea That Emotions Leave a Trace in the Physical EnvironmentTransféré parVicky Adrian
- The Impact of the Food and Financial Crises on Child Mortality: The case of sub-Saharan AfricaTransféré parUnicef Innocenti
- An Econometric Estimate of the Demand for Mba EnrollmentTransféré parRosa Lee
- Sd ArticleTransféré parAndreas Mitalas
- soil paper.pdfTransféré parAla Thajil
- 81063542 Regression AnalysisTransféré parAlphaeus
- Punjab ElectionsTransféré parSocio Camp
- Lesson 2 - Main Estimation Methods for Panel Data-PTBNgocTransféré parSuri Bella
- IndiceTransféré parVer Cab Ner
- Excel CorrTransféré parRehan Shinwari
- “Blind Retrospection Electoral Responses to Drought, Flu, and SharkTransféré parreginaxy
- AP Statistics Tutorial - Exploring the DataTransféré parNasir Ali
- 2.6_Linear+Regression+with+One+Regressor+一元线性回归Transféré parJames Jiang
- HW#9.pdfTransféré parMartin De Haro Garcia
- FTEE_63_63Transféré parbalachandrankrishnan
- Regression ExplainedTransféré parhsingla25
- Statistics Regression & Correlation AnalysisTransféré parSathya Sathya S
- project juiceTransféré parapi-254253565
- final ron report (2)Transféré parAmit Batham
- Statistical Applications for HealtTransféré parTheresiaSondangSagala
- 18 RegressionTransféré parKetan Garg
- Turner RyanTransféré parRonaldo Roberts
- Regression Lecture 4Transféré pardan
- 14641multipleregression-160909032656Transféré parJasMisionMXPachuca
- (c) (d) (b)Transféré parMathathlete
- FMTransféré parMYLAVARAPU SAMPATH

- PSY 2003 Final ReviewTransféré parGabe Marquez
- 01 Power ElectronicsTransféré parpunith gowda
- SCM-SupplierRelationshipsTransféré parharleydavidson6915
- Business Plan of PoultryTransféré parAlex Patel
- TechRefGuide EDI Demand 100700Transféré parkhunchay Leela
- 0471345970Transféré parవంశీ క్రిష్ణ
- L&T SWITCHGEARTransféré parRemo Gagal
- Turner v Lorenzo ShippingTransféré parChic Pabalan
- MELJUN CORTES MANUAL Current Trends Issues COMP04Transféré parMELJUN CORTES, MBA,MPA
- Survey of TouristsTransféré parDev0003
- Stock-to-watch-news-letters-by-the-equicom-research-16 july 2014Transféré parAmy Rose
- Doctors ListTransféré parDcStrokerehab
- IXIA LTE Testing White PaperTransféré parjiumen
- Dkrcc.pd.Vd1.d5.02 Ets6 SwTransféré parJaroslav Bires
- Part 5 Perforating TechniquesTransféré paramir_ahmed_27
- 1100005 iTransféré parTanveer
- ISO 9001 2015Transféré parFloreid
- High-Frequency-Laminates---Product-Selector-Guide-and-Standard-Thicknesses-and-Tolerances.pdfTransféré parrahul05singha
- 2017-kawasaki-z900-72460Transféré parsampluc
- VP Operations Manufacturing In Phoenix AZ Resume Lance DeckerTransféré parLance Decker
- spur gearTransféré parps143u
- Fall 2009 Michigan Trout Unlimited NewletterTransféré parMichigan Trout Unlimited
- Transmission Line Modeling 2Transféré parFengxing Zhu
- SIS proof testing by ABBTransféré parDio Masera
- Affect of Elements on SteelTransféré parAditya Pratap
- Chapter 3 - Metal Forming Basic JURITransféré parrazermabioki
- Indemnity Cum Affidavit and NocTransféré parSuppy P
- IPIC2016-Plenary Keynote TBTransféré parSkywardFire
- database of international contacts .xlsxTransféré parsakshi
- Dimension Ing LTETransféré pardiahyani

## Bien plus que des documents.

Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.

Annulez à tout moment.