Vous êtes sur la page 1sur 5

Statistics Spring 2008

Lab #4 Regression Defined: Variables: #elationship: $%ample: etc). Ass!mptions: A model for predicting one variable from other variable(s). Vs is contin!o!s" DV is contin!o!s #elationship amongst variables &an 'e predict height from 'eight (or 'eight from height" or 'eight from m!ltiple variables" (ormalit). *inearit). +!lticollinearit)

1. Graphing - Scatterplot ,he first step of an) statistical anal)sis is to first graphicall) plot the data. n terms of regression" if )o! are onl) cond!cting bivariate regression" then the scatterplot 'ill be the same as for correlation. ,h!s" see -*ab . &orrelation/ for ho' to cond!ct a scatter plot. f )o! have three variables" )o! can cond!ct a .D scatterplot. ,he instr!ctions belo' are for a .D scatterplot. f )o! have more than three variables" )o! can0t cond!ct a scatterplot beca!se it is impossible to see a scatterplot in 1D or 2D or 3D or so forth 4o' do graph a scatterplot5 6. Select Graphs 77> Legacy Dialogs 77> Scatter 2. &lic8 -.D Scatter/" and -Define/ .. +ove appropriate variables into the -9 a%is/ and -: a%is/ and -; a%is/ 1. &lic8 <=. Output below is for -commit6/ and -commit2/ and -commit./

2.

ssu!ptions" #or!ality$ Linearity$ %ulticollinearity >or -(ormalit)/ and -*inearit)/" see -*ab. &orrelation/. >or +!lticollinearit)" see belo'. =$$? ( + (D ,4A, +@*, &<** ($A# ,9 <(*9 A??* $S ,< +@*, ?*$ #$A#$SS <( Bhat is +!lticollinearit)5 Bhen variables are highl) correlated )o! have +!lticollinearit). Bh) is +!lticollinearit) a problem5 Bhen variables are highl) correlated in a m!ltiple regression anal)sis it is diffic!lt to identif) the !niC!e contrib!tion of each variable in predicting the dependent variable beca!se the highl) correlated variables are predicting the same variance in the dependent variable. n this sit!ation" the -overall/ p7val!e ma) significant b!t the p7val!e for each predictor ma) not be significant. Bhat is -highl)/ correlated5 Some statisticians sa) correlations above .D0 indicate +!lticollinearit)" and others sa) that correlations above .E0 indicate +!lticollinearit). Bhat do )o! do 'hen )o! have +!lticollinearit)5 a. <ption 6 *eave as is" and cond!ct m!ltiple regression anal)sis an)'a). +!lticollinearit) onl) affects the res!lts from the !niC!e effect of each predictor" so if )o! are onl) interested in the -overall/ effect of the combined predictors" then +!lticollinearit) is not an iss!e. b. <ption 2 #emove one of the variables from the anal)sis. c. <ption . &reate a ne' -composite/ of the highl) correlated variables. 4o' to identif) +!lticollinearit)5 9o! have t'o approaches. Approach F6 #!n a correlational anal)sis and loo8 at correlations. Approach F2 <ne of the options in S?SS is to calc!late m!lticollinearit) 'hen )o! cond!ct m!ltiple regression anal)sis b) clic8ing -Statistics/ and clic8ing -collinearit) diagnostics./ ,he o!tp!t is sho'n belo' for the m!ltiple regression anal)sis that 'ill be cond!cted later in this doc!ment.. +!lticollinearit) e%ists 'hen ,olerance is belo' .6G and V > is greater than 60 or an average m!ch greater than 6. n this case" there is not m!lticollinearit).

&. 'i(ariate Regression Hivariate regression prod!ces the same res!lt as bivariate correlation. >or e%ample" in o!r dataset 'e have a variable called -threshold6/ 'hich as8s: n order to convict a person for a crime" I!rors sho!ld feel that it is at least JJJJJK li8el) that the defendant is g!ilt) of the crime. s -age/ related to this C!estion5 A correlational anal)sis prod!ces the o!tp!t belo'" r L .603" p L .02E

(o'" letMs ans'er the same C!estion !sing regression 6. Select naly)e 77> Regression 77> Linear 2. +ove -threshold6/ into the DV bo%G move -age/ into the V bo%. .. &lic8 <=. <!tp!t belo' is fo!r bo%es. a. *ariables +ntere,-Re!o(e, tells )o! the variables in the anal)sis and ho' the) 'ere entered into the anal)sis. ,his bo% 'ill be helpf!l later 'hen 'e do m!ltiple regression. Since 'e are onl) loo8ing at bivariate regression" this bo% 'ill al'a)s give the V name and sa) -enter/ for +ethod. b. %o,el Su!!ary gives )o! # SC!are" 'hich is the variance e%plained b) the V" #2 L .066 c. #O* tells )o! 'hether the overall model is significant" p L .02E d. .oe//icients tells )o! the effect siNe" beta L .603. e. (otice ho' this res!lt is the same as for the correlational anal)sis above.

B# ,$7@?: a. ,here 'as a positive linear relationship bet'een the predictor and o!tcome variable" L .66" p L .03. ,he variance e%plained b) the predictor 'as 6.6K b. @sing regression anal)sis to predict the percentage of g!ilt I!rors feel is necessar) to convict a defendant" age 'as a positive predictor" L .66" p L .03" that e%plained 6.6K $VA*@A, <( a. 9o! eval!ate bivariate regression similarl) to correlation" s!ch as loo8ing at direction of the relationships" siNe of the relationship" and p7val!e of the relationship. See -*ab. &orrelation/ for more information abo!t interpreting effect siNe. .

4. %ultiple Regression 4ere are the basic steps involved in m!ltiple regression: a. >irst cond!ct correlational analysis 'ith all potential variables to find variables to enter into the anal)sis that are correlated 'ith the DV" b!t not overl) correlated 'ith the V (e.g." m!lticollinearit)). b. nstead of" or in addition to" the correlational anal)sis" some people 'ill enter all potential variables si!ultaneously into !ultiple regression anal)sis to see 'hich variables prod!ces !niC!e effect !pon the DV" and then cond!ct another !ultiple regression analysis with only those variables that prod!ce a !niC!e effect !pon the DV. c. f )o! have a h)pothesis" )o! cond!ct !ultiple regression to test that hypothesis0 called -confirmator)/ anal)sis beca!se )o! are determining 'hether or not )o!r h)pothesis is confirmed. d. After testing )o!r h)pothesis" you can also ,o 1e2ploratory3 analysis to loo8 at different perm!tations of the variables. ts called -e%plorator)/ anal)sis beca!se )o! are e%ploring the data be)ond )o!r initial h)pothesis. >or o!r e%ample" 'ant to loo8 at the predictors of -threshold6/. am going to !se three predictors: age" se%" and -commit6/. a. -age/ is a predictor beca!se 'ant to see if the older the age" the higher the probabilit) of g!ilt people believe is necessar) convict a person. b. -se%/ is a predictor beca!se 'ant to sho' )o! that )o! can enter -categorical/ variables into the anal)sis. 4o'ever" 8eep in mind that the categorical variables need be to dichotomo!s. f )o! have a categorical variable 'ith more than 2 categories" )o! need to create -d!mm) codings/ 'hich red!ce the categorical variable into a series of dichotomo!s variables. e%plain later in this doc!ment ho' to -d!mm) code/" b!t for right no' 'ant to incl!de -se%/ as a predictor to sho' )o! ho' )o! can enter both contin!o!s and dichotomo!s variables into the same anal)sis. c. -commit6/ is a predictor beca!se it is theoreticall) interesting to see ho' -commit6/ is related to -threshold6/. +!ltiple #egression 6. Select naly)e 77> Regression 77> Linear 2. +ove -threshold6/ into the DV bo%" and move the three predictors into the V bo% .. &lic8 -Statistics/ and -collinearit) diagnostics/. 1. &lic8 <=. Output below is for t'o C!estion a. *ariables +ntere,-Re!o(e, tells )o! the variables in the anal)sis and ho' the) 'ere entered into the anal)sis. Be entered all three predictors sim!ltaneo!sl)" so the method is -enter/. b. %o,el Su!!ary gives )o! # SC!are" 'hich is the variance e%plained b) the V" #2 L .036. n other 'ords" all three predictors together acco!nt for 3.6K of the variance in the DV. ,he -AdI!sted # SC!are/ corrects for the n!mber of variables in the anal)sis. $ach predictor e%plains some variance d!e to chance" so the more variables in the anal)sis the higher the -# SC!are/ d!e to chance. Bhen )o! have man) variables in the anal)sis )o! ma) 'ant to loo8 at AdI!sted # SC!are instead of # SC!are. c. #O* tells )o! 'hether the overall model is significant" p L .000. Also" if the overall model is significant" then at least 6 or more of the individ!al variables 'ill most li8el) have a significant relationship to the DV. d. .oe//icients tells )o! the @( O@$ effect siNe for each variables. n this case" all three variables !niC!el) predict the DV. Age" beta L .666" p L .01D Se%" beta L .6D2" p L .002 &ommit6" beta L .6.." p L .06D

B# ,$7@?: a. +!ltiple regression anal)sis 'as cond!cted to predict the percentage of g!ilt I!rors feel is necessar) to convict a defendant. ,hree predictors 'ere entered sim!ltaneo!sl) into the anal)sis: age" gender" and a C!estion as8ing 'hat percent of defendants bro!ght to trial did in fact commit the crime. ,he overall variance e%plained b) the three predictors 'as 3.6K. $ach predictor 'as positivel) related to the o!tcome variable" s!ch as age ( L .66" p L .01D)" gender ( L .6D" p L .002)" and percent bro!ght to trial that are in fact g!ilt) ( L .6." p L .06D). $VA*@A, <( a. 9o! eval!ate m!ltiple regression b) first loo8ing at the overall model and variance e%plained. b. 9o! then eval!ate each predictor separatel). 9o! eval!ate the effect siNe and p7val!e I!st as )o! 'o!ld for correlation and bivariate regression" e%cept that 'ith m!ltiple regression the o!tcome for each predictor is the @( O@$ effect 'hile controlling for the other variables.

Vous aimerez peut-être aussi