Vous êtes sur la page 1sur 1

Chapter 7: Regression

Population Regression Model: 𝑦 = 𝛽1 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝜀


Y: response variable(dependent variable) x: explanatory(independent) ß0: Intercept
ß1: slopes E: random error term

Simple Linear Regression: 𝑦̂ = 𝑏𝑜 + 𝑏1𝑥1 where y-hat: predicted variable bo: point estimate of Bo b1 is point estimate of B1

We must follow these steps to ensure a viable SLR:


1. Create a scatterplot and infer the slope. Can be positive, negative, or no relationship. If there is no relationship, STOP! We picked the wrong x!
2. Correlation Matrix:
measure of strength of relationship between x and y.
−1 < 𝑟 < +1 |R|≥ 70%
If positive scatterplot, r will be positive. If negative, neg. r
3. Finding Determination Coeff 𝑅 2 : Percentage of variability being explained by x
0 < 𝑅2 < 1
Where 𝑅 2 ≥ 70%
***WILL USE MULTIPLE R if for multiple!
4. Standard Error: SE of estimate “standard deviation”
Standard error is the variability of the actual values about the regression line. SE should have at least one less digit compared to the mean of y’s (or y-bar).
Ex: mean of y-bar=20 and SE is 2, therefore OK!
5. Significant F: Validity of regression model.
Ho: Regression model with all x’s is not valid model (don’t use it)
Reject Ho if p value is less than alpha.
Ex: .01898 < .05 would reject Ho
6. P -Value of Independent value (individual x’s)
Use the P value column
Intercept -----
X ----- Focus: must be less than alpha
Ho: 𝛽1 = 0 (𝑛𝑜 𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝)
Ha: 𝛽1 ≠ 0 (𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛𝑠ℎ𝑖𝑝 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑥 𝑎𝑛𝑑 𝑦)
7. Set up Confidece Interval:

CI: 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑉𝑎𝑙𝑢𝑒 ± 2 ∗ 𝑆𝐸

Differences in Multiple Regression:


1. Examine Adjusted R squared instead of R squared
2. Special Cases:
 Multicollinearity: high correlation between independent
variable
 Heteroskedasticity: variance of error or residuals increase
 Autocorrelation: successive value of dependent variable is
correlated
Regression model does not provide accurate predictions if we
have these special cases.

Vous aimerez peut-être aussi