Académique Documents
Professionnel Documents
Culture Documents
Bayes' Theorem
P ( A Bi ) P ( Bi )
P ( Bi A) =
P ( A B1 ) P ( B1 ) + P ( A B2 ) P ( B2 ) + ... + P ( A Bk ) P ( Bk )
Poisson distribution:
PHStat | Probability & Prob. Distributions | Poisson then check Cumulative Probabilities-
2
Chapter 6 The Normal Distribution and Other Continuous Distributions
Normal Distribution
PHStat | Probability & Prob. Distributions | Normal then check the desired calculation
To check the normality assumption construct a stem-and-leaf, box-and-whisker, histogram or a
Normal probability plot PHStat | Probability & Prob. Distributions | Normal Probability Plot
Uniform Distribution
a+b (b − a) 2
µ= σ2 = where a and b are the endpoints of the uniform distribution.
2 12
Exponential distribution
PHStat | Probability & Prob. Distributions | Exponential
Only returns results for ≤ X, for > x use 1-probability, for results between two values find the
probability for each and subtract the smaller from the larger
Sampling distribution of the mean
Calculate the standard deviation of the sampling distribution also called the Standard error of the
mean then use the Normal Distribution calculator if the population is normally distributed or
the sample size is > 30 or the population distribution is symmetrical and the sample size is > 15
σx σx N −n
Infinite population σx = Finite population σx =
n n N −1
Sampling distribution of the proportion:
Calculate the standard deviation of the sampling distribution (Standard Error of the Mean) then
If np > 5 and n(1-p) > 5 use the Normal Distribution calculator PHStat | Probability & Prob.
Distributions | Normal
X number of sucesses
ps = = ps = sample proportion p = population proportion
n sample size
p (1 − p ) p (1 − p ) N −n
Infinite population σ p = Finite population σ p =
s
n s
n N −1
Chapter 7 Confidence Interval Estimation
Interval estimate of the population mean (µ x) with σ x unknown:
3
Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked).
Parentheses indicate information to be taken from the problem
One Sample Categorical Data
Hypothesis
Ho: p = value a two tail test Ho: p ≤ value Ha: p > value upper tail test
Ha: p ≠ value Ho: p ≥ value Ha: p < value lower tail test
Test Statistic Z
Procedure Summary Data: PHStat | One-Sample Tests | Z Test for the Proportion
Raw Data: No Tests available, calculate p and use PHStat
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
4
Chapter 9 Two-Sample Tests
Procedure to determine the proper two sample mean test for numerical data:
Yes
Are Data Paired Use Paired Data Model
No
No F Test Yes
Use σ Unequal
2 Are σ 2's Equal Use σ 2 Equal
Model Model
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
5
Two Sample test of Means with numerical data σ 2’s not proven unequal with the F test
Hypothesis Ho: µ 1 = µ 2 a two tail test Ho: µ 1 ≤ µ 2 Ha: µ 1 >µ 2 upper tail test
Ha: µ 1 ≠ µ 2 Ho: µ 1 ≥ µ 2 Ha: µ 1 < µ 2 lower tail test
Procedure Summary Data: PHStat | Two-Sample Tests | t Test for Differences in Two Means
Raw Data: Data Analysis | t Test: Two Sample Assuming Equal Variances
Test Statistic t
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
Two Sample test of Means with numerical data σ 2‘s proven unequal with the F test
Hypothesis Ho: µ 1 = µ 2 a two tail test Ho: µ 1 ≤ µ 2 Ha: µ 1 >µ 2 upper tail test
Ha: µ 1 ≠ µ 2 Ho: µ 1 ≥ µ 2 Ha: µ 1 < µ 2 lower tail test
Procedure Summary Data: Use spreadsheet downloaded from the Homework web page
Raw Data: Data Analysis | t Test: Two Sample Assuming Unequal Variances
Test Statistic t
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
6
Chapter 10 Analysis of Variance (Multi (c) Sample tests with numerical data)
Equality of Variances
Hypothesis Ho: σ 21 = σ 22= σ 23 a two tail test
Ha: not all σ ’s are equal
Procedure Raw data: PHStat | Multiple-Sample Tests | Levene’s Test
Test Statistic F
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected–There is not sufficient evidence that (Question asked)
One Factor ANOVA
Hypothesis Ho: µ 1 = µ 2 = µ 3 … = µ c c = the number of populations
Ha: not all µ ’s are equal
Procedure Tools | Data Analysis |Anova: Single Factor
Test Statistic F from the computer printout P-value = The Probability of F
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
Tukey's multiple comparison method: (determines which of the c means are different from each other).
Procedure PHStat | Multiple-Sample Tests | Tukey-Kramer Procedure
Test Statistic Critical Range
Input Q found in the Studentized Range Table where column = c and row = n-c
c = number of groups n = total number of data points in all groups
Decision Rule If the absolute difference between any two pairs of means is greater than the critical range the
pair is different.
Two Factor With Replication
Hypothesis Ho1: µ A1 = µ A2 = µ A3 … = µ r r = the number of levels in Factor A
Ha1: not all µ ’s are equal
Ho2: µ B1 = µ B2 = µ B3 … = µ c c = the number of levels in Factor B
Ha2: not all µ ’s are equal
Ho3: No Interaction
Ha3: Interaction
Procedure Tools | Data Analysis |Anova: Two Factor With Replication
Test Statistic F from the computer printout. p-value = The Probability of F
For differences in rows see p-value for the Sample row of the ANOVA
For differences in columns see p-value for the Columns row of the ANOVA
For interaction between factors see p-value for the Interaction row of the ANOVA
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion H1 If rejected – There is sufficient evidence of a difference in (factor A)
H2 If rejected – There is sufficient evidence of a difference in (factor B)
H3 If rejected – There is sufficient evidence of an interaction term
If not rejected – There is not sufficient evidence to make a conclusion about …
7
Tukey's multiple comparison method for Two Factor ANOVA with replication:
No spreadsheet, hand calculate with the following formulas:
8
Chapter 11 Chi-Square Tests and Nonparametric Tests
Two Sample test of a Proportion with categorical data (Alternate Procedure)
Hypothesis Ho: p1 = p2 Ha: p1 ≠ p2 (No <, or > Hypothesis)
Procedure PHStat | Two-Sample Tests | Chi-Square Test for the Differences in Two Proportions
Test Statistic χ 2
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
Multi (c) Sample test of Proportions with categorical data
Hypothesis Ho: p1 = p2 = p3 … pc c = the number of samples
Ha: not all p’s are equal
Procedure PHStat | Multiple-Sample Tests | Chi-Square Test
Test Statistic χ 2
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
Be sure to check the box for the Marascuilo Procedure to determine which proportions are different.
χ 2
Test of Independence
Hypothesis Ho: Two categorical variables are independent
Ha: Two categorical variables are related
Procedure PHStat | Multiple-Sample Tests | Chi-Square Test
Test Statistic χ 2
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that the variables are related
If not rejected – There is not sufficient evidence that the variables are related.
Two Sample test of Medians with numerical data
Hypothesis Ho: M1 = M2 a two tail test Ho: M1 ≤ M2 Ha: M1 > M2 upper tail test
Ha: M1 ≠ M2 Ho: M1 ≥ M2 Ha: M1 < M2 lower tail test
Procedure Raw Data PHStat | Two-Sample Tests | Wilcoxon Rank Sum Test
Summary Data No Tests available.
Test Statistic Z
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
9
Kruskal-Wallis Rank Test for Differences Between c Medians
Hypothesis Ho: M1 = M2 = M3 = MC
Ha: Not all Mj are equal ( j=1,2,…C)
Procedure Raw Data PHStat | Multiple-Sample Tests | Kruskal-Wallis Rank Test
Summary Data No PHStat or Excel calculation available
Test Statistic H
Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)
10
Chapter 12 Simple Linear Regression
Hypothesis Ho: β 1 =0
Ha: β 1 ≠ 0
Procedure PHStat | Regression | Simple Linear Regression or
Tools | Data Analysis | Regression
Test Statistic F
Decision Rule If the significant F (a p-value) is less than alpha Reject the Hypothesis
If the significant F is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence to accept the linear regression model
If not rejected – There is not sufficient evidence of a linear model end the analysis
Confidence Interval estimate of β 1 found on the ANOVA output.
See the independent variable line under Lower 95% and Upper 95%.
Confidence interval estimates for the dependent variable be sure to check the input box and insert a value.
Durbin Watson statistic for autocorrelation be sure to check the input box.
11
Chapter 13 Introduction to Multiple Regression
Multiple Regression Model: represented as Yˆi = b0 +b1 X 1i +b2 X 2 i +b3 X 3i +bk X ki
Determining if the multiple linear model is significant
Hypothesis Ho: β 1 = β 2 =…= β k =0 where k equals the number of variables
Ha: Not all β ’s = 0
Procedure PHStat | Regression | Multiple Regression or
Tools | Data Analysis | Regression
Test Statistic F
Decision Rule If the significant F (a p-value) is less than alpha Reject the Hypothesis
If the significant F is greater than or equal to alpha Fail to Reject the Hypothesis
Conclusion If rejected – There is sufficient evidence that all or part of the model is significant,
(proceed with the analysis)
If not rejected – There is not sufficient evidence of a linear model (end the analysis)
Determining which variables are significant
Hypothesis Ho: β 1 =0 Ho: β 2 =0 … Ho: β k =0
Ha: β 1 ≠ 0 Ha: β 2 ≠ 0 … Ha: β k ≠ 0
Coefficient of partial determination (r2Y) contribution of each variable holding the others constant
12
Dummy Variables Model used to include categorical variables.
Prepare a data matrix with Y, X1..Xn and dummy variables with 1 representing the characteristic and 0 its
absence
Y X1 … Xn XD1 … XDn
3.8 3 1
4.2 2 0
. . 0
. . 1
.
Follow the usual multiple regression procedures. Then test for an interaction term between numerical and
categorical variables. If the interaction term is significant you can not use the dummy variable.
Independent Variables Interactions Model used to check on interaction between numerical variables.
To test for an interaction term prepare a data matrix with Y, X1..Xn and the product of all pairs of numerical
variables
Prepare a data matrix with Y, X1..Xn and Xa*Xb
Y X1 X2 Xa * Xb
3.8 3 11 33
4.2 2 15 30
. . 12
. . 18
Include all possible combinations
Follow the usual multiple regression procedures.
13
Chapter 14 Multiple Regression Model Building
The Quadratic model
Yˆi = b0 + b1 X 1i
Prepare a data matrix with the dependent variable Y and the independent variable square root of X
Y X X
3.8 9 3
4.2 4 2
…
Do a simple linear regression with X as the independent variables.
14
Model Building
Stepwise Regression – limited evaluation of alternative models
Procedure: PHStat | Regression | Stepwise Regression
1. Fit a model with all the independent variables and check the VIF box.
2. If all VIF’s are ≤ 10 proceed to the next step,
else eliminate the variable with the highest VIF and go to back to step 1
3. Sort the results by the adjusted r2 select the model with the least variables if the r2’s are close. Or
Sort the results by Cp and pick models with Cp ≤ to k+1 (k=total number of variables)and pick the best.
15
Chapter 15 Time-Series Forecasting and Index Numbers
Time-Series models use the same least squares technique as regression models. Only the data is different.
The Linear model
Yˆi = b0 + b1 X 1
b0 = estimated Y intercept
b1 = estimated linear effect on Y
Prepare a data matrix with the dependent variable Y and the independent variable X
Y X
3.8 1
4.2 2
…
Do a simple linear regression with X as the independent variable.
Forecast by plugging the next X value into the linear equation
The Quadratic model
Yˆi = b0b1X i
b0 = estimated Y intercept
b1 = is the compound growth factor where (b1 -1)*100% is the compound growth rate
Prepare a data matrix with the independent variable X and the common log of the dependent variable Y
Y log Y X
3.8 .5798 1
4.2 .6232 2
…
The data for the independent variable is often time series data where X is the year or month.
Do a linear regression with X as the independent variable, and the log of Y as the dependent variable.
Yˆi = b0b1X i
16
Autoregressive Models
Autoregressive models lag the dependent variable data by one or more periods to provide a weighted moving
average of the previous values of the variable Y.
For first-order autoregressive, do a multiple regression with Y lag 1 as the independent variable and Y as the
dependent. Forecast by plugging the last Y value into the equation. Forecast additional periods into the future
by using the most recently forecast value as the independent variable.
For second-order autoregressive, do a multiple regression with Y lag 1, and Y lag 2 as the independent
variables. Forecast by plugging the last two Y values into the equation. Forecast additional periods into the
future by using the most recently forecast values and previous values of Y as needed for the independent
variables.
For third-order autoregressive, do a multiple regression with Y lag 1,Y lag 2, and Y lag 3 as the independent
variables. Forecast by plugging the last three Y values into the equation. Forecast additional periods into the
future by using the most recently forecast values and previous values of Y as needed for the independent
variables.
Choose the model with the best adjusted r2, where r2’s are close choose the simplest model.
17