Vous êtes sur la page 1sur 24

Laboratory Exercise No.

7
Paired t-Test
Course Code: IE 301 Program: BSIE
Course Title: Advanced Statistics for IE Date Performed:Sep. 12, 2015
Section:IE31FB3 Date Submitted:Sep. 12, 2015
Members:Madrigal, Dominic R. Instructor: Engr. Rica Navarro
Cruz II, Robert D.

1. Objective(s):
The activity aims to introduce basic ideas of power and sample size calculations for 2-sample t-Test.
2. Intended Learning Outcomes (ILOs):
The students shall be able to:
2.1 test for a difference between two population means using a 2-sample t-test
2.2 determine the sample size required to detect an effect of a given size with a given degree of
confidence.

3. Discussion:

A paired t-Test helps determine whether the mean differences between paired observations is significant.
Statistically, the paired t-test is equivalent to performing a 1-sample t-test on the differences. A paired t-
test also helps you to evaluate whether the mean difference is equal to a specific value.

Paired observations are related. Examples include:


1. Weights recorded for individuals before and after an exercise program
2. Measurements of the same part taken with two different measuring devices.
Paired t-test with a random sample of paired observations. The test also assumes that the paired
differences come from a normally distributed population. However, the test is robust to violations of this

32
assumption, provided the observations are collected randomly and the data are continuous, unimodal,
and reasonably symmetric.

Why use a paired t-test?


A paired t-test answers questions such as:
1. Does a new treatment result in a difference in the product?
2. Do two different instruments provide similar measurements for the same sample?
4. Resources:
MiniTab Software/Manual
Training Data Set
Textbooks
5. Procedure:
Practice Problem: A consumer group wants to determine whether drivers can park one car more quickly
than the other. Because the data are paired (each individual parked both cars), use a paired t-test to teatv
the following hypothesis:
H0: The mean difference between paired observations in the population is zero.
H1: The mean difference between paired observations in the population is not zero.
Use the default confidence level of 95%. Display individual value plots and boxplots to help visualize the
data.
1. Open CARCTL.MPJ
2. Choose Stat ►Basic Statistics ►Paired t.
3. Complete the dialog box as shown below.

33
4. Click Graphs
5. Check Individual value plot and Boxplots of differences.
6. Click OK in each dialog box
7. Interpret the results
8. Draw conclusions.

Part 2: Testing the normality: The paired t-test


1. Choose Stat ►Basic Statistics ►Normality Test
2. In Variable, enter SupplrA
3. Click OK.
4. Choose Stat ►Basic Statistics ►Normality Test
5. In Variable, enter SupplrB
7. Click OK.
8. Interpret the results
9. Draw conclusions.

Part 3: Checking for Normality: the paired t-test is actually a 1-sample t-test on the pair wise difference.
Therefore, the pair wise differences must satisfy the 1-sample t-test assumptions, including normality.

34
Before checking for normality, store the pair wise differences in the worksheet.
1. Choose Stat ►Basic Statistics ►2 Variances
2. Complete the dialog box as shown below.

3. Click OK.
4. Interpret the results.
5. Draw conclusions.
6. Data and Results:
Part 1:

Boxplot of Differences shows the mean is within the hinges of the box while the Individual Value Plot of Differences
shows the most of the red dots is within the mean, therefore, there is no mean difference between paired observations in the
population.

35
7. Data Analysis and Conclusion:

Part 2:

The two graphs shows that the data points approximately follow the straight line, therefore, the two products has no
Difference.

Part 3:

The individual value plot shows the mean of SupplrA increases than SupplrB. In histogram, SupplrA has a normal shape
while SupplrB is left-skewed. The individual plot of SupplrA and SupplrB do not overlap and the boxplot shows the medians are
close, therefore, the mean probability is significant.

36
8. Assessment (Rubric for Laboratory Performance):
TIP-VPAA–054D
Revision Status/Date:0/2009 September 09

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES


RUBRIC FOR LABORATORY PERFORMANCE
CRITERIA BEGINNER ACCEPTABLE PROFICIENT SCORE
1 2 3
Laboratory Skills
Manipulative Members do not Members occasionally Members always
Skills demonstrate needed demonstrate needed demonstrate needed
skills. skills. skills.
Experimental Members are unable to Members are able to Members are able to
Set-up set-up the materials. set-up the materials set-up the material with
with supervision. minimum supervision.
Process Members do not Members occasionally Members always
Skills demonstrate targeted demonstrate targeted demonstrate targeted
process skills. process skills. process skills.

Safety Members do not follow Members follow safety Members follow safety
Precautions safety precautions. precautions most of the precautions at all
time. times.
Work Habits
Time Members do not finish Members finish on time Members finish ahead
Management/ on time with incomplete with incomplete data. of time with complete
Conduct of data. data and time to revise
Experiment data.
Cooperative Members do not know Members have defined Members are on tasks
and their tasks and have no responsibilities most of and have
Teamwork defined responsibilities. the time. Group responsibilities at all
Group conflicts have to conflicts are times. Group conflicts
be settled by the cooperatively managed are cooperatively
teacher. most of the time. managed at all times.
Neatness and Messy workplace during Clean and orderly Clean and orderly
Orderliness and after the workplace with workplace at all times
experiment. occasional mess during during and after the
and after the experiment.
experiment.
Ability to do Members require Members require Members do not need
independent supervision by the occasional supervision to be supervised by the
work teacher. by the teacher. teacher.
Other Comments/Observations:
TOTAL SCORE

RATING=

x 100%

37
Laboratory Exercise No.8
Correlation
Course Code:IE 301 Program:BSIE
Course Title:Advanced Statistics for IE Date Performed:Sep. 12, 2015
Section:IE31FB3 Date Submitted:Sep. 12, 2015
Members:Madrigal, Dominic R. Instructor:Engr. Rica Navarro
Cruz II, Robert D.

2. Intended Learning Outcomes (ILOs):


The students shall be able to:
2.1 Evaluate the linear relationship between two variables using scatterplot, correlation, and fitted line
plot.
2.2 Analyze and interpret results and draw conclusions about the output provided by Minitab.
3. Discussion:

The sample correlation coefficient , r, measures the degree of linear association between two variables
(the degree to which one variable changes with another). A positive correlation indicates that both
variables tend to increase or decrease together. A negative correlation indicates that, as one variable
increases, the other tends to decrease.

Use correlation when you have data for two continuous variables and wish to determine whether a linear
relationship exists between them. The correlation does not tell you whether the variables are related in a
non linear fashion.
Some statisticians argue that correlation should not be used if one variable is a dependent response of the
other.

Correlation can help answer questions such as

38
1. Are two variables related in a linear manner?
2. What is the strength of the relationship?
Example
A. Is there a linear relationship between dollars spent on training and customer satisfaction ratings?
B. What is the relationship between revenue and the number of sales calls made?

Additional Considerations
Correlation quantifies the degree of linear association between two variables.
A strong correlation does not imply a cause-and-effect relationship. For example, a strong correlation
between two variables may be due to the influence of a third variable not under consideration.
A correlation coefficient close to zero does not necessarily mean no association. The variables may have a
nonlinear association. Always plot the data so that you can identify nonlinear relationships when they are
present.

Some statisticians argue that correlation should not be used if one variable is a dependent response of
the other.
Correlation assumes that the values of both variables are free to vary. Correlation is not appropriate if you
fix the values of one variable to study changes in another.

4. Resources:
MiniTab Software/Manual
Training Data Sets
Textbooks
5. Procedure:

Practice Problem: The sales department for a software company wants to determine whether a relationship
exists between the number of sales calls made and the revenue earned. Analysts record the number of
sales calls and the revenue earned each day for a period of 420 days.

Variable Description
39
Revenue Daily Revenue in thousands of dollars, rounded to the nearest dollar
Sales Calls Number of sales calls made each day.

Part 1:
1. Open SoftRev1.MPJ
2. Choose Graph ►Scatterplot
3. Choose Simple, then click OK
4. Complete the dialog box as shown below.

5. Click OK.
6. Interpret the results

Part 2: Calculating the correlation


11. Choose Stat ►Basic Statistics ►Correlation

40
12. Complete the dialog box as shown below.

13. Click OK
14. Interpret the results
15. Draw conclusions.
6. Data and Results:

41
The plot shows the data values of variable x and y. As the sales calls increases, the
revenue also increases.
Therefore they are directly proportional and the correlation is postive.

7. Data Analysis and Conclusion:

The correlation is equal to 0.802. The relationship between revenue to sales calls is directly proportional but the amount
is not consistent.

42
8. Assessment (Rubric for Laboratory Performance):

TIP-VPAA–054D
Revision Status/Date:0/2009 September 09

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES


RUBRIC FOR LABORATORY PERFORMANCE
CRITERIA BEGINNER ACCEPTABLE PROFICIENT SCORE
1 2 3
Laboratory Skills
Manipulative Members do not Members occasionally Members always
Skills demonstrate needed demonstrate needed demonstrate needed
skills. skills. skills.
Experimental Members are unable to Members are able to Members are able to
Set-up set-up the materials. set-up the materials set-up the material with
with supervision. minimum supervision.
Process Members do not Members occasionally Members always
Skills demonstrate targeted demonstrate targeted demonstrate targeted
process skills. process skills. process skills.

Safety Members do not follow Members follow safety Members follow safety
Precautions safety precautions. precautions most of the precautions at all
time. times.
Work Habits
Time Members do not finish Members finish on time Members finish ahead
Management/ on time with incomplete with incomplete data. of time with complete
Conduct of data. data and time to revise
Experiment data.
Cooperative Members do not know Members have defined Members are on tasks
and their tasks and have no responsibilities most of and have
Teamwork defined responsibilities. the time. Group responsibilities at all
Group conflicts have to conflicts are times. Group conflicts
be settled by the cooperatively managed are cooperatively
teacher. most of the time. managed at all times.
Neatness and Messy workplace during Clean and orderly Clean and orderly
Orderliness and after the workplace with workplace at all times
experiment. occasional mess during during and after the
and after the experiment.
experiment.
Ability to do Members require Members require Members do not need
independent supervision by the occasional supervision to be supervised by the
work teacher. by the teacher. teacher.
Other Comments/Observations:
TOTAL SCORE

RATING=

x 100%

43
Laboratory Exercise No.9
Simple Linear Regression
Course Code:IE 301 Program:BSIE
Course Title:Advanced Statistics for IE Date Performed:Sep. 12, 2015
Section:IE31FB3 Date Submitted:Sep. 12, 2015
Members:Madrigal, Dominic R. Instructor:Engr. Rica Navarro
Cruz II, Robert D.

1. Objective(s):
The activity aims to measure the degree of linear association between two variables using graphs and
correlation
Model the relationship between a continuous response variable and one or more predictor variables.
2. Intended Learning Outcomes (ILOs):
The students shall be able to:
2.1 Evaluate the linear relationship between two variables using scatterplot, correlation, and fitted line plot.
2.2 Analyze and interpret results and draw conclusions about the output provided by Minitab.
3. Discussion:

Simple Linear Regression examines the relationship between a continuos response variable (y) and one
predictor variable (x) . The general equation for a simple linear regression model is:

Y   O  1  

Where Y is the response, X is the predictor,  O is the intercept (the value of Y when X equals zero), 1 is

the slope and  is random error.

Use simple linear regression when you have a continuos y and one predictor , x. The following conditions

44
should also be met:
1. X can be ordinal or continuos
2. In theory, x should be fixed by the investigator. In practice, however, it is often allowed to vary.
3. Any random variation in the measurement of x is assumed to be negligible compared to the range
in which x is measured.
The y-values obtained in your sample differ from those predicted by the regression model (unless all points
happen to fall on a perfectly straight ine). These differences are called residuals.

To confirm that the analysis is valid, verify all assumptions about the model error term. Use residual plots to
check that the errors have the following characteristics:
1. Normally distributed
2. Constant variance for all fitted values
3. Random over time
Simple Linear Regression can help answer the following questions such as
1. How important is x in predicting y?
2. What value can you expect for y when x is 5?
3. How much does y change if x increases by one unit?
For example,
Is the number of mistakes made in processing loans related to cycle time?
What salary can you expect to make with five years experience in a particular field?
How much does salary increase for every additional year of experience?
S is an estimate of the average variability about the regression line. S is the positive square root of the
mean square error (MSE). For a given problem, the better the equation predicts the response, the lower S
is.
2
R (R  Sq )

R 2 is the proportion of variability in the response that is explained by the equation. Acceptable values for
2
R vary depending on the study. For example for engineers studying chemical reactions may require an
R 2 of 90% or more. However, someone studying human behavior ( which is more variable) may be
satisfied with much lower R 2 values.

45
2
R adjusted (R  q (adj))
S
2
R adjusted is sensitive to the number of terms in the model and is important when comparing models
with different number of terms.

The Least Squares regression line


The coefficients for the regression equation are chosen to minimize the sum of the squared differences
between the response values observed in the sample and those predicted by the equation.
In other words the squared vertical distances between the points and line are minimized. The result is
called the Least squares regression line.

Confidence and prediction bands


Confidence bands provide the estimated range in which the mean response for a given value of the
predictor is expected to fall.

Prediction bands provide the estimated range in which a single new observation for a given value of the
predictor is expected to fall.
Analysts want to be confident that the mean and the individual points of the y-variable, Revenue, fall within
certain limits of variability.
Use the default confidence level of 95%

Confidence Interval
The 95% confidence interval defines a likely range of values for the population mean of y. For any given
value of x, you can be % confident that the population mean for y is between the indicated lines.

Prediction interval
The 95% prediction interval defines a likely range of y values for future individual observations. For any
given value of x, you can be 95% confident that the corresponding value of y for a single future observation
is between the indicated lines.
Note : The prediction interval is always wider than the confidence interval because of the added uncertainty

46
involved in predicting a single response versus the mean response.

Residuals
The residuals for each observation is the difference between the observed value of the response and the
value predicted by the model ( the fitted value). For example, if the observed response value is 12 and the
model predicts 10, the residual is 2.
Assumptions
1. To confirm that the analysis is valid. Verify all assumptions about the model error term. Use residual
plots to check that the errors have the following characteristics.
2. Normally distributed
3. Constant variance for all fitted values
4. Random over time

Normal Probability Plot


The normal probability plot should roughly follow a straight line. Use this plot to verify that the residuals do
not deviate substantially from a normal distribution.

Histogram
Use the normal probability plot to make decisions about the normality of the residuals. With a reasonably
large sample size, The histogram displays compatible information with the normal probability plot
The histogram of the residuals should appear approximately bell-shaped with no unusual values or outliers.
Use the histogram as an exploratory tool to learn about the following characteristics of the data.
-Typical values, spread or variation, and shape
-Unusual values in the data

Residual versus fits


Use the plot of the residuals versus fits to verify that the residuals are scattered randomly about zero.

This pattern……. Indicates …………………..

47
Curvilinear A quadratic term may be missing from the model
Fanning or uneven spread Non constant variance of the residuals
Of residuals across the different fitted values

Points far away from zero relative to other Outliers exist


Data points

Residual versus order


The plot of the residuals versus order displays the residuals in the order of data collection (provided the
data were entered in the same order in which they were collected.)
If the data collection order affects the results, residuals near each other may be correlated , and thus , not
independent.

This pattern……. Indicates …………………..


Residuals are not randomly scattered around zero Residuals are not independent over time
Residuals are randomly scattered around zer Residuals are independet
Points far away from zero Outliers exist

Additional Considerations
1. Be careful when using regression analysis to assert that changes in the predictor values were fixed
at predetermined levels in a controlled experiment. If the values of the predictors are allowed to
vary randomly, other factors may influence both the predictors and the response.
2. Do not apply regression results to values of x that are outside the sample range. The relationship
between Sales calls and Revenue may be very different for sales calls above 168.
3. Be alert for outliers when using regression procedures. Some outliers (called high leverage points)
have a large effect on the calculation of the least squares regression line. In such cases, the line
may no longer represent the rest of the data very well.
4. Time order trends in the data can violate the assumption of independence,. A run chart or individual
chart is a useful tool for detecting such efforts.
4. Resources:

48
MiniTab Software/Manual
Training Data Sets
Textbooks
5. Procedure:
Practice Problem: The sales department for a software company wants to determine whether a relationship
exists between the number of sales calls made and the revenue earned. Analysts record the number of
sales calls and the revenue earned each day for a period of 420 days.Determine the effect of Sales calls on
Revenue. Use fitted line plot to calculate and plot the regression equation.
Variable Description
Revenue Daily Revenue in thousands of dollars, rounded to the nearest dollar
Sales Calls Number of sales calls made each day.

Part 1: Fitted Line Plot


1. Open SoftRev1.MPJ
2. Choose Stat ►Regression ►Fitted Line Plot
3. Complete the dialog box as shown below.

4. Click OK.

49
5. Interpret the results.
6. Evaluate the results using the ANOVA results to evaluate whether the simple regression model is
useful for predicting revenue. State Hypothesis
7. Interpret the p-value (P) .
8. Make a conclusion.

Part 2: Adding confidence and prediction bands


1. Choose Stat ►Regression ►Fitted Line Plot or Press (Ctrl)+(E)
2. Click Options
3. Complete the dialog box as shown below.

4. Click OK
5. Click Graphs
6. Complete the dialog box shown below

50
7. Click OK in each dialog box.
8. Interpret Results
5. Normal Probability Plot
6. Histogram
7. Residual versus fits
8. Residual versus order
9. Make conclusions

6. Data and Results:

51
7. Data Analysis and Conclusion:

Part 1:
Ho = U1 = U2=...=U420
Ha = some means are differrent

The p-value is equal to 0.000 and is less than to 0.05. We reject Ho. Therefore, the means of the number of sales calls
and the revenue earned are different.

Part 2:
Fitted Line Plot – we can see that as the sales calls increases, so does the revenue.

Normal Probability Plot – the data points follow the straight line and the p-value is less than 0.05, therefeore, the normal
distribution appears to fit the sample data.

Histogram – the shape is normal so the data appear to be normally distributed.

Versus Fits – shows that the data have a constant variance.

Versus Order – shows that the data are correlated with each other.
52
8. Assessment (Rubric for Laboratory Performance):

TIP-VPAA–054D
Revision Status/Date:0/2009 September 09

TECHNOLOGICAL INSTITUTE OF THE PHILIPPINES


RUBRIC FOR LABORATORY PERFORMANCE
CRITERIA BEGINNER ACCEPTABLE PROFICIENT SCORE
1 2 3
Laboratory Skills
Manipulative Members do not Members occasionally Members always
Skills demonstrate needed demonstrate needed demonstrate needed
skills. skills. skills.
Experimental Members are unable to Members are able to Members are able to
Set-up set-up the materials. set-up the materials set-up the material with
with supervision. minimum supervision.
Process Members do not Members occasionally Members always
Skills demonstrate targeted demonstrate targeted demonstrate targeted
process skills. process skills. process skills.

Safety Members do not follow Members follow safety Members follow safety
Precautions safety precautions. precautions most of the precautions at all
time. times.
Work Habits
Time Members do not finish Members finish on time Members finish ahead
Management/ on time with incomplete with incomplete data. of time with complete
Conduct of data. data and time to revise
Experiment data.
Cooperative Members do not know Members have defined Members are on tasks
and their tasks and have no responsibilities most of and have
Teamwork defined responsibilities. the time. Group responsibilities at all
Group conflicts have to conflicts are times. Group conflicts
be settled by the cooperatively managed are cooperatively
teacher. most of the time. managed at all times.
Neatness and Messy workplace during Clean and orderly Clean and orderly
Orderliness and after the workplace with workplace at all times
experiment. occasional mess during during and after the
and after the experiment.
experiment.
Ability to do Members require Members require Members do not need
independent supervision by the occasional supervision to be supervised by the
work teacher. by the teacher. teacher.
Other Comments/Observations:
TOTAL SCORE

RATING=

x 100%

53

Vous aimerez peut-être aussi