Vous êtes sur la page 1sur 4

CHAPTER IX

Chi-square Analysis

Learning Objectives:
Given the learning materials and activities of this chapter, the students will be able to:
 Distinguish the uses of measures of association in the description of the analysis of
bivariate data.
 Perform Chi-square test for Goodness of Fit and test of independence to test the
significance preference and significance of associations between categorical variables.
 Interpret the results.

Introduction
A chi-square tests involve comparing the observed frequencies in a one-way or two-way
frequency distribution table with the expected frequencies if the null hypothesis were true. These
tests play an important role in many other problems where information is obtained by counting
rather than measuring. The method we shall described here applies to two kinds of problems. The
first is the Chi-square goodness-of-fit test, and the second is the chi-square test for independence.
The formula for the Chi-square test is
(𝑂 − 𝐸)2
𝑥2 = ∑
𝐸
With degrees of freedom = number of categories – 1 in chi-square for goodness-of-fit test and (row
– 1)(column – 1) degrees freedom for chi-square test for independence. Where O denote the
observed frequency and E denote the expected frequency, respectively. The critical chi-square
value is obtained from Appendix D the Chi-square distribution table.

Goodness-of-Fit Test
This test statistic can be used to see whether a frequency distribution fits a specific pattern.
For instance, a researcher wants to determine whether consumers have any preference among five
flavors of ice cream. If there were no preference, on would expect that each flavor would be
selected equal frequency.

Assumptions for the Chi-square Test:


1. The data obtained from a random sample must be independent.
2. The expected frequency for each category must be at least 5.
The decision made is a close agreement between the value of observed and expected, the
Chi-square is small and the null hypothesis is not rejected. If there are large differences between
observed and expected frequencies, then the Chi-square is large and null hypothesis is rejected.
Example 1: A clothing manufacturer wants to determine whether customers prefer any specific
color over other colors in shirts. He selects a random sample of 102 shirts sold and notes the color.
Color Number sold
White 43
Blue 22
Black 16
Red 10
Yellow 6
Green 5
Solution:
Steps:
1. State the null and alternative hypothesis.
Ho: Customers have no color preference.
𝐻𝑎 : Customers show a color preference.
2. Level of significance α = 0.10.
3. Select an appropriate test statistic.
The test statistic is the chi-square goodness-of-fit test, and the formula is
(𝑂 − 𝐸)2
𝑥2 = ∑
𝐸
4. Determine the critical value and critical region
Since the level of significance is 0.10 with n – 1 = 6 – 1 = 5 degrees of freedom. Hence the
critical value is 9.236.
Reject Ho, if 𝑥 2 computed is greater than 9.236.
5. Compute the value of the test statistics:
If there were no preference, then there will be 102 / 6 = 17 shirts for each color.
Hence, the expected frequency is 17. Thus, the computed chi-square statistic is
(43−17)2 (22−17)2 (5−17)2
𝑥2 = + +⋯+ = 59.76
17 17 17
6. Decision: Since the computed 𝑥 2 = 59.76 is greater than 9.236, thus, reject Ho at 0.10 level
of significance.
7. Conclusion:
Therefore, there is enough evidence to reject the claim that customers show no
preference for the color of shirts.

Test for Independence


There are times when we might be interested in observing more than one variable on each
individual to find if association exists between these variables. Our goal is a test of independence,
or to find whether two observed characteristics of a member of a population are independent.
Suppose we pick a sample size n and classify the data in a two-way table on the basis of
the two variables. Such a table for determining whether the distribution according to one variable
is contingent on the distribution of the other is called a contingency table. This table is made up
of rows and columns. Each block in the table is called a cell and it is designated by its row and
column position.
The chi-square test for independence can be used to test the independence of two variables.
The hypotheses are stated as follows:
𝐻𝑜 : The first variable is independent of the second variable.
𝐻𝐴 : The first variable is dependent of the second variable.

Procedure in the Computation of Expected frequencies


1. Find the sum of each row and each column, and the find the grand total.
2. For each cell, multiply the corresponding row sum by the column sum and divide by the
grand total, to get the expected value.
Example 2: A total of n = 309 furniture defects were recorded and the defects were classified into
four types: A, B, C, or D. At the same time, each piece of furniture was identified by the production
shift in which it was manufactured. These counts are presented in a contingency table below:

Shift
Types of 1 2 3 Total
Defects
A 15 26 33 74
B 21 31 17 69
C 45 34 49 128
D 13 5 20 38
Total 94 96 119 309

Do the data provide sufficient evidence to indicate that the type of furniture defect varies with the
shift during which the piece of furniture is produced? Test at 0.005 level of significance.

Solution: Follow the steps in hypothesis testing.

1. State the null and alternative hypothesis.


Ho: Type of furniture defect did not vary from shift to shift.
𝐻𝑎 : Type of furniture defect varies from shift to shift.
2. Level of significance α = 0.005.
3. Select an appropriate test statistic.
The test statistic is the chi-square test for independence, and the formula is
(𝑂 − 𝐸)2
𝑥2 = ∑
𝐸
4. Determine the critical value and critical region
Since the level of significance is 0. with (r – 1)(c – 1) = (4 – 1)(3 – 1) = 6 degrees of
freedom. From Appendix D: The Chi-square distribution table the critical value is 18.55.
Reject Ho, if 𝑥 2 computed is greater than 18.55.
5. Compute the value of the test statistics:
First calculate the estimated expected frequencies from each cell.
74∗94 69∗96 128∗119
𝐸𝐶11 = = 22.51 𝐸𝑐22 = = 21.44 𝐸𝑐33 = =49.29
309 309 309
74∗96 69∗119 38∗94
𝐸𝐶12 = = 22.99 𝐸𝑐23 = = 26.57 𝐸𝑐41 = = 11.56
309 309 309
74∗119 128∗94 38∗96
𝐸𝐶13 = = 28.50 𝐸𝑐31 = = 38.94 𝐸𝑐42 = = 11.81
309 309 309
69∗94 128∗96 38∗119
𝐸𝐶21 = = 20.99 𝐸𝑐32 = = 39.77 𝐸𝑐43 = = 14.63
309 309 309

Substitute all expected values to the formula and calculate the chi-square statistics.
(15−22.51)2 (26−22.99)2 (20−14.63)2
𝑥2 = + +⋯+ = 19.18
22.51 22.99 14.63
6. Decision: Since the computed 𝑥 2 = 19.18 is greater than 18.55, thus, reject Ho at 0.005
level of significance.
7. Conclusion:
Therefore, there is sufficient evidence to indicate that the proportions of defect types
of furniture vary from shift to shift.

Vous aimerez peut-être aussi