Académique Documents
Professionnel Documents
Culture Documents
In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through a process known as item analysis. Linda Croker
Both the validity and the reliability of any test depend ultimately on the characteristics of its items.
Quantitative Analysis
Qualitative Analysis
includes the consideration of content validity (content and form of items), as well as the evaluation of items in terms of effective item-writing procedures.
Quantitative Analysis
includes principally the measurement of item difficulty and item discrimination.
1 Item Difficulty
1. Definition
The item difficulty for item i, pi , is defined as the proportion of examinees who get that item correct.
2. Estimation Methods
Method for Dichotomously Scored Item Method for Polytomously Scored Item Grouping Method
R P N
(7.1)
p is the difficulty of a certain item. R is the number of examinees who get that item correct. N is the total number of examinees.
Example 1
There are 80 high school students attending a science achievement test, and 61 students pass item 1, 32 students pass item 10. Please calculate the difficulty for item 1 and 10 separately.
X P X m ax
(7.2)
X , the mean of total examinees scores on one item X m ax , the perfect scores of that item
Example 2
The perfect scores of one open- ended item is
27% could lead to the optimal point when the total test
scores are normally distributed.
P U P L P 2
PU
(7.3)
is th proportion for examinees of upper group who get the item correct.
PL
is the proportion for examinees of lower group who get the item correct.
Example 3
There are 370 examinees attending a language test. Known that 64 examinees of 27% upper extreme group pass item 5, and 33 examinees of 27% lower extreme group pass the same item. Please compute the difficulty of item 5. Key : .49
(7.4)
Example 4
The diffuculty of one five-choice item is .50, the difficulty of another four-choice item is .53. Which item is more difficulty?
ANSWER
CP 1
CP2
KP 1 5 0.5 1 0.38 K 1 5 1
KP 1 4 0.53 1 0.37 K 1 4 1
Discrimination
Difficulty
If there are 100 persons in one population , then ,we can calculate the discriminations as following:
A
B
2 Item Discrimination
When the test as a whole is to be evaluated by means of criterion-related validation, the items may themselves be
evaluated and selected on the basis of their relationships to the external criterion.
When we identify an item for which high scoring
examinees have a high probability of answering correctly and low-scoring examinees have a low probability of answer correctly, we would say such an item can discriminates or differentiates the examinees.
1. Interpretation
Item discrimination refers to the degree to which an item differentiates correctly among test takers in the behavior that the test is
designed to measure.
2. Estimation Methods
Index of Discrimination
(used for dichotomously scored items)
D = PH - PL
(7.5)
We need to set one or two cutting scores to divide the examinees into upper scoring group and lower scoring group. PH is the proportion in the upper group who answer the item correctly and PL is the proportion in the lower group who answer the item correctly. Values of D may range from -1.00 to 1.00.
Example 1
There are 140 students attending a world history test. (1) If we use the ratio 27% to determine the upper and lower group, then how many examinees are there in the upper and lower group separately? (2)If 18 examinees in upper group answer item 5 correctly, and 6 examinees in lower group answer it correctly, then calculate the discrimination index for item 5.
Example 2
50 Examinees Test Data on 8-Item Scale About Job Stress.
Item
PH PL D
.54 .32
.51 .10
.18 . 23
.18 .25
.41 -.05
xy
Ns X sY
This formula is commonly used to estimate the degree of the relationship between item and criterion scores
(2) Point Biserial Correlation If we use the total test score as the criterion, and test item is scored 0 to 1, then we can use the following formula:
rpbi
X p Xt st
p/q
(7.6)
X p is the mean test scores for those who answer the item correctly
st
p is the pass ratio of that item (difficulty) q is fail ratio of that item
Example 3
the Test Data of 15 Examinees
Examinees Test score Item score
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
90 81 80 78 77 70 69 65 55 50 49 42 35 31 10 1 0 1 1 1 1 1 0 0 0 1 0 1 0 0
note:
st
2 ( X X )
90 81 80 ... 10 Xt 58.80 15
rpbi
rpbi
Xq
X p Xq st
pq
(7.7)
is the mean test scores for those who answer that item incorrectly
X p Xt p rb st Y
or
X p X q pq rb st Y
A C
A+C
B D
B+D
A+B C+D
b) PHI Coefficient
BC AD r ( A B )(C D )( A D)( B D )
(7.9)
si
2
(X
j 1
ij
Xi )
(7.10)
2. Analysis Case
Item Group Number of Examinees on Each Choice A 1 2 3 4 Upper Lower Upper 5 22 58 B 92 50 10 C 1 12 15 D 2 16 16 Omit 0 0 1 A D C B Key
rb
0.52 0.33 -0.04 0.08
Lower
Upper Lower Upper Lower
26
17 25 1 1
21
15 11 44 56
15
28 19 14 10
36
28 34 36 28
2
12 11 5 5
Choice Analysis
Whether the examinees who choose the correct choice is more than those who choose the wrong choices
Whether the examinees of upper group who choose the wrong choice is more than those of lower group Whether there is any choice that few examinees choose
Whether there is any item that quite a number of examinees make no choices