Académique Documents
Professionnel Documents
Culture Documents
Verena Gonzales
Ann Creia Tupasi
Ramil Cabañesas
Try-out phase
Item analysis phase
Item revision phase
Item Analysis
A B* C D
1
0 40 20 20 Total
0 15 5 0 Upper 25%
0 5 10 5 Lower 25%
The correct response is B. Let us compute the difficulty index and index of
discrimination.
The correct response is B. Let us compute the
difficulty index and index of discrimination:
𝑅𝑈+𝑅𝐿
𝑃= x 100
𝑇
Where:
RU – The number in the upper group who
answered the item correctly.
RL – The number in the lower group who
answered the item correctly.
T – The total number who tried the item.
Index of Item Discriminating
Power
𝑅𝑢+𝑅𝐿
D= 1
𝑇
2
Where:
P – percentage who answered the item
correctly (index of difficulty)
R – number who answered the item correctly
T – total number who tried the item
8
𝑃= x 100 = 40%
20
The smaller the percentage figure the more difficult the
item.
Estimate the item discriminating power using the
formula below:
𝑅𝑢 −𝑅𝐿 6−2
𝐷= 1 = = .40
𝑇 10
2
The discriminating power of an item is reported as
a decimal fraction; maximum discriminating power
is indicated by an index of 1.00.
Maximum discrimination is usually found at the 50
per cent level of difficulty.
0.00 – 0.20 = very difficult
0.21 – 0.80 = moderately difficult
0.81 – 1.00 = very easy
Validation
High 20 10 5
Average 10 25 5
Low 1 10 14
The expectancy table shows that there were
20 students getting high test scores and
subsequently rated excellent in terms of their
final grades;
And finally 14 students obtained low test
scores and were later graded as needing
improvement.
The evidence for this particular test tends to
indicate that students getting high score on it
would be graded excellent; average scores
on it would be rated good later; and students
getting low scores on the test would be
graded needing improvement later.
Reliability
.70 - .80 Good for a classroom test; in the range of most. There
are probably a few items which could be improved.
.60 - .70 Somewhat low. This test should be supplemented by
other measures (e.g., more test) for grading.
.50 - .60 Suggests need for revision of test, unless it is quite
short (ten or fewer items). The test definitely needs to
be supplemented by other measures (e.g., more tests)
for grading.
.50 or below Questionable reliability. This test should not contribute
heavily to the course grade, and it needs revision.