# Current developments in

testing
Item Response Theory (IRT)

Overview
What we will cover today

Theory (IRT)

##  -Similarities and differences between IRT and CTT

 -Test bias

 -Test Fairness

 -Test Accommodations
- Assumptions of Classical Test Theory (CTT).
- Item Response Theory
- Similarities and differences between IRT and CTT
Assumptions of Classical Test Theory (CTT)

## There are three main assumptions in the Classical Test Theory

(CTT)

1. The error and the true scores from the same test have a
correlation of zero.Hence, the variance of the observed
score is expected to be equal to the sum of the variances of
the true and error score (Lord,1980)

ie Ɣ Te = 0
Sumber : Prof. ‘Dibu Ojerinde, OON ,Joint Admissions and Matriculation Board
(JAMB), Abuja,
Nigeria
Assumptions of Classical Test Theory (CTT)

## 2.The error term have an expected mean of zero.

Once the error is zero, the observed score is equal to
the true score.

## (X = T), nΣi = 0 ……….

Sumber : Prof. ‘Dibu Ojerinde, OON ,Joint Admissions and Matriculation Board (JAMB), Abuja,
Nigeria
Assumptions of Classical Test Theory (CTT)

## 3. The error from parallel measurements are uncorrelated.

X ║ X1 if X1 = X2 = Ti + Ei

Sumber : Prof. ‘Dibu Ojerinde, OON ,Joint Admissions and Matriculation Board (JAMB), Abuja,
Nigeria
Descriptions of IRT
 “IRT refers to a set of  This latent variable is
mathematical models that
describe, in probabilistic usually a hypothetical
terms, the relationship construct [trait/domain or
between a person’s response ability] which is postulated
to a survey question/test item
and his or her level of the to exist but cannot be
‘latent variable’ being measured by a single
measured by the scale” observable variable/item.
 Fayers and Hays p55
 Assessing Quality of Life in
Clinical Trials. Oxford Univ
Press:  Instead it is indirectly
 Chapter on Applying IRT for measured by using
evaluating questionnaire item multiple items or questions
and scale properties.
in a multi-item test/scale.
7
Assumptions in IRT
• Unidimensionality
– Examinee performance is a single
ability
• Response  Dichotomous
– The relationship of examinee
performance on each item and the
ability measured by the test is
described as monotonically
increasing.
• Monotonicity of item performance
and ability is typified in an item
characteristic curve (ICC).
• Examinees with more ability have
higher probabilities for giving
lower ability students
(Hambleton, 1989).
• Mathematical model
dichotomously scored
data (item performance)
a b
to the unobservable data
(ability)
c
• P(θ)
i gives the probability
of a correct response
to item i as a function
if ability (θ)
• b is the probability of
b=item difficulty a=item
discrimination (1+c)/2
c=psuedoguessing parameter
• Three items
showing
different item
difficulties (b)
• Two-parameter
model: c=0
• One-parameter
a model: c=0, a=1
b
• Different levels
of item
discrimination
 IRT has almost completely replaced CTT as method of choice.
 IRT has many advantages ove CTT that have brought IRT into
more frequent use.
 IRT allows for greater reliability.
 IRT can be used in CAT
 IRT allows for difficulty and ability to be on the same scale.
 IRT can be analyzed using multi-level modeling.

Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM ,Sintok.2016 pg.28
3 basic Compenents of IRT

## Mathematical function that relates the latent to the probability of

endorsing and item.

## An indication of item quality ; an item’s ability to diffenrentiate

among respondents.

3. Invariance –

Position on the latent trait can be estimated by the items with know
IRF’s and item characteristic are population independen within
linear tranformation.

Sumber : Psy 427 Cal State Northridge, Andrew Ainsworth, PhD, slides.
Differences between IRT and CTT
Dimension CTT IRT

Definition CTT is a theory about test scores IRT is a general statical theory
that introduces 3 concepts. about examinee item and test
Test score(often calld observed performance and how
score),true score and error score. performance relate to the abilities
that are measured by the items in
the test.

## Model Linear Non linear

Level Weak (i.e easy to meet test data) Strong ( i.e more difficult to meet
Assumption test data.

Relationship

## Test Parrallel Test Non Parrallel Test

Invariance Sample dependent Sample independent
Error Equal chance for every one Not equal chance
Performance Predictable Non Predictable

Cttirt1-150715175719-1val-app6891.pdf
Test Bias
Test Fairness
Test Accomodations
TEST BIAS

DEFINITION

## A test is considered biased

when the scores of one group
are significantly different and
have higher predictive
validity,which is the extent to
which a score on an
assessment predicts future
performance,than another
group.

SOURCES OF TEST BIAS
Types of Test Bias
Construct bias
 occurs when the construct measured yields significantly different
results for test-takers from the original culture for which the test was
developed and test-takers from a new culture.

##  A construct refers to an internal trait that cannot be directly

observed but must be inferred from consistent behavior observed in
people.

construct.

## Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.41-43

Types of Test Bias
Method bias
 Method bias refers to factors surrounding the administration of
the test that may impact the results.

##  The testing environment, length of test and assistance provided

by the teacher administrating the test are all factors that may lead to
method bias.

##  For example, if a student from one culture is used to, and

expects to, receive assistance on standardized tests, but is faced
with a situation in which the teacher is unable to provide any
guidance, this may lead to inaccurate test results.

## Arsaythamby Veloo/Rosna Awang Hashim,Teori Ujian dan Pentaksiran Pendidikan,UUM,Sintok.2016 pg.42

Types of Test Bias
Item bias

##  refers to problems that occur with individual items on the

assessment. These biases may occur because of poor use of
grammar, choice of cultural phrases and poorly written assessment
items.

TEST FAIRNESS

DEFINITION

## Fairness in testing is a fair test is one that yields

comparably valid inferences from person to person and group
to group.

Test Fairness

 means the test item should not have any biases. It should not be
offensive to any examinee subgroup.

##  A fair assessment provides all student with an equal

opportunity to demonstrate achievement

Test Accomodations

DEFINITION

##  Any test accommodations which may enhance student

performance beyond providing equal access are considered
inappropriate and therefore are not permitted.

Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.3
TEST
ACCOMODATIONS

Timing/ Response
setting Presentation
scheduling

## 2015 – 2016 Test Implementation Manuals, Appendix C

.
Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.9
Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.9
Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.10-14
Students with Disabilities: Guidelines for Special Test Accommodations, August 2015, p.10-14
