Validity

0|Page
CONTENTS
Measuring Reliability 2
Indicators Quality 3
Factors Affecting Validity 4
Kinds of Validity 5
Content Validity 6
How To Establish Content Validity? 7
Table Of Specification 8
Factors That Can Lower Validity 11
Validity | 1
MEASURING RELIABILITY
TEST – RETEST
- Give the same test twice to the same group with any time interval between tests.
Equivalent forms (similar in content, difficulty level,

arrangement, type of assessment, etc.)
- Give two forms of the test to the same group in close succession.
SPLIT-HALF
- Test has two equivalent halves. Give test once, score two equivalent halves (odd
items vs. even items).
CRONBACH ALPHA (SPSS)
- Inter-item consistency – one test – one administration
INTER-RATER CONSISTENCY (SUBJECTIVE SCORING)
- Calculate percent of exact agreement by using Pearson's product moment and

find out the coefficient of determination (SPSS).
Validity | 2
INDICATORS OF QUALITY
- Validity
- Reliability
- Utility
- Fairness
Not Reliable Not Valid Not Reliable Not Valid
Low Validity Low Reliability Not Reliable Not Valid
Validity | 3
VALIDITY
- The agreement between a test score or measure and the quality it is

believed to measure
- Does the test measure what it is supposed to measure?
FACTORS AFFECTING VALIDITY
- Representativeness of the items
- Breadth and depth of coverage of the test
- Test length
- Item type
- Clarity & specificity of instruction
Validity | 4
KINDS OF VALIDITY
Face Validity
Content Validity
Construct Validity
Criterion Validity
Predictive
Convergent Validity Concurrent
Discriminant Validity
FACE VALIDITY
- If the items seem to be reasonably related to the perceived purpose of the test
- This is not really a type of validity, because it does not offer evidence to support
conclusions drawn from test scores.
Validity | 5
CONTENT VALIDITY
- It considers the adequacy of representation of the conceptual domain the

test is designed to cover.
- It involves an attempt to assess the content of a test to assure it includes a

representative sample of all the questions that could be asked.
- ‘Your test did not give me an opportunity to demonstrate what I know’ or
‘ You assigned Chapters 1 - 5, but nearly all of the items came from
Chapters 1-2 — How can you evaluate whether we know anything
about the other material we were supposed to read?
- How well elements of the test relate to the content domain
- How closely content of questions in the test relates to content of the

curriculum
- Directly relates to instructional objectives and the fulfillment of the same!
- Can you test students on things they have not been taught?
CONSTRUCT UNDER REPRESENTATION
- Failure to capture important components of a construct
Validity | 6
CONSTRUCT-IRRELEVANT VARIANCE
- Scores are influenced by factors irrelevant to the construct.
E.g. reading comprehension, test anxiety & illness
HOW TO ESTABLISH CONTENT VALIDITY?
- Instructional objectives (looking at your list)
- Table of Specification
E.g.
At the end of the chapter, the student will be able to do the following:
1. Explain what ‘stars’ are
2. Discuss the type of stars and galaxies in our universe
3. Categorize different constellations by looking at the stars
4. Differentiate between our stars, the sun, and all other stars
Validity | 7
TABLE OF SPECIFICATION
Categories of Performance
(Mental Skills)
Content Areas
Knowledge Comprehension Analysis Total
1. What are stars?
2. Our star, the Sun
3. Constellations
4. Galaxies
Total Grand
Total
CRITERION VALIDITY
- How well a test corresponds with a particular criterion
provided by high correlations between a test and a well-defined criterion

measure
- Criterion: standard against which the test is compared
Validity | 8
- It is the degree to which content on a test (predictor) correlates with
performance on relevant criterion measures (concrete criterion in the
"real" world?).
- If they do correlate highly, it means that the test (predictor) is a valid one!
- Predictive Validity Evidence
- Concurrent Validity Evidence
PREDICTIVE VALIDITY EVIDENCE
- How well performance on a test estimates current performance on some

valued measure (criterion)
- E.g. test of dictionary skills can estimate students’ current skills in the
actual use of dictionary – observation
CONCURRENT VALIDITY EVIDENCE
- How well performance on a test predicts future performance on some

valued measure (criterion)
- E.g. reading readiness test might be used to predict students’

achievement in reading
Validity | 9
CONSTRUCT VALIDITY
Construct - something built by mental synthesis
Each construct is broken down into its component parts.
E.g. ‘motivation’ can be broken down to:
- Interest
- Attention span
- Hours spent
- Assignments undertaken and submitted, etc.
All of these sub-constructs put together measure motivation.
CONVERGENT EVIDENCE
- When a measure correlates well with other tests believed to measure the
same construct
- There is no criterion to define what we are attempting to measure.
DIVERGENT EVIDENCE
- Also known as discriminant validation. The test should have low

correlations with measures of unrelated constructs or evidence for what
the test does not measure.
Validity | 10
FACTORS THAT CAN LOWER VALIDITY
- Unclear directions
- Difficult reading vocabulary and sentence structure
- Ambiguity in statements
- Inadequate time limits
- Inappropriate level of difficulty
- Poorly constructed test items
- Test items inappropriate for the outcomes being measured
- Tests that are too short
- Improper arrangement of items (complex to easy?)
- Identifiable patterns of answers
- Teaching
- Administration and scoring
- Nature of criterion
Validity | 11

Validity

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Validity

Transféré par

Droits d'auteur :

Formats disponibles

0|Page

Factors Affecting Validity 4

How To Establish Content Validity? 7

Factors That Can Lower Validity 11

Equivalent forms (similar in content, difficulty level,

CRONBACH ALPHA (SPSS)

- Inter-item consistency – one test – one administration

INTER-RATER CONSISTENCY (SUBJECTIVE SCORING)

- Calculate percent of exact agreement by using Pearson's product moment and

Not Reliable Not Valid Not Reliable Not Valid

Low Validity Low Reliability Not Reliable Not Valid

- The agreement between a test score or measure and the quality it is

- Does the test measure what it is supposed to measure?

FACTORS AFFECTING VALIDITY

- Representativeness of the items

- Breadth and depth of coverage of the test

- Clarity & specificity of instruction

Convergent Validity Concurrent

- It considers the adequacy of representation of the conceptual domain the

- It involves an attempt to assess the content of a test to assure it includes a

- ‘Your test did not give me an opportunity to demonstrate what I know’ or

- How well elements of the test relate to the content domain

- How closely content of questions in the test relates to content of the

- Directly relates to instructional objectives and the fulfillment of the same!

CONSTRUCT UNDER REPRESENTATION

- Failure to capture important components of a construct

- Scores are influenced by factors irrelevant to the construct.

E.g. reading comprehension, test anxiety & illness

HOW TO ESTABLISH CONTENT VALIDITY?

- Instructional objectives (looking at your list)

1. Explain what ‘stars’ are

2. Discuss the type of stars and galaxies in our universe

3. Categorize different constellations by looking at the stars

Knowledge Comprehension Analysis Total

1. What are stars?

2. Our star, the Sun

- How well a test corresponds with a particular criterion

provided by high correlations between a test and a well-defined criterion

- Criterion: standard against which the test is compared

- Predictive Validity Evidence

- Concurrent Validity Evidence

PREDICTIVE VALIDITY EVIDENCE

- How well performance on a test estimates current performance on some

CONCURRENT VALIDITY EVIDENCE

- How well performance on a test predicts future performance on some

- E.g. reading readiness test might be used to predict students’

Construct - something built by mental synthesis

Each construct is broken down into its component parts.

E.g. ‘motivation’ can be broken down to:

- Assignments undertaken and submitted, etc.

All of these sub-constructs put together measure motivation.

- There is no criterion to define what we are attempting to measure.

- Also known as discriminant validation. The test should have low

- Difficult reading vocabulary and sentence structure

- Inadequate time limits

- Inappropriate level of difficulty

- Poorly constructed test items

- Test items inappropriate for the outcomes being measured

- Tests that are too short

- Improper arrangement of items (complex to easy?)

- Identifiable patterns of answers

- Administration and scoring

Vous aimerez peut-être aussi