Vous êtes sur la page 1sur 11

Polytechnic University of the Philippines

Department of Elementary and Secondary Education
Sta. Mesa, Manila


A Synthesis on:
Historical Development of Testing and Evaluation

Submitted by:
Amoyo, Shekinah F.
Boyo, Jy Allyra S.
Dolot, Dyea C.
Flores, Jacqueline S.
Libron, Maricris P.
Monderin, Camille P.
Patrolla, Danilo B.
Submitted to:
Prof. Jay-R A. Manamtam

July 5, 2015



A Little History

Early Period
The Boom Period
The First Period of Criticism
The Battery Period
The Second Period of Criticism
The Age of Accountability

2200 B.C.

Chinese used competitive exam, civil service positions

Civil law, military affairs, agriculture, revenue, and geography.
Testing extremely rigorous
Confucian classics were emphasized.
Only 3% of the group became eligible for public office.
Chinese served as models for developing civil service exams in Europe
and America in the 1800s.


Chinese failed to validate the selection procedures.

Penmanship was at that time given a relevant predictor for suitability
for office.

Wundt, Galton, and Cattell laid the foundation for the 20th Century testing.

Studied conscious human experience using his psychological laboratory.

Acknowledged individual differences but inclination was on the study of the

human mind.
His legacy was on the rigorous experimental control of procedures, which is

very important in tests administration under standardized conditions.

Studied individual differences, most basic concept underlying psychological

Concentrated individual differences sensory and motor functioning. 10

years, tested 17,000 individuals

He pioneered the study of individual differences in mental ability.

Related intellectual ability to skills such as reaction time, sensitivity to

physical stimuli, and body proportions.

Demonstrated that objective tests could be devised through standardized

Evolution of Intelligence and Standardized Achievement Tests:

Alfred Binet was on the verge of a major breakthrough in intelligence testing.

Binet developed his test to help identify children in Paris school system who could
not profit from ordinary instruction.
Binet-Simon Scale was established; major breakthrough in the creation
of modern test.

Boom Period (Americas involvement in World War I)

15-year boom period

New science of Psychology was called on to play a part in military situations
Yerkes used the Army Alpha (Verbal) and Army Beta for selection of
individuals for military service.
Robert Yerkes, a Harvard psychology professor. Convinced the Department of
War that it should test all of its 1.75 million recruits for intelligence tests, so they
could be classified and given appropriate assignments (Goddard and Terman also
chaired this committee).
Consequences of Kalikak Family

The height of Goddards success came at a time when America was

experiencing a large influx of immigrants from Europe. The
Immigration Restriction Act, passed in 1924 (which remained in effect
until 1965) was influenced by American eugenics efforts. In 1913
Goddard was invited to Ellis Island to help detect morons in the
immigrant population. In his Intelligence Classification of Immigrants
of Different Nationalities (1917) he asserted that most of the Ellis
Island immigrants were mentally deficient. For example, he indicated
that 83% of all Jews tested were feeble- minded, as were 80% of the
Hungarians, 79% of the Italians, and 87% of the Russians. The result
was that many immigrants were turned away and sent back to Europe.

Measurement expanded in 12 years after the war; vocational, and personality tests

were developed.
Personality Tests: 1920-1940 (WWII)
Structured personality tests: paper and pencil tests; i.e., Woodworth
Personal Data Sheet
Tests like MMPI were published

Criticism and Consolidation (1930s)

Test developers and users placed too much reliance on the correctness of tests
results regarding peoples abilities and characteristics

Early Abuses of Tests in America

Goddard (1906) began testing 378 residents and categorized them as Idiot (ma

below 2), imbecile (3- 7), feebleminded (8-12), moron (foolish)

Goddards desire was to separate people out
Believed feeble minded people were the cause of most social problems (thievery,

laziness, alcoholism, prostitution, immorality)

Called for the colonization of morons to restrict their breeding. Further, he

believed that many immigrants were feeble minded.

Produced evidence that supported segregation. Sounded dire warnings that racial
intermixture would inevitably cause a deterioration of American intelligence.
Later recanted: without foundation Probably the result of cultural and language

Age of Discrimination testing revealed large score differences between White
Americans and minorities- feeble-minded; started to question the test and the
conclusions drawn from the tests

First Period of Criticism

1930s saw a crash in the expectations of mental measurement.

Criticisms led young psychologist to initiate the Mental Measurements Yearbook
(MMY) to critically review test.

Battery Period (1940s)

Psychological measurement was used again for military service where batteries of
tests were developed that measure several abilities.

Reduced failure rates and led to emphasis on test batteries.

1950s educational and psychological testing grew and expanded not lonely in the

field of education but other fields like business, industry, clinics.

APA set guidelines for good testing practice.

2nd Period of Criticism

In 1965, civil rights movement were in full swing; reacted to tests invasion of

Tests were seen as biased tools; discriminate women and minorities in education
and employment.

Age of Accountability

Despite criticisms, governments and specifically educational institutions were

putting greater faith in testing to determine whether government and educational

programs were achieving their objectives.

Despite failures, school are accountable for maximum learning of the students



Segregation between/among minorities.
Created intellectual hierarchy between/among races.
Labelling: Americans superior over African Americans and other minorities.
Discrimination between men and women in employment.
Invasion of privacy

We started the class late last July 2, 2015 at 11AM instead of our usual 9AM
class because our professor needed to do something. Before the reporting start,
Group 5 gave the class some energizers. The first one is we need to get any piece
of paper and then well write information about ourselves, though there is a twist,
we need to write three truths and a lie, and then we passed the paper around the
classroom and whoever picked the paper of our classmates needs to guess which
are the truth and which is the lie. The last energizer is, well send one of our

classmates outside the classroom and the people remaining inside the classroom
will talk about what occupation will he or she have in the future and then by
actions or gestures, the person chosen to get out of the room will have to guess
what is it. The first chosen person was Mozart and he needed to guess checker,
but because we were so loud while discussing, I think he overheard what we
talked about and guessed it right. The second person, Andrea, didnt get hers right,
though she did get close to the answer which is astronaut but she answered
The energizers were successful in livening up the mood in the class and have
positive effect in our class because we had a good time. It took about 25-20
minutes of time.
After the energizers, the reporter of Group 2, Rodrigo Espina discussed their
topic, which is Historical Development of Testing and Evaluating, and started his
lesson by giving the class a timeline of the development of testing and evaluation.
The reporter explained that the development of testing, measurement and
evaluation was slow and difficult to explain because of the utilization of humans
for a long time. Then he proceeded to give us a summary of the development:
2200 B.C.

Chinese used competitive exam for civil service positions.

It consists of both oral and written examinations, which are informal

(until 1115 B.C. where their test procedures became formal)

Applicants are tested through their knowledge in civil law, military

affairs, agriculture, revenue and geography

Only 3% of the applicants passed the exam

The reporter explained that Chinese served as models for developing civil
service exams in Europe and America during 1800s and also discussed its
weaknesses which are: 1) Chinese failed to validate the selection procedures and
2) Penmanship was at that time given a relevant predictor for suitability for office.
Next, he discussed the introduction of formal measurement procedures in western
education systems in the 19th century:
Wilhelm Wundt

Studied conscious human experience using his psychological


Acknowledged individual differences but inclination was bon the study

of the human mind.

His legacy was on the rigorous experimental control of procedures,

which is very important in tests administration under standardized

Sir Francis Dalton

In 1863, a half cousin of Charles Darwin, Sir Francis Galton worked on

individual differences.
Concentrated on individual differences, sensory and motor functioning.
In the span of 10 years, he tested 17,000 individuals.
Pioneered the study of individual differences in mental ability.
Related intellectual ability to skills such as reaction time, sensitivity to

physical stimuli, and body proportions.

Demonstrated that objective tests could be devised through
standardized procedures.

James Mckeen Cattell (1860-1944)

Transported brass instruments to the U.S., did an elaborate reaction

time studies; invented the term mental test.

Some of his famous students are:
Thorndike (1898)
Woodworth (1899) and E.K. Strong (1911) whose Vocational Interest
Blank, after so many revisions, is still widely used.

The reporter explained that during the 1650s-1800s, people struggle to fit
in the society. He also added further information about Francis Dalton who is
actually a half cousin of Charles Darwin. Dalton worked on individual
differences. In 1883 he published a book titled inquires into the Human Faculty
and Development. His work was regarded as the beginning of mental tests. The
reporter then proceeded to explain in detail:

Time Period 1: The Age of Reform (1792-1900s)

The first documented formal use of evaluation took place in 1792 when
William Farish utilized the quantitative mark to assess students

The quantitative mark permitted objective ranking of examinees and
the averaging and aggregating of scores.

Time Period 2: The Age of Efficiency and Testing (1900-1930)

Fredrick W. Taylors work on scientific management became influential

to administrators in education.
Taylors scientific management

measurement, analysis, and most importantly, efficiency.

Objective-based tests were critical in determining quality of instruction.
Tests were developed by departments set up to improve the efficiency





of the educational district.

Time Period 3: The Tylerian Age (1930-1945)

Ralph Tyler, considered the father of educational evaluation, made

considerable contributions to evaluation.

Tyler directed an Eight-Year Study (1932-1940) which assessed the
outcomes of programs in 15 progressive high schools and 15 traditional
high schools.

Time Period 4: The Age of Innocence (1946-1957)

Starting in the mid 1940s, Americans moved mentally beyond the war

(World War II) and great depression.

According to Madaus & Stufflebeam (1984), society experienced a
period of great growth; there was an upgrading and expansion of

educational offerings, personnel, and facilities.

Bloom, Engelhart, Furst, Hill, and Krathwohl (1956) gave objectivebased testing advancement when they published the Taxonomy of

Educational Objectives.
Expanded the facilities; avenues to expand knowledge

From one discipline to branch to another

Time Period 5: Age of Development (1958-1972)

In 1957, the Russians successful launch of Sputnik I sparked a national

As a result, legislation was passed to improve instruction in areas that

were considered crucial to the national defense and security.

In the early 1960s, another important factor in the development of

evaluation was the emergence of criterion referenced testing.

Much more advancement
Raw data for advancement
The emergence of norm-reference test; criterion-reference test

Time Period 6: The Age of Professionalization (1973-1983)

Before: Teacher as a job
Now: Teacher as a profession

During the 1970s, evaluation emerged as a profession.

A number of journals including Educational Evaluation and Policy
Analysis, Studies in Educational Evaluation, CEDR Quarterly,
Evaluation Review, New Directions for Program Evaluation,
Evaluation and Program Planning, and Evaluation News were

Further, universities began to recognize the importance of evaluation
by offering courses in evaluation methodology.

Time Period 7: The Age of Expansion and Integration (1983-Present)

In the early 1980s, evaluation struggled under the Reagan

administration. Cut backs in funding for evaluation took place and
emphasis on cost cutting arose.

According to Weiss (1998), funding for new social initiatives were

drastically cut. By the early 1990s, evaluation had rebounded with the

The field expanded and became more integrated.
Professional associations were developed along with evaluation

Before ending his report, Rodrigo left the class with word to ponder:

Sometimes its the very people who no one imagines anything of, who do the things that no
one can imagine.
- Alan Turing
After Rodrigos discussion, our professor discussed a few things. One is
memorizing is not bad its actually good. What the students need is for
information to retain in their mind and then theyll go from there, understanding
and thinking what they memorize is all about. Our professor also gave us our tasks
to finish and reminders for the next meeting and a quiz the week after that.
Thats the end of the class.
Testing can be very helpful if its use increases the learning and performance of
children. This is why, we have seen, that, the history of testing started very early,
it has grown from the test of individual differences to almost all aspects of
education and human life. Hence there is no aspect of life that can be mentioned
where there is no form of measurement or the other. This is because test from the
best means of detecting characteristics in a reasonable objective fashion. They
help us gain the kinds of information about learners and learning that we need to
help students learn.


1. Researcher - Dyea C. Dolot and Maricris P. Libron
2. Keeper Shekinah F. Amoyo

3. Manager 4. Synthesizer 5. Encoder

6. Presenter -

Jacqueline S. Flores
Danilo B. Patrolla
Camille P. Monderin
Jy Allyra S. Boyo