Académique Documents
Professionnel Documents
Culture Documents
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We enable the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org.
http://www.jstor.org
TerenceJ. Crooks
reviewsresearchthat focusesprimarilyon the impact of variousclassroomevaluation practiceson studentlearningactivitiesand achievement.Nine specificareas
of researchare included.The thirdsectionreviewsresearchon studentmotivation,
examining the effects of differentevaluation practiceson motivation, and the
consequencesof resultingmotivationaltendenciesfor studentlearning.Five areas
of motivationalresearchare included.The final section drawstogetherthe major
findingsfrom the second and thirdsectionsand indicatesthe implicationsof these
findingsfor the effectiveuse of classroomevaluationin education.
The Nature,Role, and Impactof ClassroomEvaluation:An Overview
This sectionincludesthreesubsections.The firstbrieflysummarizesthe findings
of researchon existingpatternsof classroomevaluationin elementaryand secondary schools. The second discussesand categorizesthe variablesthat are assessed
throughclassroomevaluation.The thirdlists 17 specificways(categorizedas short,
medium,or long term)in which classroomevaluationaffectsstudents.
Patternsof classroomevaluation.In the past few years, severalresearchteams
and individualshaveexaminedclassroomevaluationpracticesin elementary,junior
high, and high schools in some detail (Dorr-Bremme& Herman, 1986;Fennessy,
1982;Fleming& Chambers,1983;Gullickson,1984, 1985;Gullickson& Ellwein,
1985;Haertel,1986;Stiggins,1985;Stiggins& Bridgeford,1985;Stiggins,Conklin,
& Bridgeford,1986).Theirfindingsare summarizedbelow.
A substantialproportionof studenttime is involvedin activitiesthat are evaluated.In two studies(Dorr-Bremme& Herman,1986;Haertel,1986),testsoccupied
studentsfor 5 to 15%of their time on average,with the lower figurebeing more
typicalforelementaryschoolstudentsand the higherfigureforhighschoolstudents.
However, this was only the time spent on taking formal written tests. Much
additionaltime is spenton otheractivitiesthatareevaluated,formallyor informally.
Particularemphasisis placed on these nontest approachesat the elementarylevel
(Gullickson,1985).
A wide rangeof evaluativeactivitiestakes place in classrooms,with the pattern
varyingmarkedlyat differentgradelevels and in differentsubjectareas(Fennessy,
1982;Gullickson,1985;Stiggins& Bridgeford,1985).Activitiesincludeevaluation
throughteacherquestioningand classor groupdiscussion,markingor commenting
on performancesof various kinds, checklists,informal observationof learning
activities,teacher-madewrittentests,and writtenexercisesof variouskinds(including projects,assignments,worksheets,text-embeddedquestions,and tests). Affective variables(e.g., aspects of motivation) are also assessed,usually in informal
ways.
Teachersjudge evaluative activities to be important aspects of teaching and
learningand workat them accordingly,but areoftenconcernedaboutthe perceived
inadequaciesin theirefforts(Gullickson,1984;Stiggins& Bridgeford,1985).
A substantialproportionof teachershave little or no formaltrainingin educational measurementtechniques,and many of those who do have such trainingfind
it of little relevanceto their classroom evaluation activities (Gullickson, 1984;
Gullickson& Ellwein, 1985;Haertel, 1986;Stiggins,1985). This is especiallytrue
for elementaryschool teachersbecauseof their heavy relianceon observationand
othernontestmeansof evaluation.Thereare strongargumentsfor helpingteachers
to improvethese nontestformsof evaluation(e.g., Shulman, 1980, pp. 69-70).
440
TerenceJ. Crooks
of description,in textbooksand courses,of routineexercisesas "problems."At the
other extreme,as Fredericksenand othershave noted, many real problemsare ill
structured,but such problemsare often avoidedin our educationsystemsbecause
of theircomplexityand open-endedness.
Such terms as higher level questions, thinkingskills, and problemsolving are
widely used in the researchsummarizedhere. In light of the discussion above,
however,it is not surprisingthatthereis much inconsistencyin the way theseterms
have been defined (see, for instance, Carrier& Fautsch-Partridge,1981, for a
discussionof categoriesused in one researcharea).Carefulattentionto the particulardefinitionsused in each researchreportis thus essential.
Severalresearchershave used coding schemesto analyzethe cognitivelevels of
questionsincludedin teacher-madetests, at gradelevels rangingfrom elementary
school to university(Ballet al., 1986;Black, 1968;Buckwalteret al., 1981;Crooks
& Collins,1986;Fleming& Chambers,1983;Haertel,1986;Milton, 1982;Rinchuse
& Zullo, 1986; Stiggins,Griswold,Green, & associates, 1988). In general,these
studieshave revealedextensiveuse of questionsat Bloom's lowest ("knowledge")
level. For instance,afteranalyzing8800 test questionsfrom tests in 12 gradeand
subject area combinations(elementaryto high school), Fleming and Chambers
(1983) reportedthat almost 80% of all questions were at the knowledgelevel.
Mathematicsand French contributedmost of the higher level items. Similarly,
Haertel(1986) found that "classroomexaminationsoften failedto reflectteachers'
stated instructionalobjectives,frequentlyrequiringlittle more than repetitionof
materialpresentedin the textbookor class,or solutionof problemsmuch like those
encounteredduringinstruction"(p. 2).
Thisfindingis not unexpected.Indeed,both proponentsand criticsof educational
testingwidelyagreethat teacher-madetests tend to give greateremphasisto lower
cognitivelevels than the teachers'statedobjectiveswouldjustify. Severalpossible
causeshavebeen suggested.These includethe difficultyof writingitems (especially
the widely used short answerand objectiveitems) to assess comprehensionand
higherlevel skills,the greaterease with which teacherscan defendtheirmarkingof
questions involving recall or recognitionand achieve tests with high reliability
(Elton, 1982, pp. 115-116; Natriello, 1987, p. 158), and the belief of teachersthat
the use of higherlevel questionswill resultin confusion, anxiety, and significant
levels of failure (Doyle, 1983, 1986). Nevertheless,this pattern is a cause for
concern, both because it reduces the validity of teacher'sevaluations of their
studentsand becausethis reviewwill presentstrongevidencethat the use of higher
level questions in evaluationenhances learning,retention,transfer,interest,and
developmentof learningskills.
Otheraspectsof the case for reducedemphasison testingrecalland recognition
of factualknowledgeare presentedby Broudy(1988), Cole (1986), DiSibio (1982),
Ebel (1982), Glaser (1985), Linn (1983), Messick (1984a, 1984b), Quellmaltz
(1985), Rothkopf (1988), and Thomdike (1969). While they differ in focus and
emphasis,they tend to agreethat transferis a very importantqualityof learning.
Thorndikeputs it particularlywell:
The crucialindicatorof a student'sunderstandingof a concept, a principle,or a
procedureis that he is able to apply it in circumstancesthat are differentfrom
those underwhich it was taught.Transferability
is the key featureof meaningful
442
TerenceJ. Crooks
4. Influencingthe students'self-perceptions,such as their perceptionsof their
self-efficacyas learners.
Theseeffectshavebeenlistedveryconciselyhere,but most of them will be discussed
in considerabledepthin the next two sectionsof this paper.
The Impactof ClassroomEvaluationon StudentLearningActivitiesand
Achievement
This sectionconsistsof nine subsections,arrangedin two groups.Eachsubsection
presentsa brief review of a particularfield of researchon classroomevaluation
practices.Althoughmotivationalfactorshelp explainsome of the reportedfindings,
and some of the evaluationarrangementsdiscussedhave markedeffectson motivationaland affectiveoutcomes,the prime emphasisin this section is on how the
implementationof classroomevaluationaffects learningstrategiesand cognitive
outcomes. Motivationalinfluencesand outcomes are more fully discussedin the
next majorsection of this review.
The Impactof NormalClassroomTestingPractices
Effectsrelatedto expectationsof what will be tested. The studyingand learning
practices of college students. Intensive research on the studying and learning
approachesof college students over the past 20 years has identified consistent
patterns in the learning strategiesadopted by university students and in the
relationshipsbetween these strategiesand teaching arrangements(notably the
evaluation approachesused). Although this researchhas been conducted with
collegestudents,the findingsseem to have much widerapplication.
The researchhas been characterizedby extensiveuse of interviewswith students,
although later researchershave developed questionnairesto gather data more
economically.This researchbeganin the United Stateswith the sociologicalwork
of Becker,Geer, and Hughes(1968) and the insightfulpsychologicalinvestigations
of the intellectualdevelopmentof HarvardUniversitystudentsconductedby Perry
(1970). Most of the more recent work, however,has been carriedout in Europe
and Australia.Marton,Saljo,and theircolleaguesin Swedengave greatimpetusto
this field with their work in the 1970s, and were the first to identifythe patterns
that have been verified repeatedlysince then (althoughit should be noted that
Perry'searlierwork is highlyrelated).This work has been extensivelyreviewedby
Entwistleand Ramsden (1983), Ford (1981), Marton, Hounsell, and Entwistle
(1984), Schmeck(1983), and Wilson (1981).
Martonand Saljo (1976a) reportedthat students'approachesto learningtasks
could be categorizedinto two broadcategoriesthat they labeledas deep or surface
approaches.Deep approachesinvolved an active searchfor meaning, underlying
principles,structuresthat linked differentconcepts or ideas together,and widely
applicabletechniques.Surfaceapproaches,in contrast,reliedprimarilyon attempts
to memorizecourse material,treatingthe materialas if differentfacts and topics
wereunrelated.Similarcategorieshavebeen foundin many laterstudies(see Biggs,
1978;Entwistle& Ramsden,1983;Marton,Hounsell,& Entwistle,1984;Ramsden,
1985;and Watkins,1984),althoughsome researchershave identifiedsubcategories
withinthe surfaceand deep approaches(van Rossum, Deijkers,& Hamer, 1985).
Afterthe initial study,f6llow-upstudiesby Martonand Saljo(1976b), Svensson
444
TerenceJ. Crooks
that studentswho generallyuse surfaceapproacheshave greatdifficultyadaptingto
evaluationrequirementsthat favordeep approaches.On the otherhand, these and
otherstudieshave demonstratedthat studentswho on some occasionssuccessfully
use deep approachescan all too easilybe persuadedto adopt surfaceapproachesif
evaluationor other factorssuggestthat these will be successful.For instance,if an
examinationconsists entirelyof detailedfactualquestionson lecturematerial,an
effectivestrategywould be to attend all lectures,take detailednotes, and rely on
last-minutecramming of the lecture notes in the days immediatelybefore the
examination(Crooks& Mahalski, 1986). Miller and Parlett(1974, p. 107) have
suggestedthat such examinationsmay actually serve to clear from the student's
memory the knowledge involved, rather than to strengthenit. Other research
suggeststhis is unlikely, but certainly there is ample researchto indicate that
detailedfactualknowledgedecaysrapidlyunlessit is used or restudied.
One interestingillustrationof an apparentinfluenceof curriculumand evaluation
practiceson studentsemergesfrom a studyby Entwistleand Kozeki (1985). They
examinedthe school motivation,approachesto studying,and attainmentof high
school students in Britain and Hungary.Using Entwistle'swell-establishedApproachesto StudyingInventory,they identified substantialmean differencesbetween British and Hungarianstudents on the deep and reproducing(surface)
approachscales. Comparedto the British students,the Hungarianstudents had
higher scores on deep approach and lower scores on surface approach.They
convincinglyhypothesizedthatthis reflecteddifferencesin teachingand examining
in the two countries.As they interpretedit, the externalexaminationsin Britainin
the latteryears of high school place a very heavy emphasison the correctreproductionof information,andthisinfluencesthe approachesadoptedby bothteachers
and students. In Hungary,on the other hand, there has been a strong reaction
againsta formerstresson rotelearningin the schools,and the emphasishas recently
been placed on attemptingto foster creativitythroughhelping studentsto think
about relationships,with much reducedemphasison factualknowledgeor operation learning.If Entwistleand Kozeki'sinterpretationis correct,their findingsare
a vividdemonstrationof the influenceof whatis emphasizedand assessedin school
on how studentsapproachtheirlearning.
On a smallerscale, Newble and Jaeger(1983) describedthe effectsof a change
in evaluationon studentsin a medicalschool. When wardratingsreplacedan oral
clinical examination,studentsfound that ward ratingswere almost alwaysabove
the pass level. Given that their writtentheory examinationsdid producefailures,
they startedspendingmore time in the libraryand less in the wards.Institutinga
differentclinical examinationshiftedthe balanceback. Newble and Jaegercommentedthat the effect of the changewas so greatas to indicatethat examinations
may be the major factorinfluencingstudent learningin a medical school with a
traditionalcurriculum.A numberof similarexamplesare given by Milton (1982),
in a book which criticallyanalyzescollegeevaluationpractices.
Ramsden, Beswick, and Bowden (1987) gave universitystudents trainingintendedto improvetheirlearningskills,expectingthe studentsto make more use of
deep approachesas a result.They found, however,the trainingactuallyled to an
increasein use of surfaceapproaches,becausethe traininghad made studentsmore
able to analyzethe demands of their course evaluationprocedures,which suited
surfaceapproaches(see also the commentsof Schmeck, 1988, p. 180).
446
TerenceJ. Crooks
the two groups,with only the item formatdiffering,differencesbetweenthe groups
weregenerallysmall (e.g., Hakstian,1971, who used a mix of cognitivelevels,and
Kumaret al., 1979,who usedonly factualquestions).Wherestatisticallysignificant
differenceswere found under these circumstances(e.g., d'Ydewalleet al., 1983;
Meyer, 1935, 1936),they favoredthe groupwhichpreparedfor a recall(as opposed
to a recognition)task. The recall group tended to preparemore thoroughlyand
performa little better.
On the whole, though,studentexpectationsof the cognitivelevel and contentof
tasksprobablyexertmuch moreinfluenceon theirstudybehaviorand achievement
than do theirexpectationsof the taskformat(forgivencontentand cognitivelevel).
Thus I believe that there is no strong evidence from this researchto support
widespreadadoptionof any one item formator style of task. Instead,the basisfor
selectingitem formatsshould be their suitabilityfor testingthe skills and content
that areto be evaluated.
A few studieshave examinedthe comparativemeritsof open book and closed
book testing (see Boniface, 1985; Francis, 1982). These studies have shown that
studentstendto be less anxiousaboutopen book tests,and to preparesomewhatless
thoroughlyfor them. Predictably,the studentswho rely most on using their notes
and/or textbooksduringthe test tend to be among the lowerachievers.Studiesto
date have demonstratedno clear benefit in levels of studentachievementarising
from open book tests. More researchis needed, however,because most of the
treatmentshave been very briefand thus have not allowedthe studentsadequate
opportunityto develop skills in handlingthe demandsof open book tests. Also,
moreattentionto the natureof the test is neededbecausethe availabilityof resource
materialsis most likelyto be meaningfuland usefulwhen testsare not speededand
consistof highercognitivelevel questions.
Effectsoffrequencyof testing.The substantialbody of researchon the effectson
studentsof the frequencyof classroomtesting has been thoroughlyreviewedin a
Kulik, and Kulik (1988). This reviewwill draw
meta-analysisby Bangert-Drowns,
heavilyon theirwork.
The reviewby Bangert-Drowns
et al. (1988) used datafrom 31 studieswhich:(a)
wereconductedin realclassrooms,(b) had all groupsreceivingthe sameinstruction
exceptfor varyingfrequenciesof testing,(c) used conventionalclassroomtests, (d)
did not have seriousmethodologicalflaws,and (e) used a summativeend-of-course
examinationtakenby all groupsas a dependentvariable.The courselengthranged
from4 weeksto 18 weeks,but only 9 studieswereof coursesshorterthan 10 weeks.
Bangert-Drownset al. reportedtheir resultsin terms of effect size (differencein
mean scoresdividedby standarddeviationof the less frequentlytestedgroup).
Overall,they found an effect size of 0.25 favoringthe frequentlytested group,
representinga modest gain in examinationperformanceassociatedwith frequent
testing. However, the actual frequenciesof testing varied dramatically,so the
collectionof studieswas veryheterogeneous.In 12 studieswherethe low frequency
group received no testing prior to the summative examination, the effect size
increasedto 0.43. It seemsreasonableto hypothesizethat,in part,this largeincrease
may have come about becausestudentswho had at least one experienceof a test
fromthe teacherbeforethe summativeexaminationwereableto betterjudgewhat
preparationwould be most valuablefor the summativeexamination.On average,
effect sizes were smaller for longer treatments,probably because most longer
448
TerenceJ. Crooks
Natriello(1987) has suggested,theremay well be a curvilinearrelationshipbetween
the level of standardsand studenteffortand performance,with some optimallevel
for each situation.This optimal level would probablydepend on other aspectsof
the evaluationarrangements,such as whetheror not studentsare given opportunities to get credit for correctingthe deficienciesof evaluatedwork, or the nature
of the feedbackon theirefforts.The weakerstudents,who are most at riskin highdemandclassrooms,may need considerablepracticalsupportand encouragement
if they areto avoid disillusionment.
Not surprisingly,Natriello and Dornbuschfound that if studentsthought the
evaluationsof theirworkwere not importantor did not accuratelyreflectthe level
of their performanceand effort,they were less likely to considerthem worthyof
effort.This conclusionis consistentwith the resultsof researchon studentattributions of the reasonsfor success or failurein educationaltasks (discussedlater in
this paper).
An importantissue is whetherthe standardsadoptedare to be norm-referenced,
or basedon the effortand improvementof individualstudents
criterion-referenced,
(Natriello, 1987). This choice appearsto differentiallyaffect the motivation and
learningof differentcategoriesof students.For instance,norm-referencedevaluation tends to underminethe learningand motivation of studentswho regularly
score near the bottom of a class, while posing much less risk to the top students.
No clearconsensusemergesfromthe literatureto date,but Natriello(1987)suggests
that self-referencedstandardsmay be optimal for most students.All studentscan
improve their knowledge, skills, and attitudes and have this verified through
evaluation,but only some can scoreabove the class medianon a measure.
When student performanceon achievementtests is the criterion,researchhas
generallyshown that higherstandardslead to higherperformance(e.g., Rosswork,
1977), although again a curvilinearrelationshipmay be predicted.Most of the
relevantclassroom-basedresearchderives from studies of masterylearning,and
these will be reviewedin a latersection.
The Impactof OtherInstructionalPracticesInvolvingEvaluation
EJJectsof adjunctquestionsin learningfrom text. In contrastto the researchin
the previous section, much of the researchreviewed in this section has been
conductedin laboratorysettings.The findingsof this research,however,converge
with findingsfrom researchon the use of conventionaltests in educationalprograms.Thus I believe that it is appropriateand valuableto include this extensive
body of researchin this review.
Adjunctquestionsarequestionsinsertedbefore,during,or aftera writtenpassage
that studentsareto study.Some studieshave allowedthe studentsto reviewearlier
materialafter they encounteran adjunctquestion, whereasothers have not permitted such looking back. The adjunctquestions may be factual or higherlevel
questions,althoughdefinitionsof higherlevel vary markedly(Carrier& FautschPartridge,1981). Their effects have been studied by examining the pace and
intensityof students'readingof portionsof the passage,and by testingstudentsin
a varietyof ways and at a varietyof times on the content of the passage.These
tests have looked at the students'graspof the content or skill directlycoveredby
the adjunctquestions,theirgraspof closelyrelatedmaterialnot directlyaddressed
450
TerenceJ. Crooks
questionsdecreaseas the numberof factsto be coveredand the amountof searching
requiredincreases,whereaspostquestionswork well with long texts, especiallyif
the numberof adjunctquestionsis not too large.
The beneficialeffects of factualadjunctquestionsare not due to greaterstudy
time of studentsreceivingadjunctquestions.Althoughit is true that the inclusion
of adjunctquestionstends to increasestudy time a little where study time is not
controlled,the effectsizesfromstudiesin whichstudytime wascontrolled(identical
for experimentaland control groups)were generallyhigherthan the effect sizes
from studiesin which studytime was not controlled(Hamaker,1986, TableIX).
Higher order adjunct questions. Studying the effects of higher order adjunct
TerenceJ. Crooks
towardactive learning),that teachersneed to practicephrasingquestionsin ways
that communicatethe task clearly,that the difficultylevel shouldbe such that the
majorityof questions receive satisfactoryresponses,and that responsesto other
than simple factual questions tend to be fuller and more appropriateif several
secondsare allowedbetweenquestionand response(see also M. B. Rowe, 1986).
Feedbackshould include knowledgeof results,but should make only limited use
of praise(e.g., praisemight be used mainly for correctresponsesfrom anxious or
less capablestudents)and verylittle use of criticism.
Perhapsthe most frequentlyresearchedaspectof teacheroralquestionshas been
the cognitivelevel of the questionsand the effectsof differentcognitivelevels on
studentachievement.This has also been an areain which reviewershave reached
markedlyvariedconclusions,althoughthe reviewershave agreedthat higherlevel
questionsare generallyused much less than lowerlevel questions(a ratioof 1 to 3
is typical of reportedfiguresfrom researchin school classrooms).Medley (1979)
and Rosenshine(1979) both concludedthat greateruse of higherlevel questions
led to lower student achievement.Winne (1979), in a review of relevantexperimental studies, found no clear pattern of achievementchange associatedwith
greateruse of higherlevel questions.Redfieldand Rousseau(1981), however,used
meta-analysison a very similarcollection of experimentalstudiesand reporteda
mean effect size of 0.73 favoringuse of higher cognitive level questions. More
recently,Samson,Strykowski,Weinstein,and Walberg(1987) conductedanother
meta-analysisof experimentalstudiesand founda mean effectsize of 0.26 favoring
use of higherlevel questions.
Severalfactorshelp to make senseof these contradictoryfindings(cf. Gall, 1984;
Samsonet al., 1987).First,studiesin this areahave been veryinconsistentin their
definitionsof higherand lower level questions.Lower level has been defined to
include the bottom one, two, or three categoriesfrom Bloom's taxonomy, and
other taxonomieshave also been used. Second,the difficultyof the questionshas
rarelybeen controlled,so that higherlevel questionsmay have been substantially
more difficult on averagethan lower level questions,which could have reduced
students'opportunityand motivation to learn effectivelyfrom these questions.
Third,too little attentionhas been paid to the natureof the criterionachievement
measures.The use of a criterioninvolvingonly factualrecallor recognitioncould
be predictednot to favorthe use of higherlevel oral questions.For example,the
studies reviewedby Medley (1979) and Rosenshine(1979) were predominantly
conductedin junior elementaryschool classeswith high proportionsof disadvantaged children,where the teaching focused very much on basic knowledgeand
skills.Thesestudentsmayhavehaddifficultyattendingto and correctlyinterpreting
the higherlevel questions,and the criterionmeasuresused generallyincludedfew
higherlevel questions.Fourth,many of the studieswere of very briefduration.It
could be predictedthat highercognitive level questionswould be most effective
when used consistentlyover substantialperiodsof time, especiallyif studentshad
previouslyhad little experiencewith such questions.This predictionis supported
by an analysisincludedin the reviewby Samsonet al. (1987). They found a mean
effectsize of 0.05 for 22 studieslasting5 daysor less, but a mean effectsize of 0.83
for 4 studieslasting20 daysor more.Finally,it is interestingto note that the review
by Samson et al. also reportedmarkedlylargermean effect sizes in studies that
were betterdesigned(randomassignmentto treatments,sample size greaterthan
454
455
TerenceJ. Crooks
correctanswers,thus helpingstudentsto "knowwhat they know."There is very
little evidence that such knowledgeof correct responsesacts by reinforcingthe
correct response,and indeed feedbackon correct responseshas little effect on
subsequentperformance,except perhapsin the specialcase wherethe studenthas
gravedoubtsaboutthe correctnessof the initial answer.
The major benefit from feedbackreportedby Kulhavyis the identificationof
errorsof knowledgeand understanding,and assistancewith correctingthose errors.
In most studies,suchfeedbackclearlyimprovedsubsequentperformanceon similar
questions.Feedbackon incorrectresponseshas been shown to be most effective
wherethe initial responsewas made with high confidence,probablybecausethe
studentattendsmore to the feedbackin such cases (due to the elementof surprise
and the initialdesireto defendthe correctnessof the response).
It seems likely that the most effective form of feedbackwill depend on the
correctnessof the answer,the student'sdegreeof confidencein the answer,and the
natureof the task. If the answeris correct,simpleconfirmationof its correctnessis
sufficient.If the questionwas factualand the answeris incorrect,the most efficient
form of feedbackis probablysimplyto give the correctanswer(Phye, 1979).If the
questioninvolvescomprehensionor highercognitiveskills,however,moredetailed
feedbackis desirable.Studentswho answeredsuch questionsincorrectlywith high
confidencemay need help to identifythe sourceof their misunderstanding
(Block
& Anderson, 1975; Fredericksen,1984b), whereas students who answeredthe
question incorrectlywith low confidencemay need to be given conceptualhelp
and advisedto restudythe material.
Thereis little supportfromlaboratoryor classroomresearchfor makingpraisea
prominentpartof feedback,but Page(1958) foundthat simplepositivecomments
werebeneficial,and harshcriticismis predictablycounterproductive.
Both the age
and achievementlevel of the studentmay modifythis conclusion:youngerand less
able studentsmay benefitmost from praise.Praiseshould be reservedfor specific
achievementsthat truly representsubstantialaccomplishmentsfor the individual
student. The motivationaleffects of differenttypes of feedbackare discussedin
more detailin latersectionsof this paper.
Feedbackcan also play a very positive role in guidingstudentsin their use of
learningstrategies(Pressley,Levin, & Ghatala, 1984). Pressleyet al. found that
explicit feedbackon strategyuse was especially valuable with young children,
whereasadults who had tried severalstrategiesand been tested on their learning
weregenerallyable to identifythe most effectivestrategy.
The timingof feedback.Effectsof the timing of feedbackhave receivedconsiderableattention.Kulik and Kulik (1988) used meta-analytictechniquesto review
53 studies of the timing of feedback in verbal learning.They identified three
differentcategoriesof study,findingquite differentresultsfor the threecategories.
A key factorthat apparentlyinfluencedthese differenceswas whetheror not the
criteriontest questions were identical to the earlier feedbackquestions. Where
differentquestionswereused, most studiesfound a small advantagefor immediate
feedback(the mean effect size for 11 studieswas 0.28). Whereidenticalquestions
wereused(e.g.,Kulhavy& Anderson,1972),however,most studiesfounda modest
advantagefor delayed feedback(the mean effect size for 14 studies was -0.36).
Kulhavyand Anderson(1972) suggestedthat this effectarosebecausethe memory
of incorrectresponsesmade duringacquisitioninterferedwith the learningof the
456
TerenceJ. Crooks
effectsizeswere0.36 and 0.67, suggestingthat a majorcomponentof the effectiveness of masterytestingarisesfromthe additionalfeedbackthat it usuallyprovides.
The otherstatisticallysignificantdifferencewas betweenstudiesat varyinglevels
of masterycriterion.In 17 studieswherethe criterionlevel for masterywas a score
of 91%or higheron unit tests, the mean effect size was 0.73; in 15 studieswith a
criterionlevel of 81 to 90%,the mean effect size was 0.51; and in 17 studieswith
a criterionlevel below 81%the mean effect size was 0.38. This is a strongeffect,
demonstratingthat under masterytesting conditionsa highercriterionlevel generallyproducesgreaterlearning(assessedon an end-of-courseexamination).
Thus the resultsof researchon masterytestingsuggestthat the sizeablebenefits
observedlargelyrepresentthe combinedeffectsof the benefitsdescribedin earlier
sectionsfrommore frequenttesting,fromgivingdetailedfeedbackon theirprogress
on a regularbasis, and from setting high but attainablestandards.One further
effect that is probablyimportantis the benefit of allowingrepeatedopportunities
to attain the standardsset. This feature might have considerablebenefits in
increasingmotivationand a sense of self-efficacy,while reducingthe anxietyoften
associatedwith one-shottesting(Friedman,1987).Kulikand Kulik(1987) reacha
similarconclusionto Abbottand Falstrom(1977):the otherfeaturesoften included
in coursesbasedon masterylearningmodelsdo not appearto add significantlyto
the effectsdescribedabove.
As in the sectionon frequencyof testing,some cautionmust be expressedabout
the generalizabilityof the findingson masterytestingbecausethe cognitivelevels
of the tests and examinationswere not analyzed.Differenteffects may occur for
coursesand teststhat heavilyemphasizehighercognitivelevel outcomes,especially
in relationto the benefits of more frequenttesting. The benefitsof feedback,of
opportunitiesforextraattemptsat tasksinitiallyhandledpoorly,and of challenging
standardsseem more likelyto applyto evaluationtasksat all cognitivelevels.
Effectsof competitive,individualistic,and cooperativelearningstructures.Many
studieshaveexaminedthe effectsof differentclassroomlearningand goal structures
on students.In particular,considerableattentionhas been given to the effectsand
comparativemeritsof competitive,individualistic,and cooperativelearningstructures. In competitivestructures,the successor failureof studentsis largelydeterminedby theirperformancerelativeto otherstudents.In individualisticstructures,
studentsare rewardedon the basis of their own work, independentof the work of
other students.In cooperativestructures,studentswork togetherin groups,and
judgmentsof successare based on the overallachievementsof each group.Ames
(1984) has classifiedthese situationsaccordingto the patternof interdependence
among students.Competitivestructuresinvolve negativeinterdependencebecause
success for one student reducesthe chances that other studentswill succeed. In
individualisticstructures,there is no interdependenceamong students.Finally,in
cooperativestructures,there is positive interdependenceamong students, since
success for one student assiststhe success of all membersof the group in which
that studentis a member.
Effects on cognitive outcomes. Johnson, Maruyama, Johnson, Nelson, and Skon
(1981) conducted a meta-analysis of 122 studies that examined the comparative
TerenceJ. Crooks
and should thus probably be regardedas tentative. The 10 studies in which
individualrewardsweregiven basedon individualperformanceshowedno advantage for cooperativestudy over noncooperativeapproaches.
Slavinconcludedthat the use of grouprewardsbasedon the individualperformance of group members is essentialto the effectivenessof cooperativelearning
methods.Such a strongconclusionmay not be justifiedon the basisof the data he
reported,but this incentivestructuredoes appearto be beneficialto grouplearning
(see also Lew, Mesch,Johnson,& Johnson, 1986).
Effects on social outcomes. One widely cited benefit of cooperativelearning
structuresis that they lead to increasedcohesivenessamong the studentsinvolved
(Johnson, Johnson, & Maruyama,1983; Slavin, 1983a). This can be especially
beneficialin classesthat are diversein ethnic composition,abilitylevel, or because
of the inclusion of mainstreamedhandicappedstudents. Johnson et al. (1983)
conducteda meta-analysisof 98 studiesof cooperativelearning,with interpersonal
attractionas the dependentvariable.They found little differencebetweencompetitive and individualisticstructures,but studentsin cooperativestructuresscored
substantiallyhigherin mean interpersonalattraction.Wherethe cooperativegroups
were not competitivewith each other, the effect size was 1.11 (comparedboth to
competitiveandto individualisticstructures).Wheretherewascompetitionbetween
groups,the meaneffectsizewassmaller(0.79 comparedto individualisticstructures,
0.55 comparedto competitivestructures).Clearly,structuresthat encouragecooperationamongstudentscan havesubstantialbeneficialeffectson socialrelationships
among students.
Astin (1987) discussedthe benefitsof cooperativelearningin highereducation.
Among otherthings,he emphasizedthat a key benefitcould be an enhancedsense
of mutualtrust,both among studentsand betweenstudentsand teacher.He noted
that in competitivelearningsituations,studentsoften work very hard to disguise
theirignorance(frompeersand fromtheirteacher).This limits the availabilityand
effectivenessof feedback,thus undermininglearning.Astin sees cooperativestructureshelpingto overcomethis problem,while fosteringinterpersonalskillsthat are
greatlyneededin the community.
MotivationalAspects Relatingto ClassroomEvaluation
Researchhas repeatedlydemonstratedthat the responsesof individualstudents
to educationalexperiencesand tasks are complex functionsof their abilitiesand
personalities,theirpasteducationalexperiences,theircurrentattitudes,self-perceptions and motivationalstates,togetherwith the natureof the currentexperiences
and tasks. Effectiveeducationrequiresthe fusing of "skilland will" (Paris, 1988;
Paris& Cross, 1983),and intrinsicinterestand continuingmotivationto learnare
educationaloutcomesthat shouldbe regardedas at least as importantas cognitive
outcomes(Maehr,1976;Paris, 1988).The importanceof motivationalfactorshas
been vigorouslystatedby Howe (1987):
I have a strong feeling that motivational factors are crucial whenever a person
achieves anything of significance as a result of learning and thought, and I cannot
think of exceptions to this statement. That is not to claim that a high level of
motivation can ever be a sufficient condition for human achievements, but it is
undoubtedly a necessary one. And, conversely, negative motivational influences,
such as fear of failure, feelings of helplessness, lack of confidence, and having the
460
TerenceJ. Crooks
cognitiveand metacognitivelearningstrategies,that they may use poor test-taking
strategies,or that they may be particularlyprone to distractingthoughts while
taking a test (such as thoughts about failure or about difficult items yet to be
completed).These proposedmechanismsare clearlynot mutuallyexclusive.The
first (weaklearningstrategies)would not explain the findingsreportedin the last
paragraph,so it is not a sufficientexplanationby itself. However, it does have
empiricalsupport (Naveh-Benjamin,McKeachie,& Lin, 1987). The other two
mechanismsare more specific to the testing situation, and both have empirical
support.
Severalguidelineshave been suggestedfor reducingthe debilitatingeffectsof test
anxietyin classroomevaluationprograms(Hill, 1984;Hill & Wigfield,1984).These
include:testingunder"power"testingconditions(verygeneroustime limits, so no
student feels under significanttime pressure);avoiding distinctiveand stressful
testingconditions;giving the studentsample details of the nature,difficulty,and
formatof the test (with practiceexamples);settingtasksthat allow each studenta
reasonablelevel of success; reducing emphasis on social comparison (Hill &
Wigfield suggest avoiding the use of letter grades in elementaryschools); and
providingspecialtrainingfor studentswho may be victimsof test anxiety.
Studentself-efficacy.Self-efficacy,as definedby Bandura(1977, 1982), refersto
students'perceptionsof their capabilityto performcertain tasks or domains of
tasks.Researchon the role of self-efficacyin achievementbehaviorand classroom
learninghas been reviewedby Schunk(1984, 1985).Perceptionsof self-efficacyin
an area have been shown to correlatehighly with achievementin that area. For
instance,in a recentstudyby Thomas,Iventosch,and Rohwer(1987), self-efficacy
was found to be a better predictorof school achievement than their selected
measureof academicability.They also found that studentswith high self-efficacy
tended to make more use of deeper learningstrategies(generativeand selective
activities)than otherstudentsdid.
Perceptionsof self-efficacyappear to have a strong influence on effort and
persistencewith difficult tasks, or after experiencesof failure (Bandura, 1982;
Schunk, 1984, 1985). Under such circumstances,students high in self-efficacy
usuallyredoubletheir efforts,whereasstudentslow in self-efficacytend to make
minimaleffortsor avoid such tasks.
The main mechanismfor buildingself-efficacyin a particulardomain appears
to be experiencingrepeatedsuccess on tasks in that domain. Success at tasks
perceivedas difficultor challengingis more influentialthan successon easiertasks.
On the other hand, of course,repeatedfailureleadsto loweredself-efficacy.More
than 40 yearsago, E. L. Thorndikebegana paperwith these words:
It is a matterof common knowledgethat a mind which for any reasonbecomes
engaged in an activity and finds itself repeatedly and persistently failing therein, is
impelled to intermit or abandon it. The person does abandon it unless this
impulsion is counterbalanced by some contrary force, such as the hope of a turn
of the tide toward success, or an inner sense of worth from maintaining the activity,
or a fear that worse will befall him if he stops. (Thorndike & Woodyard, 1934, p.
241).
To foster self-efficacy,evaluationsof task performanceshould emphasizeperformance(task mastery)ratherthan task engagement(Schunk, 1984). Thus, for
462
TerenceJ. Crooks
the findingsof threestudies.Lepper,Greene,and Nisbett(1973) foundthatstudents
who had previouslychosen to engage in an activity voluntarily,with apparent
enjoyment,were less inclined to returnto that activity after they had receiveda
rewardfrom a teacher for engagingin the activity. Maehr and Stallings(1972)
studiedstudentsperformingeasy or hardtasksunderextrinsicor intrinsicmotivation conditions.They found that studentswho workedunder the intrinsicmotivation condition continuedto be interestedin workingon difficulttasks,whereas
students who worked under the extrinsic motivation condition lost interest in
attemptingdifficulttasks, preferringto attemptonly easy ones (see also Hughes,
Sullivan, & Mosley, 1985). Finally, Condry and Chambers(1978) found that
studentsin their extrinsicmotivationgroupwere more answeroriented,tryingto
take shortcutsto producethe desiredanswers,whereasstudentsin the intrinsic
motivationgrouptendedto use deeper,moremeaningfulapproachesto understanding the tasks.
These and other studieshave repeatedlyshown that wherestudentsare initially
intrinsicallymotivated,attemptingto stimulatelearningthroughextrinsicmotivation usuallyleadsto decreasedintrinsicmotivation,especiallyon challengingtasks.
Such a resultis clearlynot desirable.On the other hand, where studentsinitially
lack intrinsicmotivationin a particularsubjectarea, researchreportedin the last
sectionsuggeststhata carefullyplannedprogramof positiveeducationalexperiences
accompaniedby extrinsicmotivation can lead to the developmentof interestin
the area,and thus to intrinsicmotivation.Unfortunately,however,thereis strong
evidencethat in most educationsystemssuch gains are usuallyoutweighedby the
losses.Many observershave commentedon the contrastbetweenthe broadenthusiasm for learning demonstratedby most children in the first year or two of
schoolingand the jaded approachof many older students.Althoughsome of this
differencemay relateto developmentalfactors,it is hardto escapethe conclusion
that for many studentsschoolingtends to lower ratherthan increaseinterestin
learning.
It is importantto note that classroomevaluationproceduresneed not have the
debilitatingeffects on intrinsic motivation noted above. Deci (1975) and others
(Keller, 1983;Ryan, Connell,& Deci, 1985) have noted that the key factorseems
to be whetherstudentsperceivethe primarygoal of the evaluationto be controlling
their behavioror providinginformativeand helpfulfeedbackon their progressin
learning.Evaluationcan be used as a bludgeonto make studentslearn,and in the
shorttermthis may producesignificantlearning,but the longerterm consequences
of such an approachappearto be most undesirable,especiallyfor the less able
students.
Attributionsfor success and failure. Extensiveresearchhas demonstratedthat
studentself-perceptionsof the factorsinfluencingsuccessor failurein learningtasks
have a very significantinfluenceon their motivationand behavior.Such attributions for successor failurearecentralto Weiner'stheoryof achievementmotivation
(Weiner, 1979, 1985, 1986), and many other researcherson motivationhave also
stressedtheir importance.Researchon studentattributionshas been reviewedby
Covington(1984, 1985), Dweck and Elliott (1983), Nicholls (1983, 1984), Paris
and Cross(1983), and Weiner(1985, 1986),among others.
Weiner(1979) statedthat successor failurecould be attributedto four possible
causes:ability,effort,luck, or task difficulty.The firsttwo of these are internalto
464
TerenceJ. Crooks
able tasks, some individualizationof tasks, use of tasks that are more intrinsically
motivating or more gamelike in nature, opportunitiesfor student autonomy in
learning,little use of abilitygroups,use of cooperativelearningapproaches,provision of unambiguousperformancefeedbackthat emphasizesmasteryand progress
(ratherthan normativecomparisons),and little emphasison summativegrading
(Covington, 1985; Johnston & Winograd, 1985; Maehr, 1983; Nicholls, 1983;
Rosenholtz & Simpson, 1984). Under such conditions, failure at a task is more
likelyto be constructiveratherthan destructive(Clifford,1984). If such conditions
couldbe fostered,perceivedabilitystratificationwouldbe reduced,with consequent
reductionsin the seriousdifferentialchangesof self-esteemthat occur from about
the age of 10 (Kifer, 1977).
Motivational aspects of competitive,individualistic,and cooperativelearning
structures.Researchon motivationalaspects of competitive, individualistic,and
cooperative task and incentive structureshas been reviewed by Ames (1984),
Johnson and Johnson (1985), and Slavin (1987). The motivational effects of
competitive structureshave been discussedin earliersections, but will be briefly
summarizedhere. Social comparison(norm referencing)is centralto competitive
structures.This tends to resultin severediscouragementfor the studentswho have
few academic successes in competition with their peers. It discouragesstudents
from helpingeach otherwith theiracademicwork,and also threatenspeerrelationships,encouragingan "us and them"mentalitywhichtends to segregatethe higher
and lower achieving students (Deutsch, 1979). It does not encourage intrinsic
motivation.Finally,it tends to encouragestudentsto attributesuccessand failure
to abilityratherthan to effort,which is especiallyharmfulfor the weakerstudents.
In individualisticstructures,rewardsarebasedon criterion-referenced
evaluation.
If all studentsare evaluatedon the same tasks, using the same standards,this can
simply become another type of competitive structure(Ames, 1984), but at least
there is some possibilityof all students meeting specifiedpassingstandards.The
provisionof repeatedopportunitiesto meet the standardscan be a key factorin
reducingthe competitivenessof such individualisticstructures.If, on the other
hand, student'sprogramsof work are more individualized,and the emphasisin
evaluation is placed on each student's progressin learning, competitivenessis
minimized. Under these circumstances,students are more inclined to help each
other, and successand failureon a task are more likely to be attributedto effort
ratherthan to ability. This, in turn, generatesconditions that support intrinsic
motivation.
Cooperativestructuresencouragehelpingand within-grouptutoringbehaviors,
especiallywhen group rewardsare based on the performanceof all the individual
group members. Webb (1985, 1988) has identified the giving or receiving of
elaboratedexplanations as a key factor in student learning within groups, so
conditions that favor such activities are desirable.Participationin cooperative
learningtends to moderatethe positive or negativeinfluence of a student'sown
high or low performance,temperingboth negative and positive self-perceptions
resultingfrom performance,and reducingperformanceanxiety(Ames, 1984).This
can help build both self-esteem and achievement for previously low-achieving
students, especially if their group is successful reasonablyconsistently. Effort
attributionsare encouraged,partlybecause the differentgroups are usuallycom466
TerenceJ. Crooks
(see also Bloom, 1986; Bok, 1986; Cronbach, 1988; Lowell, 1926; Whitehead,
1929).This requiresthat we place emphasison understanding,transferof learning
to untaught problems or situations, and other thinking skills, evaluating the
developmentof these skills throughtasks that clearly must involve more than
recognitionor recall.
Theseskillstaketime to develop,however,and areparticularlydifficultfor some
students(Lohman, 1986;Thomas,Iventosch,& Rohwer, 1987),so it is important
that they be given steadilyincreasingemphasisfromthe earliestyearsof schooling.
By the time studentsare in the uppergradelevels or in college,thereis a good case
for arguingthat factualknowledgeshould be subsumedunderhigherlevel objectives, so that studentsare expectedto use factualknowledgein solvinga problem
or carryingout a process,but are not tested directlyon their ability to recallthe
information.
Evaluationto assist learning.Too muchemphasishasbeen placedon the grading
functionof evaluation,and too little on its role in assistingstudentsto learn.The
integralrole of evaluationin teachingand learningneeds to be grasped,and its
certificationfunctionplacedin properperspective.It is hardto see anyjustification
before the final year or so of high school for placing much emphasison using
classroom evaluation for normativegradingof student achievement,given the
evidence reviewedhere that normativegrading(with the social comparisonand
interstudentcompetitionthat accompanyit) producesundesirableconsequences
for most students.
These undesirableeffectsinclude reductionof intrinsicmotivation,debilitating
evaluation anxiety, ability attributionsfor success and failure that undermine
studenteffort,loweredself-efficacyfor learningin the weakerstudents,reduceduse
and effectivenessof feedbackto improvelearning,and poorersocial relationships
among the students.Gradingon a fixed curve is especiallyinappropriatebecause
it emphasizesparticularlystronglya comparativeapproachto grading.Strong
emphasison the gradingfunctionof evaluationhas also led to overuseof features
normallyassociatedwith standardizedtesting, such as very formal testing conditions, speededtests with strict time limits, a restrictedrange of item types, and
emphasison the overallscoreratherthan what can be learnedabout strengthsand
weaknesses.These may be appropriatein psychologicaltesting, but are rarely
appropriatein educationaltesting(Wood, 1986).
Much of the evaluationactivityin educationmight more profitablybe directed
solely to giving useful feedbackto students,whereasthe less frequentevaluations
for summativepurposesshould focus on describingwhat studentscan or can't do
(i.e., should be criterion referenced).The likely small reduction in reliability
associatedwith countingfewerevaluationsin the summativeevaluationwould be
a modestpenaltyto pay for the benefitsdescribedaboveand the improvedvalidity
associatedwith greateremphasison final competence(ratherthan on the mistakes
made alongthe way).
Effectivefeedback.Thereare severalwaysin whichthe effectivenessof feedback
could be enhanced.First,feedbackis most effectiveif it focusesstudents'attention
on their progressin masteringeducational tasks. Such emphasis on personal
progressenhancesself-efficacy,encourageseffortattributions,and reducesattention
to social comparison.The approachthat leads to the most valuablefeedbackis
nicely capturedby Easleyand Zwoyer(1975):
468
Terence J. Crooks
Frequency of evaluation. Students should be given regular opportunities to
practice and use the skills and knowledge that are the goals of the program, and to
obtain feedback on their performance. Such evaluation fosters active learning,
consolidation of learning, and if appropriately arranged can also provide the
retention benefits associated with spaced practice. Much of this evaluation can be
quite informal, however, and certainly does not need to be conducted under testlike conditions. For higher level outcomes, in particular, it seems likely that too
much formal evaluation may be as bad as too little because conceptual understanding and skills do not develop overnight.
Selection of evaluation tasks. The nature and format of evaluation tasks should
be selected to suit the goals that are being assessed. In most courses this will lead
to substantial variety in tasks, with benefits in versatility of approach and development of transfer skills (Elton, 1982). If it is not inconsistent with program
objectives, students could be given some choice of tasks to be attempted. This
stimulates and takes advantage of intrinsic motivation, and helps provide suitable
challenges for all students.
What is evaluated. The most vital of all the messages emerging from this review
is that as educators we must ensure that we give appropriate emphasis in our
evaluations to the skills, knowledge, and attitudes that we perceive to be most
important. Some of these important outcomes may be hard to evaluate, but it is
important that we find ways to assess them. Cross (1987) sums up this point very
clearly:
It serves no useful purpose to lower our educational aspirations because we cannot
yet measure what we think is important to teach. Quite the contrary, measurement
and assessment will have to rise to the challenge of our educational aspirations. (p.
6)
EducationalPsychology,2, 251-257.
Contemporar'v
Ames, C. (1984). Competitive, cooperative, and individualistic goal structures: A cognitivemotivational analysis. In R. E. Ames & C. Ames (Eds.), Research on motivation in
470
Terence J. Crooks
Plake& J. C. Witt(Eds.),Buros-Nebraska
symposiumon measurementand testing:Vol.2.
The ftlure of testing. Hillsdale, NJ: Erlbaum.
Condry, J. C., & Chambers, J. (1978). Intrinsic motivation and the process of learning. In M.
30, 116-127.
Ps!vchologist,
Cronbach, L. J. (1988). Five perspectives on validity argument. In H. Wainer & H. I. Braun
(Eds.), Test validity. Hillsdale, NJ: Erlbaum.
in humanbehavior.New York:
Deci, E. L. (1975).Intrinsicmotivationand self-determination
Irvington.
AmericanPsychologist,34, 391-401.
Dillon, J. T. (1982). Cognitive correspondence between question/statement and response.
AmericanEducationalResearchJournal,19, 540-551.
DiSibio, M. (1982). Memory for connected discourse: A constructivist view. Review of
EduationalResearch,52, 149-174.
472
Terence J. Crooks
Francis, J. (1982). A case for open-book examinations. Educational Review, 34, 13-26.
Fredericksen, N. (1984a). The real test bias: Influences of testing on teaching and learning.
/AmericanPsy!chologist,39, 193-202.
Fredericksen, N. (1984b). Implications of cognitive theory for instruction in problem solving.
Edlucational
Research,77, 244-248.
Gullickson, A. R. (1985). Student evaluation techniques and their relationship to grade and
Research,55, 23-46.
Haertel, E. (1986, April). Chloosingand using classroomtests. Teachers'perspectiveson
asse.srsment.Paper presented at the annual meeting of the American Educational Research
Association, San Francisco.
Hakstian, R. (1971). The effects of type of examination anticipated on test preparation and
Research,56, 212-242.
Hamilton, R. J. (1985). A framework for the evaluation of the effectiveness of adjunct
questions and objectives. Reviewvof Educational Research, 55, 47-85.
Harter, S. (1985). Competence as a dimension of self-evaluation: Toward a comprehensive
model of self-worth. In R. Leahy (Ed.), The development of the self. New York: Academic
Press.
Hill, K. T. (1984). Debilitating motivation and testing: A major educational problemPossible solutions and policy applications. In R. E. Ames & C. Ames (Eds.), Research on
474
Terence J. Crooks
Entwistle(Eds.),The experienceof learning.Edinburgh:ScottishAcademicPress.
Lepper,M. R., Greene,D., & Nisbett,R. E. (1973). Underminingchildren'sintrinsicinterest
with extrinsicrewards:A test of the "overjustification"
hypothesis.Journalof Personality
and Social Psychology,28, 129-137.
Levin,J. R. (1982). Picturesas prose-learningdevices.In A. Flamme& W. Kintsch(Eds.),
Advances in psychology. Vol. 8. Discourse processing. Amsterdam: North-Holland.
Lew, M., Mesch, D., Johnson, D. W., & Johnson, R. (1986). Positive interdependence,
academic and collaborative-skills
group contingencies,and isolated students.American
EducationalResearchJournal,23, 476-488.
Linn, R. L. (1983). Testingand instruction:Linksand distinctions.Journalof Educational
Measurement,20, 179-189.
Lohman, D. F. (1986). Predictingmathemathaniceffects in the teachingof higher-order
thinkingskills.EducationalPsychologist,21, 191-208.
Lowell,A. L. (1926). The art of examination.AtlanticMonthly,137, 58-66.
Madaus,G. F., & Airasian,P. W. (1977). Issuesin evaluatingstudentoutcomesin competency-basedgraduationprograms.Journalof Researchand Developmentin Education,10,
79-91.
Madaus,G. F., & McDonagh,J. T. (1979). Minimum competencytesting:Unexamined
assumptionsand unexplorednegativeoutcomes.In R. T. Lennon(Ed.),Impactivechanges
in measurement,New Directions for Testing and Measurement,No. 3. San Francisco:
Jossey-Bass.
Maehr,M. L. (1976).Continuingmotivation:An analysisof a seldomconsiderededucational
outcome.Reviewof EducationalResearch,46, 443-462.
Maehr,M. L. (1983). Doing well in science:Why Johnnyno longerexcels;why Sarahnever
did. In S. G. Paris,G. M. Olson, & H. W. Stevenson(Eds.),Learningand motivationin
the classroom.Hillsdale,NJ: Erlbaum.
Maehr,M. L., & Stallings,W. M. (1972). Freedomfromexternalevaluation.ChildDevelopment,43, 177-185.
Martin,E., & Ramsden,P. (1987).Learningskillsand skillin learning.In J. T. E. Richardson,
M. W. Eysenck,& D. W. Piper(Eds.),Studentlearning:Researchin educationandcognitive
psychology.Milton Keynes,England:Open UniversityPress& Societyfor Researchinto
HigherEducation.
Marton,F., Hounsell, D. J., & Entwistle,N. J. (Eds.).(1984). The experienceof learning.
Edinburgh:ScottishAcademicPress.
Marton,F., & Saljo, R. (1976a). On qualitativedifferencesin learning:1. Outcome and
process.BritishJournalof EducationalPsychology,46, 4-11.
Marton,F., & Saljo, R. (1976b). On qualitativedifferencesin learning:2. Outcome as a
functionof the learner'sconceptionof the task.BritishJournalof EducationalPsychology,
46, 115-127.
Mathews,J. (1980). The uses of objectivetests. Teachingin HigherEducationSeries,No. 9.
England:LancasterUniversity.(ERICDocumentReproductionServiceNo. ED 230 106)
Mayer,R. E. (1975). Forwardtransferof differentreadingstrategiesevokedby testlikeevents
in mathematicstext. Journalof EducationalPsychology,67, 165-169.
McCombs,B. L. (1984). Processesand skills underlyingcontinuingintrinsicmotivationto
learn:Towarda definition'ofmotivationalskills traininginterventions.EducationalPsychologist,19, 199-218.
McGlynn,R. P. (1982). A comment on the meta-analysisof goal structures.Psychological
Bulletin,92, 184-185.
McKeachie,W. J. (1974).The declineandfallof the lawsof learning.EducationalResearcher,
3, 7-11.
McKeachie,W. J. (1984). Does anxietydisruptinformationprocessingor does poor information processinglead to anxiety.InternationalReviewof AppliedPsychology,33, 187203.
476
Naveh-Benjamin,M., McKeachie,W. J., & Lin, Y-G. (1987). Two types of test-anxious
students:Supportfor an informationprocessingmodel.Journalof EducationalPsychology,
79, 131-136.
Newble, D. I., & Jaeger,K. (1983). The effect of assessmentsand examinationson the
learningof medicalstudents.MedicalEducation,17, 25-31.
Nicholls,J. G. (1983). Conceptionsof abilityand achievementmotivation:A theoryand its
implicationsforeducation.In S. G. Paris,G. M. Olson,& H. W. Stevenson(Eds.),Learning
and motivationin the classroom.Hillsdale,NJ: Erlbaum.
Nicholls,J. G. (1984).Achievementmotivation:Conceptionsof ability,subjectiveexperience,
task choice, and performance.PsychologicalReview,91, 328-346.
Nungester,R. J., & Duchastel,P. C. (1982). Testing versus review:Effectson retention.
Journalof EducationalPsychology,74, 18-22.
O'Neill, M., Razor, R. A., & Bartz, W. R. (1976). Immediateretentionof objectivetest
answersas a functionof feedbackcomplexity.Journalof EducationalResearch,70, 72-75.
Page, E. B. (1958). Teachercommentsand studentperformance:A seventy-fourclassroom
experimentin school motivation.Journalof EducationalPsychology,49, 173-181.
Paris,S. G. (1988, April).Fusing skill and will in children'slearningand schooling.Paper
presentedat the annualmeetingof the AmericanEducationalResearchAssociation,New
Orleans,LA.
Paris,S. G., & Cross,D. R. (1983).Ordinarylearning:Pragmaticconnectionsamongchildren's
477
Terence J. Crooks
beliefs,motives,and actions.In J. Bisanz,G. Bisanz,& R. Kail (Eds.),Learningin children
(pp. 137-169). New York:Springer-Verlag.
Perry,W. F. (1970). Forms of intellectualand ethical developmentin the college years:A
scheme.New York:Holt, Rinehartand Winston.
Phye, G. D. (1979). The processingof informativefeedback about multiple-choicetest
EducationalPsychology,4, 381-394.
performance.Contemporary
Pressley,M*,Levin,J. R., & Chatala,E. S. (1984). Memorystrategymonitoringin adultsand
children.Journalof VerbalLearningand VerbalBehavior,23, 270-288.
Quellmalz,E. S. (1985). Needed: Better methods for testing higher-orderthinking skills.
EducationalLeadership,43(2), 29-35.
Ramsden,P. (1984). The contextof learning.In F. Marton,D. J. Hounsell,& N. J. Entwistle
(Eds.), The experienceof learning.Edinburgh:ScottishAcademicPress.
Ramsden,P. (1985). Studentlearningresearch:Retrospectand prospect.HigherEducation
Researchand Development,4, 51-69.
Ramsden,P., Beswick,D., & Bowden,J. (1987). Learningprocessesand learningskills.In J.
T. E. Richardson,M. W. Eysenck,& D. W. Piper (Eds.), Studentlearning:Researchin
educationand cognitivepsychology.Milton Keynes, England:Open UniversityPress &
Societyfor Researchinto HigherEducation.
Ramsden, P., & Entwistle,N. J. (1981). Effects of academic departmentson students'
approachesto studying.BritishJournalof EducationalPsychology,51, 368-383.
Redfield,D. L., & Rousseau, E. W. (1981). A meta-analysisof experimentalresearchon
teacherquestioningbehavior.Reviewof EducationalResearch,51, 237-245.
Rickards,J. P., & Friedman,F. (1978). The encodingversusthe externalstoragehypothesis
in notetaking.Contemporary
EducationalPsychology,3, 136-143.
Rinchuse, D. J., & Zullo, J. (1986). The cognitive level demands of a dental school's
predoctoraldidacticexaminations.Journalof DentalEducation,50, 167-171.
Rogers,E. M. (1969). Examinations:Powerfulagentsfor good or ill in teaching.American
Journalof Physics,37, 954-962.
Rohm, R. A., Sparzo,F. J., & Bennett,C. M. (1986). College studentperformanceunder
repeatedtesting and cumulativetesting conditions:Reports on five studies. Journal of
EducationalResearch,80, 99-104.
Rohwer, W. D., & Thomas, J. W. (1987). The role of mnemonic strategiesin study
effectiveness:Theories,individualdifferences,and applications.In M. A. McDaniel& M.
Pressley(Eds.),Imageryand relatedmnemonicprocesses.New York:Springer-Verlag.
Rosenholtz,S. J., & Simpson,C. (1984). Classroomorganizationand studentstratification.
ElementarySchoolJournal,85, 21-37.
Rosenshine,B. (1979). Content, time, and direct instruction.In P. L. Peterson& H. J.
Walberg(Eds.),Researchon teaching.Berkeley,CA:McCutchan.
Rosenshine,B., & Stevens,R. (1986).Teachingfunctions.In M. C. Wittrock(Ed.),Handbook
of researchon teaching(3rd ed., pp. 376-391). New York:Macmillan.
Rosswork,S. G. (1977).Goal setting:The effectsof an academictaskwith varyingmagnitudes
of incentive.Journalof EducationalPsychology,69, 710-715.
Rothkopf, E. Z. (1988). Perspectiveson study skills training in a realistic instructional
economy. In C. E. Weinstein,E. T. Goetz, & P. A. Alexander(Eds.),Learningand study
strategies:Issues in assessment, instruction,and evaluation.San Diego, CA: Academic
Press.
Rowe, D. W. (1986). Does researchsupport the use of "purposequestions"on reading
comprehensiontests?Journalof EducationalMeasurement,23, 43-55.
Rowe, M. B. (1986). Wait time: Slowing down may be a way of speedingup! Journalof
TeacherEducation,37(1), 43-50.
Rudman,H. E., Kelley,J. L., Wanous,D. S., Mehrens,W. A., Clark,C. M., & Porter,A. C.
A review(1922-1980) (ResearchSeriesNo.
(1980). Integratingassessmentwithinstruction:
478
Terence J. Crooks
skills through classroom assessment. Paper presented at the annual meeting of the National
Council on Measurement in Education, New Orleans.
Strang, H. R., & Rust, J. 0. (1973). The effect of immediate knowledge of results and task
definition on multiple-choice answering. Journal of Experimental Education, 42, 77-80.
Svensson, L. (1977). On qualitative differences in learning: 3. Study skill and learning. British
Journal of Educational Psvchology, 47, 233-243.
Terry, P. W. (1933). How students review for objective and essay tests. Elementary School
Journal,33, 592-603.
Thomas, J. W., Iventosch, L., & Rohwer, W. D. (1987). Relationships among student
characteristics, study activities, and achievement as a function of course characteristics.
EducationalPsychology,12, 344-364.
Contemporary
Thomas, J. W., & Rohwer, W. D. (1986). Academic studying: the role of learning strategies.
EducationalPsy'chologist,
21, 19-41.
Thorndike, E. L., & Woodyard, E. (1934). The influence of the relative frequency of success
and frustrations upon intellectual achievement. Journal of Educational Psychology, 25,
241-250.
Thorndike, R. L. (1969). Helping teachers use tests. NCME Measurement in Education, 1(1),
1-4.
Tobias, S. (1985). Test anxiety: Interference, defective skills, and cognitive capacity. Educational Psychologist, 20, 135-142.
van Rossum, E. J., Diejkers, R., & Hamer, R. (1985). Students' learning conceptions and
their interpretation of significant educational concepts. Higher Education, 14, 617-641.
van Rossum, E. J., & Schenk, S. M. (1984). The relationship between learning conception,
study strategy, and learning outcome. British Jolurnal of Educational Psychology, 54, 7383.
Watkins, D. (1984). Students' perceptions of factors influencing tertiary learning. Higher
480
Author
TERENCE J. CROOKS, Senior Lecturer, Director, Higher Education Development Centre,
University of Otago, P.O. Box 56, Dunedin, New Zealand. Specializations: improvement
of tertiary education, research design, measurement and evaluation.
481