Académique Documents
Professionnel Documents
Culture Documents
ScholarlyCommons
Consortium for Policy Research in Education
CPRE Policy Briefs
(CPRE)
5-2006
Anthony Milanowski
Steven Kimball
Allan Odden
Recommended Citation
Heneman III, Herbert G.; Milanowski, Anthony; Kimball, Steven; and Odden, Allan. (2006). Standards-Based Teacher Evaluation as a
Foundation for Knowledge- and Skill-Based Pay. CPRE Policy Briefs.
Retrieved from http://repository.upenn.edu/cpre_policybriefs/33
The Teacher Compensation Group of the Consortium for Policy Research in Education (CPRE) has been
studying the design and effectiveness of such systems for nearly a decade. We initially focused on school-based
performance award programs, in which each teacher in a school receives a bonus for meeting or exceeding
schoolwide student achievement goals (Heneman, 1998; Heneman & Milanowski, 1999; Kelley, Heneman, &
Milanowski, 2002; Kelly, Odden, Milanowski, & Heneman, 2000). We then shifted our attention to
knowledge- and skill-based pay (KSBP) plans, an approach that provides teachers with base pay increases for
the acquisition and demonstration of specific knowledge and skills thought to be necessary for improving
student achievement.
Our initial research described a variety of experiments with KSBP plans (see Odden, Kelley, Heneman, &
Milanowski, 2001). We found plans that were rewarding numerous knowledge and skills, including (a)
additional licensure or certification, (b) participation in specific professional development activities, (c)
National Board Certification, (d) mastery of specific skill blocks such as technology or authentic assessment,
(e) leadership activities, and (f) teacher performance as measured by a standards-based teacher evaluation
system. We also found districts experimenting with standards-based teacher evaluation without an intended
pay link. As described below, in standards-based teacher evaluation systems, teachers' performance is
evaluated against a set of standards that define a competency model of effective teaching. Such systems replace
the traditional teacher evaluation system and seek to provide a more thorough description and accurate
assessment of teacher performance. Findings from our research on some of these systems are the focus of this
issue of CPRE Policy Briefs.
Disciplines
Education Economics | Education Policy | Teacher Education and Professional Development
Comments
View on the CPRE website.
Graduate School of
Reporting on Issues and Research in Education Policy and Finance
Education
Level of Performance
Element Unsatisfactory Basic Proficient Distinguished
Directions and Teacher directions Teacher directions Teacher directions Teacher directions
procedures and procedures are and procedures are and procedures are and procedures are
confusing to clarified after clear to students clear to students and
students. initial student and contain an anticipate possible
confusion or are appropriate level student
excessively of detail. misunderstanding.
detailed.
Oral and written Teacher’s spoken Teacher’s spoken Teacher’s spoken Teacher’s spoken
language language is language is and written and written
inaudible, or audible, and language is clear language is correct
written language is written language is and correct . and expressive , with
illegible. Spoken or legible. Both are Vocabulary is well-chosen
written language used correctly . appropriate to vocabulary that
may contain Vocabulary is students’ age and enriches the lesson .
grammar and correct, but limited interests.
syntax errors. or is not
Vocabulary m ay be appropriate to
inappropriate, students’ ages or
vague, or used background.
incorrectly, leaving
students confused.
The Framework for Teaching (with adaptation mine placement and movement on the KSBP plan
to the local context) can be used as the perfor- salary schedule.
mance measure for a standards-based teacher An example of such a schedule is shown in
evaluation system. Evaluators can gather evi- Exhibit II. Teachers are placed into five levels of
dence from various sources (e.g., classroom increasing performance competency: apprentice,
observation, portfolios, logs) about the teacher's novice, career, advanced, and accomplished.
performance and then rate the teacher's perfor- Placement is based on performance ratings on the
mance on each element. Written and verbal feed- four domains from the Framework for Teaching.
back can be provided, and action plans for Movement into higher levels occurs as perfor-
improvement can be developed. Moreover, to mance ratings improve, with the proviso that a
integrate the evaluation system into a KSBP plan, teacher can remain an apprentice for only two
the overall or average rating can be used to deter- years and a novice for only five years. Within
Evaluation Comprehensive Nontenured teachers evaluated Nontenured teachers All teachers evaluated
procedures evaluation on all domains on all domains/elements, via 9 evaluated on a subset of annually, with 2 ratings per
for new teachers and classroom observations. the standards each year for year on selected domains,
veterans who are Tenured teachers evaluated on 1 2 years, then receive a full primarily via classroom
identified as need ing domain (minor) or 2 domains evaluation in the 3 rd year. observations. Observations
improvement, at certain (major) over 3 -year cycle via at Evaluation is ba sed on at are conducted as many times
steps on the schedule, or least 1 classroom observation. least 2 observations. as is necessary over a 2 -
desiring to become lead No portfolio required, but Tenured teachers are week period each semester.
teachers. Original plan evaluators also look at artifacts evaluated on all domains No portfolio is required, but
was for comprehensive like student work. every 2, 3, or 4 years evaluators also look at
evaluation once every 5 depending on prior rating artifacts like student work.
years. All others undergo level. At least 1 All rated on 5 core domains
less rigorous annual observation is required. and selected content -
evaluation on one domain relevant domains.
each year. Comprehensive
evaluation consisted of 5 -
6 classroom observations
and a teacher -prepared
portfolio.
Evaluators Peer evaluators, Principals and assistant Principals and department Self, peer, and assistant
principals, and assistant principals heads principals
principals
3
CPRE Policy Briefs
At all four sites we focused our research on on the current level of performance. Even if the
the standards-based teacher evaluation system. evaluation will not be linked to pay, feedback
The teachers in Cincinnati and Vaughn were from evaluators and the desire for a favorable
aware of the possibility that evaluation scores evaluation provide incentives for improvement.
would be linked to their future pay via a KSBP So we investigated the influence of evaluation
plan; the teachers in Washoe and Coventry knew systems on what teachers report they do in the
there was no intent to link evaluation scores to classroom.
pay. We used multiple methods research methods, 4. Do design and implementation processes
including both qualitative and quantitative data make a difference?
collection and analysis. A summary of our
research approach at each site can be found in the These processes can affect the overall viabili-
appendix. ty and impact of the system. The system is not
likely to have a sustained impact on teacher skill
Our research focused on four questions: development, or survive for long, if it is cumber-
1. What is the relationship between teachers' some, prone to implementation glitches, and
standards-based teacher evaluation scores or rat- unaligned with other human resource manage-
ings and the achievement of their students? ment programs that affect instructional capacity.
This question is fundamental, since there is no Relationship Between Teacher
point in encouraging teachers to develop and use
competencies that are not related to student
Evaluation Scores and Student
achievement. Further, if the scores are to be linked Achievement
with pay increases as part of a KSBP plan, the We assessed the relationship between teach-
evaluation system needs to be able to distinguish ers' performance evaluation scores and student
those teachers who facilitate greater levels of stu- achievement by correlating teachers' overall eval-
dent achievement in order to justify rewarding uation scores with estimates of the value-added
them. Significant positive relationships would academic achievement of the teachers' students.
provide evidence not only that the standards- Our value-added measure was estimated control-
based evaluation ratings can be used to identify ling for prior student achievement and other stu-
good teaching, but also that the teacher competen- dent characteristics, such as socioeconomic status,
cies underlying the system do help facilitate stu- that influence student learning. At each site, we
dent achievement. have analyzed multiple years of data. Exhibit IV
summarizes the results for reading and mathemat-
2. How do teachers and administrators react ics.
to standards-based teacher evaluation as a mea-
sure of instructional expertise? We found positive relationships between
teacher evaluation scores and student achieve-
This question is important because adminis- ment, though the average relationship varied
trator and teacher reactions are a major determi- across the four sites. At the Vaughn school, the
nant of the willingness of administrators to use the relationship was substantial, with an average cor-
system as designed, and of teachers to agree to relation over the three years we studied of 0.37 in
link pay with assessments of performance. The reading and 0.26 in mathematics.1 In Cincinnati,
initial acceptance and long-term survival of the the relationship was similar, with a three-year
evaluation and KSBP systems will be jeopardized average correlation of 0.35 in reading and 0.32 in
if administrators and teachers believe the evalua- mathematics. In Washoe, the relationships were
tion system is unfair, overly burdensome, and not somewhat smaller; the average correlations were
useful in guiding teacher efforts to improve per- 0.22 and 0.21 for reading and math achievement,
formance. respectively. In Coventry, the average correlation
3. Is there evidence that standards-based between teacher ratings and student achievement
teacher evaluation systems influence teacher in reading was 0.23, and 0.11 with mathematics
practice?
1 The correlation is a quantitative indicator of the degree
This question is important in order to assess of association between two variables. It ranges from -1.00
the potential of a KSBP plan to motivate teachers to + 1.00, with a correlation of .00 indicating no association
to improve instructional practice. The evaluation between the variables. Correlations of .20 to .40 are quite
system must provide guidance for teachers about common in educational research, and correlations in this
range are considered meaningful indicators of an associa-
districts' performance expectations and feedback tion between variables.
4
Standards-Based Teacher Evaluation as a Foundation for Knowledge-and Skill-Based Pay
achievement.2 Note that one would not expect to ing looks like. In contrast, at the two other sites, a
find a perfect or even near-perfect correlation single evaluator (the school principal or an assis-
between evaluation scores and student achieve- tant principal) made the ratings, and less training
ment, given the various other factors that influ- was provided to the evaluators. Measurement
ence both. On the student achievement side, tests error, relatively small samples in some grades and
are not perfect measures of student learning, nor is subjects, differences in the quality and coverage
teacher behavior its only cause. Teacher evalua- of student tests, and idiosyncrasies in different
tion scores are also not perfect representations of evaluators' interpretations of teacher performance
teachers' actual classroom behavior. Given the are likely explanations for the variation in the
size of recent estimates of likely teacher effects on strength of the relationship across years within
student achievement (Nye, Konstantopolis, & each of the four sites.
Hedges, 2004; Rowan, Correnti, & Miller, 2002), Overall, our results suggest that the scores
the average correlations for Cincinnati and from standards-based performance evaluation
Vaughn are about what one might expect. systems can have a substantial positive relation-
We speculate that Cincinnati and Vaughn have ship with student achievement and that the
higher average correlations in part due to the use instructional practices measured by these systems
of multiple evaluators. In addition, Cincinnati contribute to student learning. The evidence sup-
evaluators received intensive, high-quality train- ports the potential usefulness of a well-designed
ing. Vaughn evaluators could draw on a strong and rigorously implemented standards-based
shared culture and history of working on instruc- teacher evaluation as a basis for a KSBP pay sys-
tion that fostered agreement on what good teach- tem for teachers.
Coventry
1999-2000 2,3,6 .17 .01
2000-2001 2,3,4,6 .24 -.20
2001-2002 4 .29 .51
3-year average: .23 .11
Vaughn
2000-2001 2 -5 .48 .20
2001-2002 2 -5 .58 .42
2002-2003 2 -5 .05 .17
3-year average: .37 .26
Washoe
2001-2002 3 -5 .21 .19
2002-2003 4 -6 .25 .24
2003-2004 3 -6 .19 .21
3-year average: .22 .21
2Our research is reported in several journal articles and a book chapter. Validity results can be found in
Kimball, White, Milanowski, and Borman (2004); Milanowski (2004); Gallagher (2004); and Milanowski,
Kimball, and Odden (2005). Research on the implementation of the systems and teacher reactions can be
found in Heneman and Milanowski (2003); Kimball (2002); and Milanowski and Heneman (2001).
5
CPRE Policy Briefs
Teacher and Administrator tice. Additionally, many reported that the use of
Reactions the teaching standards helped improve dialogue
with their principals about teaching and perfor-
In addition to producing ratings which corre- mance expectations.For the other areas shown in
late with student achievement, a standards-based Exhibit V, teacher reactions were more mixed.
teacher evaluation system must be accepted by Numerous specific aspects of the new evaluation
those who use it if it is to survive and contribute systems and their implementations contributed to
to performance improvement. Accordingly, we the variety of reactions. This variety, including
assessed teacher reactions by interviewing hun- both positive and negative reactions, suggests that
dreds of teachers and conducting multiple surveys close attention to design and implementation
at three of our sites. We assessed administrator issues is needed in order to maximize teacher
reactions through interviews. Exhibit V summa- acceptance.
rizes the results for teachers.
Administrators also generally accepted the
The most positive and least varied reactions performance competency model and the evalua-
were to the performance competency model tion system based on it. Like teachers, they often
embedded within the evaluation system. Teachers commented that evaluation dialogue was
generally understood the standards and rubrics improved under the new system. Many principals
comprising the evaluation systems, and agreed valued the increased opportunity to discuss
that the performance described at higher levels instruction with teachers and felt that the greater
described good teaching. Many teachers told us amount of evidence they collected, combined with
that this was the first time they ever had a clear the explicit rubrics describing the four levels of
and concise understanding of the district's perfor- teacher performance, helped them do a better job
mance expectations for their instructional prac- as evaluators. However, principals also saw the
6
Standards-Based Teacher Evaluation as a Foundation for Knowledge-and Skill-Based Pay
new evaluation procedures as more work to the teacher association sought substantial revi-
implement, requiring many of them to lengthen sions of the evaluation process, and the members
their work day and complete evaluations on week- ultimately voted to reject the new teacher KSBP
ends. Some principals attempted to shortcut the pay schedule with raises to depend in part on the
process by reducing the number and length of results of evaluation.
observations, providing relatively general written A problem most apparent at Cincinnati and
feedback to teachers or focusing their time pri- Vaughn was the difficulty of ironing out all the
marily on new or struggling teachers. glitches in the performance evaluation process.
Impact on Teacher Practice Because of the complexity of these systems and
the accompanying KSBP plans, it proved hard to
We assessed effects on teaching practice pri-
foresee all potential problems and address them to
marily through interviews with teachers and eval-
the satisfaction of teachers and administrators in
uators. Many teachers reported that the new sys-
the first year of implementation. Implementation
tem had positive impacts on their instructional
glitches led teachers to perceive that “they are
practice. But the evidence indicated that the initial
building the airplane while it flies.” This was
effects of standards-based teacher evaluation on
unsettling to many teachers and provoked doubts
teacher practice tended to be broad, but relatively
about the validity and fairness of the system.
shallow. Engaging in more reflection, improved
Vaughn was more successful in addressing these
lesson planning, and better classroom manage-
concerns, being able to take quicker action on
ment were commonly cited impacts. These are
implementation problems and having a higher
basic but critical features of instructional practice
level of trust between teachers and administrators.
that can create conditions for student learning.
Teachers were less likely to report changing their Training for teachers and administrators was
instruction to a pedagogy characterized by stu- also an issue. Principals were generally responsi-
dent-initiated activities or empowerment, as ble for training teachers in the new system, result-
emphasized in some of the “distinguished” levels ing in uneven quality. Teacher training tended to
of the Framework rubrics. This is not surprising be process oriented, with limited emphasis on
since the level of feedback and assistance provid- understanding how to develop and demonstrate
ed in most cases emphasized classroom manage- the performance competencies. Training for eval-
ment and general pedagogy. Nor was professional uators varied considerably. Cincinnati invested
development highly focused on the teaching prac- considerable resources in teaching all evaluators
tices assessed by the evaluation systems. One how to collect evidence and apply the rubrics, and
interesting impact was that many teachers being produced good interrater agreement and a rela-
evaluated began to take student standards more tively strong relationship between evaluation
seriously, because of the emphasis on teaching to scores and student achievement. Coventry and
those standards in the evaluation systems. One Washoe spent less time on training, and had weak-
principal we interviewed remarked that one year er relationships. While all sites provided training
of standards-based evaluation had done more to in the first year of implementation, one site failed
motivate teachers to pay attention to the student to train principals or teachers who joined the dis-
standards than years of workshops. trict after the initial implementation. At all sites,
administrator training did not appear to put much
Design and Implementation emphasis on providing useable feedback, setting
At three of our sites, the design process performance goals, and coaching.
included not only gathering input from represen-
One factor that limited effects of the new eval-
tatives of teachers, principals, and central office
uation systems was that the competency model
administrators, but also a pilot test of the system
was not part of a coherent effort to drive the
prior to full implementation. These practices
development of a new performance culture.
helped build support and uncover potential imple-
Except at Vaughn, the systems were not suffi-
mentation problems. In many cases, these prob-
ciently linked to broader strategies for improving
lems were addressed before all teachers were
instruction or student achievement, nor to other
required to undergo the new evaluation process or
parts of the human resource management system.
shortly after full implementation. Despite the best
Although there was some attempt at alignment,
intentions, however, there were still some aspects
districts generally aligned only one or two of their
of the evaluation process or related systems that
other human resource systems (i.e., recruitment,
caused problems. And in the case of Cincinnati,
selection, induction, mentoring, professional
7
CPRE Policy Briefs
development, compensation, performance man- findings suggest that standards-based teacher
agement, and instructional leadership) with the evaluation systems could be used as the founda-
instructional vision in the competency model, for- tion of a KSBP plan, but only if the evaluation
feiting the opportunity to have these programs system is designed and implemented properly to
reinforce the teacher performance evaluation sys- support this use.
tem and the vision of good instruction embodied Guidelines for Design and Implementation
in it.
Based on our findings, we suggest the follow-
At each site, there was at least one key district ing guidelines for designing and using a stan-
administrator who shepherded the new system dards-based teacher evaluation system as the basis
through the design, pilot, and implementation for a KSBP system.
phases, and helped rally others behind the cause.
Despite their efforts, in two districts at least one of 1. Specify that performance improvement is a
the leaders of other important functions (curricu- strategic imperative. This will first require identi-
lum and instruction, principal supervision, profes- fication of performance gaps (e.g., student learn-
sional development, or human resources) ing relative to state proficiency standards,
remained resistant or disengaged. This seemed to achievement gaps), followed by a conclusion that
be due to the lack of active superintendent improvement in teachers' instructional practice
engagement with the performance competency will be a key lever for closing these gaps. In this
model underlying the teacher evaluation system. way teacher performance competency becomes
So instead of tightly integrating the performance identified as a factor of strategic importance, and
evaluation system with other efforts to improve a standards-based evaluation system and a KSBP
teacher quality, these districts treated the perfor- system will logically follow as key tools to be
mance evaluation or pay system as just another used in the drive for performance improvement. If
isolated reform. these initiatives are not embedded within a strate-
gy of performance improvement, they will likely
Another issue was the lack of alignment be viewed as “just another program” by teachers
between the teacher performance evaluation sys- and administrators. In turn, the evaluation system
tem and the performance expectations and evalu- will be lost among other priorities that come
ation process for school administrators. Most sites along, not have a designated champion for its suc-
did not hold administrators accountable for the cess, be underfunded in both monetary and time
quality of their efforts to evaluate and support terms, meet resistance or rejection by teachers and
teachers through the new system, or even for com- administrators, and gradually lose its potency.
pletion of the evaluation process. Though most
evaluators worked hard to accommodate the new 2. Develop a set of teaching standards and
system, lack of accountability led some adminis- scoring rubrics (i.e., a competency model) that
trators to minimize involvement in the new reflects what teachers need to know and be able to
teacher evaluation system, fail to evaluate teach- do to provide the kind of instruction needed to
ers in a timely way, or fail to provide the feedback meet the district's student achievement goals. The
teachers desired. model is the foundation of the program and every-
thing about the program will flow from it. Active
Using Standards-B
Based teacher participation in the construction and
Teacher Evaluation as the refinement of the model is essential. The Frame-
Foundation for Knowledge- and work for Teaching represents one possible starting
Skill-B
Based Pay point, but other models such as those developed
by the National Board for Professional Teaching
Our results provide evidence that ratings from Standards, the state of Connecticut (see
standards-based teacher evaluation systems can http://www.state.ct.us/sde/dtl/t-a/index.htm), the
have a meaningful relationship with measures of National Council of Teachers of Mathematics, and
student achievement. There is thus evidence that the National Council of Teachers of English, as
these evaluation systems are holding teachers well as a new set of standards being developed by
accountable for competencies related to student Allan Odden to reflect research-based instruction-
achievement. These results also reassure us that al practice (see Odden & Wallace, forthcoming)
holding teachers accountable for their perfor- should also be considered. The Connecticut and
mance is worthwhile, in terms of the outcome pol- Odden models place more emphasis on specific
icymakers and much of the public feels is most instructional practices derived from the most cur-
important improved student achievement. These rent research on how students learn (e.g., Brans-
8
Standards-Based Teacher Evaluation as a Foundation for Knowledge-and Skill-Based Pay
ford, Brown, & Cocking, 1999; Donovan & evaluation system, so teachers have a clear idea of
Bransford, 2005a, 2005b, 2005c; Cunningham & what they need to do to get a good evaluation
Allington, 1994). before the process starts. Administrators will also
Our research also suggests that some additions need early training on the performance competen-
to the content of Framework-based systems may cy model and on the purposes and mechanics of
be useful in improving instruction and may also the new evaluation system. In addition, it will be
yield evaluation scores with a stronger relation- critical to provide training in observational skills
ship to student achievement. First, while the and accuracy, as well as providing timely, useful
generic teaching behaviors emphasized in the sys- feedback and coaching. Further into the life of the
tems based on the Framework are important, it new system, training can shift to broader issues of
may also be useful to explicitly evaluate teachers performance management and instructional lead-
on their skill in implementing specific instruction- ership centered on the teacher competency model.
al programs important to the jurisdiction's strate- 5. Consider using multiple evaluators if “live”
gy to improve student achievement. For example, observations are part of the system. The burden of
if models like Success for All or Direct Instruc- effective standards-based evaluation may be too
tion, or a specific curriculum, are part of the strat- great for many school administrators to shoulder
egy to achieve school or district goals, teachers alone. Not only do they have many other demands
should be evaluated on how well these are imple- on their time, but few can be experts in all grades
mented in the classroom. Second, if more skill in and subjects, nor can all resist the temptation
content-specific pedagogy and higher levels of toward giving lenient ratings to preserve working
pedagogical content knowledge are needed to relationships. Having a second evaluator provides
facilitate a major boost in student achievement, expertise, reduces workload, and can help reduce
evaluation systems may need to place more leniency when scores have to be compared and
emphasis on those approaches to instruction. discussed. Alternatively, systems could use cur-
3. Be prepared for additional workload for riculum-unit based instructional portfolios, with
teachers being evaluated and those doing the eval- videos of teachers' instructional practice rather
uation. System designers need to carefully review than live observations. This is the approach of the
what is required of teachers to minimize burden. National Board, Connecticut, and the Odden and
This is especially an issue if teachers will be Wallace (forthcoming) proposal.
required to prepare a portfolio as part of the eval- 6. Provide evaluators with high-quality train-
uation. Perhaps some small reduction in other ing. For example, Cincinnati began with a three-
responsibilities while teachers are undergoing day session for evaluators on system goals, proce-
evaluation would decrease the perception of bur- dures, and rating pitfalls, and followed up with
den and sense of stress. Similarly, while evaluat- having raters view, discuss, and rate several
ing teachers is already a part of school leaders' videotapes of teaching at various performance
jobs, doing high-quality evaluations is often not levels. Raters had to meet a standard of agreement
rewarded by districts nor easy to do given the cur- with the ratings of a set of experts, and received
rent structure of the jobs. While the addition of follow-up training to help them do so. The train-
peer evaluators can reduce administrator work- ing should include the use of a structured scoring
load, districts should review the design of admin- process to guide evaluator decision making and to
istrators' jobs and consider incentives for them to discourage “gut-level” decisions. Clarify for eval-
allocate more time to teacher evaluation, feed- uators issues such as what evidence is to be col-
back, and coaching. lected, how that evidence should be compared to
4. Prepare teachers and administrators thor- the rubrics or rating scales, or how to deal with
oughly. Simply communicating about the system evidence that falls between two rubric categories.
is not enough. Training will be necessary for both 7. Support teachers in acquiring the knowl-
teachers and administrators. For teachers, early edge and skills needed to reach high performance.
training should focus on the nature of the perfor- These efforts need to go beyond orienting teach-
mance competencies on which the system is ers or training them in the new process to provid-
based, the purposes and mechanics of the evalua- ing resources for improvement. Feedback needs to
tion system, and knowledge and skills needed to be concrete and specific, telling the teacher not
function effectively within the new system. A key only her/his rating but also exactly what prevent-
here will be providing guidance on the specifics
of what good teaching looks like according to the
9
CPRE Policy Briefs
ed her/him from getting a higher score, and what development needs to be aligned, so that teachers
specific behaviors or results would raise the score. have the means to obtain the knowledge and skills
This could be followed by information about rel- rewarded by the pay system. Alignment reinforces
evant professional development, suggestions the importance of the performance competencies,
about techniques to try and whom to observe to sends consistent messages about the district's
see good performance exemplified, and even vision of good teaching, and provides a frame-
modeling of aspects of desired performance. This work for instructional leaders to use in helping
in turn requires that evaluators be trained in pro- teachers improve practice.
viding feedback and that teachers have a coach or
9. Work out details, pilot the system, and mon-
mentor to go to for help. It may be necessary to
itor implementation. We found that at least one
get school leaders and teaching peers more
pilot year was needed to work the glitches out of
involved in providing developmental feedback
the evaluation systems. A single test year may not
and coaching.
be enough. At some sites, going to scale after the
It may also be necessary to restructure profes- pilot revealed implementation problems which in
sional development programs on the teaching turn lowered the credibility of the system to teach-
knowledge and skills underlying the evaluation ers and reduced acceptance.
system, so that they to provide the skills teachers
10. Conduct validity and interrater agreement
need to do well in both the classroom and the
(reliability) analyses. This will help assure all
evaluation process. The link between specific pro-
stakeholders that evaluation scores are based on
fessional development activities and the perfor-
observable and agreed-upon features of teacher
mance competencies in the standards then needs
performance and that higher evaluation scores are
to be made clear to teachers. Having the profes-
connected with important student achievement
sional development system “look” like the evalu-
outcomes.
ation system also can help bring alignment.
Emphasizing the development and use of stan- Many of these recommendations imply that a
dards-based curriculum units, which has been standards-based evaluation system should be
shown to be a powerful form of professional designed, tested, and implemented before the link
development (Cohen & Hill, 2001), aligns nicely to pay is made. Pay change would thus follow the
with a curriculum-unit approach in the evaluation change in the performance evaluation system and
system. the development of aligned human resource prac-
tices. While new pay and evaluation systems
8. Align the human resource management sys-
could be introduced all at once, before doing so
tem with the performance competency model
program designers need to realistically assess
underlying the teacher evaluation standards. To
their organization's capacity for implementing
reinforce the importance of the performance com-
change in a number of major human resource sys-
petency model and create a shared conception of
tems, and the readiness of teachers for major
competent instruction, the content of human
changes in how they are evaluated and paid.
resource programs should reflect the content of
the model (Heneman & Milanowski, 2004). This Guidelines Caveats
applies to all eight major human resource program Several caveats about our guidelines for using
areas: recruitment, selection, induction, mentor- standards-based teacher evaluation systems are
ing, professional development, performance man- pertinent. First, there are generalizability bounds
agement, compensation, and school leadership. on our research in terms of the teacher perfor-
For example, recruitment of new teachers can be mance competency model studied (the Frame-
targeted toward applicants likely to possess the work for Teaching), the heavy emphasis on class-
competencies in the model, and applicants can be room observation as the method of evidence gath-
informed of the competencies expected. Another ering, and the types of training provided to teach-
example is teacher induction. If induction pro- ers, administrators, and special evaluators. Alter-
grams are based on the performance competencies native design and delivery features should be
underlying the evaluation system, new teachers experimented with (for examples, see Odden &
are likely to have a better understanding of per- Wallace, forthcoming; Tucker & Stronge, 2005).
formance requirements and to be better prepared Different performance competency models could
for future evaluations, as well as to be less likely be tried, varying features such as the number and
to leave the district or profession in response to a types of standards (e.g., only ones that focus on
negative evaluation experience. Professional instructional practice), or using alternative intact
10
Standards-Based Teacher Evaluation as a Foundation for Knowledge-and Skill-Based Pay
competency models such as those identified Cincinnati plan, and it was voted down by teach-
above. Greater usage and weight could be accord- ers.
ed to portfolios and videos of teacher instruction,
as opposed to classroom instruction. More inten- Conclusion
sive training, emphasizing accuracy of evaluation Our research shows the promise of standards-
and feedback and coaching skills, could be given based teacher evaluation as a foundation for
to administrators or other evaluators. Whatever KSBP systems. In order to make the most of this
the specific nature of the experimentation, evalu- approach, moving toward standards-based evalua-
ation of its effectiveness is paramount. tion should be more than a fine-tuning of the
existing evaluation system. Indeed, the system
Second, the systems we evaluated, as well as
should be made an integral component of a gener-
our recommendations for practice, entail
al performance improvement strategy. Then a
increased administrative workload for teachers,
commitment to a transformation in how teacher
administrators, and human resource staff. Such an
performance is defined, measured, and supported
effect is a natural byproduct of serious attempts to
is needed. Such commitment needs to extend not
improve teacher quality. Ways to minimize the
only to the teacher evaluation process, but also to
workload effect, or help incorporate it into stan-
aligning the human resource management system,
dard practice, should be experimented with. Sug-
linking the aligned system to state or district
gestions here include streamlining implementa-
instructional strategies, and addressing teacher
tion and evidence-gathering processes, automat-
and administrator apprehensions about changing
ing them using web-based technologies, and stan-
the pay system. This commitment is not for the
dardizing and sharing processes, such as through
faint of will, time, or budget; it is for those who
district consortia or state-funded and conducted
want to invest in creating a high-quality teaching
activities (e.g., training for administrators).
force with the competencies needed to help kids
Finally, using standards-based evaluation learn in a standards-based world.
results for KSBP systems will likely continue to
generate resistance from some teachers and About the Authors
administrators. For some teachers, familiarity and Herbert G. Heneman III is the Dickson-Bas-
comfort with the single salary schedule, aversion com Professor (Emeritus) in Business, Manage-
to performance pay, fears of pay fluctuations and ment, and Human Resources at the University of
uncertainty, skepticism about the stability and sur- Wisconsin-Madison. He also serves as a Senior
vival of funding for the pay program, and lack of Research Associate in the Wisconsin Center for
self-confidence and assistance for meeting high Education Research. His research is in the areas of
performance standards all combine to make a new staffing, performance management, union mem-
KSBP program a less than welcome addition to bership growth, work motivation, and compensa-
their educational lives. Resistance among some tion systems. Heneman is the senior author of four
administrators also may run deep, particularly due textbooks on human resource management.
to a loathing to make significant performance dif- Anthony Milanowski is an Assistant Scientist
ferentiations among teachers that will lead to sig- with CPRE at the University of Wisconsin-Madi-
nificant pay differences among them. Mecha- son. Since 1999 he has coordinated the CPRE
nisms for lessening resistance must be incorporat- Teacher Compensation Project's research on stan-
ed into the initial design of the plan. These include dards-based teacher evaluation and teacher per-
communicating extensively and continually with formance pay. His research interests include per-
teachers and administrators about the plan, mak- formance evaluation, pay system innovations,
ing the plan prospective so that current teachers teacher selection, and the teacher labor market.
have the option of staying with the old plan, guar- Before coming to CPRE, he worked for many
anteeing that there will be no pay cuts as the new years as a human resource management profes-
plan is implemented and that there will be no arti- sional.
ficial limits on the amount of pay that can be
earned, ensuring stability of funding for the plan, Steven M. Kimball is a researcher with CPRE
and showing teachers the actual dollar impacts of at the University of Wisconsin-Madison and with
the plan on their individual pay. The Denver Pro the Wisconsin Center for Education Research. For
Comp plan incorporated all of these elements and the CPRE Teacher Compensation Project, he has
was voted on favorably by the teachers. Many of researched the impact of school-based perfor-
these elements were missing from the proposed mance award programs, National Board Certifica-
11
CPRE Policy Briefs
tion, and standards-based teacher evaluation and Donovan, M. S., & Bransford, J. D. (Eds.).
compensation systems. Before joining CPRE, (2005b). How students learn: Mathematics in
Kimball held legislative analyst positions in the the classroom. Washington, DC: National Acad-
U.S. Congress and the Texas State Office in emies Press.
Washington, DC.
Allan Odden is a Professor of Educational Donovan, M. S., & Bransford, J. D. (Eds.).
Leadership and Policy Analysis at the University (2005c). How students learn: Science in the
of Wisconsin-Madison. He is also a Co-Director classroom. Washington, DC: National Acade-
of CPRE, where he directs the Education Finance mies Press.
Research Program. His research and policy
emphases include school finance redesign and Floden, R. E. (1997). Reforms that call for
adequacy, effective resource allocation in schools, teaching more than you understand. In N. C.
the costs of instructional improvement, and Burbules & D. T. Hansen (Eds.), Teaching and
teacher compensation. Odden has published wide- its predicaments (pp. 11-28). Boulder, CO:
ly on his research. His newest book, New Direc- Westview Press.
tions in Teacher Pay, co-authored with Marc Wal-
lace, will appear this year.
Gallagher, H. A. (2004). Vaughn Elementary's
innovative teacher evaluation system: Are
References teacher evaluation scores related to growth in
Bransford, J. D., Brown, A. L., & Cocking, R. student achievement? Peabody Journal of Edu-
(1999). How people learn: Brain, mind, experi- cation, 79(4), 79-107.
ence, and school. Washington, DC: National
Academy Press. Heneman, H. G., III. (1998). Assessment of the
motivational reactions of teachers to a school-
Cohen, D. K. (1996). Rewarding teachers for based performance award program. Journal of
student performance. In S. H. Fuhrman & J. A. Personnel Evaluation in Education, 12(1), 43-
O'Day (Eds.), Rewards and reform: Creating 59.
educational incentives that work (pp. 60-112).
San Francisco: Jossey-Bass. Heneman, H. G., III, & Milanowski, A. T.
(1999). Teachers' attitudes about teacher bonus-
Cohen, D. K., & Hill, H. C. (2001). Learning es under school-based performance award pro-
policy: When state education reform works. grams. Journal of Personnel Evaluation in Edu-
New Haven, CT: Yale University Press. cation, 12(4), 327-341.
Corcoran, T., & Goertz, M. (1995). Instructional Heneman, H. G., III, & Milanowski, A. T.
capacity and high performance schools. Educa- (2003). Continuing assessment of teacher reac-
tional Researcher, 24(9), 27-31. tions to a standards-based teacher evaluation
system. Journal of Personnel Evaluation in
Education, 17(2), 173-195.
Cunningham, P., & Allington, R. (1994). Class-
rooms that work: They can all read and write.
New York: HarperCollins. Heneman, H. G., III, & Milanowski, A. T.
(2004). Alignment of human resource practices
and teacher performance competency. Peabody
Danielson, C. (1996). Enhancing professional Journal of Education, 79(4), 108-125.
practice: A framework for teaching. Alexandria,
VA: Association for Supervision and Curriculum
Development. Kelley, C., Heneman, H., III, & Milanowski, A.
(2002). School-based performance rewards:
research findings and future directions. Educa-
Donovan, M. S., & Bransford, J. D. (Eds.). tional Administration Quarterly, 38(3), 372-
(2005a). How students learn: History in the 401.
classroom. Washington, DC: National Acade-
mies Press.
12
Standards-Based Teacher Evaluation as a Foundation for Knowledge-and Skill-Based Pay
Kelley, C., Odden, A., Milanowski, A., & Hene- Odden, A. (2000). New and better forms of
man, H. G., III. (2000). The motivational effects teacher compensation are possible. Phi Delta
of school-based performance awards (CPRE Kappan, 81(5), 361-366.
Policy Brief No. RB-29). Philadelphia: Univer-
sity of Pennsylvania, Consortium for Policy Odden, A., & Kelley, C. (2002). Paying teachers
Research in Education. for what they know and do: New and smarter
compensation strategies to improve schools
Kimball, S. M. (2002). Analysis of feedback, (2nd ed.). Thousand Oaks, CA: Corwin Press.
enabling conditions, and fairness perceptions of
teachers in three school districts with new stan- Odden, A., Kelley, C., Heneman, H., III, &
dards-based teacher evaluation systems. Journal Milanowski, A. (2001). Enhancing teacher
of Personnel Evaluation in Education, 16(4), quality through knowledge- and skills-based pay
241-268. (CPRE Policy Brief No. RB-34). Philadelphia:
University of Pennsylvania, Consortium for Pol-
Kimball, S. M., White, B., Milanowski, A. T., & icy Research in Education.
Borman, G. (2004). Examining the relationship
between teacher evaluation and student assess- Odden, A., & Wallace, M. (forthcoming). New
ment results in Washoe County. Peabody Jour- directions in teacher pay.
nal of Education, 79(4) 54-78.
Peterson, K. (2006). Teacher pay reform chal-
Milanowski, A. T. (2004). The relationship lenges states. Stateline.org: Where policy & pol-
between teacher performance evaluation scores itics news click. Retrieved March 13, 2006 from
and student achievement: Evidence from www.stateline.org/live/viewpage.action?siteN-
Cincinnati. Peabody Journal of Education, odeld=137&languageID=15contentID=93346.
79(4), 33-53.
Rowan, B., Correnti, R., & Miller, R.J. (2002).
Milanowski, A. T., & Heneman, H. G., III. What large-scale, survey research tells us about
(2001). Assessment of teacher reactions to a teacher effects on student achievement: Insights
standards-based teacher evaluation system: A from the Prospects study of elementary schools
pilot study. Journal of Personnel Evaluation in (CPRE Research Report No. RR-051). Philadel-
Education, 15(3), 193-212. phia: University of Pennsylvania, Consortium
for Policy Research in Education.
Milanowski, A. T., Kimball, S. M., & Odden, A.
(2005). Teacher accountability measures and Tucker, P. D., & Stronge, J. H. (2005). Linking
links to learning. In L. Stiefel, A. E. Schwartz, teacher evaluation and student learning.
R. Rubenstein, & J. Zabel, (Eds.), Measuring Alexandria, VA: Association for Supervision
school performance and efficiency: Implications and Curriculum Development.
for practice and research. 2005 Yearbook of the
American Education Finance Association (pp.
137-159). Larchmont, NY: Eye on Education.
13
CPRE Policy Briefs
14
Standards-Based Teacher Evaluation as a Foundation for Knowledge-and Skill-Based Pay
COVENTRY
1. Teacher performance ratings as predictors of student achievement —value-added
analysis for reading and math test scores.
15
CPRE Policy Briefs
Policy
Briefs
NON PROFIT
Graduate School of Education
U.S. POSTAGE
University of Pennsylvania
3440 Market Street, Suite 560 PAID
Philadelphia, PA 19104-3325 PERMIT NO. 2563
PHILADELPHIA, PA
16