Académique Documents
Professionnel Documents
Culture Documents
Faculty Evaluation
Part A
There are several types of errors that can occur when evaluating performance, especially
for faculty evaluations. There are recommendations of evaluations that will reduce these errors
With any type of evaluation, error can occur. The first type of error is unlikely to occur
when evaluating faculty. Serial position error occurs when the person doing the rating remembers
someone they saw first and last better than the people in the middle. With faculty evaluation
there isnt a long list of faculty to be evaluated by one person so there is no serial effect.
Contrast error can occur with faculty evaluations. A student rater or peer rater could
compare (contrast) the faculty they are evaluating to a different faculty member- the faculty gets
a lower or higher rating based on how they stacked up against the other member. This may be
challenging to prevent, seeing as the rater isnt always aware they are doing it.
Another type of error that may occur is halo/horn error. Halo error is where the rater will
evaluate the faculty member high on all categories of performance despite only knowing they do
well in one. An example is rating a faculty high on all dimensions when the rater only knows
they are good at communication. On the other hand, horn error occurs when the faculty member
is rated low in all categories, despite only being low in one. One potential way to combat this
type of error could be to have the rater state a critical incident in that category that gives
Leniency error is a common error that occurs with faculty evaluations. This is where the
raters will give faculty members either positive or negative scores compared to what their actual
performance is. When rated below their actual performance, it is called negative leniency.
Students will often rate professors as all good or all bad without truly evaluating how that
professor is in each category. Again, listing critical incidents for rational of scores could
The last type of error that may occur is the central-tendency error. The rater will score
faculty toward the middle- no extreme positive or negative ratings. To try and reduce this type of
error it is important to ensure the raters know and are familiar with the areas they are rating
Muchinsky and Culbertson (2016) list several ways to improve performance appraisals in
their book Psychology Applied to Work. First ratings should objective not subjective. For
example, asking how many times a student tried to meet with a professor and was unable to
would be better than asking whether the professor was available, because it is more objective.
Second, the evaluation should be job-related and based on a job analysis. Third, ratings should be
based on behaviors not traits. An example of this would be asking how a professor made the
class enjoyable rather than asking if the professor is funny. Fourth, raters should only be
evaluating behaviors within the control of the faculty member. University policies are beyond the
control of faculty members and should not considered when evaluating them. Lastly, the
dimensions should be related to a specific function not a global one (Muchinsky & Culbertson,
2016).
evaluations. Evaluations should be standardized and written down to a diverse group of raters,
including areas in which they can improve, so that they have the opportunity to correct any
deficiencies. The faculty members should also be given a chance to provide their own feedback
and any appeals. Multiple, diverse, and unbiased raters should be used. There should be written
instructions for how to train raters. Lastly, there should be documentation in order to allow a
system that checks for discriminatory practices (Muchinsky & Culbertson, 2016).
Our Process
For our faculty evaluation, the professors will be evaluated twice per semester; once in
the middle of each semester and again when the semester is almost over. Multiple raters will be
used for this process, and we will do our best to make sure they are unbiased. It will not be
evaluated at the beginning of each semester because we feel that would be unnecessary. By
evaluating it in the middle of the semester, we can see how well the professors have been doing
up to that point, and by evaluating again at the end, we can see whether they improved since
then. Since they will not get any teaching experience between the end of one semester and the
beginning of the next, there is no need to reevaluate them at the beginning. After the evaluations
are completed, they will be available for viewing by the faculty. The results will be
communicated to the professors, who will be told whether they are lacking in any categories and
given opportunities to improve. If a professor believes they have been rated unfairly, there will
be an appeal process in which they can argue their viewpoint (Muchinsky & Culbertson, 2016).
It is important for the raters to collect performance information throughout the evaluation
period, as this will help give them a basis for their ratings. The raters can do this in a few
different ways. One way to do this would be observing the professors for a time while they are
teaching in classes. The raters could also talk to random selections of students that have taken the
professors before, or speak to the department head and see whether the department head can
evaluation. Professors degree of knowledge in their subject matter is very important because a
professors job is to teach information to people, and we dont want them accidentally spreading
time and to finish grading students work and other job related activities in a timely manner.
Preparation is another relevant factor, because if a professor is well prepared for teaching each
class period, they will be able to provide the information in a way that can be understood by
students more easily. The way professors conduct themselves in front of students is also
important, as we do not want students to think the professors are rude or do not care about how
much they learn. It is also important that the professors not get too distracted during class,
because if they are off topic for too long they may not be able to get through the information they
plan to teach. Lastly, it is important for them to communicate deadlines and the dates on which
tests occur to the students, so the students can plan to get their work done on time.
The scales we will use for our faculty evaluation are graphic rating scales, one for each of
the specific aspects of the teacher being evaluated. We chose this because compared to some of
the other methods; it is easier to do and is less time consuming. Unfortunately, graphic rating
scales are often susceptible to rating errors, but we realize that no evaluation system is perfect,
and using this simpler method will make it easier to do two evaluations for the teachers each
semester. The scales will be standardized and uniform for each professor.
Our raters will need to be trained before they are able to rate the professors accurately.
This is especially important for the type of scale we are using, since it is more susceptible to rater
error than most other methods. Our training process will be a combination of rater error training
and frame-of-reference training. Written instructions regarding how to properly train the raters
will be used extensively during the training process. We will teach the raters about errors that
often occur when rating, including serial position errors, contrast errors, halo/horn errors,
leniency errors, and central-tendency errors. The raters will learn what each type of error looks
like and how to identify them and avoid them. However, we will also teach them that if they do
end up rating a professor in a way that looks like one of these errors, it may not necessarily be an
error. The raters should not necessarily always avoid giving a faculty member high ratings across
the board, as it is always possible that the faculty member is very good in each category. We will
also show the raters portrayals of made up employee performances that are good, bad, and in the
middle, and provide feedback with how accurate the raters are in rating them (Muchinsky &
Culbertson, 2016). This will enable the raters to learn how to give ratings more accurately, as
they can be shown what degree of performance would constitute a high, medium, or low rating.
In addition, since the vignettes will be made up, this will teach the raters to provide those ratings
without bias.
Whenever anyone is evaluating anything, there is some bias involved. However, the
raters can use the knowledge they received from training to help them avoid biases in their
ratings. As they will have learned about the halo effect, leniency errors, central-tendency errors,
and contrast error, they can take those into account when doing the ratings. If the rater checks
back over a rating and sees that they have rated a professor in a way that looks like halo error for
example, they can examine it further to see whether they made an error, or whether the professor
is actually very good in each aspects being rated. The raters will also be discouraged from
committing conscious rating distortion, and told how important it is that they try to rate the
faculty as accurately as possible. In order to avoid this, the raters will be required to provide
extensive documentation faculty performance, some of which should be based on the rater's
personal knowledge. If it is believed that a rater is purposely giving inflated or deflated ratings,
this will be looked into to make sure no discrimination or abuses of the system are occurring
(Muchinsky & Culbertson, 2016). If this does occur, the raters doing it will be severely punished,
We will be using graphic rating scales for our evaluation, and all were on some sort of 5-
point scale.
requirements, meets job requirements, frequently below job requirements, consistently below job
requirements
meets job requirements, frequently below job requirements, consistently below job requirements
meets job requirements, frequently below job requirements, consistently below job requirements
unacceptable
unacceptable
Accessible - consistently exceeds job requirements, frequently exceeds job requirements,
meets job requirements, frequently below job requirements, consistently below job requirements
requirements, meets job requirements, frequently below job requirements, consistently below job
requirements
To address content related recommendations we tried to make our evaluation more based
on behaviors than traits. We also made sure that the criteria was job-related and were behaviors
that could be actively observed by the ratee (Muchinsky & Culbertson, 2016).
Part C. Explanation
As stated earlier, we chose to use a uniform graphic rating scale so each professor is
evaluated based on a set standard. To ensure that the professors are rated fairly and efficiently,
we chose the graphic rating scale, seeing as it is the most commonly used type of performance
appraisal, and the other types would have issues that we thought were more detrimental to the
appraisal than the issues with graphic rating scales. We chose to use behaviors more than traits,
in addition to a mix of subjective and objective measures in order to make sure that all unique
types of teaching are evaluated fairly, while at the same time, making sure that the different types
of teaching are suitably effective. Some criteria lend themselves to being more subjective than
others, such as promotes critical thinking and approachable, seeing as some raters
personalities will affect the how they rate the evaluatee, even with the training that they receive.
Some of the scales used for the criteria are simply able to discriminate between good job
performance and unacceptable job performance, while other scales detail whether job
performance is consistently exceeded or failed. These scales were chosen because some criteria
are critical enough to the job that not meeting expectations for them is a huge issue, while
meeting other criteria is more loose and is indicative of going above and beyond the job
requirements.