The Outcome Effect and Professional Skepticism

The Outcome Effect and Professional Skepticism
Joseph F. Brazel*
North Carolina State University
jfbrazel@ncsu.edu
Scott B. Jackson
University of South Carolina
scott.jackson@moore.sc.edu
Tammie J. Schaefer
University of Missouri-Kansas City
schaefertj@umkc.edu
Bryan W. Stewart
Brigham Young University
bstewart@byu.edu
January 2016
*Corresponding Author
Professor of Accounting
Poole College of Management
Campus Box 8113
Raleigh, NC 27608
We thank Donald Moser (editor), two anonymous referees, Mark Beasley, Jean Bedard, Frank
Buckless, Aasmund Eilifsen, Scott Fleming, Jeremy Griffin, Blake Hetrick, Kip Holderness,
Kathy Hurtt, Annette Kohler, Kathleen Linn, Marlys Lipe, Bob Lipe, Tim Louwers, Jeffrey
Miller, Carmen Olsen, Joel Owen, Luc Quadackers, David Ricchiute, Dick Riley, Katherine
Schipper, Scott Showalter, Donna Street, Eileen Taylor, Scott Vandervelde, Justin Vaughan,
Sandra Vera-Munoz, Aaron Zimbelman, and workshop participants at the 2013 BYU
Accounting Research Symposium, 7th EARNet Symposium, the 2013 and 2014 Fall Meetings of
the Institute for Fraud Prevention (IFP), the 2013 and 2014 Meetings of the International
Association for Accounting Education and Research (IAAER)s Informing the International
Auditing and Assurance Standards Boards Standard Setting Process, the University of Notre
Dame, North Carolina State University, the University of South Carolina, the University of
Missouri-Kansas City, and West Virginia University for many helpful comments. This research
was supported by grants from the IAAER and the IFP. All results, interpretations, and
conclusions expressed are those of the authors alone, and do not necessarily represent the views
of the IAAER or the IFP. We also thank the individuals who participated in our experiments and
surveys.
The Outcome Effect and Professional Skepticism
Abstract
Despite the importance placed on professional skepticism by the accounting profession and
regulators, the failure of auditors to exercise an appropriate level of skepticism continues to be a
global issue. We experimentally test a potential barrier to skepticism. We find that outcome
knowledge biases supervisors evaluations of skeptical behavior. Holding a staff members
skeptical judgments and acts constant, superiors on the engagement team evaluate the staffs
skeptical behavior based on whether the staffs investigation of an issue ultimately identifies a
misstatement. Our evidence suggests that evaluators penalize auditors who employ an
appropriate level of skepticism, but do not identify a misstatement. Although consultation with
their superior while exercising skepticism improved staff auditors performance evaluations,
consultation did not effectively mitigate the outcome effect on their evaluations. Last, we
observe that auditors in the field anticipate that their superiors will be influenced by outcome
knowledge when they evaluate their skeptical behavior. Collectively, our results depict an
evaluation system that may inadvertently discourage skepticism amongst auditors in the field.
Keywords: audit, evaluation, hindsight bias, outcome effect, professional skepticism
Data Availability: Contact the authors.

I. INTRODUCTION
The level of audit quality attained on audit engagements hinges on the amount of
professional skepticism exercised by auditors (KPMG 2012). This viewpoint has led to a
renewed focus on addressing auditors failure to exercise sufficient levels of skepticism, a
fundamentally important global audit issue (Public Company Accounting Oversight Board
(PCAOB) 2011; International Auditing and Assurance Standards Board (IAASB) 2012).1 In
order to effectively identify how to increase professional skepticism in auditors, the underlying
cause(s) of insufficient skepticism must first be understood. Highly skeptical auditors increase
the likelihood that material misstatements are detected, which is important in promoting investor
confidence and global financial stability. However, exercising skepticism may also come at a
cost (e.g., budget overruns and potential conflicts with management) when additional work is
performed to obtain sufficient and appropriate evidence (Nelson 2009; PCAOB 2012a). We
consider an unexplored barrier to skepticism the effect of the professions culture, as manifest
in auditor evaluations.
We experimentally test whether outcome effects exist in supervisors evaluations of
skeptical behavior.2 Specifically, in light of the costs of skepticism, does outcome information
(i.e., whether or not a misstatement is identified) affect the evaluation of an auditors decision to
1
The International Auditing and Assurance Standards Board (IAASB) has released a staff publication to highlight
the importance of exercising professional skepticism (IAASB 2012). In addition, The IAASBs Work Plan for 2015-
2016 has emphasized an interest in understanding how professional skepticism might be enhanced on audits,
particular in relation to behavioral and training issues, as well as quality control (IAASB 2014). Similarly, the
PCAOB (2012a) has issued a Staff Audit Practice Alert to remind auditors to maintain and apply professional
skepticism on audits (PCAOB SAPA No. 10, http://pcaobus.org/Standards/QandA/12-04-2012_SAPA_10.pdf).
2
Hindsight bias is a similar phenomenon to outcome effects, as noted by Lipe (1993) and Mertins, Salbador, and
Long (2013). However, although both result from the possession of outcome knowledge and can be examined
jointly (e.g., Baron and Hershey 1988; Emby, Gelardi, and Lowe 2002), the mechanism through which the outcome
knowledge affects evaluations of decision quality differs. The direct impact of outcome knowledge on evaluations is
referred to as outcome effects, whereas the effect of outcome knowledge on the judged probability of outcomes (and
indirectly evaluations) is referred to as hindsight bias. Throughout this study, we reference the effects we examine as
outcome effects. In Section III we analyze measures to confirm the mechanism at work in our study.
1
investigate a matter as though the auditor should have known-all-along whether a misstatement
existed (e.g., Fischhoff 1975; Fiske and Beyth 1975; Blank and Nestler 2007)? We further
investigate whether consultation during the process of exercising skepticism can alleviate the
outcome effect bias. In addition, while the cost to the auditor of going over-budget has been well
researched (e.g., Houston 1999), we further explore the cost of impaired management relations
by examining how corporate managers view auditors who employ professional skepticism on the
audits of their companies financial statements. Finally, our study examines whether auditors
perceive and anticipate outcome effects in the field when they perform testing. If auditors do
perceive and anticipate this bias, there may be circumstances (e.g., investigating an ambiguous
red flag) where auditors perceive that the personal costs of being skeptical outweigh the personal
benefits, which may dissuade them from exercising appropriate levels of skepticism.
In an experimental setting, practicing audit seniors were asked to evaluate the
performance of a hypothetical staff auditor on their engagement. They were provided with
information about the engagement, the staff auditor, and the staff auditors performance on a
substantive analytical procedure related to revenue. Nelson (2009) and Hurtt, Brown-Liburd,
Early, and Krishnamoorthy (2013) describe how exercising skepticism consists of two key
components: skeptical judgment (issue identification) and skeptical action (additional
investigation). In all conditions, the staff auditor encountered financial data consistent with the
revenue balance. However, non-financial measures (NFMs), such as number of employees and
square footage of facilities, were inconsistent with the revenue balance. The staff auditor chose
to exercise an appropriate level of professional skepticism by investigating the inconsistency in
2
all conditions.3 As such, we hold the staff auditors skeptical judgment and action constant
between conditions. The investigation of the inconsistency caused the staff auditor to go over
budget and strained relations with management (i.e., the staff auditor incurred the costs of
skepticism).
We manipulated the outcome of the staff auditors investigation (as misstatement found
versus no misstatement found) and the level of consultation with the staff auditors superior (as
no consultation, moderate consultation, or high consultation). The no consultation condition
reflects the option to rely solely on ones own judgment rather than seek guidance. The moderate
consultation condition reflects the option to inform their supervisor about a situation and allow
the supervisor to provide guidance if needed (this amounts to keeping the supervisor in the
loop). The high consultation condition reflects the option to inform their supervisor about a
situation and get their approval before incurring the costs associated with investigating the
inconsistency. Based on the information presented in the experiment, participants were asked to
evaluate the performance of the staff auditor.
Overall, we find strong support for our prediction that the outcome of an investigation
will affect auditors performance evaluations. Despite the fact that the staff auditor exhibited the
same skeptical judgments and actions, the outcome effect causes their evaluators to provide
lower performance evaluations to staff who do not identify a misstatement versus staff who do
identify a misstatement. Participants in all conditions indicated a strong and equal belief that the
3
As discussed in Section III (see footnote 19), participants in our study believed the inconsistency should have been
investigated. The results of Brazel, Jones, and Zimbelman (2009) and Dechow, Ge, Larson, and Sloan (2011)
identify inconsistencies between financial measures and NFMs as a red flag for fraud and a survey by Brazel, Jones,
and Prawitt (2014) of audit partners and managers finds that they rank the NFM red flag to be a higher fraud risk
than several other common red flags (e.g., high accruals, CFO turnover in the current year, management being
extremely reluctant to record any audit adjustments). The professional literature related to professional skepticism
stresses the importance of evaluating inconsistent evidence (e.g., IAASB 2004). Thus, it is reasonable to assume that
an auditor identifying and investigating the NFM inconsistency is exercising an appropriate level of professional
skepticism.
3
inconsistency should have been investigated. Thus, even though evaluators are charged with
ensuring their subordinates exercise an appropriate level of skepticism and, in this case, agreed
that the matter under consideration warranted additional investigation, the staff auditors that
found no misstatement still received lower evaluations. To better understand why the outcome
(i.e., misstatement versus no misstatement found) influences participants performance
evaluations, we test the psychological process model proposed by Lipe (1993). We observe that
the outcome of the investigation influences the perceived benefit of the investigation, which, in
turn, influences whether the evaluator frames the cost of the investigation as lost time or a
normal cost of the audit. In turn, we find that the decision frame adopted by the evaluator
influences the overall performance evaluation.
Contrary to our expectations, consulting with the superior did not effectively mitigate the
outcome effects in auditor evaluations. Although we provide some evidence that the skeptical
auditor may be better off keeping his evaluator in the loop, any positive effect associated with
consultation is overwhelmed by the large negative evaluative effect of not finding a
misstatement. The persistence of the bias despite the involvement of the superior in the process is
consistent with prior research in psychology, which finds that this bias is difficult to correct
(Guilbault, Bryant, Brockway, and Posavac 2004).
In order to confirm the costs of skepticism faced by auditors in the field, we also
performed an experiment to explore how corporate managers view auditors who exercise
professional skepticism on the audits of their financial statements. We test the same
psychological process model (Lipe 1993) and find that corporate managers, similar to audit
evaluators, view the time spent by their employees in response to a skeptical auditor as lost
time if the investigation does not identify a misstatement. In turn, managers are more likely to
4
convey negative information about the audit staff to the audit partner. We therefore empirically
demonstrate that a cost of skepticism is likely to be impaired management relations. Hence,
auditors who exercise appropriate professional skepticism may face consequences from both
their supervisor and client management.
Finally, we administered a survey to investigate auditors perceptions of how the
outcome of investigating an inconsistency affects how they are evaluated. Our focus here is on
the perceptions of audit staff, not supervisors who evaluate audit staff. Two groups participated
in the survey: Masters of Accounting (MACC) students with audit experience and MACC
students with no audit experience. Participants provided responses to questions related to the two
potential outcomes of an investigation finding a misstatement and not finding a misstatement.
These questions allow us to better understand whether any bias identified in our experimental
context is experienced and/or expected by staff performing audit tests in the field. This is an
important complement to our experiment involving audit seniors because, while the experiment
demonstrates outcome effects in the evaluation process, those outcome effects are only a barrier
to skeptical behavior to the extent staff perceive they are subject to such bias.
Consistent with our findings that outcome knowledge influences auditor evaluations, our
survey results suggest that MACC students anticipate that outcome effects will be present in the
evaluations of auditors who engage in skeptical behavior. When exercising skepticism, survey
participants reported that finding a misstatement, relative to not finding a misstatement,
influences their performance evaluations and several other related measures of evaluator and
client approval. These perceptions did not vary by experience level, suggesting that auditors are
likely aware of the potential bias in the evaluation process at the very beginning of their careers.
5
In their review of research on professional skepticism, Hurtt et al. (2013, 56) pose the
following empirical question: Are the audit firms currently evaluating or rewarding skeptical
inquiries, regardless of outcomes? Our evidence suggests that the answer is no and that a
primary factor leading to this condition is the presence of outcome effects in the evaluations of
skeptical behavior. While the evaluation process has been posited as both a contributor and a
threat to skepticism (Nelson 2009; Glover and Prawitt 2014), our study provides an initial
examination of the evaluation of skepticism in the context of a supervisor-subordinate
relationship.
We apply the well-known psychological tendencies related to outcome effects to an issue
of global importance identify a substantial barrier that is likely inhibiting auditor skepticism in
the field. As such, our findings contribute to the theoretical understanding of why a lack of
skepticism may exist in some audit settings. Further, our findings demonstrate that auditors learn
quickly how they will be evaluated and likely respond to the incentives they face (e.g., engage in
less skeptical behavior in order to improve their evaluations). Our findings also demonstrate that
evaluators frame of mind changes based on the outcomes associated with skeptical behavior.
The time spent investigating an inconsistency is considered a waste of time if no misstatement is
found, but it is considered an investment of time that produces a benefit if a misstatement is
found. Furthermore, client management appears to be subject to similar framing effects, which
likely exacerbates the problem of insufficient skepticism.
Our study should be of interest globally to accounting firms, regulators, standards setters,
and the investing community as they attempt to find ways to increase the professional skepticism
exercised by auditors in the field. Although the profession calls for more skepticism, the
underlying culture may inhibit such behavior if auditors are punished for being skeptical when it
6
turns out there is no misstatement. In addition, although one might think that consultation could
provide an easy fix for outcome bias in evaluations, we find that even when subordinates consult
with their supervisor prior to engaging in skeptical behavior and receive permission to proceed,
supervisors are unable to purge outcome bias from their evaluations of audit staff. Consequently,
an internal, firm-level training that makes evaluators aware of this bias (and its potential
consequences) may be more effective than a subordinate-driven solution (e.g., increased
consultations). Additionally, our findings should inform human resource policies that shape the
evaluation processes employed by audit firms.
This paper is organized as follows. Section 2 provides the theory and hypotheses. Section
3 describes the design and results of an experiment involving experienced audit seniors. Section
4 describes the design and results of an experiment involving corporate managers. Section 5
provides the design and results of a survey involving MACC students, and the final section
provides conclusions.
II. THEORY AND HYPOTHESES
Professional skepticism
Professional skepticism requires the auditor to maintain a questioning mind and critically
assess audit evidence throughout the planning and performance of an audit (PCAOB 2006;
IAASB 2004). As auditors exercise more professional skepticism, they may require more
evidence to justify their audit opinions (Nelson 2009), reducing the chance that they fail to detect
a misstatement. Auditors should not be satisfied with less-than persuasive audit evidence
(IAASB 2004), and superiors should ensure that their subordinates exercise an appropriate level
of skepticism.
7
Despite the recognized importance of professional skepticism by the accounting
profession and regulators, auditors failure to exercise a sufficient level of skepticism continues
to be a global issue (PCAOB 2008; AIU 2010; ASIC 2010; EC 2010; IAASB 2012; PCAOB
2012a). A lack of skepticism has been cited as a cause of audit failures, SEC and PCAOB
enforcement actions, and malpractice claims filed against auditors (e.g., Carmichael and Craig
1996; Beasley, Carcello, and Hermanson 2001; Anderson and Wolfe 2002; Messier, Kozloski,
and Kochetova-Kozloski 2010; PCAOB 2012b). In the current economy, enhanced skepticism
may be particularly important when auditing clients that are experiencing financial distress, as
these clients may choose to engage in fraud to mask poor operating performance (Beasley,
Carcello, Hermanson, and Neal 2010). While most parties agree that efforts to improve
professional skepticism are needed (e.g., KPMG 2012; PCAOB 2012a; PwC 2012), the
underlying causes and potential solutions for insufficient skepticism remain unclear.
The underlying cause(s) of insufficient skepticism must first be understood in order to
effectively identify how to increase professional skepticism in auditors. Most of the extant
accounting literature focuses on auditor-specific traits (e.g., knowledge, innate characteristics)
that may result in insufficient skepticism (e.g., Hurtt, Eining, and Plumlee 2011). However, it is
possible that the professions culture may also influence the level of professional skepticism
exercised on engagements, as auditors respond to the rewards and incentives they face on
engagements (Nelson 2009). Research has yet to fully identify how firms evaluation systems
may be either encouraging or discouraging skeptical behavior (IAASB 2008; Hurtt et al. 2013).
An unexplored barrier to professional skepticism Outcome effect bias in auditor

evaluations
Outcome effects refer to situations where the knowledge of outcomes influences
evaluators judgments in the direction of the outcome (Tan and Lipe 1997). Although outcome
8
knowledge can be diagnostic, incorporating it into evaluations of decision quality effectively
biases the evaluation when the outcome is not diagnostic of the quality of decision making
(Mertins et al. 2013). The effect of outcome knowledge on evaluations has been found to be
large and pervasive across many studies (Guibault et al. 2004; Mertins et al. 2013). Given a
costly event (e.g., medical matters, failure to meet targets, insolvency), evaluators exhibiting
outcome bias will focus on outcomes, rather than the uncertainty inherent in a decision at the
time that decision was made (e.g., Baron and Hershey 1988; Emby et al. 2002; Ghosh and Lusch
2000).
While outcome effects have not been studied in the context we examine, Emby et al.
(2002) also examined outcome effects in an accounting context. In that study, participants
assume the role of a member of a peer review team who is evaluating work performed on an
audit engagement. The experiment manipulated whether the client of the firm subject to peer
review remained solvent or became insolvent during the subsequent year. Emby et al. (2002) find
that outcome knowledge about solvency had a significant influence on the evaluation judgments
of participants, such that participants evaluate their peers more negatively when they fail to
modify an audit opinion and the client becomes insolvent, punishing them for a lack of
skepticism. We extend the work of Emby et al. (2002) in several important ways. First, we
consider the impact of outcome effects in a different accounting context, demonstrating that
outcome effects may also result in auditors being punished for exercising appropriate levels of
professional skepticism and helping to explain why auditors in the field may be hesitant to act
skeptically. In addition, we examine how client management evaluates skeptical auditors and
9
whether staff auditors anticipate that outcome effects will influence the evaluation of their
skeptical behavior.4
To understand how outcome effects in auditor evaluations may negatively impact
skepticism, we must consider the associated costs of skeptical behavior.5 The obvious costs
associated with not being skeptical and failing to identify a misstatement make it seem intuitive
that more skepticism is always more beneficial than less skepticism. For example, the failure to
identify a material misstatement in the financial statements may result in a restatement. In turn,
the audit firm may lose the client, the auditor may lose their job, and investors may bring
lawsuits against the firm (Nelson 2009).
The costs associated with being skeptical, on the other hand, are less obvious. In the
current economic environment, companies face increased pressure to reduce expenses, including
the fees it pays to the audit firm. Public disclosure of audit fees has also created downward
pressure on audit fees across industries (Reason 2010). The decreasing trend in audit fees may, in
turn, put pressure on audit firms to cut costs and/or increase the efficiency of its audits in order to
maximize profitability.
Highly skeptical auditors may decrease the risk to the firm by reducing the risk that
material misstatements will go undetected, but their skepticism may result in additional inquiries
4
There are a number of important differences between our study and Emby et al. (2002). For example, they focus on
whether outcome knowledge causes partners to evaluate a peer more negatively when the peer appears to lack
sufficient skepticism. We focus on whether outcome knowledge causes a senior to evaluate a staff auditor more
negatively when the staff auditor is appropriately skeptical. Despite these differences, there is one general message
that our study and Emby et al. (2002) jointly suggestthe way in which professionals respond to outcome
knowledge and evaluate skepticism may critically depend on the context of the evaluation. That is, skepticism may
be evaluated positively in one context and negatively in another.
5
Although prior accounting research has found evaluators in managerial positions (e.g., Brown and Solomon 1987;
Brown and Solomon 1993; Lipe 1993; Frederickson, Peffer, and Pratt 1999), judge/juror situations (Anderson,
Jennings, Lowe, and Reckers 1997; Kadous 2000), and peer reviews (Emby et al. 2002) are impacted by outcome
knowledge, extant accounting literature has not contemplated the effects of outcome knowledge as an inhibitor of
professional skepticism as we do in this study. See Mertins et al. (2013) for a review of the outcome effects
literature in accounting.
10
or procedures and budget overruns that the audit firm may not be able to recover. Additionally,
performing unplanned or atypical audit procedures (compared to prior year audits) may anger or
frustrate the client, as they are required to respond to unanticipated inquiries and evidence
requests. While either of these costs may lead to consequences for the firm (e.g., less profitability
and/or a strained relationship with the client), they may also lead to direct consequences for the
staff auditor.
Professional skepticism is a behavior that, although encouraged by the profession, does
not always produce the same outcome (e.g., sometimes it leads to the identification of a
misstatement and other times it may not). Consider a situation where an auditor observes a red
flag or inconsistency when assessing audit evidence and exercises an appropriate level of
skepticism by performing additional testing. Conducting an investigation would be consistent
with exercising professional skepticism; however, it requires added effort from the auditor and
the client and does not ensure that a misstatement will be found. It is possible that the
investigation leads to an acceptable explanation for an unusual pattern of facts observed, such
that no audit adjustment is necessary. In short, the auditor has incurred the costs associated with
skepticism (went over budget and/or strained client relations), but has not experienced the
benefit of identifying a misstatement.6
When assessing an auditors decision making (e.g., evaluating their skeptical
judgments and actions), the outcome of the investigation is not diagnostic of decision
quality given the ex ante uncertainty in outcome and the observable nature of the
6
Our discussions with auditors from countries outside the U.S. and with the IAASB suggest that the main cost of
professional skepticism in non-U.S. settings is likely a delayed audit committee meeting and/or a delayed audit
report. This situation brings up the interesting question of whether non-U.S. auditors are more apt to work additional
days when exercising skepticism versus U.S. auditors simply working substantial overtime to not spend additional
days when exercising skepticism.
11
auditors decision process (Fisher and Selling 1993). The appropriate criterion for
evaluating the auditor is not what they found, but what the evidence suggested they might
find (Lipshitz and Barak 1995; Mertins et al. 2013). Still, research on outcome effects
suggests that when costs are incurred, an auditors performance evaluation may be
influenced more by the outcome of their skeptical behavior (i.e., whether or not a
misstatement is found) than by whether they engaged in the appropriate level of skeptical
behavior (i.e., appropriately identified and investigated a red flag or inconsistency). As
such, skeptical judgments and acts that incur costs, but identify no misstatement, may be
viewed less favorably by superiors. On the other hand, skeptical behavior that results in
the detection of a misstatement will likely be viewed more favorably by superiors as there
is a tangible benefit from the investigation. That is to say, when a misstatement is found,
the end justifies the means.
Drawing on the theory of outcome effects, and in light of the practical costs associated
with exercising professional skepticism, we expect that the skeptical auditor that fails to find a
misstatement will be penalized, whereas the skeptical auditor that identifies a misstatement will
not. In other words, evaluators of auditors will fall prey to the outcome effect bias and give lower
evaluations to auditors that do not find a misstatement, even though both auditors encountered
the same circumstances and exercised the same level of professional skepticism. Hypothesis 1,
stated in alternative form, is as follows:
H1: Superiors will evaluate skeptical auditors more negatively (positively) when they do
not (do) identify a misstatement.
Auditor consultation Removing the bias of outcome effects from the evaluation process
If Hypothesis 1 is supported and superiors evaluations of skeptical auditors are biased by
outcome effects, then potential solutions to debias evaluations are a relevant consideration. The
12
accounting literature suggests that a primary source of outcome effects stems from cognitive
reconstruction that occurs during the evaluation process (e.g., Brown and Solomon 1993).
Evaluators start with the outcome knowledge and work backwards to connect the causal links
that led to the outcome increasing the salience of outcome-congruent information (Mertins et
al. 2013). As mentioned above, when there is a perceived benefit as a result of incurring costs,
Hypothesis 1 predicts that evaluators will not penalize the decision maker for incurring the costs.
However, when there is no benefit to match with the costs, the decision maker may be penalized
as cognitive reconstruction increases the perceived inevitability of the outcome, making it seem
as though the decision maker should have known it all along.
Accordingly, one potential solution to debias superiors evaluations of skeptical auditors
would be to involve the evaluator in the decision-making process in order to reduce the
perceived inevitability of the outcome should the outcome be negative. Auditors can do this
through informal consultations with their superior before engaging in the skeptical action.7
Although the frequency of consultation varies considerably in practice, Glover and Prawitt
(2014) suggest that creating a culture of consultation may enhance the appropriate application
of skepticism in the field.8 In addition to reducing the perceived inevitability of the outcome
(Brown and Solomon 1987), consulting a superior might also mitigate outcome effects by
7
Recall the literature on professional skepticism makes a distinction between the skeptical judgment (i.e.,
identifying inconsistencies) and the skeptical action (i.e., investigating inconsistencies) (Nelson 2009). We are
suggesting that consultation between the skeptical judgment and the skeptical action may reduce outcome effects in
the evaluation.
8
Depending on the circumstances, an auditor may or may not choose to consult with their supervisor before
investigating an inconsistency. When confronted with the question of whether or not auditors routinely engage in
consultation, we had the opportunity to insert a question into the post experimental questionnaire of a separate study.
Of the 104 senior auditors who were given an analytical procedure task similar to that used in this study, 52 percent
reported that they would not consult with their manager or would consult with their manager after performing the
investigation.
13
reducing information asymmetry and, thus, decreasing the superiors perception of the
diagnosticity of the outcome information (Fisher and Selling 1993; Tan and Lipe 1997).9
Negative outcomes may result in poor evaluations of auditors skeptical behavior because
the superiors perceive the skeptical action to be unnecessary or inconsistent with the audit plan.
In this case, involving the superior in the decision-making process through informal consultation
may also increase the appraisal of the auditors performance through a shift in responsibility. If
the auditor engages in a course of action agreed upon by the superior, then, consistent with
escalation of commitment theory (Staw 1976), the superior has motivation to look upon skeptical
behavior more favorably (Brown and Solomon 1993). Accordingly, we predict that consulting
with the superior prior to engaging in skeptical action will reduce outcome effects in auditor
evaluations. Hypothesis 2, stated in alternative form, is as follows:
H2: When subordinate auditors consult with their superiors during the course of
exercising skepticism, the outcome effect in auditor evaluations is reduced.
Auditor awareness of outcome effects in evaluations
In Hypothesis 1, we predict that outcome effects will result in evaluators penalizing
skeptical behavior when no misstatement is found. The immediacy of this negative consequence
may affect staff behavior in the face of uncertain outcomes. An auditor in the field may perceive
that they would be better off not being skeptical when the likelihood of detecting a misstatement
is uncertain. Glover and Prawitt (2014) posed this as a possible threat to professional skepticism
at the individual auditor and firm level. This reasoning also suggests that audit firms evaluation
and reward systems may be inadvertently discouraging skepticism amongst auditors in the field.
9
This conjecture is consistent with Wu, Shimojo, Wang, and Camerer (2012) who observe in a visual exercise that
sharing information with evaluators about the process by which subordinates made their decisions can reduce
hindsight bias (similar to outcome effects) in the evaluation process. Specifically, when evaluators followed the
same gaze patterns in evaluating images as their subordinates, their evaluations exhibited lower hindsight bias.
14
In order for outcome effects in the evaluation process to affect the level of professional
skepticism in the field, auditors must perceive that their evaluators are likely to exhibit this bias
when evaluating skeptical behavior. In other words, they must believe ex ante that their behavior
will be evaluated differently depending on the outcome. In other settings, the anticipation of
outcome effects leads to overregulation, defensive medicine, and a refusal to litigate high-risk
cases (Studdert et al. 2005; Harley 2007; Terry 2011). From the audit perspective, such
anticipation may cause auditors in the field to avoid investigating red flags/inconsistencies unless
they are obvious or the identification of a misstatement is a sure thing.
We previously noted that reward and evaluation systems make up an important part of an
organizations culture and norms (Schneider, Ehrhart, and Macey 2013). Prior literature has
recognized that employees respond to and act in accordance with organizational tone (Treadway
1987). This conclusion is consistent with research related to self-monitoring, the process by
which individuals learn how they are evaluated (by what criteria) and anticipate how evaluators
will react to their behavior (Fiske 2009). If outcome effects frequently occur in the performance
evaluation systems of audit firms (as Hypothesis 1 suggests), then auditors in the field will
become aware that such a bias exists, expect it, and adapt to the bias. We predict that auditors in
the field perceive that outcome effects influence how their skeptical behavior is evaluated in the
following manner: skeptical behavior that identifies a misstatement is considered acceptable and
skeptical behavior that identifies no misstatement is punished. Formally, we hypothesize:
H3: Auditors perceive that their skeptical behavior is evaluated more negatively
(positively) when they do not (do) identify a misstatement.
15
III. AUDITOR EXPERIMENT
Purpose
The primary purposes of this experiment were to test (1) whether audit seniors evaluate
skeptical staff auditors more negatively (positively) when they do not (do) identify a
misstatement (Hypothesis 1) and (2) whether the outcome effect predicted by Hypothesis 1 is
reduced (or eliminated) when staff auditors consult with their supervisors during the course of
exercising skepticism (Hypothesis 2). The secondary purpose of this experiment was to test a
psychological process model that may help explain why the outcome effect exists in our setting.
Participants
The participants consisted of 96 audit seniors from an international accounting firm. We
administered the experiment while participants attended a training session. On average, our
participants had 30 months of audit experience and had conducted three performance evaluations
of staff auditors under their supervision.10
Description of experimental context
The experimental materials placed participants in the position of the lead senior on the
hypothetical audit engagement of Madison, Inc., a publicly traded manufacturing company with
multiple divisions. Participants learned that Madison had been an audit client for ten years and
had received an unqualified audit opinion each year. The materials also provided information
about the auditor-client relationship, including: (1) Madison was a large audit client in the office,
(2) the budget was very tight and there was pressure to keep fees down, (3) there had been few
10
There were 30 participants who had no experience evaluating staff auditors under their supervision. The likely
reason for some participants not having such experience is that the experiment was administered shortly after
promotions to senior at the participating audit firm. Our inferences and conclusions are not influenced by removing
participants who had no experience evaluating staff auditors. See also footnote 18 for information about how similar
results were obtained in an earlier experiment using more experienced participants.
16
historical audit adjustments, and (4) Madison expected the audit to run smoothly and asked for
explanations when the nature, timing, and extent of audit procedures changed.
Participants were informed that their task was to evaluate the performance of a third-year
staff member, Sam, who worked under their supervision. Sams responsibilities on the audit
included, among other tasks, performing substantive analytical procedures related to the revenue
account for the Sporting Goods division of Madison.11 In the past, analytical procedures had
incorporated prior year Madison financial information and industry financial trends to develop an
expectation for the divisions revenues.
The literature related to professional skepticism stresses the importance of evaluating
inconsistent evidence (e.g., IAASB 2004). In our experimental setting, the information sources
used in prior year testing (Madisons own past financial performance and industry financial data)
were consistent with the revenue account reported by the division in the current year. If Sam
sought confirming or consistent evidence, he would have likely used the same information
sources as the prior year. Non-financial measures (NFMs) for Madison, such as number of
employees and square footage of facilities, were not considered in prior years. In all conditions,
participants were informed that Sam incorporated NFMs into his analytical procedures in the
current year and noticed an inconsistency between the revenue account and related NFMs.12
Thus, Sams decision to collect and consider alternative, inconsistent evidence by itself can be
considered skeptical behavior. In other words, if Sam had simply repeated the testing performed
in the prior year, then the inconsistency would not have been identified.
11
To be clear, participants were not asked to perform analytical procedures themselves. Instead, they were asked to
evaluate the performance of an audit staff member who had performed those procedures.
12
Post-experimental questions indicated that (1) 89 percent of participants had performed analytical procedures
related to revenues, (2) 70 percent of participants had used NFMs at least once when performing analytical
procedures related to revenues (the number of times averaged 4.77 and ranged from zero to 100), and (3) 52 percent
had reviewed workpapers documenting analytical procedures related to revenues. These responses suggest that the
experimental context was reasonably well known to participants.
17
Sam then chose to investigate the inconsistency, which is a manifestation of professional
skepticism. As such, we hold the staff auditors skeptical judgment (issue identification) and
skeptical action (additional investigation) constant between conditions. The investigation of the
inconsistency caused Sam to go over budget and strain relations with management (i.e., incur the
costs of skepticism as described by Nelson 2009) in all conditions.
Manipulated variables
The first manipulated variable was whether Sams investigation uncovered a
misstatement (this variable is referred to as OUTCOME below).13 In the no misstatement
condition, participants were told the following:
Sam found that the inconsistency described above was a result of the Sporting Goods
division outsourcing some operations overseas. Sam made several inquiries into the
matter and collected additional audit evidence, which eventually led to a conclusion that
there were no misstatements in this revenue account.
In the misstatement condition, participants were told the following:
Sam found that the inconsistency described above was a result of the Sporting Goods
division outsourcing some operations overseas. Sam made several inquiries into the
matter and collected additional audit evidence, which eventually led to a conclusion that
a significant misstatement existed in this revenue account as revenues were being
recognized prematurely at the overseas operation.
This variable reflects a common feature of the real-world audit environmentthe exercise of
professional skepticism may or may not result in the identification of a misstatement. Knowing
this, auditors may attempt to mitigate an adverse evaluation by consulting with their supervisor
prior to investigating evidence inconsistencies. Investigating inconsistencies takes time for the
auditor and client, so some auditors may consult with their supervisor before doing so. See
footnote 8 for information about how such consultation varies between auditors in practice.
13
In the actual research instrument, parts of the manipulated variables were shown in red to make them more salient
to participants. We italicize the specific parts of the manipulations that were shown in red.
18
The second independent variable manipulated consultation at three levels (this variable is
referred to as CONSULT below). In the no consultation condition, participants were told the
following:
Without consulting with you, Sam chose to investigate the inconsistency between the
growth in Sporting Goods revenues and the decreases in the number of employees and
production space instead of relying on other sources like the industry trends that support
the reported revenue growth, which is what was done in prior years.
In the moderate consultation condition, participants were told the following:
Sam consulted with you about investigating the inconsistency between the growth in
Sporting Goods revenues and the decreases in the number of employees and the square
footage of production facilities. Your response to Sams consultation about investigating
the inconsistency was use your professional judgment. Sam chose to investigate the
inconsistency between the growth in revenues and the decreases in the number of
employees and production space instead of relying on other sources like the industry
trends that support the reported revenue growth, which is what was done in prior years.
In the high consultation condition, participants were told the following:
Sam consulted with you about investigating the inconsistency between the growth in
Sporting Goods revenues and the decreases in the number of employees and the square
footage of production facilities. Your response to Sams consultation about investigating
the inconsistency was I approve of you investigating the inconsistency. After receiving
your approval, Sam investigated the inconsistency between the growth in revenues and
the decreases in the number of employees and production space instead of relying on
other sources like the industry trends that support the reported revenue growth, which is
what was done in prior years.
We manipulated consultation at three levels because each level reflects a different option open to
auditors in practice.14 The no consultation condition reflects the option to rely solely on ones
own judgment rather than seek guidance. The moderate consultation condition reflects the option
to inform their supervisor about a situation and allow the supervisor to provide guidance if
needed (this amounts to keeping the supervisor in the loop). The high consultation condition
14
The experimental materials used in this study were thoroughly reviewed by partners from the participating firm to
ensure that the materials were realistic and the task was reasonable for participants. We are especially appreciative
of the feedback we received in relation to our consultation manipulation.
19
reflects the option to inform their supervisor about a situation and get their approval before
incurring the costs associated with investigating the inconsistency.15
Dependent variable
Our dependent measure was participants evaluation of Sam, which was elicited using the
following question:
Based on the information presented on the prior pages, how would you evaluate Sams
overall performance?
Participants responded on an 11-point response scale ranging from -5 to +5, with the left
endpoint labeled Below Expectations, the right endpoint labeled Above Expectations, and
the midpoint labeled Met Expectations.
Discussions with representatives at the participating firm revealed an important cultural
factor related to performance evaluations at the firm. The firms performance evaluations are
positively skewed. In practice, the majority of auditors are evaluated as performing Above
Expectations. Auditors are evaluated as Met Expectations far less often, and only a small
number are evaluated as falling Below Expectations.16 To confirm this anecdotal information,
as part of a separate study, we asked 62 audit seniors where they would rank in their peer group
if they consistently received Met Expectations ratings. Of the 62 audit seniors, approximately
76 percent indicated that they would be in the bottom half of their peer group. Given this cultural
factor, evaluations indicating that Sam has met expectations should be viewed as an indication of
relatively poor performance.
15
We had manipulation checks for both OUTCOME and CONSULT. Only four participants failed the CONSULT
manipulation check (92 answered it correctly) and only seven participants failed the CONSULT manipulation check
(89 participants answered it correctly). None of our inferences or conclusions change when participants who failed
either manipulation check are excluded from our analyses.
16
The participating firm deals with evaluation inflation by ranking auditors relative to their peer group at the end of
each year. Auditors whose evaluations indicate that they Met Expectations are typically ranked low.
20
Primary results
Table 1 reports means for the dependent variable in our six experimental conditions and
Figure 1 graphs those means. Visual inspection of Figure 1 suggests that OUTCOME may have a
large influence on participants performance evaluations, consistent with Hypothesis 1. Contrary
to Hypothesis 2, Table 1 and Figure 1 reveal that when no misstatement is found compared to
when a misstatement is found, Sam is evaluated less favorably at every level of consultation
(0.35 versus 2.61 in the no consultation condition, 1.69 versus 3.18 in the moderate consultation
condition, and 1.19 versus 2.67 in the high consultation condition).
Insert Figure 1 about here.
Insert Table 1 about here.
Table 2 provides the analysis of variance (ANOVA) results to test Hypotheses 1 and 2.
The ANOVA results in Panel A reveal that the overall performance evaluation is strongly
influenced by OUTCOME (F-statistic = 25.74, p-value < 0.001), which supports Hypothesis 1
(Panel B is discussed below). The ANOVA also reveals that the overall performance evaluation
is only weakly influenced by CONSULT (F-statistic = 2.59, p-value = 0.081). The interaction
between OUTCOME and CONSULT is not significant (F-statistic = 0.59, p-value = 0.557),
which implies that the outcome effect is not mitigated by consultation. Hypothesis 2 is not
supported. Even when Sam kept his evaluator in the loop or received approval for the
investigation, he was still penalized if he does not identify a misstatement.17, 18
17
We use the phrase penalized because an evaluation of Met Expectations, as noted above, ranks the auditor in
the bottom half of their peer group. Also, in the no misstatement condition approximately 20 percent of participants
gave Sam an overall performance evaluation of below Met Expectations, while in the misstatement condition
approximately 4 percent of participants gave Sam an overall performance evaluation of below Met Expectations.
18
We performed an earlier experiment that was very similar to the no consultation condition and manipulated audit
committee support rather than consultation. The expectation was that higher (versus lower) audit committee support
would insulate the auditor from the costs of skepticism (budget overruns and strained management relations) and
alleviate the outcome effect. We found that the audit committee support manipulation did not mitigate the outcome
21
CONSULT is a three-level variable and the ANOVA could obscure a significant
difference between levels of CONSULT. The marginal means for CONSULT in Table 1 are 1.51
(no consultation), 2.38 (moderate consultation), and 1.90 (high consultation). The marginal mean
for the no consultation condition is lower than the marginal mean for the moderate consultation
condition (non-tabulated t-statistic = 1.79, p-value 0.08), while the marginal mean for the no
consultation is not significantly different than the marginal mean for the high consultation
condition (non-tabulated t-statistic = 0.81, p-value = 0.42). Similarly, the marginal mean for
moderate consultation is not significantly different than the marginal mean for high consultation
condition (non-tabulated t-statistic = 1.08, p-value = 0.29).
Although we provide some modest evidence that the skeptical auditor may be better off
keeping his evaluator in the loop, some caution is warranted. Visual inspection of Figure 1
suggests that OUTCOME has a comparatively large effect on participants overall performance
evaluations and that CONSULT has a comparatively small effect. To formally evaluate the
treatment magnitude, we follow the guidance of Tabachnick and Fidell (2001) and examine 2
(not tabulated). The treatment magnitude of OUTCOME is over four times greater than the
treatment magnitude of CONSULT. Thus, any positive effect associated with consultation is
overwhelmed by the large negative evaluative effect of not finding a misstatement.19
effect on evaluations. However, the lack of mitigation may have been due to a mismatch between the level of
participants in the earlier experiment (seniors) and the level of auditor that deals directly with the audit committee
(managers and partners), amongst other possible explanations. Participants in the earlier experiment (vis--vis the
experiment reported here) had more general experience (mean of 45 months of experience) and were more
experienced with evaluations (only 16 percent of the sample had never performed an evaluation). Importantly, the
mean evaluations obtained in the earlier experiment in the no misstatement and misstatement conditions (0.85 and
2.61, respectively) were very similar to those reported in Table 1 for the no consultation condition.
19
Our instrument holds constant that Sam collects and uses NFMs and our objective is not to determine whether
NFMs should or should not be collected. However, in addition to our primary dependent variable (Sams
22
One potential reason for the modest effect of CONSULT on participants performance
evaluations is that high consultation appears to have unintended consequences for staff. Notice
that the mean performance evaluations in the high consultation conditions are slightly lower than
the mean performance evaluations in the moderate consultation conditions. To gain insights into
why the performance evaluations are slightly lower, we examined qualitative data obtained from
participants after they responded to the performance evaluation question. Specifically, we asked
participants to briefly describe your rationale for this rating. In the high consultation condition,
four participants (13 percent) indicate that Sam should have been more capable of making an
independent judgment. In the moderate consultation condition, no similar views were expressed
by participants. Aside from expressing a critical sentiment, we also observed positive statements
related to the exercise of appropriate skepticism. In the high consultation condition, only 13
participants (42 percent), praised Sam for exercising appropriate skepticism. However, in the
moderate consultation condition, 25 participants (81 percent) praised Sam for exercising
appropriate skepticism. It appears that the evaluation-related benefits of high consultation may
have limits in terms of insulating auditors from outcome bias. Obtaining a superiors approval
may actually backfire when staff auditors are perceived to use consultation as a substitute for
their own professional judgment.
evaluation), we did collect participants perceptions of Sams skeptical judgment (whether Sam should have
collected the information related to the number of employees and production space) and skeptical act (whether Sam
should have investigated the inconsistency) to determine whether supervisors second guess Sams judgment/action
depending on the outcome of the investigation. Once Sam identifies the inconsistency, all evaluators want it
investigated (regardless of outcome). However, the outcome of the investigation (whether or not a misstatement is
identified) and whether Sam consulted his superior interact to affect how evaluators perceived Sams skeptical
judgment (whether Sam should have collected the NFMs) (non-tabulated F-statistic = 3.54, p-value = 0.03). In short,
if Sam does not consult and the outcome is no misstatement is identified, then the evaluator questions whether Sam
should have searched for the inconsistent NFMs in the first place (his initial skeptical judgment that identified the
inconsistency).
23
This discussion suggests that the ANOVA results in Panel A of Table 2 may change to
some degree if the high consultation condition is omitted. Excluding the high consultation
condition, Panel B of Table 2 reveals that the overall performance evaluation is again strongly
influenced by OUTCOME (F-statistic = 19.47, p-value < 0.001). The ANOVA also reveals that
the overall performance evaluation is now significantly influenced by CONSULT (F-statistic =
5.01, p-value = 0.029), whereas CONSULT was only marginally significant previously. The
interaction between OUTCOME and CONSULT continues to be insignificant (F-statistic = 0.81,
p-value = 0.370), which implies that, although consultation may help to some extent, outcome
bias is not mitigated by either form of consultation in our context.
Additional analyses
To better understand why OUTCOME influences participants performance evaluations,
we utilize data obtained from our auditor experiment to test the psychological process model
proposed by Lipe (1993).20, 21 Although Lipe (1993) conceived of the process model in a
managerial accounting context, it likely applies to our setting as it demonstrates that ex post
knowledge of a decision outcome may change the frame in which the costs of an investigation
20
Alternatively, the effect of OUTCOME on EVAL could have been mediated by how evaluators felt about (1)
Sams decision to collect and incorporate NFMs into his analytical procedure, (2) Sams decision to investigate the
inconsistency (Sams skeptical judgment and action (Nelson 2009)), (3) the likelihood that a misstatement was
present (based on the information available to Sam prior to his investigation), or (4) how important NFMs should
have been to Sam when performing his analytical procedure (Emby et al. 2002). Non-tabulated mediation analyses
do not find that any of these factors mediate the relation between OUTCOME and EVAL.
21
Given that the phenomena of hindsight bias and outcome effects are so closely related (Lipe 1993), we also
perform an analysis to determine which of these phenomena we were observing. Recall the difference between the
two relates to the underlying mechanism through which outcome knowledge impacts judgments. Hindsight bias
describes situations where outcome knowledge indirectly influences judgments by making an outcome appear more
probable, whereas outcome effects occur when outcome knowledge impacts judgments without revising perceived
probabilities of the outcome. To determine which is at work in our study, we analyze the responses to the following
question: Based on the information available to Sam prior to his investigation, how likely is it that a misstatement
was present in the Sporting Goods Divisions revenue account? If hindsight bias is operating, we would expect the
response to this question to differ between the misstatement conditions. That is, evaluators falling prey to hindsight
bias would believe a misstatement was more (less) likely present in the account if a misstatement was (was not)
found. However, we do not find a significant difference between conditions (non-tabulated t-statistic = 0.25, p-value
= 0.80). Thus, the effect we are observing is most likely an outcome effect (rather than hindsight bias).
24
are viewed and, in turn, bias an evaluators judgments about a subordinates performance. Figure
2 provides the three-path mediation model. In this model, the outcome of the investigation
(OUTCOME) influences the perceived benefit of the investigation (BENEFIT), which, in turn,
influences whether the evaluator frames the cost of the investigation as lost time or a normal cost
of the audit (FRAME). In turn, the decision frame adopted by the evaluator may influence the
overall performance evaluation (EVAL). Figure 2 also contains two interior paths.
We test this three-path mediation model by estimating the regressions described in
Taylor, MacKinnon, and Tein (2008). These regressions are specified as follows:
EVAL = 1 + 1OUTCOME + (1)

BENEFIT = 2 + 2OUTCOME + (2)
FRAME = 3 + 3OUTCOME + 4BENEFIT + (3)
EVAL = 4 + 5OUTCOME + 6BENEFIT + 7FRAME + (4)
EVAL is participants performance evaluation (defined above); OUTCOME is the outcome of
Sams investigation, which was manipulated between participants as either (1) Sam found that
there was no misstatement (coded as 1) or (2) Sam found that there was a significant
misstatement (coded as 0); BENEFIT is measured as participants responses to the question Do
you feel that the audit team got some benefit from the time that Sam spent to investigate the
inconsistency between the growth in revenues and the nonfinancial measures? (responses are
provided on an 11-point scale with the left endpoint labeled there was no benefit and the right
endpoint labeled there was a benefit); FRAME is measured as participants responses to the
question Do you view the time that Sam spent investigating the inconsistency between the
growth in revenues and the nonfinancial measures as lost time or a normal cost of an
audit? (responses are provided on an 11-point scale with the left endpoint labeled lost time
and the right endpoint labeled normal cost).
25
Table 3 provides regressions for Equations (1) through (4) from which the path estimates
in Figure 2 are obtained.22 Notice that the coefficient for OUTCOME in Equation (1) is -1.71,
the coefficient for OUTCOME in Equation (4) is -1.15, and the mediated effect is 0.56 (the
mediated effect is the difference between the total effect and the direct effect). The proportion of
the total effect that is mediated is approximately 33 percent, and the mediated effect shown in
Figure 2 is statistically significant (p-value < 0.05).23, 24 Thus, the three-path mediation model
shown in Figure 2 provides support for the descriptive validity of the Lipe (1993) model in our
context. The evaluation penalty that occurs when no misstatement is found arises because
evaluators perceive that there is no benefit with which to match the cost of the investigation. The
absence of a benefit causes evaluators to view the time devoted to the investigation as lost time
rather than a normal cost of an audit. In turn, the low perceived benefit and loss frame both
contribute to the comparatively low performance evaluations for auditors who investigate an
inconsistency, but fail to find a misstatement.
IV. CORPORATE MANAGER EXPERIMENT
Purpose
The three-path mediation model in Section III explores whether and why outcome bias
occurs when supervisors evaluate an audit staff member who acts skeptically by identifying and
22
Our mediation analyses use 94 participants rather than the 96 participants used in the ANOVA because two
participants did not respond to all of the questions.
23
We estimate the significance of the mediated effect using Hayes (2013) bootstrap procedure.
24
When specifying the mediation model shown in Figure 2 (and the related mediation model shown in Figure 3), we
recognize that there are almost certainly multiple causal variables involved, only some of which we could hope to
identify and measure. Beyond the possibility of unmeasured mediators, there are various other factors including
mood, incentives, fatigue, attention, and individual idiosyncrasies that potentially influence whether full or partial
mediation is discovered. Further, our mediators measure internal psychological variables, which are almost certainly
measured with error. In the presence of measurement error, the likelihood of finding full mediation is reduced.
Finally, Baron and Kenny (1986, p. 1176) note that because most social phenomenon have multiple causes, an
expectation of full mediation may not be realistic in many contexts.
26
investigating an evidence inconsistency. As previously noted, exercising skepticism is likely
costly to the auditor (e.g., budget overruns and/or conflicts with management) when additional
work is performed to obtain sufficient, appropriate evidence (Nelson 2009; PCAOB 2012a).
While the act of going over budget on an audit has been well researched (e.g., Houston 1999),
we are unaware of any research that has investigated how corporate managers view auditors who
employ professional skepticism on the audits of their companies financial statements. Thus, we
performed an experiment with corporate managers to explore how they evaluate auditors whose
skeptical behavior requires additional time on the part of company personnel.
Participants
The email addresses of 3,000 corporate managers with accounting and/or finance
specializations were randomly selected from the Lexis/Nexis Academic Executive List. The
selected individuals come from public and private firms and all have titles indicating that they
currently hold managerial-level positions. An initial request to participate was emailed to the
managers and then a follow-up email was sent two weeks later. A total of 100 individuals
completed the online experiment.25 The mean age of participants was approximately 52 years
and the mean amount of work experience was approximately 29 years (the majority of which
was in the accounting and finance functions). On average, participants had (1) interacted with
external auditors 220 times, (2) provided feedback about audit team members to company
personnel 17 times, and (3) provided feedback to audit firm personnel 10 times. This
demographic information suggests participants were well suited to participate in our experiment.
25
The experimental instrument was completed on Qualtrics. Our response rate of 3.3 percent is comparable to the
5.4 percent response rate of Dichev, Graham, Harvey, and Rajgopal (2013). Dichev et al. (2013) describe the
challenges associated with obtaining higher response rates from corporate managers with online instruments (e.g.,
spam filters). However, Dichev et al. (2013) also describe that the responses received via online instruments are
typically of high quality. Similar to Dichev et al. (2013), we did not compensate participants.
27
Description of experimental context, manipulated variable, and measured variables
Participants assumed the role of a senior manager in a division of a publicly traded
company where their work responsibilities required them to interact with the companys external
auditor. Participants learned that the time spent by division employees assisting auditors and
answering their questions in the current year was similar to the time spent in prior years, with
one exception. The auditor responsible for testing the divisions revenue account decided to
investigate an inconsistency between the growth in revenue and related NFMs. The auditors
investigative efforts required the divisions employees to incur significant unplanned time. As a
consequence, certain division employees had difficulty meeting regular deadlines and worked
extended hours during the audit. The outcome of the auditors investigation was manipulated at
two levelsthe investigation either did not or did result in the identification of a misstatement
(OUTCOME_2). We then measured two process variables (BENEFIT_2 and FRAME_2) and the
dependent variable (EVAL_2). We define the aforementioned variables in Table 4 and Figure 3
(they are similar to their companion variables defined in Section III).
Results
Table 4 provides the regressions from which the path estimates in Figure 3 are obtained.
We find that managers provide the partner on the engagement with a lower performance
evaluation of the audit staff member who investigated the inconsistency when no misstatement is
found than when a misstatement is found (see total effect path in Figure 3, which has a
coefficient of -1.91 and a t-statistic = -6.01). To understand why the outcome of the auditors
investigation has such a powerful influence on managers performance evaluation of the audit
staff member, we again test the psychological process model developed by Lipe (1993), as
shown in Figure 3.
28
Consistent with our mediation findings in Section III, we observe that the outcome of
investigating the inconsistency influences whether managers perceive that a benefit arises from
the investigation. In turn, the perceived benefit of the investigation influences how the time spent
on the investigation is framed (i.e., as lost time or a normal cost), which, in turn, influences
how favorably managers evaluate the auditor who investigated the inconsistency. Notice that the
total effect is -1.91, the direct effect is -1.14, and the mediated effect is the difference of 0.77.
The proportion of the total effect that is mediated is approximately 40 percent. The mediated
effect shown in Figure 3 is statistically significant (p-value < 0.01).26 Thus, we provide
additional evidence that a skeptical auditor is penalized (management conveys a less favorable
evaluation to the partner on the engagement) when their skeptical behavior does not identify a
misstatement.27
Additional analyses
Tables 3 and 4 suggest that there are some differences across audit seniors and corporate
managers in terms of their responses to the measured variables in the mediation models. To
explore this further, Panel A of Table 5 provides the mean values for the three measured
variables for audit seniors (EVAL, BENEFIT, and FRAME). Panel B provides the mean values
for the three measured variables for corporate managers (EVAL_2, BENEFIT_2, and
26
Again, we estimate the significance of the indirect effect using Hayes (2013) bootstrap procedure.
27
The mediation models for audit seniors and corporate managers have similar overall patterns of results, but there
are some differences in the magnitude and significance of the path estimates. Two possible reasons for these
differences stand outone of which relates to the instruments used to elicit responses and the other of which relates
to the participants. First, the instruments place participants in different contextual situations and ask them to assume
different roles, so their responses may vary according to the context and role. Second, there are vast differences in
work experience between the participant groups. On average, the audit senior participants have just a few years of
work experience while the corporate manager participants have almost three decades of work experience. As
business professionals advance in their careers, their judgments and decisions may mature and evolve.
29
FRAME_2). Panels A and B also report p-values for tests of whether the means are significantly
different from zero.
Panel A reveals that the means for EVAL, BENEFIT, and FRAME are significantly
above zero for all auditors combined and for auditors in each of the two conditions (p-values
0.013). Panel B reveals slightly different results. Although the means for EVAL_2, BENEFIT_2,
and FRAME_2 are significantly above zero for all managers combined and for managers in the
misstatement condition (p-values < 0.001), the mean values are mixed for managers in the no
misstatement condition. In that condition, EVAL_2 is significantly above zero (p-value < 0.001),
but BENEFIT_2 and FRAME_2 are indistinguishable from zero (p-values 0.500). So, it
appears that when there is no misstatement, corporate managers perceive that there is little or no
benefit to investigating the inconsistency and they are ambivalent about whether the time spent
helping the auditor investigate the inconsistency is lost time or a normal cost of the audit.
Panel C reports p-values for tests of whether mean values for EVAL, BENEFIT, and
FRAME for audit seniors differ from mean values for EVAL_2, BENEFIT_2, and FRAME_2
for corporate managers, respectively. In terms of the overall evaluation (EVAL versus EVAL_2)
for all participants combined, we find that corporate managers evaluate Sam somewhat more
favorably than audit seniors (p-value = 0.090). This difference is driven by participants in the
misstatement condition. Notice that while there is no difference between EVAL and EVAL_2 for
participants in the no misstatement condition (p-value = 0.490), there is a significant difference
between EVAL and EVAL_2 for participants in the misstatement condition (p-value = 0.058).
Interestingly, when a misstatement is identified by the audit staff, corporate managers provide
higher evaluations than the audit staffs own superior.
30
The most significant difference between the participant groups relates to the perceived
benefit of investigating the inconsistency. On a combined basis and in each of the conditions, we
find that audit seniors perceived the benefit of investigating the inconsistency (BENEFIT) as
significantly greater than corporate managers perceived the benefit of investigating the
inconsistency (BENEFIT_2) (p-values 0.001). One reason for this significant difference may
lie in the fact that auditors better understand the value of investigating an inconsistency even
when the investigation does not uncover an error. Finally, there are no significant differences
between the perceived cost/loss frame of investigating the inconsistency for audit seniors
(FRAME) and the perceived cost/loss frame of investigating the inconsistency for corporate
managers (FRAME_2) (p-values 0.162).
V. SUPPLEMENTAL SURVEY
Purpose
While Section III and IV demonstrate that outcome effects are present in the evaluations
of skeptical behavior, the primary purpose of this survey is to test whether staff auditors expect
their skeptical behavior will be evaluated more negatively (positively) when they do not (do) find
a misstatement (Hypothesis 3). The secondary purpose of this survey is to understand why staff
auditors expect their skeptical behavior to be evaluated in this manner.
Participants
The participants consisted of 136 Master of Accounting (MACC) students from two large
public universities. Of these participants, 93 had no public accounting work experience and 43
had public accounting work experience (usually through an internship). The MACC participants
who had audit experience had worked in public accounting for an average of six months and they
had all received at least one performance evaluation.
31
Description of survey context and results
The context of the survey does not involve a particular audit issue or task. Instead, we
had participants assume the role of an audit staff in the following general situation:
As an audit staff, you likely make judgments about whether to follow-up with clients
about apparent deviations, inconsistencies, and discrepancies. Of course, follow-up may
require additional time and effort on your part and on the part of the client.
Suppose that you went significantly over budget on certain audit work because you
decided to follow-up on an apparent discrepancy. The follow-up work not only required
you to spend a significant amount of time, but also caused the client to incur significant
time and disrupted their normal activities.
There are two possible outcomes that may result from your follow-up efforts: (1) a
misstatement is found or (2) no misstatement is found.
After reading this brief context, participants then responded to the primary question of interest:
In which situation would your evaluator on the engagement team judge you most
positively?
Participants responded to this question on an 11-point scale, ranging from -5 to +5, with the left
endpoint labeled If I Did Not Find a Misstatement, the right endpoint labeled If I Did Find a
Misstatement, and the midpoint labeled Neither Would Affect How I am Evaluated. The
mean responses (not tabulated) to this question were 2.36 and 1.95 for the MACC participants
without and with experience, respectively. These mean values are significantly greater than zero
(p-values < 0.001), suggesting that participants believe that their performance evaluations are
significantly higher when a misstatement is found than when a misstatement is not found. This
finding supports Hypothesis 3.
To further illuminate responses to our primary question of interest, we categorize
responses into the following groups: negative responses, midpoint responses, and positive
responses. For inexperienced MACC participants, the tabulation for the three groups is 12 (12.90
percent), 8 (8.60 percent) and 73 (78.50 percent), respectively. For experienced MACC
32
participants, the tabulation for the three groups is 3 (6.97 percent), 10 (23.26 percent) and 30
(67.77 percent), respectively. By a wide margin, the most common group is positive responses,
and there is no significant difference between participant groups in terms of the frequency of
positive responses (2 = 1.21, p-value = 0.271). These tabulations indicate that MACC
participants, whether experienced or not, expect their skeptical behavior to be evaluated more
positively when a misstatement is found.
To understand why participants hold the aforementioned view, we had them respond to a
series of six follow-up questions, which are shown in Table 6. Responses are also graphed in
Figure 4. Each question is answered twiceonce under the assumption that a misstatement is
found and once under the assumption that no misstatement is found. The questions focus on (1)
the perceived benefits of investigating the inconsistency (Questions 1 and 2), (2) the perception
that the time incurred is viewed as lost time versus a normal cost of the audit (Questions 3 and
4), and the evaluative consequences for the time the auditor and the client spent (Questions 5 and
6). Responses were provided on an 11-point scale, ranging from -5 to +5. For questions 1 and 2,
the left endpoint is labeled No, There Was No Benefit and the right endpoint is labeled Yes,
There Was a Benefit. For questions 3 and 4, the left endpoint is labeled Lost Time and the
right endpoint is labeled Normal Cost. For questions 5 and 6, the left endpoint is labeled
Lower Evaluation and the right endpoint is labeled Increase Evaluation.
When a misstatement is found, Table 6 reveals that the mean responses of inexperienced
MACC participants (those participants who are least likely to be aware of the bias) to Questions
1 6 are 3.32, 1.68, 2.88, 1.80, 2.72, and 1.69, respectively. These responses are significantly
33
positive in all six instances (p-values < 0.001). Similarly, when a misstatement is found, the
mean responses of experienced MACC participants to Questions 1 6 are 3.86, 1.77, 3.72, 1.95,
3.35, and 2.05, respectively. These responses are significantly positive in all six instances (p-
values < 0.001).
When no misstatement is found, Table 6 reveals that the mean responses of inexperienced
MACC participants to Questions 1 6 are -0.42, -2.39, -0.92 -2.55, -0.89, and -1.76,
respectively. These responses are significantly negative in five instances (p-values 0.008).
Likewise, when no misstatement is found, the mean responses of experienced MACC
participants to Questions 1 6 are 0.47, -2.56, -0.36, -3.26, 0.00, and -2.38, respectively. These
responses are significantly negative in three of six instances (p-values < 0.001). Finally, we also
examine whether responses are significantly higher when a misstatement is found than when no
misstatement is found. For both participant groups and for all questions, Table 6 illustrates that
responses are significantly higher when a misstatement is found than when no misstatement is
found (p-values < 0.001).28
Our results suggest that finding a misstatement, relative to not finding a misstatement,
influences (1) whether the audit team and client perceive that they got something in return for the
time spent, (2) whether the audit team and client perceive that the time spent was a normal cost
of conducting the audit, (3) the auditors perceptions about the favorability of their performance
28
A related question that may also arise is whether there are differences between inexperienced MACC participants
and experienced MACC participants. MANOVA results indicate a marginally significant overall experience effect
(F-statistic = 1.71, p-value = 0.073). To provide a clearer picture of the experience effect, we also estimate two
separate MANOVAs, one with the six responses when a misstatement is found and the other with the six responses
when no misstatement is found. When there is a misstatement found, the MANOVA indicates a marginally
significant overall experience effect (F-statistic = 2.01, p-value = 0.070). When there is no misstatement found, the
MANOVA indicates an insignificant overall experience effect (F-statistic = 1.30, p-value = 0.262). Thus, any
experience effect seems to be manifest when there is a misstatement, and experienced MACC participants, relative
to inexperienced MACC participants, tend to believe that they will be evaluated more favorably when they find a
misstatement.
34
evaluation, and (4) the auditors perceptions about the clients evaluation of the audit team. Our
results arise with both inexperienced and experienced MACC participants, which suggests that
auditors understand from the onset of their careers that skeptical behavior will be evaluated with
outcome knowledge in mind.
VI. CONCLUSIONS
Skepticism is a fundamentally important auditor attribute, yet auditors continue to face
criticism for failing to be sufficiently skeptical (e.g., PCAOB 2011). Although there are a
number of potential reasons why auditors may not be sufficiently skeptical, this study identifies a
new and potentially crucial barrier to professional skepticism: the professions culture as
manifest in auditor evaluations. We employ two experiments and a survey to examine the extent
to which outcome effects influence auditor evaluations, client evaluations of their auditor, and
whether auditors anticipate that their own evaluations will be affected by the outcome of an
investigation.
The results of our experiment confirm our main prediction. In our experimental setting,
auditor participants take the role of a supervisor who must evaluate a staff auditor. We find that
the outcome effect biases these participants evaluations of skeptical behavior. They evaluate the
performances of skeptical auditors as significantly lower when a misstatement is not found
(versus when a misstatement is found). Interestingly, consulting with the superior did not
mitigate the outcome effects in auditor evaluations.
We also illustrate that corporate managers view skeptical auditors that require additional
client resources as a source of lost time for their personnel if the additional investigation does
not identify a misstatement. In such cases, the manager is more likely to convey negative
information about the audit staff person to the audit partner. Last, the results of our survey reveal
35
that staff auditors believe that they will be evaluated less favorably when a misstatement is not
found than when a misstatement is found. Thus, when deciding whether to engage in a skeptical
manner, auditors appear to be aware that their actions will be evaluated in light of the outcome of
the investigation.
One very important implication of our overall findings is that the anticipation of outcome
bias may, at times, cause auditors to forego skeptical behavior. Our study may serve to help
firms and standard setters ensure that evaluation and reward systems are more effective via
training and reviews of firms quality controls (i.e., evaluation systems). Because the outcome
effect appears to be driven by the frame through which evaluators view the costs of skeptical
behavior (versus the actual skeptical judgments and actions of staff), it is possible that trainings
highlighting the presence and consequences of the bias may be effective.
Our findings point to potentially fruitful areas for future research. In our setting, we
evaluate how skeptical behavior is evaluated depending on the outcome. Future studies could
examine the extent to which the anticipation of bias in the evaluation process reduces the
likelihood that auditors investigate ambiguous red flags and evidence inconsistencies. Does the
presence and anticipation of outcome effects lead auditors to only incur the costs of skepticism
when it relates to a sure thing? From a pedagogical perspective, many business schools are
currently emphasizing critical thinking skills in the classroom. Does repeated exposure to, and
the anticipation of, the bias reduce the likelihood that audit staff critically evaluate audit
evidence in the field? Finally, we find that outcome effects in evaluations may be a barrier to
auditor skepticism. Future studies could consider whether outcome effects in PCAOB
inspections may also lead to suboptimal behavior.
36
REFERENCES
Anderson, J. C., M. M. Jennings, D. J. Lowe, and P. M. J. Reckers. 1997. The mitigation of

hindsight bias in judges' evaluation of auditor decisions. Auditing: A Journal of Practice
and Theory, 16 (2): 20-39.
Anderson, S., and J. Wolfe. 2002. A perspective on audit malpractice claims. Journal of
Accountancy, 194 (3): 59-66.
Audit Inspection Unit of the UKs Professional Oversight Board (AIU). 2010. Audit Inspection
Unit 2009/10 Annual Report. Available online at: http://www.frc.org.uk/FRC-
Documents/POB/AIU-2009-10-Annual-Report.aspx.
Australian Securities & Investment Commission (ASIC). 2010. Report 192: Audit Inspection
Program Public Report for 200809. Available at:
http://www.asic.gov.au/asic/pdflib.nsf/LookupByFileName/rep192.pdf/$file/rep192.pdf
Baron, R., and D. Kenny. 1986. The moderatormediator variable distinction in social
psychology research: Conceptual, strategic, and statistical considerations. Journal of
Personality and Social Psychology 51 (6): 1173-1182.
Baron, J., and J. Hershey. 1988. Outcome bias in decision evaluation. Journal of Personality and
Social Psychology, 54(4): 569579.
Beasley, M. S., J. V. Carcello, and D. R. Hermanson. 2001. Top 10 audit decisions. Journal of
Accountancy, 191 (4): 63-66.
, J. V. Carcello, D. Hermanson, and T. Neal. 2010. Fraudulent Financial Reporting,
19871997: An Analysis of U.S. Public Companies. Committee of Sponsoring
Organizations of the Treadway Commission.
Blank, H., and S. Nestler. 2007. Cognitive process models of hindsight bias. Social Cognition, 25
(1): 132-147.
Brazel, J. F., K. L. Jones, and M. F. Zimbelman. 2009. Using nonfinancial measures to assess
fraud risk. Journal of Accounting Research, 47 (December): 11351166.
, K. L. Jones, and D. Prawitt. 2014. Auditors reactions to inconsistencies between
financial and nonfinancial measures: The interactive effects of fraud risk assessment and
a decision prompt. Behavioral Research in Accounting, 26 (1): 131-156.
Brown, C. E., and I. Solomon. 1987. Effects of outcome information on evaluations of
managerial decisions. The Accounting Review, 62 (3): 564577.
, and I. Solomon. 1993. An experimental investigation of explanations for outcome
effects on appraisals of capital-budgeting decisions. Contemporary Accounting Research,
10 (1): 83-111.
Carmichael, D. R., and J. L. Craig, Jr. 1996. Proposal to say the F word in auditing
standards. The CPA Journal, 66 (6): 22.
Dechow, P. M., W. Ge, C.R. Larson, and R.G. Sloan. 2011. Predicting material accounting
misstatements. Contemporary Accounting Research, 28 (1): 17-82.
Dichev, I., J. Graham, C R. Harvey, and S. Rajgopal. (2013). Earnings quality: Evidence from
the field. Journal of Accounting and Economics, 56 (2): 1-33.
Emby, C., A. M. G. Gelardi, and D. J. Lowe. 2002. A research note on the influence of outcome
knowledge on audit partners judgments. Behavioral Research in Accounting, 14: 87
103.
European Commission (EC). 2010. Audit policy: Lessons from the crisis. European
Commission. Brussels, Belgium.
37
Fischhoff, B. 1975. Hindsight is not equal to foresight: The effect of outcome knowledge on
judgment under uncertainty. Journal of Experimental Psychology: Human Perception
and Performance, 1 (3): 288299.
Fiske, S. T. 2009. Social beings: Core motives in social psychology. Hoboken, NJ: John Wiley &
Sons, Inc.
, and R. Beyth. 1975. I knew it would happen: Remembered probabilities of oncefuture
things. Organizational Behavior and Human Performance, 13 (1): 1-16.
Fisher, J., and T. I. Selling. 1993. The outcome effect in performance evaluation: Decision
process observability and consensus. Behavioral Research in Accounting, 5: 5877.
Frederickson, J. R., S. A. Peffer, and J. Pratt. 1999. Performance evaluation judgments: Effects
of prior experience under different performance evaluation schemes and feedback
frequencies. Journal of Accounting Research, 37 (1): 151 - 165.
Ghosh, D., and R. Lusch. 2000. Outcome effect, controllability, and performance evaluation on
managers: Some field evidence from multi-outlet businesses. Accounting, Organizations
and Society, 25: 411425.
Glover, S. M., D. F. Prawitt. 2014. Enhancing auditor professional skepticism: The professional
skepticism continuum. Current Issues in Auditing, 8 (2): 1-10.
Guilbault, R. L., F. B. Bryant, J. H. Brockway, and E. J. Posavac. 2004. A meta-analysis of
research on hindsight bias. Basic and Applied Social Psychology, 26: 103-117.
Harley, E. M. 2007. Hindsight bias in legal decision making. Social Cognition, 25: 48-63.
Hayes, A. 2013. Introduction to Mediation, Moderation, and Conditional Process Analysis. New
York, NY: The Guilford Press.
Houston, R. 1999. The effects of fee pressure and client risk on audit seniors' time budget
decisions. Auditing: A Journal of Practice & Theory, 18 (2): 70-86.
Hurtt, R. K., M. Eining, and R. D. Plumlee. 2011. Linking professional skepticism to auditors
behaviors. Working Paper, Baylor University.
, H. Brown-Liburd, C. E. Early, and G. Krishnamoorthy. 2013 Research on auditor
professional skepticism: Literature synthesis and opportunities for future research.
Auditing: A Journal of Practice & Theory, 32 (Supplement): 45-97.
International Auditing and Assurance Standards Board (IAASB). 2004. Objective and general
principles governing an audit of financial statements. International Standard on Auditing
200 (ISA 200). New York, NY: IFAC.
. 2008. Quality control for firms that perform audits and reviews of financial statements,
and other assurance and related services engagements. International Standard on Quality
Control 1 (ISQC 1). New York, NY: IFAC.
. 2012. Professional Skepticism in an Audit of Financial Statements. New York, NY:
IFAC. Available at:
http://www.ifac.org/sites/default/files/publications/files/IAASB%20Professional%20Ske
pticism%20QandA-final.pdf
. 2014. The IAASBs Work Plan for 2015-2016. New York, NY: IFAC.
Kadous, K. 2000. The effects of audit quality and consequence severity on juror evaluation of
auditor responsibility for plaintiff losses. The Accounting Review, 75 (3): 327-341.
KPMG LLP (KPMG). 2012. Comment letter on the boards concept release on auditor
independence and audit firm rotation. Delivered at PCAOB public meeting. March 21.
Available at: http://pcaobus.org/Rules/Rulemaking/Docket037/ps_veihmeyer.pdf.
38
Lipe, M. G. 1993. Analyzing the variance investigation decision: The effects of outcomes,
mental accounting, and framing. The Accounting Review, 68 (4): 748 - 764.
Lipshitz, R., and D. Barak. 1995. Hindsight wisdom: Outcome knowledge and the evaluation of
decisions. Acta Psychologica, 88 (2): 105125.
Mertins, L., D. Salbador, and J. Long 2013. The outcome effect - A review and implications for
future research. Journal of Accounting Literature, 31 (1): 2-30.
Messier, W., T. Kozloski, and N. Kochetova-Kozloski. 2010. An analysis of SEC and PCAOB
enforcement actions against engagement quality reviewers. Auditing: A Journal of
Practice & Theory, 29 (2): 233-252.
Nelson, M. 2009. A model and literature review of professional skepticism in auditing. Auditing:
A Journal of Practice & Theory, 28 (2): 1-34.
Pricewaterhousecoopers LLP (PwC). 2012. Written testimony of Robert E. Moritz. Delivered at
PCAOB public meeting. March 21. Available at:
http://pcaobus.org/Rules/Rulemaking/Docket037/ps_moritz.pdf.
Public Company Accounting Oversight Board (PCAOB). 2006. Due professional care in the
performance of work. AU Section 230. Washington, D.C.: PCAOB.
. 2008. Report on the PCAOBs 2004, 2005, 2006, and 2007 inspections of domestic
annually inspected firms. Washington, D.C.
. 2011. Concept release on auditor independence and audit firm rotation.
http://pcaobus.org/rules/rulemaking/docket037/release_2011-006.pdf
. 2012a. Maintaining and applying professional skepticism in audits. Staff Audit Practice
Alert No. 10 (SAPA 10). Washington, D.C.: PCAOB.
. 2012b. Statement on public meeting on auditor independence and audit firm rotation.
Delivered at PCAOB public meeting. March 21. Available at:
http://pcaobus.org/News/Speech/Pages/03212012_HarrisStatement.aspx.
Reason, T. 2010. Auditing your auditor. CFO.com. April 2010. Available at:
http://www.cfo.com/printable/article.cfm/14485723.
Schneider, B., M. G. Ehrhart, and W. H. Macey. 2013. Organizational climate and culture.
Annual review of psychology, 64: 361-388.
Staw, B. M. 1976. Knee-deep in the Big Muddy: A study of escalating commitment to a chosen
course of action. Organizational Behavior and Human Performance, 16: 2744.
Studdert, D. M., M. M. Mello, W. M. Sage, C. M. DesRoches, J. Peugh, K. Zapert, and T. A.
Brennan. 2005. Defensive medicine among high-risk specialist physicians in a volatile
malpractice environment. Journal of the American Medical Association, 293: 2609-2617.
Tabachnick, B. G., and Fidell, L. S. 2001. Using Multivariate Statistics. 4th ed. Boston, MA:
Allyn and Bacon.
Tan, H. T., and M. G. Lipe. 1997. Outcome effects: The impact of decision process and outcome
controllability. Journal of Behavioral Decision Making, 10: 315325.
Taylor, A., D. MacKinnon, and J. Tein. 2008. Tests of the three-path mediated effect.
Organizational Research Methods 11 (2): 241-269.
Terry, K. 2011. What is smart sex offender policy? Criminology & Public Policy, 10: 275-282.
Treadway Commission. 1987. Report of the national commission on fraudulent financial
reporting. National Commission on Fraudulent Financial Reporting.
Wu, D-A, S. Shimojo, S. W. Wang, and C. F. Camerer. 2012. Shared visual attention reduces
hindsight bias. Psychological Science, 23 (12): 1524-1533.
39
FIGURE 1
Graph of Cell Means for Performance Evaluation in Experimental Condition
3.5
Overall Performance Evaluation
3.0
2.5
2.0
1.5
1.0
0.5
0.0
No consultation Moderate consultation High consultation
No misstatement Misstatement
The dependent variable is participants overall performance evaluation, which is their response to
the question How would you evaluate Sams overall performance? (responses are provided on an
11-point scale ranging from -5 to +5 with the left endpoint labeled below expectations, the right
endpoint labeled above expectations, and the midpoint labeled met expectations). The
manipulated variables are defined as follows: OUTCOME (manipulated between participants as
either (1) Sam found that there was no misstatement or (2) Sam found that there was a significant
misstatement) and CONSULT (manipulated between participants at three levels as either (1) Sam
investigated the inconsistency without consulting with the senior, which is referred to as no
consultation, (2) Sam investigated the inconsistency after consulting with the senior and being told
to use professional judgment, which is referred to as moderate consultation, or (3) Sam investigated
the inconsistency after consulting with the senior and receiving approval, which is referred to as
high consultation). See Section III for a full description of the experiment.
40
FIGURE 2
Three-Path Mediation Model for Auditors Performance Evaluations (n = 94)
Coefficient = 0.83
t-statistic = 7.83
BENEFIT FRAME
Coefficient = -1.51 Coefficient = 0.17
t-statistic = -3.60 t-statistic = 2.15
OUTCOME EVAL
Coefficient = -1.71* Coefficient = -1.15**
t-statistic = -4.88 t-statistic = -3.43
OUTCOME is the independent variable, which is manipulated between participants as either Sam found that there
was no misstatement (coded as 1) or Sam found that there was a significant misstatement (coded as 0); BENEFIT
is measured as participants response to the question Do you feel that the audit team got some benefit from the
time that Sam spent to investigate the inconsistency between the growth in revenues and the nonfinancial
measures? (responses are provided on an 11-point scale with the left endpoint labeled there was no benefit and
the right endpoint labeled there was a benefit); FRAME is measured as participants response to the question
Do you view the time that Sam spent investigating the inconsistency between the growth in revenues and the
nonfinancial measures as lost time or a normal cost of an audit? (responses are provided on an 11-point scale
with the left endpoint labeled lost time and the right endpoint labeled normal cost); EVAL is the dependent
variable, which is measured as participants response to the question How would you evaluate Sams overall
performance? (responses are provided on an 11-point scale with the left endpoint labeled below expectations,
the right endpoint labeled above expectations, and the midpoint labeled met expectations). Table 3 reports the
regression results from which the path estimates are obtained.
* Total effect (relationship between OUTCOME and EVAL without indirect paths included in model).
** Direct effect (relationship between OUTCOME and EVAL with indirect paths included in model).
41
FIGURE 3
Three-Path Mediation Model for Managers Performance Evaluations (n = 100)
Coefficient = 0.55
t-statistic = 7.31
BENEFIT_2 FRAME_2
OUTCOME_2 EVAL_2
Coefficient = -1.91* Coefficient = -1.14**
t-statistic = -6.01 t-statistic = -3.90
OUTCOME_2 is the independent variable, which is manipulated between participants as either (1) the auditor
found that there was no misstatement (coded as 1) or (2) the auditor found that there was a significant
misstatement (coded as 0); BENEFIT_2 is measured as participants response to the question Do you feel that the
Sporting Goods Division got some benefit from the time that the Sporting Goods Division personnel spent to help
investigate the inconsistency between the growth in revenues and nonfinancial measures? (responses are
provided on an 11-point scale ranging from -5 to +5, with the left endpoint labeled There Was No Benefit and
the right endpoint labeled There Was a Benefit); FRAME_2 is measured as participants response to the question
Do you view the time that Sporting Goods Division personnel spent helping Sam investigate the inconsistency
between the growth in revenues and the non-financial measures as lost time or a normal cost of an audit?
(responses are provided on an 11-point scale ranging from -5 to +5, with the left endpoint labeled Lost Time and
the right endpoint labeled Normal Cost); EVAL_2 is the dependent variable, which is measured as participants
response to the question What overall evaluation of Sams performance would you communicate to the partner on
the audit engagement? (responses are provided on an 11-point scale ranging from -5 to +5, with the left endpoint
labeled Poor Performance, the right endpoint labeled Excellent Performance, and the midpoint labeled
Average Performance). Table 4 reports the regression results from which the path estimates are obtained.
* Total effect (relationship between OUTCOME_2 and EVAL_2 without indirect paths included in model).
** Direct effect (relationship between OUTCOME_2 and EVAL_2 with indirect paths included in model).
42
FIGURE 4
Mean Responses to Evaluation Questions
5
-1
-2
-3
-4
Q5-No miss.
Q1-No miss.
Q2-No miss.
Q3-No miss.
Q4-No miss.
Q6-No miss.
Q2-Miss.
Q1-Miss.
Q3-Miss.
Q4-Miss.
Q5-Miss.
Q6-Miss.
Inexperienced MACC participants Experienced MACC participants
Questions and responses are provided in Table 6. Each question is answered twiceonce under the
assumption that a misstatement is found (the Miss. column) and once under the assumption that
no misstatement is found (the No miss. column). See Section V for a full description of the
survey.
43
TABLE 1
Performance Evaluation in Experimental Conditions
Misstatement Consultation conditions (CONSULT)
conditions No Moderate High
(OUTCOME) consultation consultation consultation Marginal means
Mean = 0.35 Mean = 1.69 Mean = 1.19 Mean = 1.06
No misstatement Std. dev. = 2.23 Std. dev. = 1.99 Std. dev. = 1.56 Std. dev. = 1.99
n= 17 n = 16 n = 16 n= 49

Misstatement Std. dev. = 1.29 Std. dev. = 0.88 Std. dev. = 1.68 Std. dev. = 1.32
n= 18 n= 14 n= 15 n= 47

Marginal means Std. dev. = 2.12 Std. dev. = 1.72 Std. dev. = 1.76 Std. dev. = 1.90
n= 35 n = 30 n = 31 n = 96
The dependent variable is participants performance evaluation, which is their response to the question How
would you evaluate Sams overall performance? (responses are provided on an 11-point scale ranging from -5 to
+5 with the left endpoint labeled below expectations, the right endpoint labeled above expectations, and the
midpoint labeled met expectations). The manipulated variables are defined as follows: OUTCOME is the
outcome of Sams investigation, which is manipulated between participants as either (1) Sam found that there was
no misstatement (coded as 1) or (2) Sam found that there was a significant misstatement (coded as 0); CONSULT
is consultation with the senior prior to investigating the inconsistency, which is manipulated between participants
as either (1) Sam investigated the inconsistency without consulting with the senior (no consultation condition), (2)
Sam investigated the inconsistency after consulting with the senior and being told to use professional judgment
(moderate consultation condition), or (3) Sam investigated the inconsistency after consulting with the senior and
receiving approval (high consultation condition). See Section III for a full description of the experiment.
44
TABLE 2
ANOVA Results for Performance Evaluation
Source of variation DF SS F-statistic p-value
Panel A: ANOVA results using all levels of consultation (n = 96)
OUTCOME 1 72.42 25.74 < 0.001
CONSULT 2 14.57 2.59 0.081
OUTCOME*CONSULT 2 3.32 0.59 0.557
Error 90 253.17
R2 (%) = 26.30
Model F-statistic = 6.42 (p < 0.001)
Panel B: ANOVA results excluding the high consultation condition (n = 65)

OUTCOME 1 56.61 19.47 < 0.001
CONSULT 1 14.57 5.01 0.029
OUTCOME*CONSULT 1 2.37 0.81 0.370
Error 61 177.40
R2 (%) = 29.26
Model F-statistic = 8.41 (p-value < 0.001)
The dependent variable is participants performance evaluation, which is their response to the
question How would you evaluate Sams overall performance? (responses are provided on an
11-point scale ranging from -5 to +5 with the left endpoint labeled below expectations, the
right endpoint labeled above expectations, and the midpoint labeled met expectations). The
manipulated variables are defined as follows: OUTCOME is the outcome of Sams
investigation, which is manipulated between participants as either (1) Sam found that there was
no misstatement (coded as 1) or (2) Sam found that there was a significant misstatement (coded
as 0); CONSULT is consultation with the senior prior to investigating the inconsistency, which
is manipulated between participants as either (1) Sam investigated the inconsistency without
consulting with the senior (no consultation condition), (2) Sam investigated the inconsistency
after consulting with the senior and being told to use professional judgment (moderate
consultation condition), or (3) Sam investigated the inconsistency after consulting with the
senior and receiving approval (high consultation condition). See Section III for a full
description of the experiment.
45
TABLE 3
Regressions to Test Three-Path Mediation Model for Auditors Performance Evaluations (n = 94)
Dependent variables
Equation (1) Equation (2) Equation (3) Equation (4)
Independent variables EVAL BENEFIT FRAME EVAL
Intercept Coefficient 2.80 3.51 -0.36 1.61
t-statistic 11.28 11.82 -0.75 4.57
p-value < 0.001 < 0.001 0.46 < 0.001
OUTCOME Coefficient -1.71 -1.51 -0.17 -1.15
t-statistic -4.88 -3.60 -0.38 -3.43
p-value < 0.001 < 0.001 0.71 < 0.001
BENEFIT Coefficient 0.83 0.22
t-statistic 7.83 2.18
p-value < 0.001 0.032
FRAME Coefficient 0.17
t-statistic 2.15
p-value 0.034
Model F-statistic 23.84 12.92 36.22 18.55
Model p-value < 0.001 < 0.001 < 0.001 < 0.001
Adjusted R2 19.72 11.36 43.10 36.15
Variables are defined as follows: EVAL is participants performance evaluation, which is their response to the
question How would you evaluate Sams overall performance? (responses are provided on an 11-point scale
ranging from -5 to +5 with the left endpoint labeled below expectations, the right endpoint labeled above
expectations, and the midpoint labeled met expectations); OUTCOME is the outcome of Sams investigation,
which is manipulated between participants as either (1) Sam found that there was no misstatement (coded as 1) or
(2) Sam found that there was a significant misstatement (coded as 0); BENEFIT is measured as participants
response to the question Do you feel that the audit team got some benefit from the time that Sam spent to
investigate the inconsistency between the growth in revenues and the nonfinancial measures? (responses are
provided on an 11-point scale with the left endpoint labeled there was no benefit and the right endpoint labeled
there was a benefit); FRAME is measured as participants response to the question Do you view the time that
Sam spent investigating the inconsistency between the growth in revenues and the nonfinancial measures as lost
time or a normal cost of an audit? (responses are provided on an 11-point scale with the left endpoint labeled
lost time and the right endpoint labeled normal cost). See Section III for a full description of the experiment
and mediation model.
46
TABLE 4
Regressions to Test Three-Path Mediation Model for Managers Performance Evaluations (n = 100)
Dependent variables
Independent variables EVAL_2 BENEFIT_2 FRAME_2 EVAL_2
Intercept Coefficient 3.28 2.15 1.11 2.45
t-statistic 15.22 5.70 3.43 11.08
p-value < 0.001 < 0.001 < 0.001 < 0.001
OUTCOME_2 Coefficient -1.91 -2.30 -0.75 -1.14
t-statistic -6.01 -4.14 -1.66 -3.90
p-value < 0.001 < 0.001 0.100 < 0.001
BENEFIT_2 Coefficient 0.55 0.09
t-statistic 7.31 1.53
p-value < 0.001 0.129
FRAME_2 Coefficient 0.27
t-statistic 4.20
p-value < 0.001
Model F-statistic 36.10 17.11 38.53 31.38
Model p-value < 0.001 < 0.001 < 0.001 < 0.001
Adjusted R2 26.17 14.00 43.12 47.93
Variables are defined as follows: EVAL_2 is the dependent variable, which is measured as participants response
to the question What overall evaluation of Sams performance would you communicate to the partner on the audit
engagement? (responses are provided on an 11-point scale ranging from -5 to +5, with the left endpoint labeled
Poor Performance, the right endpoint labeled Excellent Performance, and the midpoint labeled Average
Performance); OUTCOME_2 is the independent variable, which is manipulated between participants as either (1)
the auditor found that there was no misstatement (coded as 1) or (2) the auditor found that there was a significant
misstatement (coded as 0); BENEFIT_2 is measured as participants response to the question Do you feel that the
Sporting Goods Division got some benefit from the time that the Sporting Goods Division personnel spent to help
investigate the inconsistency between the growth in revenues and nonfinancial measures? (responses are provided
on an 11-point scale ranging from -5 to +5, with the left endpoint labeled There Was No Benefit and the right
endpoint labeled There Was a Benefit); FRAME_2 is measured as participants response to the question Do you
view the time that Sporting Goods Division personnel spent helping Sam investigate the inconsistency between the
growth in revenues and the non-financial measures as lost time or a normal cost of an audit? (responses are
provided on an 11-point scale ranging from -5 to +5, with the left endpoint labeled Lost Time and the right
endpoint labeled Normal Cost). See Section IV for a full description of the experiment.
47
TABLE 5
Mean Values and Tests of Differences for Measured Variables Used in Mediation Models
Variable Combined No misstatement Misstatement
Panel A: Audit seniors (n = 94)
(a) EVAL 1.94 1.09 2.80
p-value for mean = 0 < 0.001 < 0.001 < 0.001
(b) BENEFIT 2.76 2.00 3.51
p-value for mean = 0 < 0.001 < 0.001 < 0.001
(c) FRAME 1.84 1.12 2.55
p-value for mean = 0 < 0.001 0.013 < 0.001
Panel B: Corporate managers (n = 100)
(d) EVAL_2 2.40 1.37 3.28
p-value for mean = 0 < 0.001 < 0.001 < 0.001
(e) BENEFIT_2 1.09 -0.15 2.15
p-value for mean = 0 < 0.001 0.730 < 0.001
(f) FRAME_2 1.37 0.28 2.30
p-value for mean = 0 < 0.001 0.500 < 0.001
Panel C: Tests of differences
p-value for (a) versus (d) 0.090 0.490 0.058
p-value for (b) versus (e) < 0.001 < 0.001 0.001
p-value for (c) versus (f) 0.234 0.162 0.580
Variables are defined as follows: EVAL is participants performance evaluation, which is their
response to the question How would you evaluate Sams overall performance? (responses are
provided on an 11-point scale ranging from -5 to +5 with the left endpoint labeled below
expectations, the right endpoint labeled above expectations, and the midpoint labeled met
expectations); BENEFIT is measured as participants response to the question Do you feel that
the audit team got some benefit from the time that Sam spent to investigate the inconsistency
between the growth in revenues and the nonfinancial measures? (responses are provided on an
11-point scale with the left endpoint labeled there was no benefit and the right endpoint labeled
there was a benefit); FRAME is measured as participants response to the question Do you
view the time that Sam spent investigating the inconsistency between the growth in revenues and
the nonfinancial measures as lost time or a normal cost of an audit? (responses are provided
on an 11-point scale with the left endpoint labeled lost time and the right endpoint labeled
normal cost); EVAL_2 is the dependent variable, which is measured as participants response
to the question What overall evaluation of Sams performance would you communicate to the
partner on the audit engagement? (responses are provided on an 11-point scale ranging from -5
to +5, with the left endpoint labeled Poor Performance, the right endpoint labeled Excellent
Performance, and the midpoint labeled Average Performance); BENEFIT_2 is measured as
participants response to the question Do you feel that the Sporting Goods Division got some
benefit from the time that the Sporting Goods Division personnel spent to help investigate the
inconsistency between the growth in revenues and nonfinancial measures? (responses are
provided on an 11-point scale ranging from -5 to +5, with the left endpoint labeled There Was
No Benefit and the right endpoint labeled There Was a Benefit); FRAME_2 is measured as
participants response to the question Do you view the time that Sporting Goods Division
personnel spent helping Sam investigate the inconsistency between the growth in revenues and
the non-financial measures as lost time or a normal cost of an audit? (responses are
provided on an 11-point scale ranging from -5 to +5, with the left endpoint labeled Lost Time
and the right endpoint labeled Normal Cost).
48
TABLE 6
Mean Responses to Evaluation Questions
Inexperienced MACC Experienced MACC
participants (n = 93) participants (n = 43)
Misstatement p-value Misstatement p-value
Questions Yes No for diff. Yes No for diff.
1. Do you feel that individuals who evaluate your performance
would think that the audit team got something in return for the
time that you spent following-up on the discrepancy? 3.32 -0.42 < 0.001 3.86 0.47 < 0.001
2. Do you feel that client management would think that they got
something in return for the time that client personnel spent
related to your follow-up on the discrepancy? 1.68 -2.39 < 0.001 1.77 -2.56 < 0.001
3. Do you feel that individuals who evaluate your performance
will view the time you and the client spent on follow-up as a
normal cost of conducting an audit or as lost time? 2.88 -0.92 < 0.001 3.72 -0.36 < 0.001
4. Do you feel that client management will view that time as a
normal cost of conducting an audit or as lost time? 1.80 -2.55 < 0.001 1.95 -3.26 < 0.001
5. What effect will the time that you and the client spent on the
follow-up have on your performance evaluation? 2.72 -0.89 < 0.001 3.35 0.00 < 0.001
6. What effect will the time that client personnel spent on
follow-up have on their evaluation of your audit engagement
team? 1.69 -1.76 < 0.001 2.05 -2.38 < 0.001
Responses are provided on an 11-point scale, ranging from -5 to +5. For Questions 1 and 2, the left endpoint is labeled No, There
Was No Benefit and the right endpoint is labeled Yes, There Was a Benefit. For Questions 3 and 4, the left endpoint is labeled
Lost Time and the right endpoint is labeled Normal Cost. For Questions 5 and 6, the left endpoint is labeled Lower
Evaluation and the right endpoint is labeled Increase Evaluation. Each question is answered twiceonce under the assumption
that a misstatement is found (the Yes column) and once under the assumption that no misstatement is found (the No column).
See Section V for a full description of the survey.
49

The Outcome Effect and Professional Skepticism

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

The Outcome Effect and Professional Skepticism

Transféré par

Droits d'auteur :

Formats disponibles

The Outcome Effect and Professional Skepticism

Keywords: audit, evaluation, hindsight bias, outcome effect, professional skepticism

Data Availability: Contact the authors.

renewed focus on addressing auditors failure to exercise sufficient levels of skepticism, a

We experimentally test whether outcome effects exist in supervisors evaluations of

In an experimental setting, practicing audit seniors were asked to evaluate the

components: skeptical judgment (issue identification) and skeptical action (additional

to exercise an appropriate level of professional skepticism by investigating the inconsistency in

no consultation, moderate consultation, or high consultation). The no consultation condition

evaluate the performance of the staff auditor.

(i.e., misstatement versus no misstatement found) influences participants performance

influences the overall performance evaluation.

consultation is overwhelmed by the large negative evaluative effect of not finding a

(Guilbault, Bryant, Brockway, and Posavac 2004).

demonstrate that a cost of skepticism is likely to be impaired management relations. Hence,

their supervisor and client management.

Finally, we administered a survey to investigate auditors perceptions of how the

potential outcomes of an investigation finding a misstatement and not finding a misstatement.

participants reported that finding a misstatement, relative to not finding a misstatement,

examination of the evaluation of skepticism in the context of a supervisor-subordinate

We apply the well-known psychological tendencies related to outcome effects to an issue

The time spent investigating an inconsistency is considered a waste of time if no misstatement is

found, but it is considered an investment of time that produces a benefit if a misstatement is

likely exacerbates the problem of insufficient skepticism.

consequences) may be more effective than a subordinate-driven solution (e.g., increased

evaluation processes employed by audit firms.

II. THEORY AND HYPOTHESES

The underlying cause(s) of insufficient skepticism must first be understood in order to

accounting literature focuses on auditor-specific traits (e.g., knowledge, innate characteristics)

An unexplored barrier to professional skepticism Outcome effect bias in auditor

Outcome effects refer to situations where the knowledge of outcomes influences

To understand how outcome effects in auditor evaluations may negatively impact

lawsuits against the firm (Nelson 2009).

Professional skepticism is a behavior that, although encouraged by the profession, does

skepticism by performing additional testing. Conducting an investigation would be consistent

benefit of identifying a misstatement.6

When assessing an auditors decision making (e.g., evaluating their skeptical

behavior (i.e., appropriately identified and investigated a red flag or inconsistency). As

the end justifies the means.

stated in alternative form, is as follows:

If Hypothesis 1 is supported and superiors evaluations of skeptical auditors are biased by

as though the decision maker should have known it all along.

Accordingly, one potential solution to debias superiors evaluations of skeptical auditors

evaluations. Hypothesis 2, stated in alternative form, is as follows:

Auditor awareness of outcome effects in evaluations

In Hypothesis 1, we predict that outcome effects will result in evaluators penalizing

they are obvious or the identification of a misstatement is a sure thing.

skeptical behavior that identifies no misstatement is punished. Formally, we hypothesize:

The participants consisted of 96 audit seniors from an international accounting firm. We

of staff auditors under their supervision.10

Description of experimental context

expectation for the divisions revenues.

The literature related to professional skepticism stresses the importance of evaluating

costs of skepticism as described by Nelson 2009) in all conditions.

The first manipulated variable was whether Sams investigation uncovered a

misstatement (this variable is referred to as OUTCOME below).13 In the no misstatement

condition, participants were told the following:

In the misstatement condition, participants were told the following:

In the moderate consultation condition, participants were told the following:

In the high consultation condition, participants were told the following:

incurring the costs associated with investigating the inconsistency.15

the midpoint labeled Met Expectations.