SSRN Id1772250

An Empirical Analysis of Judging Bias in Competitive Academic Debate
Clifford Chad Henson* Paul R. Dorasil

College of Law Department of Economics
College of Business, Department of Finance University of Florida
University of Illinois Gainesville, FL 32611
Champaign, IL 61820 paul.dorasil@cba.ufl.edu
henson1@law.illinois.edu
Abstract: Conventional wisdom among those involved in competitive academic debate holds
that, despite an emphasis on objective decision-making, factors other than skill affect the
outcomes of rounds. This study examines all debate rounds at the Lincoln Douglas Debate
Tournament of Champions from 2004-2009. We develop a panel logit model with fixed effects
across tournaments to estimate the marginal effects of various biases. In particular, we find
evidence of statistical bias related to sex, regional affiliation, and topic side. These factors may
explain the significant number of non-transitive outcomes in the data. Finally, we suggest some
policy remedies to mitigate the impact of biases.
*The authors would like to thank Andrew P. Morriss for supervising an earlier version of this
research, Ari Parker of Walt Whitman High School for his valuable coding assistance, and Aaron
Timmons of the Greenhill School and Jon Cruz of the Bronx High School of Science for their
assistance in obtaining data and feedback. We would also like to thank Craig Depkin III, Gary
Alan Fine, Scott Robertson, and Leslie Wexler for their valuable feedback. Finally, we thank
Victory Briefs Daily for hosting the authors’ “Debate by the Numbers” column and providing an
avenue for debaters and coaches to comment on the results and methodology. No external
funding was provided.
Clifford Chad Henson is a third-year graduate student pursuing a J.D. from the University of
Illinois College of Law and a Ph.D. in Finance from the University of Illinois College of
Business, where he is also the head coach of the intercollegiate policy debate team.
Paul R. Dorasil is a second-year graduate student pursuing a Ph.D. in Economics from the
University of Florida.
Table of Contents
I. INTRODUCTION ....................................................................................................................... 1
II. PREVIOUS LITERATURE ....................................................................................................... 2

A. ECONOMIC THEORY AND EMPIRICAL STUDIES OF JOB DISCRIMINATION ....... 2
B. OUTCOME PREFERENCES OF THIRD-PARTY DECISIONMAKERS .......................... 3
C. STUDIES OF DISCRIMINATION IN DEBATE ................................................................. 4
III. HIGH SCHOOL LINCOLN DOUGLAS DEBATE ................................................................ 8

A. COMPETITION ..................................................................................................................... 8
B. A COMMUNITY OF DEBATERS ..................................................................................... 10
C. THE TOURNAMENT OF CHAMPIONS ........................................................................... 13
IV. DATA...................................................................................................................................... 14
A. OBSERVATIONS ............................................................................................................... 14
B. SKILL PROXY .................................................................................................................... 15
C. INDEPENDENT VARIABLES ........................................................................................... 16
V. MODEL AND ESTIMATION................................................................................................. 19
VI. RESULTS ............................................................................................................................... 21
VII. CONCLUSION...................................................................................................................... 23
A. DISCUSSION ...................................................................................................................... 23
B. POLICY RECOMMENDATIONS ...................................................................................... 25
REFERENCES .............................................................................................................................. 30
APPENDIX ................................................................................................................................... 35
I. INTRODUCTION
High school Lincoln Douglas Debate rounds consist of two debaters, an affirmative who
argues in favor of a resolution and a negative who argues in opposition to a resolution.
Preliminary debate rounds in competitive academic debate are evaluated by a single judge, who
awards each debater a win or a loss and a number of speaker points. While the “better debater”
should win each round, conventional wisdom among those familiar with the activity has long
held that sex, region, and side biases play a substantial role in the outcomes of individual debate
rounds and tournaments. Whatever “better debater” means, and some objective and subjective
elements of this are discussed infra at III, it undoubtedly does not include these three factors.
While a number of studies have indicated that statistical disparities in participation and success
rates exist in debate across multiple lines, no previous work has employed rigorous empirical
methods to predict and explain biases.
The purpose of this research is to identify factors that influence judge decision-making at
the Lincoln Douglas Debate Tournament of Champions (TOC). High school competitors and
coaches place enormous importance on the outcomes of these rounds, often spending several
years of training and preparation in order to increase the probability of participation and success
at the TOC. Success at the TOC (and other nationally-competitive tournaments) can often lead
to lucrative coaching opportunities for debaters and coaches. Additionally, the TOC provides an
interesting and unique dataset for investigating discrimination due to the abundance of round-
level data available and the one-on-one nature of competition. We investigate the ways that side
bias, region bias, and sex bias can be used to predict and explain competitor success. To do this,
we collect data for all competitors and judges in every round of the Lincoln Douglas Debate
1
Tournament of Champions for years 2004-2009. We develop a panel logit model with fixed
effects across tournaments to estimate the marginal effects of side, regional, and sex biases.
The remainder of this article will be structured as follows: Section II will list and explain
the previous literature on the economics of discrimination, empirical studies on economic
discrimination, and empirical research related to competitive debate. Section III will explain
high school Lincoln Douglas Debate and the Tournament of Champions. Section IV will
describe the data. Section V will explain the theoretical model and estimation method. Section
VI will report the results of the estimation. Section VII will conclude and offer some potential
remedies for judging bias.
II. PREVIOUS LITERATURE
A. ECONOMIC THEORY AND EMPIRICAL STUDIES OF JOB DISCRIMINATION
Gary Becker (1957) authored the first theoretical economic explanation of discrimination,
positing that employment discrimination was based on taste differences. Arrow (1973) expands
on Becker’s work, focusing on discrimination in the labor market where employers face a
tradeoff between profits and discriminatory preferences. Coate and Loury (1993) expand on
Arrow’s model and find that affirmative action policies may perpetuate negative stereotypes
even if productivity is homogenous among groups when the program is repealed. While there
exist a number of studies on economic discrimination, most of these focus on wage gaps. Few
studies exist examining discrimination in hiring, and those that do seek to investigate such
discrimination either investigate discriminatory responses to applications only (e.g. Raich &
Rich, 1995) or are extremely small-n studies (e.g. Neumark, Bank & Van Nort, 1996). Blau and
Kahn (2000) investigate the gender pay gap in the United States. They find that the gap is
declining. Additionally, they find that much of the gap is attributable to gender-specific factors
2
such as differences in qualifications. Kahn (1991) provides a survey of the literature on
discrimination in professional sports. He reports evidence of race and gender discrimination in a
number of sports. Levitt (2004) investigates various forms of discrimination on the game show,
Weakest Link. He finds evidence of information-based discrimination against Hispanics and
taste-based discrimination against older contestants. He also finds evidence that contestants tend
to avoid voting for other contestants with whom they share a common gender or race (which
protects them from being eliminated). Ayres and Siegelman (1995) find evidence that car
dealerships charge higher prices to blacks and women than to white males. Bertrand and
Mullanaithan (2002) find that applicants with distinctly black names are less likely to be given
interviews than other candidates.
B. OUTCOME PREFERENCES OF THIRD-PARTY DECISIONMAKERS
Traditional economic models of discrimination typically assume that the decision-maker
is affected by the outcome of his or her choice, such as an employer seeking to hire an employee
is subsequently affected by bearing the cost of the decision to (not) hire a given person. More
analogous to the decision-making of debate judges is a situation where the decision-maker need
not bear the full cost of a wrong choice, such as when judges1 and juries decide cases in the legal
system. Fortunately, a substantial literature regarding these situations exists.
A number of juror studies have examined the effect of the race or sex of participants in
the justice system on court outcomes, with decidedly mixed results. The first study (Bridgeman
and Marlowe, 1979) found no effect of juror sex on either first-ballot voting or final verdict.2 In
1
Judge studies are more applicable than juror studies, given that judges may care about their reputations (Kozinski,
1993) while there is little reason to believe that similar incentives apply to jurors, who are not repeat-players.
Neither
2
All of the defendants were male, and little statistical information was reported.
3
that study, all of the defendants were male and little statistical information was reported. A later
study (Mills and Bohannon, 1980) found that women were more likely than men to initially vote
to convict in criminal trials, but were no more likely to vote to convict at the last stage of voting
for any crime other than robbery. Subsequent studies of gender effects on criminal (Helgeson
and Shaver, 1990; Fitzgerald and Ellsworth, 1984; Thompson et al., 1984) and civil (Green,
1968; Goodman et al., 1994; Denove and Imwinkelried, 1995) juries have also been mixed.
There also exists a robust empirical literature on the factors that influence judicial
decisions, with judges and scholars offering their respective views of what motivates judicial
behavior. (Kozinski, 1993). Cross (2003) provides a typical example of such studies and a
thorough review of the early literature. The “standard pattern” of empirical courts scholarship is
some variation of Republican appointees preferring more conservative outcomes than Democrat
appointees, tempered by other effects. (Miles & Sunstein, 2008). In perhaps the most
comprehensive examination of the effect of judicial background and demographic characteristics
upon judicial decision-making, it was discovered that race (but not sex) significantly impacted
the outcome of judicial decisions, and certain background factors related to the education and
employment history of judges also played a role in decisions (Sisk et al., 1998). One recent
comparative study (Henson, 2010) has found that in certain areas of law, while extra-legal
factors do influence outcomes, they add little explanatory power when controlling for case-
specific factors that should be legally significant and legal factors such as standard of review.
C. STUDIES OF DISCRIMINATION IN DEBATE
Competitive speech and debate has its origins in a time where women were not permitted
to vote or participate in higher education, much less expected to contribute valuably to the
analysis of pressing issues of the day. (Greenstreet, 1989). Participation and success at speech in
4
debate has never been equal; differential participation has been a cause of concern for over 70
years (Knee, 1939; Cole, 1957), and early forensics competitions separated men and women into
separate divisions because of “the belief that the male is generally superior to the female in
forensic endeavors[.]” (Hensley & Strother, 1968). The attitude that women are not (at least
potentially) the competitive equal of men has diminished over the years, and men and women no
longer compete in separate tournaments or divisions.3
Nonetheless, participation and success rates of men and women in debate are not equal.
“In general, it appears that regardless of the forensic activity, male domination ranges from
‘slight’ to ‘overwhelming’.”(Friedley and Manchester, 1985). This is widely recognized by
debate organizations; recommendations from a conference jointly sponsored by the American
Forensics Association and Speech Communication Association (McBath, 1975) included
investigating the causes of low female participation rates. The 1984 National Developmental
Conference at Northwestern University endorsed a resolution that would investigate barriers to
participation by women and other underrepresented groups (Friedley and Manchester, 1985).
The American Forensic Association funded research into diversity in U.S. forensics, resulting in
a (Allen et al) 2004 report.
Those involved in speech and debate have certainly answered the call for investigations
into this phenomenon. Leaving aside dozens of articles in popular and academic forensic
publications that examine female participation and success from an anecdotal perspective,
offering their own explanations of and solutions to this problem (Beattie, 1996; Basinger, 1996;
3
Periodically, an all-women’s tournaments, debate camps, or other debate-oriented events have been held as a
networking and consciousness-raising opportunity.
5
Bile, 1999; Bartanen, 1995; Crenshaw, 1996),4 empirical studies on the subject have been carried
out for nearly 50 years. One of the earliest studies found that mixed-sex two-person debate teams
out-performed their two-male and two-female counterparts over five years at a tournament that
did not have any divisions (either skill-or sex-related), while all-male and all-female teams
performed about equally, in college (Hensley and Strother, 1968) and in high school (Rosen et
al., 1978).
Later studies report participation, retention, and success rates of debaters. An early study
of results (Friedley and Manchester, 1985) at the 1984 National Debate Tournament (the
collegiate equivalent of the high school TOC) found that female participation at the tournament
began at 15% of all competitors and declined steadily except for the final round, where a single
female comprised 25% of the four remaining competitors. This same study revealed that at
comparable intercollegiate speech tournaments, though participation rates by women were
higher, women were less successful than men as indicated by decreased participation levels in
later elimination rounds – particularly in the more intellectually rigorous original speaking and
limited preparation events. Another study examining high school and intercollegiate national
championship participation rates (Martin, 1988) found that the National Catholic Forensic
League (high school) had female participation rates of between 15-30% while the National
Debate Tournament had female participation rates between 14-20%. This has been confirmed in
more recent years (Stepp, 1997).
4
A large, separate literature on sexual harassment in debate also exists, but is not cited here. The purpose of this
article is to report on judge behavior rather than attempt to explain student participation. The authors’ own
experience at the TOC and with the coaches and judges present there indicate that it is highly unlikely that the
number of rounds hinging on sexual quid pro quo is large enough to be driving our results.
6
Success (or even qualifying to compete) at national tournaments is just the end result,
however. Even at the earliest levels of participation, participation in debate and certain speech
events is dominated by men, while participation in other speech events tends to be dominated by
women (Martin 1980). A decade-old study seeking to explain different participation and success
rates (Stepp and Gardner, 2001) found that retention of women lagged behind retention of men.
Furthermore, this study found that while women made some significant gains in participation
rates as competitors during the 1990’s, their representation among coaches and directors
decreased. While Allen et al (2004) found that women constituted a slight majority of all those
involved in all aggregated competitive forensics activities (including individual events, debate,
mock trial, etc.), women are involved in competitive debate at lower levels than their
representation in the undergraduate population might predict (Stepp, 1997).
At the high school level, there are more women than men serving as debate coaches, as is
true of all high school teachers. (Fine, 2001). Though no empirical study of coaches on the
national circuit at the high school level exists, the authors have personally observed that the vast
majority of high school coaches at the TOC are male.
Debate judge behavior has also been empirically investigated. Early attempts at ballot
studies occurred decades earlier, but the different behavior of male and female judges was first
studied in the 1980’s in speech events. One such study (Kay and Aden, 1984) found that for
some events women judges varied less than men judges in the rankings given to a set of
competitors, while the opposite was true in other speech events. The study reasonably concluded
that with respect to their gender findings, “Further study is needed to determine the relationship
between judging standards, event, and gender.” The same study examined regional differences in
judging, finding regional patterns in agreements among judges. Region and sex variables were
7
never interacted, and the study merely reported descriptive statistics rather than significance
levels.
Bruschke and Johnson (1994) perform the most rigorous analysis of discrimination in
competitive debate to date. They analyze results from the NDT, and find that female judges tend
to award males negatives more speaker points than female negatives, and find that male judges
tend to award more points to same-sex teams. However, their results may be suspect due to their
controlling for the winner of the round as an independent variable. This creates an endogeneity
problem because any preferences that lead a judge to discriminate in allocating speaker points
between debaters may also lead the judge to discriminate when deciding the winner of the round.
Our analysis here is unique in that it involves data from high school Lincoln-Douglas debate,
exploits information on regional affiliation, and focuses on discrimination in wins rather than
speaker points. Wins are more important to debaters than speaker points because they determine
which debaters will qualify for elimination rounds. Following the economic literature on
discrimination, we provide a more rigorous statistical analysis.
III. HIGH SCHOOL LINCOLN DOUGLAS DEBATE
A. COMPETITION
High school Lincoln-Douglas debate (LD) is a one-on-one debate format. Debaters are
assigned a topic,5 known prior to the tournament, and are required to debate each side at some
5
The National Forensic League (NFL) Topic Wording Committee releases 10 potential topics each year at their
National Tournament. Member schools vote on these topics, and the top 5 are selected to be debated in the calendar
year following the vote. The 3rd-place topic is debated in January and February, the 2nd-place topic is debated in
March and April, the 1st-place topic is debated at NFL Nationals, the 4th-place topic is debated in September and
October, and the 5th-place topic is debated in November and December. Other leagues release topics as well (e.g.
the University Interscholastic League for some local debate in Texas, the National Catholic Forensic League for
their Grand Finals), but the NFL topics are widely used and the Tournament of Champions uses the
January/February topic. During the period of the sample, these were:
8
point during the preliminary rounds of the tournament. Topics are worded as statements, and in
each round one debater is assigned to debate the “affirmative” while the other debater is assigned
to debate the “negative.” Debaters give alternating speeches designed to persuade the judge to
vote for them, constrained by strictly-enforced time limits,6 with the affirmative speaking first
and last.
A single judge adjudicates each debate round.7 The judge is a supposedly neutral party,
hired by neither team, but often affiliated with some other party competing at the tournament
(such as the coach of another debater) and sometimes hired by the tournament. The judge awards
one debater the win and provides each debater with speaker points. This information is
submitted, along with comments about the debaters’ respective performances and a reason for
the judge’s decision, on a ballot that is turned into the tournament director. In addition to
comments on the ballot, it is customary for the judge to offer some comments to the debaters
orally and provide (and sometimes defend) their reason for decision. The required elements of a
successful advocacy are themselves subject to debate in the round, making debate a highly fluid
2004: Resolved: A government’s obligation to protect the environment ought to take precedence over its obligation
to promote economic development.
2005: Resolved: Democracy is best served by strict separation of church and state.
2006: Resolved: The use of the state’s power of eminent domain to promote private enterprise is unjust.
2007: Resolved: The actions of corporations ought to be held to the same moral standards as the actions of
individuals.
2008: Resolved: It is just for the United States to use military force to prevent the acquisition of nuclear weaponsby
nations that pose a military threat.
2009: Resolved: The United States ought to submit to the jurisdiction of an international court designed to
prosecute crimes against humanity.
The topic in 2010 was: Resolved: Economic sanctions ought not be used to achieve foreign policy objectives
6
In LD, the first speech is the “Affirmative Constructive,” (AC) a 6-minute speech offered by the affirmative
speaker (Aff) advocating that the judge vote for them. Following the AC, the negative speaker (Neg) asks the Aff
cross-examination questions (CXes) for 3 minutes. Subsequently, the Neg gives a 7-minute “Negative Constructive”
(NC), and the AffCXes the Neg for 3 minutes. The Aff then gives a four-minute rebuttal (1AR), followed by a 6-
minute rebuttal by the Neg (NR), followed by a 3-minute rebuttal by the Aff (2AR).
7
Preliminary rounds are adjudicated by a single judge. For elimination rounds, it is customary for odd-numbered
panels of judges to decide the outcome. Speaker points in elimination rounds are not relevant.
9
game (Snider, 1984), and there is substantial disagreement (Rowland, 1982; Zarefsky, 1982;
Lichtman & Rohrer, 1982; Ulrich, 1982) over how the game should be played by the debaters
and evaluated by the judges.
B. A COMMUNITY OF DEBATERS
These competitions and discussions about competition take place within a social context.
An excellent sociological study of the high school debate community has already been conducted
(Fine, 2001) and is beyond the scope of this article. This section provides only that information
useful for understanding the motivations and abilities of the participants in the round. In addition
to the aforementioned study, the authors draw largely on their own observations, as both have
been involved in debate at various levels for over a decade.8
The community of debaters and judges is a non-random group. Fine (2001) indicates that,
relative to their peers, high school debaters are from wealthier families and communities, more
likely to be white or Asian, more likely to be male, and more academically successful (in terms
of grades, standardized test scores, and college matriculation). Moreover, the community of
debaters and coaches tends to be highly invested in the activity, with intra-community social
status corresponding primarily to competitive success. Though the rise of Urban Debate Leagues
is increasing racial and socio-economic diversity in high school (and thus intercollegiate) policy
debate, there has not been any appreciable effect in Lincoln Douglas debate.
8
Both were high school Lincoln Douglas debaters in Texas. Chad Henson competed in policy debate in college,
coached a national circuit high school team, became a private coach and judge for hire (in both policy and Lincoln-
Douglas) on the local and national high school circuits, and later coached intercollegiate policy debate at the
University of Illinois. Paul Dorasil left debate for several years after high school, and then became a prominent
judge at local and national circuit tournaments in Texas before coaching a high school transitioning from the local to
national circuits in Florida. It is this familiarity with the debate community that permitted the efficient collection and
coding of data; without it, the lack of publicly available information would have made gathering and coding the data
a much more expensive proposition.
10
Despite these commonalities among all aspects of the debate community, there are
significant regional differences. Some states (e.g. Texas) have strong local debate cultures,
significant mingling of local- and national-circuit teams, a large number of TOC tournaments
available in the state, and a history of national success. Other states have few or none of these
things. The local debate culture can influence debater and judge behavior. Some arguments are
considered more acceptable in some regions than others. Local judges in Texas, for example,
tend to be very comfortable with a high rate of speed and technical evaluation of debates, while
local judges in most regions tend to resist this trend.
Moreover, the national circuit is a community all its own. While most (but not all)
tournaments on the “national circuit” have large contingents of local debaters who attend that
tournament but do not generally travel nationally, the elimination rounds of those tournaments
are typically dominated by debaters who travel nationally and adjudicated by judges with a
national reputation. Even on the national circuit, argumentation differences by region exist. For
example, the use of continental philosophy to critique the discourse of an opposing debater
would be considered an acceptable argument by more judges at the Victory Briefs Tournament
(California) than the Crestian (Florida). Texas debaters are known for being particularly fast and
technical. California debaters were among the first to experience success with “debate theory”
arguments.9 Some, but not all, judges on the east coast tend to dislike debate theory or critical
argument, even on the national circuit.
9
These are arguments about what sort of arguments competitors are permitted to make and how judges should
evaluate them. Whether a negative debater is permitted to advocate two conflicting positions would be an issue that
is resolved by debate theory, with the negative debater attempting to persuade the judge that debate is better (i.e.
more fair and educational) when this is permitted, and the affirmative saying debate is better when this is prohibited.
11
At the national level, and in some local circuits, coaching and judging is a highly
lucrative activity for college students. A student so-inclined could earn upwards of $10,000 in a
year and engage in what many would consider a leisure activity at others’ expense, all working
part-time while enjoying high status among a close-knit group of elite participants. Though the
per-hour rate for debate coaching may or may not be particularly high, depending on the types of
coaching or judging activities in which the student engages, it is difficult for young people to
find higher-status work or work better-suited to a student’s schedule. Debate coaching and
judging is also an activity of the young, with much of the day-to-day non-administrative work
being done by current college students or those recently graduated from college, augmented by a
small cadre of elite senior coaches (often involved heavily in tournament administration) and
another group of “lifers” who have been in debate for the entirety of their post-pubescent lives.
In addition to the monetary rewards, there are significant social incentives facing debaters and
judges. A substantial portion of the friends of judges and debaters are themselves fellow
debaters. Loss of one’s social place in the community without the ability to quickly or easily
replicate a similar social place in another community can itself be damaging – one reason why
some young coaches remain coaches, foregoing the opportunities in law, financial services, or
entrepreneurship that frequently attract debaters.
As a community norm, judges are bound to adjudicate the round based on the arguments
made by the debaters to the extent possible, avoiding biases.10 Though there is some allowance
made for a diversity of viewpoints about how rounds should be evaluated, judges who deviate
from this norm to an unacceptable degree are punished through loss of social capital and
10
A judge who evaluates rounds with a detectable bias is known as a “hack,” and the act of doing so as “hacking.”
These are not kind terms in the debate community.
12
decreased ability to access lucrative judging and coaching contracts. Judges are also supposed to
make “correct” decisions, meaning that they should vote for the “better debater” in the round.
Since outside observers are not typically in the round, this typically means making decisions
consistent with other judges, voting for debaters that end up doing well and against debaters who
end up doing poorly.
C. THE TOURNAMENT OF CHAMPIONS
The sine qua non of elite debate is participation (and success) at the Tournament of
Champions. The Tournament of Champions (TOC) is held at the University of Kentucky each
year in late April or early May. This tournament is the national championship of the national
circuit debate community. There are two days of preliminary round competition, where all
debaters compete in seven preliminary rounds against seven different opponents. Each
competitor will affirm and negate at least three times, with the 7th round being randomly
assigned. The top sixteen11 debaters advance to single-elimination competition, like the Sweet
Sixteen in NCAA Basketball, until a champion is determined.
In order to qualify to compete at the TOC, debaters must earn two “bids” at designated
tournaments throughout the year.12 Approximately 70-75 debaters each year qualify to compete.
Competing at these tournaments is an expensive and time-consuming enterprise, and most
competitors have access to substantial resources through private means or through their school.
11
Every debater who wins at least five of their seven preliminary rounds will debate at least one more round.
Seeding is determined by number of wins, followed by various measures of the number of speaker points awarded
by judges. If more than sixteen debaters won five rounds, which occurred each year in our sample, then there is a
round held among the lowest seeds to fill the contested slots. For example, if twenty debaters won at least five
rounds, then the 13th-seeded debater would face the 20th-seed to determine who would compete in further rounds.
The results of these “bubble rounds” are not included in our sample, as they are evaluated by three-judge panels.
12
A debater who advanced to elimination rounds the previous year automatically qualifies. Under certain conditions,
a debater who earns a single bid may qualify.
13
College coaches frequently attend the TOC and TOC-qualifying tournaments to identify
potential recruits, sometimes serving as judges.
Each debater is responsible for fulfilling his or her judge obligation, which requires that
he or she provide a person willing and able to judge four preliminary rounds and remain
available to judge until one round past that debater’s elimination from the tournament. If a judge
is willing to judge every round, that judge can fulfill the judging obligation for two debaters.
Debaters then select the judges who will be eligible to adjudicate the rounds in which they
compete.13 A judge considered roughly equally suitable by both competitors will evaluate a
round between them. Judges for this tournament tend to be highly qualified, having years of
experience as successful competitors and/or coaches. The vast majority make some information
about how they evaluate debates available to debaters in advance, so debaters can seek to adapt
their strategy to the judge’s predilections.
IV. DATA
A. OBSERVATIONS
We collect data from all preliminary rounds of the Lincoln-Douglas Debate Tournament
of Champions held in years 2004-2009 that meet certain conditions.14 Our data contain
13
During three years in the sample (2004, 2005, 2009), debaters assigned judges an “A”, “B”, “C”, or “Strike”
ranking. Judges struck by a debater were absolutely barred from judging that debater. Debaters were instructed to
assign a certain number of eligible judges to each rank. This system has been explained and critiqued in some detail
by Decker and Morello (1984). During the other three years (2006-2008), debaters were provided a larger quantity
of strikes but not permitted to rank un-struck judges.
14
Originally, all rounds were coded. Rounds were subsequently excluded if there were not two competitors (i.e.
there was a bye) or consistent results were not reported for both competitors (e.g. one competitor dropped from the
competition prior to completion and did not appear in the results packet for the tournament).These rounds were
excluded for four reasons: (1) quality control was made more difficult when one competitor’s data could not be
checked against the other’s; (2) speaker point differential could not be calculated due to the absence of speaker point
information for these competitors; (3) controlling for the characteristics of the missing debater would have been
possible only by utilizing private information (e.g. asking the competitor who s/he debated against and coding that
person); and (4) the performance of a debater incapable of completing the tournament was likely affected by
whatever caused them to drop from the tournament (e.g. severe illness).
14
information on the side (affirmative/negative), outcome (win/loss), and the sex and region of the
affirmative, negative, and judge. Summary statistics are found in Table A1. We excluded the
first and second rounds at each tournament in order to ensure that the results only include rounds
between evenly-matched competitors. In rounds 3-7, debaters are paired against each other based
on their previous record and speaker points (“power matched”); the first two rounds are
randomly paired. Each debate generates a single observation.15
The dependent variable in our sample is the round outcome. AffWin is an outcome
dummy variable that takes the value of 1 if the affirmative debater won, 0 if the affirmative
debater lost. This is the ultimate measure for the outcome of the round because wins are the sole
determinant of who advances to elimination rounds at this tournament, and speaker points are
only relevant to debaters that (1) have sufficient wins to be concerned about seeding in
elimination rounds or (2) have sufficient speaker points to be in the running for a top speaker
award. Furthermore, judges can only award one win per round, making this the only direct
competition for a scarce “resource.”
B. SKILL PROXY
In order to control for the relative skill of each debater, we develop a proxy we call
skillprox. While it is impossible to perfectly quantify skill, we exploit as much information as is
available in order to reach the best approximation. Ideally, there would exist an independent
rating of each debater that would be developed throughout the season, similarly to how chess
players are rated.16 Since such a rating system is unavailable, we use our own data to
15
Our dataset includes 1567 rounds of competition prior to the exclusion of rounds 1 and 2, and 1099 rounds after
those two rounds are excluded.
16
There is now, in fact, such a system created by Fantasy Debate (fantasydebate.com). It did not exist during our
sample period. Efforts to obtain sufficient data to retroactively create this system failed.
15
approximate skill. Skillprox consists of two components: the “transitivity” component and the
“other-wins” component. Each of these presents a different way of predicting a round’s outcome
by exploiting information from other rounds. The transitivity component increases by 1 for every
opponent (excluding the current one) the affirmative defeated who defeated the negative in a
different round, and decreases by 1 for every opponent the negative defeated who defeated the
affirmative in a different round. This ranges between +/- 4 prior to normalization. The second
component, other-wins, is the raw number of affirmative wins (excluding the current round)
minus the raw number of the negative wins (also excluding the current round). Skillprox is
calculated as a weighted sum of the two components, where each component is normalized to
have a mean of 0 and a standard deviation of 1.The transitivity component is given twice as
much weight as the other-wins component17. The variable femskillprox takes the value of
skillprox if the judge is a female and zero otherwise. We include this interaction term to control
for potential differences in how male and female judges evaluate skill.
We expect a positive parameter estimate associated with skill because better debaters
should win more often. We expect the parameter estimate associated with femskillprox to be
insignificant as there is no reason to think that female judges evaluate skill differently than male
judges.
C. INDEPENDENT VARIABLES
Each debater and judge was identified by sex.18 PF is a dummy variable taking the value
of 1 if P is female, 0 otherwise, where P = AFF for Affirmative, NEG for Negative, and JDG for
17
We assume the transitivity component contains more information than the other-wins component.
18
There were no transsexual or transgender debaters or judges in the sample, as far as we know. On a similar note,
we do not attempt to identify the sexual orientation or gender identification of the (largely adolescent and young-
adult) competition and judging pool. This coding represents our best efforts at identifying the biological sex of each
16
judge. Female competitors constituted approximately 26% of the sample, while female judges
constituted approximately 13% of the sample. We also include sex interaction terms for each
combination of affirmative, negative, and judge.
Each competitor, opponent, and judge was also identified as belonging to a particular
state.19 States were aggregated into seven regions. Table A2 reports the regional aggregations.
Debaters were coded as belonging to the region where the school for which they were competing
that year is located. The coding for judges was somewhat complicated. A judge’s regional
affiliation is important because it may indicate both paradigmatic differences and political
affiliations. The authors associated each judge and debater in each year with a primary state
affiliation (e.g. Texas for Henson in 2005 and 2006, but Illinois if we were to code rounds for
2010). We then coded each judge as affiliated with a particular region in the year they judged.
There is a somewhat high degree of turnover among coaches and judges, with college-aged
judges often switching school affiliations. Full-time school-affiliated coaches were identified as
belonging to the region where their school was located. All other judges were identified as
belonging to the region where they were engaged in most of their debate-related coaching or
judging activities or, if that was unclear, where they were physically located. If no information
on the judge’s location or employment was available, they were coded as affiliated with the
region where they competed as students.20
person represented in the sample, many of whom are personally known to the authors and those who assisted them.
To the extent that sex is a proxy for gender, our findings apply to gender.
19
All competitors and judges are from the contiguous United States.
20
Despite the care taken to get precise judge affiliation information, we recognize that some cases of regional
affiliation may be open to question. To minimize the likelihood of outright error and maximize our available
information, we enlisted the assistance of a regionally diverse group of coaches and ex-debaters to review our initial
coding.
17
Pr is a dummy variable that takes the value of 1 if P is affiliated with region r, where P =
AFF for affirmative, NEG for negative, and JDG for judge and r takes the value of the region
number. We have no a priori expectation regarding the value of the parameters associated with
region. Region 1 is used as the constant. We also include interaction variables for each
combination of affirmative, negative, and judge region.
Our expected parameter estimate is ambiguous for all regional interactions other than
those where the competitor and judge are from the same region. For those, we expect that judges
may favor debaters from their own regions. For example, if the judge is from the same region as
the negative, then, the coefficient should be negative. The reason for this expectation is three-
fold. First, non-controversially, a competitor from the same region as his or her judge
presumably has more familiarity with the particular predilections of his or her judge and can
adapt his or her behavior to conform more closely to that judge’s desires. Second, non-
controversially, a competitor from the same region as his or her judge has some affinity for the
same debate style as that judge, to the extent that region is partially responsible for stylistic
preferences. Third, controversially, judges from a given region may be inclined to favor (or
disfavor) competitors from that region, either consciously or subconsciously, because of their
frequent interaction with the competitors and those competitors’ coaches. A judge might prefer
to dictate an outcome favorable to those with whom that judge frequently associates. It is
important to note that because we cannot distinguish among these three motivations for bias, it is
not possible to directly deduce from a general finding of region bias whether the biases are licit
biases (i.e. debaters from the a given region “debate better” in front of judges from that region)
or illicit biases (i.e. debaters do not “debate better,” but rather the judge rewards them,
consciously or subconsciously, for reasons external to their manifest debate skill.)

18
V. MODEL AND ESTIMATION
We employ a simple model motivated by Arrow (1973). A judge’s utility is a function of
his or her own characteristics and the characteristics of each of the debaters

where J, A, and N are vectors of judge characteristics, affirmative characteristics, and negative
characteristics, respectively. S represents the relative skill of the chosen debater while B
represents the relative judging bias associated with the chosen debater. S and B are each
functions of V, which debater the judge votes for. Judging bias is defined as anything that
influences a judge’s decision other than the skill of the debaters and is a function of the
characteristics of the judge, the affirmative, and the negative. Reputation, R, is a function of
being perceived as a good (accurate) judge, G, minus being perceived as a biased judge, H. The
probability that bias is detected is θ, which is distributed between 0 and 1. F, making friends, is
the value the judge receives from exercising bias without detection.21 Judges make decisions to
maximize utility constrained only by the requirement that they award exactly one win. We
assume the following first order conditions:

21
This could potentially take the form of any benefit the judge could receive indirectly by making the winner or the
winner’s coach happy. These might include future judging contracts or a position as a camp instructor. A judge
whose bias is detected would not thereafter be rewarded for this bias.
19

As such, judges may face a trade-off between maintaining a good reputation and making friends
by exercising bias. Thus, we can simplify the estimation considerably.
Our model deviates from Arrow’s in that there is no explicit production function. A
judge does not benefit directly, in monetary terms, from a debater’s arguments or from a
debater’s success later in the tournament. As such, a judge faces no direct trade-off between bias
and profit. However, judges develop reputations as good or bad judges based on the decisions
they render. If a judge consistently votes for debaters who are later successful, his or her
reputation will improve and vice versa. Second a judge’s reputation will decrease if he or she
becomes known as a biased judge, Judges have significant incentives to improve their reputation,
detailed infra at III.B. To this extent, the analogue of the traditional production function is the
reputation function.
A debater’s in-round performance provides a judge with the best indication of how a vote
for the debater will affect the judge’s reputation. Similarly to the productivity of an individual
worker, we do not observe debaters’ in-round performance. However, we use skillprox,
described earlier, to serve as a proxy for the potential contribution of each debater to the judge’s
reputation. Bias, therefore, can be interpreted as a factor other than skill influencing a judge’s
decision. Due to the nature of our data, we are unable to distinguish between taste-based
discrimination and information-based discrimination.
The model is estimated as follows:
!"##
$% & $% '

$% ( $% ) $% *
$% + $% * $% ,
$% * $% - .$ /$%
20
where the dependent variable, WinAff takes the value of 1 if the affirmative won the round. S is
the skill proxy described earlier. J, A, and N are vectors of sex and region characteristics for the
judge, affirmative, and negative. γ is a vector of fixed effects across years. ε is a zero-mean
error term. The subscripts y and r denote the year and round. & ' ( 0 1 2 3456 are vectors
of parameters to be estimated. We include fixed effects across years in order to control for the
effects from unobservable factors that vary from tournament to tournament. For example, these
fixed effects help account for minor rule changes22 and differences in policy regarding
competitor qualification.23
VI. RESULTS
Table A3 displays the full results of the estimation of the model. Table A4 displays the
full results of the marginal effects. Table 2 displays just those marginal effects significant at the
p < 0.11 level.
Table 2. Significant Marginal Effects
Variable N Margin Standard Error p-value

Neg (const.) -0.22 0.09 0.01
skillprox 0.04 0.01 0.00
AF 297 -0.09 0.05 0.06

JF 151 0.12 0.06 0.07
JF*AF*NF 12 0.65 0.40 0.10
J1N1 61 -1.22 0.31 0.00

J1A6 77 -0.58 0.27 0.03
J2N2 40 -0.84 0.36 0.02
22
E.g. in three years of the sample, competitors used mutual judge preference, while in three other years,
competitors had an increased ability to strike judges but no ratings of judges.
23
The qualifying tournaments change from year to year, as does the number of competitors entitled to compete and
the number of at-large bids awarded.
21
J3A7 7 -1.37 0.44 0.00
J5N1 11 -0.70 0.39 0.07

J5N5 2 -0.58 0.36 0.11
A2N1 39 -0.62 0.36 0.09

A4N1 42 -0.85 0.37 0.02
A6N4 51 -1.15 0.34 0.00
A7N6 36 -1.03 0.40 0.01
The marginal effect of the constant term is -0.22. We interpret this as side bias in favor
of the negative since the dependent variable is WinAff. Side bias in favor of the negative is well-
known in high school Lincoln-Douglas Debate. The marginal effect of skillprox is 0.04,
indicating that more skilled debaters are more likely to win. The marginal effect of AF is -0.09
providing marginal evidence that female affirmatives lose more often in front of male judges
than female judges. The marginal effect of JF is .12, which provides marginal evidence that
female judges are more likely to vote affirmative than male judges. The marginal effect on the
interaction term JF*AF*NF is 0.65, which provides marginal evidence that rounds involving a
female affirmative, negative, and judge are more likely to result in an affirmative victory than
those with a male affirmative, negative, and judge. However, this result is questionable as only
12 of these rounds appear in the sample.
The marginal effect of J1N1 and J1A6 are -1.22 and -0.58 respectively, indicating that
Texas judges are more likely to vote for Texas negatives and against Northeast Affirmatives.
The marginal effect of J2N2 is -0.84, indicating that California judges are more likely to vote for
California negatives. The marginal effects of J3A7, J5N1, and J5N5 are all significant or
marginally significant with values of -1.37, -0.70, and -0.58 respectively. This indicates that
22
Southwest judges are more likely to vote against Southeast affirmatives, and Northeast judges
prefer Texas negatives and Northeast Negatives. However, the low cell-counts associated with
these interaction terms (7, 11, and 2, respectively) make these results unreliable.
The marginal effects of A2N1 and A4N1 are -0.62 and -0.85, respectively, indicating that
Texas negatives perform particularly well against affirmatives from California and the Midwest.
The marginal effect of A6N4 is -1.15, indicating that Midwest negatives perform particularly well
against Northeast affirmatives. The marginal effect of A7N6 is -1.03, indicating that Northeast
negatives perform particularly well against Southeast affirmatives.
VII. CONCLUSION
A. DISCUSSION
We use six years of annual tournament data to investigate potential side, sex, and region
biases at the high school Lincoln-Douglas Tournament of Champions. Taken together, these
results provide significant evidence of judging bias at this tournament. Judges tend to prefer
debaters of their own sex and region. There is also strong and significant evidence of side bias in
favor of the negative. Additionally, we find evidence that male judges are more likely to vote
against female affirmatives, Texas judges are more likely to vote for Texas negatives and against
Northeast Affirmatives, and California judges are more likely to vote for California negatives.
Taken as a whole, these results may indicate that a known negative bias allows judges to
express other biases without detection when voting for the debater representing the negative side.
It has long been well-known in the debate community that debating on the negative is
advantageous. This known side bias may provide judges with a means of expressing bias with a
lower probability of detection. The evidence of bias shows up almost exclusively when judges
vote negative. We find evidence that male judges are more likely to vote against female
23
affirmatives, Texas judges are more likely to vote for Texas negatives and against Northeast
Affirmatives, and California judges are more likely to vote for California negatives.
These results may indicate that debate judges feel more comfortable expressing biases
when voting negative because a negative win is the more expected outcome and negative ballots
will be less likely to be called into question by anyone other than the losing debater. This
increased expression of illicit preferences when detection is least likely is consistent with the
findings of scholars of judicial decision-making, who find increased willingness of appellate
court judges to pursue their preferred outcomes when detection becomes less likely due to the
absence of a judge with differing political preferences signaling the majority’s deviation from
legal rules (Cross & Tiller, 1998). It does not, however, indicate that the biases are conscious. An
illicit desire to vote for debaters of one’s own region and sex coupled with conscious evasion of
detection is certainly a possible explanation for this finding. However, it is also true that judges
may be more conscientious when they affirm than when they negate, taking greater care to
justify their decisions and thus being less subject to subconscious biases.
This avoidance of detection is also compatible with both taste- and information-based
discrimination – at least in the case of region. It is possible that there is something unique about
negating that enables a negative debater to take advantage of a judge being from the same region
where an affirmative debater could not. Since the negative debater speaks second, and thus
knows the affirmative debater’s initial strategy position, it may be that choosing the correct
response strategy against a position for a particular judge is more important than choosing the
correct initial strategy. If judges publicly reported on the strategies of both sides, as has become
the norm at some tournaments in intercollegiate policy debate, this hypothesis could be tested
24
empirically. It seems difficult to reconcile the information-based discrimination with a sex bias
that only appears when females affirm in front of male judges.24
B. POLICY RECOMMENDATIONS
These results indicate that substantial side, region, and sex bias exist at the TOC. Because
these biases make the outcomes of debates less dependent upon the skill of the debaters,
permitting bias to exist in debate may undermine the goals of competitive fairness and education
that are central to debate as a competitive academic activity. Debaters invest a great deal of
effort into the activity of debate largely in order to gain competitive success and the attendant
social and financial rewards that come from that success. If success in debate is determined by
something other than skill, those who do not possess that something (e.g. maleness or residency
in a region that sends a large contingent of judges to the TOC) may be less likely to compete, and
the incentive for research and intense training diminishes when the marginal effect of a small
change in ability to debate is less important than the results of a coin flip. In this way, bias may
reduce both the number of people participating in debate and its value to those who do
participate, in addition to harming those against whom discrimination directly occurs.
Loss of participation in debate is important not only for the debate community, but for
society as a whole. Competitive debate conveys a large number of benefits to its participants:
research skills, speaking skills, the ability to think quickly on one’s feet, knowledge of current
events, and the personal attendant personal empowerment the acquisition of these brings
24
In order for this to be information-based discrimination, it would have to be true that: (1) female debaters are
worse at affirming than male debaters are, and (2) male judges are better at identifying this deficiency than female
judges are. Though this is possible, it is unlikely. Whatever sex differences may occur in the debate population as a
whole, it is unlikely they would be replicated in an elite pool where no effort is made to ensure equal gender
representation among either the competitors or judges. In fact, females display less of a side bias than males do;
female judges would thus appear to be superior to male judges from the standpoint of competitive equity.
25
(Edwards, 2008). Participation in competitive debate also positively contributes to participants’
ability to successfully pursue careers in many prestigious and remunerative fields such as
academia, business management, and law (Zompetti & Driscoll, 2002). In more immediate
terms, scholarships for competitive debate, and are attached largely to high school success. Sex
bias in debate, reducing the success and participation rates of female participants, may thus be
contributing to sex disparities later in life.
The most prevalent form of bias in high school Lincoln Douglas debate is the side bias in
favor of the negative. The effects of negative side bias are difficult to overstate. First, while the
estimated marginal effect of the side bias is not the largest, it is the most pervasive: side bias is
the only form of bias that is potentially present in every round. Second, the marginal effect of
side bias is large enough to be problematic on its own; debating on the negative is substantially
preferable to a small but positive skill proxy advantage, which may be why the transitivity study
indicates that competent judging does not produce results statistically different from random
decisions. In other words, given side bias, it is better to be lucky (through random assignment to
the negative) than good. Additionally, the prevalent negative bias may provide a means for
judges to shield other expressed biases from detection. In this way, side bias may exacerbate
both imbalances in competitive equity at national tournaments and anti-female sex bias. Side bias
must be corrected in order to make debate a fair activity – especially for female debaters and
those less familiar with the judges at any given tournament, but also for the community as a
whole.
One potential source of side bias relates to speech times. The affirmative is allowed three
short speeches (6, 4, and 3, minutes) while the negative is allowed two long speeches (7 and 6
minutes). There are two intuitive explanations for the effect of speech times on side bias. First,
26
because any argument not refuted is considered true for the purpose of the debate, affirmative
debaters are required to “cover” 7 minutes of negative speech time in 4 minutes, then cover 6
minutes of negative speech time in 3 minutes. Due to the large number of arguments presented in
each round, the negative may possess a substantial advantage over the affirmative due to this
convention. Second, judges only consider arguments discussed in each debater’s last speech,
ignoring any that both sides “drop.” The negative has more time to bring up relevant and helpful
arguments and explain their significance, while the affirmative must attempt to diminish the
importance of the negative’s arguments and bring up their own in half the time due to the 2:1
ratio of NR time to 2AR time.25 Another potential source of side bias relates to organizational
difficulty stemming from the number of speeches. Because each side is only allotted a limited
amount of “prep time,” or free time to prepare their next speech, organization becomes more
difficult with a greater number of speeches.26
One potential remedy would be to allow each debater to give the same number of
speeches with equal speech times. Another remedy would be to provide an even number of
preliminary debate rounds to ensure that each debater represents the affirmative and negative
sides an equal number of times. With seven rounds, debaters who represent the negative in four
25
The first of these explanations is more popular among debaters. An analysis of the time differences in other
formats of debate (such as the National Debate Tournament style) where side bias is not known to exist, however,
indicates that side equality can survive time disparities in middle speeches (e.g. the 15-minute Neg Block vs. the 6-
minute 1AR in NDT format or the 13-minute Neg Block vs. the 5-minute 1AR in CX). In each of these formats,
however, the 2NR and 2AR (functional equivalents of the NR and 2AR in LD) are of equal time. NDT and CX
debaters frequently consider the 2NR the most difficult speech. Interestingly, every form of competitive debate of
which the authors are aware has the same speaker (or team) speak first and last, leaving the team that speaks second
with larger blocks of time. This may be intended to counterbalance the advantage gained from speaking first (and
thus setting the terms of the debate) and last (and thus having the only opportunity to “move” after all of the other
relevant “moves” have taken place.
26
Obviously, the affirmative need not expend any prep time prior to the 1AC. The NC, however, often contains
substantial pre-scripted elements that do not vary substantially based on the affirmative’s advocacy. Thus, the NC is
easier to prep than the 1AR.
27
rounds have an advantage over those who represent the negative in only three rounds. Of course,
this would do nothing to solve for potential side bias in elimination rounds.27 A final potential
remedy may be the most effective, but is also the most difficult to implement. If debate judges
were to substantially alter their accepted norms to be more affirmative-friendly, then competitive
equity could be achieved without the need to wait for policy change. For example, a norm for
evaluating arguments that gave the affirmative substantial leeway regarding what constitutes
addressing an argument or makes the affirmative’s interpretation for how the round should be
evaluated the one a judge accepts by default (subject, of course, to the negative’s ability to show
that this interpretation is unreasonable) may give the affirmative a competitive advantage to
offset the structural disadvantage imposed by the time limits.
Second, we find significant evidence of region bias. Specifically, Texas and California
judges prefer negative debaters from their regions.28 A simple remedy for region bias might be to
require that rounds be judged by someone unaffiliated with either of the debaters’ regions.29
This is done successfully at the National Forensics League National Tournament, which defines
regions at the state level. Our analysis suggests that a wider aggregation may be appropriate.
Alternatively, since the most significant evidence of bias appears when the negative debater and
judge are from the same region, the tournament could merely prohibit judges from evaluating
27
We do not study side bias in elimination rounds. Anecdotal evidence indicates that side bias is actually worse in
elimination rounds than it is in preliminary rounds.
28
While these results may mistakenly be understood to indicate that only Texas and California judges express
regional bias, it is important to remember that Texas and California send large numbers of competitors to the TOC
and, unlike the Northeast, are net suppliers of judging. This means that (1) Texas and California had the most
opportunities to display this bias and (2) despite evidence of this bias, debaters outside Texas and California tend to
prefer Texas and California judges. Further, though statistical significance could not always be inferred, the
marginal effect on affirmative wins of the negative debater and judge being from the same region was either
negative, zero, or non-estimable for all regions.
29
Currently, mutual judge preference/strikes and conflicts of interest are the sole constraints on judges evaluating
debaters. This would be an additional constraint; it cannot be a viable replacement for mutual preference or strikes.
28
negative debaters from their region, while permitting judges to evaluate debaters from their own
regions on the affirmative. This bias may be particularly pernicious at non-national tournaments,
such as the TOC-qualifying tournaments, where the region in which the tournament is located is
likely to have significantly more judges and the cost of discriminating against debaters or teams
one may not see again is low.
Finally, we find moderately significant evidence that male judges tend to vote against
female affirmatives. This is troubling given that females judge only 13% of rounds in our
sample. Clearly, more needs to be done to promote equal participation among debaters and
judges. Some empirical examination of the judge preference forms filled out by debaters under
different rules would be useful, as certain systems of judge preference may systematically
exclude female judges (or judges of certain regions). Furthermore, the TOC might benefit from
actively recruiting qualified female judges to avoid a male-dominated judging pool that may
produce decisions biased against female debaters. Reversal of its policy of refusing to hire judges
would enable the TOC to ensure that it obtains qualified female judges from underrepresented
regions, simultaneously helping resolve problems with sex bias and region bias.
29
REFERENCES
Allen, M., Trejo, M, Bartanen, M., Schroeder, A., and Ulrich, T. (2004). Diversity in United
States forensics: A report on research conducted for the American Forensics Association.
Argumentation and Advocacy, 40(3), 173-184.
Arrow, K. J. (1973). The Theory of Discrimination. In O. Ashenfelterand A. Rees (Eds.),
Discrimination in Labor Markets (pp. 3-33). Princeton, NJ: Princeton University Press.
Ayres, I., and Siegelman, P. (1995). Race and gender discrimination in bargaining for a new car,
American Economic Review, 85(3), 304-321.
Bartanen, K. (1995). Developing student voices in academic debate through a feminist
perspective of learning, knowing, and arguing, Contemporary Argumentation and
Debate, 16, 1-13.
Basinger, J. (1996). Response to Beattie: a female participant’s view of Cross Examination
Debate Association debate, Speech and Theatre Association of Missouri Journal, 27,
106-109.
Beattie, S. (1996). Because we don’t like it: a feminist non-participant’s view of the Cross
Examination Debate Association debate, Speech and Theatre Association of Missouri
Journal, 27, 96-105.
Becker, G.S. (1957). The Economics of Discrimination, Chicago, IL: University of Chicago
Press.
Bertrand , M. & Mullanaithan, S. (2002). Are Emily and Brendan more employable than Latoya
and Tyrone? Evidence on racial discrimination in the labor market from a large
randomized experiment. Unpublished manuscript, University of Chicago Graduate
School of Business, Chicago, IL.

30
Bile, J. (1999). Toward the transformation of the dominant social paradigm of argument:
Reflections of solving the "women and minority" problem in tournament debate.
International Journal of Forensics, 2, 116-127.
Blau, F.D., and Kahn, L.M. (2000).Gender differences in pay, Journal of Economic Perspectives,
14(4), 75–99.
Bornstein, B.H., and Rajki, M. (1994). Extra-Legal factors and product liability: the influence of
mock jurors' demographic characteristics and intuitions about the cause of an injury,
Behavioral Science and Law, 12(2), 127-147.
Bridgeman, D. L., and Marlowe, D. (1979). Jury decision making: an empirical study based on
actual felony trials. Journal of Applied Psychology, 64, 91-98.
Bruschke, J. and Johnson, A. (1994). An analysis of differences in success rates of male and
female debaters. Argumentation and Advocacy, 30, 162-173.
Coate, S., and Loury, G.C. (1993). Will affirmative-action policies eliminate negative
stereotypes? The American Economic Review, 83(5), pp. 1220-1240.
Cole, N. (1957). Trials and tribulations of a woman debater. Gavel, 39(3), 69-70.
Crenshaw, C. (1996). Dominant forms and marginalized voices: argumentation about
feminism(s), CEDA Yearbook, 14, 72-79.
Cross, F. (2003). Decisionmaking in the U.S. Circuit Courts of Appeals, California Law Review,
91, 1457-1515.
Cross, F. & Tiller, E. (1998). Judicial partisanship and obedience to legal doctrine:
whistleblowing on the federal courts of appeals. Yale Law Journal, 107, 2155-2176.
Denove, C.F., and Imwinkelried, E.J. (1995). Jury selection: an empirical investigation of
demographic bias, American Journal of Trial Advocacy, 19, 285-336.

31
Decker, W. D. and Morello, J.T. (1984).Some educational difficulties associated with mutual
preference debate judging systems, Journal of the American Forensic Association, 20(3),
154-161.
Edwards, R. (2008). Competitive debate: the official guide. Penguin Group: New York.
Fine, G. (2001). Gifted Tongues: high school debate and adolescent culture. Princeton
University Press: Princteon, NJ
Fitzgerald, R., and Ellsworth, P.C. (1984). Due process vs. crime control: death qualification and
jury attitudes, Law and Human Behavior, 8, 31-51.
Friedley, S.A., and Manchester, B.B. (1985).An analysis of male/female participation at select
national championships, National Forensic Journal 3(1), 1-12.
Goodman, J., et al. (1990). Matters of money: voir dire in civil cases, Forensic Reports, 3, 303-
329.
Green, E. (1968). The reasonable man: legal fiction or psychosocial reality?, Law and Society
Review, 2, pp. 241-257.
Helgeson, V.S., and Shaver, K.G. (1990). Presumption of innocence: congruence bias induced
and overcome, Journal of Applied Social Psychology, 20(4), 276-302.
Hensley, W.E. and Strother, D. B. (1968). Success in debate, Speech Teacher 17, 235-237.
Henson, C. (2010). Judging CERCLA: an empirical analysis of circuit court decision-making,
Journal of Applied Economy, 4, 69-93.
Kahn, L.M. (1991). Discrimination in professional sports: a survey of the literature. Industrial
and Labor Relations Review. 44(3), 395-418.
Kay, J., and Aden, R. (1984).The relationship of judging panel composition to scoring at the
1984 NFA nationals. National Forensic Journal, 2(2), 85-97.

32
Knee, R. C. (1939). What happens to a woman debater? Speaker, 23(3), 4.
Kozinski, A. (1993). What I ate for breakfast and other mysteries of judicial decision making,
Loyola of Los Angeles Law Review, 26, 993-999.
Levitt, S.D. (2004). Testing theories of discrimination: evidence from weakest link. Journal of
Law and Economics. 47(2), 431-452
Lichtman, A.J. and Rohrer, D.M. (1982). Policy dispute and paradigm evaluation: a response to
Rowland, Journal of the American Forensic Association, 18(3), 145-150.
Martin, S.P. (June 1988). An analysis of the participation of women in competitive debate,
Unpublished sociology honors thesis, Dartmouth College, Hanover, NH.
McBath, J. H. (Ed.). (1975) Forensics as communication. Skokie, IL: National Textbook Co.
Mills, C.J., and Bohannon, W.E. (1980). Juror characteristics: to what extent are they related to
jury verdicts?, Judicature, 64, 22-31.
Neumark, D., Bank, R. and Van Nort, K. (1996). Sex discrimination in restaurant hiring: An
audit study. The Quarterly Journal of Economics, 111(3), 915-941.
Posner, R. (1993). What do judges and justices maximize? (The same thing everybody else
does), Supreme Court Economic Review, 3, 1-41.
Raich, P., and Rich, J. (1995). An investigation of gender discrimination in labor hiring, Eastern
Economic Journal, 21(3), 343-356.
Rosen, N., Dean, L., and Willis, F. (1978). The outcome of debate in relation to gender, side, and
position, Journal of the American Forensics Association, 15(1), 17-21.
Rowland, R.C. (1982). Standards for paradigm evaluation, Journal of the American Forensic
Association, 18(3), 133-140.
33
Segal, J. & Spaeth, H. (2002). The Supreme Court and the Attitudinal Model Revisited.
Cambridge University Press: Cambridge.
Sisk, G., Heise, M., & Morriss, A. (1998). Charting the influences on the judicial mind: an
empirical study of judicial reasoning. New York University Law Review, 73, 1377-1500.
Snider, A.C. (1984). Games without frontiers: a design for communication scholars and forensic
educators, Journal of the American Forensic Association, 20(3), 162-170.
Stepp, P. (1997). Can we make intercollegiate debate more diverse? Argumentation and
Advocacy, 33(4), 176-191.
Stepp, P., and Gardner, B. (2001). Ten years of demographics: who debates in America,
Argumentation and Advocacy, 38(2), 69-82.
Thompson, W.C., Cowan, C., Ellsworth, P., and Harrington, J., (1984). Death penalty attitudes
and conviction proneness: the translation of attitudes into verdicts, Law and Human
Behavior, 8, 95-113.
Urlich, W. (1982).Flexibility in paradigm evaluation, Journal of the American Forensic
Association, 18(3), 151-153.
Zarefsky, D. (1982). The perils of assessing paradigms, Journal of the American Forensic
Association, 18(3), 141-144.
Zompetti, J. and Driscoll, W. (2002). Discovering the world through debate: a practical guide to
educational debate for debaters, coaches, and judges. Central European University Press:
Budapest.
34
APPENDIX
Table A1: Summary Statistics

Continuous
Variable Obs Mean SD Min Max
skillprox 1116 -0.05 2.22 -11.38 10.33
femskillprox 1116 0.02 0.71 -8.07 6.26
Binary
Variable N Mean
WinAff 493 0.44
AF 297 0.27
NF 292 0.26
JF 151 0.13
JF*AF 46 0.04
JF*NF 34 0.03
AF*NF 71 0.06
JF*AF*NF 12 0.01
35
Table A1 continued.
Variable N Mean Variable N Mean Variable N Mean
A1 225 0.20 N1 226 0.20 J1 280 0.25
A2 183 0.16 N2 184 0.16 J2 233 0.21
A3 54 0.05 N3 51 0.05 J3 37 0.03
A4 191 0.17 N4 197 0.18 J4 212 0.19
A5 33 0.03 N5 34 0.03 J5 43 0.04
A6 290 0.26 N6 283 0.25 J6 184 0.16
A7 143 0.13 N7 144 0.13 J7 130 0.12
J1*A1 62 0.06 J1*N1 61 0.05 A1*N1 49 0.04

J1*A2 34 0.03 J1*N2 36 0.03 A1*N2 33 0.03
J1*A3 14 0.01 J1*N3 15 0.01 A1*N3 12 0.01
J1*A4 42 0.04 J1*N4 51 0.05 A1*N4 36 0.03
J1*A5 13 0.01 J1*N5 9 0.01 A1*N5 5 0.00
J1*A6 77 0.07 J1*N6 78 0.07 A1*N6 63 0.06
J1*A7 38 0.03 J1*N7 30 0.03 A1*N7 27 0.02
J2*A1 51 0.05 J2*N1 48 0.04 A2*N1 39 0.03
J2*A2 43 0.04 J2*N2 40 0.04 A2*N2 27 0.02
J2*A3 14 0.01 J2*N3 11 0.01 A2*N3 8 0.01
J2*A4 47 0.04 J2*N4 33 0.03 A2*N4 30 0.03
J2*A5 6 0.01 J2*N5 10 0.01 A2*N5 8 0.01
J2*A6 50 0.04 J2*N6 59 0.05 A2*N6 49 0.04
J2*A7 22 0.02 J2*N7 32 0.03 A2*N7 22 0.02
J3*A1 5 0.00 J3*N1 7 0.01 A3*N1 12 0.01
J3*A2 9 0.01 J3*N2 2 0.00 A3*N2 11 0.01
J3*A3 4 0.00 J3*N3 1 0.00 A3*N3 1 0.00
J3*A4 6 0.01 J3*N4 8 0.01 A3*N4 6 0.01
J3*A5 (empty) J3*N5 2 0.00 A3*N5 (empty)
J3*A6 6 0.01 J3*N6 11 0.01 A3*N6 17 0.02
J3*A7 7 0.01 J3*N7 6 0.01 A3*N7 7 0.01
J4*A1 35 0.03 J4*N1 41 0.04 A4*N1 42 0.04
J4*A2 38 0.03 J4*N2 43 0.04 A4*N2 31 0.03
J4*A3 7 0.01 J4*N3 11 0.01 A4*N3 7 0.01
J4*A4 36 0.03 J4*N4 34 0.03 A4*N4 36 0.03
J4*A5 2 0.00 J4*N5 5 0.00 A4*N5 10 0.01
J4*A6 61 0.05 J4*N6 51 0.05 A4*N6 36 0.03
J4*A7 33 0.03 J4*N7 27 0.02 A4*N7 29 0.03
J5*A1 12 0.01 J5*N1 11 0.01 A5*N1 8 0.01
J5*A2 6 0.01 J5*N2 9 0.01 A5*N2 5 0.00
J5*A3 (empty) J5*N3 2 0.00 A5*N3 3 0.00
J5*A4 8 0.01 J5*N4 6 0.01 A5*N4 10 0.01
J5*A5 1 0.00 J5*N5 2 0.00 A5*N5 1 0.00
J5*A6 11 0.01 J5*N6 10 0.01 A5*N6 4 0.00
J5*A7 5 0.00 J5*N7 3 0.00 A5*N7 2 0.00
J6*A1 35 0.03 J6*N1 35 0.03 A6*N1 54 0.05
J6*A2 31 0.03 J6*N2 35 0.03 A6*N2 52 0.05
J6*A3 9 0.01 J6*N3 5 0.00 A6*N3 13 0.01
J6*A4 32 0.03 J6*N4 38 0.03 A6*N4 51 0.05
J6*A5 9 0.01 J6*N5 5 0.00 A6*N5 5 0.00
J6*A6 50 0.04 J6*N6 39 0.03 A6*N6 78 0.07
J6*A7 18 0.02 J6*N7 27 0.02 A6*N7 37 0.03
J7*A1 25 0.02 J7*N1 23 0.02 A7*N1 22 0.02
J7*A2 22 0.02 J7*N2 19 0.02 A7*N2 25 0.02
J7*A3 6 0.01 J7*N3 6 0.01 A7*N3 7 0.01
J7*A4 20 0.02 J7*N4 27 0.02 A7*N4 28 0.03
J7*A5 2 0.00 J7*N5 1 0.00 A7*N5 5 0.00
J7*A6 35 0.03 J7*N6 35 0.03 A7*N6 36 0.03
J7*A7 20 0.02 J7*N7 19 0.02 A7*N7 20 0.02
36
Table A2: Regional Aggregation of States
Region 130 Oklahoma, Texas
Region 2 California
Region 331 Arizona, Colorado, New Mexico, Nevada, Utah
Region 4 Illinois, Indiana, Iowa, Michigan, Minnesota, Missouri, Nebraska, Ohio, Wisconsin
Region 5 Idaho, Montana, North Dakota, Oregon, South Dakota, Washington, Wyoming
Region 6 Connecticut, Delaware, Maine, Maryland, Massachusetts, New Hampshire, New
Jersey, New York, Pennsylvania, Rhode Island, Vermont
Region 7 Alabama, Arkansas, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi,
North Carolina, South Carolina, Tennessee, Virginia, West Virginia
30
Despite the strong local debate culture in Texas, Texas and Oklahoma were grouped together because there were
few competitors or judges from Oklahoma, and those who were affiliated with Oklahoma also frequently competed
and/or judged in Texas. There is no TOC-qualifying tournament located in Oklahoma, and Texas tournaments are
closest for those teams.
31
In regions 3-7, it is not the case that every state sent a competitor to the TOC. We have included the entire
contiguous United States in order to facilitate efforts to replicate results with different datasets. In these instances,
the authors are unaware of the “debate cultures” of the states sending no competitors nationally, and group them
based entirely on geography. These groupings should not be read to imply that debate in Georgia (a state well
known for its successful policy debate) and West Virginia (which is responsible for no observations in our sample),
for example, are at all similar.
37
Table A3: Regression Results
Standard
Variable Coefficient Error
Negative -0.964 0.443
Skillprox 0.146 0.035
Femskillprox 0.049 0.108
AF -0.375 0.202
NF 0.219 0.195
JF 0.473 0.272
JF*AF -0.067 0.513
JF*NF 0.195 0.604
AF*NF 0.603 0.391
JF*AF*NF -0.935 1.016
Affirmative Region
TX Constant
CA -0.52 0.66
SW -0.19 0.92
MW -0.66 0.61
NW -0.57 1.23
NE -0.45 0.55
SE 0.30 0.69
Negative Region
TX Constant
CA 0.91 0.62
SW 1.31 0.99
MW 0.90 0.59
NW -1.26 1.67
NE 1.26 0.52
SE 1.16 0.73
38
Table A3 continued.
Texas Judge Neg Region Texas Judge Aff Region
Texas Aff TX Constant Texas Neg TX Constant
CA 0.91 0.62 CA -0.52 0.66
SW 1.31 0.99 SW -0.19 0.92
MW 0.90 0.59 MW -0.66 0.61
NW -1.26 1.67 NW -0.57 1.23
NE 1.26 0.52 NE -0.45 0.55
SE 1.16 0.73 SE 0.30 0.69
California Judge Neg Region California Judge Aff Region
CA -1.34 0.68 CA 0.84 0.67
SW -0.78 1.05 SW -0.51 0.97
MW -0.71 0.66 MW 0.20 0.63
NW -0.44 2.00 NW 0.07 1.37
NE -0.75 0.58 NE -0.03 0.58
SE -0.34 0.73 SE 0.20 0.71
Southwest Judge Neg Region Southwest Judge Aff Region
CA 23.81 68488.23 CA -1.43 1.70
SW -25.32 100003.80 SW -2.15 2.07
MW -2.00 1.39 MW 0.77 1.95
NW -2.23 2.24 NW (empty)
NE 0.05 1.41 NE -1.70 1.96
SE -2.43 1.91 SE -2.65 1.80
Midwest Judge Neg Region Midwest Judge Aff Region
CA -0.63 0.68 CA 1.15 0.70
SW -0.05 1.04 SW 2.09 1.25
MW -1.26 0.69 MW 0.81 0.68
NW 0.92 2.00 NW -21.48 76517.24
NE -1.26 0.61 NE 0.36 0.60
SE -0.99 0.77 SE -0.65 0.72
Northwest Judge Neg Region Northwest Judge Aff Region
CA -0.08 1.26 CA 1.80 1.41
SW -23.13 71424.33 SW (empty)
MW 0.66 1.49 MW 1.13 1.21
NW -0.23 3.43 NW 23.86 104073.10
NE 0.75 1.20 NE -2.01 1.35
SE -0.89 1.64 SE 0.05 1.30
Northeast Judge Neg Region Northeast Judge Aff Region
CA 0.18 0.72 CA 0.07 0.76
SW -0.62 1.29 SW -0.06 1.08
MW 0.33 0.68 MW 0.50 0.71
NW 3.05 1.81 NW 0.67 1.30
NE -1.09 0.66 NE 0.66 0.61
SE -0.28 0.78 SE -0.23 0.82
Southeast Judge Neg Region Southeast Judge Aff Region
CA -0.57 0.86 CA -0.19 0.83
SW -23.48 46281.02 SW 1.16 1.27
MW 0.32 0.77 MW -0.22 0.83
NW 21.95 110290.20 NW -45.56 87171.84
NE -0.41 0.71 NE -0.02 0.70
SE 0.29 0.88 SE -0.25 0.80
39
Table A3 continued.
Texas Judge
Aff Neg
CA CA -0.33 0.74
CA SW -0.80 1.22
CA MW 0.66 0.73
CA NW 3.69 2.01
CA NE -0.34 0.64
CA SE -0.39 0.80
SW CA -1.58 1.20
SW SW -2.32 116863.10
SW MW -0.46 1.23
SW NW (empty)
SW NE -0.26 0.94
SW SE 0.45 1.24
MW CA 0.49 0.73
MW SW -0.65 1.43
MW MW 0.16 0.70
MW NW 0.61 2.04
MW NE 0.74 0.66
MW SE 0.06 0.77
NW CA 0.17 1.66
NW SW 0.37 1.75
NW MW -0.63 1.18
NW NW -20.04 108897.40
NW NE -0.43 1.77
NW SE 22.85 62094.02
NE CA -0.23 0.66
NE SW -0.45 1.07
NE MW -0.79 0.66
NE NW 2.64 2.28
NE NE 0.26 0.57
NE SE 0.33 0.73
SE CA 0.02 0.80
SE SW -1.35 1.31
SE MW -0.23 0.81
SE NW 0.49 1.84
SE NE -1.26 0.73
SE SE -0.93 0.91
40
Table A4. Marginal Effects
Variable Margin p-value
Neg -0.22 0.01
Skillprox 0.04 0.00
Femskillprox 0.01 0.65
AF -0.09 0.057
NF 0.05 0.256
JF 0.12 0.072
JF*AF -0.14 0.279
JF*NF 0.06 0.8
AF*NF 0.07 0.759
JF*AF*NF 0.65 0.099
Affirmative Negative
Region Region
TX -0.16 1 TX -0.79 0.997
CA -0.27 0.999 CA 0.35 1
Not
SW Estimable SW -4.83 0.999
MW -0.21 1 MW -0.37 0.999
Not Not
NW Estimable NW Estimable
NE -0.58 0.999 NE -0.17 0.999
SE -0.61 0.999 SE 0.59 1
Judge Region
TX -0.4552561 0.998
CA -0.1770847 0.999
Not
SW Estimable
MW -1.029753 1
Not
NW Estimable
NE -0.0988201 1
SE -1.804766 1
41
Table A4 continued.
Texas Judge Neg Region Texas Judge Aff Region
TX -1.22 0 TX -0.12 0.665
CA -0.41 0.271 CA -0.64 0.117
SW -0.54 1 SW Not Estimable
MW -0.46 0.133 MW -0.49 0.161
NW Not Estimable NW 1.40 1
NE -0.01 0.976 NE -0.58 0.031
SE 0.55 1 SE -0.34 0.326
California Judge Neg Region California Judge Aff Region
TX -0.31 0.308 TX -0.01 0.969
CA -0.84 0.021 CA 0.31 0.36
MW -0.27 0.495 MW -0.18 0.58
NE 0.15 0.603 NE -0.50 0.119
SE 1.11 1 SE -0.03 0.951
Southwest Judge Neg Region Southwest Judge Aff Region
TX Not Estimable TX 4.56 1
CA Not Estimable CA 2.61 1
SW Not Estimable SW Not Estimable
MW Not Estimable MW 4.96 1
NW Not Estimable NW Not Estimable
NE Not Estimable NE 2.40 1
SE Not Estimable SE 1.68 1
Midwest Judge Neg Region Midwest Judge Aff Region
TX -1.05 1 TX -0.50 0.164
CA -0.87 1 CA 0.13 0.708
MW -1.55 0.999 MW -0.06 0.867
NW Not Estimable NW -20.45 1
NE -1.09 1 NE -0.59 0.042
SE -0.28 1 SE -1.37 0.002
Northwest Judge Neg Region Northwest Judge Aff Region
TX Not Estimable TX -1.08 1
CA Not Estimable CA 0.20 1
SW Not Estimable SW Not Estimable
MW Not Estimable MW -0.33 1
NE Not Estimable NE -3.55 0.999
SE Not Estimable SE -1.26 1
Northeast Judge Neg Region Northeast Judge Aff Region
TX -0.70 0.068 TX -0.02 0.956
CA 0.29 0.424 CA -0.47 0.263
MW 0.38 0.283 MW 0.11 0.777
NE -0.58 0.107 NE 0.18 0.557
SE 0.78 1 SE -0.47 0.419
Southeast Judge Neg Region Southeast Judge Aff Region
TX -2.06 0.999 TX -0.07 1
CA -1.82 0.999 CA -0.79 1
MW -0.98 1 MW -0.66 1
NW Not Estimable NW -44.11 1
NE -1.26 1 NE -0.55 1
SE 0.00 1 SE -0.54 1
42
Table A4 continued.
Texas Judge
Aff Neg
TX TX -0.5039513 0.112
TX CA 0.73 1
TX SW -3.92 1
TX MW 0.06 0.868
TX NW 1.30 1
TX NE 0.16 0.561
TX SE 0.27 0.538
CA TX -0.62 0.088
CA SE 0.29 1
CA SW -4.83 0.999
CA MW 0.60 0.14
CA NW 4.86 1
CA NE -0.29 0.357
CA SE -0.24 0.623
SW TX Not Estimable
SW CA Not Estimable
SW SW Not Estimable
SW MW Not Estimable
SW NW Not Estimable
SW NE Not Estimable
SW SE Not Estimable
MW TX -0.85 0.021
MW CA 0.88 1
MW SW -4.91 0.999
MW MW -0.12 0.725
MW NW 1.56 1
MW NE 0.56 0.116
MW SE -0.02 0.96
NW TX Not Estimable
NW CA Not Estimable
NW SW Not Estimable
NW MW Not Estimable
NW NW Not Estimable
NW NE Not Estimable
NW SE Not Estimable
NE TX -0.91 0.004
NE CA 0.10 1
NE SW -4.77 0.999
NE MW -1.15 0.001
NE NW 3.53 1
NE NE 0.01 0.952
NE SE 0.19 0.616
SE TX -0.43 0.361
SE CA 0.82 1
SE SW -5.19 0.999
SE MW -0.10 0.817
SE NW 1.85 1
SE NE -1.03 0.01
SE SE -0.59 0.271
43

SSRN Id1772250

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

SSRN Id1772250

Transféré par

Droits d'auteur :

Formats disponibles

An Empirical Analysis of Judging Bias in Competitive Academic Debate

Clifford Chad Henson* Paul R. Dorasil

II. PREVIOUS LITERATURE ....................................................................................................... 2

III. HIGH SCHOOL LINCOLN DOUGLAS DEBATE ................................................................ 8

V. MODEL AND ESTIMATION................................................................................................. 19

VI. RESULTS ............................................................................................................................... 21

argues in favor of a resolution and a negative who argues in opposition to a resolution.

methods to predict and explain biases.

the previous literature on the economics of discrimination, empirical studies on economic

remedies for judging bias.

II. PREVIOUS LITERATURE

A. ECONOMIC THEORY AND EMPIRICAL STUDIES OF JOB DISCRIMINATION

discrimination in professional sports. He reports evidence of race and gender discrimination in a

Weakest Link. He finds evidence of information-based discrimination against Hispanics and

interviews than other candidates.

B. OUTCOME PREFERENCES OF THIRD-PARTY DECISIONMAKERS

Traditional economic models of discrimination typically assume that the decision-maker

system. Fortunately, a substantial literature regarding these situations exists.

comprehensive examination of the effect of judicial background and demographic characteristics

C. STUDIES OF DISCRIMINATION IN DEBATE

longer compete in separate tournaments or divisions.3

‘slight’ to ‘overwhelming’.”(Friedley and Manchester, 1985). This is widely recognized by

debate organizations; recommendations from a conference jointly sponsored by the American

Forensics Association and Speech Communication Association (McBath, 1975) included

Conference at Northwestern University endorsed a resolution that would investigate barriers to

a (Allen et al) 2004 report.

comparable intercollegiate speech tournaments, though participation rates by women were

more recent years (Stepp, 1997).

representation in the undergraduate population might predict (Stepp, 1997).

majority of high school coaches at the TOC are male.

discrimination, we provide a more rigorous statistical analysis.

III. HIGH SCHOOL LINCOLN DOUGLAS DEBATE

and evaluated by the judges.

been involved in debate at various levels for over a decade.8

local judges in most regions tend to resist this trend.

argument, even on the national circuit.

entrepreneurship that frequently attract debaters.

end up doing poorly.

C. THE TOURNAMENT OF CHAMPIONS

Sixteen in NCAA Basketball, until a champion is determined.

Competing at these tournaments is an expensive and time-consuming enterprise, and most

potential recruits, sometimes serving as judges.

their strategy to the judge’s predilections.

randomly paired. Each debate generates a single observation.15

competition for a scarce “resource.”

skillprox. While it is impossible to perfectly quantify skill, we exploit as much information as is

combination of affirmative, negative, and judge.

region where they competed as students.20

combination of affirmative, negative, and judge region.

consciously or subconsciously, for reasons external to their manifest debate skill.)

We employ a simple model motivated by Arrow (1973). A judge’s utility is a function of

      

assume the following first order conditions:

by exercising bias. Thus, we can simplify the estimation considerably.

worker, we do not observe debaters’ in-round performance. However, we use skillprox,

discrimination and information-based discrimination.

The model is estimated as follows:

$% &  $% ' 

p < 0.11 level.

Table 2. Significant Marginal Effects

Variable N Margin Standard Error p-value

AF 297 -0.09 0.05 0.06

$% & $% '

J1A1 62 0.06 J1N1 61 0.05 A1*N1 49 0.04