Vous êtes sur la page 1sur 24

Predicting Major League Baseball

Salaries through Offensive Statistics

Devin Ensing
ECON 375
Professor Melick
December 7, 2010
Ensing 2

I. Introduction

Until the 1970s, baseball was believed to be a sport best understood through

observation. In 1977, statistician Bill James challenged this belief, theorizing that

baseball is best understood through numbers. He released annual books detailing his

exploration into the world of baseball through statistics, and quickly gained a large

following. However, major league baseball was still filled with old baseball men who

believed in scouting over numbers, ratings over statistics. This finally changed in the

1990s, when Sandy Alderson took over as general manager of the Oakland Athletics and

hired Billy Beane as a scout. Beane soon became the GM of Oakland, and changed how

baseball teams were built by relying on statistics instead of scouting reports. The book

Moneyball followed Beane and the A’s during 2002, and revolutionized the game of

baseball by changing the way it was viewed by both fans and those directly involved in

the game. This paper is attempting to show that the market for baseball players is

significantly different after Moneyball than before the book was released.

II. Review of the Literature

Extensive statistical analysis with baseball salaries could not be reasonably

researched until the 1970s. Until 1975, major league baseball teams owned their players

through the reserve clause, which essentially bound players to their team for life.

MacDonald and Reynolds (1994) noted that the reserve clause kept salaries below what

they would be in a competitive market. Scully (1974) wrote that there was a high level of

monopsonistic exploitation in baseball at the time, finding that an average player in

baseball received about 20 percent of his net marginal revenue product over his career.
Ensing 3

He concluded that the exploitation was of “considerable magnitude”. In 1975, the reserve

clause was removed, and free agency was available to players. After the removal of the

reserve clause, players began to earn more and more money, with MacDonald and

Reynolds finding average salaries climbing from $29,000 in 1970 to $150,000 in 1980.

Vrooman (1996) shows that by 1985-1987, roughly 80 percent of players that were

eligible for free agency were overpaid because of their “artificial monopoly power”

(347). He points out monopolistic inefficiencies in the free agent market, and concludes

by arguing that as of 1987, the labor market in baseball still involved “lower-tier

monopsonistic exploitation and upper-tier monopolistic inefficiency” (358). That is,

rookies and players with very few years of experiences are not paid what they are worth,

while players who have already hit free agency and have earned large contracts are

overpaid.

One of the problems for both players and general managers was that nobody

really knew how much a player was “worth”. Many articles tried to estimate the best

indicator of an offensive player’s performance, but there were contrasting results. In

1974, Scully argued that slugging percentage is the best indicator of the ability of hitters,

as it showed the highest correlation with hitting ability. But by 1994, MacDonald and

Reynolds argued that a player’s value is based on his contribution to team winning

percentage, as team winning percentage was significantly correlated with team revenue,

so owners will want players that most increase the team’s revenue. When they ran their

regressions, they found that “mean runs scored arguably is the best indicator of an

offensive player’s production” (447), as opposed to Scully’s earlier claim that slugging

percentage was the best indicator of worth.


Ensing 4

The labor market for baseball players showed little change from 1986 up to the

early 2000s, when the general manager of the Oakland Athletics, Billy Beane, began to

exploit the inefficiencies. Lewis (2003) followed the Athletics in the 2002 season in their

pursuit of winning a championship. Moneyball showed how the small-market Athletics

could compete with large budget teams such as the Boston Red Sox and New York

Yankees. The central premise of Beane’s theory to winning was to exploit inefficiencies

in the labor market for baseball players. Hakes and Sauer (2006) note the “valuation of

skills in the market for baseball players was grossly inefficient” (173). Certain offensive

statistics were overvalued, such as batting average and runs batted in, and some were

undervalued, such as on-base-percentage (OBP) and slugging percentage. Hakes and

Sauer showed that the “ability to get on base was undervalued” (175). Beane’s critical

principle was that players who were most valuable to their team were those with the

highest on-base percentages, and those players were grossly underpaid. Beane believed

that OBP was the most important offensive statistic because outs are the “currency” for a

baseball game, so players that get on base more should be worth more. But in 2002,

baseball valued players who could hit massive home runs or steal an excess of bases

much more than they valued the player who could get on base any way possible, be it by

a hit, a walk, or a hit-by-pitch.

Beane concluded that a team of players with high OBPs would be both very cheap

and compete very well. Lewis stated that the overall goal of the front office was to build a

team with the minimum payroll required to successfully contend for a playoff spot. As

Hakes and Sauer show, the A’s executed this strategy so well that they were able to

substitute new, cheaper players in for individual superstars, such as Jason Giambi, and
Ensing 5

still maintain team success, in some cases becoming an even more successful team. The

A’s made the playoffs four straight years with some of the lowest payrolls in baseball,

and the improved performance on the field increased attendance, which in turn increased

the revenue of the team. Hakes and Sauer concluded that the “Oakland strategy for

winning games was a successful exploitation of a profit opportunity” (183).

After Lewis published Moneyball in 2003, people around baseball vigorously

denied the A’s claims, but Hakes and Sauer argue that “market adjustments were in

motion” (174) and that the labor market no longer underpaid the ability to get on base, all

within a year of the book’s publication. They state that the market adjustments from

Moneyball caused the labor market to move further away from being a monopsonistic

market and more closely resemble a perfectly competitive market. Scully agrees, and

concludes that the labor market is not a perfect competition. As time has passed, the

market has been moving closer towards a perfect competition as information is

accumulated.

III. Economic Theory

We assume that in the labor market for baseball players, offensive statistics

determine the value of each player to their team. It should be noted that the market for

pitchers is beyond the scope of this paper. The goal is to find which statistics best allow

for an accurate measurement of the marginal product of labor (MPL) which is part of the

demand curve in the labor market. As discussed in the review of the literature, many

articles have already tried to estimate MPL, using statistics such as runs and slugging

percentage. This paper is going to use many different offensive performance statistics to
Ensing 6

estimate MPL. In general, a player with a better offensive skill set, or a higher MPL will

increase the probability of his team scoring runs, which will consequently increase the

probability of his team winning. That will increase the revenue of the team, as

MacDonald and Reynolds showed that a team with a better record would earn higher

revenue from ticket sales. If we can identify which offensive statistics accurately measure

a player’s value, then we can determine if and why certain players with differentiating

skill sets are paid differently, both before and after Moneyball.

We need to look at both the perfectly competitive market and the monopsony

market, as the baseball labor market is presumably somewhere between the two. In a

perfectly competitive labor market, a worker is paid the value of his MPL. In a

monopsony, owners can pay players less than the value of their MPL, so they will be able

to turn a larger profit while still staying competitive. As there are 30 teams competing for

players, the labor market should resemble a competitive market. However, as the A’s

showed, some players still receive less than they produce. But, the decrease in

asymmetrical information means that more and more players are now receiving the value

of their marginal product of labor. This means that the market is moving closer towards

perfect competition.

To determine the expected value of the salaries in a perfectly competitive market,

we must derive the supply and demand curves in the general model. The demand curve,

or the value of the marginal product of labor (VMPL), is simply equal to the marginal

product of labor multiplied by the output price, as the cost of hiring one more worker

should be less than or equal to the revenue that the hired worker can generate. So as hours

of labor increase, the wage for each worker should decrease, as their productivity is
Ensing 7

experiencing diminishing marginal returns. Therefore, the VMPL curve is downward

sloping, as can be seen in figure 1. The model will be mainly focused on this curve, as we

want to determine how wage changes as MPL changes. If we hold everything but MPL

constant, including the output price, we can determine how much a change in MPL will

change wage, or in this scenario, salary. If MPL increases, then we should see an increase

in salary, and if MPL decreases we should see a decrease in salary. As a result, we can see

which statistics affect salary the most, and whether or not there is a difference before and

after Moneyball.

The labor supply curve is determined by looking at budget constraints and

indifference curves. A worker has a choice between two “goods”, income and leisure

(figure 2), and the budget constraint will have a y-intercept of 24*wage, if the worker

worked all 24 hours a day. The x-intercept will be if the worker does not work at all,

which is located at the point (24,0). So we can see that the slope of the budget constraint

is equal to the negative wage. When we change the wage, the worker’s indifference

curves will shift depending on their preferences, and the different equilibrium points for

the different wages are then plotted to determine the worker’s supply curve.

There are two possibilities for the supply curve, depending on the size of the

income and substitution effects. An increase in wage has two effects. According to the

income effect, it can cause workers to work less, as they can earn the same amount of

income in less time, therefore leading to a decline in hours worked and an increase in

leisure. At the same time, according to the substitution effect, an increase in the wage

also causes leisure to become more expensive, as more income could be made instead of

consuming leisure, therefore causing workers to demand less leisure and work more. If
Ensing 8

the income effect dominates the substitution effect, then the worker’s supply curve will

be backward bending. If the substitution effect dominates the income effect, then the

supply curve will be upward sloping, which can be seen in figure 3.

Now that we have determined the equilibrium graph for a perfectly competitive

market, which can be seen in figure 4, we need to look at the monopsony model. In this

model, workers are not compensated properly for their work. The supply curve stays the

same, and is used to calculate wage. But there is now the marginal cost of labor curve,

which is steeper than the supply curve, as seen in figure 5. It is used to determine the

labor for a worker. The demand curve is still the value of the marginal product of labor

curve, which is equal to the output price multiplied by MPL, just as in the perfectly

competitive market. Labor in the monopsony model is found at the intercept of the MCL

and VMPL curves. Wage is then found at the point when the supply curve is equal to

labor (figure 6). Our goal in the monopsony model is the same as in the competitive

model; we want to set everything constant and then shift the demand curve, by shifting

MPL, to determine how the wages will shift.

We can see that in a monopsony model, a worker will receive a lower wage than

the value of his marginal product of labor while also working less than in a perfectly

competitive market (figure 7). Given this, while the monopsony labor market is great for

owners as they can increase profits, players are receiving a lower salary than the value

they are producing. The Oakland A’s were operating as if the baseball labor market was a

monopsony by paying players with a high OBP less than they were worth to the

franchise. It helped them win more games at a cheaper cost than their competition, and

although the market quickly adjusted, it has still not become perfectly competitive, which
Ensing 9

shows there still may be players who are receiving less than they are producing. Before

Moneyball, there were arbitrage opportunities for teams, but after the book, the market

should have become more of a perfectly competitive market, removing the arbitrage

opportunities and causing salaries to reflect the actual value of a player’s labor. This

should be reflected in the regressions, as some variables will have a much different affect

on salary before and after Moneyball.

IV. Data

I collected data on offensive performance for hitters from 1995 to 2010, from both

http://www.baseball-reference.com and http://www.thebaseballcube.com. I chose these

years because they represent a fairly large sample size, beginning in 1995 after the

baseball strike in 1994, which would have skewed the data, until last year. I collected

data on position players – no pitchers, only players that played defense and hit – who

qualified for the batting title. To qualify for the batting title, a player must have at least

3.1 plate appearances per game over an entire season. A plate appearance (PA) is every

time the batter gets into the batter’s box and a play occurs, whether the outcome is an at-

bat, walk, sacrifice, or anything else. In 1995, there were 144 games played, as the

beginning of the season was slightly delayed due to the strike, which means that the

minimum plate appearances to qualify for the batting title were 446. In every other year

in the data set, there were 162 games played, so a player must have at least 502 PAs to be

included in the study. For each player, I collected their offensive statistics, such as home

runs and runs batted in, their salary for the year, their team and the league their team

plays in, their age (as of June 30th of the year), and the position they played. There are a
Ensing 10

total of 603 different players and 2474 player years included in the study. Table 1

presents summary statistics on all of the variables used in my regressions.

The minimum salary was $109,000, earned by five different players in 1995 and

1996, and the maximum was $33 million, earned by Alex Rodriguez in both 2009 and

2010. The mean salary was $4.3 million. There is a large gap between the majority of the

salaries and the salaries of superstars, so to negate this, I use the natural log of the salaries

in my regressions. Unfortunately, salary information is not entirely accurate, as some

salaries include earned bonuses, while others do not, and some salaries depend on the

team that the player is on. In general, though, baseball has been more transparent about

salary information than other major sports, which will make it much easier to try and

estimate the effect of different variables on salary. Although there is a minimum salary in

baseball, which was $400,000 in 2009, it is binding in very few cases, so we can

disregard it in our models.

I collected a total of 23 different offensive measures for each player. Many of the

total statistics, like hits, at-bats, and walks, were used to calculate percentage statistics

such as batting average and on-base percentage, so I am going to ignore those statistics in

my regressions. I ran regressions involving many of the statistics in my data set, and

found that there were many that were insignificant in all regressions, so I have also

removed those statistics from my regressions. Finally, there are some variables that we

cannot include in regressions because we could not imagine increasing a variable while

holding another one constant. Home runs provide a good example of this. Unfortunately,

we cannot imagine an increase in home runs without an increase in both runs and RBIs,

so we are not able to include both home runs and runs in our regressions as the coefficient
Ensing 11

on home runs would not accurately reflect the value of home runs on salary. I decided on

five variables to use in my regressions: Wins Above Replacement (WAR), runs, runs

batted in (RBI), on-base percentage (OBP), and slugging percentage (SLG).

Wins above Replacement measure how much better a player is than an average

minor league replacement player with offensive, running, and defensive statistics (see

Appendix for details). It has a minimum value of -3.5 wins (values of WAR can be both

positive and negative, a negative value means the player is costing his team wins), which

belonged to Jose Guillen in 1997, and a maximum value of 12.5 wins, which belonged to

Barry Bonds in 2001. The mean WAR was 2.77 wins.

Runs are measured by the number of times a player scores a run. The minimum

runs scored was 31, by Rey Ordonez in 2001, the maximum runs scored was 152 by Jeff

Bagwell in 2000, and the mean number of runs scored is 83.4. Runs batted in are

measured by the number of times a player causes a player on his team to score a run. The

minimum RBI was 17, by Luis Castillo in 2000, the maximum RBI was 165 by Manny

Ramirez in 1999, and the mean number of RBI is 79.4. On-base percentage is calculated

as the number of hits, walks, and hit-by-pitches divided by the number of at-bats, walks,

hit-by-pitches and sacrifice flies. The minimum OBP was .259, by Angel Berroa in 2006,

the maximum OBP was .609 by Barry Bonds in 2004, and the mean OBP is .355.

Slugging percentage is calculated as the total number of bases (1 base for a single, 2 for a

double, 3 for a triple, and 4 for a home run) divided by the number of at-bats. The

minimum SLG was .268, by Cesar Izturis in 2010, the maximum SLG was .863 by Barry

Bonds in 2001, and the mean SLG is .462.


Ensing 12

V. Regressions

Now that we have all of the statistics needed, we can run regressions to try and

predict which performance statistics affect salary. Our hypothesis is that after Moneyball,

some performance statistics will be rewarded differently than before Moneyball.

Moneyball was written during the summer of 2002, published in March 2003, and Hakes

and Sauer argue that the baseball labor market had adjusted itself within a year of the

book’s publication. If Hakes and Sauer are correct, coefficient estimates from 1995-2003

should be different than those from 2004-2010, as the market should have adjusted.

Salaries in baseball are often determined by long-term contracts, so past

production better explains current salary. Meltzer (2005) found the average contract

length in baseball to be 1.79 years, which had risen from 1.31 years in 1993. We can

assume that it has risen since then, but that contract average is for all players in baseball,

while our data set contains only those players who qualified for the batting title. As such,

we should expect that these players are generally better players, so they should be

rewarded with longer contracts. This means that we expect the proper lag time to be

about three years. The best way to determine the most representative lag time (e.g. one,

two, or three years) is to run a single regression with all variables in the regression lagged

for several years. When we run regressions for one, two, and three year lags, we find that

the best lags to use are three year lags. We are assuming that each contract is

approximately three years long, so a player’s current salary will reflect their performance

from three years earlier.


Ensing 13

For each regression I am running in this paper, the independent variable is the

natural log of salary, and the dependent variables are the five performance statistics. I

have also created a "Moneyball" dummy variable, which takes on the value if 1 if the

year is greater or equal to 2003, and 0 if it is before 2003. Although there are apparent

differences between salary determination before and after Moneyball, the question of

whether these differences are statistically significant remains. We use the interaction

terms involving the “Moneyball” indicator variable to determine if the payment for some

statistics was significantly different before and after Moneyball. To determine if the

salaries would be different, we can test the “variable*MB” coefficients for each of the

offensive variables by performing t-tests on each “variable*MB”. If any of the statistics

turn out to have a p-value of < .05, we can conclude that there were differing payments.

To calculate the effect of the variable on salary before Moneyball, we simply look at the

coefficient on just the variable as the MB dummy would be equal to 0. To calculate the

effect of the variable on salary after Moneyball, we add the coefficient on the variable

and the variable*MB, as the MB dummy now equals 1. The five “variable*MB”

measures will be included in each regression, and each regression can be found in table 2.

The first regression to run is a simple OLS regression. Running this regression

will not take advantage of the fact that we have time-series data. When we run the

regression, we find that every variable but SLG is significant before Moneyball, but no

statistics are significant after Moneyball. I am going to estimate each coefficient by

increasing the variable by one standard deviation, which would mean an average player

becoming an above-average player. These estimates will produce much more significant

changes in salary than simple one-unit changes. Estimates for WAR indicate that every
Ensing 14

extra two wins (2.0 WAR) a player adds to his team before 2004 is associated with a 5.26

percent increase in the player's salary, and every additional two wins a player adds to his

team after 2003 is associated with a 6.36 percent increase in the player's salary. This is

found by adding the coefficient for WAR and WAR*MB. Every additional 19 runs

scored by a player for a season before 2004 is associated with a 9.48 percent increase in

the player’s salary, and every additional 19 runs scored by a player for a season after

2003 is associated with a 9.42 percent increase in the player’s salary. Although the

difference is not significant, this regression shows that players were rewarded less for

runs after Moneyball was published. Every additional 25 RBIs in a season before 2004 is

associated with a 17.4 percent increase in the player’s salary, and every additional 25

RBIs in a season after 2003 is associated with a 25.0 percent increase in the player’s

salary. Every additional 37 percentage points increase in OBP (e.g. from .355 to .392) for

a player in a season before 2004 is associated with a 6.76 percent increase in the player’s

salary, and every additional 37 percentage points increase in OBP for a player in a season

after 2003 is associated with a 10.05 percent increase in the player’s salary.

Next, we can check if there is any advantage to exploiting the panel nature of our

data. We have data over fifteen years, and we have many players that are in the dataset

for more than one year, so we can run a fixed effect estimator. We are worried that the

covariance between our variables and the error term does not equal zero (Cov(OBPi,t, ai)

≠ 0), which would mean that the OLS regression is biased. Ai is the error term which

takes into account information about an individual that does not vary over time. An

example of this would be the intangibles of Derek Jeter. Jeter is the captain of the New

York Yankees, and is in the data set every year since 1996. He is known as a very
Ensing 15

intelligent player, a leader for young players, and someone who handles the New York

media well, but unfortunately these traits are immeasurable and cannot be included in a

regression. As such, they show up in the error term, and we are afraid that the OLS

regression may severely underestimate Jeter’s salary because it will not know whether the

difference in salary is due to the error term or the actual variables because of the

covariance between the two. Fixed effects will take this into account and will more

correctly estimate the regression if there is covariance between any of the statistics and

the error term.

We can run both one-way fixed effects, which do not take advantage of the yearly

data, and two-way fixed effects, which do include coefficients for the years, but are not

important for the hypothesis of this paper. Because of this, two-way effects are preferred,

and when we run the regression, we find that only runs and OBP before Moneyball are

significant at the 5% level. We also find that RBI before Moneyball as well as OBP and

SLG after Moneyball are significant at the 10% level. The estimate for runs means that

every additional 19 runs scored by a player for a season before 2004 is associated with a

6.26 percent increase in the player’s salary. Also, every additional 37 percentage points

increase in OBP for a player in a season before 2004 is associated with a 14.66 percent

increase in the player’s salary. The coefficient on OBP*MB is actually negative, which

means that players were rewarded less for higher OBP after Moneyball than before,

which does not agree with our hypothesis.

We can also run a random effects estimator, which would be more appropriate

than the two-way fixed effects estimator if the covariance between our variables and the

error term does equal zero. This would be the case if there were no variables we were
Ensing 16

omitting (or could not include) that would significantly affect the covariance between the

performance statistics and the error term.

When we run random effects, we find that only RBI and OBP before Moneyball

are significant. Every additional 25 RBIs in a season before 2004 is associated with a

15.14 percent increase in the player’s salary. Every additional 37 percentage points

increase in OBP for a player in a season before 2004 is associated with a 13.63 percent

increase in the player’s salary. With random effects, the coefficient on OBP*MB is

positive, although only slightly, but this now agrees with our hypothesis.

The regression that I believe most accurately estimates the influence on

performance statistics on salary is the random effects estimator. Although fixed effects

and random effects have very similar coefficients for most of the variables, I believe that

there is an covariance between a player’s statistics and the error term, as was discussed

above with Derek Jeter. There are many reasons why a player could be getting paid

differently (usually higher) than his statistics indicate. He could have intangibles, such as

Jeter, that make him more valuable to his team, or he could simply be well-liked in his

hometown city and commands a higher salary because of his popularity. Nonetheless, I

believe that fixed effects show the true coefficients for predicting salary.

VI. Conclusions

In the regressions that we have run, only two of the statistics are consistently

significant: runs batted in and on-base percentage before Moneyball. Runs batted in was

significant at the 1% level in three of the four regressions and significant at the 10% level

in the two-way fixed effects regression. On-base percentage was always significant at the
Ensing 17

5% level. We hypothesized that players with high OBP would be paid more after

Moneyball, but we get mixed results. The OLS and random effects regressions give us

positive values for OBP*MB, but the one-way and two-way fixed effects regressions give

us negative values for OBP*MB.

Some of the statistics, while not statistically significant, are economically

significant. One such example would be a player increasing his slugging percentage from

average to above average (one standard deviation) before Moneyball. This would result in

the player increasing his salary by 5.12 percent. Given that the league average salary is

$4,834,683, on average this would be an increase of $247,729.16. Although SLG is not

statistically significant, it is definitely economically significant. Many of the variables

that are not statistically significant are economically significant, which means that

although we found no variables that were significantly different statistically before and

after Moneyball, there could still be monopsonistic exploitation.

We can see that although spending patterns were altered after the book was

released, they are not significantly different in a statistical sense. This is probably due to

the fact that many contracts were signed before the book came out that ran for years past

2004, and those contracts reward players for pre-Moneyball statistics. If we were to run

this regression again in a few years, we may be able to see both a statistically and

economically significant change in the spending habits of teams after the release of the

book.
Ensing 18

Figures and Tables

Figure 1: Perfect Competition Demand Model


W

VMPL
L

Figure 2: Budget Constraint and Indifference curve for a worker

Income

BC

W*
IC
Leisure (hr/day)
H*
Ensing 19

Figure 3: Perfect Competition Supply Model


W
SL

Figure 4: Perfect Competition Supply and Demand Model


W
SL

WP

VMPL
L
LP
Ensing 20

Figure 5: Monopsony Supply Model

W
MCL
SL

Figure 6: Monopsony Supply and Demand Model

W
MCL

VMPL SL
WM

VMPL

LM L
Ensing 21

Figure 7: Perfect Competition vs. Monopsony

W
MCL

VMPL SL
WP
WM
VMPL

LM LP L

Table 1 – Descriptive Statistics

Variable Mean Std. Deviation Minimum Maximum


Salary $4,834,683 $4,737,942 $146,366.40 $33,000,000
LnSalary 14.74953 1.304495 11.89387 17.31202
WAR 2.769725 2.247303 -3.5 12.5
Runs 83.4232 19.10701 31 152
RBI 79.39814 25.5787 17 165
OBP .3546892 .0370931 .259 .609
SLG .4617033 .0760041 .268 .863
Ensing 22

Table 2 – Regressions

Variable OLS One-Way Two-Way Random


Regression Fixed Effects Fixed Effects Effects
WAR .0263652 .0033822 -.0040468 -.0055655
(.0137541)* (.0157241) (.0160457) (.0142906)
WAR*MB .0054741 -.0270061 -.0004383 -.0064985
(.0218059) (.0225376) (.0283512) (.026163)
Runs .0049915 .0022994 .0032931 .0025642
(.0014698)*** (.0016533) (.0016333)** (.0014693)*
Runs*MB -.0000303 -.0010778 -.001663 .0004244
(.0026485) (.0026647) (.0026562) (.0024679)
RBI .0069603 .004546 .0030786 .0060543
(.001441)*** (.0016781)*** (.0016506)* (.0014295)***
RBI*MB .0030393 -.0007809 -.0013886 .0004685
(.0026261) (.0028024) (.0027512) (.0025023)
OBP 1.828273 4.002033 3.963249 3.683718
(.7808365)** (1.08716)*** (1.106998)*** (.9028859)***
OBP*MB .8882631 -.9280347 -2.969955 .219621
(1.135569) (1.16554) (1.715506)* (1.536049)
SLG .7642924 -.450428 -.1546544 .6742775
(.6034425) (.6929237) (.6772447) (.591939)
SLG*MB -.8940868 1.89428 1.964169 .551901
(1.082093) (1.116424)* (1.114851)* (1.032082)
Intercept 13.4791 13.75574 14.68708 13.03296
(.2123302) (.3135692) (.5487836) (.4425238)
N 1033 1033 1033 1033
R-Squared 0.3717 0.3020 0.2281 0.3701
Standard Errors in parentheses
*Significant at the 10% level, **Significant at the 5% level, ***Significant at the 1%
level
Ensing 23

Appendix

To calculate Wins Above Replacement:

Calculate a hitter’s weighted on-base average (wOBA), which is a statistic that combines

on-base percentage and slugging percentage. It is calculated by (0.72*NIBB + 0.75*HBP

+ 0.90*1B + 0.92*RBOE + 1.24*2B + 1.56*3B + 1.95*HR) / PA. NIBB = non-

intentional walks, and RBOE = reach base on an error.

These coefficients are the run values of each event relative to an out. To convert wOBA

to wins, we must compare the hitter’s wOBA to the league wOBA. Wins = (wOBA –

League wOBA) / 1.15 * 700 / 10.5

The league wOBA is usually around 0.338. 1.15 is the relationship between wOBA and

runs. The average player will get 700 plate appearances per 162 games, and the ratio of

runs to wins is 10.5. So the formula compares the number of runs above average a player

is per PA, through wOBA, multiplies it by the number of PAs in a season, and divides by

the runs-wins ratio to calculate WAR.

Then, you must add in the positional adjustment, the replacement level of the player, and

the park factor for the player’s home stadium. Positional adjustments are defined as: +1.0

wins for a catcher, +0.5 wins for a SS or CF, no wins for a 2B or 3B, -0.5 wins for a LF,

RF, or PH, -1.0 win for a 1B, and -1.5 wins for a DH. The replacement level is how much

the player played that year, so how hard he would be to replace. The park factor is the

number of runs above or below average the player’s home park is, so how conducive it is

to runs being scored. Once you have added in all adjustments, you have calculated Wins

Above Replacement.
Ensing 24

References

Frank, Robert H. Microeconomics and Behavior (7th ed). New York: McGraw-Hill, 2008.

Hakes, Jahn K., and Sauer, Raymond D. "An Economic Evaluation of the Moneyball
Hypothesis." Journal of Economic Perspectives, Vol. 20 No. 3 (Summer 2006), pp. 173–
186.

Lewis, Michael. Moneyball. New York: W.W. Norton & Company, Inc., 2003.

MacDonald, Don N., and Reynolds, Morgan O. “Are Baseball Players Paid their
Marginal Products?” Managerial and Decision Economics, Vol. 15, No. 5 (September –
October 1994), pp. 443-457.

Meltzer, Josh. "Average Salary and Contract Length in Major League Baseball: When do
they Diverge?" May 2005.

Rottenberg, Simon. “The Baseball Players’ Labor Market.” The Journal of Political
Economy, Vol. 64, No. 3 (June 1956), pp. 242-258

Scully, Gerald W. “Pay and Performance in Major League Baseball.” The American
Economic Review, Vol. 64, No. 6 (December 1974), pp. 915-930.

Tango, Tom M., Lichtman, Mitchel G., and Dolphin, Andrew E. The Book: Playing the
Percentages in Baseball. Dulles: Potomac Books Inc., 2007.

Vrooman, John. “The Baseball Players’ Labor Market Reconsidered.” Southern


Economic Association, Vol. 63, No. 2 (October 1996), pp. 339-360

Wooldridge, Jeffrey M. Introductory Econometrics (4th ed). Mason: Cengage Learning,


2009.

Vous aimerez peut-être aussi