Vous êtes sur la page 1sur 25

Washington University in St Louis

Senior Design Project Report

NBA Scheduling with


Binary Programming

Authors:
Mark Jajeh

Professor:
Zachary Feinstein
Jason Trobaugh

August 26, 2016

Mark Jajeh

NBA Scheduling

Abstract
In this paper we apply binary programming techniques to the problem of
sports scheduling with a focus on the NBA. We analyze all phases of a
producing a feasible solution with binary programming. We then show how
to implement complex features of a realistic schedule.

Introduction and Background

As the NBA grows both in popularity and size it becomes increasingly evident
that there is a need to remove all outside influences from the competitive
internal balance between individual teams. This was the main reason that
the NBA recently moved from a manually constructed schedule to a schedule
created by an autonomous algorithm. Along with insuring equality among
teams schedules, this algorithm should be able to maximize interest in the NBA
by picking the right teams to play at the right times. Linear programming
seems like it would be a natural fit to solve this problem. This project is
focused on finding elements of a linear programming algorithm which can be
used to achieve the NBAs goals in an efficient and robust manner.

Objectives

The purpose of this project is to show how binary programming can be


used as en efficient method for sports schedule creation; with a focus on the
NBA. This project is a study of the methodology of binary programming as
a possible solution as much as of the results it provides. We have two main
objectives
Produce a feasible NBA schedule with Binary Programming
Explore the merits of binary programing as a solution to complex
problems.

Mark Jajeh

NBA Scheduling

Methods

3.1

Linear Programming Review

A binary program can be used to solve any problem that can be organized
into a format such that the goal is to optimize a linear function subject to
some linear constraints of some binary variables. All binary programs share a
very basic core structure and are written with the following standard format.
min f 0 x
s.t. Ax b
x0
x {0, 1}
Where the variables represent
f: The Objective Function.A function to be minimized relative to our
set of constraints
x: The decision variables. A set of quantities that need to be determined
in order to solve the problem.
A,b: Constraint Matrix & Constants. These define the possible values
the variables of an LP problem can take.
We can create a binary program to produce an NBA schedule with
notation no more complicated than the form above. Once we do this we can
use commonly used linear programming algorithms to find solutions to our
problem.

3.2

NBA Structure/Constraints

Below are the aspects of a sports schedule that are unique to the NBA. We
must meet all of the below requirements to consider any schedule feasible.
2

Mark Jajeh

NBA Scheduling

There are 30 total teams in the NBA.


There are 2 conferences each consisting of 3 divisions with 5 teams each.
There are 170 days in an NBA season.

Each team plays 82 total games ( 41 home and 41 away )


4 games against the other 4 division opponents, [4x4=16 games]
4 games against 6 (out-of-division) conference opponents, [4x6=24
games]
2 games against teams in the opposing conference. [2x15=30
games]
3 games against the remaining 4 conference teams, [3x4=12 games]2

3.3

Implementation

We will now discuss the creation of our BP model.

3.3.1

Decision Variables

We use integer binary variables in our model to solve the NBA scheduling
problem. We define the following decision variables:
(
1 if team i plays at team j on day k
xi,j,k =
0 else
Not only does the definition of decision variables in this manner
simplify the problem notationally; this definition creates an intuitive way to
1

I could not find an exact number for this so I used the length of the 2015-2016 season,
which was 170 days
2
A five year rotation determines which out-of-division conference teams are played only
three times... I could not find any information of the exact rotation so my implementation
assumes the same games that were played in the 2015-2016 season. See appendix for more
details

Mark Jajeh

NBA Scheduling

visualize our constraints/program. We can then think of an NBA schedule as


a 30 30 170 binary matrix. We will see that this simplicity comes at a
cost; we can not model all of the complex aspects of an NBA schedule with
just these decision variables alone.

3.4

Basic Constraints

To meet the requirements of an NBA schedule we impose the following


constraints on our decision variables.
1. Teams can not play themselves
xi,i,k = 0

i = 1, ..., 30, k = 1, ..., 170

(1)

2. Teams can only play one game per day


30
X

(xijk + xjik ) 1

i = 1, .., 30, k = 1, ..., 170

(2)

j=1

3. Teams must play 82 games per year, 41 home & 41 away


170 X
30
X

xijk = 41

i = 1..30 Home

(3)

k=1 j=1
170 X
30
X

xijk = 41

j = 1..30

Away

(4)

k=1 i=1

4. 4 games against other division opponents division opponent ( 2 home


& 2 away )
170 X
30
X

xijk = 2

j divi

i = 1..30

(5)

xijk = 2

i divj

j = 1..30

(6)

k=1 j=1
170 X
30
X
k=1 i=1

Mark Jajeh

NBA Scheduling

5. 4 games against 6 (out of division) conference opponents3


170 X
30
X

xijk = 2

j confi

i = 1..30

(7)

xijk = 2

i confj

j = 1..30

(8)

k=1 j=1
170 X
30
X
k=1 i=1

6. 3 games against remaining 4(out of division) conference opponents


Because there is not an even home/away split we will need to use two
additional constraints to force atleast 1 home game for each team.4

170 X
30 X
30
X

xijk =3

j conf3i OR i conf3j

xijk 2

i conf3j

(10)

xijk 2

j conf3i

(11)

(9)

k=1 i=1 j=1


170 X
30 X
30
X
k=1 i=1 j=1
170 X
30 X
30
X
k=1 i=1 j=1

7. 2 games against opposite conference opponents


170 X
30
X

xijk = 1

j
/ confi

i = 1..30

(12)

xijk = 1

i
/ confj

j = 1..30

(13)

k=1 j=1
170 X
30
X
k=1 i=1
3

There is a 5 year rotation to determine which conference opponents are played 4 or 3


times. I could not find any documentation on the specifics of this rule ( I even went so
far as to email the NBA directly and ask). For the purpose of this project we will use the
rotation used in the 2015-2016 NBA season.
4
Define conf3j as team j opponents that are played 3 times.

Mark Jajeh

3.5

NBA Scheduling

Additional Simple Constraints

Many features of an NBA schedule are not explicit rules/constraints. It is


important that these are still included in our model and reflected in our
solution. We limit the number of games teams can play over stretches of days
as well as set a minimum amount of games that must be played league wide
per day.

3.5.1

X games per day

We would like to ensure that there is either at least X games played per day
or at most X games played per day throughout the entire league. We can do
this as follows.
At most X games per day:
1X
xijk X
2 i,j

k = 1, ..., 170

(14)

k = 1, ..., 170

(15)

At least X games per day:


1X
xijk X
2 i,j

3.5.2

X games in 7 days

We want to ensure an even distribution of games over the course of a season.


We can do this by setting constraints for the mandatory amount of games
played over a certain amount of days. There are two ways we can do this.
1. We set a constraint on the amount of games played in each individual
week of the season. If we want to ensure that teams play at most W
games per week can write
k+6 X
X
X
(
xijk +
xjik ) W
k

i = 1, ..., 30, kmod7 = 0

(16)

Mark Jajeh

NBA Scheduling

2. We can set a constraint on the number of games played over all 7 day
stretches in a season. We can write this as
k+6 X
X
X
(
xijk +
xjik ) W
k

i = 1, ..., 30, k = 1, ..., 164

(17)

Although these two formulations appear to be very similar and can often
produce similar results, they have very different effects on our model. The
main thing to note is that constraining all 7 day stretches imposes (30164
= 4920) additional constraints while the weekly constraint imposes only ( 30
25 = 750 constraints ). The tradeoff between model complexity and effect
on solution is what we analyze in our results.

3.6

Additional Complex Constraints

Although we can model many extra features by adding basic constraint to


the model we currently have, there are many things which require added
complexity. Some of the things we want to incorporate are non-linear by
nature. We can still model this with our linear system but they will require us
to define additional decision variables along with added constraints; increasing
the complexity of our model.

3.7

Optimization

One of the key aspects of our model is the objective function.Traditionally an


objective function represents some type of cost or quantity. In our case, there
is no obvious object which we would like to optimized. The most important
thing to note about an NBA schedule in particular is that there is no control
over which teams play each other. For a given year a teams opponents are pre
determined. The only thing we are changing is when the games are played.
There is no obvious way to handle the scoring system or value of
given days and match ups, especially with the basic decision variables we
have defined. We do want to note that although the objective function does
7

Mark Jajeh

NBA Scheduling

not affect the model, if it is not designed correctly it can greatly hurt our
solution.We will test and analyze the following optimization functions to
illustrate what affect an objective function has on a solution.

3.7.1

Assigning Value

Assume that we have give some value Vt to each team in the league. We have
also assigned a value to each day of the season Vd . To create an objective
function, f , for each decision variable identified as xi,j,k we define a function
based on each three values. It is not obvious what this function should be.
We look at several basic approaches to this problem.
f =Vt (i) + Vt (j) + Vd (k)
f =Vt (i) Vt (j) + Vd (k)
f =Vt (i) Vt (j) Vd (k)

(18)
(19)
(20)

Any realistic approach to the optimization problem( of when specific games


between teams are played) would value teams and days based on some specific
criteria. Ideally you would look at previous seasons and find some prior
probability distribution that is assumed to positively affect the net interest
in games over the course of a season based on the time of certain games.
Finding this distribution, or something similar is beyond the scope of this
project. We instead focus on generalizations which will help illustrate the
affect of objective functions on results.

3.8

Adding Complexity

We now explore how we can incorporate complex features into our model by
adding new decision variables.
5

We move to non-linear optimization functions. This has repercussions in the theoretical


guarantee of optimality of our solutions. For our purposes this affect is negligible

Mark Jajeh

3.8.1

NBA Scheduling

Consecutive days played

Although there is no strict rule regarding games played on consecutive days;


it is commonly accepted that any good schedule will have as few back to
backs as possible. There is no obvious way to achieve this with only decision
variables for games played. To regulate consecutive games we define new
decision variable in our model as follows:
(
1 team i plays on day k
dik =
(21)
0 else
(
1 team i plays day k AND (k+1)
sik =
(22)
0 else
We now need to implement constraints that will define these variables in our
model. We can define d and s as follows.
!
X
xijk + xjik dik =0
i = 1, ..., 30, k = 1, .., 170 (23)
j

sik dik + di(k1) 1 i = 1, .., 30, k = 1, .., 169


sik di(k1)
i = 1, .., 30, k = 1, .., 169
sik dik
i = 1, .., 30, k = 1, .., 169

(24)
(25)
(26)

Once we have these constraints in our model we can limit the


number of consecutive games played by imposing constraints on sik as we see
fit or better yet by creating the objective function in a way that discourages
consecutive games from being played. We do not look at what this process
would entail.

3.8.2

Tracking Location

We ultimately would like to be able to optimize a schedule based on the


distance travelled by teams. This is no easy task for two main reason
1. Teams can travel between any combination of locations on any given
day.
9

Mark Jajeh

NBA Scheduling

2. Rest days. Teams can have some number of days between games,
exponentially increasing the complexity of accounting for travel paths.
We unfortunately could not find a way to accurately account for distance
travelled in our model. We have a proposed solution that still needs some work
which can be found in the appendix. We instead focused on a simpler problem
of removing rest days all together and then determining teams locations. We
are able to do this by adding the following decision variable and constraint.

zijk

ziik =

(
1 Team i is located at arena j on day k
=
0 else

(27)

xijk

i = 1, .., 30, k = 1, ..., 82

(28)

zijk = xijk

i, j = 1, .., 30 : i 6= j, k = 1, ..., 82

(29)

Results

We first show how our solution changes as we add constraints6 ; ultimately


producing a feasible NBA schedule. Once we know we can find a feasible
solution we then begin to look at the various ways our model is updated and
changed as we add features mentioned above.
6

Constraints are added in the same order as they appear in 3.4.

10

Mark Jajeh

4.1

NBA Scheduling

Constraint Results

(a) Total game constraint

(b) Division games

(c) 4 OOD games

(d) 3 OOD games

(e) Feasible schedule

(f) NBA schedule for 2015 Season

Figure 1: Progression of solution as constraints are added to BP model

11

Mark Jajeh

NBA Scheduling

Looking at figure 1 we see the progression of our solution as we continue to


add constraints to our BP model. We can see how our model constraints are
translated into the final output. We also see how our model does not care
about the things we do not tell it to. We can see this by looking at the change
from figures 1.a to 1.e. This emphasizes the need to account for all aspects
we deem necessary to the feasibility of our solution. It also emphasizes one of
the most powerful aspects of binary programming: ambiguity can be used
to your benefit when you want it to. We see this in further detail when we
discuss optimization.
Looking at figures 1.e and 1.f we see that we do indeed produce
a feasible NBA schedule. We see that all teams play all other teams the
same amount of times. The only reason the two are not identical is that the
rotation of games between either 3 or 4 OOD conference opponents is not
technically the same. Although we use the same rotation as the 2015 NBA
season, we do not use the same home/away split.
The problem is that feasibility does not imply accuracy. There is
still a very large gap between the schedule we created and one you would
see in a real NBA season. The difference between a feasible schedule and a
realistic one can be seen in figure 2.

12

Mark Jajeh

NBA Scheduling

(a) Basic feasible NBA schedule

(b) NBA schedule for 2015 season

Figure 2: Basic Feasible NBA Schedule vs. Actual NBA Schedule

With a quick look we can see that the real schedule is much more
organized than our own. There is a balance between games and rest days as
well as an order in which locations the games are played.

4.2

Basic Optimization

To try and improve our solution as well our understanding of objective


functions and their affect on our results we optimize over the functions
(19-21).

13

Mark Jajeh

NBA Scheduling

(a) f = i + j + k

(b) f = i j + k

(c) f = i j k

Figure 3: Optimization of solution with changing objective functions

In figure(a) we see a trivial result, all games played in the 2nd half
of the season. Technically feasible but completely impractical. We can see
that in this case our solution was actually worsened with an added objective
function. Figure (b) shows the same result as (a) although it was created with
a different objective. It is important to remember that these functions are
dependent upon their critical point relative to the constraints of the model.
If two functions are maximal or minimal at the same place, for all intents and
purposes they do the same thing to our model. In figure (c) we first see how
14

Mark Jajeh

NBA Scheduling

we can use an objective function to our benefit. Although the result is again
impractical we now see a result that seems to logically follow our objective.
It is assumed that we are using the objective function see in figure (c), f =
i*j*k, in all the figures that follow.

4.3

Game Distribution

(a) Constraint by individual week

(b) Constrain by 7 day window

Figure 4: Constraint by individual week, 4 games per week

Above we see the results of limiting the amount of games played in a 7


day stretch to 4 using both methods mentioned previously.We see that both
methods produce very similar results. The solutions have a very uniform
distributions, with games occurring mostly in sets of 4. They still favor
playing games as late as possible, but now games are forced to be more spread
out throughout the season. Although not perfect we can see how adding
constraints to our basic model improves its viability.
15

Mark Jajeh

NBA Scheduling

To emphasize the different ways we can affect the solution we show


how a similar result can be achieved through optimization. We optimize our
model based on the day of the week a game is played (where later days are
more valuable)

Figure 5: Optimization over day of the week

This result is remarkably similar to those we found by adding constraints to our model. This is important as it highlights how similar features
can be achieved in different ways with linear programming.
To make the result more realistic we ensure that games are played
everyday of the season along with enforcing a maximum amount of games
over a certain amount of days. Below is the schedule we found when limited
the amount of games per 7 day stretch to 4 and forced there to be at least 12
games per day

Figure 6: Constraint by week and day

We can see that games are now more spread out through the course
of the season while still preserving some of the optimal features we would
want in our schedule ( in this test case that is games between teams with
16

Mark Jajeh

NBA Scheduling

higher indices played on later days ). The main problem we still have is that
individual teams games are closely grouped together.

Consecutive Games
Once we updated our model with the variables and constraints in section
3.6.1 we found the following:

(a) Schedule found when including consecutive games variables

(b) Consecutive games played over full season

Figure 7: Solution when accounting for consecutive days played

This is exactly what we expected from the new decision variables


and our resulting solution is even better than we expected. We can see that
our model can accurately track the back to back games played in a season
as well as use this to decide where to place games7 . More importantly; we
have shown how additional decision variable can be used to model non-linear
7

This solution was found with a 0 objective function. It is not immediately clear why
the solution looks as it does without optimization over less consecutive games.

17

Mark Jajeh

NBA Scheduling

relationships! This is a key insight moving forward, and in general when


attempting to solve a complex problem with binary programming.

4.4

Tracking team location

When we added decision variable to track location we were able to produce a


feasible solution.

(a) Games played on day 4

(b) Team locations on day 4

Figure 8: Games and location for an individual day

Looking at the two plots above we immediately notice the similarity


between the variables. Looking carefully we see that team locations seem to
be the same as games with additional points along the main diagonal. This
is just as we expect. All away teams will be located at the location of the
home team that they play and all home teams will be at their own arena (
found on the main diagonal ).

Discussion

We have shown how binary programming, binary in particular, can be an


efficient and straight forward method of simple NBA schedule design. We
18

Mark Jajeh

NBA Scheduling

also saw how binary programming can be used in complex problem solving.
Although we did not find a perfect NBA schedule or analyze all aspects of
linear programming, we did meet our goal of finding a feasible NBA schedule
while also learning more about advanced binary programming.

5.1

Schedule Analysis

We were able to find a feasible NBA schedule with little difficulty. As we saw
in figure 2 schedule we found would not be feasible in any realistic setting.We
saw that this was because we had not told our model what makes a realistic
schedule; any binary programming solution is can only be as comprehensive
as the model that built it. With additional constraints and more complex
optimization criteria we were able to create stronger schedules. At the end
of our analysis we realized there were two criteria of an NBA schedule that
we did not account for; distance minimization and home/away trip length.
These appear to be the two things that force a schedule to have the smooth
structure that was evident in the schedule from 2015 NBA season.

5.2

Binary Programming Implementation

This project showed both the strengths and weakness of binary programming.
Binary programming is known for how it can reduce complex problems to
simple models solvable with known algorithms ; which we showed to be the
case for schedule creation. We also showed that this requisite simplicity
can also be a weakness. Although we were easily able to find a feasible
solution problems arose implementing advanced constraints and optimizations
( consecutive days played, distance optimization). Some of our proposed
implementations, although technically correct on paper, became too complex
for us to implement.8 . Overall we feel we gave a comprehensive overview of
approaches to creating binary programming models as well as discussing the
pros and cons of each.
8

See source code for full implementation

19

Mark Jajeh

5.2.1

NBA Scheduling

Decision Variables

We saw that decision variables are the backbone of a BP model. The choice
of decision variables was shown to be very important. Looking at our initial
model, we see how a simple and concise set of decision variables can make
problem formulation very easy. On the other hand, as we saw when we tried
to account for consecutive games and/or distance travelled, our basic decision
variable definitions were not adequate for more complex constraints. We
needed to define additional decision variables in our model to even attempt a
solution.
We saw how to add non linear features to a model by implementing creative
decision variables. This was likely the most exciting part of our analysis with
the most room for future work. The features we added were very simple
and had very simple non-linear relationships to our basic decision variables.
Although any future or improved model would incorporate more complex
features, we provided a solid overview of how that can be accomplished.

5.2.2

Optimization

Throughout this analysis we saw that the objective function used in an BP


model can have a huge affect on the overall solution. We first saw that
if the function is not chosen properly that the solution you find, although
technically feasible, can be pretty bad. We then saw how a well designed
objective function can be used to affect solutions how you see fit.
One of the most important things was that ambiguity is not always a bad
thing. Some times we do not want and should not set strict constraints on
our model. Ambiguity can be a good thing if we want to tell our model
that something is important and not necessary. This approach lets the solver
figure out the intricacies of a good solution rather than having to model one.

5.2.3

Complexity

We were limited in our results due to the complexity of realistic scheduling.


We saw that is is not always easy to account for seemingly standard features,
such as distance traveled or previous games played. This is due to the
20

Mark Jajeh

NBA Scheduling

linearity we must have in our model. The fact is that some relationships are
not linear, they require more detailed associations. This tradeoff between
complexity and linearity is fundamental to the problems you can solve with
binary programming.

Conclusion

Overall we can confidently conclude the project to be a success. The application of binary Programming to the NBA scheduling problem was shown to be
possible as well as somewhat efficient. Although a complex and realistic model
was not achieved, that was never the goal of this project. Any such implementation would require thorough knowledge of not only binary programming
formulation but also programming complexity. This work illustrates the
complexity of schedule creation as well as both the positives and negatives of
linear programming to solve such things.

Future Work

Any future work on this project should focus on finishing the distance optimization we could not. We did not realize at the onset of this project, but
this problem is an advanced version of a problem commonly known as the
traveling tournament problem. The traveling tournament problem would be
a good start for finishing distance optimization as well as home/away game
pattern. Future work would also benefit from a comprehensive analysis of
BP solution methodology, an area we did not cover.

21

Mark Jajeh

NBA Scheduling

References
[1] Celse C. Ribeiro, Sports Scheduling: Problems and Applications. International Transactions in Operational Research, January 2012
[2] Matt Winick, NBA Scheduling Formula. www.nbastuffer.com, September
2014
[3] Gurobi Optimization
Gurobi Optimizer Reference Manual
https://www.gurobi.com/documentation/6.5/refman.pdf
[4] Michael Trick, Formulations and Reformulations in Integer Programming
Proceedings of the Second international conference on Integration of
AI and OR Techniques in Constraint Programming for Combinatorial
Optimization Problems, 2005

Appendices
Source code
Source code is to large to include here. See https://lpnba.weebly.com for full
source code and all relevant files.

7.0.1

Gurobi

We used Gurobi optimization software on top of our matlab implementation.


The implementation of Gurobi did not require any change of syntax in
our implementation, although we did use gurobi specific syntax when we
implemented some advanced constraints. Gurobi speeds up the evalution of
the standard integer linear programming matlab solver, intlinprog.

22

Mark Jajeh

7.1

NBA Scheduling

Unfinished Work on Distance optimization

The amount of control you have with your objective function is constrained
by what decision variables are defined in our model. We would like to have
some way to limit the distance travelled in a season but the simplicity of our
models decision variables does not allow for that. We propose the following
as a way to optimize distance.
We introduce two new decision variables
(
1 Team i is located at arena j on day k
zijk =
(30)
0 else
(
1 some team goes from arena i to arena j on day k
yijk =
(31)
0 else
Because there are days in a season which teams do not play, it is very hard
for the model to track the location of a team over the course of a season.
The idea is for each team to start at their home arena and then
anytime they move to update z,which is the variable we define to keep track
of a teams location. To do this we need to impose a constraint for each day
of the season for each team. We need to check if a team plays on a given day
and if so where they play. If they do we update z accordingly, if not we want
z to remain the same as it was on the previous day. This can be achieved
with the following relation



P30

xijk = 1 OR
zi,j,k1 = 1 AND
i 6= j

j=1 (xijk + xjik ) = 0


1


P
P30

zi,j,k =
zi,j,k1 = 1 AND
i == j
j xijk = 1 OR
j=1 xjik = 0

0 else
(32)
. This implies z = 1 if team i is located at arena j on day k or team i was
located at team j on the previous day and they did not play another game on
day k. This is summarized in the following truth table

23

Mark Jajeh

NBA Scheduling

zijk

xijk

zij(k1)

1
1
0
0
0

1
0
0
0
0

1
1
0
0

P30

j=1

(xijk + xjik )

0
1
1
0

We can enforce this constraint with the following equation:

0 2zijk xijk .5zij(k1) + .5

30
X
j=1

24

(xijk + xjik ) 1

(33)

Vous aimerez peut-être aussi