Académique Documents
Professionnel Documents
Culture Documents
Authors:
Mark Jajeh
Professor:
Zachary Feinstein
Jason Trobaugh
Mark Jajeh
NBA Scheduling
Abstract
In this paper we apply binary programming techniques to the problem of
sports scheduling with a focus on the NBA. We analyze all phases of a
producing a feasible solution with binary programming. We then show how
to implement complex features of a realistic schedule.
As the NBA grows both in popularity and size it becomes increasingly evident
that there is a need to remove all outside influences from the competitive
internal balance between individual teams. This was the main reason that
the NBA recently moved from a manually constructed schedule to a schedule
created by an autonomous algorithm. Along with insuring equality among
teams schedules, this algorithm should be able to maximize interest in the NBA
by picking the right teams to play at the right times. Linear programming
seems like it would be a natural fit to solve this problem. This project is
focused on finding elements of a linear programming algorithm which can be
used to achieve the NBAs goals in an efficient and robust manner.
Objectives
Mark Jajeh
NBA Scheduling
Methods
3.1
A binary program can be used to solve any problem that can be organized
into a format such that the goal is to optimize a linear function subject to
some linear constraints of some binary variables. All binary programs share a
very basic core structure and are written with the following standard format.
min f 0 x
s.t. Ax b
x0
x {0, 1}
Where the variables represent
f: The Objective Function.A function to be minimized relative to our
set of constraints
x: The decision variables. A set of quantities that need to be determined
in order to solve the problem.
A,b: Constraint Matrix & Constants. These define the possible values
the variables of an LP problem can take.
We can create a binary program to produce an NBA schedule with
notation no more complicated than the form above. Once we do this we can
use commonly used linear programming algorithms to find solutions to our
problem.
3.2
NBA Structure/Constraints
Below are the aspects of a sports schedule that are unique to the NBA. We
must meet all of the below requirements to consider any schedule feasible.
2
Mark Jajeh
NBA Scheduling
3.3
Implementation
3.3.1
Decision Variables
We use integer binary variables in our model to solve the NBA scheduling
problem. We define the following decision variables:
(
1 if team i plays at team j on day k
xi,j,k =
0 else
Not only does the definition of decision variables in this manner
simplify the problem notationally; this definition creates an intuitive way to
1
I could not find an exact number for this so I used the length of the 2015-2016 season,
which was 170 days
2
A five year rotation determines which out-of-division conference teams are played only
three times... I could not find any information of the exact rotation so my implementation
assumes the same games that were played in the 2015-2016 season. See appendix for more
details
Mark Jajeh
NBA Scheduling
3.4
Basic Constraints
(1)
(xijk + xjik ) 1
(2)
j=1
xijk = 41
i = 1..30 Home
(3)
k=1 j=1
170 X
30
X
xijk = 41
j = 1..30
Away
(4)
k=1 i=1
xijk = 2
j divi
i = 1..30
(5)
xijk = 2
i divj
j = 1..30
(6)
k=1 j=1
170 X
30
X
k=1 i=1
Mark Jajeh
NBA Scheduling
xijk = 2
j confi
i = 1..30
(7)
xijk = 2
i confj
j = 1..30
(8)
k=1 j=1
170 X
30
X
k=1 i=1
170 X
30 X
30
X
xijk =3
j conf3i OR i conf3j
xijk 2
i conf3j
(10)
xijk 2
j conf3i
(11)
(9)
xijk = 1
j
/ confi
i = 1..30
(12)
xijk = 1
i
/ confj
j = 1..30
(13)
k=1 j=1
170 X
30
X
k=1 i=1
3
Mark Jajeh
3.5
NBA Scheduling
3.5.1
We would like to ensure that there is either at least X games played per day
or at most X games played per day throughout the entire league. We can do
this as follows.
At most X games per day:
1X
xijk X
2 i,j
k = 1, ..., 170
(14)
k = 1, ..., 170
(15)
3.5.2
X games in 7 days
(16)
Mark Jajeh
NBA Scheduling
2. We can set a constraint on the number of games played over all 7 day
stretches in a season. We can write this as
k+6 X
X
X
(
xijk +
xjik ) W
k
(17)
Although these two formulations appear to be very similar and can often
produce similar results, they have very different effects on our model. The
main thing to note is that constraining all 7 day stretches imposes (30164
= 4920) additional constraints while the weekly constraint imposes only ( 30
25 = 750 constraints ). The tradeoff between model complexity and effect
on solution is what we analyze in our results.
3.6
3.7
Optimization
Mark Jajeh
NBA Scheduling
not affect the model, if it is not designed correctly it can greatly hurt our
solution.We will test and analyze the following optimization functions to
illustrate what affect an objective function has on a solution.
3.7.1
Assigning Value
Assume that we have give some value Vt to each team in the league. We have
also assigned a value to each day of the season Vd . To create an objective
function, f , for each decision variable identified as xi,j,k we define a function
based on each three values. It is not obvious what this function should be.
We look at several basic approaches to this problem.
f =Vt (i) + Vt (j) + Vd (k)
f =Vt (i) Vt (j) + Vd (k)
f =Vt (i) Vt (j) Vd (k)
(18)
(19)
(20)
3.8
Adding Complexity
We now explore how we can incorporate complex features into our model by
adding new decision variables.
5
Mark Jajeh
3.8.1
NBA Scheduling
(24)
(25)
(26)
3.8.2
Tracking Location
Mark Jajeh
NBA Scheduling
2. Rest days. Teams can have some number of days between games,
exponentially increasing the complexity of accounting for travel paths.
We unfortunately could not find a way to accurately account for distance
travelled in our model. We have a proposed solution that still needs some work
which can be found in the appendix. We instead focused on a simpler problem
of removing rest days all together and then determining teams locations. We
are able to do this by adding the following decision variable and constraint.
zijk
ziik =
(
1 Team i is located at arena j on day k
=
0 else
(27)
xijk
(28)
zijk = xijk
i, j = 1, .., 30 : i 6= j, k = 1, ..., 82
(29)
Results
10
Mark Jajeh
4.1
NBA Scheduling
Constraint Results
11
Mark Jajeh
NBA Scheduling
12
Mark Jajeh
NBA Scheduling
With a quick look we can see that the real schedule is much more
organized than our own. There is a balance between games and rest days as
well as an order in which locations the games are played.
4.2
Basic Optimization
13
Mark Jajeh
NBA Scheduling
(a) f = i + j + k
(b) f = i j + k
(c) f = i j k
In figure(a) we see a trivial result, all games played in the 2nd half
of the season. Technically feasible but completely impractical. We can see
that in this case our solution was actually worsened with an added objective
function. Figure (b) shows the same result as (a) although it was created with
a different objective. It is important to remember that these functions are
dependent upon their critical point relative to the constraints of the model.
If two functions are maximal or minimal at the same place, for all intents and
purposes they do the same thing to our model. In figure (c) we first see how
14
Mark Jajeh
NBA Scheduling
we can use an objective function to our benefit. Although the result is again
impractical we now see a result that seems to logically follow our objective.
It is assumed that we are using the objective function see in figure (c), f =
i*j*k, in all the figures that follow.
4.3
Game Distribution
Mark Jajeh
NBA Scheduling
This result is remarkably similar to those we found by adding constraints to our model. This is important as it highlights how similar features
can be achieved in different ways with linear programming.
To make the result more realistic we ensure that games are played
everyday of the season along with enforcing a maximum amount of games
over a certain amount of days. Below is the schedule we found when limited
the amount of games per 7 day stretch to 4 and forced there to be at least 12
games per day
We can see that games are now more spread out through the course
of the season while still preserving some of the optimal features we would
want in our schedule ( in this test case that is games between teams with
16
Mark Jajeh
NBA Scheduling
higher indices played on later days ). The main problem we still have is that
individual teams games are closely grouped together.
Consecutive Games
Once we updated our model with the variables and constraints in section
3.6.1 we found the following:
This solution was found with a 0 objective function. It is not immediately clear why
the solution looks as it does without optimization over less consecutive games.
17
Mark Jajeh
NBA Scheduling
4.4
Discussion
Mark Jajeh
NBA Scheduling
also saw how binary programming can be used in complex problem solving.
Although we did not find a perfect NBA schedule or analyze all aspects of
linear programming, we did meet our goal of finding a feasible NBA schedule
while also learning more about advanced binary programming.
5.1
Schedule Analysis
We were able to find a feasible NBA schedule with little difficulty. As we saw
in figure 2 schedule we found would not be feasible in any realistic setting.We
saw that this was because we had not told our model what makes a realistic
schedule; any binary programming solution is can only be as comprehensive
as the model that built it. With additional constraints and more complex
optimization criteria we were able to create stronger schedules. At the end
of our analysis we realized there were two criteria of an NBA schedule that
we did not account for; distance minimization and home/away trip length.
These appear to be the two things that force a schedule to have the smooth
structure that was evident in the schedule from 2015 NBA season.
5.2
This project showed both the strengths and weakness of binary programming.
Binary programming is known for how it can reduce complex problems to
simple models solvable with known algorithms ; which we showed to be the
case for schedule creation. We also showed that this requisite simplicity
can also be a weakness. Although we were easily able to find a feasible
solution problems arose implementing advanced constraints and optimizations
( consecutive days played, distance optimization). Some of our proposed
implementations, although technically correct on paper, became too complex
for us to implement.8 . Overall we feel we gave a comprehensive overview of
approaches to creating binary programming models as well as discussing the
pros and cons of each.
8
19
Mark Jajeh
5.2.1
NBA Scheduling
Decision Variables
We saw that decision variables are the backbone of a BP model. The choice
of decision variables was shown to be very important. Looking at our initial
model, we see how a simple and concise set of decision variables can make
problem formulation very easy. On the other hand, as we saw when we tried
to account for consecutive games and/or distance travelled, our basic decision
variable definitions were not adequate for more complex constraints. We
needed to define additional decision variables in our model to even attempt a
solution.
We saw how to add non linear features to a model by implementing creative
decision variables. This was likely the most exciting part of our analysis with
the most room for future work. The features we added were very simple
and had very simple non-linear relationships to our basic decision variables.
Although any future or improved model would incorporate more complex
features, we provided a solid overview of how that can be accomplished.
5.2.2
Optimization
5.2.3
Complexity
Mark Jajeh
NBA Scheduling
linearity we must have in our model. The fact is that some relationships are
not linear, they require more detailed associations. This tradeoff between
complexity and linearity is fundamental to the problems you can solve with
binary programming.
Conclusion
Overall we can confidently conclude the project to be a success. The application of binary Programming to the NBA scheduling problem was shown to be
possible as well as somewhat efficient. Although a complex and realistic model
was not achieved, that was never the goal of this project. Any such implementation would require thorough knowledge of not only binary programming
formulation but also programming complexity. This work illustrates the
complexity of schedule creation as well as both the positives and negatives of
linear programming to solve such things.
Future Work
Any future work on this project should focus on finishing the distance optimization we could not. We did not realize at the onset of this project, but
this problem is an advanced version of a problem commonly known as the
traveling tournament problem. The traveling tournament problem would be
a good start for finishing distance optimization as well as home/away game
pattern. Future work would also benefit from a comprehensive analysis of
BP solution methodology, an area we did not cover.
21
Mark Jajeh
NBA Scheduling
References
[1] Celse C. Ribeiro, Sports Scheduling: Problems and Applications. International Transactions in Operational Research, January 2012
[2] Matt Winick, NBA Scheduling Formula. www.nbastuffer.com, September
2014
[3] Gurobi Optimization
Gurobi Optimizer Reference Manual
https://www.gurobi.com/documentation/6.5/refman.pdf
[4] Michael Trick, Formulations and Reformulations in Integer Programming
Proceedings of the Second international conference on Integration of
AI and OR Techniques in Constraint Programming for Combinatorial
Optimization Problems, 2005
Appendices
Source code
Source code is to large to include here. See https://lpnba.weebly.com for full
source code and all relevant files.
7.0.1
Gurobi
22
Mark Jajeh
7.1
NBA Scheduling
The amount of control you have with your objective function is constrained
by what decision variables are defined in our model. We would like to have
some way to limit the distance travelled in a season but the simplicity of our
models decision variables does not allow for that. We propose the following
as a way to optimize distance.
We introduce two new decision variables
(
1 Team i is located at arena j on day k
zijk =
(30)
0 else
(
1 some team goes from arena i to arena j on day k
yijk =
(31)
0 else
Because there are days in a season which teams do not play, it is very hard
for the model to track the location of a team over the course of a season.
The idea is for each team to start at their home arena and then
anytime they move to update z,which is the variable we define to keep track
of a teams location. To do this we need to impose a constraint for each day
of the season for each team. We need to check if a team plays on a given day
and if so where they play. If they do we update z accordingly, if not we want
z to remain the same as it was on the previous day. This can be achieved
with the following relation
P30
xijk = 1 OR
zi,j,k1 = 1 AND
i 6= j
zi,j,k =
zi,j,k1 = 1 AND
i == j
j xijk = 1 OR
j=1 xjik = 0
0 else
(32)
. This implies z = 1 if team i is located at arena j on day k or team i was
located at team j on the previous day and they did not play another game on
day k. This is summarized in the following truth table
23
Mark Jajeh
NBA Scheduling
zijk
xijk
zij(k1)
1
1
0
0
0
1
0
0
0
0
1
1
0
0
P30
j=1
(xijk + xjik )
0
1
1
0
30
X
j=1
24
(xijk + xjik ) 1
(33)