Vous êtes sur la page 1sur 9

Algorithms Project 1, Report 1

Tyler Percy, Geoffrey Cline, Alex Marey


Abstract
In this initial report we will begin by thoroughly
investigating the history and relevance of them Knapsack
problem, revealing our motivation for analyzing available
algorithms. We will be investigating the brute, greedy, and
dynamic approaches to this problem, through detailing
pseudo code and considering time cost complexity. Our next
report will focus on experiments with these three
approaches that produce analytical data on their efficiency.

Introduction & Motivation


The Knapsack problem exemplifies Randall Munroe's comic:
In [Computer Science], it can be hard to explain the
difference between the easy and the virtually impossible
[1] The setup of the knapsack problem is easily understood,
and even appears practical. Given access to a limited set of
items, each with a value and weight, select the items that
maximize the value of your knapsack while still obeying a
weight limit. [2] Initially, the problem appears to have
roughly the same difficulty as problems like scheduling a
classroom or finding the shortest path that visits all the
landmarks in a city. However, careful inspection reveals a
number of subtleties that complicate the situation greatly.
The general concept of the problem is first recorded in 1887, with G.B. Mathews On the
partition of numbers [3]. Less defined versions of the problem exist in earlier historical records,
presenting as thought problems or exercises. The reason being that the Knapsack problem tends to
naturally crop up in physical applications, like the utilization of raw resources or structuring
investment portfolios. It puzzles mathematicians and computer scientists still, ranking as of
September 1999 the 4th most needed solution. [4]
Overtime, the subtleties of the problem have been explored. Consider the question: Can a value
of at least V be achieved without exceeding the weight W? Research has shown, baring a
breakthrough in mathematics and computer science, this question cannot be answered correctly with
a time cost that grows at a rate defined by a polynomial, called NP-Complete. [5] This question is a key
part of determining if a solution is correct, meaning that it also appears impossible to determine
correctness quickly. This property appears to stem from the fact that each item can only be selected
entirely, or left unselected, leading to this specific problem being called Knapsack 0-1.
Again, Knapsack 0-1s designation as NP-Complete means that there is no known efficient and
correct solution to this algorithm. However, approximate solutions are able to be generated in faster

time than brute force methods. For example, A greedy approach produces an approximate solution
modeled by O(n log n) where n is the number of items available.
Other variants exist that are more approachable, such as where items can be selected multiple
times, or broken down and selected partially. Further. The quandary of Knapsack 0-1 relates to a
similar difficulties in other problems, for example the also NP-Complete Traveling Salesman Problem,
that asks for the most efficient path between a list of cities that returns to the starting city [6].
Overall, the Knapsack Problem, specifically the 0-1 variant, is a problem that appears trivial but
actually has a deep history focused on revealing it's subtleties. Focus remains on the problem due to its
practicality and wide range of applications.

References
[1] R. Munroe, "XKCD: Tasks," 24 August 2014. [Online]. [Accessed 26 October 2014].
[2] E. W. Weisstein, "Knapsack Problem," Mathword--A Wolfram Web Resource, [Online]. Available:
http://mathworld.wolfram.com/KnapsackProblem.html. [Accessed 26 October 2014].
[3] M. B. G, "On the partition of numbers," Procedings of the London Mathematical Society, no. 28, pp.
486-490, 1897.
[4] S. S. Skiena, "Who is Interested in Algorithms and Why?," ACM SIGACT, vol. 30, no. 3, pp. 65-74,
1999.
[5] D. J. Cook, "KNAPSACK is NP-Complete," Washington State University, [Online]. Available:
http://www.eecs.wsu.edu/~cook/aa/lectures/l27/node24.html. [Accessed 26 October 2014].
[6] E. W. Weisstein, "Traveling Salesman Problem," MathWorld--A Wolfram Web Resource, [Online].
Available: http://mathworld.wolfram.com/TravelingSalesmanProblem.html. [Accessed 26 October
2014].

Proposed Solutions
Dynamic Programming
DP_Knapsack(A, W, n)
{
Create an array S[n, W]
for (i = 0 to W)
{
S[0,i] = 0
}
for (i = 1 to n)
{
for(j = 0 to W)
{
if ( (wi <= j) and ((vi + S[i-1,j-wi]) > S[i-1,j] )
{
S[i,j] = vi + S[i-1,j-wi]
keep[i,w] = True
}
else
{
S[i,j] = S[i-1,j]
keep[i,w] = False
}
}
}
K = W
for (i = n downto 1)
{
if (keep[i,k] == 1)
{
output i;
K = K - wi
}
}
return S[n,W]
}

Complexity
The complexity of any dynamic programming algorithm will always be polynomial because it
will use a table. At its greatest complexity, this algorithm uses a for loop and a nested for loop. The
complexity of a for loop is the sum of the complexity inside of it times the number of times it is
executed. The upper bounds for the for loops are the amount of items and the maximum weight, The
complexity inside of them is O(1). Therefor the complexity of the two loops will be O(nW). In the
algorithm there is one more for loop but this one will only be complexity of O(n) since it is executed n
times. Overall the complexity of the dynamic programming approach to knapsack is O(nW).

Description
Set A = {a1, , an} with n items. Each item (an) has a weight (wn) and a value (vn).
Constructing an array S will allow the algorithm to store the maximum combined value of a
combination of items an under a max weight W. After computing the entries for every spot in the
array, the entry in the location S[n,W] will be the maximum value that can fit into the knapsack.
As stated in the complexity the dynamic approach uses a for loop and a nested for loop to go
through the entire array and create a table. The table has n rows and W columns, where n is the total
amount of items and W is the max weight the knapsack can hold. At the entry S[i,j] will be the
maximum weight that can be held for i items and j weight. In order to output the optimal set of items
the algorithm uses a second table call keep. This new table stores boolean values of true or false to
determine if the position in keep[i,j] is a part of the optimal solution.
In the nested for loop the algorithms tests to see if the wi is under j and if vi + S[i-1,j-wi] >S[i-1,j].
If this statement is true then it sets S[i,j] = vi + S[i-1,j-wi] and sets that value to true in the boolean table.
If this statement is not true then the value S[i,j] is set to S[i-1,j] and the value in the boolean table is set
to false. Once all the positions have a value, S[n,W] is the maximum value for the given set A.
However, we still do not know which items in A make up this solution. In order to do find this we
must look at the table in which the values are marked true or false. The algorithm will start from S[n,
W] and use a for loop to go from i= n to 0. Each time a true value is found that value will be outputted
while the K value for the array is subtracted by that items weight. After the for loop is finished the set
of items will have been outputted in reverse order.

Greedy Algorithm
GreedyKS(A,w)

//A is given items available, w is max weight

{
D = A;
S =

//working copy of the set of available items


;

j = 1;

//build solution
//move through array of available items

//determine ratios for all elements


for i=1|D|
D[i].ratio = D[i].value / D[i].weight
//Sort D based on ratio, greatest to least
//Assume comparison operator acts on ratio
Quicksort(D,1,|D|);

while (S[i].weight + D[j].weight) w && j |D|


//while sum of the weight of all elements in S
//added to weight of next element, will fit
//in knapsack, and still elements to select from D
{
S = S
j = j + 1
}
return S
}

{ D[ j ] }

Complexity
The complexity of setting up the algorithms variables is constant. Each iteration of the loop that
defines the ratio for each element has a constant cost, and it loops for the number of elements in the
initial set A, so O(|A|). Assuming the sorting algorithm used is QuickSort, the complexity of this stage
will be O(|A| log |A|). The while loop so select elements while space is available in the knapsack can
loop a maximum of |A| times, the case where every available item is selected, so its complexity is
O(|A|). The most significant of these sub-complexities is O(|A| log |A|), thus it governs the entire
function.

Description
The algorithm closely follows the general form of a greedy algorithm. To select the best
available choice, the algorithm first determines the ratio of value to weight of each item. This is an
effective way to rank the items because the goal is to maximize the value of the selected items, while
still obeying the weight restriction. It will then continue to select the best available item that keeps the
knapsack under the weight restriction, until no more items can be selected.
This algorithm would be correct if we could select items partially, keeping the ratio intact while
exactly meeting the weight limit. Suppose if we had a gold bar worth $80 that occupied 51% of the
knapsack, while we had two silver bars that each had a value of $50 and occupied 50% of the
knapsack. The algorithm would first select the better ratio of the gold bar, and then be unable to select
another item. However, the optimal solution is to select both silver bars, for a value of $100. Now
consider if we had a equivalent value and weight of both the silver and gold in liquid form. It would be
optimal to select the whole amount of gold, before considering the silver, and selecting just a portion.
It is the combination of not only ratio of value to weight, but the numerics of optimizing the selection
of items based on compatible space that prevents the greedy approach from being correct for this
problem, as this would require decisions to be reconsidered, the opposite of a greedy algorithm.

Brute Force
bruteforce(item[1,,,,N],A[1,,,,N],W){
//A[] is initialized to 0s that corresponds to item[]. if an element
//A[k] is = 1, then item[k] is in the currentChoice
bestChoice = empty array of size n

//the final array to be returned

currentChoice = empty array of size n //the actual array of items of


//the current solution
m=0

//used to increment item placement into currentChoice array

j=0

//used for making all combos

for i = 1 to 2^n {

//enough for all combos

j = n
currentWeight = 0

//resets the process for each new combo

currentValue = 0
while ( A[j] != 0 and j > 0){
A[j] = 0

//generates each combo by using


//a binary representation

j-}
A[j] = 1
for k = 1 to n {

//adds items in current combo

if (A[k] = 1) then
currentWeight = currentWeight + item[k].weight
currentValue = currrentValue + item[k].value
currentChoice[m] = item[k]
m++
}
//checks if the current combo is better than the max so far
if ((currentValue > bestValue) && (currentWeight <= W)){
bestValue = currentValue
bestWeight = currentWeight
bestChoice = currentChoice
}
}
return bestChoice //returns the max at the end
}

Complexity
The most outer loop will always execute 2n times in order to generate each possible combination of
items. The while loop inside will execute at the most n times if it has to go through all the items. The
for loop also inside will always execute n times. That means both of these loops have a complexity of
(n). Inside of each loop is a few statements with constant complexity that are overpowered by all of
the other higher complexities. All together the outer for loop executes two loops of n complexity 2n
times for a total of (2n*2n) complexity. As n gets asymptotically large the constant 2 will be
overpower resulting in the final complexity of (n*2n)

Description
The brute force method of solving the knapsack problem makes up for its inefficiency in simplicity.
The idea is to make every possible subset of all of the items and see which combination that stays
below the capacity of the knapsack has the highest value. This method uses an array of binary
elements (either equal to 1 or 0) to generate all of the different combinations of subsets of items. The
for loop inside goes through the binary elements to create the actual array of items to be analyzed for
weight and value and puts them inside currentChoice. Finally we compare currentChoice to see if it is
valid weight and higher value than the best set of items so far. If it is we set bestChoice equal to
currentChoice and if not we continue. At the end we return bestChoice and get the subset with the
highest value and a weight under or equal to the capacity.

Plan of Experiments
Methodology
For our experiment we are going to use three different algorithms and compare the time it
takes each one to solve the knapsack problem and also compare the value of the solution produced by
each different type. To reduce the noise in our outputs we will also generate the value and the weights
of our items randomly according to the Gaussian distribution. We will use four graphs to show the
differences clearly between the different methods. The first graph will show the execution time as we
increase the size of the input keeping the mean and variance for the weights and values of our
randomly generated objects constant. The second graph will show the execution time as we increase
the variance while keeping the input size and mean constant. The third graph will show the value of
the solution as we increase the size of the input while again keeping the mean and variance constant.
The last graph will show the value of the solution as we increase the variance while keeping the input
size and mean constant.
Constant Values
For our arbitrary mean we will choose 50. For our arbitrary variance we will use 10. The
capacity of our knapsack will be 5000 so that on average, the knapsack will be able to hold 100 items.
For graphs that require a fixed input we will use 300. Each of the graphs will have 10 data points for
each method. Each data point will the average of 15 executions under the same circumstances to
minimize the randomness that comes with using the Gaussian distribution.

Significance
We are using the graphs to show which algorithm would be good to use in a given situation. It is
important to know roughly which methods perform faster or give a better solution given you know the
size of the input and the variance so that you can find the best method to solve your specific problem.
One of the biggest goals in computer science is to find the best solution to your problem whether you
define best to be fastest, most efficient or the absolute highest value possible. The results we obtain
will give us a better idea of how the algorithms perform in a given scenario.

Expected results
Our expectations for the results of our experiment are based off of our asymptotic complexities,
and are as follows. The brute force method will take the longest as the input size increases. Then
dynamic will find the result the second slowest while greedy will solve the program in the fastest time
as the input size grows. We expect for our graphs where we increase the variance that the brute force
will stay about the same execution time as it doesnt take into account the values. The greedy will also
stay roughly the same as the values of the ratios will not really affect the runtime. The time of dynamic
might increase because it does take into account the weight and value when running through its code.
We think that the as we increase the size of the input, the value of the solution will increase for all of
the methods because there is a better chance to get higher values. As we increase the size of the
variance the value of the solution will be less reliable. By saying less reliable, we think that the
increase in variance will make a higher chance for a lower or higher solution value for all of the
methods.

Vous aimerez peut-être aussi