Vous êtes sur la page 1sur 47

Advanced Analysis of Algorithms

Chapter 1
Dr. M. Sikander Hayat Khiyal Chairperson Department of Computer Science/Software Engineering, Fatima Jinnah Women University, Rawalpindi, PAKISTAN m.sikandarhayat@yahoo.com

EVERY CASE TIME COMPLEXITY


For a given function, T(n) is defined as the number of times the algorithm does the basic operations for an instance of size n. T(n) is called the every case time complexity of the algorithm, and the determination of T(n) is called an every case time complexity analysis.

WORST CASE TIME COMPLEXITY


For a given algorithm, W(n) is defined as the maximum number of times the algorithm will ever do its basic operations for an input of size n. W(n) is called the worst case time complexity of the algorithm, and the determination of W(n) is called the worst case time complexity analysis. If T(n) exists then clearly: W(n) = T(n)

AVERAGE CASE TIME COMPLEXITY


For a given algorithm, A(n) is defined as the average (or expected value) number of times the algorithm does its basic operations for an input of size n. A(n) is called the average case time complexity of the algorithm, and the determination of A(n) is called the average case time complexity analysis. If T(n) exists then clearly: A(n) = T(n)

BEST CASE TIME COMPLEXITY


For a given algorithm, B(n) is defined as the minimum number of times the algorithm will ever do its basic operations for an input of size n. B(n) is called the best case time complexity of the algorithm, and the determination of B(n) is called the Best case time complexity analysis. If T(n) exists then clearly: B(n) = T(n)

ANALYSIS SUMMARY OF SORTING ALGORITHMS Algorithm Exchange Sort Insertion Sort Selection Sort Merge Sort Merge Sort (D.P) Quick Sort Heap Sort Comparisons of keys T(n) = n2/2
W(n) = n2/2 A(n) = n2/4

Assignment of records
W(n) = 3n2/2 A(n) = 3n2/4 W(n) = n2/2 A(n) = n2/4

Extra space usage In place In place In place U(n) records U(n) links U(lg n) indices In place

T(n) = n2/2
W(n) = nlgn A(n) = nlgn W(n) = nlgn A(n) = nlgn W(n) = n2/2 A(n) = 1.38 nlgn W(n) = 2nlgn A(n) = 2nlgn

T(n) = 3n T(n) = 2nlgn T(n) = 0


A(n) = 0.69 nlgn W(n) = nlgn A(n) = nlgn

COMPUTATIONAL COMPLEXITY
Computational complexity is the study of all possible algorithms that can solve a given problem. A computational complexity analysis tries to determine a lower bound on the efficiency of all algorithms for a given problem. Its is a field that runs hand inhand with algorithm design and analysis.

LOWER BOUND FOR SORTING ALGORITHMS


Insertion sort: In general, we are concerned with sorting n distinct keys that come from any ordered set. However, without loss of generality, we can assume that the keys to be sorted are simply the +ve integers 1,2,,n because we can substitute 1 for the smallest key, 2 for 2nd smallest key and so on. Suppose for the alpha input [Ralph, Clyde, Dave], we have [3,1,2], if 1 associates with Clyde, 2 with Dave and 3 with Ralph. Any algorithm that sorts these integers only by comparison of keys would have to do the same number of comparisons to sort the three names.

A permutation of the first n positive integers can be thought of as an ordering of these integers because there are n! permutations of the first n +ve integer, there are n! different orderings. For example, for the first three +ve integers there are six ordering permutation as: [1,2,3] [1,3,2] [2,1,3] [2,3,1] [3,1,2] [3,2,1] This means that there are n! different inputs containing n distinct keys. These six permutations are the different inputs of size 3. Permutation is denoted by [k1,k2,,kn]. An inversion in a permutation is a pair (ki,kj) such that i<j and ki > kj

For example, the permutation [3,2,4,1,6,5] contains the inversion (3,2), (3,1), (2,1), (4,1) and (6,5). Clearly a permutation contains no inversion if and only if it is the sorted ordering [1,2,,n]. This means that the task of sorting n distinct keys is the removal of all inversions in a permutation.

Theorem1: Any algorithm that sorts n distinct keys only by comparison of keys and remove at most one inversion after each comparison must in the worst case do at least n(n-1)/2 comparisons of keys and on the average do at least n(n-1)/4 comparisons of keys. Proof: For the worst case, there are [n,n-1,,2,1] permutations, therefore there are at least n(n-1)/2 comparisons of keys. (solved by arithmetic sum) For the average case, pair the permutation [kn, , k2, k1] with the permutation [k1, k2, ..., kn]. This is called the Transpose of the original permutation. Let r and s be integers between 1 and n such that s>r.

Given a permutation, the pair (s,r) is an inversion in either permutation or its transpose and not in both. As there are [n,n-1,,2,1] permutations so there are n(n-1)/2 pairs between 1 and n. This means that a permutation and its transpose have n(n-1)/2 inversions. Thus the average number of inversions in a permutation and its transpose is ()*(n(n-1)/2) = n(n-1)/4. Therefore, if we consider all permutations equally probable for the input, the average number of inversions in the input is also n(n-1)/4. Because we assumed that the algorithm removes at most one inversion after each comparison, on the average it must do at least this many comparisons to remove all inversions and there by sort the input.

Lower bounds for sorting only by comparison of keys Decision tree for sorting algorithm: We can associate a binary tree with procedure sort tree by placing the comparison of a and b at the root, left child in the comparison if a<b and right child if aub.
Yes Yes

a<b
No

No

b<c
Yes

b<c a<c a<c


No Yes

No

a,b,c
Yes

c,b,a
No

a,c,b

c,a,b

b,a,c

b,c,a

Fig : 1

This tree is called Decision tree, because at each node a decision must be made as to which node to visit next. A decision tree is called Valid for sorting n keys if, for each permutation of the n keys, there is a path from the root to a leaf that sorts that permutation, that is, it can sort every input of size n. The above tree is valid, but no longer be valid if we removed any branch from the trees. Now draw a decision tree for exchange sort when sorting three keys.

Yes
c<b

b<a

No
c<a

Yes
b<a

No
c<a

Yes
a<b

No
c<b

Yes
c,b,a

Yes
b,c,a

No

Yes
c,a,b

No Yes
a,c,b a,b,c

b,a,c

Fig:2

A decision tree is Pruned if every leaf can be reached from the root by making a consistent sequence. Note that for exchange sort the comparison c<b does mean that the current value of c is compared with current value of b not s[3] and s[2].

LEMMA 1: To every deterministic algorithm for sorting n distinct keys there corresponds a pruned, valid, binary decision tree containing exactly n! leaves. PROOF:As there is a pruned, valid decision tree corresponding to any algorithm for sorting n keys. When all the keys are distinct, the result of a comparison is always < or > . Therefore, each node in that tree has at most 2 children, which means that it is a binary tree. Because there are n! different inputs that contains n distinct keys and because a decision tree is valid for sorting n distinct keys only if it has a leaf for every input, the tree has at least n! leaves. Because there is a unique path in the tree for each of the n! different inputs and because every leaf is pruned, decision tree must be reachable, the tree can have no more than n! leaves. Therefore, the tree has exactly n! leaves.

LOWER BOUND FOR WORST CASE BEHAVIOUR LEMMA 2: The worst case number of comparisons done by a decision tree is equal to its depth. PROOF: Given some input, the number of comparisons done by a decision tree is the number of internal nodes on the path followed for that input. The number of internal nodes is the same as the length of the path. Therefore, the worst case number of comparisons done by a decision tree is the length of the longest path to a leaf, which is the depth of the decision tree.

LEMMA 3: If m is the number of leaves in a binary tree and d is the depth, then d u lg m PROOF: By induction, show that 2d um Induction Base: A binary tree with the depth 0 has one node that is both the root and the only leaf. Therefore, for such a tree, the number of leaves m equals 1, and 20 u 1 Induction Hypothesis: Assume for every binary tree with depth d , 2d u m, where m is the number of leaves. Induction Step: We need to show that, for a binary tree with depth d+1 , 2d+1 u m

where m is the number of leaves. If we remove all the leaves from such a tree, we have a tree with depth d whose leaves are the parent of the leaves in original tree. If m is the number of these parents, then by the induction hypothesis. 2d u m because each parent can have at most two children, 2m>=m Thus, 2d+1 u 2m u m Proof completed. Now taking lg of 2d u m gives d u lg m because d is the integer so d u lg m.

THEOREM 2: Any deterministic algorithm that sorts n distinct keys only by comparisons of keys must in the worst case do at least lg (n!) comparisons of keys. PROOF: By lemma 1, any pruned, valid BDT has n! leaves and by lemma 3 depth dulg m. Thus theorem follow by lemma 2 that any DT worst case number of comparison is given by its depth. LEMMA 4: For any +ve integer n, lg(n!) u nlgn - 1.45n PROOF: n lg(n!) = lg[n(n-1)(n-2)(2)(1)] = 7 lg i
i=2

since lg 1=0
n

lg(n!) u 1 lg x dx = (1/ln2)[n ln n n+1] u nlogn 1.45n Proof Completed. THEOREM 3: Any deterministic algorithm that sorts n distinct keys only by comparison of keys must in the worst case do at least nlgn 1.45n comparison of keys. PROOF: By theorem 2 and lemma 4.

LOWER BOUNDS FOR AVERAGE CASE BEHAVIOUR Definition: A binary tree in which every nonleaf contains exactly two children is called 2-tree. LEMMA 5: To every pruned, valid, BDT for sorting n keys (distinct), there corresponds a pruned, valid decision 2-tree that is at least as efficient as the original tree. PROOF: If the pruned, valid BDT corresponding to a deterministic sorting algorithms for sorting n distinct keys contain any comparison nodes with only one child, we can replace each such node by its child and prune the child to obtain a decision tree that sorts using no more comparisons than did the original (fig.2). Every nonleaf in the new tree will contain exactly two children.

Definition: The external path length of a tree is the total length of all paths from the root to the leaves. EPL for Figure 1 is EPL=2+3+3+3+3+2=16 As the EPL is the total number of comparisons done by DT and n! different inputs of size n. Therefore, the average number of comparison is EPL/n! LEMMA 6: Any deterministic algorithm that sorts n distinct keys only by comparison of keys must on the average do at least (min EPL(n!)) / n! comparison of keys.

PROOF: By lemma 1, every DA for sorting n distinct keys there corresponds a pruned, valid, BDT containing n! leaves. By lemma 5, we can convert BT to 2-tree. Because the original tree has n! leaves so must the 2-tree we obtain from it. Hence proved. LEMMA 7: Any 2-tree that has m leaves and whose EPL equals min EPL(m) must have all of its leaves on at most the bottom two levels. PROOF: Suppose that some 2-tree does not have all its leaves on the bottom two levels. Let d be the depth of the tree, let A be a leaf in the tree that is not on one of the bottom two levels and let k be the depth of A.

Because nodes at the bottom level have depth d, ked-2. Now show that tree cannot minimize the EPL among the trees with the same number of leaves by developing a 2-tree with the same no. of leaves and a lower EPL. Now choose a nonleaf B at level d-1 in original tree, removing its two children and giving two children to A. Clearly the new tree has the same number of leaves as the original tree.

Original 2-tree with m leaves

A B

Level k

Level k Level k+1

Level d-1 Level d

Level d-1 New 2-tree with m leaves and EPL decreased

In new tree neither A nor the children of B are leaves, but they are leaves in old tree. Therefore, we have decreased the EPL by the length of the path to A and by the length of the two paths to Bs children. That is k+d+d = k+2d In new tree, B and the two new children of A are leaves, but they are not leaves in old tree. Therefore, we have increased the EPL by the length of the path to B and the length of the two paths to As new children. That is, we have increased the EPL by d-1+k+1+k+1 = d+2k+1 the net change in EPL is ked-2 (d+2k+1) (k+2d) = k-d+1 e d-2-d+1 = -1

As the net change in the EPL is negative, the new tree has a smaller EPL. Thus the old tree cannot minimize the EPL among trees with the same number of leaves. LEMMA 8: Any 2-tree that has m leaves and whose EPL equals min EPL(m) must have 2d-m leaves at level d-1 and 2m-2d leaves at level d and have no other leaves where d is the depth of the tree. PROOF: By lemma 7, all leaves are at the bottom two levels and non leaves in a 2-tree must have two children, there must be 2d-1 nodes at level d-1. Therefore, if r is the number of leaves at level d-1, the number of non leaves at that level is 2d-1-r. Because non leaves in a 2-tree have exactly two children, for every non leaf at least d-1 there are two leaves at level d. Because there are only leaves at

level d, the number of leaves at level d is equal to 2(2d-1r). Because lemma 7 says that all leaves are at level d or d-1, r+2(2d-1-r) = m gives, r = 2d m at level d-1 Therefore, the number of leave at level d is m - r = m - 2d + m = 2m 2d

LEMMA 9: For any 2-tree that has m leaves and whose EPL equals min EPL(m), the depth d is given by d=lgm. PROOF: Consider the case that m is a power of 2, then for some integer k, m=2k. Let d be depth of a minimizing tree by Lemma 8, let r be the number of leaves at level d-1, r = 2d m = 2d - 2k because r u 0, we must have d u k. Assuming that d>k leads to a contradiction. If d>k, then r = 2d - 2k u 2k+1 2k = 2k(2-1) = 2k = m because rem, this means r=m, and all leaves are at level d-1. But there must be some leaves at level d, this contradiction implies that d=k which means r=0,

thus, 2d m = 0 means 2d = m or d = lgm. Because lgm=lgm where m is a power of 2, Thus d = lgm . Proof completed. LEMMA 10: For all integers mu1 min EPL(m) u m lgm . PROOF: By lemma 8, any 2-tree that minimizes EPL must have 2d m leaves at level d-1, have 2m- 2d leaves at level d, and have no other leaves. Therefore we have, min EPL(m) = (2d -m)(d-1)+(2m-2d)d = md + m 2d

By lemma 9, min EPL(m) = m lgm + m 2lgm If m is a power of 2, mlgm = m lgm If m is not a power of 2, lgm = lgm+ 1, so min EPL(m)= m(lgm +1) + m 2lgm = m lgm + 2m 2lgm > m lgm > m lgm because 2m > 2lgm therefore min EPL(m) > m lgm

THEOREM 4: Any deterministic algorithm that sorts n distinct keys only by comparisons of keys must on the average do at least nlogn 1.45n comparisons of keys. PROOF: By lemma 6, any such algorithm must on the average do at least min EPL(n!) / n! comparisons of keys. By lemma 10, this comparison is greater than or equal to, min EPL(n!) / n! = n! lg(n!) /n! = lg(n!) By lemma 4, lg(n!) u nlogn 1.45n Proved.

Lower bounds for searching only by comparison of keys The problem of searching for a key can be described as follows. Given an array S containing n keys and a key x, find an index i such that x=S[i] if x equals one of the keys; if x does not equal one of the keys, report failure. For lower bound, we can associate a decision tree with every deterministic algorithm that searches for a key x in an array of n keys. Figure 1 shows a decision tree corresponding to binary search when searching seven keys and figure2 shows a decision

tree corresponding to sequential search algorithm. In these trees, each large node represents a comparison of an array item with the search key x and each small node (leaf) contains a result that is reported. When x is in the array, we report an index of the item that is equals, and when x is not in the array, we report an F for the failure. In figure we use S[1]=S1, S[2]=S2,., S[7]=S7. Each leaf in a decision tree for searching n keys for a key x represents a point at which the algorithm stops and report, an index i such that x=Si or reports failure. Every internal node represents a comparisons.

< x=S2 = x=S1 < F = > 1 F 2 < F 3 > x=S3 =

x=S4 = 4

> x=S6 < x=S5 6 > F < F 7 = > x=S7 = > F

<

> F

< F

= 5

Figure1:

The decision tree corresponding to binary search when searching seven keys

{ x=S2 { { { x=S5 { { x=S7 { F 7 = x=S6 = 6 x=S4 = 5 x=S3 = 3 = 4

x=S1 = 1 = 2

Figure 2: The decision tree corresponding to sequential search when searching seven keys

A decision tree is called Valid for searching n keys for a key x if for each possible outcome there is a path from the root to a leaf that reports that outcome. That is, there must be a path for x=Si for 1eien and a path that leads to failure. The decision tree is called pruned if every leaf is reachable. Every algorithm that searches for a key x is an array of n keys has a corresponding pruned, valid decision tree.

Lower bounds for worst case behavior


LEMMA 11: If n is the number of nodes in a binary tree and d is the depth, then d > lg(n) PROOF: We have n e 1+2+22+23++2d because there can be only one root, at most two nodes with depth 1, 22 nodes with depth 2, , and 2d nodes with depth d. Apply geometric series, n e 2d+1 1 which means that n < 2d+1 or lgn<d+1or lg(n) < d.

LEMMA 12: To be a pruned,valid decision tree for searching n distinct keys for a key x, the binary tree consisting of the corresponding nodes must contain at least n nodes.
keys. First we show that every Si must be in at least one comparison nodes. Suppose that for some i this is not the case. Take two inputs that are identical for all keys except the ith key, and are different for the ith key. Let x have the value of Si in one of the inputs. Because Si is not involved in any comparison and all the other keys are the same in both inputs, the decision tree must behave the same for both inputs. However, it must report i for one of the input and it must not report i for the other. This contradiction shows that every Si must be in at least one comparison node.
PROOF: Let Si for i=1,2,3,.,n be the values of the n

Because every Si must be in at least one comparison node, the only way we would have less than n comparisons node would be to have at least one key Si involved only in comparison with other keys- that is, one Si that is never compared with x, suppose we do have such a key. Take two inputs that are equal everywhere except for Si, with Si being the smallest key in both inputs. Let x be the ith key in one of the inputs. A path from a comparison node containing Si must go in the same direction for both inputs, and all other keys are the same in both inputs. Therefore the decision tree must behave the same for the two inputs. However, it must report i for one of them and must not report i for the other. This contradiction proves the lemma.

THEOREM 5: Any deterministic algorithm that searches for a key x in an array of n distinct keys only by comparison of keys must in the worst case do at least lg(n) + 1 comparison of keys. PROOF: Corresponding to the algorithm, there is a pruned, valid decision tree for searching n distinct keys for a key x. The worst case number of comparisons is the number of nodes in the longest path from the root to a leaf in the binary tree consisting of the comparison nodes in that decision tree. This number is the depth of the binary tree plus 1. Lemma 12 says that this binary tree has at least n nodes. Therefore, by lemma 11, its depth is greater than or equal to lg(n). This proves the theorem.

Lower bounds for average case behavior


A binary tree is called a nearly complete binary tree if it is complete down to a depth of d-1. Every essentially complete binary tree is nearly complete, but not every nearly complete binary tree is essentially complete.

Fig(a): An Essentially Complete Binary Tree

Fig(b): A Nearly Complete Binary Tree but not Essentially Complete BT.

LEMMA 13: The tree consisting of the comparison of nodes in the pruned,valid decision tree corresponding to the binary search is a nearly complete binary tree. LEMMA 14: The total node distance (TND) of a binary tree containing n nodes is equal to min TND(n) if and only if the tree is nearly complete. PROOF: First show that if a trees TND=min TND(n), the tree is nearly complete. Suppose that some binary tree is not nearly complete. Than there must be some node, not at one of the bottom two levels, that has at most one child. We can remove any node from the bottom level and make it a child of that node. The resulting tree will be a binary tree containing n nodes. The number of nodes in the path to A in that tree will be at least 1 less than the number of nodes in the path A in

the original tree. The number of nodes in the path to


all other nodes will be the same. Therefore, we have created a binary tree containing n nodes with a TND smaller than that of our original tree, which means that our original tree did not have a minimum TND. As TND is the same for all nearly complete binary tree containing n nodes. Therefore, every such tree must have the minimum TND. LEMMA 15: Suppose that we are searching n keys, the search key x is in the array, and all array slots are equally probable. Then the average case time complexity for binary search is given by min TND(n)/n PROOF: The proof follows from lemma 13 and 14.

LEMMA 16: If we assume that x is in the array and that all array slots are equally probable, the average case time complexity of any deterministic algorithm that searches for a key x in an array of n distinct keys is bounded below by min TND(n)/n PROOF:By lemma 12, every array item Si must be compared with x at least once in the decision tree corresponding to the algorithm. Let Ci be the number of nodes in the shortest path to a node containing a comparison of Si with x. Because each key has the same probability 1/n of being the search key x, a lower bound on the average case time complexity is given by n C1(1/n)+ C2(1/n)+.+ Cn(1/n)=(1/n) 7 Ci n i=1 thus, i=1 Ci u min TND(n) 7

THEOREM 6: Among deterministic algorithm that search for a key x in an array of n distinct keys only by comparison of keys, binary search is optimal in its average case performance if we assume that x is in the array and that all array slots are equally probable. Therefore, under these assumptions algorithm must on the average do at least approximately lg(n) 1 comparisons of keys. PROOF: Proof follows from lemma 15 and 16.

Vous aimerez peut-être aussi