Vous êtes sur la page 1sur 44

Algorithms for Data Science

CSOR W4246
Eleni Drinea
Computer Science Department
Columbia University
Tuesday, September 15, 2015

Outline

1 Recap

2 The running time of Mergesort and solving recurrences

3 Binary search

4 Integer multiplication

5 Fast matrix multiplication

Today

1 Recap
2 The running time of Mergesort and solving recurrences
3 Binary search
4 Integer multiplication
5 Fast matrix multiplication

The divide & conquer principle

I
I

Introduced asymptotic notation


The divide & conquer principle; application: Mergesort
I

I
I

Divide the problem into a number of subproblems that are


smaller instances of the same problem.
Conquer the subproblems by solving them recursively.
Combine the solutions to the subproblems into the
solution for the original problem.

Analyzed Mergesort
1. Correctness: by induction on the size of the array
2. Space: extra (n) space
I Input to Merge is stored in auxiliary arrays
Unlike insertion-sort, Mergesort does not sort in place

Today

1 Recap
2 The running time of Mergesort and solving recurrences
3 Binary search
4 Integer multiplication
5 Fast matrix multiplication

Mergesort: pseudocode
Mergesort (A, lef t, right)
if right == lef t then return
end if
middle = lef t + b(right lef t)/2c
Mergesort (A, lef t, middle)
Mergesort (A, middle + 1, right)
Merge (A, lef t, middle, right)

Remarks
I

Mergesort is a recursive procedure (why?)

Initial call: Mergesort(A, 1, n)

Subroutine Merge merges two sorted lists of sizes bn/2c, dn/2e


into one sorted list of size n. How can we accomplish this?

Running time of Mergesort


The running time of Mergesort satisfies:
T (n) = 2T (n/2) + cn, for n 2, constant c > 0
T (1) = c
This structure is typical of recurrence relations
I

an inequality or equation bounds T (n) in terms of an


expression involving T (m) for m < n

a base case generally says that T (n) is constant for small


constant n

Remarks
I

We ignore floor and ceiling notations.

A recurrence does not provide an asymptotic bound for


T (n): to this end, we must solve the recurrence.

Solving recurrences, method 1: recursion trees

The technique consists of three steps


1. Analyze the first few levels of the tree of recursive calls
2. Identify a pattern
3. Sum over all levels of recursion
Example: analysis of running time of Mergesort
T (n) = 2T (n/2) + cn, n 2
T (1) = c

The recursion tree of a generic recurrence relation


The running times of many recursive algorithms can be
expressed by the following recurrence
T (n) = aT (n/b) + cnk , for a, c > 0, b > 1,k 0
What is the tree of recursive calls for this recurrence?
I

a is the branching factor

b is the factor by which the size of each subproblem shrinks

at level i, there are ai subproblems, each of size n/bi


each subproblem at level i requires c(n/bi )k work
I

the height of the tree is logb n levels

Total work:

Plogb n
i=0

ai c(n/bi )k = cnk

log
Pb n
i=0


a i
bk

Solving recurrences, method 2: the Master theorem

Theorem 1 (Master theorem).


If T (n) = aT (dn/be) + O(nk ) for some
k 0, then

O(nlogb a ) ,
O(nk log n) ,
T (n) =

O(nk ) ,

constants a > 0, b > 1,


if a > bk
if a = bk
if a < bk

Example: running time of Mergesort


I

T (n) = 2T (n/2) + cn:


a = 2, b = 2, k = 1, bk = 2 = a T (n) = O(n log n)

Solving recurrences, method 3: the substitution method


The technique consists of two steps
1. Guess a bound
2. Use (strong) induction to prove that the guess is correct

Remark 1 (simple vs strong induction).


I

Simple induction: the induction step at n requires that the


inductive hypothesis holds at step n 1.

Strong induction: the induction step at n requires that the


inductive hypothesis holds at all steps 1, 2, . . . , n 1.

Strong induction is most useful when several instances of the


hypothesis are required to show the inductive step.

Exercise: show inductively that Mergesort runs in time


O(n log n).

What about...

1. T (n) = 2T (n 1) + 1, T (1) = 2

2. T (n) = 2T 2 (n 1), T (1) = 4

3. T (n) = T (2n/3) + T (n/3) + cn

Today

1 Recap
2 The running time of Mergesort and solving recurrences
3 Binary search
4 Integer multiplication
5 Fast matrix multiplication

Searching a sorted array

I
I

Input: sorted list A of n integers, integer x


Output:
1. index j s.t. 1 j n and A[j] = x; or
2. no if x is not in A

Searching a sorted array

I
I

Input: sorted list A of n integers, integer x


Output:
1. index j s.t. 1 j n and A[j] = x; or
2. no if x is not in A

Example: A = {0, 2, 3, 5, 6, 7, 9, 11, 13}, n = 9, x = 7

Searching a sorted array

I
I

Input: sorted list A of n integers, integer x


Output:
1. index j s.t. 1 j n and A[j] = x; or
2. no if x is not in A

Example: A = {0, 2, 3, 5, 6, 7, 9, 11, 13}, n = 9, x = 7


Idea: use the fact that the array is sorted and probe specific
entries in the array.

Binary search
First, probe the middle entry. Let mid = dn/2e.
I

If x == A[mid], return mid.

If x < A[mid] then look for x in A[1, mid 1];

Else if x > A[mid] look for x in A[mid + 1, n].


Initially, the entire array is active, that is, x might be anywhere in the array.
A[mid]

A[mid]

mid

Suppose x > A[mid].


Then the active area of the array, where x might be, is to the right of mid.
A[mid]

A[mid]

mid mid+1

Binary search pseudocode


binarysearch(A, lef t, right)
if right == lef t then
if A[lef t] == x then
return lef t
else
return no
end if
else
mid = lef t + d(right lef t)/2e
if A[mid] == x then
return mid
else
if A[mid] < x then lef t = mid + 1
else right = mid 1
end if
binarysearch(A, lef t, right)
end if
end if

Binary search running time

Observation: At each step there is a region of A where x


could be and we shrink the size of this region by a factor of 2
with every probe:
I

If n is odd, then we are throwing away dn/2e elements.

If n is even, then we are throwing away at least n/2


elements.

Binary search running time

Observation: At each step there is a region of A where x


could be and we shrink the size of this region by a factor of 2
with every probe:
I

If n is odd, then we are throwing away dn/2e elements.

If n is even, then we are throwing away at least n/2


elements.

Hence the recurrence for the running time is


T (n) T (n/2) + O(1)

Sublinear running time

Here are two ways to analyze the recurrence:


1. Master theorem: b = 2, a = 1, k = 0 T (n) = O(log n).
2. We can reason as follows:
I

Starting with an array of size n, after k probes, we are left


with an array of size at most 2nk (since every time we probe
an entry the active portion of the array halves).

Hence after k = log n probes, we are left with an array of


constant size (i.e., O(1)). At that time we can just search
linearly for x in the constant size array.

Concluding remarks on binary search

1. The right data structure can improve the running time of


the algorithm significantly.
I
I

What if we used a linked list to store the input?


Arrays allow for random access of their elements: given
an index, we can read any entry in an array in time O(1)
(constant time)

2. In general, we obtain running time O(log n) when the


algorithm does a constant amount of work to throw
away a constant fraction of the input.

Today

1 Recap
2 The running time of Mergesort and solving recurrences
3 Binary search
4 Integer multiplication
5 Fast matrix multiplication

Integer multiplication
I

How do we multiply two integers x and y?

Elementary school method: compute a partial product by


multiplying every digit of y separately with x and then add
up all the partial products.

Remark: this method works the same in base 10 or base 2.

Examples: (12)10 (11)10 and (1100)2 (1011)2


12

1100

11

1011

12
+ 12
132

1100
1100
0000
+ 1100
10000100

Elementary algorithm running time

A more reasonable model of computation: a single operation


on a pair of digits (bits) is a primitive computational step.
Assume we are multiplying n-digit (bit) numbers.
I

O(n) time to compute a partial product.

O(n) time to combine it in a running sum of all partial


products so far.

There are n partial products, each consisting of n bits,


hence total number of operations is O(n2 ).
Can we do better?

A first divide & conquer approach


Consider n-digit decimal numbers x, y.
x = xn1 xn2 . . . x0
y = yn1 yn2 . . . y0
Idea: rewrite each number as the sum of the n/2 high-order
digits and the n/2 low-order digits.
x = xn1 . . . xn/2 xn/21 . . . x0 = xH 10n/2 + xL
|
{z
}|
{z
}
xH

xL

y = yn1 . . . yn/2 yn/21 . . . y0 = yH 10n/2 + yL


|
{z
}|
{z
}
yH

yL

where each of xH , xL , yH , yL is an n/2-digit number.

Examples
I

n = 2, x = 12, y = 11
12 = |{z}
1 |{z}
101 + |{z}
2
|{z}
x

xH

10n/2
1

xL

10n/2

yL

11 = |{z}
1 |{z}
10 + |{z}
1
|{z}
y

yH

n = 4, x = 1000, y = 1110
1000
10 |{z}
102 + |{z}
0
|{z} = |{z}
x

xH

10n/2
2

xL

10n/2

yL

11 |{z}
10 + |{z}
1110
10
|{z} = |{z}
y

yH

A first divide & conquer approach


x y = (xH 10n/2 + xL ) (yH 10n/2 + yL )
= xH yH 10n + (xH yL + xL yH )10n/2 + xL yL
In words, we reduced the problem of solving 1 instance of size n
(i.e., one multiplication between two n-digit numbers) to the
problem of solving 4 instances, each of size n/2 (i.e., computing
the products xH yH , xH yL , xL yH and xL yL ).

A first divide & conquer approach


x y = (xH 10n/2 + xL ) (yH 10n/2 + yL )
= xH yH 10n + (xH yL + xL yH )10n/2 + xL yL
In words, we reduced the problem of solving 1 instance of size n
(i.e., one multiplication between two n-digit numbers) to the
problem of solving 4 instances, each of size n/2 (i.e., computing
the products xH yH , xH yL , xL yH and xL yL ).
This is a divide and conquer solution!
I

Recursively solve the 4 subproblems.

Multiplication by 10n is easy (shifting): O(n) time.

Combine the solutions from the 4 subproblems to an


overall solution using 3 additions on O(n)-digit numbers:
O(n) time.

Karatsubas observation
Running time: T (n) 4T (n/2) + cn
I

by the Master Theorem: T (n) = O(n2 )

no improvement

Karatsubas observation
Running time: T (n) 4T (n/2) + cn
I

by the Master Theorem: T (n) = O(n2 )

no improvement

However, if we only needed three n/2-digit multiplications, then


by the Master theorem
T (n) 3T (n/2) + cn = O(n1.59 ) = o(n2 ).

Karatsubas observation
Running time: T (n) 4T (n/2) + cn
I

by the Master Theorem: T (n) = O(n2 )

no improvement

However, if we only needed three n/2-digit multiplications, then


by the Master theorem
T (n) 3T (n/2) + cn = O(n1.59 ) = o(n2 ).
Recall that
x y = xH yH 10n + (xH yL + xL yH )10n/2 + xL yL
Key observation: we do not need each of xH yL , xL yH .
We only need their sum, xH yL + xL yH .

Gausss observation on multiplying complex numbers

A similar problem: multiply two complex numbers a + bi, c + di


(a + bi)(c + di) = ac + (ad + bc)i + bdi2

Gausss observation on multiplying complex numbers

A similar problem: multiply two complex numbers a + bi, c + di


(a + bi)(c + di) = ac + (ad + bc)i + bdi2
Gausss observation: can be done with just 3 multiplications
(a + bi)(c + di) = ac + ((a + b)(c + d) ac bd)i + bdi2 ,
at the cost of few extra additions and subtractions.
Unlike multiplications, additions and subtractions of n-digit
numbers are cheap: O(n) time!

Karatsubas algorithm
x y = (xH 10n/2 + xL ) (yH 10n/2 + yH )
= xH yH 10n + (xH yL + xL yH )10n/2 + xL yL
Similarly to Gausss method for multiplying two complex
numbers, compute only the three products
xH yH , xL yL , (xH + xL )(yH + yL )
and obtain the sum xH yL + xL yH from
(xH + xL )(yH + yL ) xH yH xL yL = xH yL + xL yH .
Combining requires O(n) time hence
T (n) 3T (n/2) + cn = O(nlog2 3 ) = O(n1.59 )

Pseudocode

Let k be a small constant.


Integer-Multiply(x, y)
if n == k then
return xy
end if
write x = xH 10n/2 + xL , y = yH 10n/2 + yL
compute xH + xL , yH + yL
product = Integer-Multiply(xH + xL , yH + yL )
xH yH = Integer-Multiply(xH , yH )
xL yL = Integer-Multiply(xL , yL )
return xH yH 10n + (product xH yH xL yL )10n/2 + xL yL

Concluding remarks

To reduce the number of multiplications we do few more


additions/subtractions: these are fast compared to
multiplications.

There is no reason to continue with recursion once n is


small enough: the conventional algorithm is probably more
efficient since it uses fewer additions.

When we recursively compute (xH + xL )(yH + yL ), each of


xH + xL , yH + yL might be (n/2 + 1)-digit integers. This
does not affect the asymptotics.

Today

1 Recap
2 The running time of Mergesort and solving recurrences
3 Binary search
4 Integer multiplication
5 Fast matrix multiplication

Fast matrix multiplication

Matrix multiplication: a fundamental primitive in numerical


linear algebra, scientific computing, machine learning and
large-scale data analysis.
I

Input: m n matrix A, n p matrix B

Output: m p matrix C = AB


Example: A = 10 01 , B = 11 11 , C =
I

1 0
0 1

Lower bounds on matrix multiplication algorithms for


m, p = (n)?

Conventional matrix multiplication

for 1 i m do
for 1 j p do
n
P
ai,k bk,j
ci,j =
k=1

end for
end for
I

Running time?

Can we do better?

A first divide & conquer approach: 8 subproblems


Assume square A, B where n = 2k for some k > 0.
Idea: express A, B as 2 2 block matrices and use the
conventional algorithm to multiply the two block matrices.
n/2n/2

 z}|{

A11 A12
B11
A21 A22
B21

B12
B22


C11
=
C21

where
C11 = A11 B11 + A12 B21
C12 = A11 B12 + A12 B22
C21 = A21 B11 + A22 B21
C22 = A21 B12 + A22 B22
Running time?

C12
C22

Strassens breakthrough: 7 subproblems suffice (part 1)


Compute the following ten n/2 n/2 matrices.
1. S1 = B11 B22
2. S2 = A11 + A12
3. S3 = A21 + A22
4. S4 = B21 B11
5. S5 = A11 + A22
6. S6 = B11 + B22
7. S7 = A12 A22
8. S8 = B21 + B22
9. S9 = A11 A21
10. S10 = B11 + B12
Running time?

Strassens breakthrough: 7 subproblems suffice (part 2)


Compute the following seven products of n/2 n/2 matrices.
1. P1 = A11 S1
2. P2 = S2 B22
3. P3 = S3 B11
4. P4 = A22 S4
5. P5 = S5 S6
6. P6 = S7 S8
7. P7 = S9 S10
Compute C as follows:
1. C11 = P4 + P5 + P6 P2
2. C12 = P1 + P2
3. C21 = P3 + P4
4. C22 = P1 + P5 P3 P7
Running time?

Strassens running time and concluding remarks

Recurrence: T (n) = 7T (n/2) + cn2

By the Master theorem:


T (n) = O(nlog2 7 ) = O(n2.81 )

Recently, there is renewed interest in Strassens algorithm


for high-performance computing: thanks to its lower
communication cost (number of bits exchanged between
machines in the network or data center), it is better suited
than the traditional algorithm for multi-core processors.

Vous aimerez peut-être aussi