Vous êtes sur la page 1sur 91

Algorithm Analysis and

complexity

Agenda

Algorithm Analysis
What is complexity?
Types of complexities
Methods of measuring complexity

Algorithm Analysis(1/5)
Measures the efficiency of an algorithm or its
implementation as a program as the input size
becomes very large
We evaluate a new algorithm by comparing its
performance with that of previous approaches
Comparisons are asymptotic analyses of classes of
algorithms

We usually analyze the time required for an algorithm


and the space required for a datastructure

Algorithm Analysis (2/5)


Many criteria affect the running time of an algorithm,
including
speed of CPU, bus and peripheral hardware
design think time, programming time and
debugging time
language used and coding efficiency of the
programmer
quality of input (good, bad or average)

Algorithm Analysis (3/5)


Programs derived from two algorithms for solving the
same problem should both be
Machine independent
Language independent
Environment independent (load on the system,...)
Amenable to mathematical study
Realistic

Algorithm Analysis (4/5)


In lieu of some standard benchmark conditions under
which two programs can be run, we estimate the
algorithm's performance based on the number of key
and basic operations it requires to process an input of
a given size
For a given input size n we express the time T to run
the algorithm as a function T(n)
Concept of growth rate allows us to compare running
time of two algorithms without writing two programs
and running them on the same computer

Algorithm Analysis (5/5)


Analysis is performed with respect to a
computational model
We will usually use a generic uni-processor
random-access machine (RAM)
All memory equally expensive to access
No concurrent operations
All reasonable instructions take unit time
Except, of course, function calls

Constant word size


Unless we are explicitly manipulating bits

Complexity
The complexity of an algorithm is simply the amount
of work the algorithm performs to complete its task.

Complexity
A measure of the performance of an algorithm
An algorithms performance depends on
internal factors
external factors

External Factors
Speed of the computer on which it is run
Quality of the compiler
Size of the input to the algorithm

Internal Factors
The algorithms efficiency, in terms of:
Time required to run
Space (memory storage)required to run

Note:
Complexity measures the internal factors (usually more interested in time
than space)

Two ways of finding complexity


Experimental study
Theoretical Analysis

Experimental study
Write a program implementing the algorithm
Run the program with inputs of varying size
and composition
Get an accurate measure of the actual running
time
Plot the results

Limitations of Experiments
It is necessary to implement the algorithm,
which may be difficult
Results may not be indicative of the running
time on other inputs not included in the
experiment.
In order to compare two algorithms, the same
hardware and software environments must be
used
Experimental data though important is not
sufficient

Theoretical Analysis
Uses a high-level description of the algorithm
instead of an implementation
Characterizes running time as a function of the
input size, n.
Takes into account all possible inputs
Allows us to evaluate the speed of an
algorithm independent of the
hardware/software environment

Measure Running Time


Basically, counting the number of basic
operations required to solve a problem,
depending on the problem/instance size
What are basic operations?
How to measure instance size?

Basic Operations
What can be done in a constant time,
regardless of the size of the input
Basic arithmetic operations, e.g. add,
subtract, shift,
Relational operations, e.g. <, >, <=,
Logical operations, e.g. and, or, not
More complex arithmetic operations, e.g.
multiply, divide, log,
Etc.

Non-basic Operations
When arithmetic, relational and logical
operations are not basic operations?
-Operations depend on the input size
-Arithmetic operations on very large
numbers
-Logical operations on very long bit strings
-Etc.

Instance Size
Number of elements
Arrays, matrices
Graphs (nodes or vertices)
Sets
Number of bits/bytes
Numbers
Bit strings
Maximum values
Numbers

Components of Program Space


Program space = Instruction space + data
space + stack space
The instruction space is dependent on several
factors.
the compiler that generates the machine code
the target computer

Components of Program Space


Data space
very much dependent on the computer architecture and
compiler
The magnitude of the data that a program works with is
another factor
char
1
float
4
short
2
double
8
int
2
long double 10
long
4
pointer
2
Unit: bytes

Components of Program Space


Data space
Choosing a smaller data type has an effect on the overall
space usage of the program.
Choosing the correct type is especially important when working
with arrays.

Environment Stack Space


Every time a function is called, the following data are saved on
the stack.
1. the return address
2. the values of all local variables and value formal parameters
3. the binding of all reference and const reference parameters

Space Complexity
The space needed by an algorithm is the sum
of a fixed part and a variable part
The fixed part includes space for

Instructions
Simple variables
Fixed size component variables
Space for constants
Etc..

Cont
The variable part includes space for
Component variables whose size is dependant on
the particular problem instance being solved
Recursion stack space
Etc..

Time Complexity
The time complexity of a problem is
the number of steps that it takes to solve an
instance of the problem as a function of the size of
the input (usually measured in bits), using the most
efficient algorithm.

The exact number of steps will depend on


exactly what machine or language is being
used.
To avoid that problem, the Asymptotic
notation is generally used.

Time Complexity
How do we measure?
1. Count a particular operation (operation counts)
2. Count the number of steps
(step counts)
3. Asymptotic complexity

Running Example: Insertion Sort


for (int i = 1; i < n; i++) // n is the number of
{
// elements in array
// insert a[i] into a[0:i-1]
int t = a[i];
int j;
for (j = i - 1; j >= 0 && t < a[j]; j--)
a[j + 1] = a[j];
a[j + 1] = t;
}

Operation Count
for (int i = 1; i < n; i++)
for (j = i - 1; j >= 0 && t < a[j]; j--)
a[j + 1] = a[j];
How many comparisons are made?
The number of compares depends on
a[]s and t as well as on n.

Operation Count
Worst case count = maximum count
Best case count = minimum count
Average count

Worst Case Operation Count


for (j = i - 1; j >= 0 && t < a[j]; j--)
a[j + 1] = a[j];
a = [1,2,3,4] and t = 0
a = [1,2,3,4,,i] and t = 0

4 compares
i compares

Worst Case Operation Count


for (int i = 1; i < n; i++)
for (j = i - 1; j >= 0 && t < a[j]; j--)
a[j + 1] = a[j];
total compares = 1+2+3++(n-1)
= (n-1)n/2

Step Count
The operation-count method omits accounting for the
time spent on all but the chosen operation
The step-count method count for all the time spent in
all parts of the program
A program step is loosely defined to be a
syntactically or semantically meaningful segment of a
program for which the execution time is independent
of the instance characteristics.
However, n adds cannot be counted as one step.

Step Count
steps/execution (s/e)
1
for (int i = 1; i < n; i++)
{
0
// insert a[i] into a[0:i-1]
0
int t = a[i];
1
int j;
0
for (j = i - 1; j >= 0 && t < a[j]; j--)
1
a[j + 1] = a[j];
1
a[j + 1] = t;
1
}
0

Step Count
s/e
1
for (int i = 1; i < n; i++)
{
0
// insert a[i] into a[0:i-1]
0
int t = a[i];
1
int j;
0
for (j = i - 1; j >= 0 && t < a[j]; j--) 1
a[j + 1] = a[j];
1
a[j + 1] = t;
1
}
0

frequency
n-1
0
0
n-1
0
(n-1)n/2
(n-1)n/2
n-1
n-1

Step Count
Total step counts
= (n-1) + 0 + 0 + (n-1) + 0 + (n-1)n/2 + (n-1)n/2
+(n-1) + (n-1)
= n2 + 3n 4

Asymptotic Complexity

Two important reasons to determine operation and step


counts
1. To compare the time complexities of two programs that
compute the same function
2. To predict the growth in run time as the instance
characteristic changes
Neither of the two yield a very accurate measure
Operation counts: focus on key operations and ignore
all others
Step counts: the notion of a step is itself in exact
Asymptotic complexity provides meaningful statements
about the time and space complexities of a program

Introduction
Why are asymptotic notations important?
They give a simple characterization of an algorithms
efficiency.
They allow the comparison of the performances of various
algorithms.
For large values of components/inputs, the multiplicative
constants and lower order terms of an exact running time are
dominated by the effects of the input size (the number of
components).

Best, average, worst-case


complexity
In some cases, it is important to consider the
best, worst and/or average (or typical)
performance of an algorithm.
For example, when sorting a list into order, if
it is already in order then the algorithm may
have very little work to do.(Best case)
The worst-case analysis gives a bound for all
possible inputs (and may be easier to calculate
than the average case)

Analysis
Worst case
Provides an upper bound on running
time
An absolute guarantee

Average case
Provides the expected running time
Very useful, but treat with care: what is
average?
Random (equally likely) inputs
Real-life inputs

Asymptotic Overview
A way to describe behavior of functions in the limit.
Describe growth of functions.
Focus on whats important by abstracting away low order
terms and constant factors.
Indicate running times of algorithms.
A way to compare sizes of functions.
Examples:
n steps vs. n+5 steps
n steps vs. n2 steps
Running time of an algorithm as a function of input size n for large
n.
Expressed using only the highest-order term in the expression for
the exact running time.

Asymptotic Notations

notation (big-oh, upper-bound)


notation

(big-omega, lower bound)

=(theta, tight bound)


notation
notation(little-oh, upper-bound not tight)
notation (little-omega, lower-bound but not
tight)

-notation (Average Case)


For function g(n), we define (g(n)),
big-Theta of n, as the set:
(g(n)) = {f(n) :
positive constants c1, c2, and n0,
such that n n0,
we have 0 c1g(n) f(n) c2g(n)

}
Intuitively: Set of all functions that
have the same rate of growth as g(n).

g(n) is an asymptotically tight bound for f(n).

-notation (Average Case)


For function g(n), we define (g(n)),
big-Theta of n, as the set:
(g(n)) = {f(n) :
positive constants c1, c2, and n0,
such that n n0,
we have 0 c1g(n) f(n) c2g(n)

}
Technically, f(n) (g(n)).

f(n) and g(n) are nonnegative, for large n.

Example
T(n) = f(n) = Q(g(n))
if c1*g(n) <= f(n) <= c2*g(n) for all n > n0, where c1, c2 and n0 are
constants > 0
c2*g(n)
f(n)
c1*g(n)

n0

1/2n2-3n=(n2) with c1=1/14,c2=1/2and n0=7

O-notation (Worst Case)


For function g(n), we define O(g(n)),
big-O of n, as the set:
O(g(n)) = {f(n) :
positive constants c and n0,
such that n n0,
we have 0 f(n) cg(n) }

Intuitively: Set of all functions


whose rate of growth is the same as
or lower than that of g(n).

g(n) is an asymptotic upper bound for f(n).


f(n) = (g(n)) f(n) = O(g(n)). Example:
2n2=O(n3) with c=1
(g(n)) O(g(n)).
and n =2
0

-notation (Best-Case)
For function g(n), we define (g(n)),
big-Omega of n, as the set:
(g(n)) = {f(n) :
positive constants c and n0,
such that n n0,
we have 0 cg(n) f(n)}

Intuitively: Set of all functions


whose rate of growth is the same
as or higher than that of g(n).

g(n) is an asymptotic lower bound for f(n).


f(n) = (g(n)) f(n) = (g(n)).
(g(n)) (g(n)).

Notation: Asymptotic Lower


Bound

T(n) = f(n) = (g(n))

if f(n) >= c*g(n) for all n > n0, where c and n0 are
constants > 0
f(n)
c*g(n)

n0

Example: T(n) = 2n + 5 is (n). Why?


2n+5 >= 2n, for all n > 0
T(n) = 5*n2 - 3*n is (n2). Why?
5*n2 - 3*n >= 4*n2, for all n >= 4
- n=(log n)
with c=1 and n0=16

>=
(g(n)), functions that grow at least as fast as g(n)

=
(g(n)), functions that grow at the same rate as g(n)
g(n)

<=
O(g(n)), functions that grow no faster than g(n)

Relations Between , O,

LHpitals rule and Stirlings


formula
LHpitals rule: If limn f(n) = limn g(n) = and the
derivatives f, g exist, then
lim

f(n)
g(n)

Example: log n vs. n

Stirlings formula: n! (2n)1/2 (n/e)n


Example: 2n vs. n!

lim

f (n)
g (n)

Orders of growth of some important


functions
All logarithmic functions loga n belong to the same class
(log n) no matter what the logarithms base a > 1 is
because

log a n = log b n / log b a

All polynomials of the same degree k belong to the same class:


aknk + ak-1nk-1 + + a0 (nk)
Exponential functions an have different orders of growth for
different as
order log n < order n (>0) < order an < order n! < order nn

o-notation

For a given function g(n), the set little-o:


o(g(n)) = {f(n): c > 0, n0 > 0 such that
n n0, we have 0 f(n) < cg(n)}.
f(n) becomes insignificant relative to g(n) as n
approaches infinity:

lim [f(n) / g(n)] = 0


n

g(n) is an upper bound for f(n) that is not


asymptotically tight.
Observe the difference in this definition from previous
ones. Why?


-notation
For a given function g(n), the set little-omega:

(g(n)) = {f(n): c > 0, n0 > 0 such that

n n0, we have 0 cg(n) < f(n)}.

f(n) becomes arbitrarily large relative to


g(n) as n approaches infinity:

lim [f(n) / g(n)] = .

g(n) is a lower bound for f(n) that is not


asymptotically tight.

Visualization of Asymptotic Growth


o(f(n))
O(f(n))
(f(n))
f(n)
(f(n))

(f(n))
n0

Need for Asymptotic Notation


For example, suppose the exact run-time T(n) of an algorithm
on an input of size n is T(n) = 5n2 + 6n + 25 seconds. Then,
since n is 0, we have 5n2 T(n) 6n2 for all n9.
Thus we can say that T(n) is roughly proportional to
n2 for sufficiently large values of n.
We write this as T(n) (n2 ), or say that T(n) is in the exact
order of n2.

Contd..
Generally, an algorithm with a run-time of (n log n)
will perform better than an algorithm with a run-time of order
(n2 ), provided that n is sufficiently large.
However, for small values of n, the (n2 ) algorithm may run
faster due to having smaller constant factors.
The value of n at which the (n log n) algorithm first
outperforms the (n2 ) algorithm is called the break-even
point for the (n log n) algorithm.

Examples
Determine whether following is true or false.
100n + 5 O(n2)
Solution:
100n + 5 100n + n for all n 5
101n 101n2
Where c =101 and n0 = 5.

Determine whether following is true or false.


n2 + 10n O(n2)
Solution:
When c=2, n0=10,
Because for
n2 + 10n 2 g(n) for all n n0
n2 + 10n 2 n2 for all n 10

Show that 5n2 O(n2)


Solution:
For n 0, we can take c = 5, n0 = 0.
5n2 5 n2 for n 0
Show that n(n-1)/2 O(n2)
Solution:
For n 0, n(n-1)/2 n(n)/2 = n2, c = and n0 = 0.

Show that n2 O(n2 + 10n)


Solution:
For n 0
n2 1 * (n2 + 10n)
c = 1 and n0 = 0.

Find the order of the function 3n + 2


Solution:
t(n) = 3n + 2
t(n) cg(n)
3n + 2 cn
3n + 2 4n where c = 4
If n = 1,
3 x 1 + 2 4 i.e 5 4
If n = 2,
3 x 2 + 2 8 i.e 8 8
If n = 3,
3 x 3 +2 12 i.e 11 12
t(n) O(n) where c= 4 and n0=2, which is called breakeven point.

Find the order of the function


t(n) = 10n2 + 4n + 2
Solution:
t(n) cg(n) for all n n0
10n2 + 4n + 2 cn2 where g(n) = n2
10n2 + 4n + 2 cn2 where c = 11
If n = 1, 16 11
If n = 2, 50 44
If n = 3, 104 99
If n = 4, 178 176
If n = 5, 272 275
If n = 6, 386 396 the given function 10n2 + 4n + 2 O(n2 ) when c = 11
and n0 = 5, which is breakeven point.

Omega(n2)

4 n2
6 n2 + 9
5 n2 + 2 n
4 n3 + 3 n2
6 n6 + 3 n4
Thetha n2

4 n2
6 n2 + 9
5 n2 + 2 n

O(n2)

3 log n + 8
5n + 7
2 log n
4 n2
6 n2 + 9
5 n2 + 2 n

Typical Running Time


Functions

1 (constant running time):

Instructions are executed once or a few times


logN (logarithmic)
A big problem is solved by cutting the original problem
in smaller sizes, by a constant fraction at each step
N (linear)
A small amount of processing is done on each input
element
N logN
A problem is solved by dividing it into smaller problems,
solving them independently and combining the solution

Exercise

solution

Performance Classification
f(n)
1

log n

Classification
Constant: run time is fixed, and does not depend upon n. Most instructions are
executed once, or only a few times, regardless of the amount of information being
processed
Logarithmic: when n increases, so does run time, but much slower. Common in
programs which solve large problems by transforming them into smaller problems.

Linear: run time varies directly with n. Typically, a small amount of processing is
done on each element.

n log n

When n doubles, run time slightly more than doubles. Common in programs which
break a problem down into smaller sub-problems, solves them independently, then
combines solutions

n2

Quadratic: when n doubles, runtime increases fourfold. Practical only for small
problems; typically the program processes all pairs of input (e.g. in a double nested
loop).

n3

Cubic: when n doubles, runtime increases eightfold

2n

Exponential: when n doubles, run time squares. This is often the result of a natural,
brute force solution.

Common growth rates


Time complexity

Example

O(1)

constant

Adding to the front of a linked list

log

Finding an entry in a sorted array

O(log N)
O(N)
O(N log N)

linear
n-log-n

O(N2)

quadratic

O(N3)

cubic

O(2N)

exponential

Finding an entry in an unsorted array


Sorting n items by divide-and-conquer
Shortest path between two nodes in a
graph
Simultaneous linear equations
The Towers of Hanoi problem

Growth rates
O(N2)
O(Nlog N)

For a short time N2 is


better than NlogN

Number of Inputs

Function of Growth rate

Standard Analysis Techniques


Constant time statements

Analyzing Loops
Analyzing Nested Loops
Analyzing Sequence of Statements
Analyzing Conditional Statements

Constant time statements


Simplest case: O(1) time statements
Assignment statements of simple data types
int x = y;
Arithmetic operations:
x = 5 * y + 4 - z;
Array referencing:
A[j] = 5;
Array assignment:
j, A[j] = 5;
Most conditional tests:
if (x < 12) ...

Analyzing Loops[1]
Any loop has two parts:
How many iterations are performed?
How many steps per iteration?
int sum = 0,j;
for (j=0; j < N; j++)
sum = sum +j;
Loop executes N times (0..N-1)
O(1) steps per iteration

Total time is N * O(1) = O(N*1) = O(N)

Analyzing Loops[2]
What about this for loop?
int sum =0, j;
for (j=0; j < 100; j++)
sum = sum +j;
Loop executes 100 times
O(1) steps per iteration
Total time is 100 * O(1) = O(100 * 1) = O(100)

Analyzing Nested Loops[1]


Treat just like a single loop and evaluate each level of
nesting as needed:
int j,k;
for (j=0; j<N; j++)
for (k=N; k>0; k--)
sum += k+j;

Start with outer loop:


How many iterations? N
How much time per iteration? Need to evaluate inner loop

Inner loop uses O(N) time


Total time is N * O(N) = O(N*N) = O(N2)

Analyzing Nested Loops[2]


What if the number of iterations of one loop
depends on the counter of the other?
int j,k;
for (j=0; j < N; j++)
for (k=0; k < j; k++)
sum += k+j;

Analyze inner and outer loop together:


Number of iterations of the inner loop is:
0 + 1 + 2 + ... + (N-1) = O(N2)

Analyzing Sequence of Statements


For a sequence of statements, compute their
complexity functions individually and add them
up
for (j=0; j < N; j++)
for (k =0; k < j; k++)
sum = sum + j*k;
for (l=0; l < N; l++)
sum = sum -l;
cout<<Sum=<<sum;

O(N2)

O(N)
O(1)

Total cost is O(N2) + O(N) +O(1) = O(N2)


SUM RULE

Analyzing Conditional Statements


What about conditional statements such as
if (condition)
statement1;
else
statement2;
where statement1 runs in O(N) time and statement2 runs in O(N2)
time?
We use "worst case" complexity: among all inputs of size N,
what is the maximum running time?
The analysis for the example above is O(N2)

Best Case
Best case is defined as which input of size n is
cheapest among all inputs of size n.
The best case for my algorithm is n=1
because that is the fastest. WRONG!
Misunderstanding

Some Properties of Big O


Transitive property
If f is O(g) and g is O(h) then f is O(h)

Product of upper bounds is upper bound for the


product
If f is O(g) and h is O(r) then fh is O(gr)

Exponential functions grow faster than polynomials


nk is O(bn ) " b > 1 and k 0
e.g. n20 is O( 1.05n)

Logarithms grow more slowly than powers


logbn is O( nk) " b > 1 and k > 0
e.g. log2n is O( n0.5)

Comparision of two algorithms


Consider two algorithms, A and B, for solving a
given problem.
 TA(n),TB( n) is time complexity of A,B
respectively (where n is a measure of the problem
size. )
 One possibility arises if we know the problem
size a priori.
 For example, suppose the problem size is n0 and
TA(n0)<TB(n0). Then clearly algorithm A is better than
algorithm B for problem size .

 In the general case,


 we have no a priori knowledge of the problem size.

Cont..
Limitation:
don't know the problem size beforehand
it is not true that one of the functions is less than or
equal the other over the entire range of problem
sizes.

we consider the asymptotic behavior of the


two functions for very large problem sizes.

Time Complexity Vs Space


Complexity
Achieving both is difficult and best case
There is always trade off
If memory available is large
Need not compensate on Time Complexity

If fastness of execution is not main concern,


memory available is less
Cant compensate on space complexity

Example
Size of data = 10 MB
Check if a word is present in the data or not
Two ways
Better Space Complexity
Better Time Complexity

Contd..
Load the entire data into main memory and check
one by one
Faster process but takes a lot of space

Load data wordby-word into main memory and


check
Slower process but takes less space

Run these algorithms


 For loop
 a. Sum=0;
for(i=0;i<N;i++)
for(j=0;j<i*i;j++)
for(k=0;k<j;k++)
Sum++;
 Compare the above "for loops" for different
inputs

Example
 3. Conditional Statements
Sum=0;
for(i=1;i<N;i++)
for(j=1;j<i*i;j++)
if(j%i==0)
for(k=0;k<j;k++)
Sum++;
 Analyze the complexity of the above algorithm for
different inputs

Summary
Analysis of algorithms
Complexity
Even with High Speed Processor and large
memory ,Asymptotically low algorithm is not
efficient
Trade Off between Time Complexity and
Space Complexity

References
Fundamentals of Computer Algorithms
Ellis Horowitz,Sartaj Sahni,Sanguthevar
Rajasekaran
Algorithm Design
Micheal T. GoodRich,Robert Tamassia
Analysis of Algorithms
Jeffrey J. McConnell

Thank You

Vous aimerez peut-être aussi