Vous êtes sur la page 1sur 39

Program Efficiency and Complexity

What will we be looking at:

What is an "efficient" program?


How can we measure efficiency?
The Big O, Big Theta and Big Omega Notation
Asymtotic Analysis

Programming and Data Structures 1


Why Measure Efficiency
Many ways to solve a problem.
- but which way (i.e. algorithm) is better

Moores Law :
- Number of transistors on CPU doubles every year
(well more precisely every 18 months)
- i.e. System performance doubles (well almost) every 18 months
or so.

If this is so and with current speeds of cpus these days, then why
bother worrying about how efficient our code is?

Programming and Data Structures 2


A Classic Optimisation Problem
Many optimisation problems can be formulated as a Travelling Salesman
problem.

The Travelling Salesman problem:


- Stated : A travelling salesman has to visit 100 different locations in a town,
what is the shortest route that he can take.
- Total number of distinct routes possible : 100! 30100

What does this mean in turns of running time?


- A supercomputer capable of checking 100 billion routes per second can
check roughly about 1020 routes in the space of one year.
- Millions of years needed to check all routes!

Programming and Data Structures 3


How do we Analyse an Algorithm?
A simple Example :

// Input: int A[N], array of N integers


// Output: Sum of all numbers in array A

int Sum(int A[], int N)


{
int s=0;
for (int i=0; i< N; i++)
s = s + A[i];
return s;
}

How should we analyse this?

Programming and Data Structures 4


Analysis of Sum Method
Niklaus Wirth (person who came up with Pascal and Modula) once said that any
algorithm can be written using only 3 programming statement constructs :
- Sequence
- Selection
Loop

Sequence :
- A series of statements that do not alter the path within the algorithm
- A call to another method is considered a sequence statement also.

Selection :
- Our if else statements

Loop :
- Our familiar for, while, do-while loops.

Programming and Data Structures 5


Analysis of Sum Method (2)
Efficiency of an algorithm is a function of the number of elements to be
processed.

If algorithm contains no loops then it is considered linear, so efficiency is


function of number of instructions.

Algorithms with loops vary in efficiency.

Lets get back to our Sum method:


- Describe the size of the input in terms of one or more parameters:
+ Input to Sum is an array of N ints, so size is N.

- Then, count how many steps are used for an input of that size:
+ A step is an elementary operation such as +, <, =, A[i]

Programming and Data Structures 6


Analysis of Sum Method (3)
// Input: int A[N], array of N integers
// Output: Sum of all numbers in array A

int Sum(int A[], int N) {


int s=0; 1
for (int i=0; i< N; i++)
2 3 4
s = s + A[i];
5
return s;
6 7 1,2,8 (i.e. sequences): Once
} 3,4,5,6,7: Once per each iteration
8 of for loop, N iteration
Total: 5n + 3
The complexity function of the
algorithm is : f(n) = 5n +3

Programming and Data Structures 7


How 5n+3 Grows
Estimated running time for different values of n:

n = 10 => 53 steps
n = 100 => 503 steps
n = 1,000 => 5003 steps
n = 1,000,000 => 5,000,003 steps

As n grows, the number of steps grows in linear proportion to


n for this Sum function.

This makes sense since f(n) = 5n+3 is a linear function in n.

Programming and Data Structures 8


Asymptotic Complexity
What term in the previous complexity function dominates?

What about the 5 in 5n+3? What about the +3?

As n gets large, the +3 becomes insignificant

The 5 is inaccurate as different operations require varying amounts of time

What is fundamental is that the time is linear in n.

Asymptotic Complexity: As n gets large, ignore all lower order terms and
concentrate on the highest order term only:

i.e. :
- Drop lower order terms such as +3
- Drop the constant coefficient of the highest order term.
Programming and Data Structures 9
Asymptotic Complexity (2)
The 5n+3 time bound is said to "grow asymptotically" like n.

This gives us an approximation of the complexity of the algorithm. (i.e. f(n) = n )

Ignores lots of (machine dependent) details, concentrates on the bigger picture.

Why is this useful?


- As inputs get larger, any algorithm of a smaller order will be more efficient
than an algorithm of a larger order.

0.05 n2 = O(n2)
Time (steps)

3n = O(n)

N = 60 Input (size)

Programming and Data Structures 10


Big O Notation
What does the O(n) and O(n2) mean in the previous slide?

This is known as the Big O notation.

It is used to express the upper bound on a function

If f(n) and g(n) are two complexity functions then we can say:

f(n) = O(g(n)) if constants c, n0 such that 0 f(n) cg(n), for all n n0

the above is read "f(n) is order g(n)", or "f(n) is big-O of g(n)"

cg(n)
f(n) Function cg(n) always dominates
f(n) to the right of n0

n0 n
Programming and Data Structures 11
Big O Notation
Think of f(n) = O(g(n)) as
- " f(n) grows at most like g(n)" or
- " f grows no faster than g"
(ignoring constant factors for n)

Important:
- Big-O is not a function!
- Never read = as "equals"
- Examples:
5n + 3 = O(n)
7n2 2n + 1 = O(n2)

Big O is used as an upper bound estimate so it is ok to say something like running


time is at worse O(n)

It is not ok to say running time is at least O(n)

Programming and Data Structures 12


Big Omega Notation
If we wanted to say running time is at least we use

Big Omega notation, , is used to express the lower bounds on a function.

If f(n) and g(n) are two complexity functions then we can say:

f(n) = (g(n)) if constants c, n0 such that 0 cg(n) f(n), for all n n0

f(n)
cg(n) In this instance, function cg(n) is
dominated by function f(n) to the right
of n0
n0 n
Example : 3n + 2 = (n)

Programming and Data Structures 13


Big Theta Notation
If we wish to express tight bounds we use the theta notation,

f(n) = (g(n)) means that f(n) = O(g(n)) and f(n) = (g(n))

cg(n)
f(n)
c1g(n)
n

Programming and Data Structures 14


What does this all mean?
If f(n) = (g(n)) we say that f(n) and g(n) grow at the same rate,
asymptotically

If f(n) = O(g(n)) but f(n) (g(n)), then we say that f(n) is


asymptotically slower growing than g(n).

If f(n) = (g(n)) but f(n) O(g(n)), then we say that f(n) is


asymptotically faster growing than g(n).

Mathematically we can express these in terms of limits as n


tends to infinity.

Programming and Data Structures 15


Limit as n tends to Infinity

f (n)
1. If lim n = 0 f(n) = O(g(n) )
g (n )

f (n)
2. If lim n = f(n) = (g(n) )
g (n)

f (n )
3. If lim n = c f(n) = (g(n) )
g (n )

Programming and Data Structures 16


Which Notation do we use?

To express the efficiency of our algorithms which of the three


notations should we use?

As computer scientist we generally like to express our algorithms


as big O since we would like to know the upper bounds of our
algorithms.

Why?

If we know the worse case then we can aim to improve it and/or


avoid it.

Programming and Data Structures 17


Common Orders of Growth
Let n be the input size, and b and k be constants

O(k) = O(1) Constant Time

Increasing Complexity
O(logbn) = O(log n) Logarithmic Time
O(n) Linear Time
O(n log n)
O(n2) Quadratic Time
O(n3) Cubic Time
...
O(kn) Exponential Time
O(n!) Exponential Time

Programming and Data Structures 18


Why Avoid Exponential Time Agorithms?

Suppose a program has run time O(n!) and the run time for
n = 10 is 1 second

- For n = 12, the run time is 2 minutes


- For n = 14, the run time is 6 hours
- For n = 16, the run time is 2 months
- For n = 18, the run time is 50 years
- For n = 20, the run time is 200 centuries!

Programming and Data Structures 19


Comparing Complexity

What happens if we double the input size n?

n log2n 5n n log2n n2 2n
8 3 40 24 64 256
16 4 80 64 256 65536
32 5 160 160 1024 ~109
64 6 320 384 4096 ~1019
128 7 640 896 16384 ~1038
256 8 1280 2048 65536 ~1076

Programming and Data Structures 20


Constant Time Statements
Simplest case: O(1) time statements

Assignment statements of simple data types


int x = y;
Arithmetic operations:
x = 5 * y + 4 - z;
Array referencing:
A[j] = 5;
Most conditional tests:
if (x < 12) ...

Programming and Data Structures 21


Analysing Loops Linear Loops
Example (have a look at this code segment):

int i = 1; Executed 1 times


while (i < n)
n comparisons
{
<<list of sequence statements>> Lets presume there is constant c steps
i++; present here; so we have c*n steps
}
Executed n times

Efficiency is proportional to the number of iterations.

Efficiency time function is :


f(n) = 1 + (n-1) + c*(n-1) +( n-1)
= (c+2)*(n-1) + 1
= (c+2)n (c+2) +1

Asymptotically, efficiency is : O(n)

Programming and Data Structures 22


Analysing Loops Logarithmic Loops
Example (have a look at this code segment):

int i = 1; Executed 1 times


while (i < 1000)
1000 comparisons? not quite! -
{ depends on i variable
<<list of sequence statements>>
i = i * 2; Lets again presume there is constant c
} steps present here. Number of times
executed depends on i variable also.

What is i for each iteration?

In the code segment above, we cannot say that we iterate 1000 times as the number of
times we iterate is governed by the i variable.

i variable does not change in a linear fashion in this case.

Lets have a look at what happens to i during each iteration

Programming and Data Structures 23


Analysing Loops Logarithmic Loops (2)
i initially = 1
Iterations i in (i <100) check i after statement i = i * 2;
1 1 1*2 = 2
2 2 2*2 = 4
3 4 4*2 = 8
4 8 8*2 = 16
5 16 16*2 = 32
6 32 32*2 = 64
7 64 64*2 = 128
8 128 128*2 = 256
9 256 256*2 = 512
10 512 512*2 = 1024
11 1024 EXIT LOOP!

On inspection we can see that the change in value, n, to i is in base 2 to the power of x
(i.e. n = 2x)

Expressing this in terms of logarithms, we can say that there are log2n iterations.
Programming and Data Structures 24
Analysing Loops Logarithmic Loops (3)

So, we know that we loop log2n times

This means that all steps within our loop get done log2n times

So our time complexity function is :


f(n) = 1 + log2n + c*log2n + log2n
= 1 + (c+2)*log2n

Expressing this asymptotically, efficiency is : O(log2n)

Programming and Data Structures 25


Analysing Loops Nested Loops
Example (have a look at this code segment):

int i = 1; Executed 1 times


while (i <= 20)
n comparisons
{
int j = 1; Initialisation done n times
while (j <= 20)
{
n comparisons
j++;
} Executed n times
i++;
}

Treat just like a single loop and evaluate each level of nesting as needed.

Total number of iterations is the product of total number of inner loop itterations and
outer loop itterations.
Programming and Data Structures 26
Analysing Loops Nested Loops (2)
Our time complexity function for our bit of code is then :

f(n) = 1+ (n + n + n) * (n + n)

Outer loop Inner loop

= 1 + 3n*2n
= 1+ 6n2

Asymtotically, efficiency is: O(n2)

Programming and Data Structures 27


Analysing Loops Nested Loops (3)
What if the number of iterations of one loop depends on the
counter of the other?
int j,k;
for (j=0; j < N; j++)
for (k=0; k < j; k++)
sum += k+j;

Solution :
Analyse inner and outer loop together:
- Number of iterations of the outer and inner loop together:
0 + 1 + 2 + ... + (n-1) = O(n2)

Programming and Data Structures 28


How Did We Get This Answer?
When doing Big-O analysis, we sometimes have to compute
a series like: 1 + 2 + 3 + ... + (n-1) + n

i.e. Sum of first n numbers. What is the complexity of this?

Gauss figured out that the sum of the first n numbers is always:
n
i=
i=1
n * (n+1)
2
n2 + n
= 2 = O(n2)

If we had analysed each loop seperately in the previous code we


would have worked out that the outer loop itterates n times and the
inner loop itterates (n+1)/2 times so total number of itterations is
n*(n+1)/2 which is Gausss equation.
Programming and Data Structures 29
Sequence of Statements
For a sequence of statements, compute their complexity functions
individually and add them up
for (j=0; j < N; j++)
for (k =0; k < j; k++) O(N2)
sum = sum + j*k;
for (l=0; l < N; l++)
sum = sum -l; O(N)
System.out.print("sum is now+sum); O(1)

Total cost is O(n2) + O(n) +O(1) = O(n2)

Programming and Data Structures 30


Conditional Statements
What about conditional statements such as
if (condition)
statement1;
else
statement2;

where statement1 runs in O(n) time and statement2 runs in O(n2) time?

We use "worst case" complexity: among all inputs of size n, what is the
maximum running time?

The analysis for the example above is O(n2)

Programming and Data Structures 31


Recurrence Relations
So far, all algorithms that we have been analysing have been non recursive

The running time of a recursive algorithm is naturally given by a recurrence


relation.

Recurrence Relation : Expresses the running time of a recursive algorithm for


inputs of size N in terms of smaller sized inputs.

Must solve recurrences to derive the running time.

Many ways to solve a recurrence equation once derived. Will look at three here :
- Iteration method
- Recursion Tree
- Master method

Programming and Data Structures 32


Deriving A Recurrence Equation
Lets have a look at a simple example of deriving a recurrence equation:

Example : Recursive power method

double power( double x, int n) {


if ( n == 0)
return 1.0; // base calse
//else
return power(x, n-1)*x; // recursive case
}

If N = 1, then running time T(N) is 2, (i.e. T(1) = 2 for N = 1)

However if N 2, then running time T(N) is the cost of each step taken plus time required
to compute power(x,n-1). (i.e. T(N) = 2+T(N-1) for N 2)

How do we solve this? One way is to use the iteration method.

Programming and Data Structures 33


Iteration Method
This is sometimes known as Back Substituting.

Involves expanding the recurrence in order to see a pattern.

Solving formula from previous example using the iteration method :

Solution : Expand and apply to itself :


Let T(1) = n0 = 2, so T(N) = nk
T(N) = 2 + T(N-1)
= 2 + 2 + T(N-2)
= 2 + 2 + 2 + T(N-3)
= 2 + 2 + 2 + + 2 + T(1)
= 2N + 2 remember that T(1) = n0 = 2 for N = 1

So T(N) = 2N+2 is O(N) for last example.

There are some common recurrence relations that appear time and time again which we will
introduce on the next slide along with the actual solution.

Programming and Data Structures 34


Common Recurrence Relations
The actual solving of the recurrence relations is left as an exercise for you to do in your tutorial.

T(1) = 1 for N = 1
T(N) = T(N-1) + N for N 2
This recurrence arises for a recursive algorithm that loops
N ( N + 1) through the input to eliminate one item:
Sol : T(N) = = O(N2)
2

T(1) = 1 for N = 1 This recurrence arises for a recursive algorithm that halves the
T(N) = T(N/2) + 1 for N 2 input in one step.
Hint for solving this : assume that N = 2n, so that the recurrence is
Sol : T(N) = lg N +1 = O(lg N) always defined (Note that this means that n = log2 N).

T(1) = 0 for N = 1 This recurrence arises for a recursive program that halves the input, but
T(N) = T(N/2) + N for N 2 perhaps must examine every item in the input.
Hint for solving this : Expand out to a geometric series and reason out.
Sol : T(N) = 2N = O(N)

T(1) = 0 for N = 1
T(N) = 2T(N/2) + N for N 2 This recurrence applies to a family of standard divide-and-conquer
algorithms.
Sol : T(N) = N lg N = O(N lg N)

Programming and Data Structures 35


Recursion Tree
The last formula is an example of an algorithm that splits the problem
into two halves and seeks to solve both.

These algorithms sometimes lend themselves to a pictorial method of


solution using recursion trees.
T(1) = 0 for N = 1
T(N) = 2T(N/2) + N for N 2

Size of Remaining Argument Additions


T(N) N => N

T(N/2) T(N/2) N/2 N/2 => N

T(N/4) T(N/4) T(N/4) T(N/4) N/4 N/4 N/4 N/4 => N

Programming and Data Structures 36


Recursion Tree Solution
In this case we can see that all the terms to be
added are N, and since the tree has height lg N the
solution is,
T(N) = N log N

With all of these methods we are seeking a pattern


which makes the solution simple.

If applying them does not lead to a simplification


of the solution then we may need to apply a
different method.
Programming and Data Structures 37
The Master Method
The Master Theorem: In general the Master
Theorem says that the recurrence,
T(1) = d
T(n) = aT(n/b) + cn
has solution
T(n) = O(n) if a < b
T(n) = O(n lg n) if a = b
T(n) = O(nlg a) if a > b

Programming and Data Structures 38


Example
Take the following recurrence equation :
T(1) = 0 for n = 1
T(n) = 4T(n/2) + cn for n 2

gives, a = 4, b = 2

therefore,
a > b and,
T(n) = O( nlg 4)
= O(n2)

Programming and Data Structures 39

Vous aimerez peut-être aussi