Vous êtes sur la page 1sur 19

Data Structures 1st module

_____________________________________________________________________________________________________________________________
_
Data Structures and programming methodologies

Module 1

Principles of programming – System Life Cycle - Algorithm Specification-Recursive Algorithms-


Documentation- Performance Analysis and Measurements- Time and Space complexity-Complexity
calculation of simple algorithms.

Introduction
Till now we were learning how to write simple programs using a particular programming
language. We studied some programming language constructs and we used these constructs to
solve our simple problems. In the next level we have to consider solution of large complex
problems or tools to implement large scale computer systems.
For solving large complex problems, just taking the problem and code it using a programming
language will not work. For this we should have a solid foundation of data abstraction and
encapsulation, algorithm specification, and performance analysis and measurement.

System Life cycle

Large scale computer systems can be treated as systems that contain many complex interacting
parts. These programs undergo a development process called system life cycle. The different
phases in this cycle are requirements, analysis, design, and coding and verification phases.
i. Requirements
When a large programming project is assigned to a group of developers, a set of
specifications that define the purpose of the project will also be given. These requirements
will contain the input information to them and the output results that they should produce.
These specifications may be given in a simple way and the developers must have to develop
rigorous input and output descriptions.
ii. Analysis
In this phase we or the developers break the problem down in to a number of
pieces. There are two approaches for this. Bottom up and top down. Usually top down
approach is the best for our problems. In the bottom approach a programmer starts from
bottom i.e. He builds the pieces individually and then tries to connect them. This will always
leads to a lot of deficiencies in the solution since he doesn’t have a master plan of the
project.
In top down analysis, a high level plan for dividing the problem into manageable
segments will be developed. Then this plan will be refined to take into account on lower
level details. For this we are using some diagrams which show the flow of data to design the
system. Several alternate solutions to the problem will be developed and compared during
this phase. We are using diagrams called as dataflow diagrams to represent the system. This
top down approach is the most preferred one for developing complex software systems.

iii. Design
In this phase programmer introduces the data objects and operations on them to the
system. The first one leads to the creation of data structures or nowadays we are using the
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 2
_________________________________________________________________________________________________________________________

term abstract data types. The second one leads to the specification of algorithms and method
for designing those algorithms. But we are postponing the coding of these algorithms to the
next phase since they are dependent on a particular programming language. This will help us
to use a number of programming languages to develop our system and also to select a most
efficient implementation for these algorithms.
iv. Refinement and coding
In this, we are choosing representations for our data objects and write algorithms
for operations on them. For e.g. we have a number of representations available for e.g.
Stacks, queues, linked lists etc..
In the refinement part we are consulting with others or we may search for the
possibilities for developing a better system. If there are any changes in the design of the
system, the original design should be able to absorb these changes quickly. This is the reason
why we are postponing the coding part to the last.
v. Verification
In this we are developing correctness proofs for our program, testing our program
with a lot of input data and removing errors.
Correctness proofs
Mathematical techniques can be used for proving the correctness of our
programs. But this is very time consuming and is difficult for large projects. Selecting
algorithms that are already proved to be correct can reduce the number of errors. This is the
reason why we are studying a lot of algorithms in this subject.
Testing
We are using the working code and a set of test data. Data should be developed
carefully so that it includes all cases. Good test data should verify that every piece of code
runs correctly. We run the code and give the test data as input and we check whether the
correct results are available. A number of software testing tools are available now.
Error removal
As a result of correctness proofs and testing we shall get errors in the system. Then
removing these errors will depend on our design in the earlier phases.

Data abstraction and encapsulation

These concepts can be explained by considering a real world example. Consider a television
set. All the interactions with the TV are done through various switches outside it so that we
are not interacting with the circuitry inside. Internal representation is hidden from the user.
This is the principle of data encapsulation.
We know what will happen when we press a particular button of the TV, but we don’t know
how it is implemented inside i.e. we know what a switch does but doesn’t know how it does
it. This is the principle of abstraction.

Data encapsulation is the hiding of the implementation details of a data object from the
outside world.
Data abstraction is the separation between the specification of the data object and its
implementation.

Abstract data types

_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 3
_________________________________________________________________________________________________________________________

A data type is a collection of objects and a set of operations that act on these objects.
For e.g. int is a built in data type. The set of objects are {… -2, -1, 0, 1, 2, 3…). Some
operations on these objects are addition, subtraction etc...

Nowadays we are using the term abstract data type rather than data structure.
An abstract data type is a data type that is organized in such a way that the specification of the
objects and the specification of the operations on these objects is separated from the
representation of the objects and the implementation of the operations on these objects.
We can take the case of natural number. It can be treated as an abstract data type.
In this the set of objects are numbers 0, 1, 2…
The set of operations on these numbers are addition, subtraction, etc..
Then we know that these are represented in a computer system as two bytes per number.
Then the implementation of the operations may be carried out by some software and hardware
circuitries.
So the ADT specification of natural no is as follows

ADT Natural Number

Objects: integers from 0, 1, 2…..up to some maximum value

Functions or operations:
For all x, y E Natural number
Where +,-*, < etc... Are usual integer operations.
Add(x, y): Natural number : Add=x+y
Equal(x, y): Boolean : if x==y then Equal=true
Else Equal=false
Subtract(x,y): Natural number : Subtract=x-y

End Natural Number

Algorithm Specification

An algorithm is a finite set of instructions to accomplish a particular task.


All algorithms should have the following properties,
1. Input
2. Output
3. Definiteness: Each instruction is clear and ambiguous.
4. Finiteness: Algorithm should terminate after a number of steps.
5. Effectiveness: Every instruction must be basic enough to be carried out.

An algorithm can be distinguished from a program


A program may not be finite that is it may be non terminating. For e.g. operating system is a
program that is waiting infinitely in a loop.

Eg for algorithm
1. Selection sort
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 4
_________________________________________________________________________________________________________________________

Statement
From those integers that are currently unsorted, find the smallest and place it next in the
sorted list.

Algorithm

Void sort (int *a, int n)


// sort the integers a[0] to a[n-1] in ascending order
{
for (int i=0; i<n; i++)
{
int j=I; /* set j as minimum */
// find smallest integer in a[i] to a [n-1]
for (int k=i+1; k<n;k++)
if (a[k] <a[j]) j=k;
//interchange
int temp=a[i]; a[i] =a[j]; a[j] =temp;
}
}

2. Binary search

In this we are searching for a particular element in a sorted array.


In this we are dividing the list in to two. Then we compare the element with the middle
element. Then again we move to the lower or upper division according to the comparison.
Again we repeat the above steps till the element is found at a particular position.
The algorithm returns the index of the element in the array if it is found, otherwise -1 is
returned.

Int binarysearch( int *a, int x, int n)


//serch the sorted array a[0] to a[n-1] for the element x
{
int lower=0;
int upper=n;
while (lower<=upper)
{
mid=(lower+upper)/2;
int r=compare(x, a[mid]);
switch(r)
{
case -1: upper=mid-1; break; /* search in upper division */
case 0: return mid; /* item is found */
case +1: lower=mid+1; break; /* search in lower part */
}
}
return -1; /* item not found */
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 5
_________________________________________________________________________________________________________________________

int compare(int x, int y) /* compare x and y, if x<y return -1


if x>y return +1
if x==y return 0 */
{
If(x>y) then return +1
else if(x<y) then return -1;
else if(x==y) then return 0;
}

Recursive algorithms

One of the efficient methods to make a program readable is to make it modular. I.e...
Divide the program in to a number of functions. The view of the function is that it is invoked
from one point and it is executed and returns the result to the invoking point.
Recursion is the method by which a function calls itself repeatedly. There are two ways of
performing it. One is direct recursion in which the functions call themselves before they are
done. The other is indirect recursion in which the functions call other functions that again invoke
the calling function.
A computer system treats recursion as an ordinary function call.
To solve a problem using recursion a programmer has to formulate the method in mind.
All the problems that are written using while or if loop can be written using recursion. But it
may not be much easier.
Eg using recursion.
To find the factorial of a number

Int fact(n)
{
If (n==0) then return 1
Else
Return n*fact(n-1);
}

Every recursive program consists of 2 steps


1. A smallest, base case that is processed without recursion.
2. a general method that reduces a particular case to one or more of the smaller cases,
thereby making progress toward eventually reducing the problem all the way to the base
case.

Designing recursive algorithms.

Some important steps are


Find the key step

_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 6
_________________________________________________________________________________________________________________________

In the above eg n*fact(n-1) is the key step.

Find a stopping rule


The stopping rule indicates that the problem or a suitable part of it is done.
For eg in the above if(n==0)then return 1 is the stopping rule.

Outline the algorithm


Combine the stopping rule and key stop to formulate a suitable algorithm.
Here we have ombined them using an if else statement.

Check termination
Check whether the algorithm terminates after a number of steps.

3. Fibonacci numbers

Int Fibonacci(int n)
{
if(n<=0)
return 0;
else if (n==1)
return 1;
else
return Fibonacci(n-1)+ Fibonacci(n-2);
}

Performance analysis and measurement

Performance analysis means we are analyzing our program before it is executed in a system.
In performance measurement, we are measuring the actual time taken to run the program by
executing it in a system.

Performance analysis of a program is done by analyzing the time and space requirements of
a program.

We know that there are different methods to solve a particular problem. For eg in the case of
searching we have two algorithms till now. One is sequential search and other is binary
search( in a sorted array). So we have to find out which program is efficient. We can judge
this in terms of time and memory space. Ie how much space does each program occupy in
memory? How much time does each program needs to run? So in analysis part we are going
to compare between algorithms in terms of space and time to solve a particular problem.
Performance analysis is also called priori estimates since it is done prior to execution.

Performance analysis
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 7
_________________________________________________________________________________________________________________________

Space Complexity
In this we have to find out how much memory space does each program occupies.
Actually we are not calculating the actual space required for a program , we are just taking
an estimate.
The space needed by each of the programs is the sum of the following components.
1. a fixed part that is independent of the characteristics( number, size of the inputs and
outputs, space for the program code, space for variables, constants etc..
2. a variable part that consists of the space needed for the variables that is dependent on the
particular problem instance being solved, the space needed by the referenced variables
and the recursion stack space.
The space requirement of a program thus is written as c+Sp
Where c consists of fixed part and Sp is instance characteristics.
Since are taking only an estimate of the space complexity, we shall just consider only the Sp.
Some eg to illustrate how to find space complexity.
1.
sum(int x, int y)
{
a= x+y;
b=2x;
}

In this case we can say that the space needed for this program is fixed. Ie. Space for a, b ,x, y
are fixed. When we execute the problem at any instance or time, its space does not change, so
Sp=0.

2.
nsum(int *a, n)
/* array a consisting of n elements */
{
for i=0 to n-1
{
a[i]=5;
}
}

in this example program nsum(), we are passing an array with n elements. It seems that this
program’s space depends on the value of n. but this is not the case. This is because this program
require space for variables n , a and I, where a is only the address of the first array element. Its
space requirement is constant at any instance in which it is executed. But its time may change
depending on the execution instance. So instance characteristics, Sp=0.

3.
fact(n)
{
if(n==0) then return 1;
else return n*fact(n-1);
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 8
_________________________________________________________________________________________________________________________

This is a recursive program to find the factorial of a number. Here the function will be
recursively called depending on the value of n. That is it uses stack space in memory since
there are a number of function calls. Stack space is used to store the return values and return
addresses and local variables. Stack space goes on increasing as the value of n changes. So
this program’s space depends on the particular problem instance. We can find that this
program requires 3 words(space for n, return value, return address). Since depth of recursion
is n+1, the recursion stack space needed is 3(n+1).

Time complexity

The time taken by a program is the sum of compile time and run time. Compile time doesn’t
depend on instance characteristics. Because a compiled program can be run many times
without recompilation. So we consider only run time to analyse time complexity.
We know that time taken for a program depends on the time taken to execute each statement
of the program and also the complexity of each of the statement is different.
. So time for a program depend on the number of additions, subtractions, multiplications,
assignments etc..
For eg. A=b+c
D=a+b*c/r+s+f

The time taken for the second statement is greater than the first. So to evaluate the actual
time by looking at the program is not possible. So we assume that each meaningfull step of
the program takes one unit of time to execute. So we are counting the number of steps in the
program to find time.

We can find out how the unit time is calculated for different statements in the program.
1. comments – step count is zero.
2. declarative statements- count is zero.
3. expressions and assignment statements- count is 1.
4. iteration statements. –it depends on the expressions present in the program.
For eg. For loop
Do –while
While statements

5. switch statement- it consists of a header followed by a number of cases.


Header switch is given count 1 since it has an expression, then count for cases
depends on condition.

6. if else statement- it also depends on the expressions present in the statement.


7. function invocation- it is given a step count of one.
8. memory management statements- eg. Malloc, calloc is given step count of one.
9. function statements- given step count of zero. Since they are counted in their invocation
as said above in the 7 th step
10. jump statements- eg continue, break, return etc.. is given step count of one.
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 9
_________________________________________________________________________________________________________________________

Using the above rules we can find out how many steps are present in a program.
There are two ways to find out the step counts in a program. Step count method and step
table method.

Step count method

Her we introduce a variable count in to the program. As each time a particular meaningful
statement is executed count increments by one.

For eg

Sum(int *a, int n)


{
int I;
int a=0;
for(i= 0 to n-1)
{
a[i]=I;
}
}

we can introduce the variable count in to the program as

Sum(int *a, int n)


{
int I;
int a=0;
count++; /* for the statement int a=0 */
for(i= 0 to n-1)
{
count++; /* for for statement*/
a[i]=I;
count ++; /* for a[i]=I */
}
count ++; /*for the last time of for statement */
return I;
count ++; /* for return */
}
after the executing the above program we get the value 2n+3 in the variable count. We can
see that number of steps in this program depends on the variable n.

eg. 2:
fact (int n)
{
if(n==0)
then return 1;
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 10
_________________________________________________________________________________________________________________________

else
return n*fact(n-1);
}

fact(int n)
{
count ++; /* for if */
if(n==0)
count ++; /* for return statement */
return 1;
else
count ++; /*for return statement */
fact=n*fact(n-1);
}

we get the value 2n + 2 in count after the execution.


This program use recursion. We have to use recurrence relations inorder to find the count.
When the value of n=0, count will be 2( only then part will be executed.)
So we can write t(0)=2. for n=n, we can write
T(n)=t(0)+t(n-1)
=2+t(n-1)
=2+2+t(n-2)
----
----
=2n+t(0)
=2n+2

so count will be 2n+2.


This method is useful for telling how the run time for a program changes with changes in
instance characteristics. As the n doubles we can see that step count also doubles
approximately.

Eg 3:
Add(int*a, m, int*b, n)
{
for(int i=0; i<m;i++)
{
for(int j=0; j<n; j++)
{
c[i][j]=a[i][j]+b[i][j];
}
}
}

Add(int*a, m, int*b, n)
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 11
_________________________________________________________________________________________________________________________

{
for(int i=0; i<m;i++)
{
count++; /* for loop */
for(int j=0; j<n; j++)
{
count ++; /* for loop */
count++; /* for the statement below */
c[i][j]=a[i][j]+b[i][j];
}
count ++; /* for last time of for loop */
}
count ++; /* last time of for loop */
}

if we count the steps we get 2mn+2m+1 for count.

Step table method


For example

Sum(int *a, int n)


1. {
2 int I;
3 int a=0;
4 for(i= 0 to n-1)
5 {
6 a[i]=I;
7 }
8 }

line s/e frequency total steps


1 0 1 0
2 0 1 0
3 1 1 1
4 1 n+1 n+1
5 0 n 0
6 1 n n
7 0 n 0
8 0 1 0
total= 2n+2

step table for sum()

s/e indicates the number of steps per execution of the statement.


_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 12
_________________________________________________________________________________________________________________________

Eg 2
fact (int n)
1 {
2 if(n==0)
3 return 1;
4 else
5 return n*fact(n-1);
6 }

line s/e frequency total steps


n=0 n>0 n=0 n>0

1 0 1 1 0 0
2 1 1 1 1 0
3 1 1 0 1 0
4 0
5 1+t(n-1) 0 1 0 1+t(n-1)
6 0

total= 2 2+t(n-1)
(for n=0) (for n>0)

step table for fact()

eg 3
Add(int*a, m, int*b, n)
1 {
2 for(int i=0; i<m;i++)
3 {
4 for(int j=0; j<n; j++)
5 {
6 c[i][j]=a[i][j]+b[i][j];
7 }
8 }
9 }

line s/e frequency total steps

1 0 1 0
2 1 m+1 m+1
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 13
_________________________________________________________________________________________________________________________

3 0 m 0
4 1 m(n+1) m(n+1)
5 0 mn 0
6 1 mn mn
7 0 mn 0
8 0 m 0
9 0 1 0
-------
total steps= 2mn+2m+1

the time complexity of a program is given by the number of steps taken by the program to
compute the function it was written for. The no. of steps is itself a function of the instance
characteristics. In the example sum we seee that the no. of steps is 2n+2. that means the time
will depend on the variable n which is an instance variable. For the add program the count is
2mn+2m+1. ie time depends on the instance variables m and n.

the above programs we discussed are simple that we can say that time complexity depend on
some variable present in the program. But when we deal with binary search program we cannot
say that its time complexity depend on no. of elements in the sorted array ie. N. but the time also
depends on the position of the element to be searched on the array. As the index of the item
varies time complexity also changes. So we can introduce worst case time count , average time
count, best case time count when we analyse our programs.

Asymptotic notation (O, Ω, θ )


(ref: Fund. Of DS by Horowits & Sahni
Introduction to algorithms by Peter Coreman)

We have studied how to find out number of steps present in a program. Also we see that
determining exact step count is a very difficult task. Since we have allotted step count of one for
a program having statements that are of different complexities, determining exact step count
correctly is not at all important.
Suppose there are 2 programs p1 and p2 for doing a task which depend on variable n. Suppose
p1 takes little time than p2 for higher values of n. then we see that p1 is efficient than p2.
Suppose for smaller values of n, p2 takes little time than p1. then p2 is more efficient. Then we
can see that after a specific value of n program p1 becomes faster than p2. so at first p2 was
faster, then after a particular n p1 becomes faster. This value of n at which p1 becomes faster
than p2 is called break even point. For eg consider binary search and sequential search. For
smaller no. of elements seq search is faster. But for larger no. of elements, binary search is
faster.

Big Oh notation

f(n)=O(g(n)) (read as f of n is big oh of g of n) iff there exists two positive constants c and n0
such that f(n)<=cg(n) for all n, where n>=n0.
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 14
_________________________________________________________________________________________________________________________

Suppose we have a program having count 3n+2. we can write it as f(n)=3n+2. We say that
3n+2=O(n), that is of the order of n, because f(n)<=4n for all n>=2.

This can be explained by the following graph.


If we plot between n and 3n+2 and also
between n and 4n (c.g(n)) we will get

graph

From the graph we see that the value of f (n) is always less than c.g(n).
That is g(n) is an upper bound on the function f(n).

Another eg. Suppose f(n)=10n**2+4n+2. we say that f(n)=O(n**2) sine


10n**2+4n+2<=11n**2 for n>=2. But here we can’t say that f(n)=O(n) since 10 n**2+ 4n+2
never less than or equal to cn. But we are able to say that f(n)=O(n**3) since f(n) can be less
than or equal to cn**3.

Normally suppose a program has step count equals 5, then we say that it has an order
O(constant) or O(1).
If f(n) = 3n+5 then O(n) or O(n**2) or O(n**3) or ---
But we cannot express its time complexity as O(1).

If f(n) = 5n**2+8n+2 then O(n**2) or O(n**3) or ---


But we cannot express its time complexity as O(n) or O(1).

If it is 6n**3+3n+2 then O(n**3) or O(n**4) or ----


But we cannot express its time complexity as O(n**2) or O(n) or O(1).

If it is n log n+6n+9 then O(n log n) (to the base 2)


If it is log n+6 then O(log n)

_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 15
_________________________________________________________________________________________________________________________

Omega notation
f(n) = Ω(g(n)) (read as f of n is big oh of g of n) iff there exists two positive constants c and n0
such that f(n) >= cg(n) for all n, where n >= n0.

Suppose we have a program having count 3n+2. we can write it as f(n)=3n+2. We say that
3n+2=Ω(n), that is of the order of n, because f(n)>=3n for all n>=2.

This can be explained by the following graph.


If we plot between n and 3n+2 and also
between n and 3n (c.g(n)) we will get

graph

From the graph we see that the value of f (n) is always greater than c.g(n), after a
particular value of n.
That is g(n) is a lower bound on the function f(n).

Normally suppose a program has step count equals 5, then we say that it has an order
Ω(constant) or Ω(1).
If f(n) = 3n+5 then Ω(n) or Ω(1)
But we cannot express its time complexity as O(n**2).

If f(n) = 5n**2+8n+2 then Ω(n**2) or Ω(n) or Ω(1)


But we cannot express its time complexity as O(n**3) or O(n**4).

If f(n) = 6n**3+3n+2 then Ω(n**3) or Ω(n**2) or Ω(n) or Ω(1)


But we cannot express its time complexity as O(n**4) -----.

If f(n) = n log n+6n+9 then Ω(n log n) (to the base 2)


If f(n) = log n+6 then Ω(log n)

_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 16
_________________________________________________________________________________________________________________________

Theta notation
f(n) = θ(g(n)) (read as f of n is big oh of g of n) iff there exists two positive constants c1, c2 and
n0 such that c1.g(n) >= f(n) <= c2.g(n) for all n, where n >= n0.

Suppose we have a program having count 3n+2. we can write it as f(n)=3n+2. We say that
3n+2=θ (n), that is of the order of n, because c1. n >= 3n + 2 <= c2. n .after a particular value of
n.
But we cannot say that 3n + 2 = θ(1) or θ (n**2) or θ (n **3).

This can be explained by the following graph.


If we plot between n and 3n+2 and also
Between n and c1. n and also
Between n and c2. n , we will get

graph

From the graph we see that the value of f (n) is always between c1.g(n) and c2. g(n),
after a particular value of n.

Normally suppose a program has step count equals 5, then we say that it has an order
θΩ(constant) or θ (1).
If f(n) = 3n+5 then θ (n)
But we cannot express its time complexity as θ(n**2) or θ(1).

If f(n) = 5n**2+8n+2 then θ (n**2).


But we cannot express its time complexity as θ(n**3) or θ (n**4) or θ(n) or θ(1).

If f(n) = 6n**3+3n+2 then θ(n**3)


But we cannot express its time complexity as O(n**4) or θ(n**2) or θ(n) or θ(1).

If f(n) = n log n+6n+9 then θ (n log n) (to the base 2)


If f(n) = log n+6 then θ (log n)

_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 17
_________________________________________________________________________________________________________________________

From this we may conclude that θ is the most precise notation.

For the binary search program we will get a time of the order log n. (to the base 2)

Performance measurement

Performance measurement is concerned with obtaining actual time and space requirements of a
program. We do not discuss measuring the space. It is also called posteriori testing as the
program is executed in a system and we measure the time taken for the program to execute. We
are measuring the run time of a program, not the compile time.
We are measuring the time at which the program starts to execute and we measure the time by
which the program stops execution. By taking the difference, we get the actual time to execute
the program. For eg.

Main ()
{
// read a, item
time(start);
seqsearch(a, n, item);
time(stop);
execution time = stop-start;
}

seqsearch(int *a,n, item)


{
x=n
while(x>0)
{
if(a[n]==item then print “found; stop;
else
x--;
}
}

here we get the time for seq search in variable execution time. But actually we will not get the
time, we get zero as execution time since our clock is not precise because this program is
executed by the processor much faster so that we can’ t get a sufficient time value.
We see that seq search has an order of n. So we get a straight line when we plot a graph
between n and t for various values of n.
What we do is to we are taking an array and call the seq search procedure for various values
ofn . ie. We are changing the no. of array elements. We are going to measure the worst case time
that is when the element is not present in array. For this we are searching for an item that is not
present in the array.
_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 18
_________________________________________________________________________________________________________________________

Inorder to get a sufficient time we are repeating the search a no. of times and measure the time.
Then we divide it by the no. of times. We get the time for one run. We vary the value of n from
0 to 1000 and get the corresponding times. We plot a graph wit n and t, we shall get a straight
line. From this we can find the time for other values of n.

The algorithm for this is follows

Measure()
{

for n =0 to 1000 // increment n by 10 up to 100 and then by 100 up to 1000

{ //ie n have values 0, 10, 20, 30, --- 100, 200, 300, ----1000
time(start)

for i=1 to 10000


{
seqsearch(a,n, item); // item should be in a
}

time (stop)

total=stop-start;
actual=total/10000;

This program has a time complexity of theta (n). That is this program is linear.
If we plot a graph we get a straight line indicating that program is linear.

Graph between n and t.


Fig

_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering
Data Structures 1st module 19
_________________________________________________________________________________________________________________________

Since the curve is a straight line, we can fin an equation for this straight line that is in the form y
= mx + c.
That is
t = mn + c.
When n=0, we will get the value of c.
As a next step, we will get the value of slop, m from the graph.

Suppose we got the equation as

t = .0034 n + .00082

From this equation, we are able to find the time required for executing sequential search
program for any value of n. Remember that the time we are getting is worst case time.

_____________________________________________________________________________________________________________________________
_
St. Joseph’s College of Engineering & Technology Palai Department of Computer Science & Engineering

Vous aimerez peut-être aussi