Vous êtes sur la page 1sur 5

Problem Statement:

Given a matrix M of size n m, write a program using C++ that computes the sum of
each column so
that the result vector V of size m is defined like so:
xample:
!he matrix has size "x# hence n$" and m$#
the resultant vector of size m$# is sum of columns such that
V%&' with & in %(,#)
for &$(
V%('
sum from i$( to n*+ of M,i,&)
i-e- sum from i$( to . of M,i,() :-here &$( and n$"
i-e- M,(,() + M,+,() + M,/,() + M,.,()
i-e- / + 0 + .+ 1 $+0-
V%(' $ +0-
for &$+
V%+'
sum from i$+ to n*+ of M,i,&)
i-e- sum from i$+ to . of M,i,+) :-here &$+ and n$"
i-e- M,(,+) + M,+,+) + M,/,+) + M,.,+)
i-e- . + # + "+ + $+2-
V%+' $ +2-
similarl3 all the # values ,from ( through 4) can 5e computed as shown in the figure-
Task 1: Sequential and scalar implementation.
6artial sum is implemented in c++ with floating point values of single precision using
float data t3pe- 7 two dimensional matrix M of n x m is used to store the input and a
one dimensional arra3 of size m is used to store the output- !he choice of data
structures is to keep the solution simple-
!he program is written for and targeted at a 2" 5it 8inux with intel core i. 9ehalem
,the code*name for an :ntel processor micro*architecture) architecture-
!he program has three parts:
+- accept input-
:nput is stored in a two dimensional arra3 named M of size rowlength x collength-
7ll the elements are floats- Values are stored from ( through row*+ and ( through col*+
last row and col are empt3- :ndex starts from ( therefore +
st
row +
st
col element is given
53 %('%(' and so forth as indicated in ta5le 5elow-
4
,(,()
1
,(,+)
.
,(,/)
(
,(,.)
(
,+,()
.
,+,+)
/
,+,/)
+
,+,.)
/
,/,()
+
,/,+)
+
,/,/)
(
,/,.)
/- calculate partial sum-
!his is the computation part of algorithm- :t solves the pro5lem se;uentiall3 53
computing values of V[j] for & from ( through collength.
for (int i = 0; i < rowlength; ++i)
{
for (int j = 0; j < collength; ++j)
{
V[j] += M[i][j];
}
}
!he loop works as follows:
first i is set to ( and & loops from ( through . ,as collength is " in our example)
when j < collength evaluated as false, the inner loop exits and : is incremented-
9ow, i is set to + and & loops again from ( through .- !his continues till the outer loop
condition fails-
!he Vector V during the execution is as follows:
:nitiall3 when it is declared, V is initiated to all zeroes- !here fore 5efore loop is
entered V$ <(,(,(,(=-
when inner loop is executed with i$( fixed, V is same as all first row elements, ,(,(),
,(,+), ,(,/), ,(,.)- V$<(,(,(,(=+<4,1,.,(=$<4,1,.,(=-
when inner loop is executed with i$+ fixed, V is changed to second row elements added
to first row elements, ,+,(), ,+,+), ,+,/), ,+,.)- V$<4,1,.,(=+<(,.,/,+=$<4,#,1,+=
and so on-
:n other words,
i$(
&$( v%('$ v%('+M%('%('$ (+4 $4
&$+ v%+'$ v%+'+M%('%+'$ (+1 $1
&$/ v%/'$ v%/'+M%('%/'$ (+. $.
&$. v%.'$ v%.'+M%('%.'$ (+( $(
i$+
&$( v%('$ v%('+M%+'%('$ 4+( $4
&$+ v%+'$ v%+'+M%+'%+'$ 1+. $#
&$/ v%/'$ v%/'+M%+'%/'$ .+/ $1
&$. v%.'$ v%.'+M%+'%.'$ (++ $+
i$/
&$( v%('$ v%('+M%/'%('$ 4+/ $0
&$+ v%+'$ v%+'+M%/'%+'$ #++ $0
&$/ v%/'$ v%/'+M%/'%/'$ 1++ $2
&$. v%.'$ v%.'+M%/'%.'$ ++( $+
7ll these steps are performed se;uentiall3 one after the other-
.-printing the output-
!his is simpl3 done 53 looping through the output vector V from ( through m-
sample output:
Task 2: Parallel Implementation
7 parallel implementation of partial sum is done using pthreads 5uilt in c++>s newest
standard std++-
7dvantages:
6rallel processing can easil3 5e achieved using <thread> header file- Calling
thread (funP,arguments) function creates a thread which executes fun6 function
parallel to parent thread executing main,) function-
8imitations:
c++++ standard thread li5rar3 is not implemented on all compilers- :t is there in GCC
on 8inux, 5ut not on M:9G? on windows- !@M*GCC is the one of few if not onl3
compiler on windows to implements this li5rar3-
:f a two dimentional arra3 needs to passed as an argument, its size should 5e known
and specified 5efore compilation- 7 workaround was to define the M and V as glo5al
varia5les-
!he data structure for holding Matrix M is changed from two dimentioanl arra3 to a
single dimentional arra3 of size nAm- M%i'%&' for /@ is e;uivalent of M%iAcol+&'-
6arallel computation part of algorithm:
:nstead of se;uentiall3 computing value of V%&' one after another, we have a thread
each for ever3 element in V-
:f there are m columns in matrix, we create m threads each to derive the partial sum
for that column-
for (int j = 0; j < collength; ++j)
{
thread T(addcol, j, rowlength, collength);
T.join();
}
!his loop creates a thread each for each column and calls addcols function- Boin,) is
used so that main,) waits till all the threads complete their execution- !his is to ensure
that main does not exit prematurel3-
void addcol(int j, int rowlength, int collength)
{
for(int i=0; i< rowlength; i++)
{
V[j] += M[i*collength+j];
}
cout<< "thread" << j << "computed V[" << j << "] as"<< V[j] << endl;
}
!he addcolfunction adds each column values and gets value of one element of V-
!he long output is to demonstrate that the values are generated 53 individual threads-
?hen n is large, there is a huge advantage of parallel computing as compared to
se;uential solution- ?hen 5oth n and m are small, the se;uential solution is 5etter
than overhead of creating threads- Cor moderate size inputs, 5oth solutions> run time is
compara5le-
Dample output:

Vous aimerez peut-être aussi