Collective Communication Functions Guide

Collective Communication
Collective Communication
Collective communication is defined as
communication that involves a group of
processes
More restrictive than point to point
Data sent is same as the data received, i.e.
type, amount
All processes involved make one call, no tag to
match operation
Processes involved can return only when
operation completes
blocking communication only
Standard Mode only
Collective Functions
Barrier synchronization across all group members
Broadcast from one member to all members of a group
Gather data from all group members to one member
Scatter data from one member to all members of a group
A variation on Gather where all members of the group receive
the result. (allgather)
Scatter/Gather data from all members to all members of a
group (also called complete exchange or all-to-all) (alltoall)
Global reduction operations such as sum, max, min, or user-
defined functions, where the result is returned to all group
members and a variation where the result is returned to only
one member
A combined reduction and scatter operation
Scan across all members of a group (also called prefix)
Collective Functions MPI_BARRIER
blocks the caller until all group members
have called it
returns at any process only after all group
members have entered the call
C
int MPI_Barrier(MPI_Comm comm )
Input Parameter
comm: communicator (handle)
Fortran
MPI_BARRIER(COMM, IERROR)
INTEGER COMM, IERROR

Collective Functions MPI_BCAST
broadcasts a message from the process with rank root to all
processes of the group, itself included
C
int MPI_Bcast(void* buffer, int count, MPI_Datatype datatype, int
root, MPI_Comm comm )
Input Parameters
count: number of entries in buffer (integer)
datatype: data type of buffer (handle)
root: rank of broadcast root (integer)
Input / Output Parameter
buffer: starting address of buffer (choice)
Fortran
MPI_BCAST(BUFFER, COUNT, DATATYPE, ROOT, COMM, IERROR)
<type> BUFFER(*)
INTEGER COUNT, DATATYPE, ROOT, COMM, IERROR
Collective Functions MPI_BCAST
A
A A A A
Collective Functions MPI_GATHER
Each process (root process included) sends the contents of its
send buffer to the root process.
The root process receives the messages and stores them in
rank order
C
int MPI_Gather(void* sendbuf, int sendcount, MPI_Datatype
sendtype, void* recvbuf, int recvcount, MPI_Datatype recvtype,
int root, MPI_Comm comm)
Input Parameters
sendbuf: starting address of send buffer (choice)
sendcount: number of elements in send buffer (integer)
sendtype: data type of send buffer elements (handle)
recvcount: number of elements for any single receive (integer,
significant only at root)
recvtype: data type of recv buffer elements (significant only at root)
(handle)
root: rank of receiving process (integer)
Output Parameter
recvbuf: address of receive buffer (choice,
Fortran
MPI_GATHER(SENDBUF, SENDCOUNT,
SENDTYPE, RECVBUF, RECVCOUNT, RECVTYPE,
ROOT, COMM, IERROR)
<type> SENDBUF(*), RECVBUF(*)
INTEGER SENDCOUNT, SENDTYPE,
RECVCOUNT, RECVTYPE, ROOT, COMM,
IERROR
B A C D
D C A
A B C D
B
Collective Functions MPI_SCATTER
MPI_SCATTER is the inverse operation to MPI_GATHER
C
int MPI_Scatter(void* sendbuf, int sendcount,
MPI_Datatype sendtype, void* recvbuf, int recvcount,
MPI_Datatype recvtype, int root, MPI_Comm comm)
Input Parameters
sendbuf: address of send buffer (choice, significant only at
root)
sendcount: number of elements sent to each process (integer,
sendtype: data type of send buffer elements (significant only
at root) (handle)
recvcount: number of elements in receive buffer (integer)
recvtype: data type of receive buffer elements (handle)
root: rank of sending process (integer)
Output Parameter
recvbuf: address of receive buffer (choice)
Fortran
MPI_SCATTER(SENDBUF, SENDCOUNT,
SENDTYPE, RECVBUF, RECVCOUNT,
RECVTYPE, ROOT, COMM, IERROR)
RECVCOUNT, RECVTYPE, ROOT, COMM,
IERROR
A B C D
D C A
A B C D
B
MPI_ALLGATHER
MPI_ALLGATHER can be thought of as MPI_GATHER, but
where all processes receive the result, instead of just the root.
The jth block of data sent from each process is received by
every process and placed in the jth block of the buffer recvbuf.
C
int MPI_Allgather(void* sendbuf, int sendcount, MPI_Datatype
sendtype, void* recvbuf, int recvcount, MPI_Datatype recvtype,
MPI_Comm comm)
Input Parameters
sendcount: number of elements in send buffer (integer)
recvcount: number of elements received from any process (integer)
MPI_ALLGATHER
Output Parameter
Fortran
MPI_ALLGATHER(SENDBUF, SENDCOUNT,
SENDTYPE, RECVBUF, RECVCOUNT, RECVTYPE,
COMM, IERROR)
RECVCOUNT, RECVTYPE, COMM, IERROR
MPI_ALLGATHER
B A C D
D C A
A B C D
B
A B C D A B C D A B C D
MPI_ALLGATHER
Collective Functions MPI_ALLTOALL
Extension of MPI_ALLGATHER to the case where each
process sends distinct data to each of the receivers.
The jth block sent from process i is received by process
j and is placed in the ith block of recvbuf
C
int MPI_Alltoall(void* sendbuf, int sendcount,
MPI_Datatype sendtype, void* recvbuf, int recvcount,
MPI_Datatype recvtype, MPI_Comm comm)
Input Parameters
sendcount: number of elements sent to each process (integer)
recvcount: number of elements received from any process
(integer)
Output Parameter
Fortran
MPI_ALLTOALL(SENDBUF, SENDCOUNT,
SENDTYPE, RECVBUF, RECVCOUNT,
RECVTYPE, COMM, IERROR)
RECVCOUNT, RECVTYPE, COMM,
IERROR
E F G H A B C D I J K L M N O P
A B C D
A E I M
E F G H
B F J N
I J K L
C G K O
M N O P
D H L P
Rank 0 Rank 1 Rank 2 Rank 3
MPI_ALLTOALL
Collective Functions MPI_REDUCE
MPI_REDUCE combines the elements provided in the input
buffer (sendbuf) of each process in the group, using the
operation op, and returns the combined value in the output
buffer (recvbuf) of the process with rank root
C
int MPI_Reduce(void* sendbuf, void* recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)
Input Parameters
sendbuf: address of send buffer (choice)
count: number of elements in send buffer (integer)
datatype: data type of elements of send buffer (handle)
op: reduce operation (handle)
root: rank of root process (integer)
Output Parameter
recvbuf: address of receive buffer (choice, significant only at root)
Fortran
MPI_REDUCE(SENDBUF, RECVBUF, COUNT, DATATYPE, OP, ROOT, COMM,
IERROR)
INTEGER COUNT, DATATYPE, OP, ROOT, COMM, IERROR
Predefined Reduce Operations
[ MPI_MAX] maximum
[ MPI_MIN] minimum
[ MPI_SUM] sum
[ MPI_PROD] product
[ MPI_LAND] logical and
[ MPI_BAND] bit-wise and
[ MPI_LOR] logical or
[ MPI_BOR] bit-wise or
[ MPI_LXOR] logical xor
[ MPI_BXOR] bit-wise xor
[ MPI_MAXLOC] max value and location (return the max and an integer,
which is the rank storing the max value)
[ MPI_MINLOC] min value and location
A B C D E F G H I J K L M N O P
AoEoIoM
In this case, root = 1
if count = 2, there
will be BoFoJoN in
the 2nd element of
the array
MPI_ALLREDUCE
Variants of the reduce operations where
the result is returned to all processes in
the group
The all-reduce operations can be
implemented as a reduce, followed by a
broadcast. However, a direct
implementation can lead to better
performance.
C
int MPI_Allreduce(void* sendbuf, void* recvbuf,
int count, MPI_Datatype datatype, MPI_Op op,
MPI_Comm comm)
MPI_ALLREDUCE
Input Parameters
count: number of elements in send buffer (integer)
datatype: data type of elements of send buffer
(handle)
op: operation (handle)
Output Parameter
recvbuf: starting address of receive buffer (choice)
Fortran
MPI_ALLREDUCE(SENDBUF, RECVBUF, COUNT,
DATATYPE, OP, COMM, IERROR)
INTEGER COUNT, DATATYPE, OP, COMM, IERROR
MPI_ALLREDUCE
A B C D E F G H I J K L M N O P
AoEoIoM
MPI_REDUCE_SCATTER
Variants of each of the reduce operations where the result is
scattered to all processes in the group on return.
MPI_REDUCE_SCATTER first does an element-wise reduction
on vector of count=
i
recvcounts[i] elements in the send
buffer defined by sendbuf, count and datatype.
Next, the resulting vector of results is split into n disjoint
segments, where n is the number of members in the group.
Segment i contains recvcounts[i] elements.
The ith segment is sent to process i and stored in the receive
buffer defined by recvbuf, recvcounts[i] and datatype.
The MPI_REDUCE_SCATTER routine is functionally equivalent
to: A MPI_REDUCE operation function with count equal to the
sum of recvcounts[i] followed by MPI_SCATTERV with
sendcounts equal to recvcounts. However, a direct
implementation may run faster.
MPI_REDUCE_SCATTER
C
int MPI_Reduce_scatter(void* sendbuf, void* recvbuf, int
*recvcounts, MPI_Datatype datatype, MPI_Op op, MPI_Comm
comm)
Input Parameters
recvcounts: integer array specifying the number of elements in result
distributed to each process. Array must be identical on all calling processes.
datatype: data type of elements of input buffer (handle)
Output Parameter
Fortran
MPI_REDUCE_SCATTER(SENDBUF, RECVBUF, RECVCOUNTS,
DATATYPE, OP, COMM, IERROR)
INTEGER RECVCOUNTS(*), DATATYPE, OP, COMM, IERROR

MPI_REDUCE_SCATTER
A B C D
Rank 0
recvcounts = 1
E F G H
I J K L
M N O P
Rank 1
recvcounts = 2
Rank 2
recvcounts = 0
Rank 3
recvcounts = 1
AoEoIoM
A B C D
E F G H
I J K L
M N O P
BoFoJoN
CoGoKoO
DoHoLoP
Collective Functions MPI_SCAN
MPI_SCAN is used to perform a prefix reduction
on data distributed across the group. The
operation returns, in the receive buffer of the
process with rank i, the reduction of the values in
the send buffers of processes with ranks 0,...,i
(inclusive). The type of operations supported,
their semantics, and the constraints on send and
receive buffers are as for MPI_REDUCE.
C
int MPI_Scan(void* sendbuf, void* recvbuf, int
count, MPI_Datatype datatype, MPI_Op op,
MPI_Comm comm )
Input Parameters
count: number of elements in input buffer (integer)
datatype: data type of elements of input buffer
(handle)
Output Parameter
Fortran
MPI_SCAN(SENDBUF, RECVBUF, COUNT, DATATYPE,
OP, COMM, IERROR)
INTEGER COUNT, DATATYPE, OP, COMM, IERROR
A B C D
Rank 0
E F G H
I J K L
M N O P
Rank 1
Rank 2
Rank 3
AoEoIoM
A B C D
E F G H
I J K L
M N O P
AoEoI
AoE
A
Example MPI_BCAST
To demonstrate how to use
MPI_BCAST to distribute an array to
other process
Example MPI_BCAST (C)
/*
// root broadcast the array to all processes
*/

#include<stdio.h>
#include<mpi.h>

#define SIZE 10

main( int argc, char** argv)
{
int my_rank; // the rank of each proc
int array[SIZE];
int root = 0; // the rank of root
int i;
MPI_Comm comm = MPI_COMM_WORLD;

MPI_Init(&argc, &argv);
MPI_Comm_rank(comm, &my_rank);

if (my_rank == 0)
{
for (i = 0; i < SIZE; i ++)
{
array[i] = i;
}
}
Example MPI_BCAST (C)
else
{
for (i = 0; i < SIZE; i ++)
{
array[i] = 0;
}
}

printf("Proc %d: (Before Broadcast) ", my_rank);
for (i = 0; i < SIZE; i ++)
{
printf("%d ", array[i]);
}
printf("\n");

MPI_Bcast(array, SIZE, MPI_INT, root, comm);

printf("Proc %d: (After Broadcast) ", my_rank);
for (i = 0; i < SIZE; i ++)
{
printf("%d ", array[i]);
}
printf("\n");

MPI_Finalize();
return 0;
}
Example MPI_BCAST (Fortran)
C /*
C * root broadcast the array to all processes
C */

PROGRAM main
INCLUDE 'mpif.h'

PARAMETER (SIZE = 10)
INTEGER my_rank, ierr, root, i
INTEGER array(SIZE)
INTEGER comm
INTEGER arraysize

root = 0
comm = MPI_COMM_WORLD
arraysize = SIZE
Example MPI_BCAST (Fortran)

CALL MPI_INIT(ierr)
CALL MPI_COMM_RANK(comm, my_rank, ierr)

IF (my_rank.EQ.0) THEN
DO i = 1, SIZE
array(i) = i
END DO
ELSE
DO i = 1, SIZE
array(i) = 0
END DO
END IF

WRITE(6, *) "Proc ", my_rank, ": (Before Broadcast)", (array(i), i=1, SIZE)
CALL MPI_Bcast(array, arraysize, MPI_INTEGER, root, comm, ierr)
WRITE(6, *) "Proc ", my_rank, ": (After Broadcast)", (array(i), i=1, SIZE)

call MPI_FINALIZE(ierr)
end
Case Study 1 MPI_SCATTER and
MPI_REDUCE
Master distributes (scatters) an
array across processes. Processes
add their elements, then combine
sum in master through a reduction
operation.
Step 1
Proc 0 initializes a 16 integers array
Proc 0: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16}
MPI_REDUCE
Step 2
Scatter array among all processes
Proc 0: {1, 2, 3, 4}
Proc 1: {5, 6, 7, 8}
Proc 2: {9, 10, 11, 12}
Proc 3: {13, 14, 15, 16}
Step 3
Each process does some calculations
MPI_REDUCE
Step 4
Reduce to Proc 0
Proc 0: Total Sum
C
mpi_scatter_reduce01.c
Compilation:
mpicc mpi_scatter_reduce01.c o mpi_scatter_reduce01
Run:
mpirun np 4 mpi_scatter_reduce01
Fortran
mpi_scatter_reduce01.f
Compilation:
mpif77 mpi_scatter_reduce01.f o mpi_scatter_reduce01
Run:
mpirun np 4 mpi_scatter_reduce01
Case Study 2 MPI_GATHER
Matrix Multiplication
+ + +
+ + +
+ + +
+ + +
=
760
686
612
538
20 16 19 12 18 8 17 4
20 15 19 11 18 7 17 3
20 14 19 10 18 6 17 2
20 13 19 9 18 5 17 1
20
19
18
17
16 12 8 4
15 11 7 3
14 10 6 2
13 9 5 1
Algorithm:
{4x4 matrix A} x {4x1 vector x} = product
Each process stores a row of A and a single entry of
x
Use 4 gather operations to place a full copy of x in
each process, then perform multiplications
Step 1:
Initialization
Proc 0: {1 5 9 13}, {17}
Proc 1: {2 6 10 14}, {18}
Proc 2: {3 7 11 15}, {19}
Proc 3: {4 8 12 16}, {20}
Step 2:
Perform 4 times MPI_GATHER to gather the column
matrix to each process
Proc0: {1 5 9 13}, {17 18 19 20}
Proc1: {2 6 10 14}, {17 18 19 20}
Proc2: {3 7 11 15}, {17 18 19 20}
Proc3: {4 8 12 16}, {17 18 19 20}
Step 3:
Perform multiplication
Proc 0: 1x17+5x18+9x19+13x20=538
Proc 1: 2x17+6x18+10x19+14x20=612
Proc 2: 3x17+7x18+11x19+15x20=686
Proc 3: 4x17+8x18+12x19+16x20=760
Step 4:
Gather all process inner product into master
process and display the result
C
mpi_gather01.c
Compilation:
mpicc mpi_gather01.c o mpi_gather01
Run:
mpirun np 4 mpi_gather01
Fortran
mpi_gather01.f
Compilation:
mpif77 mpi_gather01.f o mpi_gather01
Run:
mpirun np 4 mpi_gather01
END

Collective Communication Functions Guide

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Collective Communication Functions Guide

Transféré par

Droits d'auteur :

Formats disponibles

Collective Communication

Vous aimerez peut-être aussi