Académique Documents
Professionnel Documents
Culture Documents
Constructs I
Parallel construct Work sharing constructs
Loop construct Sections construct Single construct
Constructs - II
These constructs will enable the programmer to orchestrate actions of different threads
Barrier construct Critical construct Atomic construct Locks Master construct
Terminology
OpenMP directive In C/C++, a #pragma that specifies OpenMP program behaviour Executable directive An OpenMP directive that is NOT declarative; that is it may be placed in an executable context Construct An OpenMP executable directive and the associated statement,loop pr structured block [ lexical extent of an executable directive]
Parallel construct
This is specified as #pragma omp parallel clause1 clause2 structured block
At the end of the parallel region, there is an implied barrier which makes all the threads to wait until the work inside the regions is completed Only the initial thread continues execution after the end of the parallel region
Loop construct
#pragma omp for
for (init-expr; var relop b; incr expr)
init-expr must be an integer expression b is also an integer expression incr expr must also be an integer expression Using ++,+=,--,-= Alternatively it could be var = var+expr
section/sections construct
Using sections construct, we can assign different threads to carry on different kinds of work Using sections construct we can specify different code regions which will be executed by one of the threads There are two directives
#pragma omp section #pragma omp sections
Single construct
Single construct is used to specify that exactly one thread must execute the specified part We do not care which thread really execute this The thread executing this can differ from run to run This is used in initialization of variables
Master construct
This is similar to single construct This guarantees that the work will be done by Master thread The Master construct does NOT have an implied barrier at entry or exit This may create problmes Solution is to have an explicit barrier statement
Shared clause
The shared clause specifies which data will be shared among threads executing the region it is associated with There will be an unique instance of the variable Each thread can freely read and modify the value A note of caution is that multiple threads may try to update the same variable simulataneously Synchronization constructs are available to resolve this issue A good use is when the threads only read this variable
Private clause
The private clause ensures that each thread is given a private copy of the variable Each variable in the private list is replicated such that each thread gets its own copy
Firstprivate clause
This is used if we need to initialize private variables prior to the region in which it will be used Variables that are used in firstprivate are private variables but they will be initialized to a value which a variable with the same name happens just before entry into the parallel region
Lastprivate clause
If a value of a private variable is needed after the parallel region is over, this clause is used In the case of a work-shared loop, the object will have a value from the iteration of the loop that would be last in a sequential execution In the case of a use in a sections statement, the object gets assigned the value that it has at the end of the lexically last sections construct
Default clause
The default clause is to give variables a default sharing attribute In C/C++, the default is none or shared If default (shared) is given, all variables other than private are shared variables If default(none) is given, programmer is forced to think about variable and to specify each variable in private list or shared list Default(none) is recommended
Nowait clause
Nowait clause allows the programmer to fine tune a programs performance When we add this clause to a construct, the barrier at the end of the associated construct will be suppressed Usage: when a parallel program runs correctly, we identify places where barrier is not necessary and introduce this clause When a thread is finished with the work associated with the parallel loop it continues without waiting for others to complete their work. Example: #pragma omp for nowait
Schedule clause
This is supported in the loop construct only as follows #pragma omp parallel schedule(kind, chunksize) There are four kinds of scheduling
Static Dynamic Guided runtime
Dynamic
Iterations are assigned to the threads as the threads request them; the thread executes the chunk of iterations controlled through chunk_size parameter; then requests another chunk until there are now more chunks to work on; last chunk may have fewrer iterationsl whenno chunk size specifed, it defaults to one..
Runtime
Ordered construct
#pragma omp ordered Allows one to execute a structured block within a parallel loop in sequential order
Critical construct
This provides a means to ensure that multiple threads do not attempt to update the same shared data simultaneously [ which is called critical region] An optional name is to be given to this which must be unique globally When a thread enters a critical region, it waits until no other thread is executing it #pragma omp critical name
Structured block
OpenMP Locks
These are semaphores
OpenMP provides a set of lowlevel general purpose locking routines They provide greater flexibility for synchronization Nested locks are also possible Definition omp_lock_t *var1; Routines for simple locks Initialization omp_init_lock(var1); Set lock omp_set_lock(var1) Test lock omp_test_lock(var1) Unset lock omp_unset_lock(var1) Destroy lock omp_destroy_lock(var1)
Environment variables
OMP_NUM_THREADS
omp_set_num_threads() Omp_get_num_threads()
OMP_DYNAMIC (boolean)
Omp_set_dynamic() Omp_get_dynamic()
OMP_NESTED
Omp_set_nested() Omp_get_nested()
OMP_SCHEDULE