Vous êtes sur la page 1sur 53

Tutorials

| Exercises | Abstracts | LCWorkshops | Comments | Search | Privacy&LegalNotice

OpenMP
Author:BlaiseBarney,LawrenceLivermoreNationalLaboratory

UCRLMI133316

TableofContents
1.Abstract
2.Introduction
3.OpenMPProgrammingModel
4.OpenMPAPIOverview
5.CompilingOpenMPPrograms
6.OpenMPDirectives
1.DirectiveFormat
2.C/C++DirectiveFormat
3.DirectiveScoping
4.PARALLELConstruct
5.Exercise1
6.WorkSharingConstructs
1.DO/forDirective
2.SECTIONSDirective
3.WORKSHAREDirective
4.SINGLEDirective
7.CombinedParallelWorkSharingConstructs
8.TASKConstruct
9.Exercise2
10.SynchronizationConstructs
1.MASTERDirective
2.CRITICALDirective
3.BARRIERDirective
4.TASKWAITDirective
5.ATOMICDirective
6.FLUSHDirective
7.ORDEREDDirective
11.THREADPRIVATEDirective
12.DataScopeAttributeClauses
1.PRIVATEClause
2.SHAREDClause
3.DEFAULTClause
4.FIRSTPRIVATEClause
5.LASTPRIVATEClause
6.COPYINClause
7.COPYPRIVATEClause
8.REDUCTIONClause
13.Clauses/DirectivesSummary
14.DirectiveBindingandNestingRules
7.RunTimeLibraryRoutines
8.EnvironmentVariables
9.ThreadStackSizeandThreadBinding
10.Monitoring,DebuggingandPerformanceAnalysisToolsforOpenMP
11.Exercise3
12.ReferencesandMoreInformation
13.AppendixA:RunTimeLibraryRoutines

Abstract
OpenMPisanApplicationProgramInterface(API),jointlydefinedbyagroupofmajorcomputerhardwareandsoftware
vendors.OpenMPprovidesaportable,scalablemodelfordevelopersofsharedmemoryparallelapplications.TheAPI
supportsC/C++andFortranonawidevarietyofarchitectures.ThistutorialcoversmostofthemajorfeaturesofOpenMP3.1,
includingitsvariousconstructsanddirectivesforspecifyingparallelregions,worksharing,synchronizationanddata
environment.Runtimelibraryfunctionsandenvironmentvariablesarealsocovered.ThistutorialincludesbothCandFortran
examplecodesandalabexercise.
Level/Prerequisites:ThistutorialisidealforthosewhoarenewtoparallelprogrammingwithOpenMP.Abasicunderstanding

ofparallelprogramminginCorFortranisrequired.ForthosewhoareunfamiliarwithParallelProgrammingingeneral,the
materialcoveredinEC3500:IntroductiontoParallelComputingwouldbehelpful.

Introduction
WhatisOpenMP?
OpenMPIs:
AnApplicationProgramInterface(API)thatmaybeusedtoexplicitlydirectmulti
threaded,sharedmemoryparallelism.
ComprisedofthreeprimaryAPIcomponents:
CompilerDirectives
RuntimeLibraryRoutines
EnvironmentVariables
Anabbreviationfor:OpenMultiProcessing
OpenMPIsNot:
Meantfordistributedmemoryparallelsystems(byitself)
Necessarilyimplementedidenticallybyallvendors
Guaranteedtomakethemostefficientuseofsharedmemory
Requiredtocheckfordatadependencies,dataconflicts,raceconditions,deadlocks,orcodesequencesthatcausea
programtobeclassifiedasnonconforming
DesignedtohandleparallelI/O.Theprogrammerisresponsibleforsynchronizinginputandoutput.
GoalsofOpenMP:
Standardization:
Provideastandardamongavarietyofsharedmemoryarchitectures/platforms
Jointlydefinedandendorsedbyagroupofmajorcomputerhardwareandsoftwarevendors
LeanandMean:
Establishasimpleandlimitedsetofdirectivesforprogrammingsharedmemorymachines.
Significantparallelismcanbeimplementedbyusingjust3or4directives.
Thisgoalisbecominglessmeaningfulwitheachnewrelease,apparently.
EaseofUse:
Providecapabilitytoincrementallyparallelizeaserialprogram,unlikemessagepassinglibrarieswhichtypically
requireanallornothingapproach
Providethecapabilitytoimplementbothcoarsegrainandfinegrainparallelism
Portability:
TheAPIisspecifiedforC/C++andFortran
PublicforumforAPIandmembership
MostmajorplatformshavebeenimplementedincludingUnix/LinuxplatformsandWindows
History:
Intheearly90's,vendorsofsharedmemorymachinessuppliedsimilar,directivebased,Fortranprogramming
extensions:
TheuserwouldaugmentaserialFortranprogramwithdirectivesspecifyingwhichloopsweretobeparallelized
ThecompilerwouldberesponsibleforautomaticallyparallelizingsuchloopsacrosstheSMPprocessors
Implementationswereallfunctionallysimilar,butwerediverging(asusual)
FirstattemptatastandardwasthedraftforANSIX3H5in1994.Itwasneveradopted,largelyduetowaninginterestas
distributedmemorymachinesbecamepopular.
However,notlongafterthis,newersharedmemorymachinearchitecturesstartedtobecomeprevalent,andinterest
resumed.
TheOpenMPstandardspecificationstartedinthespringof1997,takingoverwhereANSIX3H5hadleftoff.
LedbytheOpenMPArchitectureReviewBoard(ARB).OriginalARBmembersandcontributorsareshownbelow.
(Disclaimer:allpartnernamesderivedfromtheOpenMPwebsite)

APRMembers

EndorsingApplicationDevelopers

Compaq/Digital
HewlettPackardCompany
IntelCorporation
InternationalBusiness
Machines(IBM)
Kuck&Associates,Inc.
(KAI)
SiliconGraphics,Inc.
SunMicrosystems,Inc.
U.S.DepartmentofEnergy
ASCIprogram

ADINAR&D,Inc.
ANSYS,Inc.
DashAssociates
Fluent,Inc.
ILOGCPLEXDivision
LivermoreSoftwareTechnology
Corporation(LSTC)
MECALOGSARL
OxfordMolecularGroupPLC
TheNumericalAlgorithmsGroup
Ltd.(NAG)

EndorsingSoftware
Vendors
AbsoftCorporation
EdinburghPortable
Compilers
GENIASSoftware
GmBH
MyriasComputer
Technologies,Inc.
ThePortlandGroup,
Inc.(PGI)

FormorenewsandmembershipinformationabouttheOpenMPARB,visit:openmp.org/wp/aboutopenmp.
ReleaseHistory
OpenMPcontinuestoevolvenewconstructsandfeaturesareaddedwitheachrelease.
Initially,theAPIspecificationswerereleasedseparatelyforCandFortran.Since2005,theyhavebeenreleased
together.
ThetablebelowchroniclestheOpenMPAPIreleasehistory.
Date
Oct1997
Oct1998
Nov1999
Nov2000
Mar2002
May2005
May2008
Jul2011
Jul2013
Nov2015

Version
Fortran1.0
C/C++1.0
Fortran1.1
Fortran2.0
C/C++2.0
OpenMP2.5
OpenMP3.0
OpenMP3.1
OpenMP4.0
OpenMP4.5

ThistutorialreferstoOpenMPversion3.1.Syntaxandfeaturesofnewerreleasesarenotcurrentlycovered.

References:
OpenMPwebsite:openmp.org
APIspecifications,FAQ,presentations,discussions,mediareleases,calendar,membershipapplicationandmore...
Wikipedia:en.wikipedia.org/wiki/OpenMP

OpenMPProgrammingModel
SharedMemoryModel:
OpenMPisdesignedformultiprocessor/core,sharedmemorymachines.Theunderlyingarchitecturecanbeshared
memoryUMAorNUMA.

UniformMemoryAccess

NonUniformMemoryAccess

ThreadBasedParallelism:
OpenMPprogramsaccomplishparallelismexclusivelythroughtheuseofthreads.
Athreadofexecutionisthesmallestunitofprocessingthatcanbescheduledbyanoperatingsystem.Theideaofa
subroutinethatcanbescheduledtorunautonomouslymighthelpexplainwhatathreadis.
Threadsexistwithintheresourcesofasingleprocess.Withouttheprocess,theyceasetoexist.
Typically,thenumberofthreadsmatchthenumberofmachineprocessors/cores.However,theactualuseofthreadsis
uptotheapplication.
ExplicitParallelism:
OpenMPisanexplicit(notautomatic)programmingmodel,offeringtheprogrammerfullcontroloverparallelization.
Parallelizationcanbeassimpleastakingaserialprogramandinsertingcompilerdirectives....
Orascomplexasinsertingsubroutinestosetmultiplelevelsofparallelism,locksandevennestedlocks.
ForkJoinModel:
OpenMPusestheforkjoinmodelofparallelexecution:

AllOpenMPprogramsbeginasasingleprocess:themasterthread.Themasterthreadexecutessequentiallyuntilthe
firstparallelregionconstructisencountered.
FORK:themasterthreadthencreatesateamofparallelthreads.
Thestatementsintheprogramthatareenclosedbytheparallelregionconstructarethenexecutedinparallelamong
thevariousteamthreads.
JOIN:Whentheteamthreadscompletethestatementsintheparallelregionconstruct,theysynchronizeandterminate,
leavingonlythemasterthread.
Thenumberofparallelregionsandthethreadsthatcomprisethemarearbitrary.
CompilerDirectiveBased:
MostOpenMPparallelismisspecifiedthroughtheuseofcompilerdirectiveswhichareimbeddedinC/C++orFortran
sourcecode.
NestedParallelism:
TheAPIprovidesfortheplacementofparallelregionsinsideotherparallelregions.
Implementationsmayormaynotsupportthisfeature.
DynamicThreads:

TheAPIprovidesfortheruntimeenvironmenttodynamicallyalterthenumberofthreadsusedtoexecuteparallel
regions.Intendedtopromotemoreefficientuseofresources,ifpossible.
Implementationsmayormaynotsupportthisfeature.
I/O:
OpenMPspecifiesnothingaboutparallelI/O.Thisisparticularlyimportantifmultiplethreadsattempttowrite/readfrom
thesamefile.
IfeverythreadconductsI/Otoadifferentfile,theissuesarenotassignificant.
ItisentirelyuptotheprogrammertoensurethatI/Oisconductedcorrectlywithinthecontextofamultithreaded
program.
MemoryModel:FLUSHOften?
OpenMPprovidesa"relaxedconsistency"and"temporary"viewofthreadmemory(intheirwords).Inotherwords,
threadscan"cache"theirdataandarenotrequiredtomaintainexactconsistencywithrealmemoryallofthetime.
Whenitiscriticalthatallthreadsviewasharedvariableidentically,theprogrammerisresponsibleforinsuringthatthe
variableisFLUSHedbyallthreadsasneeded.
Moreonthislater...

OpenMPAPIOverview
ThreeComponents:
TheOpenMPAPIiscomprisedofthreedistinctcomponents:
CompilerDirectives(44)
RuntimeLibraryRoutines(35)
EnvironmentVariables(13)
Theapplicationdeveloperdecideshowtoemploythesecomponents.Inthesimplestcase,onlyafewofthemare
needed.
ImplementationsdifferintheirsupportofallAPIcomponents.Forexample,animplementationmaystatethatitsupports
nestedparallelism,buttheAPImakesitclearthatmaybelimitedtoasinglethreadthemasterthread.Notexactlywhat
thedevelopermightexpect?
CompilerDirectives:
Compilerdirectivesappearascommentsinyoursourcecodeandareignoredbycompilersunlessyoutellthem
otherwiseusuallybyspecifyingtheappropriatecompilerflag,asdiscussedintheCompilingsectionlater.
OpenMPcompilerdirectivesareusedforvariouspurposes:
Spawningaparallelregion
Dividingblocksofcodeamongthreads
Distributingloopiterationsbetweenthreads
Serializingsectionsofcode
Synchronizationofworkamongthreads
Compilerdirectiveshavethefollowingsyntax:

s e n t i n e l

d i r e c t i v e - n a m e

[ c l a u s e ,

. . . ]

Forexample:
Fortran ! $ O M P
C/C++

# p r a g m a

P A R A L L E L
o m p

D E F A U L T ( S H A R E D )

p a r a l l e l

P R I V A T E ( B E T A , P I )

d e f a u l t ( s h a r e d )

p r i v a t e ( b e t a , p i )

Compilerdirectivesarecoveredindetaillater.
RuntimeLibraryRoutines:
TheOpenMPAPIincludesanevergrowingnumberofruntimelibraryroutines.
Theseroutinesareusedforavarietyofpurposes:
Settingandqueryingthenumberofthreads

Queryingathread'suniqueidentifier(threadID),athread'sancestor'sidentifier,thethreadteamsize
Settingandqueryingthedynamicthreadsfeature
Queryingifinaparallelregion,andatwhatlevel
Settingandqueryingnestedparallelism
Setting,initializingandterminatinglocksandnestedlocks
Queryingwallclocktimeandresolution
ForC/C++,alloftheruntimelibraryroutinesareactualsubroutines.ForFortran,someareactuallyfunctions,andsome
aresubroutines.Forexample:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ N U M _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ t h r e a d s ( v o i d )

NotethatforC/C++,youusuallyneedtoincludethe< o m p . h > headerfile.


Fortranroutinesarenotcasesensitive,butC/C++routinesare.
TheruntimelibraryroutinesarebrieflydiscussedasanoverviewintheRunTimeLibraryRoutinessection,andinmore
detailinAppendixA.
EnvironmentVariables:
OpenMPprovidesseveralenvironmentvariablesforcontrollingtheexecutionofparallelcodeatruntime.
Theseenvironmentvariablescanbeusedtocontrolsuchthingsas:
Settingthenumberofthreads
Specifyinghowloopinterationsaredivided
Bindingthreadstoprocessors
Enabling/disablingnestedparallelismsettingthemaximumlevelsofnestedparallelism
Enabling/disablingdynamicthreads
Settingthreadstacksize
Settingthreadwaitpolicy
SettingOpenMPenvironmentvariablesisdonethesamewayyousetanyotherenvironmentvariables,anddepends
uponwhichshellyouuse.Forexample:
csh/tcsh s e t e n v

O M P _ N U M _ T H R E A D S

sh/bash e x p o r t

O M P _ N U M _ T H R E A D S = 8

OpenMPenvironmentvariablesarediscussedintheEnvironmentVariablessectionlater.
ExampleOpenMPCodeStructure:

FortranGeneralCodeStructure
1

P R O G R A M

H E L L O

I N T E G E R

V A R 1 ,

2
3

V A R 2 ,

V A R 3

4
5

S e r i a l

c o d e

.
7

.
8

.
9

1 0
1 1

B e g i n n i n g o f p a r a l l e l r e g i o n .
S p e c i f y v a r i a b l e s c o p i n g

F o r k

t e a m

o f

t h r e a d s .

1 2
1 3

! $ O M P

P A R A L L E L

P R I V A T E ( V A R 1 ,

V A R 2 )

S H A R E D ( V A R 3 )

1 4
1 5
1 6
1 7
1 8
1 9
2 0

P a r a l l e l r e g i o n e x e c u t e d
.
O t h e r O p e n M P d i r e c t i v e s
.
R u n - t i m e L i b r a r y c a l l s
.

b y

a l l

t h r e a d s

2 1

A l l

t h r e a d s

j o i n

m a s t e r

t h r e a d

a n d

d i s b a n d

2 2
2 3

! $ O M P

E N D

P A R A L L E L

2 4
2 5

R e s u m e

s e r i a l

2 6

2 7

2 8

c o d e

2 9
3 0

E N D

C/C++GeneralCodeStructure
1

# i n c l u d e

< o m p . h >

m a i n

2
3

( )

4
5

i n t

v a r 1 ,

v a r 2 ,

v a r 3 ;

6
7

S e r i a l
.
9

1 0

c o d e

1 1
1 2
1 3

B e g i n n i n g o f p a r a l l e l r e g i o n .
S p e c i f y v a r i a b l e s c o p i n g

F o r k

t e a m

o f

t h r e a d s .

1 4
1 5
1 6

# p r a g m a
{

o m p

p a r a l l e l

p r i v a t e ( v a r 1 ,

v a r 2 )

s h a r e d ( v a r 3 )

1 7
1 8

P a r a l l e l

r e g i
.
O t h e r O p e n M P
.
R u n - t i m e L i b r
.
A l l t h r e a d s j

1 9
2 0
2 1
2 2
2 3
2 4

o n

e x e c u t e d

b y

a l l

t h r e a d s

d i r e c t i v e s
a r y

c a l l s

o i n

m a s t e r

t h r e a d

a n d

d i s b a n d

2 5
2 6

2 7
2 8

R e s u m e
.

3 0

3 1

s e r i a l

2 9

c o d e

3 2
3 3

CompilingOpenMPPrograms
LCOpenMPImplementations:
AsofJune2016,thedocumentationsourcesforLC'sdefaultcompilersclaimthefollowingOpenMPsupport:
Compiler
IntelC/C++,Fortran

Version
14.0.3

Supports
OpenMP3.1

GNUC/C++,Fortran

4.4.7

OpenMP3.0

PGIC/C++,Fortran

8.0.1

OpenMP3.0

IBMBlueGeneC/C++

12.1

OpenMP3.1

IBMBlueGeneFortran

14.1

OpenMP3.1

IBMBlueGeneGNUC/C++,Fortran 4.4.7

OpenMP3.0

OpenMP4.0Support(accordingtovendorandopenmp.orgdocumentation):
GNU:supportedin4.9forC/C++and4.9.1forFortran
Intel:14.0has"some"support15.0supports"mostfeatures"version16supported
PGI:notcurrentlyavailable
IBMBG/Q:notcurrentlyavailable
OpenMP4.5Support:
NotcurrentlysupportedonanyofLC'sproductionclustercompilers.
SupportedinabetaversionoftheClangcompileronthenonproductionrzmistandrzhasgpuclusters(June
2016).
ToviewallLCcompilerversions,usethecommandu s e

- l

c o m p i l e r s toviewcompilerpackagesbyversion.

ToviewLC'sdefaultcompilerversionssee:https://computing.llnl.gov/?set=code&page=compilers
BestplacetoviewOpenMPsupportbyarangeofcompilers:http://openmp.org/wp/openmpcompilers/.
Compiling:
AllofLC'scompilersrequireyoutousetheappropriatecompilerflagto"turnon"OpenMPcompilations.Thetablebelow
showswhattouseforeachcompiler.
Compiler/Platform

Compiler

Flag

Intel
LinuxOpteron/Xeon

i c c
i c p c
i f o r t

- o p e n m p

PGI
LinuxOpteron/Xeon

p g c
p g C
p g f
p g f

- m p

c
C
7 7
9 0

GNU
LinuxOpteron/Xeon
IBMBlueGene

g c c
g + +
g 7 7
g f o r t r a n

IBM
BlueGene

b g x
b g x
b g x
b g x
b g x
b g x
b g x
b g x

l c _
l C _
l c 8
l c 9
l f _
l f 9
l f 9
l f 2

r ,
r ,

- f o p e n m p

b g c c _ r
b g x l c + + _ r

- q s m p = o m p

9 _ r
9 _ r
r
0 _ r
5 _ r
0 0 3 _ r

*Besuretouseathreadsafecompileritsnameendswith_r
CompilerDocumentation:
IBMBlueGene:www01.ibm.com/software/awdtools/fortran/andwww01.ibm.com/software/awdtools/xlcpp
Intel:www.intel.com/software/products/compilers/
PGI:www.pgroup.com
GNU:gnu.org
All:Seetherelevantmanpagesandanyfilesthatmightrelatein/ u s r / l o c a l / d o c s

OpenMPDirectives
FortranDirectivesFormat
Format:(caseinsensitive)

sentinel

directivename

AllFortranOpenMPdirectivesmustbeginwith
asentinel.Theacceptedsentinelsdepend
uponthetypeofFortransource.Possible
sentinelsare:

AvalidOpenMP
directive.Mustappear
afterthesentineland
beforeanyclauses.

[clause...]
Optional.Clausescanbein
anyorder,andrepeatedas
necessaryunlessotherwise
restricted.

! $ O M P
C $ O M P
* $ O M P
Example:

! $ O M P

P A R A L L E L

D E F A U L T ( S H A R E D )

P R I V A T E ( B E T A , P I )

FixedFormSource:
! $ O M P

C $ O M P

* $ O M P areacceptedsentinelsandmuststartincolumn1

AllFortranfixedformrulesforlinelength,whitespace,continuationandcommentcolumnsapplyfortheentiredirective
line
Initialdirectivelinesmusthaveaspace/zeroincolumn6.
Continuationlinesmusthaveanonspace/zeroincolumn6.
FreeFormSource:
! $ O M P istheonlyacceptedsentinel.Canappearinanycolumn,butmustbeprecededbywhitespaceonly.
AllFortranfreeformrulesforlinelength,whitespace,continuationandcommentcolumnsapplyfortheentiredirective
line
Initialdirectivelinesmusthaveaspaceafterthesentinel.
Continuationlinesmusthaveanampersandasthelastnonblankcharacterinaline.Thefollowinglinemustbeginwith
asentinelandthenthecontinuationdirectives.
GeneralRules:
Commentscannotappearonthesamelineasadirective
Onlyonedirectivenamemaybespecifiedperdirective
FortrancompilerswhichareOpenMPenabledgenerallyincludeacommandlineoptionwhichinstructsthecompilerto
activateandinterpretallOpenMPdirectives.
SeveralFortranOpenMPdirectivescomeinpairsandhavetheformshownbelow.The"end"directiveisoptionalbut
advisedforreadability.

! $ O M P

d i r e c t i v e

s t r u c t u r e d

! $ O M P

e n d

b l o c k

o f

c o d e

d i r e c t i v e

OpenMPDirectives
C/C++DirectivesFormat
Format:
#pragma
omp

directivename

[clause,...]

newline

Requiredfor
allOpenMP
C/C++
directives.

AvalidOpenMPdirective.
Mustappearafterthe
pragmaandbeforeany
clauses.

Optional.Clausescanbeinany
order,andrepeatedas
necessaryunlessotherwise
restricted.

Required.Precedesthe
structuredblockwhichis
enclosedbythisdirective.

Example:

# p r a g m a

o m p

p a r a l l e l

d e f a u l t ( s h a r e d )

p r i v a t e ( b e t a , p i )

GeneralRules:
Casesensitive
DirectivesfollowconventionsoftheC/C++standardsforcompilerdirectives
Onlyonedirectivenamemaybespecifiedperdirective
Eachdirectiveappliestoatmostonesucceedingstatement,whichmustbeastructuredblock.
Longdirectivelinescanbe"continued"onsucceedinglinesbyescapingthenewlinecharacterwithabackslash("\")at
theendofadirectiveline.

OpenMPDirectives
DirectiveScoping
Dowedothisnow...ordoitlater?Ohwell,let'sgetitoverwithearly...
Static(Lexical)Extent:
Thecodetextuallyenclosedbetweenthebeginningandtheendofastructuredblockfollowingadirective.
Thestaticextentofadirectivesdoesnotspanmultipleroutinesorcodefiles
OrphanedDirective:
AnOpenMPdirectivethatappearsindependentlyfromanotherenclosingdirectiveissaidtobeanorphaneddirective.It
existsoutsideofanotherdirective'sstatic(lexical)extent.
Willspanroutinesandpossiblycodefiles
DynamicExtent:
Thedynamicextentofadirectiveincludesbothitsstatic(lexical)extentandtheextentsofitsorphaneddirectives.
Example:

! $ O M P
! $ O M P

! $ O M P

! $ O M P

P R O
. . .
P A R
. . .
D O
D O
. . .
C A L
. . .
E N D
E N D
. . .
C A L
. . .
E N D

G R A M

T E S T

A L L E L

! $ O M P
! $ O M P

I = . . .
L

S U B R O U T I N E S U B 1
. . .
C R I T I C A L
. . .
E N D C R I T I C A L
E N D

S U B 1
D O

D O

! $ O M P
S U B 2

P A R A L L E L

STATICEXTENT
TheD O directiveoccurswithinanenclosingparallel
region

! $ O M P

S U B R O U T I N E S U B 2
. . .
S E C T I O N S
. . .
E N D S E C T I O N S
. . .
E N D

ORPHANEDDIRECTIVES
TheC R I T I C A L andS E C T I O N S directivesoccur
outsideanenclosingparallelregion

DYNAMICEXTENT

TheCRITICALandSECTIONSdirectivesoccurwithinthedynamicextentoftheDOandPARALLEL
directives.
WhyIsThisImportant?
OpenMPspecifiesanumberofscopingrulesonhowdirectivesmayassociate(bind)andnestwithineachother
Illegaland/orincorrectprogramsmayresultiftheOpenMPbindingandnestingrulesareignored
SeeDirectiveBindingandNestingRulesforspecificdetails

OpenMPDirectives
PARALLELRegionConstruct
Purpose:
Aparallelregionisablockofcodethatwillbeexecutedbymultiplethreads.ThisisthefundamentalOpenMPparallel
construct.
Format:
! $ O M P

P A R A L L E L

Fortran

[ c
I F
P R
S H
D E
F I
R E
C O
N U

l a
(
I V
A R
F A
R S
D U
P Y
M _

u s
s c
A T
E D
U L
T P
C T
I N
T H

e
a l
E
(
T
R I
I O
(
R E

. .
a r
( l
l i
( P
V A
N
l i
A D

. ]
_ l
i s
s t
R I
T E
( o
s t
S

)
( s c a l a r - i n t e g e r - e x p r e s s i o n )

[ c
i f
p r
s h
d e
f i
r e
c o
n u

l a
(
i v
a r
f a
r s
d u
p y
m _

u s
s c
a t
e d
u l
t p
c t
i n
t h

e . . .
a l a r _
e ( l i
( l i s
t ( s h
r i v a t
i o n (
( l i s
r e a d s

o g i c a l _ e x p r e s s i o n )
t )
)
V A T E | F I R S T P R I V A T E
( l i s t )
p e r a t o r : l i s t )

S H A R E D

N O N E )

b l o c k
! $ O M P

E N D

# p r a g m a

P A R A L L E L
o m p

p a r a l l e l

C/C++

]
e x
s t
t )
a r
e
o p
t )
(

n e w l i n e
p r e s s i o n )
)
e d | n o n e )
( l i s t )
e r a t o r : l i s t )
i n t e g e r - e x p r e s s i o n )

s t r u c t u r e d _ b l o c k

Notes:
WhenathreadreachesaPARALLELdirective,itcreatesateamofthreadsandbecomesthemasteroftheteam.The
masterisamemberofthatteamandhasthreadnumber0withinthatteam.
Startingfromthebeginningofthisparallelregion,thecodeisduplicatedandallthreadswillexecutethatcode.
Thereisanimpliedbarrierattheendofaparallelregion.Onlythemasterthreadcontinuesexecutionpastthispoint.
Ifanythreadterminateswithinaparallelregion,allthreadsintheteamwillterminate,andtheworkdoneupuntilthat
pointisundefined.
HowManyThreads?
Thenumberofthreadsinaparallelregionisdeterminedbythefollowingfactors,inorderofprecedence:

1.EvaluationoftheI F clause
2.SettingoftheN U M _ T H R E A D S clause
3.Useoftheo m p _ s e t _ n u m _ t h r e a d s ( ) libraryfunction
4.SettingoftheOMP_NUM_THREADSenvironmentvariable
5.ImplementationdefaultusuallythenumberofCPUsonanode,thoughitcouldbedynamic(seenextbullet).
Threadsarenumberedfrom0(masterthread)toN1
DynamicThreads:
Usetheo m p _ g e t _ d y n a m i c ( ) libraryfunctiontodetermineifdynamicthreadsareenabled.
Ifsupported,thetwomethodsavailableforenablingdynamicthreadsare:
1.Theo m p _ s e t _ d y n a m i c ( ) libraryroutine
2.SettingoftheOMP_DYNAMICenvironmentvariabletoTRUE
NestedParallelRegions:
Usetheo m p _ g e t _ n e s t e d ( ) libraryfunctiontodetermineifnestedparallelregionsareenabled.
Thetwomethodsavailableforenablingnestedparallelregions(ifsupported)are:
1.Theo m p _ s e t _ n e s t e d ( ) libraryroutine
2.SettingoftheOMP_NESTEDenvironmentvariabletoTRUE
Ifnotsupported,aparallelregionnestedwithinanotherparallelregionresultsinthecreationofanewteam,consisting
ofonethread,bydefault.
Clauses:
IFclause:Ifpresent,itmustevaluateto.TRUE.(Fortran)ornonzero(C/C++)inorderforateamofthreadstobe
created.Otherwise,theregionisexecutedseriallybythemasterthread.
Theremainingclausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Restrictions:
Aparallelregionmustbeastructuredblockthatdoesnotspanmultipleroutinesorcodefiles
Itisillegaltobranch(goto)intooroutofaparallelregion
OnlyasingleIFclauseispermitted
OnlyasingleNUM_THREADSclauseispermitted
Aprogrammustnotdependupontheorderingoftheclauses

Example:ParallelRegion
Simple"HelloWorld"program
Everythreadexecutesallcodeenclosedintheparallelregion
OpenMPlibraryroutinesareusedtoobtainthreadidentifiersandtotalnumberofthreads

FortranParallelRegionExample
1

P R O G R A M

H E L L O

2
3
4

I N T E G E R N T H R E A D S , T I D ,
O M P _ G E T _ T H R E A D _ N U M

O M P _ G E T _ N U M _ T H R E A D S ,

!
! $ O M P
8

5
F o r k a t e a m o f t h r e a d s
P A R A L L E L P R I V A T E ( T I D )

w i t h

e a c h

t h r e a d

h a v i n g

p r i v a t e

T I D

v a r i a b l e

O b t a i n a n d p r i n t t h r e a d i d
T I D = O M P _ G E T _ T H R E A D _ N U M ( )
P R I N T * , ' H e l l o W o r l d f r o m

1 0
1 1

t h r e a d

' ,

T I D

1 2
1 3

O n l y
I F (
N T
P R
E N D

m a s t
T I D .
H R E A D
I N T *
I F

A l l
E N D

t h r e a d s j o i n
P A R A L L E L

1 4
1 5
1 6
1 7

e r

t h r e a d d o e s t h i s
0 ) T H E N
= O M P _ G E T _ N U M _ T H R E A D S ( )
' N u m b e r o f t h r e a d s = ' ,

E Q .
S
,

N T H R E A D S

1 8
1 9

!
! $ O M P

2 0

m a s t e r

t h r e a d

a n d

d i s b a n d

2 1
2 2

E N D

C/C++ParallelRegionExample
1

# i n c l u d e

< o m p . h >

m a i n ( i n t

a r g c ,

2
3

c h a r

* a r g v [ ] )

4
5

i n t

n t h r e a d s ,

t i d ;

6
7

/ *
8
9

F o r k
# p r a g m a
{

t e a m o f t h r e a d s w i t h e a c h
p a r a l l e l p r i v a t e ( t i d )

t h r e a d

h a v i n g

% d \ n " ,

t i d ) ;

p r i v a t e

t i d

o m p

1 0
1 1

/ * O b t a i n a n d p r i n t t h r e a d i d * /
t i d = o m p _ g e t _ t h r e a d _ n u m ( ) ;
p r i n t f ( " H e l l o W o r l d f r o m t h r e a d =

1 2
1 3
1 4
1 5

/ *
i f

1 6

O n l y
( t i d

m a s t e r
= = 0 )

t h r e a d

d o e s

t h i s

* /

1 7

1 8
1 9

n t h r e a d s = o m p _ g e t _ n u m _ t h r e a d s ( ) ;
p r i n t f ( " N u m b e r o f t h r e a d s = % d \ n " ,

2 0

n t h r e a d s ) ;

2 1
2 2

/ *

A l l

t h r e a d s

j o i n

m a s t e r

t h r e a d

a n d

t e r m i n a t e

* /

2 3
2 4

OpenMPExercise1
GettingStarted
Overview:
LogintotheworkshopclusterusingyourworkshopusernameandOTPtoken
Copytheexercisefilestoyourhomedirectory
FamiliarizeyourselfwithLC'sOpenMPenvironment
Writeasimple"HelloWorld"OpenMPprogram
Successfullycompileyourprogram
Successfullyrunyourprogram
Modifythenumberofthreadsusedtorunyourprogram

v a r i a b l e

* /

GOTOTHEEXERCISEHERE

Approx.20minutes

OpenMPDirectives
WorkSharingConstructs
Aworksharingconstructdividestheexecutionoftheenclosedcoderegionamongthemembersoftheteamthat
encounterit.
Worksharingconstructsdonotlaunchnewthreads
Thereisnoimpliedbarrieruponentrytoaworksharingconstruct,howeverthereisanimpliedbarrierattheendofa
worksharingconstruct.
TypesofWorkSharingConstructs:
NOTE:TheFortranw o r k s h a r e constructisnotshownhere,butisdiscussedlater.
DO/forsharesiterationsofaloop
SECTIONSbreaksworkinto
acrosstheteam.Representsatypeof separate,discretesections.Each
"dataparallelism".
sectionisexecutedbyathread.Can
beusedtoimplementatypeof
"functionalparallelism".

SINGLEserializesasectionofcode

Restrictions:
Aworksharingconstructmustbeencloseddynamicallywithinaparallelregioninorderforthedirectivetoexecutein
parallel.
Worksharingconstructsmustbeencounteredbyallmembersofateamornoneatall
Successiveworksharingconstructsmustbeencounteredinthesameorderbyallmembersofateam

OpenMPDirectives
WorkSharingConstructs
DO/forDirective

Purpose:
TheDO/fordirectivespecifiesthattheiterationsoftheloopimmediatelyfollowingitmustbeexecutedinparallelbythe
team.Thisassumesaparallelregionhasalreadybeeninitiated,otherwiseitexecutesinserialonasingleprocessor.
Format:
! $ O M P

D O

Fortran

[ c
S C
O R
P R
F I
L A
S H
R E
C O

l a
H E
D E
I V
R S
S T
A R
D U
L L

u s
D U
R E
A T
T P
P R
E D
C T
A P

. . . ]
( t y p e

L E

[ , c h u n k ] )

D
E

(
R I V
I V A
( l
I O N
S E

l i
A T
T E
i s
(
( n

s t )
E ( l i s t )
( l i s t )
t )
o p e r a t o r
)
|

i n t r i n s i c

l i s t )

d o _ l o o p
! $ O M P
# p r a g m a

E N D

D O
o m p

[
f o r

C/C++

N O W A I T
[ c
s c
o r
p r
f i
l a
s h
r e
c o
n o

l a
h e
d e
i v
r s
s t
a r
d u
l l
w a

u s
d u
r e
a t
t p
p r
e d
c t
a p
i t

]
e
l e

. . . ]
n e w l i n e
( t y p e [ , c h u n k ] )

d
e

(
r i v
i v a
( l
i o n
s e

l i
a t
t e
i s
(
( n

s t )
e ( l i s t )
( l i s t )
t )
o p e r a t o r :
)

l i s t )

f o r _ l o o p

Clauses:
SCHEDULE:Describeshowiterationsofthelooparedividedamongthethreadsintheteam.Thedefaultscheduleis
implementationdependent.Foradiscussiononhowonetypeofschedulingmaybemoreoptimalthanothers,see
http://openmp.org/forum/viewtopic.php?f=3&t=83.
STATIC
Loopiterationsaredividedintopiecesofsizechunkandthenstaticallyassignedtothreads.Ifchunkisnot
specified,theiterationsareevenly(ifpossible)dividedcontiguouslyamongthethreads.
DYNAMIC
Loopiterationsaredividedintopiecesofsizechunk,anddynamicallyscheduledamongthethreadswhena
threadfinishesonechunk,itisdynamicallyassignedanother.Thedefaultchunksizeis1.
GUIDED
Iterationsaredynamicallyassignedtothreadsinblocksasthreadsrequestthemuntilnoblocksremaintobe
assigned.SimilartoDYNAMICexceptthattheblocksizedecreaseseachtimeaparcelofworkisgiventoa
thread.Thesizeoftheinitialblockisproportionalto:
n u m b e r _ o f _ i t e r a t i o n s

n u m b e r _ o f _ t h r e a d s

Subsequentblocksareproportionalto
n u m b e r _ o f _ i t e r a t i o n s _ r e m a i n i n g

n u m b e r _ o f _ t h r e a d s

Thechunkparameterdefinestheminimumblocksize.Thedefaultchunksizeis1.
RUNTIME
TheschedulingdecisionisdeferreduntilruntimebytheenvironmentvariableOMP_SCHEDULE.Itisillegalto
specifyachunksizeforthisclause.
AUTO
Theschedulingdecisionisdelegatedtothecompilerand/orruntimesystem.
NOWAIT/nowait:Ifspecified,thenthreadsdonotsynchronizeattheendoftheparallelloop.

ORDERED:Specifiesthattheiterationsoftheloopmustbeexecutedastheywouldbeinaserialprogram.
COLLAPSE:Specifieshowmanyloopsinanestedloopshouldbecollapsedintoonelargeiterationspaceanddivided
accordingtothes c h e d u l e clause.Thesequentialexecutionoftheiterationsinallassociatedloopsdeterminesthe
orderoftheiterationsinthecollapsediterationspace.
Otherclausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Restrictions:
TheDOloopcannotbeaDOWHILEloop,oraloopwithoutloopcontrol.Also,theloopiterationvariablemustbean
integerandtheloopcontrolparametersmustbethesameforallthreads.
Programcorrectnessmustnotdependuponwhichthreadexecutesaparticulariteration.
Itisillegaltobranch(goto)outofaloopassociatedwithaDO/fordirective.
Thechunksizemustbespecifiedasaloopinvarientintegerexpression,asthereisnosynchronizationduringits
evaluationbydifferentthreads.
ORDERED,COLLAPSEandSCHEDULEclausesmayappearonceeach.
SeetheOpenMPspecificationdocumentforadditionalrestrictions.

Example:DO/forDirective
Simplevectoraddprogram
ArraysA,B,C,andvariableNwillbesharedbyallthreads.
VariableIwillbeprivatetoeachthreadeachthreadwillhaveitsownuniquecopy.
TheiterationsoftheloopwillbedistributeddynamicallyinCHUNKsizedpieces.
Threadswillnotsynchronizeuponcompletingtheirindividualpiecesofwork(NOWAIT).

FortranDODirectiveExample
1

P R O G R A M

V E C _ A D D _ D O

I N T
P A R
P A R
R E A

N ,

2
3
4
5
6

E G E R
A M E T E
A M E T E
L A ( N

C H
( N =
R ( C H
) , B (
R

U N K S I Z E , C H U N K ,
1 0 0 0 )
U N K S I Z E = 1 0 0 )
N ) , C ( N )

7
8

S o m e
D O I
A (
B (
E N D D
C H U N

9
1 0
1 1
1 2
1 3

i n
=
I )
I )

i t i a l i z a t i o n s
1 , N
= I * 1 . 0
= A ( I )

O
K

C H U N K S I Z E

1 4
1 5

! $ O M P

P A R A L L E L

! $ O M P

D O
D O

S H A R E D ( A , B , C , C H U N K )

1 6
1 7

! $ O M P

S C
I
C (
E N D D O
E N D D

! $ O M P

E N D

1 8
1 9
2 0
2 1

H E D U L E ( D Y N A M I C , C H U N K )
= 1 , N
I ) = A ( I ) + B ( I )
N O W A I T

2 2
2 3

P A R A L L E L

2 4
2 5

E N D

C/C++forDirectiveExample

P R I V A T E ( I )

# i n c l u d e < o m p . h >
# d e f i n e N 1 0 0 0
# d e f i n e C H U N K S I Z E
2
3

1 0 0

4
5

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

6
7

i n t i , c h u n k ;
f l o a t a [ N ] , b [ N ] ,
8

c [ N ] ;

9
1 0
1 1
1 2
1 3

/ * S o m e i n i t i a l i z a t i o n s * /
f o r ( i = 0 ; i < N ; i + + )
a [ i ] = b [ i ] = i * 1 . 0 ;
c h u n k = C H U N K S I Z E ;

1 4
1 5
1 6

# p r a g m a
{

o m p

p a r a l l e l

s h a r e d ( a , b , c , c h u n k )

p r i v a t e ( i )

1 7
1 8

# p r a g m a o m p f o r s c h e d u l e ( d y n a m i c , c h u n k )
f o r ( i = 0 ; i < N ; i + + )
c [ i ] = a [ i ] + b [ i ] ;

1 9
2 0

n o w a i t

2 1
2 2

/ *

e n d

o f

p a r a l l e l

r e g i o n

* /

2 3
2 4

OpenMPDirectives
WorkSharingConstructs
SECTIONSDirective
Purpose:
TheSECTIONSdirectiveisanoniterativeworksharingconstruct.Itspecifiesthattheenclosedsection(s)ofcodeareto
bedividedamongthethreadsintheteam.
IndependentSECTIONdirectivesarenestedwithinaSECTIONSdirective.EachSECTIONisexecutedoncebya
threadintheteam.Differentsectionsmaybeexecutedbydifferentthreads.Itispossibleforathreadtoexecutemore
thanonesectionifitisquickenoughandtheimplementationpermitssuch.
Format:
! $ O M P

S E C T I O N S

! $ O M P
Fortran

[ c
P R
F I
L A
R E

l a
I V
R S
S T
D U

u s
A T
T P
P R
C T

. .
( l
R I V A
I V A T
I O N
E

. ]
i s t )
T E ( l i s t )
E ( l i s t )
( o p e r a t o r
|

i n t r i n s i c

S E C T I O N

b l o c k
! $ O M P

S E C T I O N

b l o c k
! $ O M P
# p r a g m a

E N D

S E C T I O N S
o m p

s e c t i o n s

N O W A I T

[ c l a u s e . . . ]
n e w l i n e
p r i v a t e ( l i s t )
f i r s t p r i v a t e ( l i s t )
l a s t p r i v a t e

( l i s t )

l i s t )

l a s t p r i v a t e ( l i s t )
r e d u c t i o n ( o p e r a t o r :
n o w a i t

l i s t )

{
# p r a g m a

C/C++

o m p

s e c t i o n

n e w l i n e

s t r u c t u r e d _ b l o c k
# p r a g m a

o m p

s e c t i o n

n e w l i n e

s t r u c t u r e d _ b l o c k
}
Clauses:
ThereisanimpliedbarrierattheendofaSECTIONSdirective,unlesstheN O W A I T / n o w a i t clauseisused.
Clausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Questions:
WhathappensifthenumberofthreadsandthenumberofSECTIONsaredifferent?Morethreadsthan
SECTIONs?LessthreadsthanSECTIONs?
Answer

WhichthreadexecuteswhichSECTION?
Answer

Restrictions:
Itisillegaltobranch(goto)intooroutofsectionblocks.
SECTIONdirectivesmustoccurwithinthelexicalextentofanenclosingSECTIONSdirective(noorphanSECTIONs).

Example:SECTIONSDirective
Simpleprogramdemonstratingthatdifferentblocksofworkwillbedonebydifferentthreads.

FortranSECTIONSDirectiveExample
1

P R O G R A M

V E C _ A D D _ S E C T I O N S

2
3

I N T E G E R N , I
P A R A M E T E R ( N = 1 0 0 0 )
R E A L A ( N ) , B ( N ) , C ( N ) ,
4
5

D ( N )

6
7

S o m e
D O I
A ( I
B ( I
E N D D O
! $ O M P

P A R A L L E L

! $ O M P

S E C T I O N S

! $ O M P

S E C T I O N
D O I = 1 ,
C ( I ) =
E N D D O

8
9
1 0
1 1

i n
=
)
)

i t i a l i z a t i o n s
1 , N
= I * 1 . 5
= I + 2 2 . 3 5

1 2
1 3

S H A R E D ( A , B , C , D ) ,

1 4
1 5
1 6
1 7
1 8
1 9
2 0

2 1

A ( I )

2 2

! $ O M P

S E C T I O N
D O

1 ,

B ( I )

P R I V A T E ( I )

D O

= 1 ,
D ( I ) =
E N D D O

2 3
2 4
2 5

N
A ( I )

B ( I )

2 6
2 7

! $ O M P

E N D

S E C T I O N S

! $ O M P

E N D

P A R A L L E L

N O W A I T

2 8
2 9
3 0
3 1

E N D

C/C++sectionsDirectiveExample
1

# i n c l u d e < o m p . h >
# d e f i n e N 1 0 0 0
2
3
4

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

5
6

i n t i ;
f l o a t a [ N ] ,
7

b [ N ] ,

c [ N ] ,

d [ N ] ;

8
9
1 0
1 1
1 2
1 3

/ * S o
f o r (
a [ i
b [ i
}

m e i n i t i a
i = 0 ; i <
] = i * 1
] = i + 2

l i z a t i o n s
N ; i + + ) {
. 5 ;
2 . 3 5 ;

* /

1 4
1 5
1 6

# p r a g m a
{

o m p

p a r a l l e l

s h a r e d ( a , b , c , d )

p r i v a t e ( i )

1 7
1 8

# p r a g m a
{

1 9

o m p

s e c t i o n s

n o w a i t

2 0
2 1

# p r a g m a o m p s e c t i o n
f o r ( i = 0 ; i < N ; i + + )
c [ i ] = a [ i ] + b [ i ] ;

2 2
2 3
2 4
2 5

# p r a g m a o m p s e c t i o n
f o r ( i = 0 ; i < N ; i + + )
d [ i ] = a [ i ] * b [ i ] ;

2 6
2 7
2 8
2 9

/ *

e n d

o f

s e c t i o n s

* /

3 0
3 1

/ *

e n d

o f

p a r a l l e l

r e g i o n

* /

3 2
3 3

OpenMPDirectives
WorkSharingConstructs
WORKSHAREDirective
Purpose:
Fortranonly
TheWORKSHAREdirectivedividestheexecutionoftheenclosedstructuredblockintoseparateunitsofwork,eachof
whichisexecutedonlyonce.

Thestructuredblockmustconsistofonlythefollowing:
arrayassignments
scalarassignments
FORALLstatements
FORALLconstructs
WHEREstatements
WHEREconstructs
atomicconstructs
criticalconstructs
parallelconstructs
SeetheOpenMPAPIdocumentationforadditionalinformation,particularlyforwhatcomprisesa"unitofwork".
Format:
! $ O M P

W O R K S H A R E

s t r u c t u r e d

Fortran

! $ O M P

E N D

b l o c k

W O R K S H A R E

N O W A I T

Restrictions:
TheconstructmustnotcontainanyuserdefinedfunctioncallsunlessthefunctionisELEMENTAL.

Example:WORKSHAREDirective
Simplearrayandscalarassigmentssharedbytheteamofthreads.Aunitofworkwouldinclude:
Anyscalarassignment
Forarrayassignmentstatements,theassignmentofeachelementisaunitofwork

FortranWORKSHAREDirectiveExample
1

P R O G R A M

W O R K S H A R E

2
3

I N T E G E R N , I , J
P A R A M E T E R ( N = 1 0 0 )
R E A L A A ( N , N ) , B B ( N , N ) ,
4
5

C C ( N , N ) ,

D D ( N , N ) ,

6
7

S o m e
D O I
D O

i n
=
J
A A (
B B (
E N D D O
E N D D O

8
9
1 0
1 1
1 2
1 3

i t
1 ,
=
J ,
J ,

i a l
N
1 ,
I )
I )

i z a t i o n s
N
=

1 . 0
1 . 0

1 4
1 5

! $ O M P

P A R A L L E L

! $ O M P

! $ O M P

W O
C C
D D
F I
L A
E N

! $ O M P

E N D

S H A R E D ( A A , B B , C C , D D , F I R S T , L A S T )

1 6
1 7
1 8
1 9
2 0
2 1
2 2

R K S
=
=
R S T
S T
D W

H A R E
A A * B
A A + B
= C C (
= C C ( N
O R K S H A

B
B

2 3

1 , 1 ) + D D ( 1 , 1 )
, N ) + D D ( N , N )
R E N O W A I T

2 4
2 5
2 6

E N D

P A R A L L E L

F I R S T ,

L A S T

OpenMPDirectives
WorkSharingConstructs
SINGLEDirective
Purpose:
TheSINGLEdirectivespecifiesthattheenclosedcodeistobeexecutedbyonlyonethreadintheteam.
Maybeusefulwhendealingwithsectionsofcodethatarenotthreadsafe(suchasI/O)
Format:
! $ O M P

Fortran

S I N G L E

[ c l a u s e . . . ]
P R I V A T E ( l i s t )
F I R S T P R I V A T E ( l i s t )

b l o c k
! $ O M P

E N D

# p r a g m a

S I N G L E
o m p

C/C++

s i n g l e

N O W A I T
[ c
p r
f i
n o

l a
i v
r s
w a

u s e . . . ]
n e w l i n e
a t e ( l i s t )
t p r i v a t e ( l i s t )
i t

s t r u c t u r e d _ b l o c k

Clauses:
ThreadsintheteamthatdonotexecutetheSINGLEdirective,waitattheendoftheenclosedcodeblock,unlessa
N O W A I T / n o w a i t clauseisspecified.
Clausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Restrictions:
ItisillegaltobranchintooroutofaSINGLEblock.

OpenMPDirectives
CombinedParallelWorkSharingConstructs
OpenMPprovidesthreedirectivesthataremerelyconveniences:
PARALLELDO/parallelfor
PARALLELSECTIONS
PARALLELWORKSHARE(fortranonly)
Forthemostpart,thesedirectivesbehaveidenticallytoanindividualPARALLELdirectivebeingimmediatelyfollowedby
aseparateworksharingdirective.
Mostoftherules,clausesandrestrictionsthatapplytobothdirectivesareineffect.SeetheOpenMPAPIfordetails.
AnexampleusingthePARALLELDO/parallelforcombineddirectiveisshownbelow.

FortranPARALLELDODirectiveExample
P R O G R A M

V E C T O R _ A D D

2
I N T E G E R N , I , C H U N K S I Z E , C H U N K
P A R A M E T E R ( N = 1 0 0 0 )
P A R A M E T E R ( C H U N K S I Z E = 1 0 0 )

R E A L

A ( N ) ,

B ( N ) ,

C ( N )

7
!
8

S o m e
D O I
A (
B (
E N D D
C H U N

9
1 0
1 1
1 2
1 3

i n
=
I )
I )

i t i a l i z a t i o n s
1 , N
= I * 1 . 0
= A ( I )

O
K

C H U N K S I Z E

1 4
! $ O M P P A R A L L E L D O
! $ O M P & S H A R E D ( A , B , C , C H U N K ) P R I V A T E ( I )
! $ O M P & S C H E D U L E ( S T A T I C , C H U N K )

1 5
1 6
1 7
1 8

D O

= 1 ,
C ( I ) =
E N D D O

1 9
2 0
2 1

N
A ( I )

B ( I )

2 2
! $ O M P

2 3

E N D

P A R A L L E L

D O

2 4
2 5

E N D

C/C++parallelforDirectiveExample
1

# i n c l u d e < o m p . h >
# d e f i n e N
1 0 0 0
# d e f i n e C H U N K S I Z E
1 0 0
2
3
4
5

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

6
7

i n t i , c h u n k ;
f l o a t a [ N ] , b [ N ] ,
8

c [ N ] ;

9
1 0
1 1
1 2
1 3

/ * S o m e i n i t i a l i z a t i o n s * /
f o r ( i = 0 ; i < N ; i + + )
a [ i ] = b [ i ] = i * 1 . 0 ;
c h u n k = C H U N K S I Z E ;

1 4
1 5
1 6
1 7
1 8

# p r a
s h
s c
f o

1 9
2 0

g m
a r
h e
r
c [

o
e d (
d u l
( i =
i ]

m p p
a , b ,
e ( s t
0 ; i
= a [

a r a
c , c
a t i
<
i ]

l l
h u
c ,
n ;
+

e l
n k )
c h u
i +
b [ i

f o r \
p r i v a t e ( i )
n k )
+ )
] ;
\

OpenMPDirectives
TASKConstruct
Purpose:
TheTASKconstructdefinesanexplicittask,whichmaybeexecutedbytheencounteringthread,ordeferredfor
executionbyanyotherthreadintheteam.
Thedataenvironmentofthetaskisdeterminedbythedatasharingattributeclauses.
TaskexecutionissubjecttotaskschedulingseetheOpenMP3.1specificationdocumentfordetails.

AlsoseetheOpenMP3.1documentationfortheassociatedtaskyieldandtaskwaitdirectives.
Format:
! $ O M P

T A S K

[ c l a
I F
F I
U N
D E
M E
P R
F I
S H

Fortran

u s
(
N A
T I
F A
R G
I V
R S
A R

e
s c
L
E D
U L
E A
A T
T P
E D

. . . ]
a l a r l o g i c a l e x p r e s s i o n )
( s c a l a r l o g i c a l e x p r e s s i o n )
T

( P R
B L E
E ( l i
R I V A T
( l i s

I V A T E

F I R S T P R I V A T E

S H A R E D

s t )
E ( l i s t )
t )

b l o c k
! $ O M P

E N D

# p r a g m a

T A S K
o m p

t a s k

C/C++

[ c l a
i f
f i
u n
d e
m e
p r
f i
s h

u s
(
n a
t i
f a
r g
i v
r s
a r

e
s c
l
e d
u l
e a
a t
t p
e d

. . . ]
n e w l i n e
a l a r e x p r e s s i o n )
( s c a l a r e x p r e s s i o n )
t

( s h
b l e
e ( l i
r i v a t
( l i s

a r e d

n o n e )

s t )
e ( l i s t )
t )

s t r u c t u r e d _ b l o c k

ClausesandRestrictions:
PleaseconsulttheOpenMP3.1specificationsdocumentfordetails.

OpenMPExercise2
WorkSharingConstructs
Overview:
LogintotheLCworkshopcluster,ifyouarenotalreadyloggedin
WorkSharingDO/forconstructexamples:review,compileandrun
WorkSharingSECTIONSconstructexample:review,compileandrun

GOTOTHEEXERCISEHERE

Approx.20minutes

OpenMPDirectives

N O N E )

SynchronizationConstructs
Considerasimpleexamplewheretwothreadsontwodifferentprocessorsarebothtryingtoincrementavariablexat
thesametime(assumexisinitially0):
THREAD1:

THREAD2:

i n c r e m e n t ( x )

i n c r e m e n t ( x )

{
x

1 ;

}
THREAD1:

THREAD2:

1 0

1 0

2 0
3 0

L O A D A , ( x a d d r e s s )
A D D A , 1
S T O R E A , ( x a d d r e s s )

2 0
3 0

1 ;

L O A D A , ( x a d d r e s s )
A D D A , 1
S T O R E A , ( x a d d r e s s )

Onepossibleexecutionsequence:
1.Thread1loadsthevalueofxintoregisterA.
2.Thread2loadsthevalueofxintoregisterA.
3.Thread1adds1toregisterA
4.Thread2adds1toregisterA
5.Thread1storesregisterAatlocationx
6.Thread2storesregisterAatlocationx
Theresultantvalueofxwillbe1,not2asitshouldbe.
Toavoidasituationlikethis,theincrementingofxmustbesynchronizedbetweenthetwothreadstoensurethatthe
correctresultisproduced.
OpenMPprovidesavarietyofSynchronizationConstructsthatcontrolhowtheexecutionofeachthreadproceeds
relativetootherteamthreads.

OpenMPDirectives
SynchronizationConstructs
MASTERDirective
Purpose:
TheMASTERdirectivespecifiesaregionthatistobeexecutedonlybythemasterthreadoftheteam.Allotherthreads
ontheteamskipthissectionofcode
Thereisnoimpliedbarrierassociatedwiththisdirective
Format:
! $ O M P

M A S T E R

b l o c k

Fortran

! $ O M P
# p r a g m a

E N D

C/C++

M A S T E R
o m p

m a s t e r

n e w l i n e

s t r u c t u r e d _ b l o c k

Restrictions:
ItisillegaltobranchintooroutofMASTERblock.

OpenMPDirectives

SynchronizationConstructs
CRITICALDirective
Purpose:
TheCRITICALdirectivespecifiesaregionofcodethatmustbeexecutedbyonlyonethreadatatime.
Format:
! $ O M P

C R I T I C A L
[

n a m e

b l o c k

Fortran

! $ O M P

E N D

# p r a g m a
C/C++

C R I T I C A L
o m p

c r i t i c a l

n a m e
[

n a m e

n e w l i n e

s t r u c t u r e d _ b l o c k

Notes:
IfathreadiscurrentlyexecutinginsideaCRITICALregionandanotherthreadreachesthatCRITICALregionand
attemptstoexecuteit,itwillblockuntilthefirstthreadexitsthatCRITICALregion.
TheoptionalnameenablesmultipledifferentCRITICALregionstoexist:
Namesactasglobalidentifiers.DifferentCRITICALregionswiththesamenamearetreatedasthesameregion.
AllCRITICALsectionswhichareunnamed,aretreatedasthesamesection.
Restrictions:
ItisillegaltobranchintooroutofaCRITICALblock.
Fortranonly:Thenamesofcriticalconstructsareglobalentitiesoftheprogram.Ifanameconflictswithanyotherentity,
thebehavioroftheprogramisunspecified.

Example:CRITICALConstruct
Allthreadsintheteamwillattempttoexecuteinparallel,however,becauseoftheCRITICALconstructsurroundingthe
incrementofx,onlyonethreadwillbeabletoread/increment/writexatanytime

FortranCRITICALDirectiveExample
1

P R O G R A M

C R I T I C A L

I N T E G E R
X = 0

2
3
4
5
6

! $ O M P

P A R A L L E L

S H A R E D ( X )

! $ O M P
! $ O M P

C R I T I C A L
X = X + 1
E N D C R I T I C A L

! $ O M P

E N D

7
8
9
1 0
1 1
1 2

P A R A L L E L

1 3
1 4

E N D

C/C++criticalDirectiveExample
# i n c l u d e

< o m p . h >

# i n c l u d e

< o m p . h >

m a i n ( i n t

a r g c ,

2
3

c h a r

* a r g v [ ] )

4
5

i n t

x ;

0 ;

7
8
9

# p r a g m a
{

o m p

p a r a l l e l

s h a r e d ( x )

1 0
1 1

# p r a g m a
x = x +

1 2

o m p
1 ;

c r i t i c a l

1 3
1 4

/ *

e n d

o f

p a r a l l e l

r e g i o n

* /

1 5
1 6

OpenMPDirectives
SynchronizationConstructs
BARRIERDirective
Purpose:
TheBARRIERdirectivesynchronizesallthreadsintheteam.
WhenaBARRIERdirectiveisreached,athreadwillwaitatthatpointuntilallotherthreadshavereachedthatbarrier.All
threadsthenresumeexecutinginparallelthecodethatfollowsthebarrier.
Format:
Fortran
C/C++

! $ O M P

B A R R I E R

# p r a g m a

o m p

b a r r i e r

n e w l i n e

Restrictions:
Allthreadsinateam(ornone)mustexecutetheBARRIERregion.
Thesequenceofworksharingregionsandbarrierregionsencounteredmustbethesameforeverythreadinateam.

OpenMPDirectives
SynchronizationConstructs
TASKWAITDirective
Purpose:
OpenMP3.1feature
TheTASKWAITconstructspecifiesawaitonthecompletionofchildtasksgeneratedsincethebeginningofthecurrent
task.
Format:
Fortran
C/C++

! $ O M P
# p r a g m a

T A S K W A I T
o m p

t a s k w a i t

n e w l i n e

Restrictions:
BecausethetaskwaitconstructdoesnothaveaClanguagestatementaspartofitssyntax,therearesomerestrictions
onitsplacementwithinaprogram.Thetaskwaitdirectivemaybeplacedonlyatapointwhereabaselanguage
statementisallowed.Thetaskwaitdirectivemaynotbeusedinplaceofthestatementfollowinganif,while,do,switch,
orlabel.SeetheOpenMP3.1specificationsdocumentfordetails.

OpenMPDirectives
SynchronizationConstructs
ATOMICDirective
Purpose:
TheATOMICdirectivespecifiesthataspecificmemorylocationmustbeupdatedatomically,ratherthanlettingmultiple
threadsattempttowritetoit.Inessence,thisdirectiveprovidesaminiCRITICALsection.
Format:
! $ O M P
Fortran

A T O M I C

s t a t e m e n t _ e x p r e s s i o n
# p r a g m a

C/C++

o m p

a t o m i c

n e w l i n e

s t a t e m e n t _ e x p r e s s i o n

Restrictions:
Thedirectiveappliesonlytoasingle,immediatelyfollowingstatement
Anatomicstatementmustfollowaspecificsyntax.SeethemostrecentOpenMPspecsforthis.

OpenMPDirectives
SynchronizationConstructs
FLUSHDirective
Purpose:
TheFLUSHdirectiveidentifiesasynchronizationpointatwhichtheimplementationmustprovideaconsistentviewof
memory.Threadvisiblevariablesarewrittenbacktomemoryatthispoint.
ThereisafairamountofdiscussiononthisdirectivewithinOpenMPcirclesthatyoumaywishtoconsultformore
information.Someofitishardtounderstand?PertheAPI:
Iftheintersectionoftheflushsetsoftwoflushesperformedbytwodifferentthreadsisnonempty,thenthetwo
flushesmustbecompletedasifinsomesequentialorder,seenbyallthreads.
Saywhat?
Toquotefromtheopenmp.orgFAQ:
Q17:Isthe!$ompflushdirectivenecessaryonacachecoherentsystem?
A17:Yestheflushdirectiveisnecessary.LookintheOpenMPspecificationsforexamplesofit'suses.Thedirectiveis
necessarytoinstructthecompilerthatthevariablemustbewrittento/readfromthememorysystem,i.e.thatthe
variablecannotbekeptinalocalCPUregisterovertheflush"statement"inyourcode.
CachecoherencymakescertainthatifoneCPUexecutesareadorwriteinstructionfrom/tomemory,thenallother
CPUsinthesystemwillgetthesamevaluefromthatmemoryaddresswhentheyaccessit.Allcacheswillshowa
coherentvalue.However,intheOpenMPstandardtheremustbeawaytoinstructthecompilertoactuallyinsertthe
read/writemachineinstructionandnotpostponeit.Keepingavariableinaregisterinaloopisverycommonwhen
producingefficientmachinelanguagecodeforaloop.
AlsoseethemostrecentOpenMPspecsfordetails.
Format:

Fortran
C/C++

! $ O M P

F L U S H

# p r a g m a

( l i s t )

o m p

f l u s h

( l i s t )

n e w l i n e

Notes:
Theoptionallistcontainsalistofnamedvariablesthatwillbeflushedinordertoavoidflushingallvariables.Forpointers
inthelist,notethatthepointeritselfisflushed,nottheobjectitpointsto.
Implementationsmustensureanypriormodificationstothreadvisiblevariablesarevisibletoallthreadsafterthispoint
ie.compilersmustrestorevaluesfromregisterstomemory,hardwaremightneedtoflushwritebuffers,etc
TheFLUSHdirectiveisimpliedforthedirectivesshowninthetablebelow.ThedirectiveisnotimpliedifaNOWAIT
clauseispresent.
Fortran

C/C++

BARRIER
ENDPARALLEL
CRITICALandENDCRITICAL
ENDDO
ENDSECTIONS
ENDSINGLE
ORDEREDandENDORDERED

b a r
p a r
c r i
o r d
f o r
s e c
s i n

r i e r
a l l e l uponentryandexit
t i c a l uponentryandexit
e r e d uponentryandexit
uponexit
t i o n s uponexit
g l e uponexit

OpenMPDirectives
SynchronizationConstructs
ORDEREDDirective
Purpose:
TheORDEREDdirectivespecifiesthatiterationsoftheenclosedloopwillbeexecutedinthesameorderasiftheywere
executedonaserialprocessor.
Threadswillneedtowaitbeforeexecutingtheirchunkofiterationsifpreviousiterationshaven'tcompletedyet.
UsedwithinaDO/forloopwithanORDEREDclause
TheORDEREDdirectiveprovidesawayto"finetune"whereorderingistobeappliedwithinaloop.Otherwise,itisnot
required.
Format:
! $ O M P D O
( l o o p
! $ O M P
Fortran

O R D E R E D
r e g i o n )

[ c l a u s e s . . . ]

O R D E R E D

( b l o c k )
! $ O M P

E N D

O R D E R E D

( e n d o f l o o p
! $ O M P E N D D O

r e g i o n )

# p r a g m a o m p f o r o r d e r e d
( l o o p r e g i o n )
C/C++

# p r a g m a

o m p

o r d e r e d

n e w l i n e

s t r u c t u r e d _ b l o c k
( e n d o

o f

l o o p

[ c l a u s e s . . . ]

r e g i o n )

Restrictions:
AnORDEREDdirectivecanonlyappearinthedynamicextentofthefollowingdirectives:
DOorPARALLELDO(Fortran)
f o r orp a r a l l e l f o r (C/C++)
Onlyonethreadisallowedinanorderedsectionatanytime
ItisillegaltobranchintooroutofanORDEREDblock.
AniterationofaloopmustnotexecutethesameORDEREDdirectivemorethanonce,anditmustnotexecutemore
thanoneORDEREDdirective.
AloopwhichcontainsanORDEREDdirective,mustbealoopwithanORDEREDclause.

OpenMPDirectives
THREADPRIVATEDirective
Purpose:
TheTHREADPRIVATEdirectiveisusedtomakeglobalfilescopevariables(C/C++)orcommonblocks(Fortran)local
andpersistenttoathreadthroughtheexecutionofmultipleparallelregions.
Format:
Fortran
C/C++

! $ O M P

T H R E A D P R I V A T E

# p r a g m a

o m p

( / c b / ,

t h r e a d p r i v a t e

. . . )

c b

i s

t h e

n a m e

o f

c o m m o n

b l o c k

( l i s t )

Notes:
Thedirectivemustappearafterthedeclarationoflistedvariables/commonblocks.Eachthreadthengetsitsowncopyof
thevariable/commonblock,sodatawrittenbyonethreadisnotvisibletootherthreads.Forexample:

FortranTHREADPRIVATEDirectiveExample
1

P R O G R A M

T H R E A D P R I V

2
3

I N T E G E R A , B ,
R E A L * 4 X
C O M M O N / C 1 / A
4
5

I ,

T I D ,

O M P _ G E T _ T H R E A D _ N U M

6
7

! $ O M P

T H R E A D P R I V A T E ( / C 1 / ,

X )

E x p l i c i t l y t u r n o f f d y n a m i c t h r e a d s
C A L L O M P _ S E T _ D Y N A M I C ( . F A L S E . )

8
9
1 0
1 1
1 2
1 3

! $ O M P

1 4
1 5
1 6
1 7
1 8
1 9

! $ O M P

P R
P A
T I
A
B
X
P R
E N
D

I N
R A
D
=
=
=
I N

T
L L
=
T I
T I
1 .
T
P A

* , ' 1
E L P R
O M P _ G
D
D
1 * T
* , ' T
R A L L E

s t P a r a l l e l R e g i o n : '
I V A T E ( B , T I D )
E T _ T H R E A D _ N U M ( )

I D

+ 1 . 0
h r e a d ' , T I D , ' :

A , B , X = ' , A , B , X

2 0
2 1
2 2
2 3

P R I N T
P R I N T
P R I N T

* ,
* ,

' * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * '
' M a s t e r t h r e a d d o i n g s e r i a l w o r k h e r e '
' * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * '

P R I N T

* ,

' 2 n d

* ,

2 4
2 5
! $ O M P

P A R A L L E L

P a r a l l e l

P R I V A T E ( T I D )

R e g i o n :

'

2 6

! $ O M P

P A
T I
P R
E N

2 7
2 8
2 9

! $ O M P

R A L L E L P R
D = O M P _ G
I N T * , ' T
D P A R A L L E

I V A T E ( T I D )
E T _ T H R E A D _ N U M ( )
h r e a d ' , T I D , ' :

A , B , X = ' , A , B , X

3 0
3 1

E N D

O u t p u t :
1 s t
T h r
T h r
T h r
T h r
* * *
M a s
* * *
2 n d
T h r
T h r
T h r
T h r

P a
e a d
e a d
e a d
e a d
* * *
t e r
* * *
P a
e a d
e a d
e a d
e a d

r a l
0
1
3
2
* * *
t h
* * *
r a l
0
2
3
1

l e l

R e g
A , B
:
A , B
:
A , B
:
A , B
* * * * * * *
r e a d d o
* * * * * * *
l e l R e g
:
A , B
:
A , B
:
A , B
:
A , B
:

i o n
, X =
, X =
, X =
, X =
* * *
i n g
* * *
i o n
, X =
, X =
, X =
, X =

:
0

1 . 0
2 . 0
3 4 . 3
2 3 . 2
* * * * * *
e r i a l
* * * * * *
1

0 0 0
9 9 9
0 0 0
0 0 0
* * *
w o r
* * *

0 0 0
9 9 9
0 0 1
0 0 0
* * *

0 0 0
0 0 0
0 0 0
9 9 9

0 0 0
0 0 0
0 0 1
9 9 9

2
* *
s
* *

0 0
0 5
9 1
4 8

* * *
h e r e
* * * * * *
k

:
0

0
2

0
3

0
1

1 . 0
3 . 2
4 . 3
2 . 0

0 0
4 8
9 1
0 5

C/C++threadprivateDirectiveExample
1

# i n c l u d e

< o m p . h >

2
3

i n t
a , b ,
f l o a t x ;
4

i ,

t i d ;

5
6

# p r a g m a

o m p

t h r e a d p r i v a t e ( a ,

x )

7
8

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

9
1 0
1 1

/ *

E x p l i c i t l y t u r n o f f
o m p _ s e t _ d y n a m i c ( 0 ) ;

d y n a m i c

t h r e a d s

* /

1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
2 0
2 1

p r
# p r a
{
t i
a
b
x
p r
}

i n t f ( " 1 s t P a r a l l e l R e g i o n : \ n " ) ;
g m a o m p p a r a l l e l p r i v a t e ( b , t i d )
d

=
=

o m p _
;
;
*
" T h
n d

t i d
= t i d
= 1 . 1
i n t f (
/ * e

g e t _ t h r e a d _ n u m ( ) ;

t i d + 1 . 0 ;
r e a d % d :
a , b , x = % d % d
o f p a r a l l e l r e g i o n * /

% f \ n " , t i d , a , b , x ) ;

2 2
2 3
2 4
2 5

p r i n t f ( " * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ n " ) ;
p r i n t f ( " M a s t e r t h r e a d d o i n g s e r i a l w o r k h e r e \ n " ) ;
p r i n t f ( " * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ n " ) ;

2 6
2 7
2 8
2 9
3 0
3 1
3 2

p r
# p r a
{
t i
p r
}

i n t f ( " 2 n d P a r a l l e l R e g i o n : \ n " ) ;
g m a o m p p a r a l l e l p r i v a t e ( t i d )
d

= o m p _ g e t _ t h r e a d _ n u m ( ) ;
i n t f ( " T h r e a d % d :
a , b , x = % d % d
/ * e n d o f p a r a l l e l r e g i o n * /

% f \ n " , t i d , a , b , x ) ;

3 3
3 4

O u t p u t :
1 s t
T h r
T h r
T h r
T h r
* * *
M a s
* * *
2 n d
T h r
T h r
T h r
T h r

P a
e a d
e a d
e a d
e a d
* * *
t e r
* * *
P a
e a d
e a d
e a d
e a d

r a l
0 :
2 :
3 :
1 :
* * *
t h
* * *
r a l
0 :
3 :
1 :
2 :

l e l

* *
r e
* *
l e

R e
a , b
a , b
a , b
a , b
* * * *
a d d
* * * *
l R e
a , b
a , b
a , b
a , b

g i o
, x =
, x =
, x =
, x =
* * *
o i n
* * *
g i o
, x =
, x =
, x =
, x =

n :
0

1 . 0
3 . 2
3 4 . 3
1 2 . 1
* * * * * *
s e r i a l
* * * * * *
2
3

1
* *
g
* *

0 0 0
0 0 0
0 0 0
0 0 0
* * *
w o
* * *

0 0

0 0 0
0 0 0
0 0 0
0 0 0

0 0

0 0
0 0
0 0
* * * * * * *
r k h e r e
* * * * * * *

n :
0

0
3

0
1

0
2

1 . 0
4 . 3
2 . 1
3 . 2

0 0
0 0
0 0

Onfirstentrytoaparallelregion,datainTHREADPRIVATEvariablesandcommonblocksshouldbeassumed
undefined,unlessaCOPYINclauseisspecifiedinthePARALLELdirective
THREADPRIVATEvariablesdifferfromPRIVATEvariables(discussedlater)becausetheyareabletopersistbetween
differentparallelregionsofacode.
Restrictions:
DatainTHREADPRIVATEobjectsisguaranteedtopersistonlyifthedynamicthreadsmechanismis"turnedoff"andthe
numberofthreadsindifferentparallelregionsremainsconstant.Thedefaultsettingofdynamicthreadsisundefined.
TheTHREADPRIVATEdirectivemustappearaftereverydeclarationofathreadprivatevariable/commonblock.
Fortran:onlynamedcommonblockscanbemadeTHREADPRIVATE.

OpenMPDirectives
DataScopeAttributeClauses
AlsocalledDatasharingAttributeClauses
AnimportantconsiderationforOpenMPprogrammingistheunderstandinganduseofdatascoping
BecauseOpenMPisbaseduponthesharedmemoryprogrammingmodel,mostvariablesaresharedbydefault
Globalvariablesinclude:
Fortran:COMMONblocks,SAVEvariables,MODULEvariables
C:Filescopevariables,static
Privatevariablesinclude:
Loopindexvariables
Stackvariablesinsubroutinescalledfromparallelregions
Fortran:Automaticvariableswithinastatementblock
TheOpenMPDataScopeAttributeClausesareusedtoexplicitlydefinehowvariablesshouldbescoped.Theyinclude:
PRIVATE
FIRSTPRIVATE
LASTPRIVATE
SHARED
DEFAULT
REDUCTION
COPYIN
DataScopeAttributeClausesareusedinconjunctionwithseveraldirectives(PARALLEL,DO/for,andSECTIONS)to
controlthescopingofenclosedvariables.

Theseconstructsprovidetheabilitytocontrolthedataenvironmentduringexecutionofparallelconstructs.
Theydefinehowandwhichdatavariablesintheserialsectionoftheprogramaretransferredtotheparallel
regionsoftheprogram(andback)
Theydefinewhichvariableswillbevisibletoallthreadsintheparallelregionsandwhichvariableswillbeprivately
allocatedtoallthreads.
DataScopeAttributeClausesareeffectiveonlywithintheirlexical/staticextent.
Important:PleaseconsultthelatestOpenMPspecsforimportantdetailsanddiscussiononthistopic.
AClauses/DirectivesSummaryTableisprovidedforconvenience.

PRIVATEClause
Purpose:
ThePRIVATEclausedeclaresvariablesinitslisttobeprivatetoeachthread.
Format:
Fortran
C/C++

P R I V A T E

( l i s t )

p r i v a t e

( l i s t )

Notes:
PRIVATEvariablesbehaveasfollows:
Anewobjectofthesametypeisdeclaredonceforeachthreadintheteam
Allreferencestotheoriginalobjectarereplacedwithreferencestothenewobject
VariablesdeclaredPRIVATEshouldbeassumedtobeuninitializedforeachthread
ComparisonbetweenPRIVATEandTHREADPRIVATE:

PRIVATE

THREADPRIVATE

DataItem

C/C++:variable
Fortran:variableorcommonblock

C/C++:variable
Fortran:commonblock

Where
Declared

Atstartofregionorworksharinggroup

Indeclarationsofeachroutineusingblockor
globalfilescope

Persistent?

No

Yes

Extent

Lexicalonlyunlesspassedasan
argumenttosubroutine

Dynamic

Initialized

UseFIRSTPRIVATE

UseCOPYIN

SHAREDClause
Purpose:
TheSHAREDclausedeclaresvariablesinitslisttobesharedamongallthreadsintheteam.
Format:
Fortran
C/C++

S H A R E D

( l i s t )

s h a r e d

( l i s t )

Notes:
Asharedvariableexistsinonlyonememorylocationandallthreadscanreadorwritetothataddress

Itistheprogrammer'sresponsibilitytoensurethatmultiplethreadsproperlyaccessSHAREDvariables(suchasvia
CRITICALsections)

DEFAULTClause
Purpose:
TheDEFAULTclauseallowstheusertospecifyadefaultscopeforallvariablesinthelexicalextentofanyparallel
region.
Format:
Fortran
C/C++

D E F A U L T

( P R I V A T E

d e f a u l t

( s h a r e d

|
|

F I R S T P R I V A T E

S H A R E D

N O N E )

n o n e )

Notes:
SpecificvariablescanbeexemptedfromthedefaultusingthePRIVATE,SHARED,FIRSTPRIVATE,LASTPRIVATE,
andREDUCTIONclauses
TheC/C++OpenMPspecificationdoesnotincludeprivateorfirstprivateasapossibledefault.However,actual
implementationsmayprovidethisoption.
UsingNONEasadefaultrequiresthattheprogrammerexplicitlyscopeallvariables.
Restrictions:
OnlyoneDEFAULTclausecanbespecifiedonaPARALLELdirective

FIRSTPRIVATEClause
Purpose:
TheFIRSTPRIVATEclausecombinesthebehaviorofthePRIVATEclausewithautomaticinitializationofthevariablesin
itslist.
Format:
Fortran
C/C++

F I R S T P R I V A T E

( l i s t )

f i r s t p r i v a t e

( l i s t )

Notes:
Listedvariablesareinitializedaccordingtothevalueoftheiroriginalobjectspriortoentryintotheparallelorwork
sharingconstruct.

LASTPRIVATEClause
Purpose:
TheLASTPRIVATEclausecombinesthebehaviorofthePRIVATEclausewithacopyfromthelastloopiterationor
sectiontotheoriginalvariableobject.
Format:
Fortran
C/C++

L A S T P R I V A T E

( l i s t )

l a s t p r i v a t e

( l i s t )

Notes:
Thevaluecopiedbackintotheoriginalvariableobjectisobtainedfromthelast(sequentially)iterationorsectionofthe

enclosingconstruct.
Forexample,theteammemberwhichexecutesthefinaliterationforaDOsection,ortheteammemberwhichdoesthe
lastSECTIONofaSECTIONScontextperformsthecopywithitsownvalues

COPYINClause
Purpose:
TheCOPYINclauseprovidesameansforassigningthesamevaluetoTHREADPRIVATEvariablesforallthreadsinthe
team.
Format:
Fortran
C/C++

C O P Y I N

( l i s t )

c o p y i n

( l i s t )

Notes:
Listcontainsthenamesofvariablestocopy.InFortran,thelistcancontainboththenamesofcommonblocksand
namedvariables.
Themasterthreadvariableisusedasthecopysource.Theteamthreadsareinitializedwithitsvalueuponentryintothe
parallelconstruct.

COPYPRIVATEClause
Purpose:
TheCOPYPRIVATEclausecanbeusedtobroadcastvaluesacquiredbyasinglethreaddirectlytoallinstancesofthe
privatevariablesintheotherthreads.
AssociatedwiththeSINGLEdirective
SeethemostrecentOpenMPspecsdocumentforadditionaldiscussionandexamples.
Format:
Fortran
C/C++

C O P Y P R I V A T E
c o p y p r i v a t e

( l i s t )
( l i s t )

REDUCTIONClause
Purpose:
TheREDUCTIONclauseperformsareductiononthevariablesthatappearinitslist.
Aprivatecopyforeachlistvariableiscreatedforeachthread.Attheendofthereduction,thereductionvariableis
appliedtoallprivatecopiesofthesharedvariable,andthefinalresultiswrittentotheglobalsharedvariable.
Format:
Fortran
C/C++

R E D U C T I O N

( o p e r a t o r | i n t r i n s i c :

r e d u c t i o n

( o p e r a t o r :

l i s t )

l i s t )

Example:REDUCTIONVectorDotProduct:
Iterationsoftheparallelloopwillbedistributedinequalsizedblockstoeachthreadintheteam(SCHEDULESTATIC)
Attheendoftheparallelloopconstruct,allthreadswilladdtheirvaluesof"result"toupdatethemasterthread'sglobal
copy.

FortranREDUCTIONClauseExample
1

P R O G R A M

D O T _ P R O D U C T

I N T
P A R
P A R
R E A

N ,

2
3
4
5
6

E G E R
A M E T E
A M E T E
L A ( N

C H
( N =
R ( C H
) , B (

U N K S I Z E , C H U N K ,
1 0 0 )
U N K S I Z E = 1 0 )
N ) , R E S U L T

7
8

S o m e
D O I
A (
B (
E N D D
R E S U
C H U N

9
1 0
1 1
1 2
1 3
1 4

i n
=
I )
I )

i t i a l i z a t i o n s
1 , N
= I * 1 . 0
= I * 2 . 0

O
L T =

0 . 0
C H U N K S I Z E

A L L
A U L
E D U
U C T

E L

1 5
1 6

! $ O
! $ O
! $ O
! $ O

1 7
1 8
1 9

M P

P A R
D E F
S C H
R E D

M P &
M P &
M P &

D O
T ( S H A R E D ) P R I V A T E ( I )
L E ( S T A T I C , C H U N K )
I O N ( + : R E S U L T )

2 0
2 1

D O

I = 1 , N
R E S U L T = R E S U L T
E N D D O

2 2
2 3

( A ( I )

B ( I ) )

2 4
2 5

! $ O M P

E N D

P A R A L L E L

D O

2 6
2 7

P R I N T
E N D

2 8

* ,

' F i n a l

R e s u l t =

' ,

R E S U L T

C/C++reductionClauseExample
1

# i n c l u d e

< o m p . h >

m a i n ( i n t

a r g c ,

2
3

c h a r

* a r g v [ ] )

4
5

i n t
f l o a t

i ,

/ *

n , c h u n k ;
a [ 1 0 0 ] , b [ 1 0 0 ] ,

r e s u l t ;

1 0

S o m
1 0
n k
u l t
( i
[ i ]
[ i ]
=

1 2

c h u
r e s
f o r

1 3

1 4

1 5

1 1

i n i t i a l i z a t i o n s

* /

0 ;
=

1 0 ;
0 . 0 ;
= 0 ; i < n ; i + + )
= i * 1 . 0 ;
= i * 2 . 0 ;
=

1 6
1 7
1 8
1 9
2 0

# p r a
d e
s c
r e

g m a
f a u l
h e d u
d u c t

o m p
t ( s h
l e ( s
i o n (

p a r
a r e
t a t
+ : r

a l l
d )
i c ,
e s u

e l

f o r
p r i v a t e ( i )
c h u n k )
l t )

\
\
\

2 1
2 2
2 3

f o r

( i = 0 ; i < n ; i + + )
r e s u l t = r e s u l t + ( a [ i ]
*

b [ i ] ) ;

2 4
p r i n t f ( " F i n a l

r e s u l t =

% f \ n " , r e s u l t ) ;

2 5

p r i n t f ( " F i n a l

r e s u l t =

% f \ n " , r e s u l t ) ;

2 6
2 7

Restrictions:
Variablesinthelistmustbenamedscalarvariables.Theycannotbearrayorstructuretypevariables.Theymustalso
bedeclaredSHAREDintheenclosingcontext.
Reductionoperationsmaynotbeassociativeforrealnumbers.
TheREDUCTIONclauseisintendedtobeusedonaregionorworksharingconstructinwhichthereductionvariableis
usedonlyinstatementswhichhaveoneoffollowingforms:
Fortran

C/C++

x=xoperatorexpr
x=exproperatorx(exceptsubtraction)
x=intrinsic(x,expr)
x=intrinsic(expr,x)

xisascalarvariableinthelist
exprisascalarexpressionthatdoesnot
referencex
intrinsicisoneofMAX,MIN,IAND,IOR,IEOR
operatorisoneof+,*,,.AND.,.OR.,.EQV.,
.NEQV.

x=xopexpr
x=expropx(exceptsubtraction)
xbinop=expr
x++
++x
x
x
xisascalarvariableinthelist
exprisascalarexpressionthatdoesnotreferencex
opisnotoverloaded,andisoneof+,*,,/,&,^,|,
&&,||
binopisnotoverloaded,andisoneof+,*,,/,&,^,|

OpenMPDirectives
Clauses/DirectivesSummary
ThetablebelowsummarizeswhichclausesareacceptedbywhichOpenMPdirectives.
Directive
Clause

PARALLEL DO/for SECTIONS SINGLE PARALLEL PARALLEL


DO/for
SECTIONS

IF

PRIVATE
SHARED

DEFAULT
FIRSTPRIVATE
LASTPRIVATE

REDUCTION
COPYIN

COPYPRIVATE

SCHEDULE

ORDERED

NOWAIT

ThefollowingOpenMPdirectivesdonotacceptclauses:
MASTER
CRITICAL
BARRIER
ATOMIC
FLUSH

ORDERED
THREADPRIVATE
Implementationsmay(anddo)differfromthestandardinwhichclausesaresupportedbyeachdirective.

OpenMPDirectives
DirectiveBindingandNestingRules
ThissectionisprovidedmainlyasaquickreferenceonruleswhichgovernOpenMPdirectivesandbinding.Users
shouldconsulttheirimplementationdocumentationandtheOpenMPstandardforotherrulesandrestrictions.

Unlessindicatedotherwise,rulesapplytobothFortranandC/C++OpenMPimplementations.
Note:theFortranAPIalsodefinesanumberofDataEnvironmentrules.Thosehavenotbeenreproducedhere.
DirectiveBinding:
TheDO/for,SECTIONS,SINGLE,MASTERandBARRIERdirectivesbindtothedynamicallyenclosingPARALLEL,if
oneexists.Ifnoparallelregioniscurrentlybeingexecuted,thedirectiveshavenoeffect.
TheORDEREDdirectivebindstothedynamicallyenclosingDO/for.
TheATOMICdirectiveenforcesexclusiveaccesswithrespecttoATOMICdirectivesinallthreads,notjustthecurrent
team.
TheCRITICALdirectiveenforcesexclusiveaccesswithrespecttoCRITICALdirectivesinallthreads,notjustthe
currentteam.
AdirectivecanneverbindtoanydirectiveoutsidetheclosestenclosingPARALLEL.
DirectiveNesting:
Aworksharingregionmaynotbecloselynestedinsideaworksharing,explicittask,critical,ordered,atomic,ormaster
region.
Abarrierregionmaynotbecloselynestedinsideaworksharing,explicittask,critical,ordered,atomic,ormasterregion.
Amasterregionmaynotbecloselynestedinsideaworksharing,atomic,orexplicittaskregion.
Anorderedregionmaynotbecloselynestedinsideacritical,atomic,orexplicittaskregion.
Anorderedregionmustbecloselynestedinsidealoopregion(orparallelloopregion)withanorderedclause.
Acriticalregionmaynotbenested(closelyorotherwise)insideacriticalregionwiththesamename.Notethatthis
restrictionisnotsufficienttopreventdeadlock.
parallel,flush,critical,atomic,taskyield,andexplicittaskregionsmaynotbecloselynestedinsideanatomicregion.

RunTimeLibraryRoutines
Overview:
TheOpenMPAPIincludesanevergrowingnumberofruntimelibraryroutines.
Theseroutinesareusedforavarietyofpurposesasshowninthetablebelow:
Routine

Purpose

OMP_SET_NUM_THREADS

Setsthenumberofthreadsthatwillbeusedinthenext
parallelregion

OMP_GET_NUM_THREADS

Returnsthenumberofthreadsthatarecurrentlyintheteam
executingtheparallelregionfromwhichitiscalled

OMP_GET_MAX_THREADS

Returnsthemaximumvaluethatcanbereturnedbyacallto
theOMP_GET_NUM_THREADSfunction

OMP_GET_THREAD_NUM

Returnsthethreadnumberofthethread,withintheteam,
makingthiscall.

OMP_GET_THREAD_LIMIT

ReturnsthemaximumnumberofOpenMPthreadsavailableto
aprogram

OMP_GET_NUM_PROCS

Returnsthenumberofprocessorsthatareavailabletothe
program

OMP_IN_PARALLEL

Usedtodetermineifthesectionofcodewhichisexecutingis
parallelornot

OMP_SET_DYNAMIC

Enablesordisablesdynamicadjustment(bytheruntime
system)ofthenumberofthreadsavailableforexecutionof
parallelregions

OMP_GET_DYNAMIC

Usedtodetermineifdynamicthreadadjustmentisenabledor
not

OMP_SET_NESTED

Usedtoenableordisablenestedparallelism

OMP_GET_NESTED

Usedtodetermineifnestedparallelismisenabledornot

OMP_SET_SCHEDULE

Setstheloopschedulingpolicywhen"runtime"isusedasthe
schedulekindintheOpenMPdirective

OMP_GET_SCHEDULE

Returnstheloopschedulingpolicywhen"runtime"isusedas
theschedulekindintheOpenMPdirective

OMP_SET_MAX_ACTIVE_LEVELS

Setsthemaximumnumberofnestedparallelregions

OMP_GET_MAX_ACTIVE_LEVELS

Returnsthemaximumnumberofnestedparallelregions

OMP_GET_LEVEL

Returnsthecurrentlevelofnestedparallelregions

OMP_GET_ANCESTOR_THREAD_NUM Returns,foragivennestedlevelofthecurrentthread,the
threadnumberofancestorthread
OMP_GET_TEAM_SIZE

Returns,foragivennestedlevelofthecurrentthread,thesize
ofthethreadteam

OMP_GET_ACTIVE_LEVEL

Returnsthenumberofnested,activeparallelregions
enclosingthetaskthatcontainsthecall

OMP_IN_FINAL

Returnstrueiftheroutineisexecutedinthefinaltaskregion
otherwiseitreturnsfalse

OMP_INIT_LOCK

Initializesalockassociatedwiththelockvariable

OMP_DESTROY_LOCK

Disassociatesthegivenlockvariablefromanylocks

OMP_SET_LOCK

Acquiresownershipofalock

OMP_UNSET_LOCK

Releasesalock

OMP_TEST_LOCK

Attemptstosetalock,butdoesnotblockifthelockis
unavailable

OMP_INIT_NEST_LOCK

Initializesanestedlockassociatedwiththelockvariable

OMP_DESTROY_NEST_LOCK

Disassociatesthegivennestedlockvariablefromanylocks

OMP_SET_NEST_LOCK

Acquiresownershipofanestedlock

OMP_UNSET_NEST_LOCK

Releasesanestedlock

OMP_TEST_NEST_LOCK

Attemptstosetanestedlock,butdoesnotblockifthelockis
unavailable

OMP_GET_WTIME

Providesaportablewallclocktimingroutine

OMP_GET_WTICK

Returnsadoubleprecisionfloatingpointvalueequaltothe
numberofsecondsbetweensuccessiveclockticks

ForC/C++,alloftheruntimelibraryroutinesareactualsubroutines.ForFortran,someareactuallyfunctions,andsome
aresubroutines.Forexample:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ N U M _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ t h r e a d s ( v o i d )

NotethatforC/C++,youusuallyneedtoincludethe< o m p . h > headerfile.


Fortranroutinesarenotcasesensitive,butC/C++routinesare.
FortheLockroutines/functions:
Thelockvariablemustbeaccessedonlythroughthelockingroutines
ForFortran,thelockvariableshouldbeoftypeintegerandofakindlargeenoughtoholdanaddress.
ForC/C++,thelockvariablemusthavetypeo m p _ l o c k _ t ortypeo m p _ n e s t _ l o c k _ t ,dependingonthefunction
beingused.
Implementationnotes:
ImplementationsmayormaynotsupportallOpenMPAPIfeatures.Forexample,ifnestedparallelismis
supported,itmaybeonlynominal,inthatanestedparallelregionmayonlyhaveonethread.
Consultyourimplementation'sdocumentationfordetailsorexperimentandfindoutforyourselfifyoucan'tfindit
inthedocumentation.
TheruntimelibraryroutinesarediscussedinmoredetailinAppendixA.

EnvironmentVariables
OpenMPprovidesthefollowingenvironmentvariablesforcontrollingtheexecutionofparallelcode.
Allenvironmentvariablenamesareuppercase.Thevaluesassignedtothemarenotcasesensitive.
OMP_SCHEDULE
AppliesonlytoDO,PARALLELDO(Fortran)andf o r , p a r a l l e l f o r (C/C++)directiveswhichhavetheirschedule
clausesettoRUNTIME.Thevalueofthisvariabledetermineshowiterationsofthelooparescheduledonprocessors.
Forexample:
s e t e n v
s e t e n v

O M P _ S C H E D U L E
O M P _ S C H E D U L E

" g u i d e d , 4 "
" d y n a m i c "

OMP_NUM_THREADS
Setsthemaximumnumberofthreadstouseduringexecution.Forexample:
s e t e n v

O M P _ N U M _ T H R E A D S

OMP_DYNAMIC
Enablesordisablesdynamicadjustmentofthenumberofthreadsavailableforexecutionofparallelregions.Valid
valuesareTRUEorFALSE.Forexample:
s e t e n v

O M P _ D Y N A M I C

T R U E

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_PROC_BIND
Enablesordisablesthreadsbindingtoprocessors.ValidvaluesareTRUEorFALSE.Forexample:
s e t e n v

O M P _ P R O C _ B I N D

T R U E

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_NESTED
Enablesordisablesnestedparallelism.ValidvaluesareTRUEorFALSE.Forexample:
s e t e n v

O M P _ N E S T E D

T R U E

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.Ifnestedparallelismissupported,itisoftenonly
nominal,inthatanestedparallelregionmayonlyhaveonethread.
OMP_STACKSIZE

Controlsthesizeofthestackforcreated(nonMaster)threads.Examples:
s e t
s e t
s e t
s e t
s e t
s e t
s e t

e n v
e n v
e n v
e n v
e n v
e n v
e n v

O M P
O M P
O M P
O M P
O M P
O M P
O M P

_ S T
_ S T
_ S T
_ S T
_ S T
_ S T
_ S T

A C K
A C K
A C K
A C K
A C K
A C K
A C K

S I Z
S I Z
S I Z
S I Z
S I Z
S I Z
S I Z

2 0 0 0
" 3 0 0
1 0 M
" 1 0
" 2 0
" 1 G
2 0 0 0
E
E
E
E
E
E

5 0 0 B
0 k "
M

"

"
"
0

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_WAIT_POLICY
ProvidesahinttoanOpenMPimplementationaboutthedesiredbehaviorofwaitingthreads.AcompliantOpenMP
implementationmayormaynotabidebythesettingoftheenvironmentvariable.ValidvaluesareACTIVEand
PASSIVE.ACTIVEspecifiesthatwaitingthreadsshouldmostlybeactive,i.e.,consumeprocessorcycles,whilewaiting.
PASSIVEspecifiesthatwaitingthreadsshouldmostlybepassive,i.e.,notconsumeprocessorcycles,whilewaiting.The
detailsoftheACTIVEandPASSIVEbehaviorsareimplementationdefined.Examples:
s e t
s e t
s e t
s e t

e n v
e n v
e n v
e n v

O M P
O M P
O M P
O M P

_ W A
_ W A
_ W A
_ W A

I T _
I T _
I T _
I T _

P O L
P O L
P O L
P O L

I C Y
I C Y
I C Y
I C Y

A C T
a c t
P A S
p a s

I V E
i v e
S I V E
s i v e

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_MAX_ACTIVE_LEVELS
Controlsthemaximumnumberofnestedactiveparallelregions.Thevalueofthisenvironmentvariablemustbeanon
negativeinteger.Thebehavioroftheprogramisimplementationdefinediftherequestedvalueof
OMP_MAX_ACTIVE_LEVELSisgreaterthanthemaximumnumberofnestedactiveparallellevelsanimplementation
cansupport,orifthevalueisnotanonnegativeinteger.Example:
s e t e n v

O M P _ M A X _ A C T I V E _ L E V E L S

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_THREAD_LIMIT
SetsthenumberofOpenMPthreadstouseforthewholeOpenMPprogram.Thevalueofthisenvironmentvariable
mustbeapositiveinteger.Thebehavioroftheprogramisimplementationdefinediftherequestedvalueof
OMP_THREAD_LIMITisgreaterthanthenumberofthreadsanimplementationcansupport,orifthevalueisnota
positiveinteger.Example:
s e t e n v

O M P _ T H R E A D _ L I M I T

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.

ThreadStackSizeandThreadBinding
ThreadStackSize:
TheOpenMPstandarddoesnotspecifyhowmuchstackspaceathreadshouldhave.Consequently,implementations
willdifferinthedefaultthreadstacksize.
Defaultthreadstacksizecanbeeasytoexhaust.Itcanalsobenonportablebetweencompilers.Usingpastversionsof
LCcompilersasanexample:
Compiler

Approx.StackLimit Approx.ArraySize(doubles)

Linuxicc,ifort

4MB

700x700

Linuxpgcc,pgf90

8MB

1000x1000

Linuxgcc,gfortran

2MB

500x500

Threadsthatexceedtheirstackallocationmayormaynotsegfault.Anapplicationmaycontinuetorunwhiledatais
beingcorrupted.
Staticallylinkedcodesmaybesubjecttofurtherstackrestrictions.
Auser'sloginshellmayalsorestrictstacksize
IfyourOpenMPenvironmentsupportstheOpenMP3.0O M P _ S T A C K S I Z E environmentvariable(coveredinprevious
section),youcanuseittosetthethreadstacksizepriortoprogramexecution.Forexample:
s e t
s e t
s e t
s e t
s e t
s e t
s e t

e n v
e n v
e n v
e n v
e n v
e n v
e n v

O M P
O M P
O M P
O M P
O M P
O M P
O M P

_ S T
_ S T
_ S T
_ S T
_ S T
_ S T
_ S T

A C K
A C K
A C K
A C K
A C K
A C K
A C K

S I Z
S I Z
S I Z
S I Z
S I Z
S I Z
S I Z

E
E
E
E
E
E
E

2 0 0 0
" 3 0 0
1 0 M
" 1 0
" 2 0
" 1 G
2 0 0 0

5 0 0 B
0 k "
M
m

"
"

"
0

Otherwise,atLC,youshouldbeabletousethemethodbelowforLinuxclusters.Theexampleshowssettingthethread
stacksizeto12MB,andasaprecaution,settingtheshellstacksizetounlimited.
csh/tcsh
ksh/sh/bash

s e t e n v K M P _ S T A C K S I Z E 1 2 0 0 0 0 0 0
l i m i t s t a c k s i z e u n l i m i t e d
e x p o r t
u l i m i t

K M P _ S T A C K S I Z E = 1 2 0 0 0 0 0 0
- s u n l i m i t e d

ThreadBinding:
Insomecases,aprogramwillperformbetterifitsthreadsareboundtoprocessors/cores.
"Binding"athreadtoaprocessormeansthatathreadwillbescheduledbytheoperatingsystemtoalwaysrunonathe
sameprocessor.Otherwise,threadscanbescheduledtoexecuteonanyprocessorand"bounce"backandforth
betweenprocessorswitheachtimeslice.
Alsocalled"threadaffinity"or"processoraffinity"
Bindingthreadstoprocessorscanresultinbettercacheutilization,therebyreducingcostlymemoryaccesses.Thisis
theprimarymotivationforbindingthreadstoprocessors.
Dependinguponyourplatform,operatingsystem,compilerandOpenMPimplementation,bindingthreadstoprocessors
canbedoneseveraldifferentways.
TheOpenMPversion3.1APIprovidesanenvironmentvariabletoturnprocessorbinding"on"or"off".Forexample:
s e t e n v
s e t e n v

O M P _ P R O C _ B I N D
O M P _ P R O C _ B I N D

T R U E
F A L S E

Atahigherlevel,processescanalsobeboundtoprocessors.
DetailedinformationaboutprocessandthreadbindingtoprocessorsonLCLinuxclusterscanbefoundHERE.

Monitoring,DebuggingandPerformanceAnalysisToolsforOpenMP
MonitoringandDebuggingThreads:
Debuggersvaryintheirabilitytohandlethreads.TheTotalViewdebuggerisLC'srecommendeddebuggerforparallel
programs.Itiswellsuitedforbothmonitoringanddebuggingthreadedprograms.
AnexamplescreenshotfromaTotalViewsessionusinganOpenMPcodeisshownbelow.
1.MasterthreadStackTracePaneshowingoriginalroutine
2.Process/threadstatusbarsdifferentiatingthreads
3.MasterthreadStackFramePaneshowingsharedvariables
4.WorkerthreadStackTracePaneshowingoutlinedroutine.
5.WorkerthreadStackFramePane
6.RootWindowshowingallthreads
7.ThreadsPaneshowingallthreadsplusselectedthread

SeetheTotalViewDebuggertutorialfordetails.
TheLinuxp s commandprovidesseveralflagsforviewingthreadinformation.Someexamplesareshownbelow.See
themanpagefordetails.
p
U I D
b l a
b l a
b l a
b l a
b l a
%

- L f
i s e
i s e
i s e
i s e
i s e

P I D
2 9
2 9
2 9
2 9
2 9

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

p s - T
P I D
S P I D

T T Y

P P
2 8 2
2 8 2
2 8 2
2 8 2
2 8 2

I D
4 0
4 0
4 0
4 0
4 0

L W P
2 9
3 0
3 1
3 2
3 3

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

N L W P
0

9 9

9 9

9 9

9 9

T I M E

C M D

S T I
1 1 :
1 1 :
1 1 :
1 1 :
1 1 :

M E
3 1
3 1
3 1
3 1
3 1

T T Y
p t s
p t s
p t s
p t s
p t s

/ 5 3
/ 5 3
/ 5 3
/ 5 3
/ 5 3

0 0 :
0 0 :
0 0 :
0 0 :
0 0 :

T I
0 0 :
0 1 :
0 1 :
0 1 :
0 1 :

M E
0 0
2 4
2 4
2 4
2 4

C M D
a . o
a . o
a . o
a . o
a . o

u t
u t
u t
u t
u t

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

2 9

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

2 9
2 9
2 9
2 9

p s P I D
2 2 5 2 9
-

2 9

p t s
p t s
p t s
p t s
p t s

3 0
3 1
3 2
3 3

/ 5 3
/ 5 3
/ 5 3
/ 5 3
/ 5 3

0 0 :
0 0 :
0 0 :
0 0 :
0 0 :

0 0 :
0 1 :
0 1 :
0 1 :
0 1 :

0 0

T I
1 8 :
0 0 :
0 4 :
0 4 :
0 4 :
0 4 :

M E

0 0 :
0 0 :
0 0 :
0 0 :
0 0 :
0 0 :

a . o
a . o
a . o
a . o
a . o

4 9
4 9
4 9
4 9

u t
u t
u t
u t
u t

L m
L W P

T T Y
p t s / 5 3
-

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

2 9

3 0

3 1

3 2

3 3

5 6

C M D
a . o u t

0 0

4 4

4 4

4 4

4 4

LC'sLinuxclustersalsoprovidethet o p commandtomonitorprocessesonanode.Ifusedwiththe- H flag,thethreads


containedwithinaprocesswillbevisible.Anexampleofthet o p - H commandisshownbelow.Theparentprocessis
PID18010whichspawnedthreethreads,shownasPIDs18012,18013and18014.

PerformanceAnalysisTools:
ThereareavarietyofperformanceanalysistoolsthatcanbeusedwithOpenMPprograms.Searchingthewebwillturn
upawealthofinformation.
AtLC,thelistofsupportedcomputingtoolscanbefoundat:computing.llnl.gov/code/content/software_tools.php.
Thesetoolsvarysignificantlyintheircomplexity,functionalityandlearningcurve.Coveringthemindetailisbeyondthe
scopeofthistutorial.
Sometoolsworthinvestigating,specificallyforOpenMPcodes,include:
Open|SpeedShop
TAU
PAPI
IntelVTuneAmplifier
ThreadSpotter

OpenMPExercise3
Assorted
Overview:
Logintotheworkshopcluster,ifyouarenotalreadyloggedin
Orphaneddirectiveexample:review,compile,run
GetOpenMPimplementationenvironmentinformation
Checkoutthe"bug"programs

GOTOTHEEXERCISEHERE

Thiscompletesthetutorial.
Pleasecompletetheonlineevaluationformunlessyouaredoingtheexercise,inwhichcaseplease
completeitattheendoftheexercise.

Wherewouldyouliketogonow?
Exercise
Agenda
Backtothetop

ReferencesandMoreInformation
Author:BlaiseBarney,LivermoreComputing.
TheOpenMPwebsite,whichincludestheC/C++andFortranApplicationProgramInterfacedocuments.
www.openmp.org

AppendixA:RunTimeLibraryRoutines

OMP_SET_NUM_THREADS
Purpose:
Setsthenumberofthreadsthatwillbeusedinthenextparallelregion.Mustbeapostiveinteger.
Format:
Fortran S U B R O U T I N E
C/C++

O M P _ S E T _ N U M _ T H R E A D S ( s c a l a r _ i n t e g e r _ e x p r e s s i o n )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ n u m _ t h r e a d s ( i n t

n u m _ t h r e a d s )

Notes&Restrictions:
Thedynamicthreadsmechanismmodifiestheeffectofthisroutine.
Enabled:specifiesthemaximumnumberofthreadsthatcanbeusedforanyparallelregionbythedynamic
threadsmechanism.
Disabled:specifiesexactnumberofthreadstouseuntilnextcalltothisroutine.

Thisroutinecanonlybecalledfromtheserialportionsofthecode
ThiscallhasprecedenceovertheOMP_NUM_THREADSenvironmentvariable

OMP_GET_NUM_THREADS
Purpose:
Returnsthenumberofthreadsthatarecurrentlyintheteamexecutingtheparallelregionfromwhichitiscalled.
Format:
Fortran I N T E G E R

F U N C T I O N

O M P _ G E T _ N U M _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ t h r e a d s ( v o i d )

C/C++

Notes&Restrictions:
Ifthiscallismadefromaserialportionoftheprogram,oranestedparallelregionthatisserialized,itwillreturn1.
Thedefaultnumberofthreadsisimplementationdependent.

OMP_GET_MAX_THREADS
Purpose:
ReturnsthemaximumvaluethatcanbereturnedbyacalltotheOMP_GET_NUM_THREADSfunction.
Fortran I N T E G E R

F U N C T I O N

O M P _ G E T _ M A X _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ m a x _ t h r e a d s ( v o i d )

C/C++

Notes&Restrictions:
GenerallyreflectsthenumberofthreadsassetbytheOMP_NUM_THREADSenvironmentvariableorthe
OMP_SET_NUM_THREADS()libraryroutine.
Maybecalledfrombothserialandparallelregionsofcode.

OMP_GET_THREAD_NUM
Purpose:
Returnsthethreadnumberofthethread,withintheteam,makingthiscall.Thisnumberwillbebetween0and
OMP_GET_NUM_THREADS1.Themasterthreadoftheteamisthread0
Format:
Fortran I N T E G E R

F U N C T I O N

O M P _ G E T _ T H R E A D _ N U M ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ t h r e a d _ n u m ( v o i d )

C/C++

Notes&Restrictions:
Ifcalledfromanestedparallelregion,oraserialregion,thisfunctionwillreturn0.
Examples:
Example1isthecorrectwaytodeterminethenumberofthreadsinaparallelregion.
Example2isincorrecttheTIDvariablemustbePRIVATE
Example3isincorrecttheOMP_GET_THREAD_NUMcallisoutsidetheparallelregion
Fortrandeterminingthenumberofthreadsinaparallelregion
Example1:Correct
P R O G R A M

H E L L O

I N T E G E R
! $ O M P

T I D ,

P A R A L L E L
T I D =
P R I N T

O M P _ G E T _ T H R E A D _ N U M

P R I V A T E ( T I D )

O M P _ G E T _ T H R E A D _ N U M ( )
* , ' H e l l o W o r l d f r o m

t h r e a d

' ,

T I D

' ,

T I D

' ,

T I D

. . .
! $ O M P

E N D

P A R A L L E L

E N D
Example2:Incorrect

! $ O M P

P R O G R A M

H E L L O

I N T E G E R

T I D ,

O M P _ G E T _ T H R E A D _ N U M

P A R A L L E L
T I D =
P R I N T

O M P _ G E T _ T H R E A D _ N U M ( )
* , ' H e l l o W o r l d f r o m

t h r e a d

. . .
! $ O M P

E N D

P A R A L L E L

E N D
Example3:Incorrect
P R O G R A M

H E L L O

I N T E G E R

T I D ,

T I D =
P R I N T
! $ O M P

O M P _ G E T _ T H R E A D _ N U M

O M P _ G E T _ T H R E A D _ N U M ( )
* , ' H e l l o W o r l d f r o m

t h r e a d

P A R A L L E L
. . .

! $ O M P

E N D

P A R A L L E L

E N D

OMP_GET_THREAD_LIMIT
Purpose:
ReturnsthemaximumnumberofOpenMPthreadsavailabletoaprogram.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ T H R E A D _ L I M I T

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ t h r e a d _ l i m i t

( v o i d )

Notes:
AlsoseetheO M P _ T H R E A D _ L I M I T environmentvariable.

OMP_GET_NUM_PROCS
Purpose:
Returnsthenumberofprocessorsthatareavailabletotheprogram.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ N U M _ P R O C S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ p r o c s ( v o i d )

OMP_IN_PARALLEL
Purpose:
Maybecalledtodetermineifthesectionofcodewhichisexecutingisparallelornot.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

O M P _ I N _ P A R A L L E L ( )

# i n c l u d e < o m p . h >
i n t o m p _ i n _ p a r a l l e l ( v o i d )

Notes&Restrictions:
ForFortran,thisfunctionreturns.TRUE.ifitiscalledfromthedynamicextentofaregionexecutinginparallel,and
.FALSE.otherwise.ForC/C++,itwillreturnanonzerointegerifparallel,andzerootherwise.

OMP_SET_DYNAMIC
Purpose:
Enablesordisablesdynamicadjustment(bytheruntimesystem)ofthenumberofthreadsavailableforexecutionof
parallelregions.
Format:
Fortran S U B R O U T I N E
C/C++

O M P _ S E T _ D Y N A M I C ( s c a l a r _ l o g i c a l _ e x p r e s s i o n )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ d y n a m i c ( i n t

d y n a m i c _ t h r e a d s )

Notes&Restrictions:
ForFortran,ifcalledwith.TRUE.thenthenumberofthreadsavailableforsubsequentparallelregionscanbeadjusted
automaticallybytheruntimeenvironment.Ifcalledwith.FALSE.,dynamicadjustmentisdisabled.
ForC/C++,ifdynamic_threadsevaluatestononzero,thenthemechanismisenabled,otherwiseitisdisabled.
TheOMP_SET_DYNAMICsubroutinehasprecedenceovertheOMP_DYNAMICenvironmentvariable.
Thedefaultsettingisimplementationdependent.
Mustbecalledfromaserialsectionoftheprogram.

OMP_GET_DYNAMIC
Purpose:
Usedtodetermineifdynamicthreadadjustmentisenabledornot.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

O M P _ G E T _ D Y N A M I C ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ d y n a m i c ( v o i d )

Notes&Restrictions:
ForFortran,thisfunctionreturns.TRUE.ifdynamicthreadadjustmentisenabled,and.FALSE.otherwise.
ForC/C++,nonzerowillbereturnedifdynamicthreadadjustmentisenabled,andzerootherwise.

OMP_SET_NESTED
Purpose:
Usedtoenableordisablenestedparallelism.
Format:
Fortran S U B R O U T I N E
C/C++

O M P _ S E T _ N E S T E D ( s c a l a r _ l o g i c a l _ e x p r e s s i o n )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ n e s t e d ( i n t

n e s t e d )

Notes&Restrictions:
ForFortran,callingthisfunctionwith.FALSE.willdisablenestedparallelism,andcallingwith.TRUE.willenableit.
ForC/C++,ifnestedevaluatestononzero,nestedparallelismisenabledotherwiseitisdisabled.
Thedefaultisfornestedparallelismtobedisabled.
ThiscallhasprecedenceovertheOMP_NESTEDenvironmentvariable

OMP_GET_NESTED
Purpose:
Usedtodetermineifnestedparallelismisenabledornot.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n e s t e d

O M P _ G E T _ N E S T E D
( v o i d )

Notes&Restrictions:
ForFortran,thisfunctionreturns.TRUE.ifnestedparallelismisenabled,and.FALSE.otherwise.
ForC/C++,nonzerowillbereturnedifnestedparallelismisenabled,andzerootherwise.

OMP_SET_SCHEDULE
Purpose:
Thisroutinesetsthescheduletypethatisappliedwhentheloopdirectivespecifiesaruntimeschedule.
Format:
S U B R O U T I N E O M P _ S E T _ S C H E D U L E ( K I N D , M O D I F I E R )
Fortran I N T E G E R ( K I N D = O M P _ S C H E D _ K I N D ) K I N D
I N T E G E R M O D I F I E R
C/C++

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ s c h e d u l e ( o m p _ s c h e d _ t

k i n d ,

i n t

m o d i f i e r )

OMP_GET_SCHEDULE
Purpose:
Thisroutinereturnstheschedulethatisappliedwhentheloopdirectivespecifiesaruntimeschedule.

Format:
S U B R O U T I N E O M P _ G E T _ S C H E D U L E ( K I N D , M O D I F I E R )
Fortran I N T E G E R ( K I N D = O M P _ S C H E D _ K I N D ) K I N D
I N T E G E R M O D I F I E R
C/C++

# i n c l u d e < o m p . h >
v o i d o m p _ g e t _ s c h e d u l e ( o m p _ s c h e d _ t
*

k i n d ,

i n t

m o d i f i e r

OMP_SET_MAX_ACTIVE_LEVELS
Purpose:
Thisroutinelimitsthenumberofnestedactiveparallelregions.
Format:
Fortran
C/C++

S U B R O U T I N E O M P _ S E T _ M A X _ A C T I V E _ L E V E L S
I N T E G E R M A X _ L E V E L S
# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ m a x _ a c t i v e _ l e v e l s

( i n t

( M A X _ L E V E L S )

m a x _ l e v e l s )

Notes&Restrictions:
Ifthenumberofparallellevelsrequestedexceedsthenumberoflevelsofparallelismsupportedbytheimplementation,
thevaluewillbesettothenumberofparallellevelssupportedbytheimplementation.
Thisroutinehasthedescribedeffectonlywhencalledfromthesequentialpartoftheprogram.Whencalledfromwithin
anexplicitparallelregion,theeffectofthisroutineisimplementationdefined.

OMP_GET_MAX_ACTIVE_LEVELS
Purpose:
Thisroutinereturnsthemaximumnumberofnestedactiveparallelregions.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ M A X _ A C T I V E _ L E V E L S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ m a x _ a c t i v e _ l e v e l s ( v o i d )

OMP_GET_LEVEL
Purpose:
Thisroutinereturnsthenumberofnestedparallelregionsenclosingthetaskthatcontainsthecall.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ L E V E L ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ l e v e l ( v o i d )

Notes&Restrictions:
Theomp_get_levelroutinereturnsthenumberofnestedparallelregions(whetheractiveorinactive)enclosingthetask
thatcontainsthecall,notincludingtheimplicitparallelregion.Theroutinealwaysreturnsanonnegativeinteger,and
returns0ifitiscalledfromthesequentialpartoftheprogram.

OMP_GET_ANCESTOR_THREAD_NUM
Purpose:
Thisroutinereturns,foragivennestedlevelofthecurrentthread,thethreadnumberoftheancestororthecurrent

thread.
Format:
Fortran
C/C++

I N T E G E R
I N T E G E R

F U N C T I O N
L E V E L

O M P _ G E T _ A N C E S T O R _ T H R E A D _ N U M ( L E V E L )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ a n c e s t o r _ t h r e a d _ n u m ( i n t

l e v e l )

Notes&Restrictions:
Iftherequestednestlevelisoutsidetherangeof0andthenestlevelofthecurrentthread,asreturnedbythe
omp_get_levelroutine,theroutinereturns1.

OMP_GET_TEAM_SIZE
Purpose:
Thisroutinereturns,foragivennestedlevelofthecurrentthread,thesizeofthethreadteamtowhichtheancestoror
thecurrentthreadbelongs.
Format:
Fortran
C/C++

I N T E G E R
I N T E G E R

F U N C T I O N
L E V E L

O M P _ G E T _ T E A M _ S I Z E ( L E V E L )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ t e a m _ s i z e ( i n t

l e v e l ) ;

Notes&Restrictions:
Iftherequestednestedlevelisoutsidetherangeof0andthenestedlevelofthecurrentthread,asreturnedbythe
omp_get_levelroutine,theroutinereturns1.Inactiveparallelregionsareregardedlikeactiveparallelregionsexecuted
withonethread.

OMP_GET_ACTIVE_LEVEL
Purpose:
Theomp_get_active_levelroutinereturnsthenumberofnested,activeparallelregionsenclosingthetaskthatcontains
thecall.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ A C T I V E _ L E V E L ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ a c t i v e _ l e v e l ( v o i d ) ;

Notes&Restrictions:
Theroutinealwaysreturnsanonnegativeinteger,andreturns0ifitiscalledfromthesequentialpartoftheprogram.

OMP_IN_FINAL
Purpose:
Thisroutinereturnstrueiftheroutineisexecutedinafinaltaskregionotherwise,itreturnsfalse.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

O M P _ I N _ F I N A L ( )

# i n c l u d e < o m p . h >
i n t o m p _ i n _ f i n a l ( v o i d )

OMP_INIT_LOCK

OMP_INIT_NEST_LOCK
Purpose:
Thissubroutineinitializesalockassociatedwiththelockvariable.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ I N I T _ L O C K ( v a r )
O M P _ I N I T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ i n i t _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ i n i t _ n e s t _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Theinitialstateisunlocked
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_DESTROY_LOCK
OMP_DESTROY_NEST_LOCK
Purpose:
Thissubroutinedisassociatesthegivenlockvariablefromanylocks.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ D E S T R O Y _ L O C K ( v a r )
O M P _ D E S T R O Y _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ d e s t r o y _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ d e s t r o y _ n e s t _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_SET_LOCK
OMP_SET_NEST_LOCK
Purpose:
Thissubroutineforcestheexecutingthreadtowaituntilthespecifiedlockisavailable.Athreadisgrantedownershipof
alockwhenitbecomesavailable.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ S E T _ L O C K ( v a r )
O M P _ S E T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ s e t _ n e s t _ _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_UNSET_LOCK
OMP_UNSET_NEST_LOCK

Purpose:
Thissubroutinereleasesthelockfromtheexecutingsubroutine.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ U N S E T _ L O C K ( v a r )
O M P _ U N S E T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ u n s e t _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ u n s e t _ n e s t _ _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_TEST_LOCK
OMP_TEST_NEST_LOCK
Purpose:
Thissubroutineattemptstosetalock,butdoesnotblockifthelockisunavailable.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ T E S T _ L O C K ( v a r )
O M P _ T E S T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
i n t o m p _ t e s t _ l o c k ( o m p _ l o c k _ t * l o c k )
i n t o m p _ t e s t _ n e s t _ _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
ForFortran,.TRUE.isreturnedifthelockwassetsuccessfully,otherwise.FALSE.isreturned.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.
ForC/C++,nonzeroisreturnedifthelockwassetsuccessfully,otherwisezeroisreturned.
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.

OMP_GET_WTIME
Purpose:
Providesaportablewallclocktimingroutine
Returnsadoubleprecisionfloatingpointvalueequaltothenumberofelapsedsecondssincesomepointinthepast.
Usuallyusedin"pairs"withthevalueofthefirstcallsubtractedfromthevalueofthesecondcalltoobtaintheelapsed
timeforablockofcode.
Designedtobe"perthread"times,andthereforemaynotbegloballyconsistentacrossallthreadsinateamdepends
uponwhatathreadisdoingcomparedtootherthreads.
Format:
Fortran D O U B L E
C/C++

P R E C I S I O N

F U N C T I O N

# i n c l u d e < o m p . h >
d o u b l e o m p _ g e t _ w t i m e ( v o i d )

OMP_GET_WTICK
Purpose:
Providesaportablewallclocktimingroutine

O M P _ G E T _ W T I M E ( )

Returnsadoubleprecisionfloatingpointvalueequaltothenumberofsecondsbetweensuccessiveclockticks.
Format:
Fortran D O U B L E
C/C++

P R E C I S I O N

F U N C T I O N

# i n c l u d e < o m p . h >
d o u b l e o m p _ g e t _ w t i c k ( v o i d )

https://computing.llnl.gov/tutorials/openMP/
LastModified:06/07/201601:08:02blaiseb@llnl.gov
UCRLMI133316

O M P _ G E T _ W T I C K ( )

Vous aimerez peut-être aussi