0% found this document useful (0 votes)
111 views53 pages

Openmp: Table of Contents

The document provides an overview of OpenMP, an API for shared memory parallel programming. OpenMP uses compiler directives to specify parallel regions, work sharing, and synchronization. It follows a fork-join model where the main thread forks additional threads to perform work in parallel regions, then joins them back together. The API consists of compiler directives, runtime routines, and environment variables. It aims to make parallel programming easier through incremental parallelization of serial code.

Uploaded by

haiarshad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views53 pages

Openmp: Table of Contents

The document provides an overview of OpenMP, an API for shared memory parallel programming. OpenMP uses compiler directives to specify parallel regions, work sharing, and synchronization. It follows a fork-join model where the main thread forks additional threads to perform work in parallel regions, then joins them back together. The API consists of compiler directives, runtime routines, and environment variables. It aims to make parallel programming easier through incremental parallelization of serial code.

Uploaded by

haiarshad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Tutorials

| Exercises | Abstracts | LCWorkshops | Comments | Search | Privacy&LegalNotice

OpenMP
Author:BlaiseBarney,LawrenceLivermoreNationalLaboratory

UCRLMI133316

TableofContents
1.Abstract
2.Introduction
3.OpenMPProgrammingModel
4.OpenMPAPIOverview
5.CompilingOpenMPPrograms
6.OpenMPDirectives
1.DirectiveFormat
2.C/C++DirectiveFormat
3.DirectiveScoping
4.PARALLELConstruct
5.Exercise1
6.WorkSharingConstructs
1.DO/forDirective
2.SECTIONSDirective
3.WORKSHAREDirective
4.SINGLEDirective
7.CombinedParallelWorkSharingConstructs
8.TASKConstruct
9.Exercise2
10.SynchronizationConstructs
1.MASTERDirective
2.CRITICALDirective
3.BARRIERDirective
4.TASKWAITDirective
5.ATOMICDirective
6.FLUSHDirective
7.ORDEREDDirective
11.THREADPRIVATEDirective
12.DataScopeAttributeClauses
1.PRIVATEClause
2.SHAREDClause
3.DEFAULTClause
4.FIRSTPRIVATEClause
5.LASTPRIVATEClause
6.COPYINClause
7.COPYPRIVATEClause
8.REDUCTIONClause
13.Clauses/DirectivesSummary
14.DirectiveBindingandNestingRules
7.RunTimeLibraryRoutines
8.EnvironmentVariables
9.ThreadStackSizeandThreadBinding
10.Monitoring,DebuggingandPerformanceAnalysisToolsforOpenMP
11.Exercise3
12.ReferencesandMoreInformation
13.AppendixA:RunTimeLibraryRoutines

Abstract
OpenMPisanApplicationProgramInterface(API),jointlydefinedbyagroupofmajorcomputerhardwareandsoftware
vendors.OpenMPprovidesaportable,scalablemodelfordevelopersofsharedmemoryparallelapplications.TheAPI
supportsC/C++andFortranonawidevarietyofarchitectures.ThistutorialcoversmostofthemajorfeaturesofOpenMP3.1,
includingitsvariousconstructsanddirectivesforspecifyingparallelregions,worksharing,synchronizationanddata
environment.Runtimelibraryfunctionsandenvironmentvariablesarealsocovered.ThistutorialincludesbothCandFortran
examplecodesandalabexercise.
Level/Prerequisites:ThistutorialisidealforthosewhoarenewtoparallelprogrammingwithOpenMP.Abasicunderstanding

ofparallelprogramminginCorFortranisrequired.ForthosewhoareunfamiliarwithParallelProgrammingingeneral,the
materialcoveredinEC3500:IntroductiontoParallelComputingwouldbehelpful.

Introduction
WhatisOpenMP?
OpenMPIs:
AnApplicationProgramInterface(API)thatmaybeusedtoexplicitlydirectmulti
threaded,sharedmemoryparallelism.
ComprisedofthreeprimaryAPIcomponents:
CompilerDirectives
RuntimeLibraryRoutines
EnvironmentVariables
Anabbreviationfor:OpenMultiProcessing
OpenMPIsNot:
Meantfordistributedmemoryparallelsystems(byitself)
Necessarilyimplementedidenticallybyallvendors
Guaranteedtomakethemostefficientuseofsharedmemory
Requiredtocheckfordatadependencies,dataconflicts,raceconditions,deadlocks,orcodesequencesthatcausea
programtobeclassifiedasnonconforming
DesignedtohandleparallelI/O.Theprogrammerisresponsibleforsynchronizinginputandoutput.
GoalsofOpenMP:
Standardization:
Provideastandardamongavarietyofsharedmemoryarchitectures/platforms
Jointlydefinedandendorsedbyagroupofmajorcomputerhardwareandsoftwarevendors
LeanandMean:
Establishasimpleandlimitedsetofdirectivesforprogrammingsharedmemorymachines.
Significantparallelismcanbeimplementedbyusingjust3or4directives.
Thisgoalisbecominglessmeaningfulwitheachnewrelease,apparently.
EaseofUse:
Providecapabilitytoincrementallyparallelizeaserialprogram,unlikemessagepassinglibrarieswhichtypically
requireanallornothingapproach
Providethecapabilitytoimplementbothcoarsegrainandfinegrainparallelism
Portability:
TheAPIisspecifiedforC/C++andFortran
PublicforumforAPIandmembership
MostmajorplatformshavebeenimplementedincludingUnix/LinuxplatformsandWindows
History:
Intheearly90's,vendorsofsharedmemorymachinessuppliedsimilar,directivebased,Fortranprogramming
extensions:
TheuserwouldaugmentaserialFortranprogramwithdirectivesspecifyingwhichloopsweretobeparallelized
ThecompilerwouldberesponsibleforautomaticallyparallelizingsuchloopsacrosstheSMPprocessors
Implementationswereallfunctionallysimilar,butwerediverging(asusual)
FirstattemptatastandardwasthedraftforANSIX3H5in1994.Itwasneveradopted,largelyduetowaninginterestas
distributedmemorymachinesbecamepopular.
However,notlongafterthis,newersharedmemorymachinearchitecturesstartedtobecomeprevalent,andinterest
resumed.
TheOpenMPstandardspecificationstartedinthespringof1997,takingoverwhereANSIX3H5hadleftoff.
LedbytheOpenMPArchitectureReviewBoard(ARB).OriginalARBmembersandcontributorsareshownbelow.
(Disclaimer:allpartnernamesderivedfromtheOpenMPwebsite)

APRMembers

EndorsingApplicationDevelopers

Compaq/Digital
HewlettPackardCompany
IntelCorporation
InternationalBusiness
Machines(IBM)
Kuck&Associates,Inc.
(KAI)
SiliconGraphics,Inc.
SunMicrosystems,Inc.
U.S.DepartmentofEnergy
ASCIprogram

ADINAR&D,Inc.
ANSYS,Inc.
DashAssociates
Fluent,Inc.
ILOGCPLEXDivision
LivermoreSoftwareTechnology
Corporation(LSTC)
MECALOGSARL
OxfordMolecularGroupPLC
TheNumericalAlgorithmsGroup
Ltd.(NAG)

EndorsingSoftware
Vendors
AbsoftCorporation
EdinburghPortable
Compilers
GENIASSoftware
GmBH
MyriasComputer
Technologies,Inc.
ThePortlandGroup,
Inc.(PGI)

FormorenewsandmembershipinformationabouttheOpenMPARB,visit:openmp.org/wp/aboutopenmp.
ReleaseHistory
OpenMPcontinuestoevolvenewconstructsandfeaturesareaddedwitheachrelease.
Initially,theAPIspecificationswerereleasedseparatelyforCandFortran.Since2005,theyhavebeenreleased
together.
ThetablebelowchroniclestheOpenMPAPIreleasehistory.
Date
Oct1997
Oct1998
Nov1999
Nov2000
Mar2002
May2005
May2008
Jul2011
Jul2013
Nov2015

Version
Fortran1.0
C/C++1.0
Fortran1.1
Fortran2.0
C/C++2.0
OpenMP2.5
OpenMP3.0
OpenMP3.1
OpenMP4.0
OpenMP4.5

ThistutorialreferstoOpenMPversion3.1.Syntaxandfeaturesofnewerreleasesarenotcurrentlycovered.

References:
OpenMPwebsite:openmp.org
APIspecifications,FAQ,presentations,discussions,mediareleases,calendar,membershipapplicationandmore...
Wikipedia:en.wikipedia.org/wiki/OpenMP

OpenMPProgrammingModel
SharedMemoryModel:
OpenMPisdesignedformultiprocessor/core,sharedmemorymachines.Theunderlyingarchitecturecanbeshared
memoryUMAorNUMA.

UniformMemoryAccess

NonUniformMemoryAccess

ThreadBasedParallelism:
OpenMPprogramsaccomplishparallelismexclusivelythroughtheuseofthreads.
Athreadofexecutionisthesmallestunitofprocessingthatcanbescheduledbyanoperatingsystem.Theideaofa
subroutinethatcanbescheduledtorunautonomouslymighthelpexplainwhatathreadis.
Threadsexistwithintheresourcesofasingleprocess.Withouttheprocess,theyceasetoexist.
Typically,thenumberofthreadsmatchthenumberofmachineprocessors/cores.However,theactualuseofthreadsis
uptotheapplication.
ExplicitParallelism:
OpenMPisanexplicit(notautomatic)programmingmodel,offeringtheprogrammerfullcontroloverparallelization.
Parallelizationcanbeassimpleastakingaserialprogramandinsertingcompilerdirectives....
Orascomplexasinsertingsubroutinestosetmultiplelevelsofparallelism,locksandevennestedlocks.
ForkJoinModel:
OpenMPusestheforkjoinmodelofparallelexecution:

AllOpenMPprogramsbeginasasingleprocess:themasterthread.Themasterthreadexecutessequentiallyuntilthe
firstparallelregionconstructisencountered.
FORK:themasterthreadthencreatesateamofparallelthreads.
Thestatementsintheprogramthatareenclosedbytheparallelregionconstructarethenexecutedinparallelamong
thevariousteamthreads.
JOIN:Whentheteamthreadscompletethestatementsintheparallelregionconstruct,theysynchronizeandterminate,
leavingonlythemasterthread.
Thenumberofparallelregionsandthethreadsthatcomprisethemarearbitrary.
CompilerDirectiveBased:
MostOpenMPparallelismisspecifiedthroughtheuseofcompilerdirectiveswhichareimbeddedinC/C++orFortran
sourcecode.
NestedParallelism:
TheAPIprovidesfortheplacementofparallelregionsinsideotherparallelregions.
Implementationsmayormaynotsupportthisfeature.
DynamicThreads:

TheAPIprovidesfortheruntimeenvironmenttodynamicallyalterthenumberofthreadsusedtoexecuteparallel
regions.Intendedtopromotemoreefficientuseofresources,ifpossible.
Implementationsmayormaynotsupportthisfeature.
I/O:
OpenMPspecifiesnothingaboutparallelI/O.Thisisparticularlyimportantifmultiplethreadsattempttowrite/readfrom
thesamefile.
IfeverythreadconductsI/Otoadifferentfile,theissuesarenotassignificant.
ItisentirelyuptotheprogrammertoensurethatI/Oisconductedcorrectlywithinthecontextofamultithreaded
program.
MemoryModel:FLUSHOften?
OpenMPprovidesa"relaxedconsistency"and"temporary"viewofthreadmemory(intheirwords).Inotherwords,
threadscan"cache"theirdataandarenotrequiredtomaintainexactconsistencywithrealmemoryallofthetime.
Whenitiscriticalthatallthreadsviewasharedvariableidentically,theprogrammerisresponsibleforinsuringthatthe
variableisFLUSHedbyallthreadsasneeded.
Moreonthislater...

OpenMPAPIOverview
ThreeComponents:
TheOpenMPAPIiscomprisedofthreedistinctcomponents:
CompilerDirectives(44)
RuntimeLibraryRoutines(35)
EnvironmentVariables(13)
Theapplicationdeveloperdecideshowtoemploythesecomponents.Inthesimplestcase,onlyafewofthemare
needed.
ImplementationsdifferintheirsupportofallAPIcomponents.Forexample,animplementationmaystatethatitsupports
nestedparallelism,buttheAPImakesitclearthatmaybelimitedtoasinglethreadthemasterthread.Notexactlywhat
thedevelopermightexpect?
CompilerDirectives:
Compilerdirectivesappearascommentsinyoursourcecodeandareignoredbycompilersunlessyoutellthem
otherwiseusuallybyspecifyingtheappropriatecompilerflag,asdiscussedintheCompilingsectionlater.
OpenMPcompilerdirectivesareusedforvariouspurposes:
Spawningaparallelregion
Dividingblocksofcodeamongthreads
Distributingloopiterationsbetweenthreads
Serializingsectionsofcode
Synchronizationofworkamongthreads
Compilerdirectiveshavethefollowingsyntax:

s e n t i n e l

d i r e c t i v e - n a m e

[ c l a u s e ,

. . . ]

Forexample:
Fortran ! $ O M P
C/C++

# p r a g m a

P A R A L L E L
o m p

D E F A U L T ( S H A R E D )

p a r a l l e l

P R I V A T E ( B E T A , P I )

d e f a u l t ( s h a r e d )

p r i v a t e ( b e t a , p i )

Compilerdirectivesarecoveredindetaillater.
RuntimeLibraryRoutines:
TheOpenMPAPIincludesanevergrowingnumberofruntimelibraryroutines.
Theseroutinesareusedforavarietyofpurposes:
Settingandqueryingthenumberofthreads

Queryingathread'suniqueidentifier(threadID),athread'sancestor'sidentifier,thethreadteamsize
Settingandqueryingthedynamicthreadsfeature
Queryingifinaparallelregion,andatwhatlevel
Settingandqueryingnestedparallelism
Setting,initializingandterminatinglocksandnestedlocks
Queryingwallclocktimeandresolution
ForC/C++,alloftheruntimelibraryroutinesareactualsubroutines.ForFortran,someareactuallyfunctions,andsome
aresubroutines.Forexample:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ N U M _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ t h r e a d s ( v o i d )

NotethatforC/C++,youusuallyneedtoincludethe< o m p . h > headerfile.


Fortranroutinesarenotcasesensitive,butC/C++routinesare.
TheruntimelibraryroutinesarebrieflydiscussedasanoverviewintheRunTimeLibraryRoutinessection,andinmore
detailinAppendixA.
EnvironmentVariables:
OpenMPprovidesseveralenvironmentvariablesforcontrollingtheexecutionofparallelcodeatruntime.
Theseenvironmentvariablescanbeusedtocontrolsuchthingsas:
Settingthenumberofthreads
Specifyinghowloopinterationsaredivided
Bindingthreadstoprocessors
Enabling/disablingnestedparallelismsettingthemaximumlevelsofnestedparallelism
Enabling/disablingdynamicthreads
Settingthreadstacksize
Settingthreadwaitpolicy
SettingOpenMPenvironmentvariablesisdonethesamewayyousetanyotherenvironmentvariables,anddepends
uponwhichshellyouuse.Forexample:
csh/tcsh s e t e n v

O M P _ N U M _ T H R E A D S

sh/bash e x p o r t

O M P _ N U M _ T H R E A D S = 8

OpenMPenvironmentvariablesarediscussedintheEnvironmentVariablessectionlater.
ExampleOpenMPCodeStructure:

FortranGeneralCodeStructure
1

P R O G R A M

H E L L O

I N T E G E R

V A R 1 ,

2
3

V A R 2 ,

V A R 3

4
5

S e r i a l

c o d e

.
7

.
8

.
9

1 0
1 1

B e g i n n i n g o f p a r a l l e l r e g i o n .
S p e c i f y v a r i a b l e s c o p i n g

F o r k

t e a m

o f

t h r e a d s .

1 2
1 3

! $ O M P

P A R A L L E L

P R I V A T E ( V A R 1 ,

V A R 2 )

S H A R E D ( V A R 3 )

1 4
1 5
1 6
1 7
1 8
1 9
2 0

P a r a l l e l r e g i o n e x e c u t e d
.
O t h e r O p e n M P d i r e c t i v e s
.
R u n - t i m e L i b r a r y c a l l s
.

b y

a l l

t h r e a d s

2 1

A l l

t h r e a d s

j o i n

m a s t e r

t h r e a d

a n d

d i s b a n d

2 2
2 3

! $ O M P

E N D

P A R A L L E L

2 4
2 5

R e s u m e

s e r i a l

2 6

2 7

2 8

c o d e

2 9
3 0

E N D

C/C++GeneralCodeStructure
1

# i n c l u d e

< o m p . h >

m a i n

2
3

( )

4
5

i n t

v a r 1 ,

v a r 2 ,

v a r 3 ;

6
7

S e r i a l
.
9

1 0

c o d e

1 1
1 2
1 3

B e g i n n i n g o f p a r a l l e l r e g i o n .
S p e c i f y v a r i a b l e s c o p i n g

F o r k

t e a m

o f

t h r e a d s .

1 4
1 5
1 6

# p r a g m a
{

o m p

p a r a l l e l

p r i v a t e ( v a r 1 ,

v a r 2 )

s h a r e d ( v a r 3 )

1 7
1 8

P a r a l l e l

r e g i
.
O t h e r O p e n M P
.
R u n - t i m e L i b r
.
A l l t h r e a d s j

1 9
2 0
2 1
2 2
2 3
2 4

o n

e x e c u t e d

b y

a l l

t h r e a d s

d i r e c t i v e s
a r y

c a l l s

o i n

m a s t e r

t h r e a d

a n d

d i s b a n d

2 5
2 6

2 7
2 8

R e s u m e
.

3 0

3 1

s e r i a l

2 9

c o d e

3 2
3 3

CompilingOpenMPPrograms
LCOpenMPImplementations:
AsofJune2016,thedocumentationsourcesforLC'sdefaultcompilersclaimthefollowingOpenMPsupport:
Compiler
IntelC/C++,Fortran

Version
14.0.3

Supports
OpenMP3.1

GNUC/C++,Fortran

4.4.7

OpenMP3.0

PGIC/C++,Fortran

8.0.1

OpenMP3.0

IBMBlueGeneC/C++

12.1

OpenMP3.1

IBMBlueGeneFortran

14.1

OpenMP3.1

IBMBlueGeneGNUC/C++,Fortran 4.4.7

OpenMP3.0

OpenMP4.0Support(accordingtovendorandopenmp.orgdocumentation):
GNU:supportedin4.9forC/C++and4.9.1forFortran
Intel:14.0has"some"support15.0supports"mostfeatures"version16supported
PGI:notcurrentlyavailable
IBMBG/Q:notcurrentlyavailable
OpenMP4.5Support:
NotcurrentlysupportedonanyofLC'sproductionclustercompilers.
SupportedinabetaversionoftheClangcompileronthenonproductionrzmistandrzhasgpuclusters(June
2016).
ToviewallLCcompilerversions,usethecommandu s e

- l

c o m p i l e r s toviewcompilerpackagesbyversion.

ToviewLC'sdefaultcompilerversionssee:https://computing.llnl.gov/?set=code&page=compilers
BestplacetoviewOpenMPsupportbyarangeofcompilers:http://openmp.org/wp/openmpcompilers/.
Compiling:
AllofLC'scompilersrequireyoutousetheappropriatecompilerflagto"turnon"OpenMPcompilations.Thetablebelow
showswhattouseforeachcompiler.
Compiler/Platform

Compiler

Flag

Intel
LinuxOpteron/Xeon

i c c
i c p c
i f o r t

- o p e n m p

PGI
LinuxOpteron/Xeon

p g c
p g C
p g f
p g f

- m p

c
C
7 7
9 0

GNU
LinuxOpteron/Xeon
IBMBlueGene

g c c
g + +
g 7 7
g f o r t r a n

IBM
BlueGene

b g x
b g x
b g x
b g x
b g x
b g x
b g x
b g x

l c _
l C _
l c 8
l c 9
l f _
l f 9
l f 9
l f 2

r ,
r ,

- f o p e n m p

b g c c _ r
b g x l c + + _ r

- q s m p = o m p

9 _ r
9 _ r
r
0 _ r
5 _ r
0 0 3 _ r

*Besuretouseathreadsafecompileritsnameendswith_r
CompilerDocumentation:
IBMBlueGene:www01.ibm.com/software/awdtools/fortran/andwww01.ibm.com/software/awdtools/xlcpp
Intel:www.intel.com/software/products/compilers/
PGI:www.pgroup.com
GNU:gnu.org
All:Seetherelevantmanpagesandanyfilesthatmightrelatein/ u s r / l o c a l / d o c s

OpenMPDirectives
FortranDirectivesFormat
Format:(caseinsensitive)

sentinel

directivename

AllFortranOpenMPdirectivesmustbeginwith
asentinel.Theacceptedsentinelsdepend
uponthetypeofFortransource.Possible
sentinelsare:

AvalidOpenMP
directive.Mustappear
afterthesentineland
beforeanyclauses.

[clause...]
Optional.Clausescanbein
anyorder,andrepeatedas
necessaryunlessotherwise
restricted.

! $ O M P
C $ O M P
* $ O M P
Example:

! $ O M P

P A R A L L E L

D E F A U L T ( S H A R E D )

P R I V A T E ( B E T A , P I )

FixedFormSource:
! $ O M P

C $ O M P

* $ O M P areacceptedsentinelsandmuststartincolumn1

AllFortranfixedformrulesforlinelength,whitespace,continuationandcommentcolumnsapplyfortheentiredirective
line
Initialdirectivelinesmusthaveaspace/zeroincolumn6.
Continuationlinesmusthaveanonspace/zeroincolumn6.
FreeFormSource:
! $ O M P istheonlyacceptedsentinel.Canappearinanycolumn,butmustbeprecededbywhitespaceonly.
AllFortranfreeformrulesforlinelength,whitespace,continuationandcommentcolumnsapplyfortheentiredirective
line
Initialdirectivelinesmusthaveaspaceafterthesentinel.
Continuationlinesmusthaveanampersandasthelastnonblankcharacterinaline.Thefollowinglinemustbeginwith
asentinelandthenthecontinuationdirectives.
GeneralRules:
Commentscannotappearonthesamelineasadirective
Onlyonedirectivenamemaybespecifiedperdirective
FortrancompilerswhichareOpenMPenabledgenerallyincludeacommandlineoptionwhichinstructsthecompilerto
activateandinterpretallOpenMPdirectives.
SeveralFortranOpenMPdirectivescomeinpairsandhavetheformshownbelow.The"end"directiveisoptionalbut
advisedforreadability.

! $ O M P

d i r e c t i v e

s t r u c t u r e d

! $ O M P

e n d

b l o c k

o f

c o d e

d i r e c t i v e

OpenMPDirectives
C/C++DirectivesFormat
Format:
#pragma
omp

directivename

[clause,...]

newline

Requiredfor
allOpenMP
C/C++
directives.

AvalidOpenMPdirective.
Mustappearafterthe
pragmaandbeforeany
clauses.

Optional.Clausescanbeinany
order,andrepeatedas
necessaryunlessotherwise
restricted.

Required.Precedesthe
structuredblockwhichis
enclosedbythisdirective.

Example:

# p r a g m a

o m p

p a r a l l e l

d e f a u l t ( s h a r e d )

p r i v a t e ( b e t a , p i )

GeneralRules:
Casesensitive
DirectivesfollowconventionsoftheC/C++standardsforcompilerdirectives
Onlyonedirectivenamemaybespecifiedperdirective
Eachdirectiveappliestoatmostonesucceedingstatement,whichmustbeastructuredblock.
Longdirectivelinescanbe"continued"onsucceedinglinesbyescapingthenewlinecharacterwithabackslash("\")at
theendofadirectiveline.

OpenMPDirectives
DirectiveScoping
Dowedothisnow...ordoitlater?Ohwell,let'sgetitoverwithearly...
Static(Lexical)Extent:
Thecodetextuallyenclosedbetweenthebeginningandtheendofastructuredblockfollowingadirective.
Thestaticextentofadirectivesdoesnotspanmultipleroutinesorcodefiles
OrphanedDirective:
AnOpenMPdirectivethatappearsindependentlyfromanotherenclosingdirectiveissaidtobeanorphaneddirective.It
existsoutsideofanotherdirective'sstatic(lexical)extent.
Willspanroutinesandpossiblycodefiles
DynamicExtent:
Thedynamicextentofadirectiveincludesbothitsstatic(lexical)extentandtheextentsofitsorphaneddirectives.
Example:

! $ O M P
! $ O M P

! $ O M P

! $ O M P

P R O
. . .
P A R
. . .
D O
D O
. . .
C A L
. . .
E N D
E N D
. . .
C A L
. . .
E N D

G R A M

T E S T

A L L E L

! $ O M P
! $ O M P

I = . . .
L

S U B R O U T I N E S U B 1
. . .
C R I T I C A L
. . .
E N D C R I T I C A L
E N D

S U B 1
D O

D O

! $ O M P
S U B 2

P A R A L L E L

STATICEXTENT
TheD O directiveoccurswithinanenclosingparallel
region

! $ O M P

S U B R O U T I N E S U B 2
. . .
S E C T I O N S
. . .
E N D S E C T I O N S
. . .
E N D

ORPHANEDDIRECTIVES
TheC R I T I C A L andS E C T I O N S directivesoccur
outsideanenclosingparallelregion

DYNAMICEXTENT

TheCRITICALandSECTIONSdirectivesoccurwithinthedynamicextentoftheDOandPARALLEL
directives.
WhyIsThisImportant?
OpenMPspecifiesanumberofscopingrulesonhowdirectivesmayassociate(bind)andnestwithineachother
Illegaland/orincorrectprogramsmayresultiftheOpenMPbindingandnestingrulesareignored
SeeDirectiveBindingandNestingRulesforspecificdetails

OpenMPDirectives
PARALLELRegionConstruct
Purpose:
Aparallelregionisablockofcodethatwillbeexecutedbymultiplethreads.ThisisthefundamentalOpenMPparallel
construct.
Format:
! $ O M P

P A R A L L E L

Fortran

[ c
I F
P R
S H
D E
F I
R E
C O
N U

l a
(
I V
A R
F A
R S
D U
P Y
M _

u s
s c
A T
E D
U L
T P
C T
I N
T H

e
a l
E
(
T
R I
I O
(
R E

. .
a r
( l
l i
( P
V A
N
l i
A D

. ]
_ l
i s
s t
R I
T E
( o
s t
S

)
( s c a l a r - i n t e g e r - e x p r e s s i o n )

[ c
i f
p r
s h
d e
f i
r e
c o
n u

l a
(
i v
a r
f a
r s
d u
p y
m _

u s
s c
a t
e d
u l
t p
c t
i n
t h

e . . .
a l a r _
e ( l i
( l i s
t ( s h
r i v a t
i o n (
( l i s
r e a d s

o g i c a l _ e x p r e s s i o n )
t )
)
V A T E | F I R S T P R I V A T E
( l i s t )
p e r a t o r : l i s t )

S H A R E D

N O N E )

b l o c k
! $ O M P

E N D

# p r a g m a

P A R A L L E L
o m p

p a r a l l e l

C/C++

]
e x
s t
t )
a r
e
o p
t )
(

n e w l i n e
p r e s s i o n )
)
e d | n o n e )
( l i s t )
e r a t o r : l i s t )
i n t e g e r - e x p r e s s i o n )

s t r u c t u r e d _ b l o c k

Notes:
WhenathreadreachesaPARALLELdirective,itcreatesateamofthreadsandbecomesthemasteroftheteam.The
masterisamemberofthatteamandhasthreadnumber0withinthatteam.
Startingfromthebeginningofthisparallelregion,thecodeisduplicatedandallthreadswillexecutethatcode.
Thereisanimpliedbarrierattheendofaparallelregion.Onlythemasterthreadcontinuesexecutionpastthispoint.
Ifanythreadterminateswithinaparallelregion,allthreadsintheteamwillterminate,andtheworkdoneupuntilthat
pointisundefined.
HowManyThreads?
Thenumberofthreadsinaparallelregionisdeterminedbythefollowingfactors,inorderofprecedence:

1.EvaluationoftheI F clause
2.SettingoftheN U M _ T H R E A D S clause
3.Useoftheo m p _ s e t _ n u m _ t h r e a d s ( ) libraryfunction
4.SettingoftheOMP_NUM_THREADSenvironmentvariable
5.ImplementationdefaultusuallythenumberofCPUsonanode,thoughitcouldbedynamic(seenextbullet).
Threadsarenumberedfrom0(masterthread)toN1
DynamicThreads:
Usetheo m p _ g e t _ d y n a m i c ( ) libraryfunctiontodetermineifdynamicthreadsareenabled.
Ifsupported,thetwomethodsavailableforenablingdynamicthreadsare:
1.Theo m p _ s e t _ d y n a m i c ( ) libraryroutine
2.SettingoftheOMP_DYNAMICenvironmentvariabletoTRUE
NestedParallelRegions:
Usetheo m p _ g e t _ n e s t e d ( ) libraryfunctiontodetermineifnestedparallelregionsareenabled.
Thetwomethodsavailableforenablingnestedparallelregions(ifsupported)are:
1.Theo m p _ s e t _ n e s t e d ( ) libraryroutine
2.SettingoftheOMP_NESTEDenvironmentvariabletoTRUE
Ifnotsupported,aparallelregionnestedwithinanotherparallelregionresultsinthecreationofanewteam,consisting
ofonethread,bydefault.
Clauses:
IFclause:Ifpresent,itmustevaluateto.TRUE.(Fortran)ornonzero(C/C++)inorderforateamofthreadstobe
created.Otherwise,theregionisexecutedseriallybythemasterthread.
Theremainingclausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Restrictions:
Aparallelregionmustbeastructuredblockthatdoesnotspanmultipleroutinesorcodefiles
Itisillegaltobranch(goto)intooroutofaparallelregion
OnlyasingleIFclauseispermitted
OnlyasingleNUM_THREADSclauseispermitted
Aprogrammustnotdependupontheorderingoftheclauses

Example:ParallelRegion
Simple"HelloWorld"program
Everythreadexecutesallcodeenclosedintheparallelregion
OpenMPlibraryroutinesareusedtoobtainthreadidentifiersandtotalnumberofthreads

FortranParallelRegionExample
1

P R O G R A M

H E L L O

2
3
4

I N T E G E R N T H R E A D S , T I D ,
O M P _ G E T _ T H R E A D _ N U M

O M P _ G E T _ N U M _ T H R E A D S ,

!
! $ O M P
8

5
F o r k a t e a m o f t h r e a d s
P A R A L L E L P R I V A T E ( T I D )

w i t h

e a c h

t h r e a d

h a v i n g

p r i v a t e

T I D

v a r i a b l e

O b t a i n a n d p r i n t t h r e a d i d
T I D = O M P _ G E T _ T H R E A D _ N U M ( )
P R I N T * , ' H e l l o W o r l d f r o m

1 0
1 1

t h r e a d

' ,

T I D

1 2
1 3

O n l y
I F (
N T
P R
E N D

m a s t
T I D .
H R E A D
I N T *
I F

A l l
E N D

t h r e a d s j o i n
P A R A L L E L

1 4
1 5
1 6
1 7

e r

t h r e a d d o e s t h i s
0 ) T H E N
= O M P _ G E T _ N U M _ T H R E A D S ( )
' N u m b e r o f t h r e a d s = ' ,

E Q .
S
,

N T H R E A D S

1 8
1 9

!
! $ O M P

2 0

m a s t e r

t h r e a d

a n d

d i s b a n d

2 1
2 2

E N D

C/C++ParallelRegionExample
1

# i n c l u d e

< o m p . h >

m a i n ( i n t

a r g c ,

2
3

c h a r

* a r g v [ ] )

4
5

i n t

n t h r e a d s ,

t i d ;

6
7

/ *
8
9

F o r k
# p r a g m a
{

t e a m o f t h r e a d s w i t h e a c h
p a r a l l e l p r i v a t e ( t i d )

t h r e a d

h a v i n g

% d \ n " ,

t i d ) ;

p r i v a t e

t i d

o m p

1 0
1 1

/ * O b t a i n a n d p r i n t t h r e a d i d * /
t i d = o m p _ g e t _ t h r e a d _ n u m ( ) ;
p r i n t f ( " H e l l o W o r l d f r o m t h r e a d =

1 2
1 3
1 4
1 5

/ *
i f

1 6

O n l y
( t i d

m a s t e r
= = 0 )

t h r e a d

d o e s

t h i s

* /

1 7

1 8
1 9

n t h r e a d s = o m p _ g e t _ n u m _ t h r e a d s ( ) ;
p r i n t f ( " N u m b e r o f t h r e a d s = % d \ n " ,

2 0

n t h r e a d s ) ;

2 1
2 2

/ *

A l l

t h r e a d s

j o i n

m a s t e r

t h r e a d

a n d

t e r m i n a t e

* /

2 3
2 4

OpenMPExercise1
GettingStarted
Overview:
LogintotheworkshopclusterusingyourworkshopusernameandOTPtoken
Copytheexercisefilestoyourhomedirectory
FamiliarizeyourselfwithLC'sOpenMPenvironment
Writeasimple"HelloWorld"OpenMPprogram
Successfullycompileyourprogram
Successfullyrunyourprogram
Modifythenumberofthreadsusedtorunyourprogram

v a r i a b l e

* /

GOTOTHEEXERCISEHERE

Approx.20minutes

OpenMPDirectives
WorkSharingConstructs
Aworksharingconstructdividestheexecutionoftheenclosedcoderegionamongthemembersoftheteamthat
encounterit.
Worksharingconstructsdonotlaunchnewthreads
Thereisnoimpliedbarrieruponentrytoaworksharingconstruct,howeverthereisanimpliedbarrierattheendofa
worksharingconstruct.
TypesofWorkSharingConstructs:
NOTE:TheFortranw o r k s h a r e constructisnotshownhere,butisdiscussedlater.
DO/forsharesiterationsofaloop
SECTIONSbreaksworkinto
acrosstheteam.Representsatypeof separate,discretesections.Each
"dataparallelism".
sectionisexecutedbyathread.Can
beusedtoimplementatypeof
"functionalparallelism".

SINGLEserializesasectionofcode

Restrictions:
Aworksharingconstructmustbeencloseddynamicallywithinaparallelregioninorderforthedirectivetoexecutein
parallel.
Worksharingconstructsmustbeencounteredbyallmembersofateamornoneatall
Successiveworksharingconstructsmustbeencounteredinthesameorderbyallmembersofateam

OpenMPDirectives
WorkSharingConstructs
DO/forDirective

Purpose:
TheDO/fordirectivespecifiesthattheiterationsoftheloopimmediatelyfollowingitmustbeexecutedinparallelbythe
team.Thisassumesaparallelregionhasalreadybeeninitiated,otherwiseitexecutesinserialonasingleprocessor.
Format:
! $ O M P

D O

Fortran

[ c
S C
O R
P R
F I
L A
S H
R E
C O

l a
H E
D E
I V
R S
S T
A R
D U
L L

u s
D U
R E
A T
T P
P R
E D
C T
A P

. . . ]
( t y p e

L E

[ , c h u n k ] )

D
E

(
R I V
I V A
( l
I O N
S E

l i
A T
T E
i s
(
( n

s t )
E ( l i s t )
( l i s t )
t )
o p e r a t o r
)
|

i n t r i n s i c

l i s t )

d o _ l o o p
! $ O M P
# p r a g m a

E N D

D O
o m p

[
f o r

C/C++

N O W A I T
[ c
s c
o r
p r
f i
l a
s h
r e
c o
n o

l a
h e
d e
i v
r s
s t
a r
d u
l l
w a

u s
d u
r e
a t
t p
p r
e d
c t
a p
i t

]
e
l e

. . . ]
n e w l i n e
( t y p e [ , c h u n k ] )

d
e

(
r i v
i v a
( l
i o n
s e

l i
a t
t e
i s
(
( n

s t )
e ( l i s t )
( l i s t )
t )
o p e r a t o r :
)

l i s t )

f o r _ l o o p

Clauses:
SCHEDULE:Describeshowiterationsofthelooparedividedamongthethreadsintheteam.Thedefaultscheduleis
implementationdependent.Foradiscussiononhowonetypeofschedulingmaybemoreoptimalthanothers,see
http://openmp.org/forum/viewtopic.php?f=3&t=83.
STATIC
Loopiterationsaredividedintopiecesofsizechunkandthenstaticallyassignedtothreads.Ifchunkisnot
specified,theiterationsareevenly(ifpossible)dividedcontiguouslyamongthethreads.
DYNAMIC
Loopiterationsaredividedintopiecesofsizechunk,anddynamicallyscheduledamongthethreadswhena
threadfinishesonechunk,itisdynamicallyassignedanother.Thedefaultchunksizeis1.
GUIDED
Iterationsaredynamicallyassignedtothreadsinblocksasthreadsrequestthemuntilnoblocksremaintobe
assigned.SimilartoDYNAMICexceptthattheblocksizedecreaseseachtimeaparcelofworkisgiventoa
thread.Thesizeoftheinitialblockisproportionalto:
n u m b e r _ o f _ i t e r a t i o n s

n u m b e r _ o f _ t h r e a d s

Subsequentblocksareproportionalto
n u m b e r _ o f _ i t e r a t i o n s _ r e m a i n i n g

n u m b e r _ o f _ t h r e a d s

Thechunkparameterdefinestheminimumblocksize.Thedefaultchunksizeis1.
RUNTIME
TheschedulingdecisionisdeferreduntilruntimebytheenvironmentvariableOMP_SCHEDULE.Itisillegalto
specifyachunksizeforthisclause.
AUTO
Theschedulingdecisionisdelegatedtothecompilerand/orruntimesystem.
NOWAIT/nowait:Ifspecified,thenthreadsdonotsynchronizeattheendoftheparallelloop.

ORDERED:Specifiesthattheiterationsoftheloopmustbeexecutedastheywouldbeinaserialprogram.
COLLAPSE:Specifieshowmanyloopsinanestedloopshouldbecollapsedintoonelargeiterationspaceanddivided
accordingtothes c h e d u l e clause.Thesequentialexecutionoftheiterationsinallassociatedloopsdeterminesthe
orderoftheiterationsinthecollapsediterationspace.
Otherclausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Restrictions:
TheDOloopcannotbeaDOWHILEloop,oraloopwithoutloopcontrol.Also,theloopiterationvariablemustbean
integerandtheloopcontrolparametersmustbethesameforallthreads.
Programcorrectnessmustnotdependuponwhichthreadexecutesaparticulariteration.
Itisillegaltobranch(goto)outofaloopassociatedwithaDO/fordirective.
Thechunksizemustbespecifiedasaloopinvarientintegerexpression,asthereisnosynchronizationduringits
evaluationbydifferentthreads.
ORDERED,COLLAPSEandSCHEDULEclausesmayappearonceeach.
SeetheOpenMPspecificationdocumentforadditionalrestrictions.

Example:DO/forDirective
Simplevectoraddprogram
ArraysA,B,C,andvariableNwillbesharedbyallthreads.
VariableIwillbeprivatetoeachthreadeachthreadwillhaveitsownuniquecopy.
TheiterationsoftheloopwillbedistributeddynamicallyinCHUNKsizedpieces.
Threadswillnotsynchronizeuponcompletingtheirindividualpiecesofwork(NOWAIT).

FortranDODirectiveExample
1

P R O G R A M

V E C _ A D D _ D O

I N T
P A R
P A R
R E A

N ,

2
3
4
5
6

E G E R
A M E T E
A M E T E
L A ( N

C H
( N =
R ( C H
) , B (
R

U N K S I Z E , C H U N K ,
1 0 0 0 )
U N K S I Z E = 1 0 0 )
N ) , C ( N )

7
8

S o m e
D O I
A (
B (
E N D D
C H U N

9
1 0
1 1
1 2
1 3

i n
=
I )
I )

i t i a l i z a t i o n s
1 , N
= I * 1 . 0
= A ( I )

O
K

C H U N K S I Z E

1 4
1 5

! $ O M P

P A R A L L E L

! $ O M P

D O
D O

S H A R E D ( A , B , C , C H U N K )

1 6
1 7

! $ O M P

S C
I
C (
E N D D O
E N D D

! $ O M P

E N D

1 8
1 9
2 0
2 1

H E D U L E ( D Y N A M I C , C H U N K )
= 1 , N
I ) = A ( I ) + B ( I )
N O W A I T

2 2
2 3

P A R A L L E L

2 4
2 5

E N D

C/C++forDirectiveExample

P R I V A T E ( I )

# i n c l u d e < o m p . h >
# d e f i n e N 1 0 0 0
# d e f i n e C H U N K S I Z E
2
3

1 0 0

4
5

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

6
7

i n t i , c h u n k ;
f l o a t a [ N ] , b [ N ] ,
8

c [ N ] ;

9
1 0
1 1
1 2
1 3

/ * S o m e i n i t i a l i z a t i o n s * /
f o r ( i = 0 ; i < N ; i + + )
a [ i ] = b [ i ] = i * 1 . 0 ;
c h u n k = C H U N K S I Z E ;

1 4
1 5
1 6

# p r a g m a
{

o m p

p a r a l l e l

s h a r e d ( a , b , c , c h u n k )

p r i v a t e ( i )

1 7
1 8

# p r a g m a o m p f o r s c h e d u l e ( d y n a m i c , c h u n k )
f o r ( i = 0 ; i < N ; i + + )
c [ i ] = a [ i ] + b [ i ] ;

1 9
2 0

n o w a i t

2 1
2 2

/ *

e n d

o f

p a r a l l e l

r e g i o n

* /

2 3
2 4

OpenMPDirectives
WorkSharingConstructs
SECTIONSDirective
Purpose:
TheSECTIONSdirectiveisanoniterativeworksharingconstruct.Itspecifiesthattheenclosedsection(s)ofcodeareto
bedividedamongthethreadsintheteam.
IndependentSECTIONdirectivesarenestedwithinaSECTIONSdirective.EachSECTIONisexecutedoncebya
threadintheteam.Differentsectionsmaybeexecutedbydifferentthreads.Itispossibleforathreadtoexecutemore
thanonesectionifitisquickenoughandtheimplementationpermitssuch.
Format:
! $ O M P

S E C T I O N S

! $ O M P
Fortran

[ c
P R
F I
L A
R E

l a
I V
R S
S T
D U

u s
A T
T P
P R
C T

. .
( l
R I V A
I V A T
I O N
E

. ]
i s t )
T E ( l i s t )
E ( l i s t )
( o p e r a t o r
|

i n t r i n s i c

S E C T I O N

b l o c k
! $ O M P

S E C T I O N

b l o c k
! $ O M P
# p r a g m a

E N D

S E C T I O N S
o m p

s e c t i o n s

N O W A I T

[ c l a u s e . . . ]
n e w l i n e
p r i v a t e ( l i s t )
f i r s t p r i v a t e ( l i s t )
l a s t p r i v a t e

( l i s t )

l i s t )

l a s t p r i v a t e ( l i s t )
r e d u c t i o n ( o p e r a t o r :
n o w a i t

l i s t )

{
# p r a g m a

C/C++

o m p

s e c t i o n

n e w l i n e

s t r u c t u r e d _ b l o c k
# p r a g m a

o m p

s e c t i o n

n e w l i n e

s t r u c t u r e d _ b l o c k
}
Clauses:
ThereisanimpliedbarrierattheendofaSECTIONSdirective,unlesstheN O W A I T / n o w a i t clauseisused.
Clausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Questions:
WhathappensifthenumberofthreadsandthenumberofSECTIONsaredifferent?Morethreadsthan
SECTIONs?LessthreadsthanSECTIONs?
Answer

WhichthreadexecuteswhichSECTION?
Answer

Restrictions:
Itisillegaltobranch(goto)intooroutofsectionblocks.
SECTIONdirectivesmustoccurwithinthelexicalextentofanenclosingSECTIONSdirective(noorphanSECTIONs).

Example:SECTIONSDirective
Simpleprogramdemonstratingthatdifferentblocksofworkwillbedonebydifferentthreads.

FortranSECTIONSDirectiveExample
1

P R O G R A M

V E C _ A D D _ S E C T I O N S

2
3

I N T E G E R N , I
P A R A M E T E R ( N = 1 0 0 0 )
R E A L A ( N ) , B ( N ) , C ( N ) ,
4
5

D ( N )

6
7

S o m e
D O I
A ( I
B ( I
E N D D O
! $ O M P

P A R A L L E L

! $ O M P

S E C T I O N S

! $ O M P

S E C T I O N
D O I = 1 ,
C ( I ) =
E N D D O

8
9
1 0
1 1

i n
=
)
)

i t i a l i z a t i o n s
1 , N
= I * 1 . 5
= I + 2 2 . 3 5

1 2
1 3

S H A R E D ( A , B , C , D ) ,

1 4
1 5
1 6
1 7
1 8
1 9
2 0

2 1

A ( I )

2 2

! $ O M P

S E C T I O N
D O

1 ,

B ( I )

P R I V A T E ( I )

D O

= 1 ,
D ( I ) =
E N D D O

2 3
2 4
2 5

N
A ( I )

B ( I )

2 6
2 7

! $ O M P

E N D

S E C T I O N S

! $ O M P

E N D

P A R A L L E L

N O W A I T

2 8
2 9
3 0
3 1

E N D

C/C++sectionsDirectiveExample
1

# i n c l u d e < o m p . h >
# d e f i n e N 1 0 0 0
2
3
4

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

5
6

i n t i ;
f l o a t a [ N ] ,
7

b [ N ] ,

c [ N ] ,

d [ N ] ;

8
9
1 0
1 1
1 2
1 3

/ * S o
f o r (
a [ i
b [ i
}

m e i n i t i a
i = 0 ; i <
] = i * 1
] = i + 2

l i z a t i o n s
N ; i + + ) {
. 5 ;
2 . 3 5 ;

* /

1 4
1 5
1 6

# p r a g m a
{

o m p

p a r a l l e l

s h a r e d ( a , b , c , d )

p r i v a t e ( i )

1 7
1 8

# p r a g m a
{

1 9

o m p

s e c t i o n s

n o w a i t

2 0
2 1

# p r a g m a o m p s e c t i o n
f o r ( i = 0 ; i < N ; i + + )
c [ i ] = a [ i ] + b [ i ] ;

2 2
2 3
2 4
2 5

# p r a g m a o m p s e c t i o n
f o r ( i = 0 ; i < N ; i + + )
d [ i ] = a [ i ] * b [ i ] ;

2 6
2 7
2 8
2 9

/ *

e n d

o f

s e c t i o n s

* /

3 0
3 1

/ *

e n d

o f

p a r a l l e l

r e g i o n

* /

3 2
3 3

OpenMPDirectives
WorkSharingConstructs
WORKSHAREDirective
Purpose:
Fortranonly
TheWORKSHAREdirectivedividestheexecutionoftheenclosedstructuredblockintoseparateunitsofwork,eachof
whichisexecutedonlyonce.

Thestructuredblockmustconsistofonlythefollowing:
arrayassignments
scalarassignments
FORALLstatements
FORALLconstructs
WHEREstatements
WHEREconstructs
atomicconstructs
criticalconstructs
parallelconstructs
SeetheOpenMPAPIdocumentationforadditionalinformation,particularlyforwhatcomprisesa"unitofwork".
Format:
! $ O M P

W O R K S H A R E

s t r u c t u r e d

Fortran

! $ O M P

E N D

b l o c k

W O R K S H A R E

N O W A I T

Restrictions:
TheconstructmustnotcontainanyuserdefinedfunctioncallsunlessthefunctionisELEMENTAL.

Example:WORKSHAREDirective
Simplearrayandscalarassigmentssharedbytheteamofthreads.Aunitofworkwouldinclude:
Anyscalarassignment
Forarrayassignmentstatements,theassignmentofeachelementisaunitofwork

FortranWORKSHAREDirectiveExample
1

P R O G R A M

W O R K S H A R E

2
3

I N T E G E R N , I , J
P A R A M E T E R ( N = 1 0 0 )
R E A L A A ( N , N ) , B B ( N , N ) ,
4
5

C C ( N , N ) ,

D D ( N , N ) ,

6
7

S o m e
D O I
D O

i n
=
J
A A (
B B (
E N D D O
E N D D O

8
9
1 0
1 1
1 2
1 3

i t
1 ,
=
J ,
J ,

i a l
N
1 ,
I )
I )

i z a t i o n s
N
=

1 . 0
1 . 0

1 4
1 5

! $ O M P

P A R A L L E L

! $ O M P

! $ O M P

W O
C C
D D
F I
L A
E N

! $ O M P

E N D

S H A R E D ( A A , B B , C C , D D , F I R S T , L A S T )

1 6
1 7
1 8
1 9
2 0
2 1
2 2

R K S
=
=
R S T
S T
D W

H A R E
A A * B
A A + B
= C C (
= C C ( N
O R K S H A

B
B

2 3

1 , 1 ) + D D ( 1 , 1 )
, N ) + D D ( N , N )
R E N O W A I T

2 4
2 5
2 6

E N D

P A R A L L E L

F I R S T ,

L A S T

OpenMPDirectives
WorkSharingConstructs
SINGLEDirective
Purpose:
TheSINGLEdirectivespecifiesthattheenclosedcodeistobeexecutedbyonlyonethreadintheteam.
Maybeusefulwhendealingwithsectionsofcodethatarenotthreadsafe(suchasI/O)
Format:
! $ O M P

Fortran

S I N G L E

[ c l a u s e . . . ]
P R I V A T E ( l i s t )
F I R S T P R I V A T E ( l i s t )

b l o c k
! $ O M P

E N D

# p r a g m a

S I N G L E
o m p

C/C++

s i n g l e

N O W A I T
[ c
p r
f i
n o

l a
i v
r s
w a

u s e . . . ]
n e w l i n e
a t e ( l i s t )
t p r i v a t e ( l i s t )
i t

s t r u c t u r e d _ b l o c k

Clauses:
ThreadsintheteamthatdonotexecutetheSINGLEdirective,waitattheendoftheenclosedcodeblock,unlessa
N O W A I T / n o w a i t clauseisspecified.
Clausesaredescribedindetaillater,intheDataScopeAttributeClausessection.
Restrictions:
ItisillegaltobranchintooroutofaSINGLEblock.

OpenMPDirectives
CombinedParallelWorkSharingConstructs
OpenMPprovidesthreedirectivesthataremerelyconveniences:
PARALLELDO/parallelfor
PARALLELSECTIONS
PARALLELWORKSHARE(fortranonly)
Forthemostpart,thesedirectivesbehaveidenticallytoanindividualPARALLELdirectivebeingimmediatelyfollowedby
aseparateworksharingdirective.
Mostoftherules,clausesandrestrictionsthatapplytobothdirectivesareineffect.SeetheOpenMPAPIfordetails.
AnexampleusingthePARALLELDO/parallelforcombineddirectiveisshownbelow.

FortranPARALLELDODirectiveExample
P R O G R A M

V E C T O R _ A D D

2
I N T E G E R N , I , C H U N K S I Z E , C H U N K
P A R A M E T E R ( N = 1 0 0 0 )
P A R A M E T E R ( C H U N K S I Z E = 1 0 0 )

R E A L

A ( N ) ,

B ( N ) ,

C ( N )

7
!
8

S o m e
D O I
A (
B (
E N D D
C H U N

9
1 0
1 1
1 2
1 3

i n
=
I )
I )

i t i a l i z a t i o n s
1 , N
= I * 1 . 0
= A ( I )

O
K

C H U N K S I Z E

1 4
! $ O M P P A R A L L E L D O
! $ O M P & S H A R E D ( A , B , C , C H U N K ) P R I V A T E ( I )
! $ O M P & S C H E D U L E ( S T A T I C , C H U N K )

1 5
1 6
1 7
1 8

D O

= 1 ,
C ( I ) =
E N D D O

1 9
2 0
2 1

N
A ( I )

B ( I )

2 2
! $ O M P

2 3

E N D

P A R A L L E L

D O

2 4
2 5

E N D

C/C++parallelforDirectiveExample
1

# i n c l u d e < o m p . h >
# d e f i n e N
1 0 0 0
# d e f i n e C H U N K S I Z E
1 0 0
2
3
4
5

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

6
7

i n t i , c h u n k ;
f l o a t a [ N ] , b [ N ] ,
8

c [ N ] ;

9
1 0
1 1
1 2
1 3

/ * S o m e i n i t i a l i z a t i o n s * /
f o r ( i = 0 ; i < N ; i + + )
a [ i ] = b [ i ] = i * 1 . 0 ;
c h u n k = C H U N K S I Z E ;

1 4
1 5
1 6
1 7
1 8

# p r a
s h
s c
f o

1 9
2 0

g m
a r
h e
r
c [

o
e d (
d u l
( i =
i ]

m p p
a , b ,
e ( s t
0 ; i
= a [

a r a
c , c
a t i
<
i ]

l l
h u
c ,
n ;
+

e l
n k )
c h u
i +
b [ i

f o r \
p r i v a t e ( i )
n k )
+ )
] ;
\

OpenMPDirectives
TASKConstruct
Purpose:
TheTASKconstructdefinesanexplicittask,whichmaybeexecutedbytheencounteringthread,ordeferredfor
executionbyanyotherthreadintheteam.
Thedataenvironmentofthetaskisdeterminedbythedatasharingattributeclauses.
TaskexecutionissubjecttotaskschedulingseetheOpenMP3.1specificationdocumentfordetails.

AlsoseetheOpenMP3.1documentationfortheassociatedtaskyieldandtaskwaitdirectives.
Format:
! $ O M P

T A S K

[ c l a
I F
F I
U N
D E
M E
P R
F I
S H

Fortran

u s
(
N A
T I
F A
R G
I V
R S
A R

e
s c
L
E D
U L
E A
A T
T P
E D

. . . ]
a l a r l o g i c a l e x p r e s s i o n )
( s c a l a r l o g i c a l e x p r e s s i o n )
T

( P R
B L E
E ( l i
R I V A T
( l i s

I V A T E

F I R S T P R I V A T E

S H A R E D

s t )
E ( l i s t )
t )

b l o c k
! $ O M P

E N D

# p r a g m a

T A S K
o m p

t a s k

C/C++

[ c l a
i f
f i
u n
d e
m e
p r
f i
s h

u s
(
n a
t i
f a
r g
i v
r s
a r

e
s c
l
e d
u l
e a
a t
t p
e d

. . . ]
n e w l i n e
a l a r e x p r e s s i o n )
( s c a l a r e x p r e s s i o n )
t

( s h
b l e
e ( l i
r i v a t
( l i s

a r e d

n o n e )

s t )
e ( l i s t )
t )

s t r u c t u r e d _ b l o c k

ClausesandRestrictions:
PleaseconsulttheOpenMP3.1specificationsdocumentfordetails.

OpenMPExercise2
WorkSharingConstructs
Overview:
LogintotheLCworkshopcluster,ifyouarenotalreadyloggedin
WorkSharingDO/forconstructexamples:review,compileandrun
WorkSharingSECTIONSconstructexample:review,compileandrun

GOTOTHEEXERCISEHERE

Approx.20minutes

OpenMPDirectives

N O N E )

SynchronizationConstructs
Considerasimpleexamplewheretwothreadsontwodifferentprocessorsarebothtryingtoincrementavariablexat
thesametime(assumexisinitially0):
THREAD1:

THREAD2:

i n c r e m e n t ( x )

i n c r e m e n t ( x )

{
x

1 ;

}
THREAD1:

THREAD2:

1 0

1 0

2 0
3 0

L O A D A , ( x a d d r e s s )
A D D A , 1
S T O R E A , ( x a d d r e s s )

2 0
3 0

1 ;

L O A D A , ( x a d d r e s s )
A D D A , 1
S T O R E A , ( x a d d r e s s )

Onepossibleexecutionsequence:
1.Thread1loadsthevalueofxintoregisterA.
2.Thread2loadsthevalueofxintoregisterA.
3.Thread1adds1toregisterA
4.Thread2adds1toregisterA
5.Thread1storesregisterAatlocationx
6.Thread2storesregisterAatlocationx
Theresultantvalueofxwillbe1,not2asitshouldbe.
Toavoidasituationlikethis,theincrementingofxmustbesynchronizedbetweenthetwothreadstoensurethatthe
correctresultisproduced.
OpenMPprovidesavarietyofSynchronizationConstructsthatcontrolhowtheexecutionofeachthreadproceeds
relativetootherteamthreads.

OpenMPDirectives
SynchronizationConstructs
MASTERDirective
Purpose:
TheMASTERdirectivespecifiesaregionthatistobeexecutedonlybythemasterthreadoftheteam.Allotherthreads
ontheteamskipthissectionofcode
Thereisnoimpliedbarrierassociatedwiththisdirective
Format:
! $ O M P

M A S T E R

b l o c k

Fortran

! $ O M P
# p r a g m a

E N D

C/C++

M A S T E R
o m p

m a s t e r

n e w l i n e

s t r u c t u r e d _ b l o c k

Restrictions:
ItisillegaltobranchintooroutofMASTERblock.

OpenMPDirectives

SynchronizationConstructs
CRITICALDirective
Purpose:
TheCRITICALdirectivespecifiesaregionofcodethatmustbeexecutedbyonlyonethreadatatime.
Format:
! $ O M P

C R I T I C A L
[

n a m e

b l o c k

Fortran

! $ O M P

E N D

# p r a g m a
C/C++

C R I T I C A L
o m p

c r i t i c a l

n a m e
[

n a m e

n e w l i n e

s t r u c t u r e d _ b l o c k

Notes:
IfathreadiscurrentlyexecutinginsideaCRITICALregionandanotherthreadreachesthatCRITICALregionand
attemptstoexecuteit,itwillblockuntilthefirstthreadexitsthatCRITICALregion.
TheoptionalnameenablesmultipledifferentCRITICALregionstoexist:
Namesactasglobalidentifiers.DifferentCRITICALregionswiththesamenamearetreatedasthesameregion.
AllCRITICALsectionswhichareunnamed,aretreatedasthesamesection.
Restrictions:
ItisillegaltobranchintooroutofaCRITICALblock.
Fortranonly:Thenamesofcriticalconstructsareglobalentitiesoftheprogram.Ifanameconflictswithanyotherentity,
thebehavioroftheprogramisunspecified.

Example:CRITICALConstruct
Allthreadsintheteamwillattempttoexecuteinparallel,however,becauseoftheCRITICALconstructsurroundingthe
incrementofx,onlyonethreadwillbeabletoread/increment/writexatanytime

FortranCRITICALDirectiveExample
1

P R O G R A M

C R I T I C A L

I N T E G E R
X = 0

2
3
4
5
6

! $ O M P

P A R A L L E L

S H A R E D ( X )

! $ O M P
! $ O M P

C R I T I C A L
X = X + 1
E N D C R I T I C A L

! $ O M P

E N D

7
8
9
1 0
1 1
1 2

P A R A L L E L

1 3
1 4

E N D

C/C++criticalDirectiveExample
# i n c l u d e

< o m p . h >

# i n c l u d e

< o m p . h >

m a i n ( i n t

a r g c ,

2
3

c h a r

* a r g v [ ] )

4
5

i n t

x ;

0 ;

7
8
9

# p r a g m a
{

o m p

p a r a l l e l

s h a r e d ( x )

1 0
1 1

# p r a g m a
x = x +

1 2

o m p
1 ;

c r i t i c a l

1 3
1 4

/ *

e n d

o f

p a r a l l e l

r e g i o n

* /

1 5
1 6

OpenMPDirectives
SynchronizationConstructs
BARRIERDirective
Purpose:
TheBARRIERdirectivesynchronizesallthreadsintheteam.
WhenaBARRIERdirectiveisreached,athreadwillwaitatthatpointuntilallotherthreadshavereachedthatbarrier.All
threadsthenresumeexecutinginparallelthecodethatfollowsthebarrier.
Format:
Fortran
C/C++

! $ O M P

B A R R I E R

# p r a g m a

o m p

b a r r i e r

n e w l i n e

Restrictions:
Allthreadsinateam(ornone)mustexecutetheBARRIERregion.
Thesequenceofworksharingregionsandbarrierregionsencounteredmustbethesameforeverythreadinateam.

OpenMPDirectives
SynchronizationConstructs
TASKWAITDirective
Purpose:
OpenMP3.1feature
TheTASKWAITconstructspecifiesawaitonthecompletionofchildtasksgeneratedsincethebeginningofthecurrent
task.
Format:
Fortran
C/C++

! $ O M P
# p r a g m a

T A S K W A I T
o m p

t a s k w a i t

n e w l i n e

Restrictions:
BecausethetaskwaitconstructdoesnothaveaClanguagestatementaspartofitssyntax,therearesomerestrictions
onitsplacementwithinaprogram.Thetaskwaitdirectivemaybeplacedonlyatapointwhereabaselanguage
statementisallowed.Thetaskwaitdirectivemaynotbeusedinplaceofthestatementfollowinganif,while,do,switch,
orlabel.SeetheOpenMP3.1specificationsdocumentfordetails.

OpenMPDirectives
SynchronizationConstructs
ATOMICDirective
Purpose:
TheATOMICdirectivespecifiesthataspecificmemorylocationmustbeupdatedatomically,ratherthanlettingmultiple
threadsattempttowritetoit.Inessence,thisdirectiveprovidesaminiCRITICALsection.
Format:
! $ O M P
Fortran

A T O M I C

s t a t e m e n t _ e x p r e s s i o n
# p r a g m a

C/C++

o m p

a t o m i c

n e w l i n e

s t a t e m e n t _ e x p r e s s i o n

Restrictions:
Thedirectiveappliesonlytoasingle,immediatelyfollowingstatement
Anatomicstatementmustfollowaspecificsyntax.SeethemostrecentOpenMPspecsforthis.

OpenMPDirectives
SynchronizationConstructs
FLUSHDirective
Purpose:
TheFLUSHdirectiveidentifiesasynchronizationpointatwhichtheimplementationmustprovideaconsistentviewof
memory.Threadvisiblevariablesarewrittenbacktomemoryatthispoint.
ThereisafairamountofdiscussiononthisdirectivewithinOpenMPcirclesthatyoumaywishtoconsultformore
information.Someofitishardtounderstand?PertheAPI:
Iftheintersectionoftheflushsetsoftwoflushesperformedbytwodifferentthreadsisnonempty,thenthetwo
flushesmustbecompletedasifinsomesequentialorder,seenbyallthreads.
Saywhat?
Toquotefromtheopenmp.orgFAQ:
Q17:Isthe!$ompflushdirectivenecessaryonacachecoherentsystem?
A17:Yestheflushdirectiveisnecessary.LookintheOpenMPspecificationsforexamplesofit'suses.Thedirectiveis
necessarytoinstructthecompilerthatthevariablemustbewrittento/readfromthememorysystem,i.e.thatthe
variablecannotbekeptinalocalCPUregisterovertheflush"statement"inyourcode.
CachecoherencymakescertainthatifoneCPUexecutesareadorwriteinstructionfrom/tomemory,thenallother
CPUsinthesystemwillgetthesamevaluefromthatmemoryaddresswhentheyaccessit.Allcacheswillshowa
coherentvalue.However,intheOpenMPstandardtheremustbeawaytoinstructthecompilertoactuallyinsertthe
read/writemachineinstructionandnotpostponeit.Keepingavariableinaregisterinaloopisverycommonwhen
producingefficientmachinelanguagecodeforaloop.
AlsoseethemostrecentOpenMPspecsfordetails.
Format:

Fortran
C/C++

! $ O M P

F L U S H

# p r a g m a

( l i s t )

o m p

f l u s h

( l i s t )

n e w l i n e

Notes:
Theoptionallistcontainsalistofnamedvariablesthatwillbeflushedinordertoavoidflushingallvariables.Forpointers
inthelist,notethatthepointeritselfisflushed,nottheobjectitpointsto.
Implementationsmustensureanypriormodificationstothreadvisiblevariablesarevisibletoallthreadsafterthispoint
ie.compilersmustrestorevaluesfromregisterstomemory,hardwaremightneedtoflushwritebuffers,etc
TheFLUSHdirectiveisimpliedforthedirectivesshowninthetablebelow.ThedirectiveisnotimpliedifaNOWAIT
clauseispresent.
Fortran

C/C++

BARRIER
ENDPARALLEL
CRITICALandENDCRITICAL
ENDDO
ENDSECTIONS
ENDSINGLE
ORDEREDandENDORDERED

b a r
p a r
c r i
o r d
f o r
s e c
s i n

r i e r
a l l e l uponentryandexit
t i c a l uponentryandexit
e r e d uponentryandexit
uponexit
t i o n s uponexit
g l e uponexit

OpenMPDirectives
SynchronizationConstructs
ORDEREDDirective
Purpose:
TheORDEREDdirectivespecifiesthatiterationsoftheenclosedloopwillbeexecutedinthesameorderasiftheywere
executedonaserialprocessor.
Threadswillneedtowaitbeforeexecutingtheirchunkofiterationsifpreviousiterationshaven'tcompletedyet.
UsedwithinaDO/forloopwithanORDEREDclause
TheORDEREDdirectiveprovidesawayto"finetune"whereorderingistobeappliedwithinaloop.Otherwise,itisnot
required.
Format:
! $ O M P D O
( l o o p
! $ O M P
Fortran

O R D E R E D
r e g i o n )

[ c l a u s e s . . . ]

O R D E R E D

( b l o c k )
! $ O M P

E N D

O R D E R E D

( e n d o f l o o p
! $ O M P E N D D O

r e g i o n )

# p r a g m a o m p f o r o r d e r e d
( l o o p r e g i o n )
C/C++

# p r a g m a

o m p

o r d e r e d

n e w l i n e

s t r u c t u r e d _ b l o c k
( e n d o

o f

l o o p

[ c l a u s e s . . . ]

r e g i o n )

Restrictions:
AnORDEREDdirectivecanonlyappearinthedynamicextentofthefollowingdirectives:
DOorPARALLELDO(Fortran)
f o r orp a r a l l e l f o r (C/C++)
Onlyonethreadisallowedinanorderedsectionatanytime
ItisillegaltobranchintooroutofanORDEREDblock.
AniterationofaloopmustnotexecutethesameORDEREDdirectivemorethanonce,anditmustnotexecutemore
thanoneORDEREDdirective.
AloopwhichcontainsanORDEREDdirective,mustbealoopwithanORDEREDclause.

OpenMPDirectives
THREADPRIVATEDirective
Purpose:
TheTHREADPRIVATEdirectiveisusedtomakeglobalfilescopevariables(C/C++)orcommonblocks(Fortran)local
andpersistenttoathreadthroughtheexecutionofmultipleparallelregions.
Format:
Fortran
C/C++

! $ O M P

T H R E A D P R I V A T E

# p r a g m a

o m p

( / c b / ,

t h r e a d p r i v a t e

. . . )

c b

i s

t h e

n a m e

o f

c o m m o n

b l o c k

( l i s t )

Notes:
Thedirectivemustappearafterthedeclarationoflistedvariables/commonblocks.Eachthreadthengetsitsowncopyof
thevariable/commonblock,sodatawrittenbyonethreadisnotvisibletootherthreads.Forexample:

FortranTHREADPRIVATEDirectiveExample
1

P R O G R A M

T H R E A D P R I V

2
3

I N T E G E R A , B ,
R E A L * 4 X
C O M M O N / C 1 / A
4
5

I ,

T I D ,

O M P _ G E T _ T H R E A D _ N U M

6
7

! $ O M P

T H R E A D P R I V A T E ( / C 1 / ,

X )

E x p l i c i t l y t u r n o f f d y n a m i c t h r e a d s
C A L L O M P _ S E T _ D Y N A M I C ( . F A L S E . )

8
9
1 0
1 1
1 2
1 3

! $ O M P

1 4
1 5
1 6
1 7
1 8
1 9

! $ O M P

P R
P A
T I
A
B
X
P R
E N
D

I N
R A
D
=
=
=
I N

T
L L
=
T I
T I
1 .
T
P A

* , ' 1
E L P R
O M P _ G
D
D
1 * T
* , ' T
R A L L E

s t P a r a l l e l R e g i o n : '
I V A T E ( B , T I D )
E T _ T H R E A D _ N U M ( )

I D

+ 1 . 0
h r e a d ' , T I D , ' :

A , B , X = ' , A , B , X

2 0
2 1
2 2
2 3

P R I N T
P R I N T
P R I N T

* ,
* ,

' * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * '
' M a s t e r t h r e a d d o i n g s e r i a l w o r k h e r e '
' * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * '

P R I N T

* ,

' 2 n d

* ,

2 4
2 5
! $ O M P

P A R A L L E L

P a r a l l e l

P R I V A T E ( T I D )

R e g i o n :

'

2 6

! $ O M P

P A
T I
P R
E N

2 7
2 8
2 9

! $ O M P

R A L L E L P R
D = O M P _ G
I N T * , ' T
D P A R A L L E

I V A T E ( T I D )
E T _ T H R E A D _ N U M ( )
h r e a d ' , T I D , ' :

A , B , X = ' , A , B , X

3 0
3 1

E N D

O u t p u t :
1 s t
T h r
T h r
T h r
T h r
* * *
M a s
* * *
2 n d
T h r
T h r
T h r
T h r

P a
e a d
e a d
e a d
e a d
* * *
t e r
* * *
P a
e a d
e a d
e a d
e a d

r a l
0
1
3
2
* * *
t h
* * *
r a l
0
2
3
1

l e l

R e g
A , B
:
A , B
:
A , B
:
A , B
* * * * * * *
r e a d d o
* * * * * * *
l e l R e g
:
A , B
:
A , B
:
A , B
:
A , B
:

i o n
, X =
, X =
, X =
, X =
* * *
i n g
* * *
i o n
, X =
, X =
, X =
, X =

:
0

1 . 0
2 . 0
3 4 . 3
2 3 . 2
* * * * * *
e r i a l
* * * * * *
1

0 0 0
9 9 9
0 0 0
0 0 0
* * *
w o r
* * *

0 0 0
9 9 9
0 0 1
0 0 0
* * *

0 0 0
0 0 0
0 0 0
9 9 9

0 0 0
0 0 0
0 0 1
9 9 9

2
* *
s
* *

0 0
0 5
9 1
4 8

* * *
h e r e
* * * * * *
k

:
0

0
2

0
3

0
1

1 . 0
3 . 2
4 . 3
2 . 0

0 0
4 8
9 1
0 5

C/C++threadprivateDirectiveExample
1

# i n c l u d e

< o m p . h >

2
3

i n t
a , b ,
f l o a t x ;
4

i ,

t i d ;

5
6

# p r a g m a

o m p

t h r e a d p r i v a t e ( a ,

x )

7
8

m a i n ( i n t

a r g c ,

c h a r

* a r g v [ ] )

9
1 0
1 1

/ *

E x p l i c i t l y t u r n o f f
o m p _ s e t _ d y n a m i c ( 0 ) ;

d y n a m i c

t h r e a d s

* /

1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
2 0
2 1

p r
# p r a
{
t i
a
b
x
p r
}

i n t f ( " 1 s t P a r a l l e l R e g i o n : \ n " ) ;
g m a o m p p a r a l l e l p r i v a t e ( b , t i d )
d

=
=

o m p _
;
;
*
" T h
n d

t i d
= t i d
= 1 . 1
i n t f (
/ * e

g e t _ t h r e a d _ n u m ( ) ;

t i d + 1 . 0 ;
r e a d % d :
a , b , x = % d % d
o f p a r a l l e l r e g i o n * /

% f \ n " , t i d , a , b , x ) ;

2 2
2 3
2 4
2 5

p r i n t f ( " * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ n " ) ;
p r i n t f ( " M a s t e r t h r e a d d o i n g s e r i a l w o r k h e r e \ n " ) ;
p r i n t f ( " * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \ n " ) ;

2 6
2 7
2 8
2 9
3 0
3 1
3 2

p r
# p r a
{
t i
p r
}

i n t f ( " 2 n d P a r a l l e l R e g i o n : \ n " ) ;
g m a o m p p a r a l l e l p r i v a t e ( t i d )
d

= o m p _ g e t _ t h r e a d _ n u m ( ) ;
i n t f ( " T h r e a d % d :
a , b , x = % d % d
/ * e n d o f p a r a l l e l r e g i o n * /

% f \ n " , t i d , a , b , x ) ;

3 3
3 4

O u t p u t :
1 s t
T h r
T h r
T h r
T h r
* * *
M a s
* * *
2 n d
T h r
T h r
T h r
T h r

P a
e a d
e a d
e a d
e a d
* * *
t e r
* * *
P a
e a d
e a d
e a d
e a d

r a l
0 :
2 :
3 :
1 :
* * *
t h
* * *
r a l
0 :
3 :
1 :
2 :

l e l

* *
r e
* *
l e

R e
a , b
a , b
a , b
a , b
* * * *
a d d
* * * *
l R e
a , b
a , b
a , b
a , b

g i o
, x =
, x =
, x =
, x =
* * *
o i n
* * *
g i o
, x =
, x =
, x =
, x =

n :
0

1 . 0
3 . 2
3 4 . 3
1 2 . 1
* * * * * *
s e r i a l
* * * * * *
2
3

1
* *
g
* *

0 0 0
0 0 0
0 0 0
0 0 0
* * *
w o
* * *

0 0

0 0 0
0 0 0
0 0 0
0 0 0

0 0

0 0
0 0
0 0
* * * * * * *
r k h e r e
* * * * * * *

n :
0

0
3

0
1

0
2

1 . 0
4 . 3
2 . 1
3 . 2

0 0
0 0
0 0

Onfirstentrytoaparallelregion,datainTHREADPRIVATEvariablesandcommonblocksshouldbeassumed
undefined,unlessaCOPYINclauseisspecifiedinthePARALLELdirective
THREADPRIVATEvariablesdifferfromPRIVATEvariables(discussedlater)becausetheyareabletopersistbetween
differentparallelregionsofacode.
Restrictions:
DatainTHREADPRIVATEobjectsisguaranteedtopersistonlyifthedynamicthreadsmechanismis"turnedoff"andthe
numberofthreadsindifferentparallelregionsremainsconstant.Thedefaultsettingofdynamicthreadsisundefined.
TheTHREADPRIVATEdirectivemustappearaftereverydeclarationofathreadprivatevariable/commonblock.
Fortran:onlynamedcommonblockscanbemadeTHREADPRIVATE.

OpenMPDirectives
DataScopeAttributeClauses
AlsocalledDatasharingAttributeClauses
AnimportantconsiderationforOpenMPprogrammingistheunderstandinganduseofdatascoping
BecauseOpenMPisbaseduponthesharedmemoryprogrammingmodel,mostvariablesaresharedbydefault
Globalvariablesinclude:
Fortran:COMMONblocks,SAVEvariables,MODULEvariables
C:Filescopevariables,static
Privatevariablesinclude:
Loopindexvariables
Stackvariablesinsubroutinescalledfromparallelregions
Fortran:Automaticvariableswithinastatementblock
TheOpenMPDataScopeAttributeClausesareusedtoexplicitlydefinehowvariablesshouldbescoped.Theyinclude:
PRIVATE
FIRSTPRIVATE
LASTPRIVATE
SHARED
DEFAULT
REDUCTION
COPYIN
DataScopeAttributeClausesareusedinconjunctionwithseveraldirectives(PARALLEL,DO/for,andSECTIONS)to
controlthescopingofenclosedvariables.

Theseconstructsprovidetheabilitytocontrolthedataenvironmentduringexecutionofparallelconstructs.
Theydefinehowandwhichdatavariablesintheserialsectionoftheprogramaretransferredtotheparallel
regionsoftheprogram(andback)
Theydefinewhichvariableswillbevisibletoallthreadsintheparallelregionsandwhichvariableswillbeprivately
allocatedtoallthreads.
DataScopeAttributeClausesareeffectiveonlywithintheirlexical/staticextent.
Important:PleaseconsultthelatestOpenMPspecsforimportantdetailsanddiscussiononthistopic.
AClauses/DirectivesSummaryTableisprovidedforconvenience.

PRIVATEClause
Purpose:
ThePRIVATEclausedeclaresvariablesinitslisttobeprivatetoeachthread.
Format:
Fortran
C/C++

P R I V A T E

( l i s t )

p r i v a t e

( l i s t )

Notes:
PRIVATEvariablesbehaveasfollows:
Anewobjectofthesametypeisdeclaredonceforeachthreadintheteam
Allreferencestotheoriginalobjectarereplacedwithreferencestothenewobject
VariablesdeclaredPRIVATEshouldbeassumedtobeuninitializedforeachthread
ComparisonbetweenPRIVATEandTHREADPRIVATE:

PRIVATE

THREADPRIVATE

DataItem

C/C++:variable
Fortran:variableorcommonblock

C/C++:variable
Fortran:commonblock

Where
Declared

Atstartofregionorworksharinggroup

Indeclarationsofeachroutineusingblockor
globalfilescope

Persistent?

No

Yes

Extent

Lexicalonlyunlesspassedasan
argumenttosubroutine

Dynamic

Initialized

UseFIRSTPRIVATE

UseCOPYIN

SHAREDClause
Purpose:
TheSHAREDclausedeclaresvariablesinitslisttobesharedamongallthreadsintheteam.
Format:
Fortran
C/C++

S H A R E D

( l i s t )

s h a r e d

( l i s t )

Notes:
Asharedvariableexistsinonlyonememorylocationandallthreadscanreadorwritetothataddress

Itistheprogrammer'sresponsibilitytoensurethatmultiplethreadsproperlyaccessSHAREDvariables(suchasvia
CRITICALsections)

DEFAULTClause
Purpose:
TheDEFAULTclauseallowstheusertospecifyadefaultscopeforallvariablesinthelexicalextentofanyparallel
region.
Format:
Fortran
C/C++

D E F A U L T

( P R I V A T E

d e f a u l t

( s h a r e d

|
|

F I R S T P R I V A T E

S H A R E D

N O N E )

n o n e )

Notes:
SpecificvariablescanbeexemptedfromthedefaultusingthePRIVATE,SHARED,FIRSTPRIVATE,LASTPRIVATE,
andREDUCTIONclauses
TheC/C++OpenMPspecificationdoesnotincludeprivateorfirstprivateasapossibledefault.However,actual
implementationsmayprovidethisoption.
UsingNONEasadefaultrequiresthattheprogrammerexplicitlyscopeallvariables.
Restrictions:
OnlyoneDEFAULTclausecanbespecifiedonaPARALLELdirective

FIRSTPRIVATEClause
Purpose:
TheFIRSTPRIVATEclausecombinesthebehaviorofthePRIVATEclausewithautomaticinitializationofthevariablesin
itslist.
Format:
Fortran
C/C++

F I R S T P R I V A T E

( l i s t )

f i r s t p r i v a t e

( l i s t )

Notes:
Listedvariablesareinitializedaccordingtothevalueoftheiroriginalobjectspriortoentryintotheparallelorwork
sharingconstruct.

LASTPRIVATEClause
Purpose:
TheLASTPRIVATEclausecombinesthebehaviorofthePRIVATEclausewithacopyfromthelastloopiterationor
sectiontotheoriginalvariableobject.
Format:
Fortran
C/C++

L A S T P R I V A T E

( l i s t )

l a s t p r i v a t e

( l i s t )

Notes:
Thevaluecopiedbackintotheoriginalvariableobjectisobtainedfromthelast(sequentially)iterationorsectionofthe

enclosingconstruct.
Forexample,theteammemberwhichexecutesthefinaliterationforaDOsection,ortheteammemberwhichdoesthe
lastSECTIONofaSECTIONScontextperformsthecopywithitsownvalues

COPYINClause
Purpose:
TheCOPYINclauseprovidesameansforassigningthesamevaluetoTHREADPRIVATEvariablesforallthreadsinthe
team.
Format:
Fortran
C/C++

C O P Y I N

( l i s t )

c o p y i n

( l i s t )

Notes:
Listcontainsthenamesofvariablestocopy.InFortran,thelistcancontainboththenamesofcommonblocksand
namedvariables.
Themasterthreadvariableisusedasthecopysource.Theteamthreadsareinitializedwithitsvalueuponentryintothe
parallelconstruct.

COPYPRIVATEClause
Purpose:
TheCOPYPRIVATEclausecanbeusedtobroadcastvaluesacquiredbyasinglethreaddirectlytoallinstancesofthe
privatevariablesintheotherthreads.
AssociatedwiththeSINGLEdirective
SeethemostrecentOpenMPspecsdocumentforadditionaldiscussionandexamples.
Format:
Fortran
C/C++

C O P Y P R I V A T E
c o p y p r i v a t e

( l i s t )
( l i s t )

REDUCTIONClause
Purpose:
TheREDUCTIONclauseperformsareductiononthevariablesthatappearinitslist.
Aprivatecopyforeachlistvariableiscreatedforeachthread.Attheendofthereduction,thereductionvariableis
appliedtoallprivatecopiesofthesharedvariable,andthefinalresultiswrittentotheglobalsharedvariable.
Format:
Fortran
C/C++

R E D U C T I O N

( o p e r a t o r | i n t r i n s i c :

r e d u c t i o n

( o p e r a t o r :

l i s t )

l i s t )

Example:REDUCTIONVectorDotProduct:
Iterationsoftheparallelloopwillbedistributedinequalsizedblockstoeachthreadintheteam(SCHEDULESTATIC)
Attheendoftheparallelloopconstruct,allthreadswilladdtheirvaluesof"result"toupdatethemasterthread'sglobal
copy.

FortranREDUCTIONClauseExample
1

P R O G R A M

D O T _ P R O D U C T

I N T
P A R
P A R
R E A

N ,

2
3
4
5
6

E G E R
A M E T E
A M E T E
L A ( N

C H
( N =
R ( C H
) , B (

U N K S I Z E , C H U N K ,
1 0 0 )
U N K S I Z E = 1 0 )
N ) , R E S U L T

7
8

S o m e
D O I
A (
B (
E N D D
R E S U
C H U N

9
1 0
1 1
1 2
1 3
1 4

i n
=
I )
I )

i t i a l i z a t i o n s
1 , N
= I * 1 . 0
= I * 2 . 0

O
L T =

0 . 0
C H U N K S I Z E

A L L
A U L
E D U
U C T

E L

1 5
1 6

! $ O
! $ O
! $ O
! $ O

1 7
1 8
1 9

M P

P A R
D E F
S C H
R E D

M P &
M P &
M P &

D O
T ( S H A R E D ) P R I V A T E ( I )
L E ( S T A T I C , C H U N K )
I O N ( + : R E S U L T )

2 0
2 1

D O

I = 1 , N
R E S U L T = R E S U L T
E N D D O

2 2
2 3

( A ( I )

B ( I ) )

2 4
2 5

! $ O M P

E N D

P A R A L L E L

D O

2 6
2 7

P R I N T
E N D

2 8

* ,

' F i n a l

R e s u l t =

' ,

R E S U L T

C/C++reductionClauseExample
1

# i n c l u d e

< o m p . h >

m a i n ( i n t

a r g c ,

2
3

c h a r

* a r g v [ ] )

4
5

i n t
f l o a t

i ,

/ *

n , c h u n k ;
a [ 1 0 0 ] , b [ 1 0 0 ] ,

r e s u l t ;

1 0

S o m
1 0
n k
u l t
( i
[ i ]
[ i ]
=

1 2

c h u
r e s
f o r

1 3

1 4

1 5

1 1

i n i t i a l i z a t i o n s

* /

0 ;
=

1 0 ;
0 . 0 ;
= 0 ; i < n ; i + + )
= i * 1 . 0 ;
= i * 2 . 0 ;
=

1 6
1 7
1 8
1 9
2 0

# p r a
d e
s c
r e

g m a
f a u l
h e d u
d u c t

o m p
t ( s h
l e ( s
i o n (

p a r
a r e
t a t
+ : r

a l l
d )
i c ,
e s u

e l

f o r
p r i v a t e ( i )
c h u n k )
l t )

\
\
\

2 1
2 2
2 3

f o r

( i = 0 ; i < n ; i + + )
r e s u l t = r e s u l t + ( a [ i ]
*

b [ i ] ) ;

2 4
p r i n t f ( " F i n a l

r e s u l t =

% f \ n " , r e s u l t ) ;

2 5

p r i n t f ( " F i n a l

r e s u l t =

% f \ n " , r e s u l t ) ;

2 6
2 7

Restrictions:
Variablesinthelistmustbenamedscalarvariables.Theycannotbearrayorstructuretypevariables.Theymustalso
bedeclaredSHAREDintheenclosingcontext.
Reductionoperationsmaynotbeassociativeforrealnumbers.
TheREDUCTIONclauseisintendedtobeusedonaregionorworksharingconstructinwhichthereductionvariableis
usedonlyinstatementswhichhaveoneoffollowingforms:
Fortran

C/C++

x=xoperatorexpr
x=exproperatorx(exceptsubtraction)
x=intrinsic(x,expr)
x=intrinsic(expr,x)

xisascalarvariableinthelist
exprisascalarexpressionthatdoesnot
referencex
intrinsicisoneofMAX,MIN,IAND,IOR,IEOR
operatorisoneof+,*,,.AND.,.OR.,.EQV.,
.NEQV.

x=xopexpr
x=expropx(exceptsubtraction)
xbinop=expr
x++
++x
x
x
xisascalarvariableinthelist
exprisascalarexpressionthatdoesnotreferencex
opisnotoverloaded,andisoneof+,*,,/,&,^,|,
&&,||
binopisnotoverloaded,andisoneof+,*,,/,&,^,|

OpenMPDirectives
Clauses/DirectivesSummary
ThetablebelowsummarizeswhichclausesareacceptedbywhichOpenMPdirectives.
Directive
Clause

PARALLEL DO/for SECTIONS SINGLE PARALLEL PARALLEL


DO/for
SECTIONS

IF

PRIVATE
SHARED

DEFAULT
FIRSTPRIVATE
LASTPRIVATE

REDUCTION
COPYIN

COPYPRIVATE

SCHEDULE

ORDERED

NOWAIT

ThefollowingOpenMPdirectivesdonotacceptclauses:
MASTER
CRITICAL
BARRIER
ATOMIC
FLUSH

ORDERED
THREADPRIVATE
Implementationsmay(anddo)differfromthestandardinwhichclausesaresupportedbyeachdirective.

OpenMPDirectives
DirectiveBindingandNestingRules
ThissectionisprovidedmainlyasaquickreferenceonruleswhichgovernOpenMPdirectivesandbinding.Users
shouldconsulttheirimplementationdocumentationandtheOpenMPstandardforotherrulesandrestrictions.

Unlessindicatedotherwise,rulesapplytobothFortranandC/C++OpenMPimplementations.
Note:theFortranAPIalsodefinesanumberofDataEnvironmentrules.Thosehavenotbeenreproducedhere.
DirectiveBinding:
TheDO/for,SECTIONS,SINGLE,MASTERandBARRIERdirectivesbindtothedynamicallyenclosingPARALLEL,if
oneexists.Ifnoparallelregioniscurrentlybeingexecuted,thedirectiveshavenoeffect.
TheORDEREDdirectivebindstothedynamicallyenclosingDO/for.
TheATOMICdirectiveenforcesexclusiveaccesswithrespecttoATOMICdirectivesinallthreads,notjustthecurrent
team.
TheCRITICALdirectiveenforcesexclusiveaccesswithrespecttoCRITICALdirectivesinallthreads,notjustthe
currentteam.
AdirectivecanneverbindtoanydirectiveoutsidetheclosestenclosingPARALLEL.
DirectiveNesting:
Aworksharingregionmaynotbecloselynestedinsideaworksharing,explicittask,critical,ordered,atomic,ormaster
region.
Abarrierregionmaynotbecloselynestedinsideaworksharing,explicittask,critical,ordered,atomic,ormasterregion.
Amasterregionmaynotbecloselynestedinsideaworksharing,atomic,orexplicittaskregion.
Anorderedregionmaynotbecloselynestedinsideacritical,atomic,orexplicittaskregion.
Anorderedregionmustbecloselynestedinsidealoopregion(orparallelloopregion)withanorderedclause.
Acriticalregionmaynotbenested(closelyorotherwise)insideacriticalregionwiththesamename.Notethatthis
restrictionisnotsufficienttopreventdeadlock.
parallel,flush,critical,atomic,taskyield,andexplicittaskregionsmaynotbecloselynestedinsideanatomicregion.

RunTimeLibraryRoutines
Overview:
TheOpenMPAPIincludesanevergrowingnumberofruntimelibraryroutines.
Theseroutinesareusedforavarietyofpurposesasshowninthetablebelow:
Routine

Purpose

OMP_SET_NUM_THREADS

Setsthenumberofthreadsthatwillbeusedinthenext
parallelregion

OMP_GET_NUM_THREADS

Returnsthenumberofthreadsthatarecurrentlyintheteam
executingtheparallelregionfromwhichitiscalled

OMP_GET_MAX_THREADS

Returnsthemaximumvaluethatcanbereturnedbyacallto
theOMP_GET_NUM_THREADSfunction

OMP_GET_THREAD_NUM

Returnsthethreadnumberofthethread,withintheteam,
makingthiscall.

OMP_GET_THREAD_LIMIT

ReturnsthemaximumnumberofOpenMPthreadsavailableto
aprogram

OMP_GET_NUM_PROCS

Returnsthenumberofprocessorsthatareavailabletothe
program

OMP_IN_PARALLEL

Usedtodetermineifthesectionofcodewhichisexecutingis
parallelornot

OMP_SET_DYNAMIC

Enablesordisablesdynamicadjustment(bytheruntime
system)ofthenumberofthreadsavailableforexecutionof
parallelregions

OMP_GET_DYNAMIC

Usedtodetermineifdynamicthreadadjustmentisenabledor
not

OMP_SET_NESTED

Usedtoenableordisablenestedparallelism

OMP_GET_NESTED

Usedtodetermineifnestedparallelismisenabledornot

OMP_SET_SCHEDULE

Setstheloopschedulingpolicywhen"runtime"isusedasthe
schedulekindintheOpenMPdirective

OMP_GET_SCHEDULE

Returnstheloopschedulingpolicywhen"runtime"isusedas
theschedulekindintheOpenMPdirective

OMP_SET_MAX_ACTIVE_LEVELS

Setsthemaximumnumberofnestedparallelregions

OMP_GET_MAX_ACTIVE_LEVELS

Returnsthemaximumnumberofnestedparallelregions

OMP_GET_LEVEL

Returnsthecurrentlevelofnestedparallelregions

OMP_GET_ANCESTOR_THREAD_NUM Returns,foragivennestedlevelofthecurrentthread,the
threadnumberofancestorthread
OMP_GET_TEAM_SIZE

Returns,foragivennestedlevelofthecurrentthread,thesize
ofthethreadteam

OMP_GET_ACTIVE_LEVEL

Returnsthenumberofnested,activeparallelregions
enclosingthetaskthatcontainsthecall

OMP_IN_FINAL

Returnstrueiftheroutineisexecutedinthefinaltaskregion
otherwiseitreturnsfalse

OMP_INIT_LOCK

Initializesalockassociatedwiththelockvariable

OMP_DESTROY_LOCK

Disassociatesthegivenlockvariablefromanylocks

OMP_SET_LOCK

Acquiresownershipofalock

OMP_UNSET_LOCK

Releasesalock

OMP_TEST_LOCK

Attemptstosetalock,butdoesnotblockifthelockis
unavailable

OMP_INIT_NEST_LOCK

Initializesanestedlockassociatedwiththelockvariable

OMP_DESTROY_NEST_LOCK

Disassociatesthegivennestedlockvariablefromanylocks

OMP_SET_NEST_LOCK

Acquiresownershipofanestedlock

OMP_UNSET_NEST_LOCK

Releasesanestedlock

OMP_TEST_NEST_LOCK

Attemptstosetanestedlock,butdoesnotblockifthelockis
unavailable

OMP_GET_WTIME

Providesaportablewallclocktimingroutine

OMP_GET_WTICK

Returnsadoubleprecisionfloatingpointvalueequaltothe
numberofsecondsbetweensuccessiveclockticks

ForC/C++,alloftheruntimelibraryroutinesareactualsubroutines.ForFortran,someareactuallyfunctions,andsome
aresubroutines.Forexample:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ N U M _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ t h r e a d s ( v o i d )

NotethatforC/C++,youusuallyneedtoincludethe< o m p . h > headerfile.


Fortranroutinesarenotcasesensitive,butC/C++routinesare.
FortheLockroutines/functions:
Thelockvariablemustbeaccessedonlythroughthelockingroutines
ForFortran,thelockvariableshouldbeoftypeintegerandofakindlargeenoughtoholdanaddress.
ForC/C++,thelockvariablemusthavetypeo m p _ l o c k _ t ortypeo m p _ n e s t _ l o c k _ t ,dependingonthefunction
beingused.
Implementationnotes:
ImplementationsmayormaynotsupportallOpenMPAPIfeatures.Forexample,ifnestedparallelismis
supported,itmaybeonlynominal,inthatanestedparallelregionmayonlyhaveonethread.
Consultyourimplementation'sdocumentationfordetailsorexperimentandfindoutforyourselfifyoucan'tfindit
inthedocumentation.
TheruntimelibraryroutinesarediscussedinmoredetailinAppendixA.

EnvironmentVariables
OpenMPprovidesthefollowingenvironmentvariablesforcontrollingtheexecutionofparallelcode.
Allenvironmentvariablenamesareuppercase.Thevaluesassignedtothemarenotcasesensitive.
OMP_SCHEDULE
AppliesonlytoDO,PARALLELDO(Fortran)andf o r , p a r a l l e l f o r (C/C++)directiveswhichhavetheirschedule
clausesettoRUNTIME.Thevalueofthisvariabledetermineshowiterationsofthelooparescheduledonprocessors.
Forexample:
s e t e n v
s e t e n v

O M P _ S C H E D U L E
O M P _ S C H E D U L E

" g u i d e d , 4 "
" d y n a m i c "

OMP_NUM_THREADS
Setsthemaximumnumberofthreadstouseduringexecution.Forexample:
s e t e n v

O M P _ N U M _ T H R E A D S

OMP_DYNAMIC
Enablesordisablesdynamicadjustmentofthenumberofthreadsavailableforexecutionofparallelregions.Valid
valuesareTRUEorFALSE.Forexample:
s e t e n v

O M P _ D Y N A M I C

T R U E

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_PROC_BIND
Enablesordisablesthreadsbindingtoprocessors.ValidvaluesareTRUEorFALSE.Forexample:
s e t e n v

O M P _ P R O C _ B I N D

T R U E

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_NESTED
Enablesordisablesnestedparallelism.ValidvaluesareTRUEorFALSE.Forexample:
s e t e n v

O M P _ N E S T E D

T R U E

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.Ifnestedparallelismissupported,itisoftenonly
nominal,inthatanestedparallelregionmayonlyhaveonethread.
OMP_STACKSIZE

Controlsthesizeofthestackforcreated(nonMaster)threads.Examples:
s e t
s e t
s e t
s e t
s e t
s e t
s e t

e n v
e n v
e n v
e n v
e n v
e n v
e n v

O M P
O M P
O M P
O M P
O M P
O M P
O M P

_ S T
_ S T
_ S T
_ S T
_ S T
_ S T
_ S T

A C K
A C K
A C K
A C K
A C K
A C K
A C K

S I Z
S I Z
S I Z
S I Z
S I Z
S I Z
S I Z

2 0 0 0
" 3 0 0
1 0 M
" 1 0
" 2 0
" 1 G
2 0 0 0
E
E
E
E
E
E

5 0 0 B
0 k "
M

"

"
"
0

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_WAIT_POLICY
ProvidesahinttoanOpenMPimplementationaboutthedesiredbehaviorofwaitingthreads.AcompliantOpenMP
implementationmayormaynotabidebythesettingoftheenvironmentvariable.ValidvaluesareACTIVEand
PASSIVE.ACTIVEspecifiesthatwaitingthreadsshouldmostlybeactive,i.e.,consumeprocessorcycles,whilewaiting.
PASSIVEspecifiesthatwaitingthreadsshouldmostlybepassive,i.e.,notconsumeprocessorcycles,whilewaiting.The
detailsoftheACTIVEandPASSIVEbehaviorsareimplementationdefined.Examples:
s e t
s e t
s e t
s e t

e n v
e n v
e n v
e n v

O M P
O M P
O M P
O M P

_ W A
_ W A
_ W A
_ W A

I T _
I T _
I T _
I T _

P O L
P O L
P O L
P O L

I C Y
I C Y
I C Y
I C Y

A C T
a c t
P A S
p a s

I V E
i v e
S I V E
s i v e

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_MAX_ACTIVE_LEVELS
Controlsthemaximumnumberofnestedactiveparallelregions.Thevalueofthisenvironmentvariablemustbeanon
negativeinteger.Thebehavioroftheprogramisimplementationdefinediftherequestedvalueof
OMP_MAX_ACTIVE_LEVELSisgreaterthanthemaximumnumberofnestedactiveparallellevelsanimplementation
cansupport,orifthevalueisnotanonnegativeinteger.Example:
s e t e n v

O M P _ M A X _ A C T I V E _ L E V E L S

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.
OMP_THREAD_LIMIT
SetsthenumberofOpenMPthreadstouseforthewholeOpenMPprogram.Thevalueofthisenvironmentvariable
mustbeapositiveinteger.Thebehavioroftheprogramisimplementationdefinediftherequestedvalueof
OMP_THREAD_LIMITisgreaterthanthenumberofthreadsanimplementationcansupport,orifthevalueisnota
positiveinteger.Example:
s e t e n v

O M P _ T H R E A D _ L I M I T

Implementationnotes:
Yourimplementationmayormaynotsupportthisfeature.

ThreadStackSizeandThreadBinding
ThreadStackSize:
TheOpenMPstandarddoesnotspecifyhowmuchstackspaceathreadshouldhave.Consequently,implementations
willdifferinthedefaultthreadstacksize.
Defaultthreadstacksizecanbeeasytoexhaust.Itcanalsobenonportablebetweencompilers.Usingpastversionsof
LCcompilersasanexample:
Compiler

Approx.StackLimit Approx.ArraySize(doubles)

Linuxicc,ifort

4MB

700x700

Linuxpgcc,pgf90

8MB

1000x1000

Linuxgcc,gfortran

2MB

500x500

Threadsthatexceedtheirstackallocationmayormaynotsegfault.Anapplicationmaycontinuetorunwhiledatais
beingcorrupted.
Staticallylinkedcodesmaybesubjecttofurtherstackrestrictions.
Auser'sloginshellmayalsorestrictstacksize
IfyourOpenMPenvironmentsupportstheOpenMP3.0O M P _ S T A C K S I Z E environmentvariable(coveredinprevious
section),youcanuseittosetthethreadstacksizepriortoprogramexecution.Forexample:
s e t
s e t
s e t
s e t
s e t
s e t
s e t

e n v
e n v
e n v
e n v
e n v
e n v
e n v

O M P
O M P
O M P
O M P
O M P
O M P
O M P

_ S T
_ S T
_ S T
_ S T
_ S T
_ S T
_ S T

A C K
A C K
A C K
A C K
A C K
A C K
A C K

S I Z
S I Z
S I Z
S I Z
S I Z
S I Z
S I Z

E
E
E
E
E
E
E

2 0 0 0
" 3 0 0
1 0 M
" 1 0
" 2 0
" 1 G
2 0 0 0

5 0 0 B
0 k "
M
m

"
"

"
0

Otherwise,atLC,youshouldbeabletousethemethodbelowforLinuxclusters.Theexampleshowssettingthethread
stacksizeto12MB,andasaprecaution,settingtheshellstacksizetounlimited.
csh/tcsh
ksh/sh/bash

s e t e n v K M P _ S T A C K S I Z E 1 2 0 0 0 0 0 0
l i m i t s t a c k s i z e u n l i m i t e d
e x p o r t
u l i m i t

K M P _ S T A C K S I Z E = 1 2 0 0 0 0 0 0
- s u n l i m i t e d

ThreadBinding:
Insomecases,aprogramwillperformbetterifitsthreadsareboundtoprocessors/cores.
"Binding"athreadtoaprocessormeansthatathreadwillbescheduledbytheoperatingsystemtoalwaysrunonathe
sameprocessor.Otherwise,threadscanbescheduledtoexecuteonanyprocessorand"bounce"backandforth
betweenprocessorswitheachtimeslice.
Alsocalled"threadaffinity"or"processoraffinity"
Bindingthreadstoprocessorscanresultinbettercacheutilization,therebyreducingcostlymemoryaccesses.Thisis
theprimarymotivationforbindingthreadstoprocessors.
Dependinguponyourplatform,operatingsystem,compilerandOpenMPimplementation,bindingthreadstoprocessors
canbedoneseveraldifferentways.
TheOpenMPversion3.1APIprovidesanenvironmentvariabletoturnprocessorbinding"on"or"off".Forexample:
s e t e n v
s e t e n v

O M P _ P R O C _ B I N D
O M P _ P R O C _ B I N D

T R U E
F A L S E

Atahigherlevel,processescanalsobeboundtoprocessors.
DetailedinformationaboutprocessandthreadbindingtoprocessorsonLCLinuxclusterscanbefoundHERE.

Monitoring,DebuggingandPerformanceAnalysisToolsforOpenMP
MonitoringandDebuggingThreads:
Debuggersvaryintheirabilitytohandlethreads.TheTotalViewdebuggerisLC'srecommendeddebuggerforparallel
programs.Itiswellsuitedforbothmonitoringanddebuggingthreadedprograms.
AnexamplescreenshotfromaTotalViewsessionusinganOpenMPcodeisshownbelow.
1.MasterthreadStackTracePaneshowingoriginalroutine
2.Process/threadstatusbarsdifferentiatingthreads
3.MasterthreadStackFramePaneshowingsharedvariables
4.WorkerthreadStackTracePaneshowingoutlinedroutine.
5.WorkerthreadStackFramePane
6.RootWindowshowingallthreads
7.ThreadsPaneshowingallthreadsplusselectedthread

SeetheTotalViewDebuggertutorialfordetails.
TheLinuxp s commandprovidesseveralflagsforviewingthreadinformation.Someexamplesareshownbelow.See
themanpagefordetails.
p
U I D
b l a
b l a
b l a
b l a
b l a
%

- L f
i s e
i s e
i s e
i s e
i s e

P I D
2 9
2 9
2 9
2 9
2 9

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

p s - T
P I D
S P I D

T T Y

P P
2 8 2
2 8 2
2 8 2
2 8 2
2 8 2

I D
4 0
4 0
4 0
4 0
4 0

L W P
2 9
3 0
3 1
3 2
3 3

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

N L W P
0

9 9

9 9

9 9

9 9

T I M E

C M D

S T I
1 1 :
1 1 :
1 1 :
1 1 :
1 1 :

M E
3 1
3 1
3 1
3 1
3 1

T T Y
p t s
p t s
p t s
p t s
p t s

/ 5 3
/ 5 3
/ 5 3
/ 5 3
/ 5 3

0 0 :
0 0 :
0 0 :
0 0 :
0 0 :

T I
0 0 :
0 1 :
0 1 :
0 1 :
0 1 :

M E
0 0
2 4
2 4
2 4
2 4

C M D
a . o
a . o
a . o
a . o
a . o

u t
u t
u t
u t
u t

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

2 9

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

2 9
2 9
2 9
2 9

p s P I D
2 2 5 2 9
-

2 9

p t s
p t s
p t s
p t s
p t s

3 0
3 1
3 2
3 3

/ 5 3
/ 5 3
/ 5 3
/ 5 3
/ 5 3

0 0 :
0 0 :
0 0 :
0 0 :
0 0 :

0 0 :
0 1 :
0 1 :
0 1 :
0 1 :

0 0

T I
1 8 :
0 0 :
0 4 :
0 4 :
0 4 :
0 4 :

M E

0 0 :
0 0 :
0 0 :
0 0 :
0 0 :
0 0 :

a . o
a . o
a . o
a . o
a . o

4 9
4 9
4 9
4 9

u t
u t
u t
u t
u t

L m
L W P

T T Y
p t s / 5 3
-

2 2 5
2 2 5
2 2 5
2 2 5
2 2 5

2 9

3 0

3 1

3 2

3 3

5 6

C M D
a . o u t

0 0

4 4

4 4

4 4

4 4

LC'sLinuxclustersalsoprovidethet o p commandtomonitorprocessesonanode.Ifusedwiththe- H flag,thethreads


containedwithinaprocesswillbevisible.Anexampleofthet o p - H commandisshownbelow.Theparentprocessis
PID18010whichspawnedthreethreads,shownasPIDs18012,18013and18014.

PerformanceAnalysisTools:
ThereareavarietyofperformanceanalysistoolsthatcanbeusedwithOpenMPprograms.Searchingthewebwillturn
upawealthofinformation.
AtLC,thelistofsupportedcomputingtoolscanbefoundat:computing.llnl.gov/code/content/software_tools.php.
Thesetoolsvarysignificantlyintheircomplexity,functionalityandlearningcurve.Coveringthemindetailisbeyondthe
scopeofthistutorial.
Sometoolsworthinvestigating,specificallyforOpenMPcodes,include:
Open|SpeedShop
TAU
PAPI
IntelVTuneAmplifier
ThreadSpotter

OpenMPExercise3
Assorted
Overview:
Logintotheworkshopcluster,ifyouarenotalreadyloggedin
Orphaneddirectiveexample:review,compile,run
GetOpenMPimplementationenvironmentinformation
Checkoutthe"bug"programs

GOTOTHEEXERCISEHERE

Thiscompletesthetutorial.
Pleasecompletetheonlineevaluationformunlessyouaredoingtheexercise,inwhichcaseplease
completeitattheendoftheexercise.

Wherewouldyouliketogonow?
Exercise
Agenda
Backtothetop

ReferencesandMoreInformation
Author:BlaiseBarney,LivermoreComputing.
TheOpenMPwebsite,whichincludestheC/C++andFortranApplicationProgramInterfacedocuments.
www.openmp.org

AppendixA:RunTimeLibraryRoutines

OMP_SET_NUM_THREADS
Purpose:
Setsthenumberofthreadsthatwillbeusedinthenextparallelregion.Mustbeapostiveinteger.
Format:
Fortran S U B R O U T I N E
C/C++

O M P _ S E T _ N U M _ T H R E A D S ( s c a l a r _ i n t e g e r _ e x p r e s s i o n )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ n u m _ t h r e a d s ( i n t

n u m _ t h r e a d s )

Notes&Restrictions:
Thedynamicthreadsmechanismmodifiestheeffectofthisroutine.
Enabled:specifiesthemaximumnumberofthreadsthatcanbeusedforanyparallelregionbythedynamic
threadsmechanism.
Disabled:specifiesexactnumberofthreadstouseuntilnextcalltothisroutine.

Thisroutinecanonlybecalledfromtheserialportionsofthecode
ThiscallhasprecedenceovertheOMP_NUM_THREADSenvironmentvariable

OMP_GET_NUM_THREADS
Purpose:
Returnsthenumberofthreadsthatarecurrentlyintheteamexecutingtheparallelregionfromwhichitiscalled.
Format:
Fortran I N T E G E R

F U N C T I O N

O M P _ G E T _ N U M _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ t h r e a d s ( v o i d )

C/C++

Notes&Restrictions:
Ifthiscallismadefromaserialportionoftheprogram,oranestedparallelregionthatisserialized,itwillreturn1.
Thedefaultnumberofthreadsisimplementationdependent.

OMP_GET_MAX_THREADS
Purpose:
ReturnsthemaximumvaluethatcanbereturnedbyacalltotheOMP_GET_NUM_THREADSfunction.
Fortran I N T E G E R

F U N C T I O N

O M P _ G E T _ M A X _ T H R E A D S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ m a x _ t h r e a d s ( v o i d )

C/C++

Notes&Restrictions:
GenerallyreflectsthenumberofthreadsassetbytheOMP_NUM_THREADSenvironmentvariableorthe
OMP_SET_NUM_THREADS()libraryroutine.
Maybecalledfrombothserialandparallelregionsofcode.

OMP_GET_THREAD_NUM
Purpose:
Returnsthethreadnumberofthethread,withintheteam,makingthiscall.Thisnumberwillbebetween0and
OMP_GET_NUM_THREADS1.Themasterthreadoftheteamisthread0
Format:
Fortran I N T E G E R

F U N C T I O N

O M P _ G E T _ T H R E A D _ N U M ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ t h r e a d _ n u m ( v o i d )

C/C++

Notes&Restrictions:
Ifcalledfromanestedparallelregion,oraserialregion,thisfunctionwillreturn0.
Examples:
Example1isthecorrectwaytodeterminethenumberofthreadsinaparallelregion.
Example2isincorrecttheTIDvariablemustbePRIVATE
Example3isincorrecttheOMP_GET_THREAD_NUMcallisoutsidetheparallelregion
Fortrandeterminingthenumberofthreadsinaparallelregion
Example1:Correct
P R O G R A M

H E L L O

I N T E G E R
! $ O M P

T I D ,

P A R A L L E L
T I D =
P R I N T

O M P _ G E T _ T H R E A D _ N U M

P R I V A T E ( T I D )

O M P _ G E T _ T H R E A D _ N U M ( )
* , ' H e l l o W o r l d f r o m

t h r e a d

' ,

T I D

' ,

T I D

' ,

T I D

. . .
! $ O M P

E N D

P A R A L L E L

E N D
Example2:Incorrect

! $ O M P

P R O G R A M

H E L L O

I N T E G E R

T I D ,

O M P _ G E T _ T H R E A D _ N U M

P A R A L L E L
T I D =
P R I N T

O M P _ G E T _ T H R E A D _ N U M ( )
* , ' H e l l o W o r l d f r o m

t h r e a d

. . .
! $ O M P

E N D

P A R A L L E L

E N D
Example3:Incorrect
P R O G R A M

H E L L O

I N T E G E R

T I D ,

T I D =
P R I N T
! $ O M P

O M P _ G E T _ T H R E A D _ N U M

O M P _ G E T _ T H R E A D _ N U M ( )
* , ' H e l l o W o r l d f r o m

t h r e a d

P A R A L L E L
. . .

! $ O M P

E N D

P A R A L L E L

E N D

OMP_GET_THREAD_LIMIT
Purpose:
ReturnsthemaximumnumberofOpenMPthreadsavailabletoaprogram.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ T H R E A D _ L I M I T

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ t h r e a d _ l i m i t

( v o i d )

Notes:
AlsoseetheO M P _ T H R E A D _ L I M I T environmentvariable.

OMP_GET_NUM_PROCS
Purpose:
Returnsthenumberofprocessorsthatareavailabletotheprogram.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ N U M _ P R O C S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n u m _ p r o c s ( v o i d )

OMP_IN_PARALLEL
Purpose:
Maybecalledtodetermineifthesectionofcodewhichisexecutingisparallelornot.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

O M P _ I N _ P A R A L L E L ( )

# i n c l u d e < o m p . h >
i n t o m p _ i n _ p a r a l l e l ( v o i d )

Notes&Restrictions:
ForFortran,thisfunctionreturns.TRUE.ifitiscalledfromthedynamicextentofaregionexecutinginparallel,and
.FALSE.otherwise.ForC/C++,itwillreturnanonzerointegerifparallel,andzerootherwise.

OMP_SET_DYNAMIC
Purpose:
Enablesordisablesdynamicadjustment(bytheruntimesystem)ofthenumberofthreadsavailableforexecutionof
parallelregions.
Format:
Fortran S U B R O U T I N E
C/C++

O M P _ S E T _ D Y N A M I C ( s c a l a r _ l o g i c a l _ e x p r e s s i o n )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ d y n a m i c ( i n t

d y n a m i c _ t h r e a d s )

Notes&Restrictions:
ForFortran,ifcalledwith.TRUE.thenthenumberofthreadsavailableforsubsequentparallelregionscanbeadjusted
automaticallybytheruntimeenvironment.Ifcalledwith.FALSE.,dynamicadjustmentisdisabled.
ForC/C++,ifdynamic_threadsevaluatestononzero,thenthemechanismisenabled,otherwiseitisdisabled.
TheOMP_SET_DYNAMICsubroutinehasprecedenceovertheOMP_DYNAMICenvironmentvariable.
Thedefaultsettingisimplementationdependent.
Mustbecalledfromaserialsectionoftheprogram.

OMP_GET_DYNAMIC
Purpose:
Usedtodetermineifdynamicthreadadjustmentisenabledornot.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

O M P _ G E T _ D Y N A M I C ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ d y n a m i c ( v o i d )

Notes&Restrictions:
ForFortran,thisfunctionreturns.TRUE.ifdynamicthreadadjustmentisenabled,and.FALSE.otherwise.
ForC/C++,nonzerowillbereturnedifdynamicthreadadjustmentisenabled,andzerootherwise.

OMP_SET_NESTED
Purpose:
Usedtoenableordisablenestedparallelism.
Format:
Fortran S U B R O U T I N E
C/C++

O M P _ S E T _ N E S T E D ( s c a l a r _ l o g i c a l _ e x p r e s s i o n )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ n e s t e d ( i n t

n e s t e d )

Notes&Restrictions:
ForFortran,callingthisfunctionwith.FALSE.willdisablenestedparallelism,andcallingwith.TRUE.willenableit.
ForC/C++,ifnestedevaluatestononzero,nestedparallelismisenabledotherwiseitisdisabled.
Thedefaultisfornestedparallelismtobedisabled.
ThiscallhasprecedenceovertheOMP_NESTEDenvironmentvariable

OMP_GET_NESTED
Purpose:
Usedtodetermineifnestedparallelismisenabledornot.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ n e s t e d

O M P _ G E T _ N E S T E D
( v o i d )

Notes&Restrictions:
ForFortran,thisfunctionreturns.TRUE.ifnestedparallelismisenabled,and.FALSE.otherwise.
ForC/C++,nonzerowillbereturnedifnestedparallelismisenabled,andzerootherwise.

OMP_SET_SCHEDULE
Purpose:
Thisroutinesetsthescheduletypethatisappliedwhentheloopdirectivespecifiesaruntimeschedule.
Format:
S U B R O U T I N E O M P _ S E T _ S C H E D U L E ( K I N D , M O D I F I E R )
Fortran I N T E G E R ( K I N D = O M P _ S C H E D _ K I N D ) K I N D
I N T E G E R M O D I F I E R
C/C++

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ s c h e d u l e ( o m p _ s c h e d _ t

k i n d ,

i n t

m o d i f i e r )

OMP_GET_SCHEDULE
Purpose:
Thisroutinereturnstheschedulethatisappliedwhentheloopdirectivespecifiesaruntimeschedule.

Format:
S U B R O U T I N E O M P _ G E T _ S C H E D U L E ( K I N D , M O D I F I E R )
Fortran I N T E G E R ( K I N D = O M P _ S C H E D _ K I N D ) K I N D
I N T E G E R M O D I F I E R
C/C++

# i n c l u d e < o m p . h >
v o i d o m p _ g e t _ s c h e d u l e ( o m p _ s c h e d _ t
*

k i n d ,

i n t

m o d i f i e r

OMP_SET_MAX_ACTIVE_LEVELS
Purpose:
Thisroutinelimitsthenumberofnestedactiveparallelregions.
Format:
Fortran
C/C++

S U B R O U T I N E O M P _ S E T _ M A X _ A C T I V E _ L E V E L S
I N T E G E R M A X _ L E V E L S
# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ m a x _ a c t i v e _ l e v e l s

( i n t

( M A X _ L E V E L S )

m a x _ l e v e l s )

Notes&Restrictions:
Ifthenumberofparallellevelsrequestedexceedsthenumberoflevelsofparallelismsupportedbytheimplementation,
thevaluewillbesettothenumberofparallellevelssupportedbytheimplementation.
Thisroutinehasthedescribedeffectonlywhencalledfromthesequentialpartoftheprogram.Whencalledfromwithin
anexplicitparallelregion,theeffectofthisroutineisimplementationdefined.

OMP_GET_MAX_ACTIVE_LEVELS
Purpose:
Thisroutinereturnsthemaximumnumberofnestedactiveparallelregions.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ M A X _ A C T I V E _ L E V E L S ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ m a x _ a c t i v e _ l e v e l s ( v o i d )

OMP_GET_LEVEL
Purpose:
Thisroutinereturnsthenumberofnestedparallelregionsenclosingthetaskthatcontainsthecall.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ L E V E L ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ l e v e l ( v o i d )

Notes&Restrictions:
Theomp_get_levelroutinereturnsthenumberofnestedparallelregions(whetheractiveorinactive)enclosingthetask
thatcontainsthecall,notincludingtheimplicitparallelregion.Theroutinealwaysreturnsanonnegativeinteger,and
returns0ifitiscalledfromthesequentialpartoftheprogram.

OMP_GET_ANCESTOR_THREAD_NUM
Purpose:
Thisroutinereturns,foragivennestedlevelofthecurrentthread,thethreadnumberoftheancestororthecurrent

thread.
Format:
Fortran
C/C++

I N T E G E R
I N T E G E R

F U N C T I O N
L E V E L

O M P _ G E T _ A N C E S T O R _ T H R E A D _ N U M ( L E V E L )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ a n c e s t o r _ t h r e a d _ n u m ( i n t

l e v e l )

Notes&Restrictions:
Iftherequestednestlevelisoutsidetherangeof0andthenestlevelofthecurrentthread,asreturnedbythe
omp_get_levelroutine,theroutinereturns1.

OMP_GET_TEAM_SIZE
Purpose:
Thisroutinereturns,foragivennestedlevelofthecurrentthread,thesizeofthethreadteamtowhichtheancestoror
thecurrentthreadbelongs.
Format:
Fortran
C/C++

I N T E G E R
I N T E G E R

F U N C T I O N
L E V E L

O M P _ G E T _ T E A M _ S I Z E ( L E V E L )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ t e a m _ s i z e ( i n t

l e v e l ) ;

Notes&Restrictions:
Iftherequestednestedlevelisoutsidetherangeof0andthenestedlevelofthecurrentthread,asreturnedbythe
omp_get_levelroutine,theroutinereturns1.Inactiveparallelregionsareregardedlikeactiveparallelregionsexecuted
withonethread.

OMP_GET_ACTIVE_LEVEL
Purpose:
Theomp_get_active_levelroutinereturnsthenumberofnested,activeparallelregionsenclosingthetaskthatcontains
thecall.
Format:
Fortran I N T E G E R
C/C++

F U N C T I O N

O M P _ G E T _ A C T I V E _ L E V E L ( )

# i n c l u d e < o m p . h >
i n t o m p _ g e t _ a c t i v e _ l e v e l ( v o i d ) ;

Notes&Restrictions:
Theroutinealwaysreturnsanonnegativeinteger,andreturns0ifitiscalledfromthesequentialpartoftheprogram.

OMP_IN_FINAL
Purpose:
Thisroutinereturnstrueiftheroutineisexecutedinafinaltaskregionotherwise,itreturnsfalse.
Format:
Fortran L O G I C A L
C/C++

F U N C T I O N

O M P _ I N _ F I N A L ( )

# i n c l u d e < o m p . h >
i n t o m p _ i n _ f i n a l ( v o i d )

OMP_INIT_LOCK

OMP_INIT_NEST_LOCK
Purpose:
Thissubroutineinitializesalockassociatedwiththelockvariable.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ I N I T _ L O C K ( v a r )
O M P _ I N I T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ i n i t _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ i n i t _ n e s t _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Theinitialstateisunlocked
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_DESTROY_LOCK
OMP_DESTROY_NEST_LOCK
Purpose:
Thissubroutinedisassociatesthegivenlockvariablefromanylocks.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ D E S T R O Y _ L O C K ( v a r )
O M P _ D E S T R O Y _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ d e s t r o y _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ d e s t r o y _ n e s t _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_SET_LOCK
OMP_SET_NEST_LOCK
Purpose:
Thissubroutineforcestheexecutingthreadtowaituntilthespecifiedlockisavailable.Athreadisgrantedownershipof
alockwhenitbecomesavailable.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ S E T _ L O C K ( v a r )
O M P _ S E T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ s e t _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ s e t _ n e s t _ _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_UNSET_LOCK
OMP_UNSET_NEST_LOCK

Purpose:
Thissubroutinereleasesthelockfromtheexecutingsubroutine.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ U N S E T _ L O C K ( v a r )
O M P _ U N S E T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
v o i d o m p _ u n s e t _ l o c k ( o m p _ l o c k _ t * l o c k )
v o i d o m p _ u n s e t _ n e s t _ _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.

OMP_TEST_LOCK
OMP_TEST_NEST_LOCK
Purpose:
Thissubroutineattemptstosetalock,butdoesnotblockifthelockisunavailable.
Format:
Fortran

C/C++

S U B R O U T I N E
S U B R O U T I N E

O M P _ T E S T _ L O C K ( v a r )
O M P _ T E S T _ N E S T _ L O C K ( v a r )

# i n c l u d e < o m p . h >
i n t o m p _ t e s t _ l o c k ( o m p _ l o c k _ t * l o c k )
i n t o m p _ t e s t _ n e s t _ _ l o c k ( o m p _ n e s t _ l o c k _ t

* l o c k )

Notes&Restrictions:
ForFortran,.TRUE.isreturnedifthelockwassetsuccessfully,otherwise.FALSE.isreturned.
ForFortran,varmustbeanintegerlargeenoughtoholdanaddress,suchasINTEGER*8on64bitsystems.
ForC/C++,nonzeroisreturnedifthelockwassetsuccessfully,otherwisezeroisreturned.
Itisillegaltocallthisroutinewithalockvariablethatisnotinitialized.

OMP_GET_WTIME
Purpose:
Providesaportablewallclocktimingroutine
Returnsadoubleprecisionfloatingpointvalueequaltothenumberofelapsedsecondssincesomepointinthepast.
Usuallyusedin"pairs"withthevalueofthefirstcallsubtractedfromthevalueofthesecondcalltoobtaintheelapsed
timeforablockofcode.
Designedtobe"perthread"times,andthereforemaynotbegloballyconsistentacrossallthreadsinateamdepends
uponwhatathreadisdoingcomparedtootherthreads.
Format:
Fortran D O U B L E
C/C++

P R E C I S I O N

F U N C T I O N

# i n c l u d e < o m p . h >
d o u b l e o m p _ g e t _ w t i m e ( v o i d )

OMP_GET_WTICK
Purpose:
Providesaportablewallclocktimingroutine

O M P _ G E T _ W T I M E ( )

Returnsadoubleprecisionfloatingpointvalueequaltothenumberofsecondsbetweensuccessiveclockticks.
Format:
Fortran D O U B L E
C/C++

P R E C I S I O N

F U N C T I O N

# i n c l u d e < o m p . h >
d o u b l e o m p _ g e t _ w t i c k ( v o i d )

https://computing.llnl.gov/tutorials/openMP/
LastModified:06/07/201601:08:02blaiseb@llnl.gov
UCRLMI133316

O M P _ G E T _ W T I C K ( )

You might also like