Compaq Fortran
User Manual for
Tru64 UNIX and
Linux Alpha Systems


Previous Contents Index


Appendix D
Parallel Library Routines

Note

This appendix applies only to Compaq Fortran on Tru64 UNIX systems.

This appendix contains the following sections:

This appendix summarizes the library routines available for use with directed parallel decomposition requested by the -mp and -omp compiler options.

Where applicable, new applications should call run-time parallel library routines using the OpenMP Fortran API format. (See Section D.1, OpenMP Fortran API Run-Time Library Routines.) For compatibility with existing programs, the Compaq Fortran compiler recognizes equivalent routines of the formats described in Section D.2. Thus, for example, if your program calls _OtsGetNumThreads , the Compaq Fortran compiler interprets that as a call to omp_get_num_threads .

D.1 OpenMP Fortran API Run-Time Library Routines

This section describes:

Table D-1 lists the supported OpenMP Fortran API run-time library routines. These routines are all external procedures.

Table D-1 OpenMP Fortran API Run-Time Library Routines
Routine Name Usage
Library Routines That Control and Query the Parallel Execution Environment
omp_get_dynamic Inform if dynamic thread adjustment is enabled. See Section D.1.1.1, omp_get_dynamic.
omp_get_max_threads Get the maximum value that can be returned by calls to the omp_get_num_threads() function. See Section D.1.1.2, omp_get_max_threads.
omp_get_nested Inform if nested parallelism is enabled. See Section D.1.1.3, omp_get_nested.
omp_get_num_procs Get the number of processors that are available to the program. See Section D.1.1.4, omp_get_num_procs.
omp_get_num_threads Get the number of threads currently in the team executing the parallel region from which the routine is called. See Section D.1.1.5, omp_get_num_threads.
omp_get_thread_num Get the thread number, within the team, in the range from zero to omp_get_num_threads() --1. See Section D.1.1.6, omp_get_thread_num.
omp_in_parallel Inform whether or not a region is executing in parallel. See Section D.1.1.7, omp_in_parallel.
omp_set_dynamic Enable or disable dynamic adjustment of the number of threads available for execution of parallel regions. See Section D.1.1.8, omp_set_dynamic.
omp_set_nested Enable or disable nested parallelism. See Section D.1.1.9, omp_set_nested.
omp_set_num_threads Set the number of threads to use for the next parallel region. See Section D.1.1.10, omp_set_num_threads.
General-Purpose Lock Routines
omp_destroy_lock Disassociate a lock variable from any locks. See Section D.1.2.1.
omp_init_lock Initialize a lock to be used in subsequent calls. See Section D.1.2.2.
omp_set_lock Make the executing thread wait until the specified lock is available. See Section D.1.2.3.
omp_test_lock Try to set the lock associated with a lock variable. See Section D.1.2.4.
omp_unset_lock Release the executing thread from ownership of a lock. See Section D.1.2.5.

D.1.1 Library Routines That Control and Query the Parallel Execution Environment

These routines are described in detail in the following sections.

D.1.1.1 omp_get_dynamic

Determines the status of dynamic thread adjustment.

Syntax:


INTERFACE 
    LOGICAL FUNCTION omp_get_dynamic () 
    END FUNCTION omp_get_dynamic 
END INTERFACE 
LOGICAL result 
result = omp_get_dynamic () 

Return Values:

This function returns TRUE if dynamic thread adjustment is enabled; otherwise it returns FALSE . The function always returns FALSE if dynamic adjustment of the number of threads is not implemented.

See Also:

Section D.1.1.8, omp_set_dynamic

D.1.1.2 omp_get_max_threads

Returns the maximum value that can be returned by calls to the
omp_get_num_threads() function.

Syntax:


INTERFACE 
    INTEGER FUNCTION omp_get_max_threads () 
    END FUNCTION omp_get_max_threads 
END INTERFACE 
INTEGER result 
result = omp_get_max_threads () 

Description:

If your program uses omp_set_num_threads() to change the number of threads, subsequent calls to omp_get_max_threads() will return the new value. When the omp_set_dynamic() routine is set to TRUE , you can use omp_get_max_threads() to allocate data structures that are maximally sized for each thread.

This function has global scope.

Return Values:

This function returns the maximum value whether executing from a serial region or from a parallel region.

If your program used omp_set_num_threads to change the number of threads, subsequent calls to omp_get_max_threads will return the new value.

See Also:

Section D.1.1.10, omp_set_num_threads
Section D.1.1.8, omp_set_dynamic

D.1.1.3 omp_get_nested

Determines the status of nested parallelism.

Syntax:


INTERFACE 
    LOGICAL FUNCTION omp_get_nested () 
    END FUNCTION omp_get_nested 
END INTERFACE 
LOGICAL result 
result = omp_get_nested () 

Description:

This function returns TRUE if nested parallelism is enabled. If nested parallelism is disabled it returns FALSE . The function always returns FALSE if nested parallelism is not implemented.

See Also:

Section D.1.1.9, omp_set_nested

D.1.1.4 omp_get_num_procs

Returns the number of processors that are available to the program.

Syntax:


INTERFACE 
    INTEGER FUNCTION omp_get_num_procs () 
    END FUNCTION omp_get_num_procs 
END INTERFACE 
INTEGER result 
result = omp_get_num_procs () 

Return Values:

This function returns an integer value indicating the number of processors your program has available.

D.1.1.5 omp_get_num_threads

Returns the number of threads currently in the team executing the parallel region from which it is called.

Syntax:


INTERFACE 
    INTEGER FUNCTION omp_get_num_threads () 
    END FUNCTION omp_get_num_threads 
END INTERFACE 
INTEGER result 
result = omp_get_num_threads () 

Description:

This function interacts with the omp_set_num_threads call and the OMP_NUM_THREADS environment variable that control the number of threads in a team. If the number of threads has not been explicitly set by the user, the default is implementation dependent.

The omp_get_num_threads function binds to the closest enclosing PARALLEL directive (see Chapter 6, Parallel Compiler Directives and Their Programming Environment). It returns 1 if the call is made from the serial portion of a program, or from a nested parallel region that is serialized.

See Also:

Section D.1.1.10, omp_set_num_threads
OMP_NUM_THREADS environment variable in Table 6-4, OpenMP Fortran API Environment Variables

D.1.1.6 omp_get_thread_num

Returns the thread number, within the team.

Syntax:


INTERFACE 
    INTEGER FUNCTION omp_get_thread_num () 
    END FUNCTION omp_get_thread_num 
END INTERFACE 
INTEGER result 
result = omp_get_thread_num () 

Description:

This function binds to the closest enclosing PARALLEL directive (see Chapter 6, Parallel Compiler Directives and Their Programming Environment). The master thread of the team is thread zero.

Return Values:

The value returned ranges from zero to omp_get_num_threads() - 1. The function returns zero when called from a serial region or from within a nested parallel region that is serialized.

See Also:

Section D.1.1.5, omp_get_num_threads
Section D.1.1.10, omp_set_num_threads

D.1.1.7 omp_in_parallel

Returns whether or not a region is executing in parallel.

Syntax:


INTERFACE 
    LOGICAL FUNCTION omp_in_parallel () 
    END FUNCTION omp_in_parallel 
END INTERFACE 
LOGICAL result 
result = omp_in_parallel() 

Description:

This function has global scope.

Return Values:

This function returns TRUE if it is called from the dynamic extent of a region executing in parallel, even if nested regions exist that may be serialized; otherwise it returns FALSE . A parallel region that is serialized is not considered to be a region executing in parallel.

D.1.1.8 omp_set_dynamic

Enables or disables dynamic adjustment of the number of threads available for execution in a parallel region.

Syntax:


INTERFACE 
    SUBROUTINE omp_set_dynamic (enable) 
    LOGICAL enable 
    END SUBROUTINE omp_set_dynamic 
END INTERFACE 
LOGICAL scalar_local_expression 
CALL omp_set_dynamic (scalar_logical_expression) 

Description:

To obtain the best use of system resources, certain run-time environments automatically adjust the number of threads that are used for executing subsequent parallel regions. This adjustment is enabled only if the value of the scalar logical expression to omp_set_dynamic is TRUE . Dynamic adjustment is disabled if the value of the scalar logical expression is FALSE .

When dynamic adjustment is enabled, the number of threads specified by the user becomes the maximum thread count. The number of threads remains fixed throughout each parallel region and is reported by the omp_get_num_threads() function.

A call to omp_set_dynamic overrides the OMP_DYNAMIC environment variable.

The default for dynamic thread adjustment is implementation dependent. A user code that depends on a specific number of threads for correct execution should explicitly disable dynamic threads. Implementations are not required to provide the ability to dynamically adjust the number of threads, but they are required to provide the interface in order to support portability across platforms.

See Also:

Section D.1.1.1, omp_get_dynamic
Section D.1.1.5, omp_get_num_threads
OMP_DYNAMIC environment variable in Table 6-4, OpenMP Fortran API Environment Variables

D.1.1.9 omp_set_nested

Enables or disables nested parallelism.

Syntax:


INTERFACE 
    SUBROUTINE omp_set_nested (enable) 
    LOGICAL enable 
    END SUBROUTINE omp_set_nested 
END INTERFACE 
LOGICAL scalar_logical_expression 
CALL omp_set_nested (scalar_logical_expression) 
END INTERFACE 

Description:

If the value of the scalar logical expression is FALSE , nested parallelism is disabled, and nested parallel regions are serialized and executed by the current thread. This is the default. If the value of the scalar logical expression is set to TRUE , nested parallelism is enabled, and parallel regions that are nested can deploy additional threads to form the team.

A call to omp_set_nested overrides the OMP_NESTED environment variable.

When nested parallelism is enabled, the number of threads used to execute the nested parallel regions is implementation dependent. This allows implementations that comply with the OpenMP standard to serialize nested parallel regions, even when nested parallelism is enabled.

See Also:

Section D.1.1.3, omp_get_nested
OMP_NESTED environment variable in Table 6-4, OpenMP Fortran API Environment Variables

D.1.1.10 omp_set_num_threads

Sets the number of threads to use for the next parallel region.

Syntax:


INTERFACE 
    SUBROUTINE omp_set_num_threads (number_of_threads) 
    INTEGER number_of_threads 
    END SUBROUTINE omp_set_num_threads 
END INTERFACE 
INTEGER scalar_integer_expression 
CALL omp_set_num_threads (scalar_integer_expression) 

Description:

The compiler evaluates the scalar integer expression and interprets its value as the number of threads to use. This function takes effect only when called from serial portions of the program. The behavior of the function is undefined if the function is called from a portion of the program where the omp_in_parallel function returns TRUE .

A call to omp_set_num_threads sets the maximum number of threads to use for the next parallel region when dynamic adjustment of the number of threads is enabled. A call to omp_set_num_threads overrides the OMP_NUM_THREADS environment variable.

See Also:

Section D.1.1.5, omp_get_num_threads
Section D.1.1.7, omp_in_parallel
OMP_NUM_THREADS environment variable in Table 6-4, OpenMP Fortran API Environment Variables

D.1.2 General-Purpose Lock Routines

The OpenMP run-time library includes a set of general-purpose locking routines. Your program must not attempt to access any lock variable, var, except through the routines described in this section. The var lock variable is an integer of a KIND large enough to hold an address. On Compaq Tru64 UNIX systems, var should be declared as INTEGER(KIND=8).

The lock control routines must be called in a specific sequence:

  1. The lock to be associated with the lock variable must first be initialized.
  2. The associated lock is made available to the executing thread.
  3. The executing thread is released from lock ownership.
  4. When finished, the lock must always be disassociated from the lock variable.

A simple SET_LOCK and UNSET_LOCK combination satisfies this requirement. If you want your program to do useful work while waiting for the lock to become available, you can use the combination of TRY_LOCK and UNSET_LOCK instead. For example:


      PROGRAM LOCK_USAGE 
      implicit none 
      integer(kind=4) ID 
      include 'forompdef'       ! It's in /usr/include after installation 
      INTEGER(KIND=8) LCK       ! This variable should be of size POINTER 
      CALL OMP_INIT_LOCK(LCK) 
!$OMP PARALLEL SHARED(LCK) PRIVATE(ID) 
      ID = OMP_GET_THREAD_NUM() 
      CALL OMP_SET_LOCK(LCK) 
      PRINT *, MY THREAD ID IS , ID 
      CALL OMP_UNSET_LOCK(LCK) 
      DO WHILE (.NOT. OMP_TEST_LOCK(LCK)) 
      CALL SKIP(ID) ! Do not yet have lock, do something else 
      END DO 
      CALL WORK(ID) ! Have the lock, now do work 
      CALL OMP_UNSET_LOCK(LCK) 
!$OMP END PARALLEL 
      CALL OMP_DESTROY_LOCK(LCK) 
      END 

The lock control routines are described in detail in the following sections.

D.1.2.1 omp_destroy_lock

Disassociates a given lock variable from any locks.

Syntax:


INTERFACE 
    SUBROUTINE omp_destroy_lock (var) 
    INTEGER(KIND=8) var 
    END SUBROUTINE omp_destroy_lock 
END INTERFACE 
INTEGER(KIND=8) v 
CALL omp_destroy_lock (v) 

Restriction:

Attempting to call this routine with a lock variable that has not been initialized is an invalid operation and will cause a run-time error.

D.1.2.2 omp_init_lock

Initializes a lock associated with a given lock variable for use in subsequent calls.

Syntax:


INTERFACE 
    SUBROUTINE omp_init_lock (var) 
    INTEGER(KIND=8) var 
    END SUBROUTINE omp_init_lock 
END INTERFACE 
INTEGER(KIND=8) v 
CALL omp_init_lock (v) 

Description:

The initial state of the lock variable v is unlocked.

Restriction:

Attempting to call this routine with a lock variable that is already associated with a lock is an invalid operation and will cause a run-time error.

D.1.2.3 omp_set_lock

Makes the executing thread wait until the specified lock is available.

Syntax:


INTERFACE 
    SUBROUTINE omp_set_lock (var) 
    INTEGER(KIND=8) var 
    END SUBROUTINE omp_set_lock 
END INTERFACE 
INTEGER(KIND=8) v 
CALL omp_set_lock (v) 

Description:

When the lock becomes available, the thread is granted ownership.

Restriction:

Attempting to call this routine with a lock variable that has not been initialized is an invalid operation and will cause a run-time error.

D.1.2.4 omp_test_lock

Tries to set the lock associated with the lock variable var.

Syntax:


INTERFACE 
    LOGICAL FUNCTION omp_test_lock (var) 
    INTEGER(KIND=8) var 
    END SUBROUTINE omp_test_lock 
END INTERFACE 
INTEGER(KIND=8) v 
LOGICAL result 
result = omp_test_lock (v) 

Return Values:

If the attempt to set the lock specified by the variable succeeds, the function returns TRUE ; otherwise it returns FALSE . In either case, the routine does not wait for the lock to become available.

Restriction:

Attempting to call this routine with a lock variable that has not been initialized is an invalid operation and will cause a run-time error.

D.1.2.5 omp_unset_lock

Releases the executing thread from ownership of the lock.

Syntax:


INTERFACE 
    SUBROUTINE omp_unset_lock (var) 
    INTEGER(KIND=8) var 
    END SUBROUTINE omp_unset_lock 
END INTERFACE 
INTEGER(KIND=8) v 
CALL omp_unset_lock (v) 

Description:

If the thread does not own the lock specified by the variable, the behavior is undefined.

Restriction:

Attempting to call this routine with a lock variable that has not been initialized is an invalid operation and will cause a run-time error.

D.2 Other Parallel Threads Routines

Note

Compaq Fortran supports the set of parallel thread routines described in this section for existing programs. For creating new programs, use the set of routines described in Section D.1, OpenMP Fortran API Run-Time Library Routines.

Table D-2, Other Parallel Threads Routines shows additional parallel threads routines. The _Otsxxx (Compaq spelling) and the mpc_xxx (compatibility spelling) routine names are equivalent. For example, calling _OtsGetNumThreads is the same as calling mpc_numthreads .

Table D-2 Other Parallel Threads Routines
Routine Name Description
_otsgetmaxthreads
mpc_maxnumthreads
Return the number of threads that would normally be used for parallel processing in the current environment. This is affected by the environment variable mp_thread_count , by the number of processes in the current process's processor set, and by any call to _otsinitparallel . Invoke as an integer function. See Section D.2.1.
_OtsGetNumThreads
mpc_numthreads
Return the number of threads that are being used in the current parallel region (if running within one), or the number of threads that have been created so far (if not currently within a parallel region). Invoke as an integer function. See Section D.2.2.
_OtsGetThreadNum
mpc_my_threadnum
Return a number that identifies the current thread. The main thread is 0, and slave threads are numbered densely from 1. Invoke as an integer function. See Section D.2.3.
_OtsInitParallel Start slave threads for parallel processing if they have not yet been started implicitly (normally, the threads have been started by default at the first parallel region). Call as a subroutine with two arguments (see Section D.2.4):
  • The total number of threads desired (or specify zero to allow use of the environment variable MP_THREAD_COUNT or maximum number of processors).
  • A pointer to a pthreads attribute block, which can be used to control the attributes of the slave threads.
_OtsInParallel
mpc_in_parallel_region
Return 1 if you are currently within a parallel region, or 0 if not. Invoke as an integer function. See Section D.2.5.
_OtsSetNumThreads Sets the number of threads to use for the next parallel region.
_OtsStopWorkers
mpc_destroy
Stop any slave threads created by parallel library support. This routine cannot be called from within a parallel region. After this call, new slave threads will be implicitly created the next time a parallel region is encountered, or can be created explicitly by calling _OtsInitParallel . Call as a subroutine. See Section D.2.7.

To call the _Otsxxx or mpc_xxx routines, use the cDEC$ ALIAS directive (described in the Compaq Fortran Language Reference Manual) to handle the mixed-case naming convention and missing trailing underscore.

For example, to call the _OtsGetThreadNum routine with an alias of OtsGetThreadNum , use the following code:


      integer a(10) 
      INTERFACE 
          INTEGER FUNCTION OtsGetThreadNum () 
!DEC$     ALIAS OtsGetThreadNum, '_OtsGetThreadNum' 
          END FUNCTION OtsGetThreadNum 
      END INTERFACE 
 
!$par parallel do 
      do i = 1,10 
      print *, "i=",i, "  thread=", OtsGetThreadNum () 
      enddo 
 
      end 

Fortran INTERFACE blocks for all of the _Otsxxx routines are in a file named forompdef.f in /usr/include . Add the following line to your program and you can use the Fortran name otsxxx to call any of the _Otsxxx routines:


USE 'forompdef.f' 

Alternatively, to use the compatibility naming convention of mpc_my_threadnum :


      integer a(10) 
      INTERFACE 
          INTEGER FUNCTION mpc_my_threadnum () 
!DEC$     ALIAS mpc_my_threadnum, 'mpc_my_threadnum' 
          END FUNCTION mpc_my_threadnum 
      END INTERFACE 
 
!$par parallel do 
      do i = 1,10 
         print *, "i=",i, "  thread=", mpc_my_threadnum () 
      enddo 
 
      end 

These parallel threads are described in detail in the following sections.

See Also:

Section 6.1.3, Parallel Processing Thread Model

D.2.1 _OtsGetMaxThreads or mpc_maxnumthreads

Returns the maximum number of threads for the current environment.

Syntax:


      INTERFACE 
          INTEGER FUNCTION otsgetmaxthreads () 
!DEC$     ALIAS otsgetmaxthreads, '_OtsGetMaxThreads' 
          END FUNCTION otsgetmaxthreads 
      END INTERFACE 
      INTEGER result 
      result = otsgetmaxthreads () 

Description:

Returns the number of threads that would normally be used for parallel processing in the current environment. This is affected by the environment variable MP_THREAD_COUNT , by the number of processes in the current process's processor set, and by any call to _OtsInitParallel .

D.2.2 _OtsGetNumThreads or mpc_numthreads

Returns the number of threads being used (in a parallel region) or created so far (if not in a parallel region).

Syntax:


      INTERFACE 
          INTEGER FUNCTION otsgetnumthreads () 
!DEC$     ALIAS otsgetnumthreads, '_OtsGetNumThreads' 
          END FUNCTION otsgetnumthreads 
      END INTERFACE 
      INTEGER result 
      result = otsgetnumthreads () 

Description:

Returns the number of threads that are being used in the current parallel region (if running within one), or the number of threads that have been created so far (if not currently within a parallel region). You can use this call to decide how to partition a parallel loop. For example:


      nt = otsgetnumthreads () 
c$par parallel do 
      do i = a,nt-1 
            work(i) = 0 
            k0 = 1+(i*n)/nt 
            k1 = ((i+1)+n)/nt 
            do j = 1,m 
                  do k = k0,k1 
                     ! use work(i) 
                  enddo 
            enddo 
      enddo 

D.2.3 _OtsGetThreadNum or mpc_my_threadnum

Returns the number of the current thread.

Syntax:


      INTERFACE 
          INTEGER FUNCTION otsgetthreadnum () 
!DEC$     ALIAS otsgetthreadnum, '_OtsGetThreadNum' 
          END FUNCTION otsgetthreadnum 
      END INTERFACE 
      INTEGER result 
      result = otsgetthreadnum () 

Description:

Returns a number that identifies the current thread. The main thread is 0, and slave threads are numbered densely from 1.

D.2.4 _OtsInitParallel

Starts slave threads.

Syntax:


      INTERFACE 
          SUBROUTINE otsinitparallel (nthreads, attr) 
!DEC$     ALIAS otsinitparallel, '_OtsInitParallel' 
          INTEGER              nthreads 
          INTEGER              (KIND=8) attr  
!DEC$     ATRRIBUTES, VALUE :: nthreads, attr 
          END SUBROUTINE otsinitparallel 
      END INTERFACE 

Description:

Starts slave threads for parallel processing if they have not yet been started implicitly. Use this routine if you want to:

The arguments are:

D.2.5 _OtsInParallel or mpc_in_parallel_region

Returns the current status of processing activity in a parallel region.

Syntax:


      INTERFACE 
          INTEGER FUNCTION otsinparallel () 
!DEC$     ALIAS otsinparallel, '_OtsInParallel' 
          END FUNCTION OtsInParallel 
      END INTERFACE 
      INTEGER result 
      result = otsinparallel () 

Description:

The routine returns 1 if the program is currently running within a parallel region; otherwise it returns 0.

D.2.6 _OtsSetNumThreads

Sets the number of threads to use for the next parallel region.

D.2.7 _OtsStopWorkers or mpc_destroy

Stops slave threads.

Syntax:


      INTERFACE 
          SUBROUTINE otsstopworkers () 
!DEC$     ALIAS otsstopworkers, '_OtsStopWorkers' 
          END SUBROUTINE otsstopworkers 
      END INTERFACE 
      CALL otsstopworkers () 

Description:

Stop any slave threads created by parallel library support. Use this routine if you need to perform some operation, such as a call to fork() , that cannot tolerate extra threads running in the process. This routine cannot be called from within a parallel region. After this call, new slave threads will be implicitly created the next time a parallel region is encountered, or can be created explicitly by calling _OtsInitParallel .


Index Contents