C H A P T E R  3

Getting Started

This chapter explains how to develop, compile and link, execute, and debug a Sun MPI program. The chapter focuses on what is specific to the Sun MPI implementation and, for the most part, does not repeat information that can be found in related documents. Information about programming with the Sun MPI I/O routines is in Chapter 4.


Header Files

Include syntax must be placed at the top of any program that calls Sun MPI routines.

#include <mpi.h>
INCLUDE 'mpif.h'

These lines enable the program to access the Sun MPI version of the mpi header file, which contains the definitions, macros, and function prototypes required when compiling the program. Ensure that you are referencing the Sun MPI include file.

The include files are usually found in /opt/SUNWhpc/include/ or /opt/SUNWhpc/include/v9/. If the compiler cannot find them, verify that they exist and are accessible from the machine on which you are compiling your code. The location of the include file is specified by a compiler option (see Compiling and Linking).

Sample Code

Two simple Sun MPI programs are available in /opt/SUNWhpc/examples/mpi and are included here in their entirety. In the same directory you will find the Readme file, which provides instructions for using the examples, and the make file Makefile.

CODE EXAMPLE 3-1 Simple Sun MPI Program in C: connectivity.c
/*
 * Test the connectivity between all processes.
 */
 
#pragma ident "@(#)connectivity.c 1.1 99/02/02"
 
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <netdb.h>
#include <unistd.h>
 
#include <mpi.h>
 
int
main(int argc, char **argv)
{
    MPI_Status  status;
    int         verbose = 0;
    int         rank;
    int         np;	                 /* number of processes in job */
    int         peer;
    int         i;
    int         j;
 
    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &np);
 
    if (argc>1 && strcmp(argv[1], "-v")==0)
        verbose = 1;
 
    for (i=0; i<np; i++) {
        if (rank==i) {
            /* rank i sends to and receives from each higher rank */
            for(j=i+1; j<np; j++) {
                if (verbose)
                  printf("checking connection %4d <-> %-4d\n", i, j);
                MPI_Send(&rank, 1, MPI_INT, j, rank, MPI_COMM_WORLD);
               MPI_Recv(&peer, 1, MPI_INT, j, j, MPI_COMM_WORLD, &status);
            }
        } else if (rank>i) {
            /* receive from and reply to rank i */
          MPI_Recv(&peer, 1, MPI_INT, i, i, MPI_COMM_WORLD, &status);
            MPI_Send(&rank, 1, MPI_INT, i, rank, MPI_COMM_WORLD);
        }
    }
 
    MPI_Barrier(MPI_COMM_WORLD);
    if (rank==0)
        printf("Connectivity test on %d processes PASSED.\n", np);
 
    MPI_Finalize();
    return 0;
}

CODE EXAMPLE 3-2 Simple Sun MPI Program in Fortran: monte.f
!
! Estimate pi via Monte-Carlo method.
! 
! Each process sums how many of samplesize random points generated 
! in the square (-1,-1),(-1,1),(1,1),(1,-1) fall in the circle of 
! radius 1 and center (0,0), and then estimates pi from the formula
! pi = (4 * sum) / samplesize.
! The final estimate of pi is calculated at rank 0 as the average of 
! all the estimates.
!
        program monte
 
        include 'mpif.h'
 
        double precision drand
        external drand
 
        double precision x, y, pi, pisum
        integer*4 ierr, rank, np
        integer*4 incircle, samplesize
 
        parameter(samplesize=2000000)
 
        call MPI_INIT(ierr)
        call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
        call MPI_COMM_SIZE(MPI_COMM_WORLD, np, ierr)
 
!       seed random number generator
        x = drand(2 + 11*rank)
 
        incircle = 0
        do i = 1, samplesize
           x = drand(0)*2.0d0 - 1.0d0     ! generate a random point
           y = drand(0)*2.0d0 - 1.0d0
 
           if ((x*x + y*y) .lt. 1.0d0) then
              incircle = incircle+1       ! point is in the circle
           endif
        end do
 
        pi = 4.0d0 * DBLE(incircle) / DBLE(samplesize)
 
!       sum estimates at rank 0
         call MPI_REDUCE(pi, pisum, 1, MPI_DOUBLE_PRECISION, MPI_SUM, 
      &         0 , MPI_COMM_WORLD, ierr)
 
        if (rank .eq. 0) then
!          final estimate is the average
           pi = pisum / DBLE(np)
              print '(A,I4,A,F8.6,A)','Monte-Carlo estimate of pi by ',np,
      &          ' processes is ',pi,'.'
        endif
 
        call MPI_FINALIZE(ierr)
        end


Compiling and Linking

Sun MPI programs are compiled with ordinary C, C++, or Fortran compilers, just like any other C, C++, or Fortran program, and linked with the Sun MPI library.

The mpf77, mpf90, mpcc, and mpCC utilities can be used to compile Fortran 77, Fortran 90, C, and C++ programs, respectively. For example, you might use the following entry to compile a Fortran 77 program that uses Sun MPI:

% mpf77 -fast -xarch=v8plusa -o a.out a.f -lmpi

See the man pages for more information on these utilities.

For performance, the single most important compilation switch is -fast. This is a macro that expands to settings appropriate for high performance for a general set of circumstances. Because its expansion varies from one compiler release to another, you might prefer to specify the underlying switches. To see what -fast expands to, use -v for "verbose" compilation output in Fortran, and -# for C. Also, -fast assumes native compilation, so you should compile on UltraSPARCtrademark processors.

The next important compilation switch is -xarch. Sun ONE Studio 7 Compiler Collection compilers set -xarch by default when you select -fast for native compilations. If you plan to compile on one type of processor and run the program on another type (nonnative compilation), be sure to use the -xarch flag. Also use it to compile in 64-bit mode. For UltraSPARC II processors, specify:

-xarch=v8plusa 

or

-xarch=v9a 

after -fast for 32-bit or 64-bit binaries, respectively. This version is only supported on Solaris 8 software. For UltraSPARC III processors, specify:

-xarch=v8plusb 

or

-xarch=v9b

The v8plusb and v9b flags apply only to running on UltraSPARC III processors, and do not work when running on UltraSparc II processors. For more information, see the Sun HPC ClusterTools Software Performance Guide and the documents that came with your compiler.

Sun MPI programs compiled using the Sun ONE Studio 7 Compiler Collection Fortran compiler should be compiled with -xalias=actual. The
-xalias=actual workaround requires patch 111718-01 (which requires 111714-01).

This recommendation arises because the MPI Fortran binding is inconsistent with the Fortran 90 standard in several respects. Specifically, this is documented in the MPI 2 standard, which you can find on the World Wide Web:

http://www-unix.mcs.anl.gov/mpi/mpi-standard/mpi-report-2.0/node19.htm#Node19

This recommendation applies to the use of high levels of compiler optimization. A highly optimizing Fortran compiler could break MPI codes that use nonblocking operations.

The failure modes are varied and insidious and include the following:

  • Silently incorrect answers
  • Intermittent and mysterious floating-point exceptions
  • Intermittent and mysterious hangs

If you will be using the Prism debugger, you must compile your program with compilers from the Fortetrademark 6 update 2, or Sun ONE Studio 7 (formerly Forte Development 7 software) Compiler Collections (see Debugging).

TABLE 3-1 Compile and Link Line Options for Sun MPI and Sun MPI I/O

Program

Options

C (nonthreaded
example)

Use mpcc (below), or if you prefer:

% cc filename.c -o filename \

-I/opt/SUNWhpc/include -L/opt/SUNWhpc/lib \

-R/opt/SUNWhpc/lib -lmpi

C++

Note that x.0 represents the version of your C++ compiler (6.0 for Forte 6 update 2, and 7.0 for Sun ONE Studio 7).

Use mpCC (below), or if you prefer:

% CC filename.cc -o filename \

-I/opt/SUNWhpc/include -L/opt/SUNWhpc/lib \

-R/opt/SUNWhpc/lib -L/opt/SUNWhpc/lib/SCx.0 \

-R/opt/SUNWhpc/lib/SCx.y -mt -lmpi++ -lmpi

mpcc, mpCC

% mpcc -o filename filename.c -lmpi

% mpCC -o filename filename.cc -mt -lmpi

Fortran 77
(nonthreaded
example)

Use mpf77 (below), or if you prefer:

% f77 -dalign filename.f -o filename \

-I/opt/SUNWhpc/include -L/opt/SUNWhpc/lib \

-R/opt/SUNWhpc/lib -lmpi

Fortran on a 64-bit system

% f77 -dalign filename.f -o filename \

-I/opt/SUNWhpc/include/v9 \

-L/opt/SUNWhpc/lib/sparcv9 \

-R/opt/SUNWhpc/lib/sparcv9 -lmpi

Fortran 90

Replace mpf77 with mpf90, or f77 with f90:

mpf90, mpf95

% mpf90 -o filename -dalign filename.f -lmpi

% mpf95 -o filename -dalign filename.f -lmpi

Multithreaded programs and programs containing nonblocking MPI I/O routines

To support multithreaded code, replace -lmpi with -lmpi_mt. This change also supports programs with nonblocking MPI I/O routines.

Note that -lmpi can be used for programs containing nonblocking MPI I/O routines, but -lmpi_mt must be used for multithreaded programs.




Note - For the Fortran interface, the -dalign option is necessary to avoid the possibility of bus errors. (The underlying C or C++ routines in Sun MPI internals assume that parameters and buffer types passed as REALs are double-aligned.)





Note - If your program has previously been linked to any static libraries, you must relink it to libmpi.so before executing it.



Choosing a Library Path

The paths for the MPI libraries, which you must specify when you are compiling and linking your program, are listed in TABLE 3-2.

TABLE 3-2 Sun MPI Libraries

Category

Description

Path: /opt/SUNWhpc/lib/...

32-bit libraries

Default, not thread-safe

libmpi.so

 

C++ (in addition to libmpi.so)

SC6.0/libmpi++.so

 

Thread-safe

libmpi_mt.so

64-bit libraries

Default, not thread-safe

sparcv9/libmpi.so

 

C++ (in addition to sparcv9/libmpi.so)

sparcv9/SC6.0/libmpi++.so

 

Thread-safe

sparcv9/libmpi_mt.so


Stubbing Thread Calls

The libthread.so libraries are automatically linked into the respective libmpi.so libraries. This means that any thread-function calls in your program can be resolved by the libthread.so library. Simply omitting libthread.so from the link line does not cause thread calls to be stubbed out; you must remove the thread calls yourself. For more information about the libthread.so library, see its man page. (For the location of Solaris man pages at your site, see your system administrator.)


Profiling With mpprof

If you plan to extract MPI profiling information from the execution of a job, you need to set the MPI_PROFILE environment variable to 1 before you start the job execution.

% setenv MPI_PROFILE 1

If you want to set any other mpprof environment variables, you must set them also before starting the job. See Appendix B for detailed descriptions of the mpprof environment variables.


Basic Job Execution

The CRE environment provides close integration with batch-processing systems, also known as resource managers. You can launch parallel jobs from a batch system to control resource allocation, and continue to use the CRE environment to monitor job status. For a list of currently supported resource managers, see TABLE 3-3.

TABLE 3-3 Currently Supported Resource Managers

Resource manager

Name used with
-x option to mprun

Version

Man page

Sun Grid Engine

sge

SGE 5.3

sge_cre.1

PBS

pbs

PBS 2.3.15

pbs_cre.1

 

pbs

PBS Pro 5.x.x

pbs_cre.1

LSF

lsf

LSF 4.x

lsf_cre.1


To enable the integration between the CRE environment and the supported resource managers, you must call mprun from a script in the resource manager. Use the -x flag to specify the resource manager, and the -np and -nr flags to specify the resources you need. Instructions and examples for each resource manager are provided in the Sun HPC ClusterTools Software User's Guide.

Before starting your job, you might want to set one or more environment variables, which are also described in Appendix B and in the Sun HPC ClusterTools Software Performance Guide.

Executing With CRE

When using CRE software, parallel jobs are launched using the mprun command. For example, to start a job with six processes named mpijob, use this command:

% mprun -np 6 mpijob

Executing With LSF Suite

Parallel jobs can be either launched by the LSF Parallel Application Manager (PAM) or submitted in queues configured to run PAM as the parallel job starter. LSF's bsub command launches both parallel interactive and batch jobs. For example, to start a batch job named mpijob on four CPUs, use this command:

% bsub -n 4 pam mpijob

To launch an interactive job, add the -I argument to the command line. For example, to launch an interactive job named earth on a single CPU in the queue named sun, which is configured to launch jobs with PAM, use this command:

% bsub -q sun -Ip -n 1 earth


Debugging

Debugging parallel programs is notoriously difficult, because you are in effect debugging a program potentially made up of many distinct programs executing simultaneously. Even if the application is an SPMD (single-program, multiple-data) application, each instance can be executing a different line of code at any instant. The Prism development environment eases the debugging process considerably and is recommended for debugging with Sun HPC ClusterTools software.

Debugging With the Prism Environment



Note - To run the graphical version of the Prism environment, you must be running the Solaris 8 operating environment with either OpenWindowstrademark software or the Common Desktop Environment (CDE), and with your DISPLAY environment variable set correctly. See the Prism Software User's Guide for information.



This section provides a brief introduction to the Prism development environment.

You can use a Prism session to debug more than one Sun MPI job at a time. To debug a child or client program it is necessary to launch an additional Prism session. If the child program is spawned using calls to MPI_Comm_spawn() or MPI_Comm_spawn_multiple(), Prism can (if enabled) debug the child program as well.

However, if an MPI job connects to another job, the current Prism session has control only of the parent or server MPI job. It cannot debug the children or clients of that job. This might occur, for example, when an MPI job sets up a client/server connection to another MPI job with MPI_Comm_accept or MPI_Comm_connect.

With the exception of programs using calls to MPI_Comm_spawn() or MPI_Comm_spawn_multiple(), to use the Prism environment to debug a Sun MPI program the program must be written in the SPMD style. In other words, all processes that make up a Sun MPI program must be running the same executable.

MPI_Comm_spawn_multiple can create multiple executables with only one job ID. Therefore, you can use the Prism environment to debug jobs with various executables that have been spawned with this command.

Starting Up Prism

To start Prism with a Sun MPI program, launch it from within the mprun command.

For example,

% mprun -np 4 -x lsf prism -np 4 foo

launches Prism on executable foo with four processes.

This starts up a graphical version of Prism with your program loaded. You can then debug and visualize data in your Sun MPI program.

You can also attach Prism to running processes. First determine the job ID (not the individual process ID), jobname (or jid), using mpps. (See the Sun HPC ClusterTools Software User`s Guide for further information about mpps.) Then specify the jid at the command line:

% prism foo 12345

This launches Prism and attaches it to the processes running in job 12345.

One important feature of the Prism environment is that it enables you to debug the Sun MPI program at any level of detail. You can look at the program as a whole, or you can look at subsets of processes within the program (for example, those that have an error condition) or at individual processes, all within the same debugging session. For complete information, see the Prism Software User's Guide.

Debugging With TotalView

TotalView is a third-party multiprocess debugger from Etnus that runs on many platforms. Support for using the TotalView debugger on Sun MPI applications includes:

  • Making Sun HPC ClusterTools software compatible with the Etnus debugger TotalView
  • Allowing Sun MPI jobs to be debugged by TotalView using the Sun Grid Engine (SGE), the Portable Batch System (PBS), and Platform Computing's Load Sharing Facility (LSF)
  • Displaying Sun MPI message queues
  • Allowing multiple instantiations of TotalView on a single cluster
  • Supporting TotalView in Sun HPC ClusterTools software

The following sections provide a brief description of how to use the TotalView debugger with Sun MPI applications, including:

Refer to your TotalView documentation for more information about using TotalView.

Limitations

  • Debuggable job restricted according to the license with Etnus. Contact the system administrator who installed TotalView for more details.
  • Supports the SPARC platform only.
  • TotalView 5 supports 32-bit application debugging only. However, TotalView 6 supports both 32-bit and 64-bit application debugging.
  • Does not support MPI_Comm_spawn and MPI_Comm_spawn_multiple function calls. Use Prism to debug these function calls.
  • Displays MPI_COMM_WORLD in the message queue graph only after MPI_Init() has occurred. Displays neither collective communicators nor MPI_COMM_SELF. Refer to the Etnus TotalView user's guide for more information.
  • Does not display any buffer contents for unexpected messages in the message queue window.

Related Documentation

For more information, refer to the following related documentation:

  • Sun HPC ClusterTools Software User's Guide
  • Sun HPC ClusterTools Software Administrator's Guide
  • Sun HPC ClusterTools software man pages
    • lsf_cre(1)
    • lsf_cre_admin(1M)
    • mpps(1M)
    • mprun(1M)
    • pbs_cre(1)
    • pbs_cre_admin(1M)
    • sge_cre(1)
    • sge_cre_admin(1M)
    • totalview_mprun(1M)
  • Etnus TotalView documentation

Starting a New Job Using TotalView

You can start a new job from the Total View Graphical User Interface (GUI) using:

  • GUI method 1
  • GUI method 2
  • Command-line interface (CLI)

procedure icon  To Start a New Job Using GUI Method 1

1. Type:

% totalview mprun [totalview args] -a [mprun args]

For example:

% totalview mprun -bg blue -a -np 4 /opt/SUNWhpc/mpi/conn.x

2. When the GUI appears, type g for go, or click Go in the TotalView window.

TotalView may display a dialog box:

Process mprun is a parallel job. Do you want to stop the job now?

3. Click Yes to open the TotalView debugger window with the Sun MPI source window, if compiled with option -g, and to leave all processes in a traced state.


procedure icon  To Start a New Job Using GUI Method 2

1. Type:

% totalview

2. Select the menu option File and then New Program.

3. Type mprun as the executable name in the dialog box.

4. Click OK.

TotalView displays the main debug window.

5. Select the menu option Process and then Startup Parameters, which are the mprun args.


procedure icon  To Start a New Job Using the CLI

1. Type:

% totalviewcli mprun [totalview args] -a [mprun args]

For example:

% totalviewcli mprun -a -np 4 /opt/SUNWhpc/mpi/conn.x

2. When the job starts, type dgo.

TotalView displays this message:

Process mprun is a parallel job. Do you want to stop the job now?

3. Type y to start the MPI job, attach TotalView, and leave all processes in a traced state.

Attaching to an mprun Job

This section describes how to attach to an already running mprun job from both the TotalView GUI and CLI.


procedure icon  To Attach to a Running Job from the GUI

1. Find the host name and process identifier (PID) of the mprun job by typing:

% mpps -b

mprun displays the PID and host name in a similar manner to this example:

JOBNAME   MPRUN_PID   MPRUN_HOST
cre.99    12345       hpc-u2-9
cre.100   12601       hpc-u2-8

For more information, refer to the mpps(1M) man page, option -b.

2. In the TotalView GUI, select File and then New Program.

3. Type the PID in Process ID.

4. Type mprun in the field Executable Name.

5. Do one of the following:

  • Leave Remote Host blank if TotalView is running on the same node as the mprun job.
  • Enter the host name in Remote Host.

6. Click OK.


procedure icon  To Attach to a Running Job From the CLI

1. Find the process identifier (PID) of the launched job.

See the example under the preceding GUI procedure. For more information, refer to the mpps(1M) man page, option -b.

2. Start totalviewcli by typing:

% totalviewcli

3. Attach the executable program to the mprun PID:

% dattach mprun mprun_pid

For example:

% dattach mprun 12601 

Launching Sun MPI Batch Jobs Using TotalView

This section describes how to launch Sun MPI batch jobs, including:

  • TotalView's control using the GUI only
  • Interactive sessions using the GUI
  • Interactive sessions using the CLI

This section provides examples of launching batch jobs in Sun Grid Engine (SGE). Refer to Chapter 5 of the Sun HPC ClusterTools Software User's Guide for descriptions of launching batch jobs in the Load Sharing Facility (LSF) and the Portable Batch System (PBS).


procedure icon  To Execute Startup in Batch Mode for the TotalView GUI

Executing startup in batch mode for the TotalView CLI is not practical, because there is no controlling terminal for input and output. This procedure describes executing startup in batch mode for the TotalView GUI:

1. Write a batch script, which contains a line similar to the following:

% totalview mprun -a -x SGE /opt/SUNWhpc/mpi/conn.x

2. Then submit the script to SGE for execution with a command similar to the following:

% qsub -l crfe 4 batch_script

The TotalView GUI appears upon successful allocation of resources and execution of the batch script in SGE.


procedure icon  To Use the Interactive Mode

The interactive mode creates an xterm window for your use, so you can use either the TotalView GUI or the CLI.

1. Submit an interactive mode job to SGE with a command similar to the following:

% qsh -l cre 4

The system displays an xterm window.

2. Run the following, or an equivalent path, to source the SGE environment:

% source /opt/sge/default/common/settings.csh

3. Execute a typical totalview or totalviewcli command.

TotalView GUI example:

% totalview mprun -a -x SGE /opt/SUNWhpc/mpi/conn.x

TotalView CLI example:

% totalviewcli mprun -a -x SGE /opt/SUNWhpc/mpi/conn.x

Debugging With MPE

The multiprocessing environment (MPE) available from Argonne National Laboratory includes a debugger that can also be used for debugging at the thread level. For information about obtaining and building MPE, see MPE: Extensions to the Library.