Sun Logo


Sun HPC ClusterToolstrademark 5 Software User's Guide

817-0084-10



Contents

Preface

1. Introduction to Sun HPC ClusterTools Software

Supported Configurations

Sun HPC ClusterTools Runtime Environment (CRE)

Executing Programs With mprun

Killing Programs

Displaying Job Information

Displaying Node Information

Integration With Distributed Resource Management Systems

Sun MPI and MPI I/O

Prism

Support for TotalView

Sun S3L

MPProf

2. Fundamental Concepts

Clusters and Nodes

Partitions

How Partitions Are Enabled and Selected

Load Balancing

Processes

Jobs

How the CRE Environment Is Integrated With Distributed Resource Management Systems

How Programs Are Launched

How Distributed Resource Managers Work

3. Before You Begin

Prerequisites

Command and Man Page Paths

Authentication Methods

Core Files

4. Running Programs With mprun

Syntax

Controlling Where the Program Runs

Precedence for Program Execution

procedure iconsmall spaceHow to Run a Program With Default Settings

procedure iconsmall spaceHow to Run on a Different Cluster (-c)

procedure iconsmall spaceHow to Run on a Different Partition (-p)

procedure iconsmall spaceHow to Run as Multiple Processes (-np)

procedure iconsmall spaceHow to Share Nodes (-j)

procedure iconsmall spaceHow to Enable Process Spawning (-Ys)

procedure iconsmall spaceHow to Disable Process Spawning (-Ns)

procedure iconsmall spaceHow to Wrap Multiple Processes (-W)

procedure iconsmall spaceHow to Settle for Available Processes (-S)

procedure iconsmall spaceHow to Include Independent Nodes (-u)

procedure iconsmall spaceHow to Combine Process Placement Options

Mapping MPI Processes to Nodes

procedure iconsmall spaceHow to Distribute Processes Among Nodes (-l)

procedure iconsmall spaceHow to Distribute Processes by Block (-Z and -Zt)

procedure iconsmall spaceHow to Distribute Processes by Rank Map (-m)

procedure iconsmall spaceHow to Reserve Resources For Spawning or Multithreading (-nr)

procedure iconsmall spaceHow to Select Nodes by Resource Requirement (-R)

Controlling Input/Output

procedure iconsmall spaceHow to Redirect Output to mprun (-D)

procedure iconsmall spaceHow to Redirect Output to Individual Files (-B)

procedure iconsmall spaceHow to Shut Off All Standard I/O (-N)

procedure iconsmall spaceHow to Redirect With an Argument Vector (-A)

procedure iconsmall spaceHow to Read Standard Input From /dev/null (-n)

procedure iconsmall spaceHow to Redirect With a Custom Configuration (-I)

Controlling Other Job Attributes

procedure iconsmall spaceHow to Include Shell-Specific Actions

procedure iconsmall spaceHow to Move a Process to the Background

procedure iconsmall spaceHow to Change the Working Directory (-C)

procedure iconsmall spaceHow to Use a Different User Name (-U)

procedure iconsmall spaceHow to Use a Different Group Name (-G)

procedure iconsmall spaceHow to Run a Job on a Different Project (-P)

procedure iconsmall spaceHow to Specify Verbose Output (-v)

procedure iconsmall spaceHow to Display Command Help (-h)

procedure iconsmall spaceHow to Display the Command's Version (-V)

procedure iconsmall spaceHow to Display Job Status Information (-J)

procedure iconsmall spaceHow to Store Job Name in a File (-d)

procedure iconsmall spaceHow to Tag Output With Its Rank Number (-o)

Command Reference (mprun)

5. Running Programs With mprun in Distributed Resource Management Systems

mprun Options for DRM Integration

Improper Flag Combinations for Batch Jobs

Running Parallel Jobs in the PBS Environment

procedure iconsmall spaceHow to Run an Interactive Job in PBS

procedure iconsmall spaceHow to Run a Script Job in PBS

Running Parallel Jobs in the LSF Environment

procedure iconsmall spaceHow to Run an Interactive Job in LSF

procedure iconsmall spaceHow to Run a Script Job in LSF

procedure iconsmall spaceHow to Run an LSF Job in Compatibility Mode

Running Parallel Jobs in the SGE Environment

procedure iconsmall spaceHow to Run an Interactive Job in SGE

procedure iconsmall spaceHow to Run a Script Job in SGE

6. Killing or Sending Signals to Programs With mpkill

What You Can Do

Return Values

procedure iconsmall spaceHow to Kill a Running Program

procedure iconsmall spaceHow to Remove All Traces of a Job

procedure iconsmall spaceHow to Display a List of Supported Signals (-l -d)

procedure iconsmall spaceHow to Send a Signal to a Job

7. Displaying Program Information With mpps

What You Can Do

procedure iconsmall spaceHow to Display Job Status

procedure iconsmall spaceHow to Display Information About Individual Jobs (-J)

procedure iconsmall spaceHow to Display Job Name, PID, and Host of Current Job (-b)

procedure iconsmall spaceHow to Display Information About All Jobs (-e)

procedure iconsmall spaceHow to Display a Job's Start Time (-f)

procedure iconsmall spaceHow to Display Job Information by Partition (-A -a)

procedure iconsmall spaceHow to Display Job Information by Process (-p -P)

Command Reference (mpps)

8. Profiling Programs With MPPROF

Enabling MPI Profiling

Controlling Data Collection

MPI_PROFDATADIR

MPI_PROFINDEXFDIR

MPI_PROFINTERVAL

MPI_PROFMAXFILESIZE

Using mpprof to Generate Reports

mpprof Command Syntax

Generating a Message Passing Report

Reporting on Specific Processes

Reporting Processes That Occur After a Specified Time Interval

To Save Report Output for Later Use

A Sample Report

Using mpdump to Convert Intermediate Binary Files to ASCII Files

The mpdump Command Syntax

A Sample mpdump File

9. Displaying Information With mpinfo

What You Can Do

procedure iconsmall spaceHow to Display Information About Published Names (-T)

procedure iconsmall spaceHow to Display Information About Any Cluster (-c)

procedure iconsmall spaceHow to Display Information About the Current Cluster (-C)

procedure iconsmall spaceHow to Display Information About Individual Partitions (-p)

procedure iconsmall spaceHow to Display Information About All Partitions (-P)

procedure iconsmall spaceHow to Display Information About Individual Nodes (-n)

procedure iconsmall spaceHow to Display Information About All Nodes (-N)

procedure iconsmall spaceHow to Display an Online List of Valid Attributes (-lc, -lp, -ln)

procedure iconsmall spaceHow to Restrict Output to Individual Attributes (-A)

procedure iconsmall spaceHow to Display Information in Verbose Mode (-v)

Command Reference (mpinfo)

Index