C H A P T E R 4 |
Running Programs With mprun |
The mprun command controls several aspects of program execution. This chapter describes what you can do with the command. It contains the following sections:
% mprun [ options ] [ - ] program-name [ program-arguments ] |
The options control the behavior of the command. The tasks they perform are summarized in the diagram on the previous page. TABLE 4-5 lists the options in alphabetical order, with a brief description.
The runtime environment applies the options to the mprun command according to useful program logic rather than sequential order. Some options override conflicting options that appear earlier in the command line or in the MPRUN_FLAGS environment variable. In some cases, the presence of one option causes other options in the command line to be ignored, even if they appear later in the command line. As a result, option precedence varies by task. A table at the beginning of each group of tasks lists precedence order for the options used in those tasks.
If program-name conflicts with the name of an mprun option, use the - (dash) symbol to separate the program name from the option list. Be sure to add a space between the - symbol and the dash in the program name. For example:
% mprun -np 4 - myprogram |
Enter any required program-arguments after the program-name.
You can pre-enter options to the mprun command by setting the MPRUN-FLAGS environment variable. Since the MPRUN-FLAGS variable only affects default behavior, you can override those options by entering different ones when you enter the mprun command itself.
The MPRUN-FLAGS environment variable uses the same options as the mprun command. (For a complete list, see TABLE 4-5.) If you use more than one word, enclose the list in quotation marks.
For example, to make part2 the default partition, enter:
% setenv MPRUN_FLAGS "-p part2" |
# MPRUN_FLAGS = "-p part2"; export MPRUN_FLAGS |
You can check the current setting of MPRUN_FLAGS by issuing the command printenv.
% printenv MPRUN_FLAGS |
Three environment variables related to mprun are available for your scripts:
Each variable is automatically set by the mprun command at execution time. For example, this instance of mprun...
% mprun -np 6 a.out |
... would set the value of the variables to:
The same jid that can be displayed by mpps (see Chapter 7) |
To run the program with default settings, enter the command and program name, followed by any required arguments to the program:
% mprun program-name |
By default, a program runs on your login cluster. To run a program on a different cluster, use the -c option:
% mprun -c cluster-name program-name |
To find the name of a cluster, use the mpinfo command with the -C option, as described in How to Display Information About the Current Cluster (-C). Note case sensitivity.
To run the program on a partition other than your login partition, use the -p option:
% mprun -p partition-name program-name |
The partition must be enabled. If it is not enabled, the job fails. (As described in Partitions, if a node is included in multiple partitions, only one partition can be enabled at a time.)
By default, an MPI program started with mprun runs as one process. To run the program as multiple processes, use the -np option:
% mprun -np process-count program-name |
When you request multiple processes, CRE attempts to start one process per CPU. If you request more processes than the number of available CPUs, you must use either the -W (Page See How to Wrap Multiple Processes (-W)) or -S (Page See How to Settle for Available Processes (-S)) options to prevent mprun from failing.
If you enter 0 as the number of processes, the runtime environment starts one process per available CPU. For example:
% mprun -np 4 a.out % mprun -p partition2 -np 0 a.out |
The first example runs four copies of the program a.out on the login partition. The second example runs the job on partition2, which has six CPUs. Because the second command specifies "0" processes, the runtime environment runs six copies of a.out, one for each available CPU.
% mprun -np process-count x threads program-name |
When launching a multi-threaded program, use the x threads syntax to specify the number of threads per process. Although the job requires a number of resources equal to process-count multiplied by threads, only process-count processes are started. The ranks are numbered from 0 (zero) to process-count minus 1. The processes are allocated across nodes so that each node provides a number of CPUs equal to or greater than threads. If threading requirements cannot be met, the job fails and provides diagnostic messages. As with a processor value of 0, a thread value of 0 requests all available resources on the node. In this way it is equivalent to the -Ns option.
The syntax -np process-count is equivalent to the syntax -np process-countx1. The default is -np 1x1.
Note - If a batch job calls MPI_Comm_spawn(3SunMPI) or MPI_Comm_spawn_multiple(3SunMPI), be sure to use the -nr option to reserve the additional resources. |
To run a program on the same node(s) as another program, use the -j option:
% mprun -j jid [ mprun-options ] program-name |
The jid argument is the program's job ID (described in Jobs).
Place additional mprun-options, if any, after the -j option. Here are two examples.
% mprun -j cre.85 a.out % mprun -j cre.85 -Ns a.out |
Both of the examples above run the program a.out on the same node as the program identified by the jid of 85. The second example includes the -Ns option to disable process spawning (Page See How to Disable Process Spawning (-Ns)).
To enable a program that runs on a node with multiple CPUs to spawn processes, use the -Ys option:
% mprun -Ys program-name |
To limit the number of processes a program uses to one per node, use the -Ns option:
% mprun -Ns program-name |
The -Ns option prevents nodes that have multiple CPUs from spawning additional processes.
When you have more processes than available CPUs, use the -W option to wrap the processes:
% mprun -np process-count -W program-name |
Without the -W option, excess processes would make the job fail. The -W option assigns as many processes as required to each CPU, and executes the processes one at a time. (To include independent nodes in the wrap, use the -u option, described on page See How to Include Independent Nodes (-u).)
% mprun -p part2 -np 10 -W a.out |
If the partition part2 had six available CPUs and you specified 10 wrapped processes, CRE would distribute the processes among the CPUs according to load-balancing rules.
(The -S option, described below, provides a different solution to the same problem.)
When you have more processes than available CPUs, use the -S option to settle for the number of available CPUs.
% mprun -np process-count -S program-name |
Without the -S option, excess processes would make the job fail. The -S option assigns one process to each CPU, and when it runs out of CPUs, it ignores the remaining processes. (To assign the remaining processes to independent nodes, use the -u option, described below.)
% mprun -p part2 -np 10 -S a.out |
If the partition part2 had six available CPUs and you specified 10 processes with the -S option, CRE would assign one process to each of the six CPUs, and discard the remaining four processes.
(The -W option, described on page See How to Wrap Multiple Processes (-W), provides a different solution to the same problem.)
When a partition does not have enough CPUs to handle all the processes of a job, and you select either the -S option or the -W option, you can use the -u option to assign the extra processes to independent nodes outside the partition:
% mprun -np process-count -W -u program-name % mprun -np process-count -S -u program-name |
To be eligible, an independent node must satisfy three requirements:
2. It cannot belong to another partition that is currently enabled.
3. It must be running the same version of the Solaris operating environment as the nodes in the partition. For the current release of Sun HPC ClusterTools software, this OS must be Solaris 8.
For example, assume partition2 had six available CPUs and the node had two independent nodes. If you specified 10 wrapped processes and added the -u option...
% mprun -p part2 -np 10 -W -u a.out |
... CRE would distribute the ten processes among the 8 CPUs, and use load-balancing rules to assign the remaining two processes.
If you specified 10 processes with the -S option and added the -u option....
% mprun -p part2 -np 10 -S -u a.out |
... CRE would assign one process to each of the six CPUs, one to each independent node, and discard the remaining two processes.
As described in How to Run as Multiple Processes (-np), you can request x processes, if as many as x processors are available, using the -np option. For example,
% mprun -np x a.out |
If you specify 0 as the number of processes, the runtime environment starts one process per available CPU.
However, if you combine the -np option with the -Ns option (assign one process per node) or the -W option (assign processes to the available nodes until the
-np argument is satisfied),
If you assign to a node a number of processes that is greater than the number of CPUs on that node, the runtime environment complies with your request unless the value of total_max_procs prevents it.
Four primary mprun options affect rank placement: -l, -m, -Z, and -Zt. Four ancillary options also influence rank placement: -W, -S, -np, and -u. The following table summarizes an interaction matrix for these options:
To distribute processes among individual nodes, use the -l option following the
-np option:
% mprun -np process-count -l rank-spec program-name |
The -np option (described in How to Run as Multiple Processes (-np)) specifies the number of processes the program uses.
The rank-specs specify how many processes go to each node. Be sure to enclose the set of rank-specs with one set of quotation marks, and use commas to separate them from each other:
The number of rank-specs you use must be a factor of the number of processes you specify with the -np option. For example:
% mprun -np 1 -l "node0" a.out % mprun -np 2 -l "node0, node1" a.out % mprun -np 4 -l "node0, node1, node2, node3" a.out |
The examples above use one rank-spec for one process, two rank-specs for two processes, and three rank-specs for three processes. You cannot use three rank-specs with four processes, for instance, because four processes cannot be evenly distributed across three nodes.
Each rank-spec identifies one node and the number of processes that run on it:
The node-name can be a name or an IP address. The process-count argument is optional. If you omit it, as in the examples above, one process is assigned to each node. If you have more processes than nodes, you must include the process-count argument to indicate how many processes are assigned to each node. For example:
% mprun -np 2 -l "node0 2" a.out |
In the example above, the program runs with two processes on one node, node0, so you must indicate that both processes are assigned to node0.
In the following example, the program runs with four processes on two nodes, so you must indicate how those processes are assigned to the nodes. Three combinations are possible:
% mprun -np 4 -l "node0 2, node1 2" a.out % mprun -np 4 -l "node0 1, node1 3" a.out % mprun -np 4 -l "node0 3, node1 1" a.out |
You can arrange a job's processes into blocks. The blocks of processes are then distributed among the nodes. The -Z option distributes the blocks among the available nodes using load balancing. In other words, two blocks may be assigned to the same node if that is the most efficient way to execute the job. To force each block to be assigned to a separate node instead, use the -Zt option. Use the -Z or -Zt option ahead of the -np option:
% mprun -Z block-count -np process-count program-name % mprun -Zt block-count -np process-count program-name |
% mprun -Z 2 -np 4 a.out % mprun -Zt 2 -np 4 a.out |
In the example above, the -Z option specifies two blocks. Because the total number of processes is four (-np 4), each block has two processes. They are distributed among available nodes as efficiently as possible. The -Zt option also creates two blocks, each with two processes, but they are distributed to two separate nodes.
% mprun -Z 3 -np 8 a.out % mprun -Zt 3 -np 8 a.out |
Both examples above create three blocks, two with three processes each, and one with two processes.
To distribute processes among nodes with a rank map file, use the -m option:
% mprun -np process-count -m rankmap-file program-name |
Use the -m rankmap-file option to assign processes to nodes as specified in the file rankmap-file. The rankmap in the file is specified as one or more nodenames, each followed optionally by the number of processes to assign to that node (in rank order); the default is one. The rankmap file can also accept IP addresses instead of nodenames.
Multiple nodenames (or IP addresses) may be separated by newlines; if multiple nodenames appear on the same line, they are separated by commas.
You can obtain the names and IP addresses of nodes using the -Nv option to the mpinfo command.
% mpinfo -Nv |
The rank map specified with the -m option will be rejected if any of the following conditions are true:
If the process-count used with the -np option is greater than the number of ranks specified in the rank map, you must use either -S (to settle for the available number of ranks in the rank map) or -W (to wrap the requested processes on the specified nodes). Otherwise your job will fail.
If the value specified in the -np option is less than the number of ranks specified in the rank map, the rank assignment will be limited to the value of -np.
If you use -np 0, the number of processes will be derived from the number or ranks described in the rank map.
A rank map file has this syntax:
rank-map file--> node-name [ , ]node-name [ , ]node-name [ , ] ...
A node-name can be a name or an IP address. Since commas can be used to separate node names in a file, you could simply place the contents of an inline rank map in a file. However, new-line characters (\n) are also recognized as separators in rank map files, so you will probably find it easier to list each node on its own line. For example:
mars 2 venus 2 jupiter 2 |
How to Reserve Resources For Spawning or Multithreading (-nr) |
This syntax reserves a number of resources equal to numprocs x threads. These resources are held in reserve over and above the number of resources specified by the -np option. Use this option when the batch job contains calls to MPI_Comm_spawn(3SunMPI) or MPI_Comm_spawn_multiple(3SunMPI). Specify a number of resources equal to or greater than the total number of processes that will be spawned. For example,
% mprun -nr numprocs [ x threads ]... |
In a multithreaded environment, use the xthreads syntax to specify the number of threads per process. The syntax -nr numprocs is equivalent to the syntax -nr numprocsx1. The default is -nr 0x1.
A threads setting of 0 allocates the processes among all available reserved resources. It is equivalent to the -Ns option.
To distribute processes among nodes by resource requirement, use the -R option:
% mprun -np process-count -R resource-requirement-spec program-name |
The processes are distributed among the nodes that satisfy the criteria in the resource requirement spec (RRS).
The RRS accommodates computing requirements that are more complex than those accepted by rank maps. It has this syntax:
RRS --> "resource-requirement [& | | resource-requirement ]..."
The & symbol is a logical AND operation. In other words, a node must satisfy all the criteria in the spec. The | symbol is a logical OR operation. A node must satisfy either of the criteria in the spec. Use them alone or in combination:
resource-requirement & resource-requirement resource-requirement | resource-requirement
Each individual resource-requirement has this syntax:
resource-requirement --> resource [ operator value ]
The resource argument identifies the resource whose requirement is specified. For a list of resources, see TABLE 4-2.
The operator argument is an arithmetic or logical symbol such as = or > that indicates the relationship between the resource and its value. For example:
"name=node0"
In the example above, the processes are distributed to a node whose name resource is equal to node0. For a list of operators, see TABLE 4-3.
The value argument is simply the value of the resource that must be met. Although the operator and value are optional, they are used in the great majority of cases.
The runtime environment parses the attribute settings in the order in which they are listed in the RRS, along with other options you specify. It then merges these results with the results of an internally specified RRS that controls load-balancing.
The result is an ordered list of CPUs that meet your requirements. If a job uses only one process, the process is sent to the first CPU on the list. If a job uses n processes, they are distributed among the first n CPUs, wrapping if necessary.
Note - Unless -Ns is specified, the RRS specifies node resources but generates a list of CPUs. If -Ns is specified, the list refers only to nodes. |
TABLE 4-2 lists the predefined resources you can use. Your system administrator may have defined additional resources for your particular cluster. To display them, use the mpinfo command described in Chapter 9.
Maximum number of processes allowed on the node, including cluster daemons. |
|
The operators have the following precedence, from strongest to weakest:
unary - *, / +, binary - =, !=, >=, <=, >, <, <<, >> ! &, | ? |
Here are some examples of resource requirement specifiers in use.
% mprun -R "name = hpc-demo" a.out % mpinfo -N -R "partition.name=part1" % mprun -R "load5 < 4" a.out |
The last example specifies that you only want nodes whose individual load averages over the previous five minutes were less than four.
When the value of an attribute contains a floating point number or a string decimal number, you must enclose the number in single quotes. For example:
% mpinfo -R "os_release='5.8'" |
Attributes that use either << or >> take no value. For example:
% mprun -R "mem_total>>" a.out |
The example above specifies that you prefer nodes with the largest physical memory available.
If you use the << or >> operator, CRE does not provide load-balancing. In the previous example, CRE would choose the node with the most free swap space, regardless of its load. If you use << or >> more than once, only the last use has any effect -- it overrides the previous uses. For example:
% mprun -R "mem_free>> swap_free>>" a.out |
The example above initially selects the nodes that have the most free memory, but then selects nodes that have the largest amount of available swap space. The second selection may yield a different set of nodes than were selected initially.
You can also use arithmetic expressions for numeric attributes anywhere. For example:
% mprun -R "load1 / load5 < 2" a.out |
specifies that the ratio between the one-minute load average and the five-minute load average must be less than two. In other words, the load average on the node must not be growing too fast.
You can use standard arithmetic operators as well as the C conditional operator.
Boolean attributes are either true or false. If you want the attribute to be true, simply list the attribute in the RRS. For example, if your system administrator has defined an attribute called ionode, you can request a node with that attribute:
% mprun -R "ionode" a.out |
If you want the attribute to be false (that is, you do not want a resource with that attribute), precede the attribute's name with !. (Precede this with a backslash in the C shell; the backslash is an escape character to prevent the shell from interpreting the exclamation point as a "history" escape.) For example:
% mprun -R "\!ionode" a.out |
% mprun -R "mem_free > 256" a.out |
The example above specifies that the node must have over 256 Mbytes of available RAM.
% mprun -R "swap_free >>" a.out |
The example above specifies that the node picked must have the highest available swap space.
The following example specifies that the program must run on a node in the partition with 512 Mbytes of memory:
% mprun -p part2 -R "mem_total=512" a.out |
The following example specifies that you want to run on any of the three nodes listed:
% mprun -R "name=node1 | name=node2 | name=node3" a.out |
The following example chooses nodes with over 300 Mbytes of free swap space. Of these nodes, it then chooses the one with the most total physical memory:
% mprun -R "swap_free > 300 & mem_total>>" a.out |
The following example assumes that your system administrator has defined an attribute called framebuffer, which is set (TRUE) on any node that has a frame buffer attached to it. You could then request such a node via this command:
% mprun -R "framebuffer" a.out |
By default, mprun handles standard output and standard error the way rsh does: the output and error streams are merged and are displayed on your terminal screen. Note that this behavior is slightly different from the standard Solaris behavior when you are not executing remotely; in that case, the stdout and stderr streams are separate. You can obtain this behavior with mprun via the -D option.
Likewise, the mprun standard input (stdin) is sent to the standard input of all the processes.
You can redirect the mprun standard input, output, and error using the standard shell syntax. For example,
% mprun -np 4 echo hello > hellos |
You also can change what happens to the standard input, output, and error of each process in the job. For example,
% mprun echo hello > message |
The example above sends hello across the network from the echo process to the mprun process, which writes it to a file called message.
The set of mprun options that control stdio handling cannot be combined. These options override one another. If more than one is given on a command line, the last one overrides all of the rest. The relevant options are: -D, -N, -B, -n, -i, -o, and -I.
To redirect a job's stdout and stderr to those of the mprun command, use the -D option:
% mprun -D program-name |
You can merge the standard output and standard error streams from each process and direct them to individual files by using the -B option.
% mprun -B program-name |
The -B option writes one file for each process. The filename has this nomenclature:
out.jid.rank
The jid is the program's job ID. The rank is the rank of the process. The files are stored in the job's working directory.
To shut off all standard I/O to all processes, use the -N option:
% mprun -N program-name |
This option closes all stdin, stdout, and stderr connections for the job. For instance, you can reduce the overhead incurred by establishing standard I/O connections for each remote process and then closing those connections as each process ends.
By default, mprun passes the vector of a program's command-line arguments to the program in the standard way. In cluster-level programming, it is sometimes useful to specify a first argument that is not the name of the program. You can use the -A option to do this.
% mprun -A program-name argument... |
The argument to -A is the name of the program to be executed. After the program name you can add the argument of your choice. For example, if you issue the command:
% mprun a.out arg1 arg2 |
mprun passes an array in which the name of the program, a.out, is the first element and arg1 and arg2 are the second and third elements. Or, to pass newarg as the first argument to the program a.out, along with arg1 and arg2, you could issue the command:
% mprun -A a.out newarg arg1 arg2 |
To read stdin from /dev/null, use the -n option:
% mprun -n program-name |
Reading input from /dev/null can be useful when running mprun in the background, either directly or through a script. Without -n, mprun would block in this situation, even if no reads were posted by the remote job. With -n, the user process encounters an EOF if it attempts to read from stdin. This behavior is similar to the behavior of the -n option to rsh.
To redirect output with a custom configuration, use the -I option:
% mprun -I custom-configuration program-name |
A custom configuration tells the runtime environment how to handle each job's I/O streams (standard input, output, and error). It has this syntax:
Each file-descriptor provides handling instructions for one process. It has this syntax:
file-descriptor --> stream-number attribute
Quotation marks are optional. You can place the file-descriptors in any order. A custom configuration can include a file-descriptor for each stream associated with a job; if any file-descriptor is omitted, its stream is not connected to any device.
If you include strings to redirect both standard output and standard error, you must also redirect standard input. If the job has no standard input, you can redirect file descriptor 0 to /dev/null.
The stream identifies the input, output, or error stream. The standard I/O streams are assigned these numbers:
The handling instructions for each stream are specified by the attribute.
You must specify either r or w for each file descriptor -- that is, whether the file descriptor is to be written to or read from. Thus, the string
5w
means that the stream associated with file descriptor 5 is to be written. And
0rp
means that the standard input is to be read from the pseudo-terminal.
If you use the p (pty) attribute, you must have one rp and one wp in the complete series of file descriptor strings. In other words, you must specify both reading from and writing to the pty. No other attributes can be associated with rp and wp.
For example, you can make each process send its standard output or standard error to a file on its own node. In the following example, each node will write hello to a local file called message:
% mprun -I "1w=message" echo hello |
Use the l attribute in combination with the w attribute to line-buffer the output of multiple processes. This takes care of the situation in which output from one process arrives in the middle of output from another process. For example:
% mprun -np 2 echo "Hello" HelHello lo |
With the l attribute, you ensure that processes do not intrude on each other's output. The following example shows how using the l attribute could prevent the problem illustrated in the previous example:
% mprun -np 2 -I "0r, 1wl" echo "Hello" [Return] Hello Hello |
Be sure to press the Return or Enter key to begin the output.
Use the t attribute in place of l to force line-buffering and, additionally, to prefix each line with the rank of the process producing the output. For example:
% mprun -np 2 -I "0r, 1wt" echo "Hello" [Return] r0:Hello r1:Hello |
As with the -l option, be sure to press the Return or Enter key to begin the output.
The b attribute is input-related and thus can be used only in combination with r. In multiprocess jobs, the b attribute specifies that input is to go only to the first process, rather than to all processes, which is the default behavior.
The m attribute pertains to reading from a pseudo-terminal and thus can be used only with rp. The m attribute in combination with rp causes keystrokes to be echoed multiple times when multiple processes are running. The default is to display multiple keystrokes only once.
You can direct one file descriptor's output to the same location as that specified by another file descriptor by using the syntax:
fd attr=@other_fd
For example, 2w=@1means that the standard error is to be sent wherever the standard output is going. You cannot do this for a file descriptor string that uses the p attribute.
If the behavior of the second file descriptor in this syntax is changed later in the -I argument list, the change does not affect the earlier reference to the file descriptor. That is, the -I argument list is parsed from left to right.
You can tie a file descriptor's output to a file by using the syntax
fd attr=filename
For example, 10w=output means that the stream associated with file descriptor 10 is to be written to the file output. Once again, however, you cannot use this feature for a file descriptor defined with the p attribute.
In the following example, the standard input is read from the pty, the standard output is written to the pty, and the standard error is sent to the file named errors:
% mprun -I "0rp,1wp,2w=errors" a.out |
If you use the w attribute without specifying a file, the file descriptor's output is written to the corresponding output stream of the parent process; the parent process is typically a shell, so the output is typically written to the user's terminal.
For multiprocess jobs, each process creates its own file; the file is opened on the node on which the process runs.
Note - If output is redirected such that multiple processes open the same file over NFS, the processes will overwrite each other's output. |
In specifying the individual file names for processes, you can use the following symbols:
The symbols will be replaced by the actual values. For example, assuming the job ID is 15, this file descriptor string
1w=myfile.&J.&R
redirects standout output from a multiprocess job to a series of files named myfile.15.0, myfile.15.1, myfile.15.2, and so on, one file for each rank of the job.
In the following example, there is no standard input (it comes from /dev/null), and the standard output and standard error are written to the files out.job.rank:
% mprun -I "0r=/dev/null,1w=out.&J.&R,2w=@1" a.out |
This is the behavior of the -B option. Note the inclusion in this example of a file descriptor string for standard input even though the job has none. This is required because both standard output and standard error are redirected.
By default, the maximum number of file descriptors that a process can have open is 1024. This is because CRE enforces only the hard limit for file descriptors and ignores any file descriptor soft limit that may be set.
Note - CRE enforces soft limits for all other kernel parameters. |
The default, per-process limit of 1024 file descriptors is likely to be more than enough for all but the most extreme MPI job execution requirements. You can, however, easily accommodate exceptional file descriptor demands by taking the following steps:
For example, to increase the file descriptor hard limit to 2048, add the following line to the /etc/system file on each node in the cluster:
set rlim_fd_max=2048 |
You can also increase the file descriptor hard limit in a 32-bit Solaris 8 environment. However, this approach is not recommended because the 32-bit environment has a kernel-level limit of 1024. Consequently, you would also have to define the C pre-processor symbol FD_SETSIZE in your application to be at least as large as the new rlim_fd_max value, and then recompile/relink the application.
The default I/O behavior of mprun (merged standard error and standard output) is equivalent to:
% mprun -I "0rp,1wp,2w=@1" a.out |
The -D option provides separate standard output and standard error streams; it is equivalent to:
% mprun -I "0rp,1wp,2w" a.out |
You can use the -o option to force each line of output to be prepended with the rank of the process writing it. This is equivalent to:
% mprun -I "0rp,1wt,2w=@1" a.out |
If you redirect output to a shared file, you must use standard shell redirection rather than the equivalent -I formulation (-I "lwt=outfile"). The same restriction also applies to the linebuffer formulation (-I "lwt=outfile").
For example, the following command line concatenates the outputs of the individual processes of a job and writes them to outfile.dat:
% mprun -np 4 myprogram > outfile.dat |
The following command line concatenates the outputs of the individual processes and appends them to the previous content of the output file:
% mprun -np 4 myprogram >> outfile.dat |
The following table describes three mprun command-line options that provide the same control over standard I/O as some -I constructs, but are much simpler to express. Their -I equivalents are also shown.
Use the -i option to mprun with caution, since the -i option provides only one stdin connection (to rank 0). If that connection is closed, keyboard signals are no longer forwarded to those remote processes. To signal the job, you must go to another window and issue the mpkill command. For example, if you issue the command mprun -np 2 -i cat and then type the Ctrl-d character (which causes cat to close its stdin and exit), rank 0 will exit. However, rank 1 is still running, and can no longer be signaled from the keyboard.
These shortcuts are not exact substitutions. CRE uses ptys correctly, whether the -I option is present or absent. Also, CRE merges standard error with standard output when it is appropriate. If either stderr or stdout is redirected (but not both), ptys are not used and stderr and stdout are separated. If both stderr and stdout are redirected, ptys are still not used, but stderr and stdout are combined.
To perform actions that are shell specific, such as executing compound commands, invoke the appropriate shell as part of the mprun command:
% mprun shell-command shell-options |
% mprun csh -c 'echo $USER' % mprun csh -c 'cd /foo ; bar' |
To move either a process started with mprun or a script that issues mprun commands to the background, redirect stdin to a file, like this:
% mprun < /dev/null |
You can also use the -n option to mprun so that standard input is read from /dev/null. See How to Read Standard Input From /dev/null (-n).
% mprun -n |
When mprun stops, whether via Control-Z or in terminal output, the job under control of mprun is stopped.
Use the -C option to specify the path of an alternative working directory to be used by the processes spawned when you run your program:
% mprun -C working-directory program-name |
Setting a path with -C does not affect where the runtime environment looks for executables. If you do not specify -C, the default is the current working directory. For example:
% mprun -C /home/collins/bin a.out |
The syntax above changes the working directory for a.out to /home/collins/bin.
To start a program with a different user name or ID, use the -U option:
% mprun -U username program-name % mprun -U userid program-name |
If you are not the user identified by username, you must have superuser privileges.
To start a program with a different group name or ID, use the -G option:
% mprun -G group-name program-name % mprun -G groupid program-name |
You must belong to the group you use, or be the superuser.
For accounting purposes,any job you run is part of your current project. You can set a default project by changing the value of the variable SUNHPC_PROJECT. That value overrides your current project. However, you can override both values by adding the -P option to the mprun command:
% mprun -P project-name |
Use this syntax to specify verbose output. For example,
% mprun -v |
To display a list of mprun options, use the -h option (alone):
To display the command's version number, use the -V (upper case) option (alone):
% mprun -V |
To display information about the job after it finishes executing, add the -J option to the command:
% mprun options -J program-name |
In this example, the job ID (jid), cluster name, and number of processes are displayed after the job finishes executing:
% mprun -np 4 -J a.out |
To store the job name in a user-specified file for later access, use the -d option:
% mprun options -d output-file hostname |
To precede each output line with the number of the rank that wrote it, use the -o option:
% mprun options -o program-name |
Copyright © 2003, Sun Microsystems, Inc. All rights reserved.