C H A P T E R 4 |
Using the CLI Tools On Sun HPC Cluster Nodes |
This chapter explains how to use the following Sun HPC ClusterTools software installation utilities.
If you are installing the software in an NFS cluster configuration, remember that:
See Chapter 5 for instructions on setting up and installing software on NFS servers.
When you want to activate Sun HPC ClusterTools software in an NFS configuration, you must ensure that the activation tool is able to locate the software mount point. You can do this in either of the following ways:
mount_point/*/SUNWhpc/HPC5.0/bin/Install_Utilities/bin |
# mount server:/export/apps mount_point |
The steps below are common to any operation in which you would use CLI commands.
1. Load a CD-ROM containing the Sun HPC ClusterTools software in each cluster node. If using a central host for command initiation, do this on the central host as well.
If you are using a central command initiation host, do this step on the central host. If operating in direct local mode, log in as superuser on the cluster node.
3. If the Sun HPC ClusterTools software has not already been installed, change directory to distribution/hpc/Product/Install_Utilities/bin, where distribution is the location of the Sun HPC ClusterTools software CD-ROM. Otherwise, go to Step 4.
4. If the software was previously installed and you intend to perform other tasks, such as activation, deactivation, or removal, change directory to $INSTALL_LOC/SUNWhpc/HPC5.0/bin/Install_Utilities/bin, where $INSTALL_LOC is the location where the software was installed.
You can now start using the CLI commands. They are described separately below, with examples of common applications given for each.
For usage information on any command, either enter the command without options or with the -h option.
# ./command # ./command -h |
Use the ctinstall command to install Sun HPC ClusterTools software on cluster nodes. See TABLE 4-1 for a summary of the ctinstall options. Explanations of their use are provided in the following contexts:
This section shows examples of software installations in which the ctinstall command is initiated from a central host in a non-NFS configuration.
# ./ctinstall -n node1,node2 -r rsh |
CODE EXAMPLE 4-1 installs the full Sun HPC ClusterTools software suite (root and non-root packages) on node1 and node2 from a central host. The node list is specified on the command line. The remote connection method is rsh. This requires a trusted hosts setup.
The software will not be ready for use when the installation process completes. It must be activated by hand before it can be used.
# ./ctinstall -n node1,node2 -r ssh |
CODE EXAMPLE 4-2 is the same as CODE EXAMPLE 4-1, except that the remote connection method is ssh. This method requires that the initiating node be able to log in as superuser to the target nodes without being prompted for any interaction, such as a password.
# ./ctinstall -N /tmp/nodelist -r telnet |
CODE EXAMPLE 4-3 installs the full Sun HPC ClusterTools software suite (root and non-root packages) on the set of nodes listed in the file /tmp/nodelist from a central host. A node list file is particularly useful when you have a large set of nodes or you want to run operations on the same set of nodes repeatedly.
The node list file has the following contents:
# Node list for CODE EXAMPLE 4-2 node1 node2 |
The remote connection method is telnet. All cluster nodes must share the same password. If some nodes do not use the same password as others, install the software in groups, each group consisting of nodes that use a common password.
The software will not be ready for use when the installation process completes. It must be activated by hand before it can be used.
# ./ctinstall -N /tmp/nodelist -r telnet -k /tmp/cluster-logs -g |
CODE EXAMPLE 4-4 is the same as CODE EXAMPLE 4-3, except it includes the -k and -g options.
In this example, the -k option causes the local log files of all specified nodes to be saved in /tmp/cluster-logs on the central host.
The -g option causes a pair of node list files to be created on the central host in /var/sadm/system/logs/hpc/nodelists. One file, ctinstall.pass$$, contains a list of the nodes on which the installation was successful. The other file, ctinstall.fail$$, lists the nodes on which the installation was unsuccessful. The $$ symbol is replaced by the process number associated with the installation.
These generated node list files can then be used for command retries or in subsequent operations using the -N switch.
# ./ctinstall -N /tmp/nodelist -r telnet -p SUNWcremn,SUNWmpimn |
CODE EXAMPLE 4-5 installs the packages SUNWcremn and SUNWmpimn on the set of nodes listed in the file /tmp/nodelist. No other packages are installed. The remote connection method is telnet.
The -p option can be useful if individual packages were not installed on the nodes by ctinstall.
# ./ctinstall -N /tmp/nodelist -r rsh -a -m node2 |
CODE EXAMPLE 4-6 installs the full Sun HPC ClusterTools software suite (root and non-root packages) on the nodes listed in the file /tmp/nodelist. The remote connection method is rsh.
The software will be activated automatically as soon as the installation is complete. Because activation is automatic, a master node must be specified for the cluster in advance. This is node2 in CODE EXAMPLE 4-6. If a master node is not specified, an error message is displayed.
This section shows examples of software installations in which the ctinstall command is initiated from a central host in an NFS configuration.
# ./ctinstall -c -n node1,node2 -r rsh |
CODE EXAMPLE 4-7 is the same as CODE EXAMPLE 4-1, except that node1 and node2 are NFS client nodes. The -c option causes only root packages to be installed on these nodes. If the NFS server is to be used as a cluster node, run this command on it as well.
Use ctnfssvr to set up the NFS server and install the non-root packages on it.
# ./ctinstall -c -n node1,node2 -r rsh -a -m node2 |
CODE EXAMPLE 4-8 is the same as CODE EXAMPLE 4-7, except it includes the options -a and -m, which cause the software to be activated automatically and specify the cluster's master node, respectively.
Note - Since this command will activate the software on NFS client nodes as soon as the installation completes, the NFS server must be properly installed and enabled before this operation is performed. See Chapter 5 for details on NFS server setup operations. |
This section shows examples of software installations in which the ctinstall command is initiated on the local node in non-NFS configurations.
Note - The options -g, -k, -n, -N, -r, and -S are incompatible with local (non-centralized) installations. If the -l option is used with any of these options, an error message is displayed. |
# ./ctinstall -l |
CODE EXAMPLE 4-9 installs the full Sun HPC ClusterTools software suite (root and non-root packages) on the local node only.
# ./ctinstall -l -p SUNWcremn,SUNWmpimn |
CODE EXAMPLE 4-10 installs the packages SUNWcremn and SUNWmpimn on the local node.
# ./ctinstall -l -a -m node2 |
CODE EXAMPLE 4-11 installs the full Sun HPC ClusterTools software suite (root and non-root packages) on the local node and causes it to be activated as soon as the installation is complete. It also specifies the cluster master node as node2.
Note - The local node needs to be told which cluster node is the master node. |
This section shows examples of software installations in which the ctinstall command is initiated on the local node in NFS configurations.
# ./ctinstall -c -l |
CODE EXAMPLE 4-12 installs the Sun HPC ClusterTools software root packages on the local node.
# ./ctinstall -c -l -a -m node2 |
CODE EXAMPLE 4-13 is the same as CODE EXAMPLE 4-12, except the software is activated as soon as the installation completes. The NFS server must be installed and enabled before this step can be taken.
Use the ctact command to activate Sun HPC ClusterTools software on cluster nodes. See TABLE 4-2 for a summary of the ctact options.
This section shows examples of software activation in which the ctact command is initiated from a central host.
# ./ctact -n node1,node2 -r rsh -m node2 |
CODE EXAMPLE 4-14 activates the software on node1 and node2 and specifies node2 as the master node. The remote connection method is rsh.
# ./ctact -n node1,node2 -r rsh -m node2 -k /tmp/cluster-logs -g |
CODE EXAMPLE 4-15 is the same as CODE EXAMPLE 4-14, except it specifies the options -k and -g.
In this example, the -k option causes the local log files of all specified nodes to be saved in /tmp/cluster-logs on the central host.
Note - Specify a directory that is local to the central host rather than an NFS-mounted directory. This will avoid unnecessary network traffic and will result in faster execution of the operation. |
The -g option causes files ctact.pass$$ and ctact.fail$$ to be created on the central host in /var/sadm/system/logs/hpc/nodelists. ctact.pass$$ lists the cluster nodes on which software activation was successful and ctact.fail$$ lists the nodes on which activation was unsuccessful. The $$ symbol is replaced by the process number associated with the activation.
These generated node list files can then be used for command retries or in subsequent operations using the -N switch.
This section shows an example of software activation on the local node.
# ./ctact -l -m node2 |
CODE EXAMPLE 4-16 activates the software on the local node and specifies node2 as the master node.
Use the ctdeact command to deactivate Sun HPC ClusterTools software on cluster nodes. See TABLE 4-3 for a summary of the ctdeact options.
This section shows examples of software deactivation in which the ctdeact command is initiated from a central host.
# ./ctdeact -N /tmp/nodelist -r rsh |
CODE EXAMPLE 4-17 deactivates the software on the nodes listed in /tmp/nodelist. The remote connection method is rsh.
# ./ctact -N /tmp/nodelist -r rsh -k /tmp/cluster-logs -g |
CODE EXAMPLE 4-18 is the same as CODE EXAMPLE 4-17, except it specifies the options -k and -g.
In this example, the -k option causes the local log files of all specified nodes to be saved in /tmp/cluster-logs on the central host.
The -g option causes files ctdeact.pass$$ and ctdeact.fail$$ to be created on the central host. ctdeact.pass$$ lists the cluster nodes where software deactivation was successful. ctdeact.fail$$ lists the nodes where deactivation was unsuccessful. The $$ symbol is replaced by the process number associated with the software deactivation.
These generated node list files can then be used for command retries or in subsequent operations using the -N switch.
This section shows software deactivation on the local node.
# ./ctdeact -l |
CODE EXAMPLE 4-19 deactivates the software on the local node.
Use the ctremove command to remove Sun HPC ClusterTools software from cluster nodes. See TABLE 4-4 for a summary of the ctremove options.
Note - If the nodes are active at the time ctremove is initiated, they will be deactivated automatically before the removal process begins. |
This section shows examples of software removal in which the ctremove command is initiated from a central host.
# ./ctremove -N /tmp/nodelist -r rsh |
CODE EXAMPLE 4-20 removes the software from the nodes listed in /tmp/nodelist. The remote connection method is rsh.
# ./ctremove -N /tmp/nodelist -r rsh -k /tmp/cluster-logs -g |
CODE EXAMPLE 4-21 is the same as CODE EXAMPLE 4-20, except it specifies the options -k and -g.
# ./ctremove -N /tmp/nodelist -r rsh -p SUNWcremn,SUNWmpimn |
CODE EXAMPLE 4-22 removes the packages SUNWcremn and SUNWmpimn from the nodes listed in /tmp/nodelist. The remote connection method is rsh.
This section shows software removal from the local node.
# ./ctremove -l |
CODE EXAMPLE 4-23 deactivates the software on the local node.
# ./ctremove -l -p SUNWcremn,SUNWmpimn |
CODE EXAMPLE 4-24 removes the packages SUNWcremn and SUNWmpimn from the local node.
Use the ctstartd command to start all Sun HPC ClusterTools software daemons on the cluster nodes. Once the Sun HPC ClusterTools 5 software is activated, ctstartd is available in /opt/SUNWhpc/sbin.
See TABLE 4-5 for a summary of the ctstartd options.
This section shows how to start Sun HPC ClusterTools software daemons from a central host.
# ./ctstartd -N /tmp/nodelist -r rsh |
CODE EXAMPLE 4-25 starts the Sun HPC ClusterTools software daemons on the nodes listed in /tmp/nodelist. The remote connection method is rsh.
# ./ctstartd -N /tmp/nodelist -r rsh -k /tmp/cluster-logs -g |
CODE EXAMPLE 4-26 is the same as CODE EXAMPLE 4-25, except it specifies the options -k and -g to gather log information centrally and to generate pass and fail node lists.
# ./ctstartd -l |
CODE EXAMPLE 4-27 starts the Sun HPC ClusterTools software daemons on the local node.
Use the ctstopd command to stop all Sun HPC ClusterTools software daemons on the cluster nodes. Once the Sun HPC ClusterTools 5 software is activated, ctstopd is available in /opt/SUNWhpc/sbin.
See TABLE 4-6 for a summary of the ctstopd options.
This section shows how to stop Sun HPC ClusterTools software daemons from a central host.
# ./ctstopd -N /tmp/nodelist -r rsh |
CODE EXAMPLE 4-28 stops the Sun HPC ClusterTools software daemons on the nodes listed in /tmp/nodelist. The remote connection method is rsh.
# ./ctstopd -N /tmp/nodelist -r rsh -k /tmp/cluster-logs -g |
CODE EXAMPLE 4-29 is the same as CODE EXAMPLE 4-28, except it specifies the options -k and -g to gather log information centrally and to generate pass and fail node lists.
# ./ctstopd -l |
CODE EXAMPLE 4-30 stops the Sun HPC ClusterTools software daemons on the local node.
Copyright © 2003, Sun Microsystems, Inc. All rights reserved.