C H A P T E R  3

Administering Your System

You administer your system using the distributed management card command-line interface (CLI), and through the MOH and PMS applications.

The distributed management card CLI works with the MOH and PMS applications, and supports Simple Network Management Protocol (SNMP) and Remote Method Invocation (RMI) interfaces. MOH provides the SNMP and RMI interfaces to manage the system and send out events and alerts. CLI provides an overlapping subset of commands with MOH and also provides commands for the distributed management card itself. Sending out events and alerts is not a function of the CLI.

This chapter contains the following sections:


Using the Distributed Management Card CLI

The distributed management card CLI provides commands to control power of the system, control the node boards, administer the system, show status, and set configuration information. See Accessing the Distributed Management Cards for information on how to access the distributed management card.

All CLI commands can be used on the active distributed management card. A subset of CLI commands can be used on the standby distributed management card.

CLI Commands

TABLE 3-1 lists the active distributed management card command-line interface commands by type, command name, default permission required to use the command, and command description. TABLE 3-2 lists the subset of the CLI commands available for the standby distributed management card.

Default permission levels are:

The permission level for a user can be changed with the userperm command.

A -h option with a command indicates that help is available for that command.

TABLE 3-1 Active Distributed Management Card Command-Line Interface Commands

Command
Type

Command

Permis-

sion

Description

Configuration

setipmode
-b port_num
rarp|config|none

a

Set the IP mode of the specified Ethernet port. Choose the IP mode according to the services available in the network (rarp, config, or none). The default for the external Ethernet port (1) is none; the default for the internal Ethernet port (2) is none, that is, no services are available on these ports. You must reset the distributed management card for the changes to take effect.

 

setipaddr
-b port_num addr

a

Set the IP address of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the ipmode is set to config. You must reset the distributed management card for the changes to take effect.

 

setipnetmask
-b port_num mask

a

Set the IP netmask of the specified Ethernet port. The default is 255.255.255.0. This command is only used if the ipmode is set to config. You must reset the distributed management card for the changes to take effect.

 

setipgateway addr

a

Set the IP gateway of the distributed management card. The default is 0.0.0.0. You must reset the distributed management card for the changes to take effect.

 

sethostname hostname

a

Set the hostname to be used in the CLI prompt. The default is netract. The maximum length is 32 characters.

 

setservicemode true|false

a

Set whether MOH and PMS services are started automatically on the distributed management card after a reboot. The default is false, meaning that these services are automatically started.

 

setmohsecurity [-h] true|false

a

Set whether authentication is required on the distributed management card RMI interface. The default is false. Refer to MOH Configuration and RMI for more information.

 

showmohsecurity [-h]

 

Display the value for setmohsecurity.

 

setdmcrecovery [-h] on|off

a

Set whether the active distributed management card should try to reset the other, failed distributed management card. The default is off (a reset is not tried).

 

showdmcrecovery [-h]

 

Display the value for setdmcrecovery.

 

setdefaults

a

Initialize the distributed management card system configuration variables, for example, the Ethernet variables and the hostname, to the defaults.

To return all configuration information to the defaults, use the setdefaults command on both distributed management cards, and then press the Reset button on both distributed management cards at the same time or power cycle the system.

 

showipmode
-b port_num

 

Display the value of ip_mode for the specified port number.

 

showipaddr
-b port_num

 

Display the value of ip_addr for the specified port number.

 

showipnetmask
-b port_num

 

Display the value of ip_netmask for the specified port number.

 

showipgateway

 

Display the value of ip_gateway for the distributed management card.

 

showhostname

 

Display the value of the hostname used in the CLI prompt.

 

showservicemode

 

Display the value of the distributed management card service mode.

 

setntpserver addr|none

a

Configure the distributed management card to be an NTP client, and optionally an NTP server. The default is none.

 

showntpserver

 

Display the IP address of the NTP server.

 

setfailover on|off|force

a

Enable (on), disable (off), or force (force) a failover from the active distributed management card to the standby distributed management card. The default is off. Refer to Configuring the Distributed Management Cards for Failover for more information.

 

showfailover

 

Display the distributed management card failover mode.

 

setetherfailover -b 1 enable|disable

a

Set the external Ethernet interface of the distributed management card to enable or disable failover if the external Ethernet interface fails. The default is disable.

 

showetherfailover -b 1

 

Display the distributed management card external Ethernet interface failover mode.

 

setipalias -b port_num addr

a

Set the alias IP address for the specified Ethernet port. For the external Ethernet port, set the port_num to 1. For the internal Ethernet port, set the port_num to 2. The default is 0.0.0.0. You must reset the active distributed management card for the changes to take effect.

 

showipalias -b port_num

 

Display the alias IP address for the specified Ethernet port.

 

setipaliasnetmask -b port_num mask

a

Set the IP alias netmask for the specified Ethernet port. The default is 255.255.255.0.

 

showipaliasnetmask -b port_num

 

Display the IP alias netmask for the specified Ethernet port.

 

settimezone time_zone

a

Set the time zone value for the date. Refer to Setting the Date and Time on the Distributed Management Cards for more information.

 

showtimezone

 

Display the current time zone value.

Power control

poweroff cpu_node

r

Power off the specified node slot, where cpu_node can be 2 through 21. This command is also supported on third-party node boards.

 

poweron cpu_node

r

Power on the specified node slot, where cpu_node can be 2 through 21. This command is also supported on third-party node boards.

 

powersupply
n on|off

r

Switch the specified power supply unit on or off, where n can be 1 through 8.

CPU control

reset [-h] [dmc|1A|1B|cpu_node]
[-x
cpu_node]

r

Reset (reboot) a specified node.
reset [dmc|1A|1B|cpu_node]produces a soft reset (reboots the operating system), where dmc is the distributed management card the command is issued on; 1A is the top distributed management card; 1B is the bottom distributed management card; and cpu_node can be 3 through 20.
reset -x produces a hard reset (resets the board), where cpu_node can be 2 through 21.
reset -x is also supported on third-party node boards.

 

pmsd

a

Display help information on starting, stopping, and controlling the PMS daemon on the distributed management card. See also the PMS daemon control commands at the end of TABLE 3-1; Enabling the Processor Management Service Application; and Using the PMS Application for Recovery and Control of Node Boards for more information.

 

console cpu_node

c

Enter console mode and connect to the specified node board, where cpu_node can be 3 through 20.

 

break cpu_node

c

Put the server in debug mode, where cpu_node can be 3 through 20.

 

showhealth [-b cpu_node]

 

Show the healthy information of a node, where cpu_node can be 0 through 21.

 

showcpustate

 

Display the board type, power state, and boot state for each slot in the system. Refer to Displaying Board State Information for more information. This command is also supported on third-party node boards.

System status

showenvironment

 

Display a summary of current environmental information, such as fantray and power supply status.

 

shownetwork

 

Display the current network configuration of the distributed management card.

 

showdate

 

Display the system date.

Administra-
tion

setpanicdump [all|cpu_node] [true|false]

a

Set whether a panic dump is generated when a node is reset, where all means all nodes 3 through 20, and cpu_node can be a specific node 3 through 20.

 

setescapechar value

 

Set the escape character to end a console session. The default is a tilde (~).

 

useradd [-h] username

u

Add a user account. The default user account is netract. The distributed management card supports 16 accounts.

 

userdel [-h] username

u

Delete a user account.

 

usershow [-h] [username]

 

Show user accounts.

 

userpassword [-h] username

u

Set or change the password of a specified user account.

 

userperm [-h] username
[c|u|a|r]

u

Set or change the permission levels for a specified user account.

 

showusers

 

Show the number of users logged in to the distributed management card.

 

logout

 

Log out of the current session.

 

password [-h]

u

Change the existing password.

 

flashupdate -d cmsw|bcfw|bmcfw|rpdf|scdf -f path

a

Flash update the distributed management card software, where cmsw represents the chassis management software;. bcfw represents the boot control firmware; bmcfw represents the BMC firmware; rpdf represents the system configuration repository; and scdf initializes the system configuration variables to their defaults. Refer to Updating the Distributed Management Cards' Flash Images for more information.

 

help

 

Display a list of supported commands.

 

setdate [-h] [mmdd][HHMM][ccyy][:ss]

a

Set the current date.

 

setfru [-h] fru_name instance fru_property value

a

Set FRU ID information. Refer to Specifying Netra CT Server FRU ID Information for more information.

This command is also supported on third-party node boards. Refer to To Configure a Chassis Slot for a Third-Party Node Board for more information.

 

showfru fru_name instance fru_property

 

Display FRU ID information. Refer to Displaying Netra CT Server FRU ID Information for more information.

This command is also supported on third-party node boards. Refer to To Display FRU ID Information for a Third-Party Node Board for more information.

 

showescapechar

a

Show the escape character used to end a console session.

 

showpanicdump [all|cpu_node]

 

Show whether or not a panic dump has been set for all nodes 3 through 20 or for a specific node 3 through 20.

 

version

 

Display the versions of various software and firmware.

 

snmpconfig [-h] add|del|show access|trap community [readonly|readwrite] [addr]

a

Configure the distributed management card SNMP interface. Refer to MOH Configuration and SNMP for more information.

PMS daemon control

pmsd start [-p port_num] [-e server_admin_state] [-d]

a

Start PMS on the distributed management card.

 

pmsd stop [-p port_num]

a

Stop PMS on the distributed management card.

 

pmsd slotaddressset -s slot_num -i ip_addr

a

Set the IP address for the distributed management card to control and monitor a node board.

 

pmsd slotaddressshow
-s slot_num|all

a

Print the IP address set with the pmsd slotaddressset command.

 

pmsd slotrndaddressadd -s slot_num|all -n ip_addr
-d ip_addr -r slot_num

a

Add address information for a node board to control other node boards.

 

pmsd slotrndaddressdelete
-s slot_num|all
-i index_num|all

a

Delete address information added with the pmsd slotrndaddressadd command.

 

pmsd slotrndaddressshow
-s slot_num|all
-i index_num|all

a

Print address information added with the pmsd slotrndaddressadd command.

 

pmsd operset -s slot_num|all -o maint_config|
oper_config|
none_config|
graceful_reboot

a

Enable automatic recovery of a node board.

 

pmsd infoshow -s slot_num|all

a

Print PMS system information.

 

pmsd historyshow -s slot_num|all

a

Print a log of PMS system events and time stamps.

 

pmsd recoveryoperset
-s slot_num|all
-o pc|rst|rstpc|pd|rb

a

Manually recover a board in case of fault.

 

pmsd recoveryautooperset -s slot_num|all
-o pc|rst|rstpc|pd|rb|
rbpc
|none|trg [-d startup_delay] [-f on|off] [-r retries] [-n inter_op_delay] [-p reset_power-cycle_delay]

a

Automatically recover a board in case of fault.

 

pmsd recoveryautoinfoshow
-s slot_num|all

a

Print the configuration information affected by the recoveryautooperset command.

 

pmsd hwoperset -s slot_num|all -o powerdown|powerup|
reset|mon_enable|
mon_disable
[-f]

a

Perform operations on a node board hardware.

 

pmsd hwinfoshow -s slot_num|all

a

Print PMS system information on the hardware.

 

pmsd hwhistoryshow -s slot_num|all

a

Print a log of PMS hardware events and time stamps.

 

pmsd osoperset -s slot_num|all -o reboot|mon_enable|
mon_disable [-f]

a

Perform operations on a node board operating system.

 

pmsd osinfoshow -s slot_num|all

a

Print PMS system information on the operating system.

 

pmsd oshistoryshow -s slot_num|all

a

Print a log of PMS operating system events and time stamps.

 

pmsd appoperset -s slot_num|all -o force_offline|
vote_active|
force_active

a

Perform operations on node board applications.

 

pmsd appinfoshow -s slot_num|all

a

Print PMS system information on the applications.

 

pmsd apphistoryshow -s slot_num|all

a

Print a log of PMS application events and time stamps.

 

pmsd version

a

Print the PMS version.

 

pmsd usage

a

Print a synopsis of the pmsd commands.


Information on configuring distributed management card ports, setting up user accounts, specifying FRU ID information, and starting the PMS daemon using the distributed management card CLI is provided in Chapter 2. The PMS daemon commands are described in Using the PMS Application for Recovery and Control of Node Boards.

TABLE 3-2 lists the commands valid on the standby distributed management card.

TABLE 3-2 Standby Distributed Management Card Command-Line Interface Commands

Command
Type

Command

Permis-

sion

Description

Configuration

setipmode
-b port_num
rarp|config|none

a

Set the IP mode of the specified Ethernet port. Choose the IP mode according to the services available in the network (rarp, config, or none). The default for the external Ethernet port (1) is none; the default for the internal Ethernet port (2) is none, that is, no services are available on these ports. You must reset the distributed management card for the changes to take effect.

 

setipaddr
-b port_num addr

a

Set the IP address of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the ipmode is set to config. You must reset the distributed management card for the changes to take effect.

 

setipnetmask
-b port_num mask

a

Set the IP netmask of the specified Ethernet port. The default is 255.255.255.0. This command is only used if the ipmode is set to config. You must reset the distributed management card for the changes to take effect.

 

setipgateway addr

a

Set the IP gateway of the distributed management card. The default is 0.0.0.0. You must reset the distributed management card for the changes to take effect.

 

sethostname hostname

a

Set the hostname to be used in the CLI prompt. The default is netract. The maximum length is 32 characters.

 

setservicemode true|false

a

Set whether MOH and PMS services are started automatically on the distributed management card after a reboot. The default is false, meaning that these services are automatically started.

 

showmohsecurity [-h]

 

Display the value for setmohsecurity.

 

showdmcrecovery [-h]

 

Display the value for setdmcrecovery.

 

setdefaults

a

Initialize the distributed management card system configuration variables, for example, the Ethernet variables and the hostname, to the defaults.

To return all configuration information to the defaults, use the setdefaults command on both distributed management cards, and then press the Reset button on both distributed management cards at the same time or power cycle the system.

 

showipmode
-b port_num

 

Display the value of ip_mode for the specified port number.

 

showipaddr
-b port_num

 

Display the value of ip_addr for the specified port number.

 

showipnetmask
-b port_num

 

Display the value of ip_netmask for the specified port number.

 

showipgateway

 

Display the value of ip_gateway for the distributed management card.

 

showhostname

 

Display the value of the hostname used in the CLI prompt.

 

showservicemode

 

Display the value of the distributed management card service mode.

 

showntpserver

 

Display the IP address of the NTP server.

 

showfailover

 

Display the distributed management card failover mode.

 

showetherfailover -b 1

 

Display the distributed management card external Ethernet interface failover mode.

 

showipalias -b port_num

 

Display the alias IP address for the specified Ethernet port.

 

showipaliasnetmask -b port_num

 

Display the alias IP netmask for the specified Ethernet port.

 

showtimezone

 

Display the current time zone value.

CPU control

reset [-h] [dmc|1A|1B]

r

Reset (reboot) a distributed management card.
reset [dmc|1A|1B]produces a soft reset (reboots the operating system), where dmc is the distributed management card the command is issued on; 1A is the top distributed management card; and 1B is the bottom distributed management card.

 

showhealth [-b cpu_node]

 

Show the healthy information of a node, where cpu_node can be 0 through 21.

 

showcpustate

 

Display the board type, power state, and boot state for each slot in the system. Refer to Displaying Board State Information for more information. This command is also supported on third-party node boards.

System status

showenvironment

 

Display a summary of current environmental information, such as fantray and power supply status.

 

shownetwork

 

Display the current network configuration of the distributed management card.

 

showdate

 

Display the system date.

Administra-
tion

usershow [-h] [username]

 

Show user accounts.

 

showusers

 

Show the number of users logged in to the distributed management card.

 

logout

 

Log out of the current session.

 

password [-h]

u

Change the existing password.

 

flashupdate -d cmsw|bcfw|bmcfw|rpdf|scdf -f path

a

Flash update the distributed management card software, where cmsw represents the chassis management software;. bcfw represents the boot control firmware; bmcfw represents the BMC firmware; rpdf represents the system configuration repository; and scdf initializes the system configuration variables to their defaults. Refer to Updating the Distributed Management Cards' Flash Images for more information.

 

help

 

Display a list of supported commands.

 

setdate [-h] [mmdd][HHMM][ccyy][:ss]

a

Set the current date.

 

showfru fru_name instance fru_property

 

Display FRU ID information. Refer to Displaying Netra CT Server FRU ID Information for more information.

This command is also supported on third-party node boards. Refer to To Display FRU ID Information for a Third-Party Node Board for more information.

 

showescapechar

a

Show the escape character used to end a console session.

 

showpanicdump [all|cpu_node]

 

Show whether or not a panic dump has been set for all nodes 3 through 20 or for a specific node 3 through 20.

 

version

 

Display the versions of various software and firmware.

 

snmpconfig [-h] show access|trap

a

Display SNMP access or trap information for the distributed management card SNMP interface. Refer to MOH Configuration and SNMP for more information.


 

Security

A remote command-line session or a console session automatically disconnects after 10 minutes of inactivity.

Security is also provided through the permission levels and passwords set for each account.


Updating the Distributed Management Cards' Flash Images

The primary boot device for the distributed management card is always the flash. You can update the distributed management card flash images over the network using nfs or tftp. TABLE 3-3 shows the distributed management card flash options.

TABLE 3-3 Distributed Management Card Flash Options

Option

Description

cmsw

Updates the chassis management software, which includes the Chorus software, the MOH application, and the PMS application.

bcfw

Updates the boot control firmware.

bmcfw

Updates the BMC firmware.

rpdf

Updates the system configuration repository, which contains information used internally by the CLI in the flash, reinitializes it to a default minimum, and resets the distributed management card.

scdf

(Optional) Initializes the system configuration variables, for example, the Ethernet variables and the hostname, to the defaults.


You must flash update the standby distributed management card first, then fail over from the active to the standby distributed management card, then flash update the new standby distributed management card. There is no required sequence for flashing each distributed management card, although the following order is recommended: cmsw, bcfw, bmcfw, and rpdf. You can update individual images if desired.

During a flash update of the BMC firmware, the BMC is not able to respond to communication requests, and the following messages may display on the console:

ysif_xfer_msg: kcs driver xfermsg returns -1
read_evt_buffer: sysif_xfer_msg returns -1
poll_evt_handler: read_evt_buffer returns -1
listner_thread: poll_evt_handler returns -1

These messages can be safely ignored during a flash update.


procedure icon  To Update All the Distributed Management Cards' Flash Images

1. Log in to the standby distributed management card.

2. Flash update the standby distributed management card images:



Note - The scdf option is not mandatory. Use it only if you want to initialize the system configuration variables to the defaults.



 

hostname cli> flashupdate -d cmsw -f path
hostname cli> flashupdate -d bcfw -f path
hostname cli> flashupdate -d bmcfw -f path
hostname cli> flashupdate -d scdf
hostname cli> flashupdate -d rpdf -f path

where path can be nfs://nfs.server.ip.address/directory/filename or tftp://tftp.server.ip.address/directory/filename where the software to use in the flash is installed. If you are using the NFS option, make sure that the path is a shared NFS mount.

After you update rpdf, the distributed management card resets itself. If you do not update rpdf, you must reset the distributed management card manually, with the reset dmc command.

3. Log in to the active distributed management card.

4. Force a failover from the active distributed management card to the standby distributed management card:

hostname cli> setfailover force

The active distributed management card becomes the new standby distributed management card.

5. Repeat the instructions in Step 2 to flash update the new standby distributed management card.


procedure icon  To Update an Individual Distributed Management Card Flash Image

1. Log in to the standby distributed management card.

2. Flash update a distributed management card image:

hostname cli> flashupdate -d option
hostname cli> reset dmc

where option can be cmsw -f path, bcfw -f path, bmcfw -f path, or scdf, and path can be nfs://nfs.server.ip.address/directory/filename or tftp://tftp.server.ip.address/directory/filename where the software to use in the flash is installed. If you are using the NFS option, make sure that the path is a shared NFS mount.

If you update rpdf, the distributed management card resets itself after finishing the rpdf update.

3. Log in to the active distributed management card.

4. Force a failover from the active distributed management card to the standby distributed management card:

hostname cli> setfailover force

The active distributed management card becomes the new standby distributed management card.

5. Repeat the instructions in Step 2 to flash update the new standby distributed management card.

 


Running Scripts on the Distributed Management Cards

This section describes the Netra CT server distributed management card scripting feature.

Using Scripting

Normally, the distributed management card cannot execute batch commands. The distributed management card scripting feature allows you to write scripts to execute distributed management card CLI commands in batch mode, similar to using scripting in the Solaris OS. You run the scripts from a node board in the same system as the distributed management card.

As an example, using the scripting feature, you can write a script to configure an Ethernet port on the distributed management card, and then check to be sure it is configured the way you want. This sample script runs the version command, and the setipmode, setipaddr, showipmode, and showipaddr commands for Ethernet port 2 on the distributed management card:

rsh DMC_SysMgmt_ipaddress version
rsh DMC_SysMgmt_ipaddress setipmode -b 2 config
rsh DMC_SysMgmt_ipaddress setipaddr -b 2 addr
rsh DMC_SysMgmt_ipaddress showipmode -b 2
rsh DMC_SysMgmt_ipaddress showipaddr -b 2

The script includes the rsh command, the distributed management card System Management Network IP address, and the CLI command(s) to run. You can use the IP address for the distributed management card in slot 1A, the IP address for the distributed management card in slot 1B, or the alias IP address to stay connected to the active distributed management card. For information on the System Management Network IP address, refer to Configuring the System Management Network. For information on the CLI commands, refer to TABLE 3-1.

Scripting Limitations

All the active distributed management card CLI commands in TABLE 3-1 are supported in a script except the interactive commands userpassword, password, console, and break, and the command logout.

You can not use scripting on the standby distributed management card.

For security reasons, you must be a root user on a node board in the same system as the distributed management card. The commands can only be run over the System Management Network interface.


procedure icon  To Run a Script on a Distributed Management Card

1. Log in to the server.

2. Create a script:

rsh DMC_SysMgmt_ipaddress CLI_command
rsh DMC_SysMgmt_ipaddress CLI_command
rsh DMC_SysMgmt_ipaddress CLI_command
rsh DMC_SysMgmt_ipaddress CLI_command
...

where DMC_SysMgmt_ipaddress is the System Management Network IP address of the distributed management card, and CLI_command is the CLI command you want to run.

3. Save the script to a file.

4. As root, run the script:

# /path/filename

where path is the path to the script and filename is the name of the script.

Before executing the commands in the script, the distributed management card verifies that the commands are being run by a root user on a node board in the same system as the distributed management card, and that the commands have been received over the System Management Network.


Displaying Board State Information

Board information, including type of board, power state of the board, and boot state of the board can be displayed for each slot in the system using the CLI showcpustate command.

Sample output from this command is:

hostname cli> showcpustate
-------------------------------------------------------
   Slot No   : Board Type   : Power_State : Boot_State
-------------------------------------------------------
      1A     :    DMC Board :          On :      Ready
      1B     :    DMC Board :          On :      Ready
       2     : SWITCH Board :          On :      Ready
       3     :    CPU Board :         Off :
       4     :        Empty :             :
       5     :    CPU Board :          On :      Ready
       6     :    CPU Board :          On :    Offline
       7     :    CPU Board :          On :    Offline
       8     :    CPU Board :          On :      Ready
       9     :    CPU Board :          On :    Offline
      10     :    CPU Board :          On :    Unknown
 
Press 'q' + Return to quit, hit Return to continue

TABLE 3-4 contains the various state descriptions.

TABLE 3-4 Board State Information

State

Value

Description

Power _State

On

The slot is powered on

Power _State

Off

The slot is powered off

Boot_State

Online

The board boot sequence has started

Boot_State

Ready

The board boot sequence has finished, and the board is ready to use

Boot_State

Offline

The board may be running its power-on self-test (POST), the board may be at the OpenBoot PROM level, or the boot may have failed

Boot_State

Unknown

The distributed management card can not determine the current boot state of the board


For third-party node boards, the showcpustate command returns a state of unknown.


Booting Node Boards

Node boards can boot from a local disk or over the network.

Board Power-on Sequence

When you power on the Netra CT 820 system by pressing the power switch on the rear of the system to the On (|) position, the boards power on in this sequence:

1. The two distributed management cards are powered on. The top card in slot 1A is designated as the active distributed management card, and the bottom card in slot 1B is designated as the standby distributed management card. Once the cards have booted and are ready for use, the Ready LED is solid green on the active card and blinking green on the standby card.

2. The active distributed management card powers on the switching fabric boards in slots 2 and 21. While the switching fabric boards are powering on, the blue LED state is solid. Once the boards have booted and are ready for use, the blue LED turns off.

3. The active distributed management card looks at the Boot_Mask field in the midplane FRU ID for boot servers and performs one of the following actions:

or

A midplane FRU ID fault is a system fault, and no boards can be powered on.

Boot Device Variables

By default, the OpenBoot PROM NVRAM boot-device configuration variable is set to disk net, disk being an alias for the path to the local disk, and net being an alias for the path of the primary network. You can set the boot device for node boards through the distributed management card CLI setfru command. Refer to Configuring a Chassis Slot for a Board for information on using the setfru command to specify a boot device for a board.

For example, you might want to change the node board in slot 3 to boot first from its PMC disk. To do this, check the current OpenBoot PROM boot-device variable:

ok printenv boot-device
boot-device =     disk net
ok

On the active distributed management card, check and change the boot_devices setting:

hostname cli> showfru slot 3 boot_devices
showfru: Boot_Devices:
hostname cli> setfru slot 3 boot_devices pmc0/disk net
hostname cli> showfru slot 3 boot_devices
showfru: Boot_Devices: pmc0/disk net
hostname cli>

After you power cycle the system, check the OpenBoot PROM boot-device variable:

ok printenv boot-device
boot-device =     pmc0/disk net
ok

When a node board is hot-swapped, power cycled, rebooted, or reset, the OpenBoot PROM firmware checks with the distributed management card for a boot device for that slot. The distributed management card sends the value from the Boot_Devices field in FRU ID to the OpenBoot PROM firmware; the value is either the boot device list for that slot you set using the setfru command or a null string if you did not set a boot device list for that slot. The value overwrites the NVRAM boot-device value. The board boots from the value in the boot-device variable if its diag-switch? variable is set to false (the default). The board boots from the value in the diag-device variable (the default is net) if its diag-switch? variable is set to true.

Booting with a DHCP Server

You can configure Netra CT node boards to boot over DHCP. This process includes setting the node board boot device for DHCP, forming the node board DHCP client ID, and configuring the DHCP server.

On the Netra CT system, the DHCP client ID is a combination of the system's midplane Sun part number (7 bytes), the system's midplane Sun serial number (6 bytes), and the board's geographical address (slot number) (2 bytes). The parts are separated by a colon (:).


procedure icon  To Configure a Node Board to Boot Over DHCP

1. Log in to the active distributed management card.

2. Set the boot device for the board to dhcp with the setfru command:

hostname cli> setfru slot instance Boot_Devices network_devicename:dhcp

where instance is the slot number of the board to be configured for DHCP and network_devicename is a path or alias to a network device. For example, to set the boot device to dhcp for the node board in slot 4, enter the following:

hostname cli> setfru slot 4 Boot_Devices net:dhcp

3. Get the Netra CT system part number and the system serial number with the showfru command:

hostname cli> showfru midplane 1 Sun_Part_No
...
hostname cli> showfru midplane 1 Sun_Serial_No
...

4. Form the three-part client ID by using the system part number, the system serial number, and the slot number, separated by colons. Then, convert the client ID to ASCII.

For example, if the output from the showfru commands in Step 3 is 375-4335 (Sun part number) and 000001 (Sun serial number), and you want to form the client ID for the node board in slot 4, the client ID is 3754335:000001:04.

Translate the client ID to its ASCII equivalent. For example:

Client ID part

ASCII Representation

3754335

33 37 35 34 33 33 35

:

3A

000001

30 30 30 30 30 31

:

3A

04

30 34


Thus, the example client ID in ASCII is:

33 37 35 34 33 33 35 3A 30 30 30 30 30 31 3A 30 34.

5. Configure the DHCP server.

Refer to the Solaris DHCP Administration Guide on the web site http://docs.sun.com for information on how to configure the DHCP server for remote boot and diskless boot clients.

The client ID is retained across a node board power cycle, reboot, or reset. The distributed management card updates the client ID during a first-time power on or a hot-swap of a node board.


Connecting to Node Board Consoles from the Distributed Management Card

The Netra CT system provides the capability to connect to node boards and open console sessions from the active distributed management card.

You begin by logging in to the distributed management card through either the serial port or the Ethernet port. Once a console session with a node board is established, you can run Solaris system administration commands, such as passwd, read status and error messages, or halt the board in that particular slot.

Configuring Your System for Multiple Console Use

To enable your system to use multiple consoles, you set several variables, either at the Solaris level or at the OpenBoot PROM level. Set these variables on each node board to enable console use.


procedure icon  To Configure Your System for Multiple Consoles

1. Log in as root to the node board, using the on-board console port ttya.

2. Enter either set of the following commands to enable multiple consoles:

or

Establishing Console Sessions Between the Distributed Management Card and Node Boards

Once you have configured your system for multiple console use, you can log in to the active distributed management card and open a console for a slot. The Netra CT system allows four console users per node board slot.

TABLE 3-5 shows the distributed management card CLI console-related commands that can be executed from the current login session on the distributed management card.

TABLE 3-5 Distributed Management Card CLI Console-Related Commands

Command

Description

console cpu_node

Enter console mode and connect to a specified node board, where cpu_node can be 3 through 20.

break cpu_node

Put the specified node board in debug mode, where cpu_node can be 3 through 20. Debug mode uses the OpenBoot PROM level.

setescapechar value

Set the escape character to be used in all future console sessions. The default is tilde (~). Refer to TABLE 3-6 for escape character use.

showescapechar

Show the current escape character.


Most node board consoles use the system management bus, but a board at the OpenBoot PROM level connects over the IPMI bus. There can be only one console user on the IPMI bus at any one time.

For example, if the board in slot 4 is at the OpenBoot PROM level, the user opening a console session connects to it over the IPMI bus. This causes the IPMI bus to be fully occupied and no other users can connect over that bus. If they try, an error message displays. However, other users can connect to boards in other slots over the system management bus. The system management bus is faster than the IPMI bus, while the IPMI bus is typically a more stable communication channel than the system management bus.

Once you have a console connection with a node board, you can issue normal Solaris commands. There are several escape character sequences to control the current session. TABLE 3-6 shows these sequences.

TABLE 3-6 Node Board Console-Related Escape Character Sequences

Sequence

Description

~b

Break from the Solaris level and enter the OpenBoot PROM (debug) level.

~.

End the console session.

~g

Determine the status (system management bus or IPMI) of the current console.

~t

Toggle between system management bus and IPMI.



procedure icon  To Start a Console Session From the Distributed Management Card

1. Log in to the active distributed management card.

You can log in to the distributed management card through a terminal attached to either the serial port connection or the Ethernet port connection.

2. Open a console session to a board in a slot:

hostname cli> console cpu_node

where cpu_node is 3 through 20. For example, to open a console to the board in slot 4, enter the following:

hostname cli> console 4

You now have access to the board in slot 4. Depending on the state of the board in that particular slot, and whether the previous user logged out of the shell, you see one of several prompts:


procedure icon  To Determine the Status of the Current Console

A message displays, indicating the current state of the console connection. The message is either:

Console mode is IPMI

This means the console is in Solaris mode or OpenBoot PROM mode.

Or the message might be:

Console mode is NET

This means the console is in Solaris mode.


procedure icon  To Toggle Between the System Management Bus and IPMI

Toggling between the system management bus and IPMI could be useful for troubleshooting. For example, if the console stops working for some reason, you could try toggling to IPMI (the more reliable communication channel).

1. If the node board is in Solaris mode, enter the escape sequence ~t:

# ~t
New console mode is IPMI
#

The console switches between the system management bus and IPMI mode. The console now fully occupies the IPMI bus. No other console may be at the OpenBoot PROM level at the same time. If another user attempts to access a board that is occupying the IPMI bus, the console connection fails.

2. To return to the system management bus mode, enter ~t again:

# ~t
New console mode is NET
#


procedure icon  To Break Into OpenBoot PROM From the Console

The console mode switches to IPMI:

New console mode is IPMI
Type `go' to resume
ok

You can now debug from the OpenBoot PROM level.


procedure icon  To End the Console Session

1. (Optional) Log out of the Solaris shell.

2. At the prompt, disconnect from the console by entering the escape sequence ~. (tilde period):

prompt ~.
hostname cli>

Disconnecting from the console does not automatically log you out from the remote host. Unless you log out from the remote host, the next console user who connects to that board sees the shell prompt of your previous session.


procedure icon  To Show the Current Escape Character

The current escape character is displayed:

hostname cli> escape_char: value


procedure icon  To Change the Default Escape Character

where value is any printable character. For example, to change the default escape character from tilde (~) to pound sign (#), enter the following:

hostname cli> setescapechar #

The pound sign is now the escape character for all future console sessions.


Using the PMS Application for Recovery and Control of Node Boards

This section describes specifying recovery operations and controlling node boards through the active distributed management card PMS CLI commands.

Recovery Configuration of a Node Board From the Distributed Management Card

You specify the recovery configuration of a node board by using the command pmsd operset -s slot_num|all (a single slot number or all slots in the Netra CT system containing a node board) and the recovery mode for the specified slot(s).

The recovery configuration can be maintenance mode, operational mode, or none mode. Maintenance mode means the distributed management card's automatic recovery of a node board is disabled, and PMS applications are started in an offline state, so that you can use manual maintenance operations. Operational mode means the distributed management card's automatic recovery of a node board is enabled; the distributed management card recovers the node board in the event of a monitoring fault, and starts PMS applications in an active state. None mode means the distributed management card's automatic recovery mode may be manually enabled or disabled; PMS application states are not enforced.

The mode is stored in persistent storage. You specify the operation to be performed on the specified slot by using the option -o with one of the following parameters: maint_config (set the hardware, operating system, and applications into maintenance mode), oper_config (set the hardware, operating system, and applications into operational mode), none_config (set the hardware, operating system, and applications into no enforcement mode), or graceful_reboot (bring the applications offline if needed and then reboot the operating system).


procedure icon  To Specify the Recovery Configuration of a Node Board

1. Log in to the active distributed management card.

2. Configure the automatic recovery mode with the operset command:

hostname cli> pmsd operset -s slot_num|all -o maint_config|oper_config|none_config|gracefulreboot

where slot_num can be a slot number from 3 to 20, and all specifies all slots containing node boards. For example, to make PMS recovery operational for the entire Netra CT server, enter:

hostname cli> pmsd operset -s all -o oper_config

Printing PMS Recovery Configuration Information

The pmsd infoshow -s slot_num|all command can be used to print the recovery configuration and alarm status for the recovery configuration.

The pmsd historyshow -s slot_num|all command can be used to print a recovery configuration and runtime message log. The log is printed to the ChorusOS terminal performing the operation.

Detailed Recovery of a Board in Case of Fault

You can perform detailed, manual recovery operations on a board or instruct PMS to perform detailed, automatic recovery operations on a board using the CLI. The operations are performed across the hardware, the operating system, and the applications.

For manual recovery, use the pmsd recoveryoperset -s slot_num|all command. This command can only be run when the board is in maintenance mode or none mode (PMS applications are offline). You specify the recovery operation to be performed on the specified slot by using the option -o with one of the following parameters: pc (power cycle), rst (reset), rstpc (reset, then power cycle), pd (power down), or rb (reboot).

For automatic recovery, use the recoveryautooperset -s slot_num|all command. This command sets how PMS responds to a fault when the board is in operational mode (PMS applications are active).

You specify the automatic recovery operation to be performed on the specified slot by using the option -o with one of the following parameters: pc (power cycle), rst (reset), rstpc (reset, then power cycle), pd (power down), or rb (reboot), rbpc (reboot, then power cycle), none (no recovery), or trg (manually simulate a fault to trigger a recovery). Optional parameters for automatic recovery include: -d startup delay (the time in deciseconds between a fault occurrence and the start of a recovery operation; default is 0 deciseconds), -f off|on (whether a power down operation will occur if the recovery operation fails; on specifies power down will occur and off specifies that power down will not occur; the default is off), -r retries (the number of times a recovery operation can occur and fail before it is terminated; the default is one try), -n inter_op_delay (the time in deciseconds between one and the next operation for an operation with multiple retries; the default is 0 deciseconds [1 decisecond equals 10 milliseconds]; you should change the default to a number other than 0, for example, 4000 [equals 40 seconds], to allow time between the operations), and -p reset_power-cycle_delay (the time in deciseconds to be waited between the reset and power cycle portions of the recovery operation before a failed reset is declared and the power cycle portion of the operation starts; default is 0 deciseconds).


procedure icon  To Manually Recover a Board

1. Log in to the active distributed management card.

2. Perform manual recovery operations on a board with the recoveryoperset command:

hostname cli> pmsd recoveryoperset -s slot_num|all -o pc|rst|rstpc|pd|rb

where slot_num can be a slot number from 3 to 20, and all specifies all slots containing node boards. For example, to instruct PMS to reboot slot 5 after a fault, enter the following:

hostname cli> pmsd recoveryoperset -s 5 -o rb


procedure icon  To Automatically Recover a Board

1. Log in to the active distributed management card.

2. Perform automatic recovery operations on a board with the recoveryoperset command:

hostname cli> pmsd recoveryautooperset -s slot_num|all -o pc|rst|rstpc|pd|rb|rbpc|none|trg [-d startup delay][-f on|off][-r retries][-n inter_op_delay][-p reset_power-cycle_delay]

where slot_num can be a slot number from 3 to 20, and all specifies all slots containing node boards. For example, to instruct PMS to automatically reboot slot 5 after a fault, with the default delays, retries, and failure power state, enter the following:

hostname cli> pmsd recoveryautooperset -s 5 -o rb

Printing PMS Automatic Recovery Information

The pmsd recoveryautoinfoshow -s slot_num|all command can be used to print information showing the configuration information affected by the recoveryautooperset command.

Monitoring and Controlling a Node Board's Resources From the Distributed Management Card

PMS can perform operations on a board's hardware, the operating system, and applications. You can specify that PMS performs operations on one of these, rather than all.

Hardware Operations

The pmsd hwoperset -s slot_num|all command performs operations on the hardware. The operations can only be performed in maintenance or none mode unless the optional -f parameter is used. You specify the operation to be performed on the specified slot by using the option -o with one of the following parameters: powerdown (set the hardware to the power-off state), powerup (set the hardware to the power-on state), reset (reset the hardware), mon_enable (enable health monitoring of the hardware), or mon_disable (disable health monitoring of the hardware). The optional -f parameter can be used to perform the operation even if applications are in the active state, and the slot is in operational mode.

The pmsd hwinfoshow -s slot_num|all command can be used to print PMS system information on the hardware state, monitoring status, and alarm status (whether an alarm was generated).

The pmsd hwhistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the hardware's operation. The log is printed to the ChorusOS terminal performing the operation.

Operating System Operations

The pmsd osoperset -s slot_num|all command performs operations on the operating system. The operations can only be performed in maintenance or none mode unless the optional -f parameter is used. You specify the operation to be performed on the specified slot by using the option -o with one of the following parameters: reboot (reboot the operating system), mon_enable (enable health monitoring of the operating system), or mon_disable (disable health monitoring of the operating system). The optional -f parameter can be used to perform the operation even if applications are in the active state, and the slot is in operational mode.

The pmsd osinfoshow -s slot_num|all command can be used to print PMS system information on the operating system state, monitoring status, and alarm status (whether an alarm was generated).

The pmsd oshistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the operating system's operation. The log is printed to the ChorusOS terminal performing the operation.

Application Operations

The pmsd appoperset -s slot_num|all command performs operations on the applications. The operations can only be performed in the none mode. You specify the operation to be performed on the specified slot by using the option -o with one of the following parameters: force_offline (force the applications to an offline state), vote_active (move the group of applications to the active state only if all of the applications agree to be moved), or force_active (force the applications to the active state).

The pmsd appinfoshow -s slot_num|all command can be used to print PMS system information on the applications' state and alarm status (whether an alarm was generated).

The pmsd apphistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the applications' operation. The log is printed to the ChorusOS terminal performing the operation.

Printing Other PMS Information

The pmsd version command prints the current version of pmsd.

The pmsd usage command prints a synopsis of the pmsd commands.


Monitoring Your System

This section describes various ways to monitor your system.

CLI Information

The distributed management card CLI provides many commands to display system status. Refer to the distributed management card CLI commands in the section, Using the Distributed Management Card CLI, in particular the show commands, to view system status.

The MOH Application

The MOH collects information about individual field replaceable units (FRUs) in your system and monitors their operational status. MOH can also monitor certain daemons. For example, if you installed the Netra High Availability Suite, MOH monitors daemons through that application.

Starting and Stopping MOH

If you installed the Solaris patches for MOH in a directory other than the default directory, specify that path instead. You must start the MOH application as root.

# cd /opt/SUNWnetract/mgmt3.0/bin
# ./ctmgx start [option]

Refer to TABLE 2-7 for the options available with ctmgx start.

# cd /opt/SUNWnetract/mgmt3.0/bin
# ./ctmgx stop

Once MOH is running, it interfaces with your SNMP or RMI application to discover network elements, monitor the system, and provide status messages. Refer to the Netra CT Server Software Developer's Guide for information on writing applications to interface with the MOH application.

Additional Troubleshooting Information

In the event of an active distributed management card fault, hot-swapping is not supported.

For additional troubleshooting information, refer to the Netra CT 820 Server Service Manual.