C H A P T E R  3

Administering Your System

You administer your system using the distributed management card command-line interface, and through the MOH and PMS applications.

The distributed management card CLI works with the MOH and PMS applications, and supports Simple Network Management Protocol (SNMP) and Remote Method Invocation (RMI) interfaces. MOH provides the SNMP and RMI interfaces to manage the system and send out events and alerts. CLI provides an overlapping subset of commands with MOH and also provides commands for the distributed management card itself; sending out events and alerts is not a function of the CLI.

This chapter contains the following sections:


Using the Distributed Management Card Command-Line Interface

The distributed management card command-line interface provides commands to control power of the system, control the node boards, administer the system, show status, and set configuration information. (See Accessing the Distributed Management Cards for information on how to access the distributed management card.)

All CLI commands can be used on the active distributed management card; a subset of CLI commands can be used on the standby distributed management card.

CLI Commands

TABLE 3-1 lists the active distributed management card command-line interface commands by type, command name, default permission required to use the command, and command description. TABLE 3-2 lists the subset of the CLI commands available for the standby distributed management card.

Default permission levels are:

The permission level for a user can be changed with the userperm command.

A -h option with a command indicates that help is available for that command.

TABLE 3-1 Active Distributed Management Card Command-Line Interface Commands

Command
Type

Command

Permis-

sion

Description

Status

showenvironment

 

Display a summary of current environmental information, such as fantray and power supply status.

 

shownetwork

 

Display the current network configuration of the distributed management card.

 

showipmode
-b port_num

 

Display the value of ip_mode for the specified port number.

 

showipaddr
-b port_num

 

Display the value of ip_addr for the specified port number.

 

showipnetmask
-b port_num

 

Display the value of ip_netmask for the specified port number.

 

showipgateway

 

Display the value of ip_gateway for the distributed management card.

 

showdate

 

Display the system date.

 

showntpserver

 

Display the IP address of the NTP server.

 

showfru target instance field

 

Display FRU ID information. Refer to Displaying Netra CT Server FRU ID Information for more information.

This command is also supported on third-party node boards. Refer to To Display FRU ID Information for a Third-Party Node Board for more information.

 

showhostname

 

Display the value of the hostname used in the CLI prompt.

 

showservicemode

 

Display the value of the distributed management card service mode.

 

showcpustate

 

Display the board type, power state, and boot state for each slot in the system. Refer to Displaying Board State Information for more information. This command is also supported on third-party node boards.

Power control

poweroff cpu_node

r

Power off the specified node slot, where cpu_node can be 2 through 21. This command is also supported on third-party node boards.

 

poweron cpu_node

r

Power on the specified node slot, where cpu_node can be 2 through 21. This command is also supported on third-party node boards.

 

powersupply
n on|off

r

Switch on or off the specified power supply unit, where n can be 1 through 8.

CPU control

console cpu_node

c

Enter console mode and connect to the specified node board, where cpu_node can be 3 through 20.

 

break cpu_node

c

Put the server in debug mode, where cpu_node can be 3 through 20.

 

reset [-h] [dmc|1A|cpu_node]
[-x
cpu_node]

r

Reset (reboot) a specified node.
reset [dmc|1A|cpu_node]produces a soft reset (reboots the operating system), where dmc is the top distributed management card; 1A is the top distributed management card; and cpu_node can be 3 through 20.
reset -x produces a hard reset (resets the board), where cpu_node can be 2 through 21.
reset -x is also supported on third-party node boards.

 

setpanicdump [all|cpu_node] [true|false]

a

Set whether a panic dump is generated when a node is reset, where all means all nodes 3 through 20, and cpu_node can be a specific node 3 through 20.

 

showpanicdump [all|cpu_node]

 

Show whether or not a panic dump has been set for all nodes 3 through 20 or for a specific node 3 through 20.

 

setescapechar value

 

Set the escape character to end a console session. The default is a ~ (tilde).

 

showhealth [-b cpu_node]

 

Show the healthy information of a node, where cpu_node can be 0 through 21.

 

pmsd

a

Display help information on starting, stopping, and controlling the PMS daemon on the distributed management card. Refer to Enabling the Processor Management Service Application and to Using the PMS Application for Recovery and Control of Node Boards for more information.

Administra-
tion

useradd [-h] username

u

Add a user account. The default user account is netract. The distributed management card supports 16 accounts.

 

userdel [-h] username

u

Delete a user account.

 

usershow [-h] [username]

 

Show user accounts.

 

userpassword [-h] username

u

Set or change the password of a specified user account.

 

userperm [-h] username
[c|u|a|r]

u

Set or change the permission levels for a specified user account.

 

logout

 

Log out of the current session.

 

password [-h]

u

Change the existing password.

 

flashupdate -d cmsw|bcfw|bmcfw|rpdf|scdf -f path

a

Flash update the distributed management card software, where cmsw represents the chassis management software;. bcfw represents the boot control firmware; bmcfw represents the BMC firmware; rpdf represents the system configuration repository; and scdf initializes the system configuration variables to their defaults. Refer to Updating the Distributed Management Card Flash Images for more information.

 

setdefaults

a

Initialize the distributed management card system configuration variables, for example, the external Ethernet port variables, to the defaults.

 

help

 

Display a list of supported commands.

 

version

 

Display the versions of various software and firmware.

 

setdate [-h] mmddHHMMccyy

a

Set the current date.

 

setntpserver addr|none

a

Configure the distributed management card to be an NTP client. The NTP server IP address must be on the same subnet as the distributed management card. The default is none.

 

setfru [-h] target instance field value

a

Set FRU ID information. Refer to Specifying Netra CT Server FRU ID Information for more information.

This command is also supported on third-party node boards. Refer to To Configure a Chassis Slot for a Third-Party Node Board for more information.

 

showescapechar

a

Show the escape character used to end a console session.

Configuration (Ethernet ports)

setipmode
-b port_num
rarp|config|none

a

Set the IP mode of the specified Ethernet port. Choose the IP mode according to the services available in the network (rarp, config, or none). The default for the external Ethernet port is none; the default for the internal Ethernet port is none, that is, no services are available on this port. You must reset the server for the changes to take effect.

 

setipaddr
-b port_num addr

a

Set the IP address of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the ipmode is set to config. You must reset the server for the changes to take effect.

 

setipnetmask
-b port_num mask

a

Set the IP netmask of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the ipmode is set to config. You must reset the server for the changes to take effect.

 

setipgateway addr

a

Set the IP gateway of the distributed management card. The default is 0.0.0.0. You must reset the server for the changes to take effect.

Configuration (Other)

sethostname hostname

a

Set the hostname to be used in the CLI prompt. The default is netract. The maximum length is 32 characters.

 

setservicemode true|false

a

Set whether MOH and PMS services are started automatically on the distributed management card after a reboot. The default is false, meaning that these services are automatically started.

PMS daemon control

pmsd start [-p port_num] [-e server_admin_state] [-d]

a

Start PMS on the distributed management card.

 

pmsd stop [-p port_num]

a

Stop PMS on the distributed management card.

 

pmsd slotaddressset -s slot_num -i ip_addr

a

Set the IP address for the distributed management card to control and monitor a node board.

 

pmsd slotaddressshow
-s slot_num|all

a

Print the IP address set with the pmsd slotaddressset command.

 

pmsd slotrndaddressadd -s slot_num|all -n ip_addr
-d ip_addr -r slot_num

a

Add address information for a node board to control other node boards.

 

pmsd slotrndaddressdelete
-s slot_num|all
-i index_num|all

a

Delete address information added with the pmsd slotrndaddressadd command.

 

pmsd slotrndaddressshow
-s slot_num|all
-i index_num|all

a

Print address information added with the pmsd slotrndaddressadd command.

 

pmsd operset -s slot_num|all -o maint_config|
oper_config|
none_config|
graceful_reboot

a

Enable automatic recovery of a node board.

 

pmsd infoshow -s slot_num|all

a

Print PMS system information.

 

pmsd historyshow -s slot_num|all

a

Print a log of PMS system events and time stamps.

 

pmsd recoveryoperset
-s slot_num|all
-o pc|rst|rstpc|pd|rb

a

Manually recover a board in case of fault.

 

pmsd recoveryautooperset -s slot_num|all
-o pc|rst|rstpc|pd|rb|
rbpc
|none|trg [-d startup_delay] [-f on|off] [-r retries] [-n inter_op_delay] [-p reset_power-cycle_delay]

a

Automatically recover a board in case of fault.

 

pmsd recoveryautoinfoshow
-s slot_num|all

a

Print the configuration information affected by the recoveryautooperset command.

 

pmsd hwoperset -s slot_num|all -o powerdown|powerup|
reset|mon_enable|
mon_disable
[-f]

a

Perform operations on a node board hardware.

 

pmsd hwinfoshow -s slot_num|all

a

Print PMS system information on the hardware.

 

pmsd hwhistoryshow -s slot_num|all

a

Print a log of PMS hardware events and time stamps.

 

pmsd osoperset -s slot_num|all -o reboot|mon_enable|
mon_disable [-f]

a

Perform operations on a node board operating system.

 

pmsd osinfoshow -s slot_num|all

a

Print PMS system information on the operating system.

 

pmsd oshistoryshow -s slot_num|all

a

Print a log of PMS operating system events and time stamps.

 

pmsd appoperset -s slot_num|all -o force_offline|
vote_active|
force_active

a

Perform operations on a node board applications.

 

pmsd appinfoshow -s slot_num|all

a

Print PMS system information on the applications.

 

pmsd apphistoryshow -s slot_num|all

a

Print a log of PMS application events and time stamps.

 

pmsd version

a

Print the PMS version.

 

pmsd usage

a

Print a synopsis of the pmsd commands.


Information on configuring distributed management card ports, setting up user accounts, specifying FRU ID information, and starting the PMS daemon using the distributed management card CLI is provided in Chapter 2. The PMS daemon commands are described in Using the PMS Application for Recovery and Control of Node Boards.

 

TABLE 3-2 Standby Distributed Management Card Command-Line Interface Commands

Command
Type

Command

Permis-

sion

Description

Status

shownetwork

 

Display the current network configuration of the distributed management card.

 

showdate

 

Display the system date.

 

showhostname

 

Display the value of the hostname used in the CLI prompt.

 

showipmode
-b port_num

 

Display the value of ip_mode for the specified port number.

 

showipaddr
-b port_num

 

Display the value of ip_addr for the specified port number.

 

showipnetmask
-b port_num

 

Display the value of ip_netmask for the specified port number.

 

showipgateway

 

Display the value of ip_gateway for the distributed management card.

 

showservicemode

 

Display the value of the distributed management card service mode.

CPU control

reset [-h] [dmc|1B]

r

Reset (reboot) the distributed management card, where dmc is the bottom distributed management card and 1B is the bottom distributed management card.

Administra-
tion

logout

 

Log out of the current session.

 

password [-h]

u

Change the existing password.

 

flashupdate -d cmsw|bcfw|bmcfw|rpdf|scdf -f path

a

Flash update the distributed management card software, where cmsw represents the chassis management software;. bcfw represents the boot control firmware; bmcfw represents the BMC firmware; rpdf represents the system configuration repository; and scdf initializes the system configuration variables to their defaults. Refer to Updating the Distributed Management Card Flash Images for more information.

 

help

 

Display a list of supported commands.

 

version

 

Display the versions of various software and firmware.

 

setdate [-h] mmddHHMMccyy

a

Set the current date.

Configuration (Ethernet ports)

setipmode
-b port_num
rarp|config|none

a

Set the IP mode of the specified Ethernet port. Choose the IP mode according to the services available in the network (rarp, config, or none). The default for the external Ethernet port is none; the default for the internal Ethernet port is none, that is, no services are available on this port. You must reset the server for the changes to take effect.

 

setipaddr
-b port_num addr

a

Set the IP address of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the ipmode is set to config. You must reset the server for the changes to take effect.

 

setipnetmask
-b port_num mask

a

Set the IP netmask of the specified Ethernet port. The default is 0.0.0.0. This command is only used if the ipmode is set to config. You must reset the server for the changes to take effect.

 

setipgateway addr

a

Set the IP gateway of the distributed management card. The default is 0.0.0.0. You must reset the server for the changes to take effect.

Configuration (Other)

sethostname hostname

a

Set the hostname to be used in the CLI prompt. The default is netract. The maximum length is 32 characters.

 

setservicemode true|false

a

Set whether MOH and PMS services are started automatically on the distributed management card after a reboot. The default is false, meaning that these services are automatically started.


Security Provided

A remote command-line session or a console session automatically disconnects after 10 minutes of inactivity.

Security is also provided through the permission levels and passwords set for each account.


Updating the Distributed Management Card Flash Images

The primary boot device for the distributed management card is always the flash. You can update the distributed management card flash images over the network using nfs or tftp. TABLE 3-3 shows the distributed management card flash options.

TABLE 3-3 Distributed Management Card Flash Options

Option

Description

cmsw

Updates the chassis management software, which includes the Chorus operating system, the MOH application, and the PMS application.

bcfw

Updates the boot control firmware.

bmcfw

Updates the BMC firmware.

rpdf

Updates the system configuration repository, which contains information used internally by the CLI in the flash, reinitializes it to a default minimum, and resets the distributed management card.

scdf

(Optional) Initializes the system configuration variables, for example, the external Ethernet port variables, to the defaults.


There is no required sequence for flashing the distributed management card, although the following order is recommended: cmsw, bcfw, bmcfw, and rpdf. You can update individual images if you want.

During a flash update of the BMC firmware, the BMC is not able to respond to communication requests, and the following messages may display on the console:

ysif_xfer_msg: kcs driver xfermsg returns -1
read_evt_buffer: sysif_xfer_msg returns -1
poll_evt_handler: read_evt_buffer returns -1
listner_thread: poll_evt_handler returns -1


procedure icon  To Update All the Distributed Management Card Flash Images

1. Log in to the distributed management card.

2. Flash update the distributed management card images:



Note - The scdf option is not mandatory. Use it only if you want to initialize the system configuration variables to the defaults.



 

hostname cli> flashupdate -d cmsw -f path
hostname cli> flashupdate -d bcfw -f path
hostname cli> flashupdate -d bmcfw -f path
hostname cli> flashupdate -d scdf
hostname cli> flashupdate -d rpdf -f path

where path can be nfs://nfs.server.ip.address/directory/filename or tftp://tftp.server.ip.address/directory/filename where the software to use in the flash is installed.

After you update rpdf, the distributed management card resets itself. If you do not update rpdf, you must reset the distributed management card manually.


procedure icon  To Update an Individual Distributed Management Card Flash Image

1. Log in to the distributed management card.

2. Flash update a distributed management card image:

hostname cli> flashupdate -d option
hostname cli> reset dmc

where option can be cmsw -f path, bcfw -f path, bmcfw -f path, or scdf, and path can be nfs://nfs.server.ip.address/directory/filename or tftp://tftp.server.ip.address/directory/filename where the software to use in the flash is installed. If you update rpdf, the distributed management card will reset itself after finishing the rpdf update.


Setting the Date and Time on the Distributed Management Card

The distributed management card does not support battery backup time-of-day because battery life cannot be monitored to predict end of life, and drift in system clocks can be common. To provide a consistent system time, set the date and time on the distributed management card using one of these methods:


procedure icon  To Set the Distributed Management Card Date and Time Manually

1. Log in to the distributed management card.

2. Set the date and time manually:

hostname cli> setdate mmddHHMMccyy

where mm is the current month; dd is the current day of the month; HH is the current hour of the day; MM is the current minutes past the hour; cc is the current century minus one; and yy is the current year.


procedure icon  To Set the Distributed Management Card Date and Time as an NTP Client

1. Log in to the distributed management card.

2. Set the date and time as an NTP client:

hostname cli> setntpserver addr

where addr is the IP address of the NTP server.


Displaying Board State Information

Board information, including type of board, power state of the board, and boot state of the board can be displayed for each slot in the system using the CLI showcpustate command.

Sample output from this command is:

hostname cli> showcpustate
-------------------------------------------------------
   Slot No   : Board Type   : Power_State : Boot_State
-------------------------------------------------------
      1A     :    DMC Board :          On :      Ready
      1B     :    DMC Board :          On :      Ready
       2     : SWITCH Board :          On :      Ready
       3     :    CPU Board :         Off :
       4     :        Empty :             :
       5     :    CPU Board :          On :      Ready
       6     :    CPU Board :          On :    Offline
       7     :    CPU Board :          On :    Offline
       8     :    CPU Board :          On :      Ready
       9     :    CPU Board :          On :    Offline
      10     :    CPU Board :          On :    Unknown
 
Press 'q' + Return to quit, hit Return to continue

TABLE 3-4 contains the various state descriptions.

TABLE 3-4 Board State Information

State

Value

Description

Power _State

On

The slot is powered on

Power _State

Off

The slot is powered off

Boot_State

Online

The board boot sequence has started

Boot_State

Ready

The board boot sequence has finished, and the board is ready to use

Boot_State

Offline

The board may be running its power-on self-test (POST), the board may be at the OpenBoot PROM level, or the boot may have failed

Boot_State

Unknown

The distributed management card can not determine the current boot state of the board


For third-party node boards, the showcpustate command returns a state of unknown.


Booting Node Boards

Node boards can boot from a local disk or over the network.

Board Power-on Sequence

When you power on the Netra CT 820 system by pressing the power switch on the rear of the system to the On (|) position, the boards power on in this sequence:

1. The two distributed management cards are powered on; the top card in slot 1A is designated as the active distributed management card, and the bottom card in slot 1B is designated as the standby distributed management card. Once the cards have booted and are ready for use, the "Ready" LED is solid green on the active card and blinking green on the standby card.

2. The active distributed management card powers on the switching fabric boards in slots 2 and 21. While the switching fabric boards are powering on, the blue LED state is solid; once the boards have booted and are ready for use, the blue LED is off.

3. The active distributed management card looks at the Boot_Mask field in the midplane FRU ID for boot servers.

a. If one or more boot servers are designated in the Boot_Mask field, the active distributed management card powers on the boot servers first; once the boards have booted and are ready for use, the active distributed management card powers on the rest of the node boards together; these boards boot from the boot servers. The method of booting depends first on the value in the Boot_Devices field in the midplane FRU ID or secondly on the value in the OpenBoot PROM NVRAM boot_device configuration variable. After a board has booted and is ready for use, "Ready" LED is solid green and the blue LED is off.

or

b. If no boot server is designated in the Boot_Mask field, the active distributed management card powers on all the node slots together. Once the node boards are powered on, the method of booting depends first on the value in the Boot_Devices field in the midplane FRU ID or secondly on the value in the OpenBoot PROM NVRAM boot_device configuration variable. After a board has booted and is ready for use, the "Ready" LED is solid green and the blue LED is off.

A midplane FRU ID fault is a system fault, and no boards can be powered on.

Boot Device Variables

By default, the OpenBoot PROM NVRAM boot-device configuration variable is set to disk net, disk being an alias for the path to the local disk, and net being an alias for the path of the primary network. You can set the boot device for node boards through the distributed management card CLI setfru command. Refer to Configuring a Chassis Slot for a Board for information on using the setfru command to specify a boot device for a board.

For example, you might want to change the node board in slot 3 to boot first from its PMC disk. To do this, check the current OpenBoot PROM boot-device variable:

ok printenv boot-device
boot-device =     disk net
ok

On the distributed management card, check and change the boot_devices setting:

hostname cli> showfru slot 3 boot_devices
showfru: Boot_Devices:
hostname cli> setfru slot 3 boot_devices pmc0/disk net
hostname cli> showfru slot 3 boot_devices
showfru: Boot_Devices: pmc0/disk net
hostname cli>

After you power cycle the system, check the OpenBoot PROM boot-device variable:

ok printenv boot-device
boot-device =     pmc0/disk net
ok

When a node board is hot swapped, power cycled, rebooted, or reset, the OpenBoot PROM firmware checks with the distributed management card for a boot device for that slot. The distributed management card sends the value from the Boot_Devices field in FRU ID to the OpenBoot PROM firmware; the value is either the boot device list for that slot you set using the setfru command or a null string if you did not set a boot device list for that slot. The value overwrites the NVRAM boot-device value. The board will boot from the value in the boot-device variable if its diag-switch? variable is set to false (the default); the board will boot from the value in the diag-device variable (the default is net) if its diag-switch? variable is set to true.

In the event of a distributed management card fault, a node board hot swap, power cycle, reboot or reset will cause the OpenBoot PROM firmware to default to the value set in the boot-device variable.

Booting with a DHCP Server

You can configure Netra CT node boards to boot over DHCP. This process includes setting the node board boot device for DHCP, forming the node board DHCP client ID, and configuring the DHCP server.

On the Netra CT system, the DHCP client ID is a combination of the system's midplane Sun part number (7 bytes), the system's midplane Sun serial number (6 bytes), and the board's geographical address (slot number) (2 bytes). The parts are separated by a : (colon).


procedure icon  To Configure a Node Board to Boot Over DHCP

1. Log in to the distributed management card.

2. Set the boot device for the board to dhcp with the setfru command:

hostname cli> setfru slot fru_instance Boot_Devices network_devicename:dhcp

where fru_instance is the slot number of the board to be configured for DHCP and network_devicename is a path or alias to a network device. For example, to set the boot device to dhcp for the node board in slot 4, enter the following:

hostname cli> setfru slot 4 Boot_Devices net:dhcp

3. Get the Netra CT system part number and the system serial number with the showfru command:

hostname cli> showfru midplane 1 Sun_Part_No
...
hostname cli> showfru midplane 1 Sun_Serial_No
...

4. Form the three-part client ID by using the system part number, the system serial number, and the slot number, separated by colons. Then, convert the client ID to ASCII.

For example, if the output from the showfru commands in Step 3 is 375-4335 (Sun part number) and 000001 (Sun serial number), and you want to form the client ID for the node board in slot 4, the client ID is: 3754335:000001:04.

Translate the client ID to its ASCII equivalent. For example:

Client ID part

ASCII Representation

3754335

33 37 35 34 33 33 35

:

3A

000001

30 30 30 30 30 31

:

3A

04

30 34


Thus, the example client ID in ASCII is:

33 37 35 34 33 33 35 3A 30 30 30 30 30 31 3A 30 34.

5. Configure the DHCP server.

Refer to the Solaris DHCP Administration Guide on the web site docs.sun.com for information on how to configure the DHCP server for remote boot and diskless boot clients.

The client ID is retained across a node board power cycle, reboot, or reset; the distributed management card updates the client ID during a first-time power on or a hot swap of a node board. In the event of a distributed management card fault, a node board reboot or reset will retrieve the previously written client ID.


Connecting to Node Board Consoles from the Distributed Management Card

The Netra CT system provides the capability to connect to node boards and open console sessions from the active distributed management card.

You begin by logging in to the distributed management card through either the serial port or the Ethernet port. Once a console session with a node board is established, you can run Solaris system administration commands, such as passwd, read status and error messages, or halt the board in that particular slot.

Configuring Your System for Multiple Console Use

To enable your system to use multiple consoles, you set several variables, either at the Solaris level or at the OpenBoot PROM level. Set these variables on each node board to enable console use.


procedure icon  To Configure Your System for Multiple Consoles

1. Log in as root to the node board, using the on-board console port ttya.

2. Enter either set of the following commands to enable multiple consoles:

From the Solaris level:

# eeprom "multiplexer-output-devices=ttya ssp-serial"
# eeprom "multiplexer-input-devices=ttya ssp-serial"
# eeprom input-device=input-mux
# eeprom output-device=output-mux
# reboot

or

From the OpenBoot PROM level:

ok setenv multiplexer-output-devices ttya ssp-serial
ok setenv multiplexer-input-devices ttya ssp-serial
ok setenv input-device input-mux
ok setenv output-device output-mux
ok reset-all

Establishing Console Sessions Between the Distributed Management Card and Node Boards

Once you have configured your system for multiple console use, you can log in to the distributed management card and open a console for a slot. The Netra CT system allows four console users per node board slot.

TABLE 3-5 shows the distributed management card CLI console-related commands that can be executed from the current login session on the distributed management card.

TABLE 3-5 Distributed Management Card CLI Console-Related Commands

Command

Description

console cpu_node

Enter console mode and connect to a specified node board, where cpu_node can be 3 through 20.

break cpu_node

Put the specified node board in debug mode, where cpu_node can be 3 through 20. Debug mode can use OpenBoot PROM or kadb, depending on server configuration.

setescapechar value

Set the escape character to be used in all future console sessions. The default is ~ (tilde). Refer to TABLE 3-6 for escape character use.

showescapechar

Show the current escape character.


Most node board consoles use the system management bus, but a board at the OpenBoot PROM level connects over the IPMI bus. There can be only one console user on the IPMI bus at any one time.

For example, if the board in slot 4 is at the OpenBoot PROM level, the user opening a console session will connect to it over the IPMI bus. This will cause the IPMI bus to be fully occupied and no other users can connect over that bus. If they try, an error message displays. However, other users can connect to boards in other slots over the system management bus. The system management bus is faster than the IPMI bus, while the IPMI bus is typically a more stable communication channel than the system management bus.

Once you have a console connection with a node board, you can issue normal Solaris commands. There are several escape character sequences to control the current session. TABLE 3-6 shows these sequences.

TABLE 3-6 Node Board Console-Related Escape Character Sequences

Sequence

Description

~b

Break from the Solaris level and enter the OpenBoot PROM (debug) level.

~.

End the console session.

~g

Determine the status (system management bus or IPMI) of the current console.

~t

Toggle between system management bus and IPMI.



procedure icon  To Start a Console Session from the Distributed Management Card

1. Log in to the distributed management card.

You can log in to the distributed management card through a terminal attached to either the serial port connection or the Ethernet port connection.

2. Open a console session to a board in a slot:

hostname cli> console cpu_node

where cpu_node is 3 through 20. For example, to open a console to the board in slot 4, enter the following:

hostname cli> console 4

You now have access to the board in slot 4. Depending on the state of the board in that particular slot, and whether the previous user logged out of the shell, you see one of several prompts:


procedure icon  To Determine the Status of the Current Console

A message displays, indicating the current state of the console connection. The message is either:

Console mode is IPMI

This means the console is in Solaris mode or OpenBoot PROM mode.

Or the message might be:

Console mode is NET

This means the console is in Solaris mode.


procedure icon  To Toggle Between the System Management Bus and IPMI

Toggling between the system management bus and IPMI could be useful for troubleshooting. For example, if the console stops working for some reason, you could try toggling to IPMI (the more reliable communication channel).

1. If the node board is in Solaris mode, enter the escape sequence ~t:

# ~t
New console mode is IPMI
#

The console switches between the system management bus and IPMI mode. The console now fully occupies the IPMI bus. No other console may be at the OpenBoot PROM level at the same time. If another user attempts to access a board that is occupying the IPMI bus, the console connection will fail.

2. To return to the system management bus mode, enter ~t again and press enter:

# ~t
New console mode is NET
#


procedure icon  To Break into OpenBoot PROM from the Console

The console mode switches to IPMI:

New console mode is IPMI
Type `go' to resume
ok

You can now debug from the OpenBoot PROM level.


procedure icon  To End the Console Session

1. (Optional) Log out of the Solaris shell.

2. At the prompt, disconnect from the console by entering the escape sequence ~. (tilde period):

prompt ~.
hostname cli>

Disconnecting from the console does not automatically log you out from the remote host. Unless you log out from the remote host, the next console user who connects to that board sees the shell prompt of your previous session.


procedure icon  To Show the Current Escape Character

The current escape character is displayed:

hostname cli> escape_char: value


procedure icon  To Change the Default Escape Character

where value is any printable character. For example, to change the default escape character from ~ (tilde) to # (pound sign), enter the following:

hostname cli> setescapechar #

The pound sign is now the escape character for all future console sessions.


Using the PMS Application for Recovery and Control of Node Boards

This section describes specifying recovery operations and controlling node boards through the distributed management card PMS CLI commands.

Recovery Configuration of a Node Board From the Distributed Management Card

You specify the recovery configuration of a node board by using the command pmsd operset -s slot_num|all (a single slot number or all slots in the Netra CT system containing a node board) and the recovery mode for the specified slot(s).

The recovery configuration can be maintenance mode, operational mode, or none mode. Maintenance mode means the distributed management card's automatic recovery of a node board is disabled, and PMS applications are started in an offline state, so that you can use manual maintenance operations. Operational mode means the distributed management card's automatic recovery of a node board is enabled; the distributed management card will recover the node board in the event of a monitoring fault, and start PMS applications in an active state. None mode means the distributed management card's automatic recovery mode may be manually enabled or disabled; PMS application states are not enforced.

The mode is stored in persistent storage. You specify the operation to be performed on the specified slot by using the option -o with the parameter maint_config (set the hardware, operating system, and applications into maintenance mode), oper_config (set the hardware, operating system, and applications into operational mode), none_config (set the hardware, operating system, and applications into no enforcement mode), or graceful_reboot (bring the applications offline if needed and then reboot the operating system).


procedure icon  To Specify the Recovery Configuration of a Node Board

1. Log in to the distributed management card.

2. Configure the automatic recovery mode with the operset command:

hostname cli> pmsd operset -s slot_num|all -o maint_config|oper_config|none_config|gracefulreboot

where slot_num can be a slot number from 3 to 20, and all specifies all slots containing node boards. For example, to make PMS' recovery operational for the entire Netra CT server, enter:

hostname cli> pmsd operset -s all -o oper_config

Printing PMS Recovery Configuration Information

The pmsd infoshow -s slot_num|all command can be used to print the recovery configuration and alarm status for the recovery configuration.

The pmsd historyshow -s slot_num|all command can be used to print a recovery configuration and runtime message log. The log is printed to the ChorusOS terminal performing the operation.

Detailed Recovery of a Board in Case of Fault

You can perform detailed, manual recovery operations on a board or instruct PMS to perform detailed, automatic recovery operations on a board using the CLI. The operations are performed across the hardware, the operating system, and the applications.

For manual recovery, use the pmsd recoveryoperset -s slot_num|all command. This command can only be run when the board is in maintenance mode or none mode (PMS applications are offline). You specify the recovery operation to be performed on the specified slot by using the option -o with the parameters: pc (power cycle), rst (reset), rstpc (reset, then power cycle), pd (power down), or rb (reboot).

For automatic recovery, use the recoveryautooperset -s slot_num|all command. This command instructs PMS what to do in response to a fault when the board is in operational mode (PMS applications are active).

You specify the automatic recovery operation to be performed on the specified slot by using the option -o with the parameters: pc (power cycle), rst (reset), rstpc (reset, then power cycle), pd (power down), or rb (reboot), rbpc (reboot, then power cycle), none (no recovery), or trg (manually simulate a fault to trigger a recovery). Optional parameters for automatic recovery include: -d startup delay (the time in deciseconds between a fault occurrence and the start of a recovery operation; default is 0 deciseconds), -f off|on (whether a power down operation will occur if the recovery operation fails; on specifies power down will occur and off specifies that power down will not occur; the default is off), -r retries (the number of times a recovery operation can occur and fail before it is terminated; the default is one try), -n inter_op_delay (the time in deciseconds between one and the next operation for an operation with multiple retries; the default is 0 deciseconds [1 decisecond equals 10 milliseconds]; you should change the default to a number other than 0, for example, 4000 [equals 40 seconds], to allow time between the operations), and -p reset_power-cycle_delay (the time in deciseconds to be waited between the reset and power cycle portions of the recovery operation before a failed reset is declared and the power cycle portion of the operation starts; default is 0 deciseconds).


procedure icon  To Manually Recover a Board

1. Log in to the distributed management card.

2. Perform manual recovery operations on a board with the recoveryoperset command:

hostname cli> pmsd recoveryoperset -s slot_num|all -o pc|rst|rstpc|pd|rb

where slot_num can be a slot number from 3 to 20, and all specifies all slots containing node boards. For example, to instruct PMS to reboot slot 5 after a fault, enter the following:

hostname cli> pmsd recoveryoperset -s 5 -o rb


procedure icon  To Automatically Recover a Board

1. Log in to the distributed management card.

2. Perform automatic recovery operations on a board with the recoveryoperset command:

hostname cli> pmsd recoveryautooperset -s slot_num|all -o pc|rst|rstpc|pd|rb|rbpc|none|trg [-d startup delay][-f on|off][-r retries][-n inter_op_delay][-p reset_power-cycle_delay]

where slot_num can be a slot number from 3 to 20, and all specifies all slots containing node boards. For example, to instruct PMS to automatically reboot slot 5 after a fault, with the default delays, retries, and failure power state, enter the following:

hostname cli> pmsd recoveryautooperset -s 5 -o rb

Printing PMS Automatic Recovery Information

The pmsd recoveryautoinfoshow -s slot_num|all command can be used to print information showing the configuration information affected by the recoveryautooperset command.

Monitoring and Controlling a Node Board's Resources From the Distributed Management Card

PMS can perform operations on a board's hardware, the operating system, and applications. You can specify that PMS performs operations on one of these, rather than all.

Hardware Operations

The pmsd hwoperset -s slot_num|all command performs operations on the hardware. The operations can only be performed in maintenance or none mode unless the optional -f parameter is used. You specify the operation to be performed on the specified slot by using the option -o with the parameters: powerdown (set the hardware to the power-off state), powerup (set the hardware to the power-on state), reset (reset the hardware), mon_enable (enable health monitoring of the hardware), or mon_disable (disable health monitoring of the hardware). The optional -f parameter can be used to perform the operation even if applications are in the active state, and the slot is in operational mode.

The pmsd hwinfoshow -s slot_num|all command can be used to print PMS system information on the hardware state, monitoring status, and alarm status (whether an alarm was generated).

The pmsd hwhistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the hardware's operation. The log is printed to the ChorusOS terminal performing the operation.

Operating System Operations

The pmsd osoperset -s slot_num|all command performs operations on the operating system. The operations can only be performed in maintenance or none mode unless the optional -f parameter is used. You specify the operation to be performed on the specified slot by using the option -o with the parameters: reboot (reboot the operating system), mon_enable (enable health monitoring of the operating system), or mon_disable (disable health monitoring of the operating system). The optional -f parameter can be used to perform the operation even if applications are in the active state, and the slot is in operational mode.

The pmsd osinfoshow -s slot_num|all command can be used to print PMS system information on the operating system state, monitoring status, and alarm status (whether an alarm was generated).

The pmsd oshistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the operating system's operation. The log is printed to the ChorusOS terminal performing the operation.

Application Operations

The pmsd appoperset -s slot_num|all command performs operations on the applications. The operations can only be performed in the none mode. You specify the operation to be performed on the specified slot by using the option -o with the parameters: force_offline (force the applications to an offline state), vote_active (move the group of applications to the active state only if all of the applications agree to be moved), or force_active (force the applications to the active state).

The pmsd appinfoshow -s slot_num|all command can be used to print PMS system information on the applications' state and alarm status (whether an alarm was generated).

The pmsd apphistoryshow -s slot_num|all command can be used to print a short log (one-line descriptions) of messages pertaining to changes in the applications' operation. The log is printed to the ChorusOS terminal performing the operation.

Printing Other PMS Information

The pmsd version command prints the current version of pmsd.

The pmsd usage command prints a synopsis of the pmsd commands.


Monitoring Your System

This section describes various ways to monitor your system.

Command-line Interface Information

The distributed management card CLI provides many commands to display system status. Refer to the distributed management card CLI commands in the section, Using the Distributed Management Card Command-Line Interface, in particular the show commands, to view system status.

The MOH Application

The MOH collects information about individual field replaceable units (FRUs) in your system and monitors their operational status. MOH can also monitor certain daemons; for example, if you installed the Netra High Availability Suite, MOH monitors daemons through that application.

Starting and Stopping MOH

If you installed the Solaris patches for MOH in a directory other than the default directory, specify that path instead. You must start the MOH application as root.

# cd /opt/SUNWnetract/mgmt3.0/bin
# ./ctmgx start [option]

Refer to TABLE 2-4 for the options available with ctmgx start.

# cd /opt/SUNWnetract/mgmt3.0/bin
# ./ctmgx stop

Once MOH is running, it interfaces with your SNMP or RMI application to discover network elements, monitor the system, and provide status messages. Refer to the Netra CT Server Software Developer's Guide for information on writing applications to interface with the MOH application.

Additional Troubleshooting Information

In the event of an active distributed management card fault, hot swap is not supported.

For additional troubleshooting information, refer to the Netra CT Server Service Manual.