C H A P T E R 11 |
Operating Lights Out Management from Solaris |
This chapter explains how to use the LOMlite2-specific commands available in Solaris 8 for monitoring and managing a Netra 20 server.
For an introduction to the LOMlite2 device and a description of an alternative user interface to it, see Chapter 10.
The chapter contains the following sections:
To use the Lights Out Management (LOM) facilities, either remotely or locally, you need a terminal connection to the LOM console port on the Netra 20 server.
There are two ways of interrogating the LOMlite2 device or of sending it commands to perform:
For information about how to do this, see Chapter 10.
These commands are described in this chapter.
The Solaris commands described in this section, which are all available from the UNIX # prompt, run the /usr/sbin/lom utility.
Where appropriate, the command lines given in this section are accompanied by typical output from the commands.
To view the manual pages for the LOMlite2 utility, type:
To check that the input lines and the output line for the power supply unit are working normally, type:
Note - If there are any failures of the PSU that affect more than just the input or output lines, Solaris will not run. However, if standby power is present, you can still use the LOMlite2 shell commands described in Chapter 10. |
To check whether the PSU LEDs are on or off, type:
# lom -L LOMlite led states: 1 on Power 2 off Fault 3 off Suppply A 4 off Supply B 5 on PSU ok 6 off PSU fail # |
Note - The above example is taken from an AC system, hence the Supply A and Supply B status LEDs, which relate to the DC power supply, are both reported as off. |
To check status of the fans, type:
To identify each fan, see Fan Identification. If you need to replace a fan, contact your local Sun sales representative and quote the part number of the component you need. For information, see Appendix A and the Netra 20 Service and System Reference Manual.
The -v option displays the status of the Netra 20 server's internal circuit breakers. For any that have been tripped, the status will read faulty. The system contains two circuit breakers: one for the PSU and one for the System Configuration Card reader. If there is a problem with the circuit breakers, remove the device connected to the relevant port. When you do this, the circuit breakers will automatically reset. If there is a problem with the circuit breaker for the System Configuration Card, it means that you do not have a valid System Configuration Card inserted. Insert one.
To check the status of the supply rails and internal circuit breakers, type:
To check the internal temperature of the system and also the system's warning and shutdown threshold temperatures, type:
To check whether the Fault LED and alarms are on or off, type:
Alarms 1, 2, and 3 are software flags. They are associated with no specific conditions but are available to be set by your own processes or from the command line (see Turning Alarms On and Off (lom -A)).
For full information about enabling and using the LOMlite2's watchdog process, see Configuring the LOMlite2 to Restart the Server Automatically After a Lockup.
To find out how the LOMlite2 watchdog is currently configured, type:
The LOMlite2 watchdog is enabled by default when Solaris boots. This means that if the watchdog does not receive a "pat" for 40 seconds, it will turn on the Fault LED on the front and back panels of the system, generate a LOM event report, and, if configured to do so, perform an automatic server restart. However, although the watchdog is enabled by default when Solaris boots, the Hardware reset option is not. This means that the LOMlite2 device does not, by default, automatically restart the server after a lockup.
To configure the LOMlite2 device to perform an automatic server restart (ASR) after a lockup, you must enable the Hardware reset option as well as the Watchdog option. For more information, see Configuring the LOMlite2 to Restart the Server Automatically After a Lockup.
To view the settings of all the configurable variables for the LOMlite2 device, type:
To view all the status data stored by the LOMlite2 device plus the details of the device's own configuration, type:
where n is the number of reports (up to 128) that you want to see and x specifies the level of reports you are interested in. There are four levels of events:
If you specify a level, you will see reports for that level and above. For example, if you specify level 2, you will see reports of level 2 and level 1 events. If you specify level 3, you will see reports of level 3, level 2, and level 1 events.
If you do not specify a level, you will see reports of level 3, level 2, and level 1 events.
CODE EXAMPLE 11-1 shows a sample event log display. Note that the first event is the oldest and that each event has a date-stamp indicating the days, hours and minutes since the system was last booted.
You can configure the LOMlite2 device to restart the server automatically after a lockup. The LOMlite2 device has a watchdog process which, by default, expects to be patted every 10000 milliseconds, i.e., every 10 seconds. If it does not receive a pat after 40000 milliseconds (default)--i.e., every 40 seconds--the LOMlite2 device turns on the front and back Fault LEDs and generates a LOM event report. However, it does not automatically restart the system unless you have configured it to do so.
Remove the hash (`#') from the following line in the script file /etc/rc2.d/S25lom to enable the LOMlite2 watchdog process:
When you have done this, the LOMlite2 device will restart the server whenever the watchdog times out.
You can turn the option on and off from the UNIX # prompt. For more information, see Setting the Hardware Reset Option From a Script or Command (lom -R on).
However, as long as you have the -R on option set in /etc/rc2.d/S25lom, the Hardware Reset option will always be enabled when you start the system.
Note - You do not normally need to do this. If you want to configure the LOMlite2 device to perform an automatic server restart after a lockup, see Stopping LOMlite2 from Sending Reports to the Lom Console Port (lom -E off). Only use the lom -W on option on the command line or in another script file if for some reason you have removed the /etc/rc2.d/S25lom script. |
The LOMlite2 watchdog process is disabled by default. To enable the watchdog process type:
The number 40000 on this command line indicates the watchdog's timeout period in milliseconds; you can specify a different number. The number 10000 indicates its pat interval in milliseconds; again, you can specify a different number.
If the watchdog process times out (in other words, if it does not receive its expected pat), the LOMlite2 device will turn on the server's front and back Fault LEDs and generate a LOM event report. However, it will not automatically reset the system. To make it reset the system, you must use the -R option.
If you have no LOMlite2 watchdog process running already and you want the process to run, type the following, or add it to another script file:
If you want the LOMlite2 device to perform an automatic server restart after a lockup, you must include the -R on option in the command, as follows:
To force the LOMlite2 watchdog to trigger an automatic server restart (ASR) after a lockup, add the -R on option to the command in the /etc/rc2.d/S25lom script file. This is the script that runs the watchdog. For full instructions about how to do this, see Configuring the LOMlite2 Watchdog to Restart the System After a Lockup.
However, if for any reason you are not using the script file provided with your system (/etc/rc2.d/S25lom) but have instead enabled the watchdog from the command line or from another script file, you can turn the Hardware reset option on by typing the following at the command line:
To turn the Hardware reset option off from the command line, type:
This section explains how to turn the alarms and Fault LEDs on and off by using the lom command. It also explains how to:
There are three alarms associated with the LOMlite2 device. They are associated with no specific conditions but are software flags available to be set by your own processes or from the command line.
To turn an alarm on from the command line, type:
where n is the number of the alarm you want to set: 1, 2, or 3.
To turn the alarm off again, type:
where n is the number of the alarm you want to turn off: 1, 2, or 3.
To turn the Fault LED on, type:
To turn the Fault LED off again, type:
The character sequence #. (hash, dot) enables you to escape from Solaris to the lom> prompt.
To change the first character of this default lom escape sequence, type:
where x is the alpha-numeric character you want to use instead of #.
LOMlite2 event reports can interfere with information you are attempting to send or receive on the LOM console port.
To stop the LOMlite2 device from sending reports to the LOM console port, type:
By default, the LOM console port is shared by the console and the LOMlite2 device. The LOMlite2 interrupts the console whenever it needs to send an event report. To prevent the LOMlite2 from interrupting the console on Serial A/LOM, turn serial event reporting off.
To turn serial event reporting on again, type:
If you want to dedicate the LOM console port to the LOMlite2 device and to use the Serial B port as your console port, see Separating LOMlite2 From the Console on the LOM Console Port.
By default, the LOMlite2 driver cannot be unloaded. This is because the driver is required by the watchdog process and event reporting. If you unload the driver and you have configured the system to restart when the watchdog times out, the watchdog will time out causing a system reset. For information about configuring the system to restart automatically after a lock-up, see Configuring the LOMlite2 to Restart the Server Automatically After a Lockup).
To remove driver protection from the LOMlite2 driver so that you can unload the driver:
1. Turn the watchdog process off by typing:
2. Unload the driver by typing:
If you have scripts written to the LOMlite interface on the Netra t1 Model 100/105 server or the Netra t 1400/1405 server and you want to use these scripts on the Netra 20 server, you can add file system links that make this possible. To do so, simply type:
When you have done this, you will be able to use the old scripts on the new system.
To upgrade the firmware on the LOMlite2 device, obtain the new firmware package from SunSolveSM or from your local Sun Sales representative, and type the following:
where filename is the name of the file containing the new firmware.
Note - LOMlite2 firmware upgrades will be released as patches and will include detailed installation instructions. |
Copyright © 2003, Sun Microsystems, Inc. All rights reserved.