C H A P T E R 3 |
Advanced Lights Out Manager |
This chapter gives an overview of the Sun Advanced Lights Out Manager (ALOM) software. The chapter covers the following topics:
The Netra 240 server is shipped with the Sun Advanced Lights Out Manager installed. The system console is directed to ALOM by default and is configured to show server console information on start-up.
ALOM enables you to monitor and control your server over either a serial connection (using the SERIAL MGT port) or an Ethernet connection (using the NET MGT port). For information on configuring an Ethernet connection, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server
(817-3174).
Note - The ALOM serial port, labeled SERIAL MGT, is for server management only. If you need a general-purpose serial port, use the serial port labeled 10101. |
You can configure ALOM to send email notification of hardware failures and other events related to the server or to ALOM.
The ALOM circuitry uses standby power from the server, with the following results:
TABLE 3-1 lists the components that are monitored by ALOM and the information that the software provides for each component.
Presence, temperature, and any thermal warning or failure conditions |
|
Ambient temperature and any thermal warning or failure conditions |
|
The default management port is labeled SERIAL MGT. This port uses an RJ-45 connector and is for server management only; it supports only ASCII connections to an external console. Use this port the first time you operate the server.
Another serial port--labeled 10101--is available for general purpose serial data transfer. This port uses a DB-9 connector. For information about pinouts, refer to the Netra 240 Server Installation Guide (part number 817-2698).
In addition, the server has one 10BASE-T Ethernet management domain interface, labeled NET MGT. To use this port, ALOM configuration is required. For information, see the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174).
When you switch to the ALOM software after initial power-on, you see the sc> prompt. At this point, you can execute commands that require no user permissions. (Refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server, part number 817-3174, for a list of commands.) When you attempt to execute any command that requires user permissions, you are prompted to set a password for user admin.
If you are prompted to do so, set a password for the admin user.
The password must contain the following:
Once the password is set, the admin user has full permissions and can execute all ALOM CLI commands. The user is prompted to log in with the admin password when subsequently switching to ALOM.
This section covers some basic ALOM functions. For comprehensive documentation, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174) and the Netra 240 Server Release Notes (817-3142).
To Switch to the ALOM Prompt |
At a command prompt, type the following #. keystroke sequence:
Note - When you switch to the ALOM prompt, you are logged in with the userid admin. See Setting the admin Password. |
To Switch to the Server Console Prompt |
More than one ALOM user can be connected to the server console at a time, but only one user is permitted to type input characters to the console.
If another user is logged in and has write capability, you see the following message below after typing the console command:
To Take Console Write Capability Away From Another User |
Note - Automatic System Recovery (ASR) is not the same as Automatic Server Restart, which the Netra 240 server also supports. |
Automatic Server Restart is a component of ALOM. It monitors the Solaris OS while it is running and, by default, syncs the file systems and restarts the server if it fails.
ALOM uses a watchdog process to monitor the kernel only. ALOM does not restart the server if a process hangs and the kernel is still running. The ALOM watchdog parameters for the watchdog patting interval and the watchdog timeout are not user configurable.
If the kernel hangs and the watchdog times out, ALOM reports and logs the event and performs one of three user configurable actions:
For more information, see the sys_autorestart section of the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174).
For instructions on using Automatic System Recovery (ASR), see Chapter 1.
The Netra 240 server features an environmental monitoring subsystem designed to protect the server and its components against the following:
Monitoring and control capabilities are handled by the ALOM firmware, which ensures that monitoring capabilities remain operational even if the system has halted or is unable to boot. Also, monitoring the system from the ALOM firmware frees the system to dedicate CPU and memory resources to the operating system and application software.
The environmental monitoring subsystem uses an industry-standard I2C bus. The I2C bus is a simple two-wire serial bus used throughout the system to enable the monitoring and control of temperature sensors, fans, power supplies, status LEDs, and the front panel system control rotary switch.
The server contains three temperature sensors that monitor the ambient temperature of the server and the die temperature of the two CPUs. The monitoring subsystem polls each sensor and uses the sampled temperatures to report and respond to any overtemperature or undertemperature conditions. Additional I2C devices detect component presence and component faults.
The hardware and software together ensure that the temperatures within the enclosure do no exceed predetermined "safe operation" ranges. If the temperature observed by a sensor falls below a low-temperature warning threshold or rises above a high-temperature warning threshold, the monitoring subsystem software lights the system Service Required LEDs on the front and back panels. If the temperature condition persists and reaches a high or low soft shut-down temperature threshold, the system initiates a graceful system shut down. If the temperature reaches a high or low hard temperature threshold, the system initiates a forced system shut down.
Error and warning messages are sent to the system console and are logged in the /var/adm/messages file, and Service Required LEDs remain lit after an automatic system shutdown to aid in problem diagnosis.
The types of messages that are sent to the system console and are logged in the /var/adm/messages file depend on how you set the sc_clieventlevel and sys_eventlevel ALOM user variables. For information about setting these variables, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (817-3174).
The monitoring subsystem is also designed to detect failures on the four-system blower. If any blower fails, the monitoring subsystem detects the failure and generates an error message to the system console, logs the message in the /var/adm/messages file, and lights the Service Required LEDs.
The power subsystem is monitored in a similar manner. Polling the power supply status occasionally, the monitoring subsystem indicates the status of each supply's outputs, inputs, and presence.
If a power supply problem is detected, an error message is sent to the system console and is logged in the /var/adm/messages file. Additionally, LEDs located on each power supply light to indicate failures. The system Service Required LED lights to indicate a system fault. The ALOM console alerts record power supply failures.
Use the showenvironment ALOM command to view the warning thresholds of the power subsystem and the fan speeds. For instructions on using this command, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174).
Copyright © 2004, Sun Microsystems, Inc. All rights reserved.