C H A P T E R  1

Diagnostic Tools Overview

The Netra 440 server and its accompanying software and firmware contain many diagnostic tools and features that can help you:

This chapter introduces the diagnostic tools you can use on the server.

If you want comprehensive background information about diagnostic tools, read this chapter and then read Chapter 2 to find out how the tools fit together.

If you only want instructions for using diagnostic tools, skip the first two chapters and turn to:

You may also find it helpful to turn to the Netra 440 Server System Administration Guide for information about the system console.


A Spectrum of Tools

Sun provides a wide spectrum of diagnostic tools for use with the Netra 440 server. These tools range from the SunVTStrademark software, a comprehensive validation test suite, to log files that may contain clues helpful in narrowing down the possible sources of a problem.

The diagnostic tool spectrum also ranges from standalone software packages, to firmware-based power-on self-test (POST), to hardware LEDs that tell you when the power supplies are operating.

Some diagnostic tools enable you to examine many systems from a single console; others do not. Some diagnostic tools stress the system by running tests in parallel, while other tools run sequential tests, enabling the system to continue its normal functions. Some diagnostic tools function on standby power or when the system is offline, while others require the operating system to be up and running.

TABLE 1-1 summarizes the full palette of tools. Most of these tools are discussed in depth in this manual; some are discussed in greater detail in the Netra 440 Server Administration Guide (817-3884-xx). Some tools also have their own comprehensive documentation sets. See the Preface for more information.

TABLE 1-1 Summary of Diagnostic Tools

Diagnostic Tool

Type

What It Does

Accessibility and Availability

Remote Capability

Advanced Lights Out Manager (ALOM)

Hardware, software, and firmware

Monitors environmental conditions, generates alerts, performs basic fault isolation, and provides remote console access.

Can function on standby power and when the operating system is not running.

Designed for remote access

LEDs

Hardware

Indicate status of overall system and particular components.

Accessed from system chassis. Available anytime system power is available.

Local, but can be accessed through ALOM

POST

Firmware

Tests core components of system: CPUs, memory, and motherboard I/O bridge integrated circuits.

Can be run on startup, but default is no POST. Available when the operating system is not running.

Local, but can be accessed through ALOM

OpenBoot Diagnostics

Firmware

Tests system components, focusing on peripherals and
I/O devices.

Can be run automatically at startup, but the default is no diagnostics. Can also be run interactively. Available when the operating system is not running.

Local, but can be accessed through ALOM

OpenBoot commands

Firmware

Display various kinds of system information.

Available when the operating system is not running.

Local, but can be accessed through ALOM

Solaris commands

Software

Display various kinds of system information.

Requires operating system.

Local, and over network

SunVTS

Software

Exercises and stresses the system, running tests in parallel.

Requires operating system. You may need to install SunVTS software separately.

View and control over network


Why are there so many different diagnostic tools?

There are a number of reasons for the lack of a single all-in-one diagnostic test, starting with the complexity of the server.

Consider the bus repeater circuit built into every Netra 440 server. This circuit interconnects all CPUs and high-speed I/O interfaces (see FIGURE 1-1), sensing and adapting its communications depending on how many CPU modules are present. This sophisticated high-speed interconnect represents just one facet of the Netra 440 server's advanced architecture.

  FIGURE 1-1 Simplified Schematic View of a Netra 440 Server

This figure is a block diagram showing the major subsystems and buses of a Netra 440 server, focusing mostly on motherboard components.

Consider also that some diagnostics must function even when the system fails to boot. Any diagnostic capable of isolating problems when the system fails to boot must be independent of the operating system. But any diagnostic that is independent of the operating system will also be unable to make use of the operating system's considerable resources for getting at the more complex causes of failures.

Another complicating factor is that different sites have differing diagnostic requirements. You may be administering a single computer or a whole data center full of equipment in racks. Alternatively, your systems may be deployed remotely-- perhaps in areas that are physically inaccessible.

Finally, consider the different tasks you expect to perform with your diagnostic tools:

Not every diagnostic tool can be optimized for all these varied tasks.

Instead of one unified diagnostic tool, Sun provides a palette of tools, each of which has its own specific strengths and applications. To best appreciate how each tool fits into the larger picture, it is necessary to have some understanding of what happens when the server starts up, during the so-called boot process. This is discussed in the next chapter.