C H A P T E R  5

DR Domain Procedures

This chapter describes how you use the cfgadm(1M) command on the domain to perform DR operations. It also describes attachment points and procedures for displaying the status of system boards.


Attachment Points

Before you use the cfgadm(1M) command, make sure you understand the syntax for attachment points on the Sun Fire high-end system platform. There are physical and logical attachment points. In addition, single attachment points are used for board slots, and dynamic attachment points are used for components. Attachment points created by the DR driver have a physical and logical path.

Physical attachment points for system boards take the following form:

/devices/pseudo/dr@0:SBx (for CPU/memory boards)
-OR-
/devices/pseudo/dr@0:IOx (for I/O boards)

where x represents the expander board number (for example, 0 through 17 on a Sun Fire 15K system, and 0 through 8 on a Sun Fire 12K system).

Logical attachment points for system boards take the following form:

SBx (for CPU/memory boards)
-OR-
IOx (for I/O boards)

where x represents the board number (for example, 0 through 17 on a Sun Fire 15K system, and 0 through 8 on a Sun Fire 12K system).

Dynamic attachment points refer to components (CPUs and memory) on system boards and I/O devices on I/O boards. The attachment points are created by the DR driver. Refer to the dr(7D) man page for more details.


Displaying Board Status

The cfgadm(1M) command displays information about boards and slots. Refer to the cfgadm_sbd(1M) man page for options to this command.

Basic Status Display

Many operations require that you specify the system board names. To obtain these system names, type:

# cfgadm -a -s "select=class(sbd)"

The cfgadm(1M) command displays information only about those boards that are assigned to the domain; or those boards that appear in the available component list for the domain and are not assigned to any other domain.

The following output is typical:

Ap_Id                Type       Receptacle       Occupant        Condition
SB0                  CPU        connected        configured      ok
SB0::cpu0            cpu        connected        configured      ok
SB0::memory          memory     connected        configured      ok
IO1                  PCI        connected        configured      ok
IO1::pci0            io         disconnected     unconfigured    failed

Detailed Status Display

For a more detailed status report, use the cfgadm(1M) command with its -v (verbose) option, which turns on expanded descriptions. In addition to basic information such as the attachment point ID, receptacle and occupant states, and board status, the expanded status report also includes the date when the board was configured into the domain, the type of board, the activity state, and the physical attachment point.


Removing a Board

This section describes how to remove a CPU/Memory and an I/O board.


procedure icon  To Remove a CPU/Memory Board

To perform the following steps, you must have domain administrator privileges.

1. Log in to the domain.

2. Use the cfgadm(1M) command with the -l option to determine the attachment point for the board.

3. Stop all activity on the board.

You must halt all accesses by other CPU/Memory boards and prevent any further use until the board is replaced by using the appropriate Solaris commands.

4. Verify that the board does not have bound processes running.

If a process is bound to a CPU, the board cannot be removed until the process is unbound. Refer to the pbind(1M) man page for more information.

5. Unconfigure and disconnect the board using the following single command:

# cfgadm -v -c disconnect SBx

where x represents the board number (for example, 0 through 17 on a Sun Fire 15K system, or 0 through 8 on a Sun Fire 12K system).



Note - Do not remove a board until it is disconnected. Otherwise the board will be damaged.




procedure icon  To Remove an I/O Board

To remove an I/O board, you must first stop all usage of the board. To complete the steps in this procedure, you must have domain administrator privileges.

1. Log in to the domain.

2. Check the status of the board.

# cfgadm -a -s "select=class(sbd)"

3. If the system is using multipathing software:

a. Switch all board functions to the alternate board.

b. Remove any multipathing databases and/or private regions.

c. Wait until all of the alternate paths are functioning before proceeding.

4. Unmount file systems, including metadevices that have a board resident partition (for example: umount /partition).



Caution - Unmounting file systems may affect NFS client systems.



5. If the board contains Sun RSM Arraytrademark 2000 controllers, take the controllers off line, using the rm6 or rdacutil commands.

6. Remove disk partitions from the swap configuration.

7. If any process directly opens a device or raw partition, either kill the process or direct it to close the open device on the board.

8. If a detach-unsafe device is present on the board, close all instances of the device and use modunload(1M) to unload the driver.

9. Disconnect the board.

# cfgadm -v -c disconnect IOx

where x represents the board number (for example, 0 through 17 on a Sun Fire 15K system, or 0 through 8 on a Sun Fire 12K system).



Note - If the cfgadm(1M) command fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you will need to reboot the domain to use the board.




Adding a Board

Before installing a board, consider the following points:



Note - Whenever you use DR to add a COD board into a domain, make sure that enough COD right-to-use (RTU) licenses are available to the target domain to enable each active CPU on the COD board. If there are not enough RTU licenses available to a target domain when you add a COD board to the domain, DR displays an error message for each CPU that cannot be enabled in the domain. For more information about the COD option, see the System Management Services (SMS) Administrator Guide.




procedure icon  To Install a Board

To perform a board installation from the domain, the board must already be assigned to the domain, or must be in the available component list. Refer to the System Management Services (SMS) Administrator Guide for information on how to assign boards or to update the available component list.

1. Verify that the selected board slot can accept a board.

# cfgadm -a -s "select=class(sbd)"

The states and conditions should be:

-OR-

2. Connect and configure the board using a single command.

# cfgadm -v -c configure SBx (CPU/memory board)
-OR-
# cfgadm -v -c configure IOx (I/O board)

where x represents the board number (for example, 0 through 17 on a Sun Fire 15K system, or 0 through 8 on a Sun Fire 12K system).

After a short delay during which the system tests the board, a message appears in the domain console log indicating that the components have been configured. The states and conditions for a connected and configured attachment point should be:

Now the system is aware of the usable devices on the board and the devices can be used.



Note - If the cfgadm(1M) command fails during a DR operation, the board does not return to its original state. A dxs or dca error message is logged to the domain. If the error is recoverable, you can retry the command. If the error is unrecoverable, you will need to reboot the domain to use the board.




DR Using cfgadm(1M) - Examples

Displaying Help

To display help text for commands use the -h option. If the -h option is followed by an attachment point identifier, help messages (syntax) related to the hardware-specific library of the attachment point are shown.

TABLE 5-1 Commands that Display Help

Command

Description

# cfgadm -h

Shows general syntax.

# cfgadm -h pci

Shows PCI hotplug-specific commands.

# cfgadm -h SB2

Shows help related to system board-specific commands and options.


Displaying Verbose Messages

The -v option displays detailed messages as DR operations proceed. For example:

To configure the memory on system board 2 (SB2) with the verbose option use:

# cfgadm -v -c configure SB2::memory

To unconfigure CPU 3 (cpu3) on system board 0 (SB0) with the verbose option, use the following command:

# cfgadm -v -c unconfigure SB0::cpu3

Suppressing User Confirmation

When certain cfgadm commands are entered (such as unconfigure permanent memory), the command prompts the user to confirm the operation, yes or no. For example, the following command unconfigures system board 6 (SB6), which holds permanent system memory, and prompts the user for confirmation:

# cfgadm -c unconfigure SB6::memory
System may be temporarily suspended, proceed (yes/no)?

You can suppress the confirmation prompt by using the -y or -n option on the command line. The -y option automatically responds with "yes" and the -n option responds with "no." The following example performs exactly the same operation as the previous command, but uses the -y option to bypass user confirmation:

# cfgadm -y -c unconfigure SB6::memory
#

Power Control When Disconnecting Boards

To unconfigure system board 6 (SB6), which holds the domain's permanent memory; answer user confirmation prompts with a "yes" response; and display verbose messages, use the following command:

# cfgadm -y -v -c disconnect -o unassign,nopoweroff SB6

To disconnect I/O board 12 (IO12), but leave it powered-off and assigned to the same domain, use:

# cfgadm -c disconnect I012

Power Control of Disconnected Boards

To power-on system board 2 (SB2), use the following command:

# cfgadm -x poweron SB2

To power-off system board 2 (SB2), use the following command:

# cfgadm -x poweroff SB2

Connecting and Configuring Boards

When DR configures a board into a domain, it first connects the board electrically to the system, putting it into the connected state. DR then configures the system board so that it is fully available to all applications running in the domain, putting it into the configured state.

When DR removes a board from a domain, it first unconfigures the system board so that it is no longer available to all applications running in the domain, putting it into the unconfigured state. DR then disconnects the board electrically from the system, putting it into the disconnected state.

Hot Plugging PCI Adapter Cards

Each hotplug slot on an I/O board can be individually connected, configured, unconfigured, and disconnected. Each attachment point for a hotplug slot, which identifies both the slot and the adapter card that is plugged into the slot, is created when the I/O board is configured into the domain.

To connect, but not configure, an adapter at slot 1 of I/O board 1 into a domain, use a command such as the following:

# cfgadm -c connect pcisch0:e01b1slot1

To configure the adapter at slot 1 of I/O board 1 into the domain, use a command such as the following:

# cfgadm -c configure pcisch0:e01b1slot1

To disconnect an adapter at slot 1 of I/O board 1 before unplugging the adapter, use a command such as the following:

# cfgadm -c disconnect pcisch13:eo1b1slot1

To unconfigure the adapter at slot 1 of I/O board 1 out of the domain, use a command such as the following:

# cfgadm -c unconfigure pcisch0:e01b1slot1

For more information, see cfgadm_pci(1M).

Testing a Board

The -t option causes a board to be tested. Prior to running the following command, system board 2 (SB2) must be disconnected, assigned and powered-on. The following command includes the verbose option:

# cfgadm -vt SB2

The board is tested using the diagnostic level specified for the domain in the .postrc file; the default is 16.

Displaying Attachment Point Information

This section includes several example of commands that you can use to display system information about attachment points. See the cfgadm(1M) man page for additional information.

To list the state, status, and condition of all attachment points with the verbose option use:

# cfgadm -val

To list the state and condition of an adapter at slot 1 of I/O board 3 use the following command:

# cfgadm -al pcisch13:e03b1slot1

The following command displays in columnar format the logical name of each attachment point; along with its condition; status time in both calendar and parsable formats; and other information:

# cfgadm -s "cols=ap_id:condition:status_time:status_time_p: info"

The following command displays in columnar format the logical name and physical ID of each attachment point:

# cfgadm -s "cols=ap_id:physid"

The following command displays in columnar format the logical name of each attachment point, along with its receptacle state; occupant state; occupant type; busy status; and class:

# cfgadm -s "cols=ap_id:r_state:o_state:type:busy:class"

Tracking Memory Unconfigure Operations

When unconfiguring a system board that contains the domain's permanent memory, the following command tracks the memory delete process:

# cfgadm -a -s "select=type(memory),cols=ap_id:o_state:info"

Finding the Board Containing Permanent Memory

To find the system board that contains the domain's permanent memory use the following command:

# cfgadm -val | grep permanent