C H A P T E R  14

Troubleshooting the Solaris x86 PXE Boot Installation

This chapter provides information on problems that can occur during or after a PXE boot installation of the Solaris x86 operating system. It covers the following problems:

Synopsis: prom_panic: Could not mount filesystem

The following error can appear at startup when the blade is attempting to perform a PXE boot:

Broadcom UNDI PXE-2.1 (build 082) v6.2.11                                       
Copyright (C) 2000-2003 Broadcom Corporation                                    
Copyright (C) 1997-2000 Intel Corporation                                       
All rights reserved.                                                            
                                                                                
CLIENT MAC ADDR: 00 03 BA 29 F0 DE  GUID: 00000000 0000 0000 0000 000000000000  
CLIENT IP: 123.123.123.172  MASK: 255.255.255.0  DHCP IP: 123.123.123.163       
SunOS Secondary Boot version 3.00                                               
                                                                                
prom_panic: Could not mount filesystem.                                         
Entering boot debugger:.                                                        
[136039]: 

Cause:

The secondary bootstrap program was unable to mount the file system for the Solaris x86 install image.

Solution:

Check that the SrootPTH macro has been entered correctly as displayed by the add_install_client output (see FIGURE 10-7 in Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade).

Synopsis: Cannot Read SUNW.i86pc File for Blade

The following error can appear at startup when the blade is attempting to perform a PXE boot and Jumpstart installation:

Broadcom UNDI PXE-2.1 (build 082) v6.2.11                                 
Copyright (C) 2000-2003 Broadcom Corporation                              
Copyright (C) 1997-2000 Intel Corporation                                 
All rights reserved.                                                      
     
CLIENT MAC ADDR: 00 03 BA 29 F0 DE  GUID: 00000000 0000 0000 0000 000000000000  
CLIENT IP: 123.123.123.172  MASK: 255.255.255.0  DHCP IP: 123.123.123.163 
GATEWAY IP: 123.123.123.8                                                 
 
     
Solaris network boot ...                                                  
     
Cannot read file 123.123.123.163:/tftpboot/SUNW.i86pc.                    
Type <ENTER> to retry network boot or <control-C> to try next boot device 

where 123.123.123.163 is the IP address of the Network Install Server containing the Solaris x86 image for the blade.

Cause:

The data structures used by DHCP to transfer the DHCP option strings currently impose a limit of 255 characters on the length of these strings. If this limit is exceeded one of the option strings will be truncated. If this happens to be the value of the Bootfile option, then the PXE boot protocol will attempt to perform a non-client-specific PXE boot by reading the file SUNW.i86pc. This file is not suitable for booting B100x and B200x blades and in any case it will not normally exist in the /tftpboot directory on the Network Install Server.

Solution:

When configuring the DHCP options strings (see Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade), you need to take into account that long names for the Install server path and the root server path will quickly use up the available option string space of 255 characters. For a screen shot of the window in the DHCP Manager's GUI where the path for the option string is specified, see FIGURE 10-8.

If you have encountered this problem, reduce the length of the SrootPTH and SinstPTH option strings. You can achieve this by creating a link to the full path stored in the Network Install Server's file system. For example, supposing the paths for SrootPTH and SinstPTH are:

SrootPTH=/export/install/media/b100xb200x/solaris9install/Solaris_9/Tools/BootSinstPTH=/export/install/media/b100xb200x/solaris9-install

You can reduce the length of these specified paths by creating a link to the solaris9-install image on the Network Install Server. To do this:

1. Log in as root to the Network Install Server and type the following command:

# ln -s /export/install/media/b100xb200x/solaris9-install /export/s9-install

2. Adjust the macros in the DHCP server as follows:

SrootPTH=/export/s9-install/Solaris_9/Tools/Boot
SinstPTH=/export/s9-install

In this example, this has reduced the total length of these two DHCP option strings by 62 characters.

Synopsis: PXE Access Violation Before Primary Bootstrap Has Loaded

The following error can appear at startup when the blade is attempting to perform a PXE boot:

Broadcom UNDI PXE-2.1 (build 082) v6.2.11                                       
Copyright (C) 2000-2003 Broadcom Corporation                                    
Copyright (C) 1997-2000 Intel Corporation                                       
All rights reserved.                                                            
                                                                                
CLIENT MAC ADDR: 00 03 BA 29 F0 DE  GUID: 00000000 0000 0000 0000 000000000000  
CLIENT IP: 123.123.123.172  MASK: 255.255.255.0  DHCP IP: 123.123.123.163       
GATEWAY IP: 123.123.123.8                                                       
TFTP.                                                                           
PXE-T02: Access violation                                                       
PXE-E3C: TFTP Error - Access Violation                                          
                                                                                
PXE-M0F: Exiting Broadcom PXE ROM. 

Cause:

This error message indicates that, during the PXE boot process, the blade was unable to download the primary bootstrap program from the install server's /tftpboot area. There are a number of possible reasons for this:

Solution:

If you think you did not execute the add_install_client command, then execute it now (see Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade). When you have done so, check that the files for the primary bootstrap, the secondary bootstrap, and the client-specific boot settings exist in the /tftpboot area on the Network Install Server.

If any of them do not exist there (or do not have read permissions), you will encounter access violation errors during the PXE boot process.

To check you have the correct client-specific files in the /tftpboot area, do the following:

1. Search for all the files that contain the blade's MAC address in their filename.

Assuming a blade MAC address of 00:03:BA:29:F0:DE, you would type the following command (remembering that in these filenames the MAC address is preceded by 01 and has its colon characters removed):

# cd /tftpboot 
# ls -l *010003BA29F0DE*
lrwxrwxrwx   1 root   other         26 Oct 29 12:35 010003BA29F0DE -> inetboot.I86PC.Solaris_9-1
-rw-r--r--  1 root   other        639 Oct 29 12:35 010003BA29F0DE.bootenv.rc
lrwxrwxrwx  1 root   other         21 Oct 29 12:35 nbp.010003BA29F0DE -> nbp.I86PC.Solaris_9-1
-rw-r--r--  1 root   other        568 Oct 29 12:35 rm.010003BA29F0DE

The output from this command shows the:

The files listed in the above output with an arrow (->) after them are links. The filename after the arrow is the file that they link to.

2. Use the ls command to check that the copies required of the install image's original bootstrap files do in fact exist in the /tftpboot area:

# ls -l nbp.I86PC.Solaris_9-1
-rwxr-xr-x   1 root     other      14596 Oct 29 12:35 nbp.I86PC.Solaris_9-1
#
# ls -l inetboot.I86PC.Solaris_9-1
-rwxr-xr-x   1 root     other     401408 Oct 29 12:35 inetboot.I86PC.Solaris_9-1

The copies of the install image's bootstrap files in /tftpboot are created by the add_install_client utility (which you ran in Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade).

If they do not exist in /tftpboot, then either you have not run the add_install_client utility, or you have run it for a network install image that does not support client-specific PXE booting.

In either case run the add_install_client utility for the correct install image, following the instructions in Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade.

3. If the bootstrap files pointed to by the links do exist in /tftpboot (in other words, if they are listed by the ls command that you ran in Step 2), then check they are the same size as the original bootstrap programs belonging to the Solaris x86 install image that you intend to use for the blade or blades.

To do this, run the ls commands for the original bootstrap files belonging to the install image you intend to use, and compare their file sizes with the file sizes reported in Step 2 for the client-specific files in /tftpboot.

In the sample commands provided in Chapter 10, the Solaris x86 install image was located in the directory /export/s9x on the Network Install Server. The sample commands below assume the same path:

# cd /export/s9x/Solaris_9/Tools/Boot
 
# ls -l usr/platform/i86pc/lib/fs/nfs/inetboot
-rw-r--r--   1 root     sys       401408 Oct  7 23:55 usr/platform/i86pc/lib/fs/nfs/inetboot
 
# ls -l boot/solaris/nbp
-rw-r--r--   1 root     sys        14596 Sep 23 15:45 boot/solaris/nbp

4. If the necessary files did not exist in the /tftpboot directory on the Network Install Server, or if they were not identical to the bootstrap files belonging to the install image you have been intending to use for the blade or blades, then run the add_install_client utility again for the correct image (see Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade).

If the files did appear to exist and to be the correct files, a final check is to compare the checksums for the different files using the sum(1) command. If the checksum for the client-specific copy matches the checksum for the original file belonging to the install image, then the files are identical. If not, run the add_install_client utility again, making sure you run it for the correct Solaris x86 install image.

Synopsis: Cannot Read Secondary Bootstrap

The following error can appear at startup when the blade is attempting to perform a PXE boot:

Broadcom UNDI PXE-2.1 (build 082) v6.2.11                                       
Copyright (C) 2000-2003 Broadcom Corporation                                    
Copyright (C) 1997-2000 Intel Corporation                                       
All rights reserved.                                                            
                                                                                
CLIENT MAC ADDR: 00 03 BA 29 F0 DE  GUID: 00000000 0000 0000 0000 000000000000  
CLIENT IP: 123.123.123.172  MASK: 255.255.255.0  DHCP IP: 123.123.123.163       
GATEWAY IP: 123.123.123.8                                                       
                                                                                
                                                                                
                                                                                
Solaris network boot ...                                                        
                                                                                
Cannot read file 123.123.123.163:/tftpboot/010003BA29F0DE.                      
Type <ENTER> to retry network boot or <control-C> to try next boot device ...

Cause:
Solution:

Carry out the same checks as were recommended in the solution to the following problem: Synopsis: PXE Access Violation Before Primary Bootstrap Has Loaded

Synopsis: Blade Appears to Hang After Primary Bootstrap is Loaded

The following error can appear at startup when the blade is attempting to perform a PXE boot:

Broadcom UNDI PXE-2.1 (build 082) v6.2.11                                       
Copyright (C) 2000-2003 Broadcom Corporation                                    
Copyright (C) 1997-2000 Intel Corporation                                       
All rights reserved.                                                            
                                                                                
CLIENT MAC ADDR: 00 03 BA 29 F0 DE  GUID: 00000000 0000 0000 0000 000000000000  
CLIENT IP: 123.123.123.172  MASK: 255.255.255.0  DHCP IP: 123.123.123.163       
GATEWAY IP: 123.123.123.8                                                       
                                                                                
                                                                                
                                                                                
Solaris network boot ... 

Cause:

Possible causes include:

Solution:

The first thing to check is that you have run the add_install_client command correctly (see Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade). If you are not sure, you can simply run the command again. Then carry out the same checks as were recommended in the solution to the problem: Synopsis: PXE Access Violation Before Primary Bootstrap Has Loaded

Synopsis: Secondary Boot Program Aborts to > Prompt

The following error can appear at startup when the blade is attempting to perform a PXE boot:

Broadcom UNDI PXE-2.1 (build 082) v6.2.11                                       
Copyright (C) 2000-2003 Broadcom Corporation                                    
Copyright (C) 1997-2000 Intel Corporation                                       
All rights reserved.                                                            
                                                                                
CLIENT MAC ADDR: 00 03 BA 29 F0 DE  GUID: 00000000 0000 0000 0000 000000000000  
SunOS Secondary Boot version 3.00 255.255.255.0  DHCP IP: 123.123.123.163       
GATEWAY IP: 123.123.123.8                                                       
/dev/diskette0: device not installed, unknown device type 0                     
                                                                                
                                                                                
             Solaris Intel Platform Edition Booting System                      
                                                                                
                                                                                
> 

Cause:

Possible causes include:

Solution:

The first thing to check is that you have run the add_install_client command correctly (see Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade). If you are not sure, you can simply run the command again. Then carry out the same checks as were recommended in the solution to the problem: Synopsis: PXE Access Violation Before Primary Bootstrap Has Loaded.

Synopsis: Malformed Bootpath

The following error can appear at startup when the blade is attempting to perform a PXE boot:

 Error: Malformed bootpath
 
 Property The bootpath property:
 
 /pci@0,0/pci78887,7
 
 is badly formed, and will be ignored.
 
 Press Enter to  Continue.
 
 
 
 
 
 
 
 Enter_Continue

Cause:

Possible causes include:

Solution:

The first thing to check is that you have run the add_install_client command correctly (see Section 10.4, Configuring the Install Server and the DHCP Server to Install Solaris x86 Onto Each Blade). If you are not sure, you can simply run the command again. Then carry out the same checks as were recommended in the solution to the problem: Synopsis: PXE Access Violation Before Primary Bootstrap Has Loaded.

Synopsis: Installation Stops at Screen Called 'Solaris Device
Configuration Assistant
'

The following screen can appear at startup when the blade is attempting to perform a PXE boot:

 Solaris Device Configuration Assistant
 
  The Solaris(TM) (Intel Platform Edition) Device Configuration Assistant
  scans to identify system hardware, lists identified devices, and can
  boot the Solaris software from a specified device. This program must be
  used to install the Solaris operating environment, add a driver, or
  change the hardware on the system.
 
 
 
  > To perform a full scan to identify all system hardware, choose Continue.
 
  > To diagnose possible full scan failures, choose Specific Scan.
 
  > To add new or updated device drivers, choose Add Driver.
 
  About navigation... 
      - The mouse cannot be used.
      - If the keyboard does not have function keys or they do not respond,
        press ESC. The legend at the bottom of the screen will change to show
        the ESC keys to use for navigation.
      - The F2 key performs the default action.
 
 
 
  F2_Continue    F3_Specific Scan    F4_Add Driver    F6_Help

Cause:

Possible causes include:

Solution:

For information about setting up Jumpstart correctly for your requirements, refer to the Solaris 9 Installation Guide, and see Section 10.9, Preparatory Steps for Setting up a Jumpstart Installation for a Blade, and Section 10.10, Configuring a Jumpstart Installation.

 

Synopsis: Blade Boots to Device Configuration Assistant on Every Reboot After an Interactive Network Installation

The following screen can also appear when you are performing an interactive network installation of Solaris x86 on a blade that has previously had Solaris x86 or Linux running on it but that has a disk partition table that does not contain separate Boot and Solaris partitions.

 Solaris Device Configuration Assistant
 
  The Solaris(TM) (Intel Platform Edition) Device Configuration Assistant
  scans to identify system hardware, lists identified devices, and can
  boot the Solaris software from a specified device. This program must be
  used to install the Solaris operating environment, add a driver, or
  change the hardware on the system.
 
 
 
  > To perform a full scan to identify all system hardware, choose Continue.
 
  > To diagnose possible full scan failures, choose Specific Scan.
 
  > To add new or updated device drivers, choose Add Driver.
 
  About navigation... 
      - The mouse cannot be used.
      - If the keyboard does not have function keys or they do not respond,
        press ESC. The legend at the bottom of the screen will change to show
        the ESC keys to use for navigation.
      - The F2 key performs the default action.
 
  F2_Continue    F3_Specific Scan    F4_Add Driver    F6_Help

Cause

The blade's hard disk partition table does not define separate Boot and Solaris partitions. Because of this the bootpath property was not set at the end of the install process in the file /a/boot/solaris/bootenv.rc.

Solution

If you want to install the blade using a single Solaris disk partition, follow the instructions in Chapter 10 to perform a Jumpstart installation. In particular, make sure you use the x86-finish script as described in Section 10.9, Preparatory Steps for Setting up a Jumpstart Installation for a Blade. This will ensure that, before the blade is rebooted, the bootpath property is correctly set in the file /a/boot/solaris/bootenv.rc.

Alternatively you can simply step through the DCA screens by pressing [F2] and [ENTER], then selecting the hard disk as the boot device. When Solaris has booted you can then use an editor to add the correct bootpath property to the file /a/boot/solaris/bootenv.rc.

To prevent this problem from occurring when you reboot after a future interactive network installation, perform the installation as described in Chapter 10, and follow the instructions in Section 10.8.6, Removing the Entire Disk Partition Table Before Restarting the Solaris Install Program.