Systems with UDWIS/SBus Host Adapter card could get some warning messages stating "SCSI transport failed". These messages could drastically slow down the I/O operations and eventually can hang the system. At that time system needs to be rebooted.
The affected UDWIS (Differential Ultra/Wide SCSI/S) cards have part numbers:
Storage devices connected to the affected UDWIS card are vulnerable, for example:
Since this problem is with the UDWIS card, any attached third party SCSI devices are vulnerable.
This problem can occur on Ultra 3x00/4x00/5x00/6x00 systems including Enterprise E10000s and happens under heavy I/O activities.
The problem is due to the SCSI bus arbitration. These messages will cause timeouts on the SCSI bus and drastically slow down the I/O operations and eventually hang the system.
The following SCSI warning messages were produced under very heavy I/O loads when using the UDWIS SBus Host Adapter connected to the StorEdge D1000, A1000, A3000, and A3500 in the desktop and/or server configurations:
Oct 8 11:16:32 malin unix: WARNING: /sbus@2,0/QLGC,isp@2,10000/sd@2,0 (sd17): Oct 8 11:16:32 malin unix: SCSI transport failed: reason 'incomplete': retrying command Oct 8 11:16:32 malin unix: Oct 8 11:16:32 malin unix: WARNING: /sbus@2,0/QLGC,isp@2,10000/sd@4,0 (sd19): Oct 8 11:16:32 malin unix: SCSI transport failed: reason 'incomplete': retrying command
As a workaround, set sd_max_throttle = 15, so that no more than 190 SCSI commands will be queued up in the UDWIS host adapter memory, even if there are 12 disk drives in a D1000, A1000, or A3X00 disk array. (15 x 12 = 180)
To set the maximum throttle value to 15 SCSI commands per disk, add the following to /etc/system:
Reboot the system for the above change to take effect. (Reboot may take a long time on large systems)
For configurations with multiple A1000s or A3X00s daisy chained to one UDWIS host adapter and supporting more than 12 LUNs on the SCSI bus, use a new sd_max_throttle value determined by dividing 190 by the total number of LUNs on the SCSI bus.
Note: Please refer your Sun Field/Accout Team to the appropriate FIN document.
This problem is addressed with UDWIS card part number 370-2443-03.
The problem described in this Sun(sm) Alert document may or may not be experienced by your particular system(s). The information in this Sun(sm) Alert document may be based upon information received from third-parties. It is being provided to you "as is," for informational purposes only. Sun does not make any representations, warranties, or guaranties as to the quality, suitability, truth, accuracy or completeness of any of the information. Sun shall not be liable for any losses or damages suffered as a result of Customer's use or non-use of the information.