Identifying Configuration Problems Using Storage Expert

Storage Expert provides a large number of rules that help you to diagnose configuration issues that might cause problems for your storage environment. Each rule describes the issues involved, and suggests remedial actions.

The rules help you to diagnose problems in the following categories:

A full list of Storage Expert rules, listed in numerical order, can be found in Rule Definitions and Attributes.

Recovery Time

Several "best practice" rules enable you to check that your storage configuration has the resilience to withstand a disk failure or a system failure.

Checking for Multiple RAID-5 Logs on a Physical Disk (vxse_disklog)

To check whether more than one RAID-5 log exists on the same physical disk, run rule vxse_disklog.

RAID-5 log mirrors for the same physical volume should be located on separate physical disks to ensure redundancy. More than one RAID-5 log on a disk also makes the recovery process longer and more complicated.

Checking for Large Mirror Volumes Without a DRL (vxse_drl1)

To check whether large mirror volumes (larger than 1GB) have an associated dirty region log (DRL), run rule vxse_drl1.

Creating a DRL speeds recovery of mirrored volumes after a system crash. A DRL tracks those regions that have changed and uses the tracking information to recover only those portions of the volume that need to be recovered. Without a DRL, recovery is accomplished by copying the full contents of the volume between its mirrors. This process is lengthy and I/O intensive.

For information on adding a DRL log to a mirrored volume, see Preparing a Volume for DRL and Instant Snapshots.

Checking for Large Mirrored Volumes Without a Mirrored DRL (vxse_drl2)

To check whether a large mirrored volume has a mirrored DRL log, run rule vxse_drl2.

Mirroring the DRL log provides added protection in the event of a disk failure.

For information on adding a mirror to a DRL log, see Preparing a Volume for DRL and Instant Snapshots.

Checking for RAID-5 Volumes Without a RAID-5 Log (vxse_raid5log1)

To check whether a RAID-5 volume has an associated RAID-5 log, run rule vxse_raid5log1.

In the event of both a system failure and a failure of a disk in a RAID-5 volume, data that is not involved in an active write could be lost or corrupted if there is no RAID-5 log.

For information about adding a RAID-5 log to a RAID-5 volume, see Adding a RAID-5 Log.

Checking Minimum and Maximum RAID-5 Log Sizes (vxse_raid5log2)

To check that the size of RAID-5 logs falls within the minimum and maximum recommended sizes, run rule vxse_raid5log2.

The recommended minimum and maximum sizes are 64MB and 1GB respectively. If vxse_raid5log2 reports that the size of the log is outside these boundaries, adjust the size by replacing the log.

Checking for Non-Mirrored RAID-5 Logs (vxse_raid5log3)

To check that the RAID-5 log of a large volume is mirrored, run the vxse_raid5log3 rule.

A mirror of the RAID-5 log protects against loss of data due to the failure of a single disk. You are strongly advised to mirror the log if vxse_raid5log3 reports that the log of a large RAID-5 volume does not have a mirror.

For information on adding a RAID-5 log mirror, see Adding a RAID-5 Log.

Disk Groups

Disks groups are the basis of VxVM storage configuration so it is critical that the integrity and resilience of your disk groups are maintained. Storage Expert provides a number of rules that enable you to check the status of disk groups and associated objects.

Checking Whether a Configuration Database Is Too Full (vxse_dg1)

To check whether the disk group configuration database has become too full, run rule vxse_dg1.

By default, this rule suggests a limit of 250 for the number of disks in a disk group. If one of your disk groups exceeds this figure, you should consider creating a new disk group. The number of objects that can be configured in a disk group is limited by the size of the private region which stores configuration information about every object in the disk group. Each disk in the disk group that has a private region stores a separate copy of this configuration database.

For information on creating a new disk group, see Creating a Disk Group.

Checking Disk Group Configuration Copies and Logs (vxse_dg2)

To check whether a disk group has too many or too few disk group configuration copies, and whether a disk group has too many or too few copies of the disk group log, run rule vxse_dg2.

Checking "on disk config" Size (vxse_dg3)

To check whether a disk group has the correct "on disk config" size, run rule vxse_dg3.

Checking Version Number of Disk Groups (vxse_dg4)

To check the version number of a disk group, run rule vxse_dg4.

For optimum results, your disk groups should have the latest version number that is supported by the installed version of VxVM.

If a disk group is not at the latest version number, see the section Upgrading a Disk Group for information about upgrading it.

Checking the Number of Configuration Copies in a Disk Group (vxse_dg5)

To find out whether a disk group has only a single VxVM configured disk, run rule vxse_dg5.

See Creating and Administering Disk Groups for more information.

Checking for Non-Imported Disk Groups (vxse_dg6)

To check for disk groups that are visible to VxVM but not imported, run rule vxse_dg6.

Importing a disk to a disk group is described in Importing a Disk Group.

Checking for Initialized VM Disks that are not in a Disk Group (vxse_disk)

To find out whether there are any initialized disks that are not a part of any disk group, run rule vxse_disk. This prints out a list of disks, indicating whether they are part of a disk group or unassociated.

For information on how to add a disk to disk group, see Adding a Disk to a Disk Group.

Checking Volume Redundancy (vxse_redundancy)

To check whether a volume is redundant, run rule vxse_redundancy.

This rule displays a list of volumes together with the number of mirrors that are associated with each volume. If vxse_redundancy shows that a volume does not have an associated mirror, your data is at risk in the event of a disk failure, and you should rectify the situation by creating a mirror for the volume.

See Adding a Mirror to a Volume for information on adding a mirror to a volume.

Checking States of Plexes and Volumes (vxse_volplex)

To check whether your disk groups contain unused objects (such as plexes and volumes), run rule vxse_volplex. In particular, this rule notifies you if any of the following conditions exist:

disabled plexes
detached plexes
stopped volumes
disabled volumes
disabled logs
failed plexes
volumes needing recovery

If any of these conditions exist, see the following for information on correcting the situation:

To re-enable a disabled or detached plex, see Reattaching Plexes.
To re-enable a stopped or disabled volume, see Starting a Volume.
To recover a volume, see the chapter "Recovery from Hardware Failure" in the VERITAS Volume Manager Troubleshooting Guide.

Disk Striping

Striping enables you to enhance your system's performance. Several rules enable you to monitor important parameters such as the number of columns in a stripe plex or RAID-5 plex, and the stripe unit size of the columns.

Checking the Configuration of Large Mirrored-Stripe Volumes (vxse_mirstripe)

To check whether large mirror-striped volumes should be reconfigured as striped-mirror volumes, run rule vxse_mirstripe.

A large mirrored-striped volume should be reconfigured, using relayout, as a striped-mirror volume to improve redundancy and enhance recovery time after failure.

To convert a mirrored-striped volume to a striped-mirror volume, see Converting Between Layered and Non-Layered Volumes.

Checking the Number of Columns in RAID-5 Volumes (vxse_raid5)

To check whether RAID-5 volumes have too few or too many columns, run rule vxse_raid5.

By default, this rule assumes that a RAID-5 plex should have more than 4 columns and fewer than 8 columns.

See Performing Online Relayout for information on changing the number of columns.

Checking the Stripe Unit Size of Striped Volumes (vxse_stripes1)

By default, rule vxse_stripes1 reports a violation if a volume's stripe unit size is not set to an integer multiple of 8KB.

See Performing Online Relayout for information on changing the stripe unit size.

Checking the Number of Columns in Striped Volumes (vxse_stripes2)

The default values for the number of columns in a striped plex are 16 and 3. By default, rule vxse_stripes2 reports a violation if a striped plex in your volume has fewer than 3 columns or more than 16 columns.

See Performing Online Relayout for information on changing the number of columns in a striped volume.

Disk Sparing and Relocation Management

The hot-relocation feature of VxVM uses spare disks in a disk group to recreate volume redundancy after disk failure.

Checking the Number of Spare Disks in a Disk Group (vxse_spares)

This "best practice" rule assumes that between 10% and 20% of disks in a disk group should be allocated as spare disks. By default, vxse_spares checks that a disk group falls within these limits.

See Administering Hot-Relocation for information on managing the pool of spare disks.

Hardware Failures

Checking for Disk and Controller Failures (vxse_dc_failures)

Rule vxse_dc_failures can be used to discover if the system has any failed disks or disabled controllers.

Rootability

Checking the Validity of Root Mirrors (vxse_rootmir)

Rule vxse_rootmir can be used to confirm that the root mirrors are set up correctly.

System Hostname

Checking the System Name (vxse_host)

Rule vxse_host can be used to confirm that the system name (hostname) in the file /etc/vx/volboot is the same as the name that was assigned to the system when it was booted.


^ Return to Top	< Previous \| Next >

Product: Volume Manager Guides
Manual: Volume Manager 4.1 Administrator's Guide
VERITAS Software Corporation www.veritas.com