Previous  |  Next  >  
Product: Volume Manager Guides   
Manual: Volume Manager 4.1 Administrator's Guide   

Handling Conflicting Configuration Copies in a Disk Group

If an incomplete disk group is imported on several different systems, this can create inconsistencies in the disk group configuration copies that you may need to resolve manually. This section and following sections describe how such a condition can occur, and how to correct it. (When the condition occurs in a cluster that has been split, it is usually referred to as a serial split brain condition).


Note   Note    The procedures given here require that the version number of the disk group is at least 110. However, these procedures cannot be applied to SAN disk groups.

Example of a Serial Split Brain Condition in a Campus Cluster


Note   Note    This section presents an example of how a serial split brain condition might occur for a shared disk group in a cluster. For more information about shared disk groups in clusters, see Administering Cluster Functionality. Conflicts between configuration copies can also occur for private disk groups in clustered and non-clustered configurations where the disk groups have been partially imported on different systems. The procedure in Correcting Conflicting Configuration Information describes how to correct such problems.

A campus cluster (also known as a stretch cluster) typically consists of a 2-node cluster where each component (server, switch and storage) of the cluster exists in a separate building. This is illustrated in Typical Arrangement of a 2-node Campus Cluster, which shows a 2-node cluster with node 0, a fibre channel switch and disk enclosure enc0 in building A, and node 1, another switch and enclosure enc1 in building B. The fibre channel connectivity is multiply redundant to implement redundant-loop access between each node and each enclosure. As usual, the two nodes are also linked by a redundant private network.

Typical Arrangement of a 2-node Campus Cluster

Typical Arrangement of a 2-node Campus Cluster

Click the thumbnail above to view full-sized image.

A serial split brain condition typically arises in a cluster when a private (non-shared) disk group is imported on Node 0 with Node 1 configured as the failover node.

If the network connections between the nodes are severed, both nodes think that the other node has died. (This is the usual cause of the split brain condition in clusters). If a disk group is spread across both enclosure enc0 and enc1, each portion loses connectivity to the other portion of the disk group. Node 0 continues to update to the disks in the portion of the disk group that it can access. Node 1, operating as the failover node, imports the other portion of the disk group (with the -f option set), and starts updating the disks that it can see.

When the network links are restored, attempting to reattach the missing disks to the disk group on Node 0, or to re-import the entire disk group on either node, fails. This serial split brain condition arises because VxVM increments the serial ID in the disk media record of each imported disk in all the disk group configuration databases on those disks, and also in the private region of each imported disk. The value that is stored in the configuration database represents the serial ID that the disk group expects a disk to have. The serial ID that is stored in a disk's private region is considered to be its actual value.

If some disks went missing from the disk group (due to physical disconnection or power failure) and those disks were imported by another host, the serial IDs for the disks in their copies of the configuration database, and also in each disk's private region, are updated separately on that host. When the disks are subsequently re-imported into the original shared disk group, the actual serial IDs on the disks do not agree with the expected values from the configuration copies on other disks in the disk group.

Depending on what happened to the different portions of the split disk group, there are two possibilities for resolving inconsistencies between the configuration databases:

  • If the other disks in the disk group were not imported on another host, VxVM resolves the conflicting values of the serial IDs by using the version of the configuration database from the disk with the greatest value for the updated ID (shown as update_tid in the output from the vxdg list diskgroup command). This case is illustrated below.
  • Example of a Serial Split Brain Condition that Can Be Resolved Automatically

    Example of a Serial Split Brain Condition that Can Be Resolved Automatically

    Click the thumbnail above to view full-sized image.

  • If the other disks were also imported on another host, no disk can be considered to have a definitive copy of the configuration database. The figure below illustrates how this condition can arise for two disks.
  • Example of a True Serial Split Brain Condition that Cannot Be Resolved Automatically

    Example of a True Serial Split Brain Condition that Cannot Be Resolved Automatically

    Click the thumbnail above to view full-sized image.

    This is a true serial split brain condition, which VxVM cannot correct automatically. In this case, the disk group import fails, and the vxdg utility outputs error messages similar to the following before exiting:

    VxVM vxconfigd NOTICE V-5-0-33 Split Brain. da id is 0.1, while dm id is 0.0 for DM mydg01
    VxVM vxdg ERROR V-5-1-587 Disk group newdg: import failed: Serial Split Brain detected. Run vxsplitlines
    The import does not succeed even if you specify the -f flag to vxdg.
    Although it is usually possible to resolve this conflict by choosing the version of the configuration database with the highest valued configuration ID (shown as config_tid in the output from the vxdg list diskgroup command), this may not be the correct thing to do in all circumstances.
    The following section, Correcting Conflicting Configuration Information, describes how to fix this condition.

Correcting Conflicting Configuration Information


Note   Note    This procedure requires that the disk group has a version number of at least 110. See Upgrading a Disk Group for more information about disk group version numbers.

To resolve conflicting configuration information, you must decide which disk contains the correct version of the disk group configuration database. To assist you in doing this, you can run the vxsplitlines command to show the actual serial ID on each disk in the disk group and the serial ID that was expected from the configuration database. For each disk, the command also shows the vxdg command that you must run to select the configuration database copy on that disk as being the definitive copy to use for importing the disk group.

The following is sample output from running vxsplitlines on the disk group newdg:


vxsplitlines -g newdg
The following splits were found in disk group newdg
They are listed in da(dm) name pairs. 

Pool 0.
  c2t5d0 ( c2t5d0 ), c2t6d0 ( c2t6d0 ),
The configuration from any of the disks in this split should appear to be be the same.
To see the configuration from any of the disks in this split, run:
  /etc/vx/diag.d/vxprivutil dumpconfig /dev/vx/dmp/c2t5d0
To import the dg with the configuration from this split, run:
  /usr/sbin/vxdg -o selectcp=1045852127.32.olancha import newdg
To get more information about this particular configuration, run:
  /usr/sbin/vxsplitlines -g newdg -c c2t5d0

Split 1.
c2t7d0 ( c2t7d0 ), c2t8d0 ( c2t8d0 ),
The configuration from any of the disks in this split should appear to be be the same.
To see the configuration from any of the disks in this split, run:
  /etc/vx/diag.d/vxprivutil dumpconfig /dev/vx/dmp/c2t7d0
To import the dg with the configuration from this split, run:
  /usr/sbin/vxdg -o selectcp=1045852127.33.olancha import newdg
To get more information about this particular configuration, run:
  /usr/sbin/vxsplitlines -g newdg -c c2t7d0

In this example, the disk group has four disks, and is split so that two disks appear to be on each side of the split.

You can specify the -c option to vxsplitlines to print detailed information about each of the disk IDs from the configuration copy on a disk specified by its disk access name:


vxsplitlines  -g newdg -c c2t6d0
DANAME(DMNAME)                 || Actual SSB         || Expected SSB
c2t5d0( c2t5d0 ) || 0.1                        || 0.0 ssb ids don't match
c2t6d0( c2t6d0 ) || 0.1                        || 0.1 ssb ids match
c2t7d0( c2t7d0 ) || 0.1                        || 0.1 ssb ids match
c2t8d0( c2t8d0 ) || 0.1                        || 0.0 ssb ids don't match

Please note that even though some disks ssb ids might match
that does not necessarily mean that those disks' config copies
have all the changes. From some other configuration copies, those
disks' ssb ids might not match. 
To see the configuration from this disk, run
/etc/vx/diag.d/vxprivutil dumpconfig /dev/vx/dmp/c2t6d0

Based on your knowledge of how the serial split brain condition came about, you must choose one disk's configuration to be used to import the disk group. For example, the following command imports the disk group using the configuration copy that is on side 0 of the split:


/usr/sbin/vxdg -o selectcp=1045852127.32.olancha import newdg

When you have selected a preferred configuration copy, and the disk group has been imported, VxVM resets the serial IDs to 0 for the imported disks. The actual and expected serial IDs for any disks in the disk group that are not imported at this time remain unaltered.

 ^ Return to Top Previous  |  Next  >  
Product: Volume Manager Guides  
Manual: Volume Manager 4.1 Administrator's Guide  
VERITAS Software Corporation
www.veritas.com