Previous  |  Next  >  
Product: Cluster Server Guides   
Manual: Cluster Server 4.1 User's Guide   

Setting Up a Fire Drill

The Disaster Recovery Fire Drill procedure tests the fault-readiness of a configuration by mimicking a failover from the primary site to the secondary site. This procedure is done without stopping the application at the primary site and disrupting user access.

The initial steps to create a fire drill service group on the secondary site that closely follows the configuration of the original application service group and contains a point-in-time copy of the production data in the Replicated Volume Group (RVG). Bringing the fire drill service group online on the secondary site demonstrates the ability of the application service group to fail over and come online at the secondary site, should the need arise. Fire drill service groups do not interact with outside clients or with other instances of resources, so they can safely come online even when the application service group is online. You must conduct a fire drill only at the Secondary site; do not bring the fire drill service group online on the node hosting the original application.


Note   Note    You can conduct fire drills only on regular VxVM volumes; volume sets (vset) are not supported.

Configuring the Fire Drill Service Group

Use the RVG Secondary Fire Drill Wizard to set up the fire drill configuration.

The wizard performs the following specific tasks:

    Checkmark  Prepares all data volumes with FMR 4.0 technology, which enables space-optimized snapshots.

    Checkmark  Creates a Cache object to store changed blocks during the fire drill, which minimizes disk space and disk spindles required to perform the fire drill.

    Checkmark  Configures a VCS service group that resembles the real application group.

    Checkmark  Schedules the fire drill and the notification of results.

  To run the wizard

  1. Start the RVG Secondary Fire Drill wizard on the VVR secondary site, where the service group is not online:
       # /opt/VRTSvcs/bin/fdsetup
  2. Read the information on the Welcome screen and press the Enter key.
  3. The wizard identifies the global service groups. Enter the name of the service group for the fire drill.
  4. The wizard lists the volumes in disk group that could be used for a space-optimized snapshot. Enter the volumes to be selected for the snapshot. Typically, all volumes used by the application, whether replicated or not, should be prepared, otherwise a snapshot might not succeed.

    Click the thumbnail above to view full-sized image.

    Press the Enter key when prompted.

  5. Enter the cache size to store writes when the snapshot exists. The size of the cache must be large enough to store the expected number of changed blocks during the fire drill. However, the cache is configured to grow automatically if it fills up. Enter disks on which to create the cache.

    Click the thumbnail above to view full-sized image.

    Press the Enter key when prompted.

  6. The wizard starts running commands to create the fire drill setup. Press the Enter key when prompted.

    The wizard creates the application group with its associated resources. It also creates a fire drill group with resources for the application (Oracle, for example), the Mount, and the RVGSnapshot types.

    The application resources in both service groups define the same application, the same database in this example. The wizard sets the FireDrill attribute for the application resource to 1 to prevent the agent from reporting a concurrency violation when the actual application instance and the fire drill service group are online at the same time.

Verifying a Successful Fire Drill

Bring the fire drill service group online on a node that does not have the application running. Verify that the fire drill service group comes online. This action validates that your disaster recovery solution is configured correctly and the production service group will fail over to the secondary site in the event of an actual failure (disaster) at the primary site.

If the fire drill service group does not come online, review the VCS engine log to troubleshoot the issues so that corrective action can be taken as necessary in the production service group. You can also view the fire drill log, located at /tmp/fd-servicegroup.pid


Caution  Caution    Remember to take the fire drill offline once its functioning has been validated. Failing to take the fire drill offline could cause failures in your environment. For example, if the application service group were to fail over to the node hosting the fire drill service group, there would be resource conflicts, resulting in both service groups faulting.

Scheduling a Fire Drill

Schedule the fire drill for the service group by adding the file /opt/VRTSvcs/bin/fdsched to your crontab. You can make fire drills highly available by adding the file to every node in the cluster.

The scheduler runs the command hagrp -online firedrill_group -any at periodic intervals.

 ^ Return to Top Previous  |  Next  >  
Product: Cluster Server Guides  
Manual: Cluster Server 4.1 User's Guide  
VERITAS Software Corporation
www.veritas.com