Previous  |  Next  >  
Product: Cluster Server Guides   
Manual: Cluster Server 4.1 User's Guide   

Preonline IP Check

You can enable a preonline check of a failover IP address to protect against network partitioning. The check pings a service group's configured IP address to verify it is not already in use. If it is, the service group is not brought online. A second check verifies the system is connected to its public and private networks. If the system receives no response from a broadcast ping to the public network and a check of the private networks, it determines the system is isolated and does not bring the service group online.

  To enable the preonline IP check

  1. Move the preonline trigger script from the sample triggers directory into the triggers directory:
    cp /opt/VRTSvcs/bin/sample_triggers/preonline_ipc
      /opt/VRTSvcs/bin/triggers/preonline
  2. Change the file permissions to make it executable.

Network Partitions and the UNIX Boot Monitor

Most UNIX systems provide a console-abort sequence that enables you to halt and continue the processor. Continuing operations after the processor has stopped may corrupt data and is therefore unsupported by VCS. Specifically, when a system is halted with the abort sequence it stops producing heartbeats. The other systems in the cluster then consider the system failed and take over its services. If the system is later enabled with another console sequence, it continues writing to shared storage as before, even though its applications have been restarted on other systems where available.

If a write operation was pending when the console-abort sequence was processed, the write occurs when the processing sequence resumes. Halting a system by this method appears to all other nodes as a complete system fault because all heartbeats disappear simultaneously. Another node takes over services for the missing node. When the resume occurs, it takes several seconds before the return of a formerly missing heartbeat causes a system panic. During this time, the write waiting on the stopped node occurs, leading to data corruption.

VERITAS recommends rebooting the system if it was halted using the console-abort sequence.

 ^ Return to Top Previous  |  Next  >  
Product: Cluster Server Guides  
Manual: Cluster Server 4.1 User's Guide  
VERITAS Software Corporation
www.veritas.com