Previous  |  Next  >  
Product: Cluster Server Guides   
Manual: Cluster Server 4.1 Agent Developer's Guide   

State Transition Diagrams

This chapter illustrates the state transitions within the agent framework. State transition diagrams are shown separately for the following behaviors:

  • Opening a resource
  • Resource in a steady state
  • Onlining a resource
  • Offlining a resource
  • Resource fault (without automatic restart)
  • Resource fault (with automatic restart)
  • Monitoring of persistent resources
  • Closing a resource

In addition, state transitions are shown for the handling of resources with respect to the ManageFaults service group attribute. By default, ManageFaults is set to NONE, in which case the clean entry point is not called by VCS. See ManageFaults.The diagrams cover the following conditions:

  • Onlining a resource when the ManageFaults attribute is set to NONE
  • Offlining a resource when the ManageFaults attribute is set to NONE
  • Resource fault when ManageFaults attribute is set to ALL
  • Resource fault (unexpected offline) when ManageFaults attribute is set to NONE
  • Resource fault (monitor is hung) when ManageFaults attribute is set to ALL
  • Resource fault (monitor is hung) when ManageFaults attribute is set to NONE

The states shown in these diagrams are associated with each resource by the agent framework. These states are used only within the agent framework and are independent of the IState resource attribute values indicated by the engine.

The VCS agent writes resource state transition information into the agent log file when the LogDbg parameter, a static resource type attribute, is set to the value DBG_AGINFO. Agent developers can make use of this information when debugging agents.

Click the thumbnail above to view full-sized image.

When the agent starts up, each resource starts with the initial state of Detached. In the Detached state (Enabled=0), the agent rejects all commands to online or offline the resource.

When the resource is enabled (Enabled=1), the open entry point is invoked. Periodic Monitoring begins after open times out or succeeds. Depending on the return value of monitor, the resource transitions to either the Online or the Offline state. In the unlikely event that monitor times out or returns unknown, the resource stays in a Probing state.

Click the thumbnail above to view full-sized image.

When resources are in a steady state of Online or Offline, they are monitored at regular intervals. The intervals are specified by the MonitorInterval parameter for a resource in the Online state and the OfflineMonitorInterval parameter for a resource in the Offline state. An Online resource that is unexpectedly detected as Offline is considered to be faulted. Refer to diagrams describing faulted resources.

Click the thumbnail above to view full-sized image.

When the agent receives a request from the VCS engine to online the resource, the resource enters the Going Online state, where the online entry point is invoked. If online completes successfully, the resource enters the Going Online Waiting state where it waits for the next monitor cycle.

If monitor returns a status of online, the resource moves to the Online state.

If, however, the monitor times out, or returns a status of "not Online" (that is, unknown or offline), the agent returns the resource to the Going Online Waiting state and waits for the next monitor cycle.

When the OnlineWaitLimit is reached, the clean entry point is invoked.

  • If clean times out or fails, the resource again returns to the Going Online Waiting state and waits for the next monitor cycle. The agent again invokes the clean entry point if the monitor reports a status of "not Online."
  • If clean succeeds with the OnlineRetryLimit reached, and the subsequent monitor reports offline status, the resource transitions to the offline state and is marked FAULTED.
  • If clean succeeds and the ORL is not reached, the resource transitions to the Going Online state where the online entry point is retried.
  • Click the thumbnail above to view full-sized image.

Upon receiving a request from the VCS engine to offline a resource, the agent places the resource in a Going Offline state and invokes the offline entry point.

If offline succeeds, the resource enters the Going Offline Waiting state where it waits for the next monitor.

If monitor returns a status of offline, the resource is marked Offline.

If the monitor times out or return a status "not offline," the agent invokes the clean entry point. Also, if, in the Going Offline state, the offline entry point times out, the agent invokes clean entry point.

  • If clean fails or times out, the resource is placed in the Going Offline Waiting state and monitored. If monitor reports "not offline," the agent invokes the clean entry point, where the sequence of events repeats.
  • If clean returns success, the resource is placed in the Going Offline Waiting state and monitored. If monitor times out or reports "not offline," the resource returns to the GoingOfflineWaiting state. The UNABLE _TO_OFFLINE flag is sent to engine.
  • Click the thumbnail above to view full-sized image.

This diagram describes the activity that occurs when a resource faults and the RestartLimit is reached. When the monitor entry point times out successively and FaultOnMonitorTimeout is reached, or monitor returns offline and the ToleranceLimit is reached, the agent invokes the clean entry point.


Note   Note    For VCS on Windows, the FaultOnMonitorTimeout attribute has no significance. Instead, the monitor entry point is allowed to continue to completion.

If clean fails, or if it times out, the agent places the resource in the Online state as if no fault occurred.

If clean succeeds, the resource is placed in the Going Offline Waiting state, where the agent waits for the next monitor.

  • If monitor reports Online, the resource is placed back Online as if no fault occurred. If monitor reports Offline, the resource is placed in an Offline state and marked as FAULTED.
  • If monitor reports unknown or times out, the agent places the resource back into the Going Offline Waiting state, and sets the UNABLE_TO_OFFLINE flag in the engine.
  • Click the thumbnail above to view full-sized image.

This diagram describes the activity that occurs when a resource faults and the RestartLimit is not reached. When the monitor entry point times out successively and FaultOnMonitorTimeout is reached, or monitor returns offline and the ToleranceLimit is reached, the agent invokes the clean entry point.


Note   Note    For VCS on Windows, the FaultOnMonitorTimeout attribute has no significance. Instead, the monitor entry point is allowed to continue to completion.

  • If clean succeeds, the resource is placed in the Going Online state and the online entry point is invoked to restart the resource; refer to the diagram, "Onlining a resource."
  • If clean fails or times out, the agent places the resource in the Online state as if no fault occurred.

Refer to the diagram "Resource fault without automatic restart," for a discussion of activity when a resource faults and the RestartLimit is reached.

Click the thumbnail above to view full-sized image.

For a persistent resource in the Online state, if monitor returns a status of offline and the ToleranceLimit is not reached, the resource stays in an Online state. If monitor returns offline and the ToleranceLimit is reached, the resource is placed in an Offline state and noted as FAULTED. If monitor returns "not offline," the resource stays in an Online state.

Likewise, for a persistent resource in an Offline state, if monitor returns offline, the resource remains in an Offline state. If monitor returns a status of online, the resource is placed in an Online state.

Click the thumbnail above to view full-sized image.

When the resource is disabled (Enabled=0), the agent stops Periodic Monitoring and the the close entry point is invoked. When the close entry point succeeds or times out, the resource is placed in the Detached state.

Click the thumbnail above to view full-sized image.

Click the thumbnail above to view full-sized image.

Click the thumbnail above to view full-sized image.

Click the thumbnail above to view full-sized image.

Click the thumbnail above to view full-sized image.

Click the thumbnail above to view full-sized image.

 ^ Return to Top Previous  |  Next  >  
Product: Cluster Server Guides  
Manual: Cluster Server 4.1 Agent Developer's Guide  
VERITAS Software Corporation
www.veritas.com