Oracle® Data Guard Broker 10g Release 1 (10.1) Part Number B10822-01 |
|
|
View PDF |
This chapter describes how the broker manages databases during switchover and failover. This chapter contains the following sections:
When the primary database (or all instances of the RAC primary database) fails, such as when a system or software failure occurs, you may need to transition one of its corresponding standby databases to take over the primary role by performing a failover. Even in the absence of such a failure, you may have reason (for example, planned hardware maintenance) to perform a switchover to direct one of the standby databases to assume the role of being the primary database, while the former primary database assumes the role of being a standby database.
Without the broker, failover and switchover are manual processes that can be automated only by using script-based solutions. For example, if a physical standby database is in read-only mode (log apply services are offline) when a failure occurs on the primary database, you must change the standby database to Redo Apply mode, apply archived redo log files that have not yet been applied to the standby database, and fail over the standby database to the primary role.
The broker simplifies the switchover or failover by allowing you to invoke them through a single command and then coordinating role transitions on all databases in the configuration.
You can switch a database from the primary role to the standby role, as well as from standby to primary, without resetting the online redo log files of the associated new primary database. This is known as a database switchover, because the standby database that you specify becomes the primary database, and the original primary database becomes a standby database. There is no loss of application data, the data is consistent between the original, and the new primary database after the switchover completes.
Whenever possible, you should switch over to a physical standby database:
If the switchover transitions a physical standby database to the primary role, then the original primary database will be switched to a physical standby role. The online redo log files are continuously archived from the new primary database to all standby databases in the configuration.
If the switchover transitions a logical standby database to the primary role, then the original primary database will be switched to a logical standby role. If there are physical standby databases in the configuration not involved in the switchover, they will not be able to serve as standby databases to the new primary database, because a new incompatible online redo log stream has started.
Warning: Switching over to a logical standby database results in the physical standby databases in the broker configuration being disabled by the broker. These are no longer viable as standby databases. Section 4.2.5 describes how to restore their viability as standby databases. |
If the switchover transitions a physical standby database to the primary role, then both the primary database and the target standby database will be restarted after the switchover completes.
If the switchover transitions a logical standby database to the primary role, neither the primary database nor the logical standby database needs to be restarted after the switchover completes.
Consider the following points before you begin a switchover:
When you start a switchover, the broker verifies that at least one standby database (including the primary database that is about to be transitioned to the standby role) is configured to support the overall protection mode (maximum protection, maximum availability, or maximum performance).
You should prepare the primary database in advance for its possible future role as a standby database in the context of the overall protection mode (see Section 3.6). The preparation includes:
Setting up standby redo log files on the primary database if you intend to use SYNC
or ASYNC
log transport mode to the database after the switchover.
Preset log transport services related properties, such as LogXptMode
, NetTimeout
, StandbyArchiveLocation
, and AlternateLocation
. For more details about managing log transport services using configurable properties, see Section 3.4.
Preset log apply services related properties, such as RealTimeApply
and ApplyParallel
. For more details about managing log apply services using configurable properties, see Section 3.5.
Note that the broker does not use these properties to set up log transport and log apply services until you actually switch over the primary database to the standby role. Thus, the validity of the values of these properties is not verified until after the switchover. Once you set these properties, their values persist through role changes during switchover and failover.
After a switchover completes, the overall Data Guard protection mode (maximum protection, maximum availability, or maximum performance) remains at the same protection level it was before the switchover. Also, the log transport mode (SYNC
, ASYNC
, or ARCH
) of other standby databases not involved in the switchover does not change after a switchover. Log apply services for all other standby databases not involved in the switchover automatically begin applying archived redo log files from the new primary database.
If there are both logical and physical standby databases in the configuration and the switchover occurs to a logical standby database, you need to re-create all physical standby databases, as described in Section 4.2.5.
The act of switching roles should be a well-planned activity. The primary and standby databases involved in the switchover should have as small a transactional lag as possible. Oracle highly recommends that you consider performing a full, consistent backup of the primary database before starting the switchover. (Oracle Data Guard Concepts and Administration provides detailed information about setting up the databases in preparation of a switchover.)
To start a switchover using the Data Guard GUI, select the standby database that you want to change to the primary role and click Switchover. When using the CLI, you need to issue only one SWITCHOVER
command to specify the name of the standby database that you want to change into the primary role.
The broker controls the rest of the switchover, as described in Section 4.1.3.
Once you start the switchover, the broker:
Verifies that the primary and the target standby databases are in the following states:
The primary database must be enabled and in the ONLINE
state.
The participating standby database must be enabled and in the ONLINE
state.
The broker allows the switchover to proceed as long as there are no errors for the primary database and the standby database that you selected to participate in the switchover operation. Errors occurring for any other standby databases not involved in the switchover will not impede the switchover.
Shuts down all instances except one.
If the primary database is a RAC database, the broker will keep only one instance running and shut down all other instances before it continues the switchover. If the standby database you want to switch to the primary role is a RAC database, the broker will shut down all instances except the apply instance before it continues the switchover. If those other instances cannot be shut down, the switchover fails. In this case, you must manually shut down those other instances and issue the switchover command again. It is also important that you do not start any new instances during the switchover.
Switches roles between the primary and standby databases.
The broker first converts the original primary database to run in the standby role. Then, the broker transitions the target standby database to the primary role. If any errors occur during either conversion, the broker stops the switchover. See Section 9.4 for more information.
Updates the broker configuration file to record the change in roles.
Because the configuration file profiles all database objects in the configuration, this ensures that each database will run in the correct role should it be restarted later for any reason.
Restarts the new standby database if the switchover operation occurs with a physical standby database, and log apply services begin applying archived redo log files transmitted from the new primary database. If this is a RAC database, the broker restarts the instances that it shut down prior to the switchover.
Restarts the new primary database if it was a physical standby database, opens it in read/write mode, and starts log transport services transmitting redo data to the archived redo log files for the standby databases, including to the former primary database. If the switchover occurs to a logical standby database, there is no need to restart any databases. If this is a RAC database and a restart was necessary, the broker restarts the instances that it shut down prior to the switchover.
The broker verifies the state and status of the databases to ensure that the switchover successfully transitioned the databases to their new role correctly. Standby databases not involved in the switchover and not disabled by the broker after the switchover will continue operating in the state they were in before the switchover. For example, if a physical standby database was in read-only mode, it will remain in that mode after switchover completes. Log apply services for all other standby databases not involved in the switchover automatically begin applying archived redo log files from the new primary database.
Database failover transitions one of the standby databases to the role of primary database. You should perform a failover only when a catastrophic failure occurs on the primary database, and there is no possibility of recovering the primary database in a timely manner. The failed primary database is discarded, and the target standby database assumes the primary role.
The broker supports two methods of failover:
Complete failover (FAILOVER TO
database-name
;)
This is the recommended and default failover option that automatically recovers the maximum amount of data for the protection mode of the original primary database application data and attempts to bring along any standby databases not involved in the failover to continue serving as standby databases to the new primary database:
After failover to a physical standby database, the original primary database must be re-created to act as a standby database for the new primary database. In addition, some standby databases may be disabled by the broker during the failover if the broker detects that they have applied log sequences beyond the new primary database. Any database that was disabled by the broker must be re-created using the steps described in Section 4.2.5.
After failover to a logical standby database, the original primary database and any physical standby databases in the configuration must be re-created to act as a standby database for the new primary database. Additionally, if there is a gap in the log sequence, the logical standby databases not involved in the failover cannot finish applying all of the redo data that the target logical standby database applied before the failover. This results in the logical standby database being disabled. Any database that was disabled by the broker must be re-created using the steps described in Section 4.2.5.
Immediate failover (FAILOVER TO
database-name
IMMEDIATE;)
Caution: Do not perform an immediate failover to a standby database except in an emergency. |
A consequence of the immediate failover is that you must re-create the original primary database and all other standby databases not involved in the failover before they can serve as standby databases to the new primary database. Section 4.2.5 describes how to do this. Another consequence is that there may be lost application data.
Depending on the destination attributes of log transport services, the result of a complete failover may provide no data loss or minimal data loss. An immediate failover may result in data loss. Always try to perform a complete failover. Only when a complete failover is unsuccessful should you perform an immediate failover.
Note: After a failover (complete or immediate), the overall Data Guard protection mode is always reset to the maximum performance mode. The log transport mode (SYNC , ASYNC , or ARCH ) of the other standby databases not involved in the failover does not change. You can subsequently upgrade the protection mode as described in Section 3.6.1. |
If the standby database you want to fail over to the primary role is a RAC database, the broker will shut down all instances except the apply instance before it continues the failover. If the broker cannot shut down the instances, the failover fails. In this case, you must manually shut down all instances except the apply instance and issue the FAILOVER
command again. It is also important that you do not start any new instances during the failover. The broker restarts instances that it shut down prior to the failover.
There are many considerations when selecting a standby database to be the next primary database in a failover. Each production site might have unique requirements. This section discusses various considerations of which you need to be aware. Data Guard broker does not enforce a particular choice of standby database as the next primary database in the failover. You will need to make the decision and should be able to get the information you need from Data Guard broker to assist with this decision.
If you have a configuration that only has one type of standby database (either all physical or all logical standby databases), your choice of the target standby database for the failover should be decided based on the following criteria:
Which standby database has the most data archived to it. Failover to this standby database will result in the least amount of (or no) data loss.
Which standby database has the least amount of online redo log files to be applied before it can become the primary. (This can be done with the CLI by connecting to one of the standby databases in the configuration and reading the value of the RecvQEntries
property on different standby databases. Using the Data Guard GUI, you can view the value of the LastAppliedLog
column for each standby database in the Standby Databases section of the overview page.) This will affect the downtime of your primary database. The longer it takes to failover to the target standby, the longer the downtime would be.
Logistically, which standby database is a more desirable database to assume the role of primary.
If you have a configuration that has both physical and logical standby databases, you may want to consider choosing one of your physical standby databases as the failover target. Failing over to a logical standby database has the side effect of invalidating all your physical standby databases. Therefore, it is always better to choose a physical standby database as the failover target. This will avoid the need to re-create other physical standby databases not involved in the failover after the failover to a logical standby database completes.
However, there may be exceptions to this rule when one of the previous criteria is an important consideration for your failover. For example, if your physical standby database is lagging behind your logical standby database (the physical standby database was in READ-ONLY
state) and your business requires minimum downtime, you may want to consider selecting your logical standby database as the failover target.
Once you decide which standby database is your target, you can use Data Guard broker to execute the failover.
To start a failover using the Data Guard GUI, select the standby database that you want to change to the primary role and click Failover. When using the CLI, you issue a FAILOVER
command that specifies the name of the standby database that you want to change into the primary role.
After the failover, the overall protection mode of the new configuration (maximum protection, maximum availability, or maximum performance) is reset to the maximum performance mode.
The broker controls the failover steps described in Section 4.2.3. However, the failover results in eliminating the previous primary database as an active participant in the configuration. Depending on whether or not there are other valid standby databases in the configuration, you may need to perform the additional recovery steps described in Section 4.2.5 to maintain a viable disaster recovery solution in the event of another disaster.
After determining that there is no possibility of recovering the primary database in a timely manner, ensure that the primary database is shut down and then begin the failover operation.
Once you start a complete failover, the broker:
Checks to see if the primary database is still available and, if so, issues a warning message asking whether you want to continue with the failover operation. If you choose to continue, the primary database is shutdown.
Verifies that the target standby database is enabled. If the database is not enabled, you will not be able to perform a failover to this database. If the target is a RAC standby database, the broker shuts down all instances except the apply instance.
Waits for the target standby database to finish applying any remaining archived redo log files before stopping log apply services on it.
Updates the broker configuration file to record the change in role.
If a standby database not involved in the failover is not disabled by the broker during this failover, it will remain in the state it was in before the failover. For example, if a physical standby database was operating in read-only mode, it will remain in read-only mode.
Note: Standby databases not involved in the failover may be disabled by the broker during a failover, and they must be re-created in the configuration before they can serve as standby databases to the new primary database. A failover to a logical standby database will result in all physical standby databases being disabled by the broker, and it may result in all logical standby databases being disabled by the broker. See Section 4.2.5 for additional information. |
Transitions the target standby database into the primary role, opens the new primary database in read/write mode, determines whether or not any standby databases that did not participate in the failover operation have applied log sequences beyond the new primary database and thus need to be re-created, and starts log transport services to begin transmitting redo data to all standby databases not involved in the failover and not required to be re-created.
If the target is a RAC standby database, the broker restarts instances that it shut down prior to the failover.
The broker allows the failover to proceed as long as there are no errors for the standby database that you selected to participate in the failover. Errors occurring for any standby databases not involved in the failover will not stop the failover. If you initiated a complete failover and it fails, you might need to restart it as an immediate failover.
After determining that there is no possibility of recovering the primary database in a timely manner, ensure that the primary database is shut down and then begin the failover operation.
Once you start an immediate failover, the broker:
Verifies that the target standby database is enabled. If the standby database is not enabled for management by the broker, then the failover cannot occur.
Stops log apply services on the standby database immediately, without waiting for log apply services to finish applying the available archived redo log files. Note that this may result in some data loss.
Updates the broker configuration file to record the change in role.
Transitions the target standby database into the primary role, opens the new primary database in read/write mode, and starts log transport services.
Because an immediate failover starts a new incompatible redo branch from the new primary database, all the standby databases in the configuration are disabled by the broker and must be re-created. See Section 4.2.5.
The broker allows the failover to proceed as long as there are no errors for the standby database that you selected to participate in the failover.
You must perform recovery steps after the failover completes:
After a failover operation completes, the original, failed primary database is disabled by the broker until such time as the database can be re-created as a standby to the new primary database.
After a complete failover finishes, any of the standby databases not involved in the failover that are determined to be unviable as a standby for the new primary database will be disabled by the broker.
For instance, this could happen if a standby database not involved in the failover finds that it has applied more log files than the new primary database itself has applied. This standby database must be re-created before it can serve as a standby for the new primary database.
After a complete failover to a logical standby database finishes, the broker disables all of the physical standby databases in the configuration. They must be re-created before they can serve as standby to the new primary database.
After an immediate failover completes, the new primary database starts a new incompatible redo branch, causing all the standby databases in the configuration, regardless of their type, to be disabled. They must be re-created before they can serve as standby to the new primary database.
A database that has been disabled by the broker can be brought back to broker operation by:
Re-creating the database, either through flashback instantiation or by creating the database as described in Oracle Data Guard Concepts and Administration.
Enabling broker management of the re-created standby database. For example, use the CLI ENABLE DATABASE
command.
Note: Databases can be flashback instantiated only if the Flashback Database feature was enabled and flashback logs are available. If failover was to a physical standby database, all databases can be flashback instantiated. If failover was to a logical standby database, the old primary database and any other logical standby databases can be flashback instantiated. The physical standby databases will need to be re-created as described in Oracle Data Guard Concepts and Administration. |
The newly re-created standby database will begin serving as standby to the new primary database.