Quick Reference Guide for Managing Physical Standby Databases

Using a Physical Standby database (aka Data Guard) is one of the most effective ways to protect your database and improve its availability.  My colleague Jay Stanley wrote a great blog on this subject: http://dbspecialists.wpengine.com/blog/database-backups/need-uptime-use-and-oracle-physical-standby-database/.

The Standby database can be activated in the event of a disaster or it can be done for maintenance reasons, such as upgrading the host OS.  In this case, you can do a switchover to the Standby, upgrade the Primary host and then switch back to the Primary, all with potentially just a few minutes of downtime and zero data loss.  A failover occurs when the Primary becomes unavailable and the Standby needs to be used for production operations.  This is not typically a planned exercise and is done after some sort of disaster renders the Primary unavailable.  This could result in data loss, depending on the protection mode that has been configured. 

In this blog, I will demonstrate how to accomplish several common tasks for maintaining Physical Standby databases, including how to do both a switchover (for maintenance) and a failover (for disaster).  There are actually several different ways to accomplish these tasks: Grid Control, Data Guard command line (DGMGRL) and SQLPlus.  Because of the nature of the remote DBA work that I do, I like to use SQLPlus as it is the one tool that I can count on at any client site.

I highly recommend that you practice all of these commands in a non-production environment.  All of this was done using 11.2.0.2 Enterprise Edition on 64-bit linux.

Start a Standby Database in Redo Apply Mode with no delay

startup mount;
alter database recover managed standby database disconnect;

Start the Standby Database in Redo Apply Mode with a 120 minute delay

startup mount;
alter database recover managed standby database DELAY 120 disconnect;

Start the Standby Database in Real-Time Apply Mode

In order to use Real-Time Apply, the Standby database must have Standby Redo Logs.  For more details, see Oracle Support Note: 828274.1 Data Guard Real-Time Apply FAQ.

startup mount;
alter database recover managed standby database using current logfile disconnect;

Stop Redo Apply and open the Standby Database for Read Only access

alter database recover managed standby database cancel;
alter database open read only;

Perform a Switchover to a Physical Standby database.

Step 1: Make sure everything is ready

Query the SWITCHOVER_STATUS column of the V$DATABASE view on the primary database.

Select switchover_status from v$database;
SWITCHOVER_STATUS
-----------------
 TO STANDBY
 1 row selected

A value of TO STANDBY or SESSIONS ACTIVE indicates that the primary database can be switched to the standby role. If neither of these values is returned, a switchover is not possible because redo transport is either misconfigured or is not functioning properly.  Shutdown the application, disconnect all sessions and stop job processing.  You want to ensure that there are no active transactions occurring in order to minimize the time it takes to switchover.  If you have a delay configured, cancel and restart Redo Apply without it.

select delay_mins from v$managed_standby;

Note:  According to the docs, this should show the delay but in my testing, this value was always zero.  However, this message in the alert log indicates that there is a delay:

Media Recovery Delayed for 120 minute(s) (thread 1 sequence 23)
alter database recover managed standby database cancel;
alter database recover managed standby database NODELAY disconnect;

Check to see that the Standby has received and processed the latest archive log from the Primary.  Run this query on the Standby and compare against the latest sequence generated from the Primary

select al.thrd "Thread",
almax "Last Seq Received",
lhmax "Last Seq Applied"
from (select thread# thrd, max(sequence#) almax
     from v$archived_log
     where resetlogs_change#=(select resetlogs_change# from v$database)
     group by thread#) al,
    (select thread# thrd, max(sequence#) lhmax
     from v$log_history
     where first_time=(select max(first_time) from v$log_history)
     group by thread#) lh
where al.thrd = lh.thrd;
Thread Last Seq Received Last Seq Applied
---------- ----------------- ----------------
1                74               74

Run this on the Primary to see the latest sequence generated.  The last sequence applied to the Standby should be within 1 or 2 of this number:

SQL> select thread#, sequence# from v$thread;
THREAD#  SEQUENCE#
---------- ----------
1           75

If there is big difference between the last log generated and the last one applied, it could be because the Standby is not processing fast enough or there could be a gap in the Archive logs.  You can check for this by querying v$archive_gap.  You should resolve any such issues prior to doing the switchover.

Step 2: Convert the Primary to Standby:

alter database commit to switchover to physical standby with session shutdown;
shutdown abort;
startup mount;

Step 3: Convert the standby to Primary:

Query the SWITCHOVER_STATUS column of the V$DATABASE view on the standby database.

SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
-----------------
TO_PRIMARY

A value of TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to be switched to the primary role. If neither of these values is returned, verify that Redo Apply is active and that redo transport is configured and working properly. Continue to query this column until the value returned is either TO PRIMARY or SESSIONS ACTIVE.

alter database commit to switchover to primary with session shutdown;
alter database open;

Step 4: Start Redo Apply on the new physical standby database and any others that exist.

alter database recover managed standby database using current logfile disconnect;

At this point, the databases have switched roles.  Confirm that the new Standby is receiving and applying logs from the new Primary.  You can do the same steps again to reverse them back to their original roles.  You should take a backup of the new Primary now.

Perform a Failover to a Physical Standby Database

Step 1:  Flush any unsent redo from the primary database to the target standby database.

If the primary database can be mounted, it may be possible to flush any unsent archived and current redo from the primary database to the standby database. If this operation is successful, a zero data loss failover is possible even if the primary database is not in a zero data loss data protection mode. Mount, but do not open the primary database. If the primary database cannot be mounted, go to Step 2.

Issue the following SQL statement at the primary database:

alter system flush redo to <Standby DB Name>;

For <Standby DB Name>, specify the DB_UNIQUE_NAME of the standby database that is to receive the redo flushed from the primary database. This statement flushes any unsent redo from the primary database to the standby database, and waits for that redo to be applied to the standby database. If this statement completes without any errors, go to Step 3. If the statement completes with any error, or if it must be stopped because you cannot wait any longer for the statement to complete, continue with Step 2.

Step 2:   Verify that the standby database has the most recently archived redo log file for each primary database redo thread and that there are no gaps. 

 Query the V$ARCHIVED_LOG  and v$archive_gap view on the target standby database to obtain the highest log sequence number for each redo thread.  If necessary, manually copy archive logs to the Standby and register them.

 alter database register physical logfile '<full path and file name>';

 Step 3:  Stop Redo Apply and finish applying all received redo data.

alter database recover managed standby database cancel;
alter database recover managed standby database finish;

Step 4: Open the new Primary database

4a) If  the “alter database…finish” statement in Step 3 fails, try to resolve the problem.  If it cannot be resolved a failover can still be performed (with some data loss) by issuing the following SQL statement on the target standby database:

alter database activate physical standby database;
alter database open; 

4b) If the “alter database…finish” completes without errors, then you can do a switchover with no data loss.  This allows you the option of switching back to the original primary after you resolve whatever caused it to become unavailable.

alter database commit to switchover to primary with session shutdown;
alter database open;

At this point, you should take a backup of the new Primary database.  If you have other Physical Standby databases, then you can restart the redo apply now.

Flashback a Failed Primary Database into a Physical Standby Database

In the event of a failover, the original Primary database will need to be rebuilt or flashed back before it can become a Standby.  The following steps assume that a failover has been performed to a physical standby database and that Flashback Database was enabled on the old primary database at the time of the failover. This procedure brings the old primary database back into the Data Guard configuration as a physical standby database.

Step 1: On the new primary database, issue the following query to determine the SCN at which the old standby database became the new primary database:

Select to_char(standby_became_primary_scn) from v$database;

Step 2:   Flashback and convert the database to a physical standby database.

Issue the following statements on the old primary database:

shutdown immediate;
startup mount;
flashback database to scn <SCN from Step 1>;
alter database convert to physical standby;
shutdown immediate;
startup mount;
alter database recover managed standby database using current logfile disconnect;

 Step 3:   Start transporting redo to the new physical standby database.

Configure the log_archive_dest_2 parameter and enable it:

ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_n=ENABLE;

On the new standby database, you may also need to change the LOG_ARCHIVE_DEST_n initialization parameters so that redo transport services do not transmit redo data to other databases.  Confirm that the new Standby is receiving and applying logs from the new Primary.

Here at Database Specialists, we all have many years of experience working with Standby databases.  If you have questions or problems with your database, feel free to contact us and we will give you a hand.

Rack Blogger is our catchall blog byline, subbed in when a Racker author moves on, or used when we publish a guest post. You can email Rack Blogger at blog@rackspace.com.

4 COMMENTS

  1. Hi,
    I have issued an “alter database recover managed standby finish;”. Is there any way to go back to a standby from this? I keep getting error ORA-16157.

    Thanks.

LEAVE A REPLY

Please enter your comment!
Please enter your name here