Standby database creation of VLDBs

Introduction

I have mentioned before that the best way to insure Oracle RDBMS uptime, is to create and maintain a physical standby database. There simply is no more reliable way to insure uptime for the Oracle RDBMS.

Creating a standby database involves these steps:

  • Creating a standby controlfile from the primary database & transferring it to the standby,
  • Modifying parameters on the primary database to ship logs to the standby if using Enterprise Edition, or setting up shell scripts to the same,
  • Setting up a special parameter file for the standby database that will accept redo from the primary database,
  • Copying all datafiles from the primary database to the standby database,
  • Starting the standby database, and enabling log shipping from the primary database,
  • Optionally, if using dataguard, adding in dataguard configuration and enabling it,
  • Monitor the standby database to insure that it is applying logs from the primary,
  • Testing failover to the standby database to insure all parts are working.

The steps above are well-documented in the Oracle Dataguard Concepts and Administration manual. For most setups, creating a standby database is not too difficult. However, for very large databases with high transaction/redo rates, a serious issue can become evident.

The problem is that for a standby database to stay in-sync with the primary database after creation, it needs to have access to all of the redo that has been generated from the time that the files were backed up on the primary. For very large databases (ie 10Tb+), it’s not uncommon for it to take a number of days to backup/transfer datafiles from the primary to the standby database. When those datafiles are finally restored on the standby database, the redo that was generated over these days needs to be applied to the standby, before the standby can accept new redo from the primary database, and this can be a very large quantity of redo!

For very large databases that generate lots of redo (say 1Tb/day), this may require that the newly created standby may need to apply several Tb of redo, and it needs to do this quickly. Unless you have a LOT of disk space to stage those redo logs, this is going to be very difficult to do.

This presentation describes one way to get around this issue.

The secret

So, what’s the secret? The main secret is to create the standby incrementally, datafile by datafile, over an extended period of time, and keeping the datafiles transferred all synced with the primary soon after they are transferred.

It turns out that Oracle physical standby databases manage controlfiles differently than primary databases. When you issue ‘alter database drop datafile mydatafile.dbf including contents and datafiles’ on a primary database, the controlfile is updated, and the history of that datafile is wiped clean; there really is no way to restore a datafile after it has been dropped.

However, on a standby database, when you issue a ‘alter database drop datafile mydatafile.dbf’ on a standby database, the history of that datafile actually doesn’t go away! It is simply marked with a ‘delete’ flag in memory, which causes the Oracle recovery process on the standby database to skip that datafile.

Using this information, you can create a standby database datafile-by-datafile, over a period of days, weeks, or even months, by:

  1. Creating a standby controlfile from the primary and shipping it to the standby,
  2. Modifying parameters on the primary database to ship logs to the standby if using Enterprise Edition,
  3. Setting up a special parameter file for the standby database that will accept redo from the primary database,
  4. Copy the first datafile from the primary to the standby (or restore it using RMAN),
  5. Start the standby database in ‘mount’ mode (alter database mount standby database). You will notice that in v$datafile_header, the 1st datafile will have a status of ONLINE, and all other datafiles will have a status of ERROR.
  6. For all datafiles that have a status of ERROR, issue a ‘alter database drop datafile datafile_name;’ on the standby database.
  7. Initiate standby recovery on the standby database, (recover managed standby database..)
  8. Initiate redo transfer from the primary to the standby (ie set the LOG_ARCHIVE_DEST_n_ENABLE parameter in the primary)
  9. Insure that the standby is applying redo from the primary database (ie v$dataguard_status in the standby).
  10. This is now stable; the 1st datafile will be kept up-to-date. You can’t really use (or open) the standby database yet.
  11. When you are ready for the next datafile, transfer the datafile from the primary to the standby,
  12. Shut down the standby (shutdown immediate). Then, start it back up (startup nomount; alter database mount standby database).
  13. Again, note which datafiles have a status of OFFLINE in v$datafile_header. For each of those, re-issue the alter database drop datafile datafile_name.
  14. Begin redo application from the standby (ie alter database recover managed standby database…).
  15. The database is again ‘stable’ – the 1st two datafiles will be kept in-sync from the primary.
  16. Repeat the steps above (11-15) for each datafile in turn. If you like, you can copy a few files at a time; it depends on how big they are, and how much redo you can keep around.

One note; when you’re using the process above, remember that the view v$datafile_header on the standby is very, very useful; it will not only show which datafiles have been scanned, but also the last system change number (SCN) of each datafile to see how they are syncing.

The steps above allow you to create a standby database incrementally, datafile-by-datafile, over an extended period of time.

Conclusion

It is not often that a standby database needs to be created in this way; it is warranted only for very large databases with high redo rates, or setups which have virtually no extra disk space on the standby to hold redo. It is obviously a lot more work to create one this way. However, if you run into a situation like this, knowing that there is a way around this issue can be very useful.

Author: Jay Stanley of Database Specialists <jstanley@dbspecialists.com>

Date: December 22, 2010

HTML generated by org-mode 6.34c in emacs 22

2 COMMENTS

  1. Good.
    what are your thoughts on incremental sync using particular SCN onwards backup using rman and apply to standby.. so that you do not need to apply all the redos.

    Regards,

    • That’s a good question. The main disadvantage to using RMAN to just do incremental block recovery, is that it’s bulk-based; it won’t keep the standby database in-sync as close-in-time to just applying archived (or online in the case of dataguard) redo relatively constantly. The place where I’ve see it most useful is when you have a very large database that takes many hours/days to back up. You can do 1 backup (with the block change tracking file active), then just do incremental updates to that backup. So, it’s more of a backup solution than a standby solution per-se.
      -jay

LEAVE A REPLY

Please enter your comment!
Please enter your name here