Vous êtes sur la page 1sur 10

Incremental Recovery of Standby (ASM and RMAN)

December 20, 2011


I had another interesting recovery operation recently. A company I was working with had a very
large (~4TB) database generating anywhere from several hundred MB to over a TB of archive logs
each day. They had RMAN backups configured, but due to a slight misconfiguration of their backups,
they accidentally deleted an archive log that they needed. As consequence, they had a gap in their
standby, and they did not have a backup of the required archive log. They had also created additional
data files in the time that the standby was out of synch.
In the past, I had always recovered this situation by a complete re-instantiation of the standby, which
we really did not want to do because of the very large size of the source database. Therefore, I was
happy to have someone mention this process of recovery using an incremental backup from SCN. This
is the process I used. After recovering the customer, I went back and refined the process and
identified some unneeded steps.
First I created a primary and a standby database in VMware workstation. The primary was called
dgsrc, the standby is dgsrcsb. I then created a table I called mytest_table, and inserted some rows in
dgsrc. I then opened the standby in read only and verified that the table was in the standby. At that
point I shut down the standby and its listener.
Then I inserted a large number of rows in the primary, forcing a log switch after each large insert. I
then added a datafile to the primary.
Next, I started rman and deleted several archive logs so they could not be transported. I then started
up the standby and verified that there was a gap:
FAL[client]:

Failed

GAP

to

request

thread

DBID

All

defined

sequence

sequence

586313788

FAL[client]:

gap

85-86

branch
FAL

servers

769276415

have

been

attempted.

Check

that

parameter
enough

is
to

archivelog gaps.

the
defined
maintain

CONTROL_FILE_RECORD_KEEP_TIME
to
adequate

value
log

switch

thats

initialization

sufficiently

information

to

large
resolve

After verifying the gap, I added a datafile to the primary, and inserted more rows into mytest_table. I
then took a count of the rows in the table:
SQL> select count(1) from akerber.mytest_table;
COUNT(1)
3809968
I then proceeded to attempt the recovery process as documented below. The basic reference
document for this process is Metalink id #836986.1. However, this document does not cover the steps
to add missing datafiles, and also contains unnecessary steps. The steps below were validated on
both a database using a file system, and a database using ASM. Also, in order for the process below
to work properly, the parameters DB_FILE_NAME_CONVERT and LOG_FILE_NAME_CONVERT must be
set correctly on the standby.
The basic steps are as follows:
1.

Shutdown the standby database.

2.

Identify the last applied SCN on the standby.

3.

Make a list of files currently on the primary and the standby.

4.

Run an incremental backup from SCN, and create a standby controlfile, both on the primary
database.

5.

Move the backup files and the backup controlfile to the standby server.

6.

Restore the standby controlfile to the standby.

7.

Create missing datafiles.

8.

Switch datafiles to copy (required whether or not new files are created).

9.

Recover using the incremental SCN backup.

10.

Restart the standby.

The steps in detail are below. This is a much quicker method to re-instantiate the standby database
than I have used in the past. I have tried it out on databases running on both file systems and ASM.
Step 1:

Once you have determined you have a log gap, and do not have a copy of the archivelog to

register, cancel the recovery process on the standby:

alter database recover managed standby database cancel;

Step 2: On the STANDBY, run these commands (NOTE: the CURRENT_SCN can be a very large
number, and if you do not use the format command, the output may be an unreadable exponential
form):

Column current_scn format 999999999999999;


Select current_scn from v$database;

select min(fhscn) current_scn from x$kcvfh;

Use the lesser of the two numbers above, and then subtract 1000 to for the SCN in the incremental
backup operation. Note that the lesser of the two numbers above should be sufficient, but choosing
an earlier SCN cannot hurt, and will add a safety margin.

Step 3: Check the number and creation date of datafiles on the STANDBY:

select count(1), max(file#), to_char(max(creation_time),dd-Mon-yyyy hh24mi) from v$datafile;.

Run the same query on the primary database, and compare the two.

If the number of files are different on the two queries, or the creation dates are substantially
different, save a listing of datafiles from both the standby and the primary. Use the query below to get
the listing. Run it on both the primary and standby:

select file#, name from v$datafile;

Step 4: On the PRIMARY start RMAN, and run the backup command:
run {
allocate channel t1 device type disk;
allocate channel t2 device type disk;
allocate channel t3 device type disk;
allocate channel t4 device type disk;
BACKUP as compressed backupset INCREMENTAL FROM SCN <scn from step 1 goes here>
DATABASE FORMAT <path to backup destination>/filename_%U;
}
Eg:
RMAN> run {
allocate channel t1 device type disk;
allocate channel t2 device type disk;
allocate channel t3 device type disk;
allocate channel t4 device type disk;
BACKUP as compressed backupset INCREMENTAL FROM SCN 604900565
DATABASE FORMAT /home/oracle/bkup/orc_%U.bak;
}

Backup the current control file on the primary for the standby. Note that this command can also be
included as part of the backup script above. If you choose to do this, make sure it is run after the
incremental backup:

RMAN> backup current controlfile for standby format /home/oracle/bkup/control01.ctl;

Step 5: Copy the standby controlfile backup and the incremental backup files to the standby server,
using SCP or FTP.

It saves time to store the Incremental backup and controlfile backup in a location

by the same name on the standby server.

Step 6:

Shutdown the STANDBY, and start it up in nomount mode, and restore the standby

controlfile. Remember to use the STANDBY keyword so that Oracle understands that this is a standby
controlfile that is being restored:
RMAN> Shutdown immediate;
RMAN> startup nomount;
RMAN> restore standby controlfile from /home/oracle/bkup/control01.ctl;
RMAN> alter database mount;

Catalog the backup files on the standby. This is not necessary if the files are in the same location as
on the primary:
RMAN>

catalog

start

with

directory

containing

files;

E.g.;

RMAN>

catalog

/home/oracle/bkup;

RMAN will ask you if you really want to catalog the files, enter YES to catalog the files.

start

with

Now switch to SQLPLUS and compare the list of the files from v$datafile with the list you saved earlier
from the standby:
SQL> select file#, name from v$datafile;

Step 7: If any datafiles are missing, create them using these commands:
SQL> alter system set standby_file_management=manual scope=memory;
SQL> alter database create datafile <file#>;
Eg: SQL> alter database create datafile 12;

After all datafiles are created, run this command:


SQL> alter system set standby_file_management=auto scope=memory;

Step 8: Switch back to RMAN and switch to the new data files (note this step is required whether or
not new files are created. It actually directs the control file to the correct set of datafiles):

RMAN> switch database to copy;

The switch database to copy command will force a clear of the redo logs.

If you are monitoring the

alert log you will see a series of errors about failed to open online log. These can be ignored.

NOTE: I did not test this aspect, however I am fairly confident that ff the DB_FILE_NAME_CONVERT
and LOG_FILE_NAME_CONVERT parameters are not set, the CATALOG command must be run for each

diskgroup (or directory) on the STANDBY containing datafiles prior to the switch database to copy
command.
eg:

RMAN> catalog start with +DATA1/orcl/datafile;


RMAN> catalog start with +DATA2/orcl/datafile;

Once again, after each catalog command you will be asked if you want to catalog the files, enter YES.
After all catalog commands are complete, run the command:
RMAN> switch database to copy;

Step 9: Recover the standby database. This is done with the command below:
RMAN> recover database noredo;
The NOREDO option forces RMAN to use a backup rather than the archive logs, thus bypassing the
gaps. This apply process will often take substantial time.

Step 10: After the recovery process is complete, resume the recovery process:

SQL> alter database recover managed standby database disconnect

Monitor the alert log on the standby to verify that logs are applying.

from session;

Entries in the alert log beginning

with Media Recovery Log <file name> indicate that logs are being applied.

Archived Log entry <number> added for thread <#> sequence <#> indicates that logs are being
received.

A sample from a running standby is below:

RFS[1]: Selected log 7 for thread 1 sequence 102 dbid 586313788 branch 769276415
Sun Dec 18 13:25:31 2011
Archived Log entry 35 added for thread 1 sequence 101 ID 0x22f2f73c dest 1:
Sun Dec 18 13:25:33 2011
Media Recovery Log +DATA1X/dgsrc1/archivelog/2011_12_18/thread_1_seq_101.474.770217929
Media Recovery Waiting for thread 1 sequence 102 (in transit)
The last line indicates that the standby has applied the previous log and is waiting for the next log
(102) to arrive so that it can apply that log.

To verify that the process was successful, I opened the standby in read only mode, and counted the
rows in mytest_table.

SQL> select count(1) from akerber.mytest_table;


COUNT(1)
3809968
The recovery process was successful.

PRE-REQUISITES:

Primary database can be single node or RAC and running OK.


No downtime of Primary is required.
All Dataguard settings are intact
Step-by-step:
1.

1.

Get the latest SCN from standby:

1select to_char(current_scn) from v$database;


2
310615562421
1.

2.

Create incremental backup on Primary for all the changes since SCN on standby:

1
2
3
4
5
6 [oracle@primary backup]$ rman target /</pre>
7 connected to target database: PRIMARY (DBID=720063942)
8
9 RMAN> run
1
0 2> {
11
3> allocate channel d1 type disk;
1
2 4> allocate channel d2 type disk;
1
3 5> backup incremental from scn 10615562421 database format
1
4 6> '/tmp/backup/primary_%U';
1
5 7> release channel d1;
1
6 8> release channel d2;
<pre>9> }
1
7
1
8
1
9
1.

3.

Create copy of control file on Primary:

1alter database create standby controlfile as /tmp/backup/stby.ctl;

1.

4.
SCP the backup files and standby control file to the standby server. A little tip: if
you copy the backup files to the directory with the same name (like /tmp/backup here),
your controlfile will know about them and you can bypass the registration bit later on.

1.

5.
The next step is to replace the standby control file with the new one. May sound
simple, but this proved to be the trickiest part due to the fact that standby controlfile is
OMF and in ASM. You will need to use RMAN for the restore operation:

Switch the database to nomount, then:

1restore controlfile from /tmp/backup/stby.ctl';


-

Mount the database.

At this point you have the controlfile with the information about the files as they are on the Primary side, so,
the next step is to register everything we have on Standby side:

1catalog start with '+data/standby/';


Check the output and YES to register any reported standby files.
-

Shutdown immediate your standby instance.

1RMAN> switch database to copy;


2
3RMAN> report schema;
On this stage you should have a nice and clean list of actual standby files.

Now we are ready to apply our differential backup to bring the standby in line with Primary:

1RMAN> recover database noredo;


Because the online redo logs are lost, you must specify the NOREDO option in the RECOVER command.
You must also specify NOREDO if the online logs are available but the redo cannot be applied to the
incrementals.
If you do not specify NOREDO, then RMAN searches for redo logs after applying the incremental backup, and
issues an error message when it does not find them.

Vous aimerez peut-être aussi