Incremental Recovery of Standby - ASM

Incremental Recovery of Standby (ASM and RMAN)
December 20, 2011

I had another interesting recovery operation recently. A company I was working with had a very
large (~4TB) database generating anywhere from several hundred MB to over a TB of archive logs
each day. They had RMAN backups configured, but due to a slight misconfiguration of their backups,
they accidentally deleted an archive log that they needed. As consequence, they had a gap in their
standby, and they did not have a backup of the required archive log. They had also created additional
data files in the time that the standby was out of synch.
In the past, I had always recovered this situation by a complete re-instantiation of the standby, which
we really did not want to do because of the very large size of the source database. Therefore, I was
happy to have someone mention this process of recovery using an incremental backup from SCN. This
is the process I used. After recovering the customer, I went back and refined the process and
identified some unneeded steps.
First I created a primary and a standby database in VMware workstation. The primary was called
dgsrc, the standby is dgsrcsb. I then created a table I called mytest_table, and inserted some rows in
dgsrc. I then opened the standby in read only and verified that the table was in the standby. At that
point I shut down the standby and its listener.
Then I inserted a large number of rows in the primary, forcing a log switch after each large insert. I
then added a datafile to the primary.
Next, I started rman and deleted several archive logs so they could not be transported. I then started
up the standby and verified that there was a gap:
FAL[client]:
Failed
GAP
to
request
thread
DBID
All
defined
sequence
sequence
586313788
FAL[client]:
gap
85-86
branch
FAL
servers
769276415
have
been
attempted.
Check
that
parameter
enough
is
to
archivelog gaps.
the
defined
maintain
CONTROL_FILE_RECORD_KEEP_TIME
to
adequate
value
log
switch
thats
initialization
sufficiently
information
to
large
resolve
After verifying the gap, I added a datafile to the primary, and inserted more rows into mytest_table. I
then took a count of the rows in the table:
SQL> select count(1) from akerber.mytest_table;
COUNT(1)
3809968
I then proceeded to attempt the recovery process as documented below. The basic reference
document for this process is Metalink id #836986.1. However, this document does not cover the steps
to add missing datafiles, and also contains unnecessary steps. The steps below were validated on
both a database using a file system, and a database using ASM. Also, in order for the process below
to work properly, the parameters DB_FILE_NAME_CONVERT and LOG_FILE_NAME_CONVERT must be
set correctly on the standby.
The basic steps are as follows:
1.
Shutdown the standby database.
2.
Identify the last applied SCN on the standby.
3.
Make a list of files currently on the primary and the standby.
4.
Run an incremental backup from SCN, and create a standby controlfile, both on the primary
database.
5.
Move the backup files and the backup controlfile to the standby server.
6.
Restore the standby controlfile to the standby.
7.
Create missing datafiles.
8.
Switch datafiles to copy (required whether or not new files are created).
9.
Recover using the incremental SCN backup.
10.
Restart the standby.
The steps in detail are below. This is a much quicker method to re-instantiate the standby database
than I have used in the past. I have tried it out on databases running on both file systems and ASM.
Step 1:
Once you have determined you have a log gap, and do not have a copy of the archivelog to
register, cancel the recovery process on the standby:
alter database recover managed standby database cancel;
Step 2: On the STANDBY, run these commands (NOTE: the CURRENT_SCN can be a very large
number, and if you do not use the format command, the output may be an unreadable exponential
form):
Column current_scn format 999999999999999;

Select current_scn from v$database;
select min(fhscn) current_scn from x$kcvfh;
Use the lesser of the two numbers above, and then subtract 1000 to for the SCN in the incremental
backup operation. Note that the lesser of the two numbers above should be sufficient, but choosing
an earlier SCN cannot hurt, and will add a safety margin.
Step 3: Check the number and creation date of datafiles on the STANDBY:
select count(1), max(file#), to_char(max(creation_time),dd-Mon-yyyy hh24mi) from v$datafile;.
Run the same query on the primary database, and compare the two.
If the number of files are different on the two queries, or the creation dates are substantially
different, save a listing of datafiles from both the standby and the primary. Use the query below to get
the listing. Run it on both the primary and standby:
select file#, name from v$datafile;
Step 4: On the PRIMARY start RMAN, and run the backup command:
run {
allocate channel t1 device type disk;
BACKUP as compressed backupset INCREMENTAL FROM SCN <scn from step 1 goes here>
DATABASE FORMAT <path to backup destination>/filename_%U;
}
Eg:
RMAN> run {
BACKUP as compressed backupset INCREMENTAL FROM SCN 604900565
DATABASE FORMAT /home/oracle/bkup/orc_%U.bak;
}
Backup the current control file on the primary for the standby. Note that this command can also be
included as part of the backup script above. If you choose to do this, make sure it is run after the
incremental backup:
RMAN> backup current controlfile for standby format /home/oracle/bkup/control01.ctl;
Step 5: Copy the standby controlfile backup and the incremental backup files to the standby server,
using SCP or FTP.
It saves time to store the Incremental backup and controlfile backup in a location
by the same name on the standby server.
Step 6:
Shutdown the STANDBY, and start it up in nomount mode, and restore the standby
controlfile. Remember to use the STANDBY keyword so that Oracle understands that this is a standby
controlfile that is being restored:
RMAN> Shutdown immediate;
RMAN> startup nomount;
RMAN> restore standby controlfile from /home/oracle/bkup/control01.ctl;
RMAN> alter database mount;
Catalog the backup files on the standby. This is not necessary if the files are in the same location as
on the primary:
RMAN>
catalog
start
with
directory
containing
files;
E.g.;
RMAN>
catalog
/home/oracle/bkup;
RMAN will ask you if you really want to catalog the files, enter YES to catalog the files.
start
with
Now switch to SQLPLUS and compare the list of the files from v$datafile with the list you saved earlier
from the standby:
SQL> select file#, name from v$datafile;
Step 7: If any datafiles are missing, create them using these commands:
SQL> alter system set standby_file_management=manual scope=memory;
SQL> alter database create datafile <file#>;
Eg: SQL> alter database create datafile 12;
After all datafiles are created, run this command:

SQL> alter system set standby_file_management=auto scope=memory;
Step 8: Switch back to RMAN and switch to the new data files (note this step is required whether or
not new files are created. It actually directs the control file to the correct set of datafiles):
RMAN> switch database to copy;
The switch database to copy command will force a clear of the redo logs.
If you are monitoring the
alert log you will see a series of errors about failed to open online log. These can be ignored.
NOTE: I did not test this aspect, however I am fairly confident that ff the DB_FILE_NAME_CONVERT
and LOG_FILE_NAME_CONVERT parameters are not set, the CATALOG command must be run for each
diskgroup (or directory) on the STANDBY containing datafiles prior to the switch database to copy
command.
eg:
RMAN> catalog start with +DATA1/orcl/datafile;

RMAN> catalog start with +DATA2/orcl/datafile;
Once again, after each catalog command you will be asked if you want to catalog the files, enter YES.
After all catalog commands are complete, run the command:
RMAN> switch database to copy;
Step 9: Recover the standby database. This is done with the command below:
RMAN> recover database noredo;
The NOREDO option forces RMAN to use a backup rather than the archive logs, thus bypassing the
gaps. This apply process will often take substantial time.
Step 10: After the recovery process is complete, resume the recovery process:
SQL> alter database recover managed standby database disconnect
Monitor the alert log on the standby to verify that logs are applying.
from session;
Entries in the alert log beginning
with Media Recovery Log <file name> indicate that logs are being applied.
Archived Log entry <number> added for thread <#> sequence <#> indicates that logs are being
received.
A sample from a running standby is below:
RFS[1]: Selected log 7 for thread 1 sequence 102 dbid 586313788 branch 769276415
Sun Dec 18 13:25:31 2011
Archived Log entry 35 added for thread 1 sequence 101 ID 0x22f2f73c dest 1:
Sun Dec 18 13:25:33 2011
Media Recovery Log +DATA1X/dgsrc1/archivelog/2011_12_18/thread_1_seq_101.474.770217929
Media Recovery Waiting for thread 1 sequence 102 (in transit)
The last line indicates that the standby has applied the previous log and is waiting for the next log
(102) to arrive so that it can apply that log.
To verify that the process was successful, I opened the standby in read only mode, and counted the
rows in mytest_table.
SQL> select count(1) from akerber.mytest_table;

COUNT(1)
3809968
The recovery process was successful.
PRE-REQUISITES:
Primary database can be single node or RAC and running OK.

No downtime of Primary is required.
All Dataguard settings are intact
Step-by-step:
1.
1.
Get the latest SCN from standby:
1select to_char(current_scn) from v$database;

2
310615562421
1.
2.
Create incremental backup on Primary for all the changes since SCN on standby:
1
2
3
4
5
6 [oracle@primary backup]$ rman target /</pre>
7 connected to target database: PRIMARY (DBID=720063942)
8
9 RMAN> run
1
0 2> {
11
3> allocate channel d1 type disk;
1
2 4> allocate channel d2 type disk;
1
3 5> backup incremental from scn 10615562421 database format
1
4 6> '/tmp/backup/primary_%U';
1
5 7> release channel d1;
1
6 8> release channel d2;
<pre>9> }
1
7
1
8
1
9
1.
3.
Create copy of control file on Primary:
1alter database create standby controlfile as /tmp/backup/stby.ctl;
1.
4.
SCP the backup files and standby control file to the standby server. A little tip: if
you copy the backup files to the directory with the same name (like /tmp/backup here),
your controlfile will know about them and you can bypass the registration bit later on.
1.
5.
The next step is to replace the standby control file with the new one. May sound
simple, but this proved to be the trickiest part due to the fact that standby controlfile is
OMF and in ASM. You will need to use RMAN for the restore operation:
Switch the database to nomount, then:
1restore controlfile from /tmp/backup/stby.ctl';

-
Mount the database.
At this point you have the controlfile with the information about the files as they are on the Primary side, so,
the next step is to register everything we have on Standby side:
1catalog start with '+data/standby/';

Check the output and YES to register any reported standby files.
-
Shutdown immediate your standby instance.
1RMAN> switch database to copy;

2
3RMAN> report schema;
On this stage you should have a nice and clean list of actual standby files.
Now we are ready to apply our differential backup to bring the standby in line with Primary:
1RMAN> recover database noredo;

Because the online redo logs are lost, you must specify the NOREDO option in the RECOVER command.
You must also specify NOREDO if the online logs are available but the redo cannot be applied to the
incrementals.
If you do not specify NOREDO, then RMAN searches for redo logs after applying the incremental backup, and
issues an error message when it does not find them.

Incremental Recovery of Standby - ASM

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Incremental Recovery of Standby - ASM

Transféré par

Droits d'auteur :

Formats disponibles

Incremental Recovery of Standby (ASM and RMAN)

December 20, 2011

Shutdown the standby database.

Identify the last applied SCN on the standby.

Make a list of files currently on the primary and the standby.

Restore the standby controlfile to the standby.

Create missing datafiles.

Recover using the incremental SCN backup.

Restart the standby.

register, cancel the recovery process on the standby:

alter database recover managed standby database cancel;

Column current_scn format 999999999999999;

select min(fhscn) current_scn from x$kcvfh;

select count(1), max(file#), to_char(max(creation_time),dd-Mon-yyyy hh24mi) from v$datafile;.

select file#, name from v$datafile;

RMAN> backup current controlfile for standby format /home/oracle/bkup/control01.ctl;

by the same name on the standby server.

After all datafiles are created, run this command:

RMAN> switch database to copy;

If you are monitoring the

RMAN> catalog start with +DATA1/orcl/datafile;

SQL> alter database recover managed standby database disconnect

Entries in the alert log beginning

A sample from a running standby is below:

SQL> select count(1) from akerber.mytest_table;

Primary database can be single node or RAC and running OK.

Get the latest SCN from standby:

1select to_char(current_scn) from v$database;

Create copy of control file on Primary:

1alter database create standby controlfile as /tmp/backup/stby.ctl;

Switch the database to nomount, then:

1restore controlfile from /tmp/backup/stby.ctl';

Mount the database.

1catalog start with '+data/standby/';

Shutdown immediate your standby instance.

1RMAN> switch database to copy;

1RMAN> recover database noredo;

Vous aimerez peut-être aussi