Académique Documents
Professionnel Documents
Culture Documents
List of Tables................................................................................................................................................ i List of Figures .............................................................................................................................................. i Acknowledgments ..................................................................................................................................... iii Abstract ....................................................................................................................................................... iv Whats New in the Oracle9i Maximum Availability Architecture ...................................................... vi 1 Introduction ....................................................................................................................................1-9 2 Architecture Overview...................................................................................................................2-1 2.1 Application Tier .....................................................................................................................2-2 2.2 Network Infrastructure.........................................................................................................2-3 2.3 Storage Infrastructure ...........................................................................................................2-4 2.4 Real Application Clusters .....................................................................................................2-5 2.5 Data Guard and the Secondary Site....................................................................................2-6 2.6 Operational Best Practices ...................................................................................................2-7 3 System Configuration.....................................................................................................................3-1 3.1 Storage.....................................................................................................................................3-1 3.2 Server Hardware ....................................................................................................................3-6 3.3 Server Software ......................................................................................................................3-8 4 Network Configuration .................................................................................................................4-1 4.1 Use WAN traffic managers to provide site failover capabilities.....................................4-1 4.2 Use load balancers to distribute incoming requests .........................................................4-1 4.3 Choose and implement appropriate application failover.................................................4-1 4.4 Ensure all network components are redundant ................................................................4-3 5 Database Configuration.................................................................................................................5-1 5.1 Real Application Clusters .....................................................................................................5-1 5.2 Oracle Data Guard ................................................................................................................5-8 5.3 Backups and Recovery Manager........................................................................................5-34 6 MAA within the Data Center........................................................................................................6-1 7 Monitoring and Managing with Enterprise Manager ................................................................7-4 7.1 Monitoring..............................................................................................................................7-4 7.2 Managing...............................................................................................................................7-11 7.3 Enterprise Manager Architecture for High Availability.................................................7-12 7.4 EM Configuration ...............................................................................................................7-15 8 Outages ............................................................................................................................................8-1 8.1 Unscheduled Outages ...........................................................................................................8-1 8.2 Scheduled Outages ................................................................................................................8-4 9 Recovering from Outages .............................................................................................................9-1 9.1 Recovery Operations Overview ..........................................................................................9-1 9.2 Client Tier Site Failover (Includes All Tiers) .................................................................9-4 9.3 Application Server Tier Application Server Failover....................................................9-7 9.4 Database Server Tier Object Reorganization.................................................................9-8
9.5 Database Server Tier Object Recovery.........................................................................9-13 9.6 Database Server Tier RAC Failover and Transparent Application Failover ...........9-22 9.7 Database Server Tier Data Guard Switchover.............................................................9-23 9.8 Database Server Tier Data Guard Failover ..................................................................9-35 9.9 Database Server Tier Standby Instance Failover ..........................................................9-45 10 Restoring Database Fault Tolerance..........................................................................................10-1 10.1 Restoring Failed Nodes or Instances within Real Applications Cluster......................10-1 10.2 Restoring the Standby Database after a Failover ............................................................10-7 10.3 Restoring Fault Tolerance after Secondary Site or Cluster-Wide Scheduled Outage10-11 10.4 Restoring Fault Tolerance after Standby Database Data Failure .............................. 10-13 10.5 Instantiation of the Standby Database .......................................................................... 10-16 10.6 Restoring Fault Tolerance after Dual Failures ............................................................. 10-17 11 Conclusion.....................................................................................................................................11-1 12 Appendix A - Risk Assessment of Major Architectural Changes..........................................12-1 13 Appendix B - Operational Best Practices .................................................................................13-1 13.1 Logistical Best Practices......................................................................................................13-1 13.2 Technical Best Practices .................................................................................................. 13-13 14 Appendix C - Database SPFILE and Oracle Net Configuration File Samples...................14-1 14.1 Database System Parameter File (SPFILE) Sample .......................................................14-2 14.2 Oracle Net Configuration File Samples ...........................................................................14-8 15 Appendix D - Test Configurations ............................................................................................15-1 15.1 RAC Active / Active Test Configuration ........................................................................15-1 15.2 Data Guard Primary Site / Secondary Site Test Configuration ...................................15-2 15.3 SAME Test Configuration .................................................................................................15-2 16 Appendix E - Object Reorganization and Recovery Examples.............................................16-1 16.1 Object Reorganization Examples......................................................................................16-1 16.2 Object Recovery Examples................................................................................................16-5 17 Appendix F - Server Parameter File (SPFILE) ........................................................................17-1 17.1 SPFILE Creation .................................................................................................................17-1 17.2 SPFILE Administration......................................................................................................17-3
ii
List of Tables
Table 3-1: Contents of each SAME set per site .................................................................................................... 3-3 Table 4-1: Application Failover Methods .............................................................................................................. 4-3 Table 4-3: Redundant Network Components ....................................................................................................... 4-3 Table 5-1: RAC Initialization Parameter Best Practices....................................................................................... 5-2 Table 5-2: Standby Database Type.......................................................................................................................... 5-8 Table 5-3: Determining the Standby Database Type............................................................................................ 5-8 Table 5-4: Initialization Parameter Settings for Oracle Data Guard................................................................ 5-12 Table 5-6: Archiving Rules ..................................................................................................................................... 5-13 Table 5-7: Protection Modes.................................................................................................................................. 5-19 Table 5-8: Determining the Protection Mode ..................................................................................................... 5-19 Table 5-9 200 User Test Effect on Primary ....................................................................................................... 5-22 Table 5-10 Logical Standby MAX_SERVERS Example................................................................................... 5-31 Table 7-1: Unscheduled Outages for EM ............................................................................................................ 7-14 Table 8-1: Unscheduled Outages............................................................................................................................. 8-1 Table 8-2: Unscheduled Outages on the Production Site.................................................................................... 8-3 Table 8-3: Unscheduled Outages on the Secondary Site ..................................................................................... 8-3 Table 8-4: Scheduled Outages on the Primary Site............................................................................................... 8-5 Table 8-5: Scheduled Outages on the Secondary Site .......................................................................................... 8-6 Table 8-6: Preparing for Scheduled Secondary Site Maintenance ...................................................................... 8-7 Table 9-1: Recovery Operation................................................................................................................................ 9-1 Table 9-2: Summary of Object Reorganization Solutions ................................................................................... 9-8 Table 9-3: Decision Matrix for Object Recovery of a Table or Index ............................................................ 9-16 Table 9-5: Object Recovery Decision Criteria Descriptions ............................................................................. 9-16 Table 9-7: Local Object Recovery Solutions ....................................................................................................... 9-17 Table 9-8: Decision Matrix for Data Guard Switchover ................................................................................... 9-23 Table 9-9: Decission Matrix for Data Guard Failover ....................................................................................... 9-35 Table 9-10: Using Failover for Different Types of Outages ............................................................................. 9-36 Table 9-11: Using Failover for Different Types of Outages ............................................................................. 9-42 Table 10-1: Performance Impact of Restarting or Rejoining a Node or Instance......................................... 10-1 Table 10-2: Restoration and Connection Failback.............................................................................................. 10-3 Table 10-3: Re-Creating the Production and Standby Databases...................................................................10-17 Table 12-1: MAA vs. Alternative Architectures .................................................................................................. 12-1 Table 13-1: Prevention, Detection, and Repair of Completely Unavailable Objects ..................................13-15 Table 13-2: Prevention, Detection, and Repair of Partially Unavailable or Corrupted Objects................13-18 Table 13-3: Prevention and Detection of Unscheduled Outages...................................................................13-19 Table 17-1: Effect of SCOPE Option.................................................................................................................. 17-3
List of Figures
Figure 1-1 The HA Puzzle ....................................................................................................................................... 1-9 Figure 1-3 Oracle HA Solutions............................................................................................................................ 1-10 Figure 2-1: Maximum Availability Architecture .................................................................................................... 2-1 Figure 2-3: Maximum Availability Network Configuration ................................................................................ 2-4 Figure 3-1: Detailed View of the Storage Array .................................................................................................... 3-4 Figure 3-2: Server Hardware Configuration .......................................................................................................... 3-6 Figure 4-1: Network Configuration (hb=heartbeat)............................................................................................. 4-4 Figure 5-1 Primary and Secondary Archive Destinations.................................................................................. 5-15
Figure 5-3 200 User Tests ...................................................................................................................................... 5-23 Figure 5-4 LGWR SYNC ....................................................................................................................................... 5-25 Figure 6-1: One Cluster with One Application Server in Each Data Center ................................................... 6-2 Figure 6-3: Two Clusters in Separate Data Centers that Share an Application Server.................................... 6-2 Figure 7-1 EM MAA Architecture........................................................................................................................ 7-13 Figure 9-1: Three-Tier Maximum Availability Architecture................................................................................ 9-2 Figure 9-2: Network Routes Before Site Failover................................................................................................. 9-4 Figure 9-3: Network Routes After Site Failover ................................................................................................... 9-5 Figure 9-4: Application Server Failure in an Application Server Farm ............................................................. 9-7 Figure 9-6: Standby Instance Failover .................................................................................................................. 9-49 Figure 10-1: Partitioned 2-Node RAC Database ................................................................................................ 10-4 Figure 10-3: Failure of a RAC Instance in a Partitioned Database .................................................................. 10-4 Figure 10-5: Nonpartitioned RAC Instances....................................................................................................... 10-5 Figure 10-7: Failure of One RAC Instance with Nonpartitioned Application............................................... 10-5 Figure 10-9: Failover Timeline............................................................................................................................... 10-9 Figure 12-1: RAC Production and Data Guard single instance........................................................................ 12-2 Figure 13-1: Change Management Flow .............................................................................................................. 13-4 Figure 15-1: RAC Active/Active Test Configuration ........................................................................................ 15-1 Figure 15-2: Data Guard Test Configuration ...................................................................................................... 15-2 Figure 15-3 SAME Test Configuration ................................................................................................................ 15-3
ii
Acknowledgments
We wish to thank the following Oracle partners and people for their valuable contribution to the Maximum Availability Architecture.
iii
Abstract
Choosing and implementing the architecture that best fits your availability requirements can be a daunting task. This architecture must encompass redundancy across all components, achieve fast client failover for all types of outages, provide consistent high performance, and provide protection from user errors, corruptions, and site disasters, while being easy to deploy, manage, and scale. This paper describes a technical architecture that removes the complexity of designing a highly available (HA) architecture for your business. Maximum Availability Architecture (MAA) is a simple, redundant, and robust architecture that prevents, detects, and recovers from different outages within a small mean time to recovery (MTTR), as well as preventing or minimizing downtime for maintenance. This architecture is a complete solution consisting of proven Oracle HA technology and exemplifies Oracle Unbreakable architecture. As more system capabilities become available, IT managers, architects, system administrators and database administrators often find it difficult to integrate a suitable set of features to build one unified, high availability solution that fits all of their business requirements or incorporate and leverage a new HA feature to their existing architecture. The purpose of MAA is to remove the complexity of designing the correct highly available architecture. Instead of focusing on features and point solutions, MAA incorporates all of Oracle's best HA proven technologies and best practice recommendations to maximize systems availability and make it easy to set up and configure, while providing the following benefits:
MAA gives the ability to control the length of time to recover from an outage and the amount of acceptable data loss under disaster conditions thus allowing MTTR to be tailored to business requirements. MAA reduces the implementation costs for a highly available Oracle system by providing detailed configuration guidelines. The results of performance impact studies for different configurations are highlighted to ensure that your highly available architecture can continue to perform and scale accordingly to your needs. MAA provides best practices and recovery steps to eliminate or minimize downtime that could occur because of scheduled and unscheduled outages such as human errors, system faults and crashes, maintenance, data failures, corruptions, and disasters.
Oraganization
Although this document focuses on building MAA, it can be leveraged as a reference guide in answering your toughest HA questions or configuring parts of MAA such as Data Guard or Real Application Clusters (RAC). The MAA illustrates Oracles best, easy to manage HA architecture while the configuration best practices and outage and repair sections can help any DBA or system administrator build and test parts or all of MAA such as Data Guard or RAC. This paper is targeted toward IT managers, database administrators, system administrators, consultants, and architects. Part I: Getting Started with MAA: The Introduction and Architecture Overview provide an executive view of the MAA goals, MAA architecture and components. Part II: MAA Configuration Best Practices: Describes the configuration best practices in building the Maximum Availability Architecture (MAA). It focuses on what needs to be implemented and why. The MAA Configuration Best Practices sections describe what needs to be implemented and why, and is primarily created for database administrators, system administrators and consultants interested in implementing all or part of MAA configuration best practices. Part III: Monitoring and Managing Highly Available Environments: The Monitoring and Managing with Enterprise Manager section describes how Enterprise Manager provides a sophisticated monitoring and managing framework for any high available architecture including MAA. Part IV: Outages, Recovering from Outages, and Restoring Fault Tolerance: The Outages, Recovering from Outages, and Restoring Fault Tolerance sections justify this architecture by providing the best solutions for a list of
iv
unscheduled and scheduled outages and provides details for restoring high availability after an outage. This section is initially directed to all audiences but gets into great detail for those who need to implement or test MAAs different HA solutions and recovery steps. Part V: Appendix: These sections contain further information on best practices, risk assessments, and test configurations.
Knowledge of Oracle Server, Real Application Clusters and Data Guard terminology is required to understand the configuration and implementation details. Please refer to Oracle9i Database Concepts, the Oracle Data Guard documentation set, and the Real Application Clusters documentation set for related information. MAA is designed, tested, and validated by the Oracle Server Technologies High Availability Systems Group and is being validated and deployed in numerous customer sites around the world.
Oracle Data Guard Added recommendation for using database parameter _LOG_ARCHIVE_CALLOUT='LOCAL_FIRST=TRUE' when using the ARCH process to transmit redo data over slow network links in Oracle9i Release 9.2.0.5 or later.
July 2003
This section highlights the changes that have been made to the Maximum Availability Architecture in the July 2003 revision.
Enterprise Manager Added Monitoring and Managing with Enterprise Manager chapter. Added more details in building a high availability architecture for the Entermprise Manager repository. Moved all Enterprised Manager related topics under this chapter.
Logical Standby Database Incorporated logical standby as a Data Guard option within MAA by validating and testing logical standby database, enhanced Data Guard configuration, Outages, Recovering from Outages, and Restoring Fault Tolerance sections to include logical standby.
Oracle Data Guard Added additional best practices derived from Data Guard (physical and logical standby) performance tests and customer cases. Added decision matrix to help choose the best standby configuration and adjusted protection mode section.
February 2003
This section highlights the changes that have been made to the Maximum Availability Architecture in the February 2003 revision.
Added Whats New section. Enterprise Manager Added high availability architecture for the Oracle Enterprise Manager (OEM) repository and updated the outage matrix to describe failure cases and solutions for all three tiers of OEM (agent, Oracle Management Server (OMS) and repository).
Real Application Clusters Updated the Real Application Clusters Configuration Best Practices section to include control file size and CONTROL_FILE_RECORD_KEEP_TIME recommendations.
Updated the Restoring Full Database Fault Tolerance section to clarify the steps involved when restoring the standby database after a failover or forced failover.
Backup and Recovery Updated the Backup and Recovery Configuration Best Practices section. Removed Appendix E - Fast-Start Checkpointing Overview and Recommendation. The information provided in this appendix has been made available as a separate paper on the Oracle Technology Network (OTN) (http://otn.oracle.com) titled Oracle9i Fast-Start Checkpointing Best Practices. Removed Appendix I - Oracle Recovery Tuning Best Practices. The information provided in this appendix has been made available as a separate paper on OTN (http://otn.oracle.com) titled Oracle9i Media Recovery Best Practices. Removed Appendix H RMAN Considerations for Backups. Material has been included in Backup and Recovery Configuration Best Practices section.
vii
1 Introduction
The MAA goals are Complete out-of-the-box HA solution and experience Unbreakable architecture Integrated best practices to prevent, detect and repair every outage
This paper and project will be a continual investment until HA can fully satisfy the above criteria for all customers. This paper starts with providing the database and system blueprint using RAC and Data Guard as the foundation. Future releases and revisions will integrate Oracle Application Server, Collaboration Suite, and e-Business suite. Furthermore, the paper will continue to be updated with validated new features, results from different performance studies and more materials on manageability. For most customers, this document provides an HA reference for configuring HA solutions by providing a roadmap to Maximum Availability Architecture and all its components. As Figure 1-1 illustrates, MAA tackles the problem of what is required for HA, how to configure the HA architecture and its components and how to repair from outages.
MAA solves all your most demanding outages. Figure 1-2 illustrates how Oracle has solutions for different outages. MAA goes beyond listing Oracle features by providing detailed steps to recover from different outages. The steps are derived from in-house and collaborative customer testing, validation and discussions. The outage and solution section provides more details.
Introduction
This paper is divided in the following parts and sections: Getting Started with MAA o o Introduction Architecture Overview
MAA Configuration Best Practices o o o o System Configuration Network Configuration Database Configuration MAA in the Data Center
Monitoring and Managing with Enterprise Manager Outages, Recovering from Outages, and Restoring Fault Tolerance o o o Outages Recovering from Outages Restoring Fault Tolerance
1-10
Introduction
The Architecture Overview section provides an executive view of the architecture and its components. Under MAA Configuration Best Practices, configuration guidelines are given for System, Network, and Database, as well as bringing MAA into an existing Data Center. The Monitoring and Managing with Enterprise Manager identifies which significant events needs to be monitored in an HA environment and what can be managed using Enterprise Manager. The Outages and Recovering from Outages sections justify this architecture by providing the best solutions for a list of scheduled and unscheduled outages. After a failover operation or after resolving a database outage, use the Restoring Fault Tolerance section to restore complete high availability to the database. This paper provides detailed configuration best practices and solutions to help prevent and repair a wide range of different outages across the entire architecture. The core content focuses on configuring and maintaining a highly available database within a three-tier architecture.
1-11
2 Architecture Overview
MAA provides a simple, redundant and robust architecture that prevents different outages or recovers from an outage within a small mean time to recover (MTTR). The goal is that most outages have no impact or minimal impact to availability while catastrophic outages can be repaired in less than 30 minutes. It encompasses the following main components:
Redundant middle or application tier Redundant network infrastructure Redundant storage infrastructure Real Application Clusters (RAC) to protect from host and instance failures Oracle Data Guard (DG) to protect from human errors and data failures and recover from site failures Sound operational practices
Oracle9iAS
Oracle9iAS
RAC
Primary Site
Data Guard
RAC
Secondary Site
Figure 2-1 provides an overview of MAA. MAA recommends identically configured sites. Each site consists of redundant components and redundant routing mechanisms, so that requests are always serviceable even in the event of a failure. Most outages are resolved locally. Client requests are always routed to the site playing the production role. Initially in Figure 2-1, the primary site contains the production database and plays the production role, and the secondary site contains the physical or logical or both standby databases and plays the standby role. The roles switch after a scheduled switchover
Architecture Overview
operation or an unplanned failover operation. Even though roles can change, the primary and secondary site labels are constant. After catastrophic outage, client requests are routed to another site that assumes the production role. Each site contains a set of application servers or mid-tier servers. The site playing the production role contains a production database using RAC to protect from host and instance failures. The site playing the standby role contains one or two standby databases managed by Data Guard. Data Guard switchover and failover functions allow the roles to be traded between sites. A detailed description of these operations and when they should occur appear in the section, Recovering from Outages. We advocate identical site configurations between the primary and secondary sites to ensure that performance is not sacrificed after a failover or switchover although special considerations may be required if you plan to deploy both physical and logical standby databases. Symmetric sites allow processes and procedures to be kept the same between sites, making operational tasks much easier to maintain and execute. Furthermore, ensure that upgrades and software changes on the primary site are also repeated on the secondary site and vice versa. In all cases, remote software synchronization needs to be maintained manually or with third party solutions to keep software synchronized. The following sections give a brief overview of each component of MAA:
Application tier Network infrastructure Storage infrastructure Real Application Clusters Data Guard and the secondary site Operational best practices
2-2
Architecture Overview
Because each server farm is redundant on multiple sets of machines, problems and outages within a site of individual application tier hosts are transparent to the client. Automatic detection by the monitoring infrastructure and, where applicable, restart of a failed component of the application tier ensure near uninterrupted availability of application services.
2-3
Architecture Overview
Tier 1
hb
heartbeat
Client
Internet
Primary Site
Secondary Site
Primary
Standby
Router
Firewall
hb
Router
Firewall
WAN Traffic Manager
Router
Firewall
Router
hb
Firewall
Switches
hb
Switches
Tier 2
Hardware-based load balancer Hardware-based load balancer
Active
Switches
hb
Switches
Standby
App/Web Servers
Firewall
IMAP servers
LDAP servers
Firewall
App/Web Servers
Firewall
IMAP servers
LDAP servers
Firewall
hb
hb
Tier 3
Switches
hb
Switches
Router
Router
Switches
Switches
RAC instance
hb hb
RAC instanc e
RAC instance
hb hb
RAC instanc e
Database
Database
Full redundancy for all hardware components Online parts replacement (hot swappable parts) Online patch application
2-4
Architecture Overview
Hardware mirroring and striping capabilities Mirrored write cache with battery backup Load balancing and failover capabilities across all host bus adapters
Availability provides near-continuous access to data with minimal interruption from hardware and software component failures Scalability allows nodes to be added to the cluster to increase processing capabilities without having to redistribute data or alter the user application Manageability provides a single system image to manage
RAC allows continuous data availability in the event of component, instance, or node failure. If an instance or node fails, the surviving instances automatically perform recovery for the failed instance and continue to provide database service. User data is always accessible if there is at least one available instance running in the cluster. Along with effectively handling unscheduled outages (e.g., instance or node failures), RAC gives the administrator the ability to perform scheduled maintenance on a subset of nodes or components of the cluster while continuing to provide service to users. RAC automatically harnesses the processing power of additional nodes as they are brought into the cluster, thus providing scalability, potentially without downtime. With RACs Cache Fusion architecture, it is not necessary to re-partition data or modify an application to take advantage of additional CPU power or additional I/O and network bandwidth made available when nodes are added to or removed from the cluster. RAC also can automatically balance new database connection requests among the available instances, based on lowest processing load and fewest connections. Because of an instances ability to provide load data to listeners and to crossregister with remote listeners, each listener is aware of all services, instances, dispatchers, and their current loads regardless of their location. Thus a listener can send an incoming client request for a specific service to the least-loaded node, instance, or dispatcher. A key component to RAC availability and scalability is the private interconnect. The interconnect is a communication facility that links the nodes in the cluster, routing messages, data, and other cluster communications traffic to coordinate each nodes access to shared resources. For high availability, the interconnect must be redundant such that a single link failure from a failed adapter, cable, or switch, does not isolate one node from the rest of the cluster. To ensure scalability, particularly with the Cache Fusion architecture, the interconnect must be a high-bandwidth, low-latency link. Ideally, the cluster can fully utilize the redundant links and balance loads across the multiple interconnect paths. When maintaining a RAC environment, since it is a single database accessed by multiple instances, a single system image is preserved across the cluster for all database operations, which simplifies manageability. DBAs perform configuration, HA operations, recovery, and monitoring functions once. Oracle then automatically distributes the management functions to the appropriate nodes. This means the DBA manages one virtual server.
2-5
Architecture Overview
The Real Application Clusters Guard (RACG) feature of RAC provides an enhanced HA solution by coupling the availability advantages of RAC with integrated monitoring, connection failover, and hardware clustering. RACG is intended for environments with the strictest availability requirements.
Redundant middle or application tier Redundant network infrastructure Redundant storage infrastructure Identical Real Application Clusters environment to protect from host and instance failures when in the production role Sound operational practices
Data Guard with physical standby database provides the following benefits:
Availability provides protection from human errors, data failures, primary site failures and disasters, provides switchover operations for primary site maintenance, and different database protection modes to enable no data loss environments Manageability provides a management and monitoring framework for log transport services and managed standby recovery, role management services such as switchover and failover, and ability to offload backups and read only activities from the production database
Additionally Data Guard with logical standby database provides the following additional benefits: Reporting and Read Write Accessibility allows the logical database to be leveraged for reporting or decision support as well as for disaster recovery. The logical standbys SQL apply engine transforms the productions redo to SQL and applies it to an opened database. Additionally, the logical standby database supports adding objects, such as indexes and materialized views.
2-6
Architecture Overview
A specified delay of redo application at the standby database should be configured to ensure that a logical corruption or error such as dropping a table will be detected before the change is applied to the standby database. In addition, adequate monitoring and detection also need to be in place to ensure errors are detected within the specified delay interval. The standby database can be configured with a zero data or transaction loss for physical standby databases only; however architectural design decisions must be considered to minimize performance impact. Using the standby database, most database failures are resolved faster than by using on-disk backups since the database recovery time is dramatically reduced. Please refer to the Data Guard configuration section to determine if physical, logical or both standby databases fit your requirements and for best practices and recommendations to minimize performance impact for all protection modes.
Preventing outages Detecting potential problems Recovering from outages within a tolerated MTTR
Operational best practices have been categorized into logistical and technical components. The logistical component includes those practices that are the foundation of managing the IT infrastructure and are geared towards process and policy management. Some of the logistical best practices include having sound change management policies, backup and recovery planning, disaster recovery planning, scheduled outage planning, adequate staff training, thorough documentation practices, and sound security policies and procedures. These processes and policies allow IT to prevent most problems from occurring and provide recovery plans when a problem does occur. The technical component covers the specific technical detail and infrastructure used to prevent, detect, and resolve a problem. Technical best practices include the following:
QA and test systems to allow for thorough testing before deployment Redundant, secure system stack to prevent single point of failures and malicious acts from causing downtime A monitoring infrastructure to quickly detect, prevent, notify, and possibly resolve problems Automated recovery infrastructure to resolve the most common outages
For more information on the logistical best practices and technical best practice components, please refer to the Operational Best Practices appendix. Built upon open Internet standards, the Oracle Enterprise Manager (OEM) suite of products provides the first comprehensive management framework designed to support multiple, heterogeneous environments. The OEM console is the primary interface for performing all management tasks. A reliable and scalable multi-administrator repository offers the option of cooperative management. All OEM areas of management can also be accessed from anywhere using a standard web browser. Using the OEM product family, database administrators and IT managers can increase productivity, deliver better services, and reduce the total cost of information systems. OEM features:
Unified architecture for managing the complete Oracle environment Centralized console for single point of management Real-time monitoring
2-7
Architecture Overview
Oracle Enterprise Manager is Oracles single, integrated solution for administering and monitoring global enterprises. High availability demands continuous service availability, scalability, simplified management, service performance optimization, and reporting. To meet these requirements, Oracle Enterprise Manager offers real-time monitoring for events, distributed database administration, collection and analysis of performance and availability data, automated tuning of the Oracle environment, and well-integrated reporting capabilities. It provides effective systems management that enables administrators to centrally view their systems, associate and organize disparate but linked services, such as databases and applications, and effectively monitor and respond to the health of these systems on a 24 x 7 basis. For more information on Enterprise Manager, please refer to the Monitoring and Managing with Enterprise Manager section.
2-8
Part II describes the configuration best practices in building the Maximum Availability Architecture (MAA). It focuses on what needs to be implemented and why. The following configuration best practices sections are included:
Section 3, System Configuration Section 4, Network Configuration Section 5, Database Configuration Section 6, MAA in the Data Center
3 System Configuration
The overall goal of configuring an MAA environment is to create a redundant, reliable system and database without sacrificing simplicity and performance. This section provides recommendations for configuring the sub-components that make up the MAA database server tier. This section is divided into the following subsections: Storage Server Hardware Server Software (including Oracle software)
Note: The following sections highlight some of the key aspects of maintaining high availability in regard to the server hardware and software. However, it is important to reference your vendors high availability architecture, best practices, and recommendations for the hardware, operating system, and cluster configurations.
3.1 Storage
Electronic data is one of the most important assets of any business. Storage arrays that house this data must protect it and keep it accessible to ensure the success of the company. This section describes characteristics of a fault tolerance storage subsystem that protects data while providing manageability and performance. The following is a list of storage recommendations that are further expanded below: Ensure that all hardware components are fully redundant and fault-tolerant Use an array that can be serviced online Use hardware-level RAID capabilities Load balance I/O across all physical interfaces Configure disks using the Stripe and Mirror Everything (SAME) methodology for new configurations Evaluate the need to implement Hardware Assisted Resilient Data (HARD)
Note: The following sections highlight some of the key aspects of a highly available storage subsystem. However, it is important to reference your storage vendors high availability architecture, best practices, and recommendations for the storage array hardware, operating system, and cluster configurations.
3.1.1 Ensure that all hardware components are fully redundant and fault-tolerant
All hardware components of a storage array must be fully redundant, from physical interfaces to physical disks, including redundant power supplies and connectivity to the array itself. The storage array should contain one or more spare disks (often called hot spares). When a physical disk starts to report errors to the monitoring infrastructure, or fails suddenly, the firmware should immediately restore fault tolerance by mirroring the contents of the failed disk onto a spare disk.
System Configuration
Connectivity to the storage array must be fully redundant (referred to as multipathing) so that the failure of any single component in the data path from any node to the shared disk array (such as controllers, interface cards, cables, and switches) is transparent and keeps the array fully accessible. This achieves addressing the same logical device through multiple physical paths. A host-based logical volume manager is responsible for reissuing the I/O to one of the surviving physical interfaces. If the storage array includes a write cache, it must be mirrored to guard against memory board failures. The write cache should be protected by multiple battery backups to guard against a failure of all external power supplies. The cache must be able to survive either by using the battery backup long enough for the external power to return or until all the dirty blocks in cache are guaranteed to be fully flushed to a physical disk.
3.1.5 Configure disks using the Stripe and Mirror Everything (SAME) methodology
The SAME configuration provides a simple, efficient, and highly available storage configuration. The basic idea of this configuration is to make extensive use of striping across large sets of disks. The Optimal Storage Configuration Made Easy paper1, http://otn.oracle.com/deploy/availability/pdf/oow2000_same.pdf, discusses the SAME configuration and how it provides a simple, efficient, and highly available storage configuration. An internal Oracle study was done to validate the assertions made in the original SAME paper referenced above. Testing details that provide further validation of the SAME methodology can be found on OTN at http://otn.oracle.com/deploy/availability/pdf/SAME_HP_WP_112002.pdf. When considering SAME for MAA, you need to evaluate the following:
1
3-2
System Configuration
One shared data area SAME set for the shared raw volumes for the Oracle data files, control files, and online and standby redo log files One backup area SAME set. This area contains files that are used for recovery and testing recovery procedures. The backup area consists of the UNIX file system containing the database backups. The backup area is local to one node but must be accessible and mountable by all other nodes in the cluster in the case of a failure of the owning node. A local data area SAME set per node for the file systems (for Oracle software, configuration files and archived redo log files) mounted on each node in the cluster. The file systems should be striped over as many disks as required to support the file system capacity requirements.
Allocate a SAME set for each local data area file systems Allocate a SAME set for the backup area to accommodate all database backups Allocate the remaining disks in the storage array to a SAME set for the shared data area
Contents
Shared data area: Control files, data files, online and standby redo log files Backup area: RMAN backup sets or database backups Local data area: Archived redo log files and configuration files
Quantity
1 shared by each node 1 shared by each node 1 on each node in the cluster
Comments
Stored on a shared disk group addressable by all hosts in the cluster
The backup area is local to one node but must be accessible and mountable by all other nodes in the cluster in the case of a failure of the owning node. Stored on an exclusive disk group and mounted by a single host.
The following diagram provides a sample breakdown of the contents of each disk in a storage array.
3-3
System Configuration
Storage Array
Shared Data Area - Striped over 20 disk Backup Area - Striped over 20 disks
3.1.6 Evaluate the need to implement Hardware Assisted Resilient Data (HARD)
The HARD initiative2 is a program designed to prevent data corruptions before they happen. Data corruptions are very rare, but when they do occur, they can have a catastrophic effect on business. Under the HARD initiative, Oracles storage partners implement Oracle's data validation algorithms inside storage devices. This makes it possible to prevent corrupted data from being written to permanent storage. The goal of HARD is to eliminate a class of failures that the computer industry has so far been powerless to prevent. RAID has gained a wide following in the storage industry by ensuring the physical protection of data; HARD takes data protection to the next level by going beyond protecting physical data to protecting business data. In order to prevent corruptions before they happen, Oracle tightly integrates with advanced storage devices to create a system that detects and eliminates corruptions before they happen. Oracle has worked with leading storage vendors to
End-to-End Validation of Oracle Database Blocks Wei Hu, J. Bill Lee, Juan Loaiza Oracle Corporation - November, 2001
2
3-4
System Configuration
implement Oracle's data validation and checking algorithms in the storage devices themselves. The classes of data corruptions that Oracle addresses with HARD include:
Writes of physically and logically corrupt data file, control file and log file blocks. Writes of Oracle blocks to incorrect locations Erroneous writes to Oracle data by programs other than Oracle Write of partial or incomplete blocks
The key approach that we are taking is end-to-end block validation, whereby the operating system or storage subsystem validates the Oracle data block contents. By validating Oracle data in the storage devices, corruptions will be detected and eliminated before they have a chance to be written to permanent storage. This goes beyond the current Oracle block validation features that do not detect a stray, lost or corrupted write until the next physical read. Oracle vendors are given the opportunity to implement validation checks based on a specification. A vendors implementation may offer unique features specific to their storage technology. Oracle maintains a website that will show a comparison of each vendors solution, broken down by product and Oracle version. For the most recent information, see http://otn.oracle.com/deploy/availability/htdocs/HARD.html. In some of the current implementations of HARD, there is a constraint that the online redo log files must reside in a different SAME set from the Oracle data files and control files, which conflicts with the recommendation that all Oracle data files, redo log files, and control files reside in a single SAME set. Therefore, if the storage uses this type of HARD implementation then an additional SAME set for the online redo log files is needed. When configuring the number of disks in the SAME set for online redo log files, test that this SAME set is able to support the redo log generation rate at peak throughput.
3-5
System Configuration
The recommendations for server hardware are: Use a supported clustered system to run RAC Use similar hardware for every node in the cluster Choose the proper cluster interconnect Use fewer, faster, and denser components Use redundant hardware components that are hot-swappable Use systems that can automatically detect failures and provide alternate paths around or fence off subsystems that have failed Protect the boot disk and a backup copy
3-6
System Configuration
3.2.6 Use systems that can automatically detect failures and provide alternate paths around or isolate subsystems that have failed
Choose a system that continues to run despite a component failure, and automatically works around the failed component without incurring a full outage. For example, find a system that can use an alternate path for I/O requests if an adapter fails or can avoid a bad physical memory location if a memory board fails.
3-7
System Configuration
3.3.2 Use the same operating system version, patch level, one-off patches, and driver versions on all nodes
Consistency with OS versions and patch levels reduces the likelihood of encountering incompatibilities or small inconsistencies between the software on all nodes. It is impossible for any vendor to test each and every piece of new software with every combination of software that has been previously released.
3-8
System Configuration
3-9
4 Network Configuration
In MAA, two sites are created to provide a fault-tolerant environment the primary and secondary site. The primary and secondary sites are identical, composed of redundant routing systems, multiple connections to the Internet, redundant load balancers, application server farms, and RAC databases. The secondary site is idle during normal operation and does not take any client requests. Traffic is directed to the secondary site only when the primary site cannot provide service due to an outage. Use WAN traffic managers to provide site failover capabilities Use load balancers to distribute incoming requests Choose and implement appropriate application failover Ensure all network components are redundant
The RAC instance or node failure can be handled by one of the following:
Transparent Application Failover (TAF) Virtual IP address failover Application-specific exception handling
Each of these is described below. Following the description, each methods advantages and disadvantages are contrasted under Choosing the Right Application Failover Table 4-1 to assist with the selection of the appropriate RAC failover method or combination of methods. MAA focuses on TAF.
4-2
Cluster. 2. Reissue any interrupted SQL queries or uncommitted transactions that were rolled back during recovery.
Method
Virtual IP address
Advantages
Any client that can connect to the database via TCP/IP will work with virtual IP address
Disadvantages
Failed client session needs to be recovered by the application and needs application specific exception handling Requires deployment of application in a cluster Must predefine application failover and recovery policy Must use failover-aware API calls that are built into OCI If a node goes down, your application may not notice it, and TAF may not be triggered until your application attempts to execute another SQL statement
TAF Automatic failover clients are automatically reconnected to another node in the cluster Provides failover with minimal or no custom coding of the application TAF-related callbacks provided by OCI that can be used to make an application aware of failovers Provides total failover and failback solution when used with Oracle Net connect-time failover and load balancing More control to the application
Failover logic must be programmed and tested into the application Must predefine application failover and recovery policy
Component
Exterior connectivity (including remote IT staff)
Fault tolerance
Multiple internet service providers (ISP) assure that an ISP failure does not make your site unavailable. [Note: The cables should not be housed within the same trunk from the street into the facility. Otherwise, a runaway saw or shovel can cut both simultaneously.] Use a primary and backup WAN traffic manager on each site. Implement a redundant load balancer on each site. The secondary load balancer pair should be identically configured with a heartbeat to the primary site. If the primary load balancer pair fails, the secondary initiates a takeover function. Implement redundant firewalls, load balancing across the pair. Because the connections through a firewall tend to be longer (socket connections vs. HTTP), the load-balancing decision needs to be made when the session is initiated and kept during the whole session. Check with your firewall vendor for the functionality available. Implement application server farms, which provides fault tolerance and scalability.
Firewall
The following diagram depicts a single site in an MAA environment, highlighting the redundant network components.
4-3
Router
Router
Firewall
hb
Firewall
Switches
hb
Switches
App/Web Servers
IMAP servers
LDAP servers
Firewall
hb
Firewall
Switches
hb
Switches Router
RAC instance
hb hb
RAC instance
Database
4-4
5 Database Configuration
This section provides configuration best practices for the following Oracle products and features:
Real Application Clusters Oracle Data Guard Backups and Recovery Manager
Use a server parameter file (SPFILE) for initialization parameters Use two control files Size the volumes that contain the control files (both primary and standby) to allow for sufficient growth Set CONTROL_FILE_RECORD_KEEP_TIME to a value that allows all on-disk backup information to be retained in the control file Size production and standby redo logs and groups appropriately Multiplex production and standby redo logs Enable ARCHIVELOG mode and use automatic archiving Enable block checksums Enable database block checking Log checkpoints to the alert log Enable fast-start checkpointing Capture timing related performance statistics Use automatic undo management Use locally managed tablespaces Use automatic segment-space management Use temporary tablespaces and specify a default temporary tablespace Use resumable space allocation Be consistent when setting THREAD, INSTANCE_NUMBER, and INSTANCE_NAME database parameters
For configuration best practices that are specific to Data Guard, see the next section Oracle Data Guard.
Some of the recommendations listed and described in this section correspond with one or more database parameter settings, which are shown in the table below. Database-wide parameters are prefixed with *. and instance-specific parameters are prefixed with the ORACLE_SID value, as they would be specified in the SPFILE. These sample settings are for a database named SALES that has 2 instances with ORACLE_SID values of SALES1 and SALES2. The recommendations pertaining to the parameters in the table, as well as the recommendations that do not have a corresponding database parameter, are detailed in the text below.
Table 5-1: RAC Initialization Parameter Best Practices
5-2
5.1.3 Size the volumes that contain the control files (both primary and standby) to allow for sufficient growth
As archive logs are generated and RMAN backups are made, Oracle adds new records to the reusable section of the control file. If no records are available for reuse (because all records are still within the number of days specified by CONTROL_FILE_RECORD_KEEP_TIME), the control file is expanded and new records are added to the control file. The maximum control file size is 20000 database blocks. If db_block_size=8192, the maximum control file size is 156MB. If the control files are stored in pre-created volumes (instead of a cluster filesystem), the volumes that contain the production and standby control files should be sized to accommodate a control file of maximum size. If the control file volume is too small and cannot be extended, existing records in the control file will be overwritten before their intended reuse. This behavior is indicated by the following message in the alert log:
krcpwnc: following controlfile record written over: The alert log should be monitored for these messages to indicate the control file volume that is not correctly sized.
5.1.4 Set CONTROL_FILE_RECORD_KEEP_TIME to a value that allows all ondisk backup information to be retained in the control file
CONTROL_FILE_RECORD_KEEP_TIME specifies the number of days records are kept within the control file before becoming a candidate for reuse. Set the CONTROL_FILE_RECORD_KEEP_TIME value to slightly longer than the oldest backup file that you intend to keep on disk, as determined by the size of the backup area. Records older than this value will be reused. However, the backup metadata will still be available in the RMAN recovery catalog. Reference: Backup and Recovery Best Practices section and Oracle9i Recovery Manager Users Guide
5.1.5 Size production and standby redo logs and groups appropriately
All production and standby redo logs should be the same size and should be sized to switch logs approximately once per hour during normal activity, and at most every 20 minutes during peak activity. However, the MTTR for an outage that requires failing over to the secondary site is directly affected by the standby database recovery time, and therefore is governed by the redo log file size. Larger redo logs may result in longer recovery times, which must be factored into deciding the size of redo logs. Refer to the Oracle Data Guard section for details. There should be a minimum of four production log groups to prevent LGWR from waiting for a group to be available following a log switch because a checkpoint has not yet completed or the group has not yet been archived. Reference: Oracle9i Database Administrator's Guide and Oracle9i Data Guard Concepts and Administration
5-3
5-4
You should also perform off-peak block checking by using one of the following detection tools: RMAN VALIDATE DBMS_REPAIR DBVERIFY ANALYZE TABLE <tablename> VALIDATE STRUCTURE CASCADE
5.1.10
Checkpoint activity should be logged to the alert log by setting LOG_CHECKPOINT_TO_ALERT=TRUE. Checkpoint activity should be monitored to ensure that a current checkpoint completes before the next checkpoint starts. Reference: Oracle9i Database Reference
5.1.11
Fast-start checkpointing refers to the periodic writes by the database writer (DBWn) processes for the purpose of writing changed data blocks from the Oracle buffer cache to disk and advancing the thread-checkpoint. Setting the database parameter FAST_START_MTTR_TARGET to an interval value greater than zero seconds enables the fast-start checkpointing feature. If the time to recover from an instance or node failure in a RAC environment is not critical to meet required service levels, then set FAST_START_MTTR_TARGET=3600. If controlling the time to recover from an instance failure is a necessary component for reaching required service levels, then FAST_START_MTTR_TARGET should be set to the desired MTTR in seconds (for example, FAST_START_MTTR_TARGET=300). Fast-start checkpointing should always be enabled for the following reasons: It reduces the time required for cache recovery, and makes instance recovery time-bounded and predictable. If the system is not already near or at its maximum I/O capacity, fast-start checkpointing will have a negligible impact on performance. Fast-start checkpointing eliminates bulk writes and corresponding I/O spikes that occur traditionally with intervalbased checkpoints.
When enabling fast-start checkpointing, the following initialization parameters should be removed or disabled (set to 0): LOG_CHECKPOINT_INTERVAL, LOG_CHECKPOINT_TIMEOUT, and FAST_START_IO_TARGET. Reference: Oracle9i Database Performance Tuning Guide and Reference and the Oracle9i Fast-Start Checkpointing Best Practices paper on the Oracle Technology Network at http://otn.oracle.com.
5.1.12
Set TIMED_STATISTICS=TRUE to capture Oracle event timing data to properly monitor performance and diagnose performance problems. This parameter is TRUE by default per the default setting for the STATISTICS_LEVEL database parameter, which is TYPICAL. Effective data collection and analysis is essential for identifying and correcting system performance problems. Oracle provides a number of tools that allow a performance engineer to gather information regarding instance and database performance. Setting TIMED_STATISTICS=TRUE is essential to effectively using the Oracle tools. See the Performance Tuning Guide and Reference documentation in the section on Oracle Tools to Gather Database Statistics for more detail.
5-5
5.1.13
With automatic undo management, the Oracle server effectively and efficiently manages undo space, leading to lower administrative complexity and cost. When Oracle internally manages undo segments, undo block and consistent read contention are essentially eliminated by automatically adjusting the size and number of undo segments to meet the current workload requirement. To use automatic undo management, set the following parameters:
UNDO_MANAGEMENT = AUTO UNDO_RETENTION = <the desired time to retain undo data> UNDO_TABLESPACE = <a unique undo tablespace for each instance> Some advanced object recovery features, such as flashback query for object level recovery, require automatic undo management. See Flashback Query in the Recovering from Outages appendix for more details.
5.1.14
Locally managed tablespaces perform better than the dictionary-managed tablespaces, are easier to manage, and eliminate space fragmentation concerns. Locally managed tablespaces use bitmaps stored in the data file headers and, unlike dictionary managed tablespaces, do not contend for centrally managed resources for space allocations and de-allocations. Reference: Oracle9i Database Administrator's Guide
5.1.15
Automatic segment-space management simplifies space administration tasks thus reducing the chance of human errors. An added benefit is the elimination of performance tuning related to space management. It facilitates management of free space within objects such as tables or indexes, improves space utilization, and provides significantly better out of the box performance and scalability. The automatic segment-space management feature is available only with permanent locally managed tablespaces. Reference: Oracle9i Database Concepts
5-6
5.1.17
Resumable space allocation provides a means of suspending, and later resuming, database operations in the event of space allocation failures. Instead of the database returning an error when a space allocation failure occurs, the affected operation is suspended. When the space issue is cleared, the suspended operation is automatically resumed. To automatically enable resumable mode for all sessions, a database level LOGON trigger can be registered. Reference: Oracle9i Database Administrator's Guide
5.1.18 Be consistent when setting THREAD, INSTANCE_NUMBER, and INSTANCE_NAME database parameters
Use a consistent naming convention for the THREAD and INSTANCE_NUMBER parameters. INSTANCE_NAME should be comprised of the database name (DB_NAME parameter) followed by the INSTANCE_NUMBER and should match the ORACLE_SID environment variable. See the example server parameter file in the appendix. Reference: Oracle9i Real Application Clusters Administration
5-7
Standby Type
Physical Standby (redo apply)
Advantages
o o minimal resource overhead on the primary and the standby databases redo apply is fastest and most efficient approach to apply changes to the standby database can be used to offload backups which can be used to recover the primary database allows the standby database to be open for normal operations. allows additional objects to be built and maintained. performs all of its operations on the standby node, and requires minimal additional processing on the primary nodes.
Considerations
o o physically identical copy of the primary database when open in READ ONLY no redo is applied
o o o
o o
physical organization and structure of the data can be different if used with a physical standby also and a failover is done to the physical standby, the logical standby must be reinstantiated uses more resources than a physical standby
Questions
1. Do you require strict zero data loss without any data divergence? For more information, please see Determine the proper database protection mode
Recommendations
Yes - use a physical standby database No go to next question
5-8
Questions
2. Do you have any unsupported logical standby data types?
Recommendations
run this query: SELECT DISTINCT OWNER,TABLE_NAME FROM DBA_LOGSTDBY_UNSUPPORTED ORDER BY OWNER,TABLE_NAME; Rows returned use a physical standby or investigate switching to a supported data type No rows returned go to next question
3.
Do you need to have the standby database open for read and/or write access?
Yes go to next question No - use a physical standby Yes use a logical standby and/or a physical standby No use a physical standby or increase available system resources on the logical standby
4.
Can a logical standby keep up with your primary redo rate? For more information, please see Technical Paper, Oracle9i Data Guard: SQL Apply Best Practices paper at http://otn.oracle.com/deploy/availab ility/htdocs/maa.htm
Data Guard setup is essential to ensuring that at the time of switchover or failover operations, both databases work properly and perform their roles within service levels. The following is a list of configuration recommendations that are categorized by General (applies to either type of standby), Physical Standby Database Only recommendations, and Logical Standby only recommendations. If both a physical standby and a logical standby are used then the apply processes (Managed Recovery (MRP) on a physical standby and the logical standby apply process (LSP) on a logical standby) should run on separate nodes in the RAC. When both are used then all recommendations below apply. General Use a simple, robust archiving strategy and configuration Enable FORCE LOGGING mode Establish a recovery delay Configure multiple standby instances Disable time-based thread advance feature (ARCHIVE_LAG_TARGET=0) for LGWR Unset DB_CREATE_ONLINE_LOG_DEST Use Data Guard instead of remote mirroring technology Determine the proper database protection mode Conduct a performance assessment with your proposed network configuration Configure the database and listener for dynamic service registration Configure connect time failover for the Data Guard network service descriptors Set up an alternate standby connection Evaluate tuning the network in a WAN environment
5-9
For no data loss protection modes, utilize a LAN or MAN network environment For no data loss protection modes, set SYNC=NOPARALLEL attribute Use the archive transport for the greatest performance throughput In Maximum Performance mode, over a WAN, evaluate implementing SSH port-forwarding with compression Set the OS network TCP send and receive buffer sizes to the bandwidth delay product
Physical Standby Database Only Use standby redo logs In Maximum Performance mode, use the ASYNC attribute with 10MB buffer size Tune the standby database and host for optimal recovery performance Set parallel recovery to 2 times number of CPUs on one standby host Set parallel recovery buffer size to 4K Clear online redo log groups on the standby
Logical Standby Only Use supplemental logging and primary key constraints on all production tables Set the logical standby MAX_SERVERS SQL Apply parameter Increase the initialization parameter PARALLEL_MAX_SERVERS Use the default for MAX_SGA SQL Apply parameter Set the _EAGER_SIZE SQL Apply parameter to 1000 Set the APPLY_DELAY SQL Apply parameter Set the TRANSACTION_CONSISTENCY SQL Apply parameter Skip SQL Apply for unnecessary objects Create database links in both directions
Table 5-4 lists each of the recommended initialization parameter settings for Data Guard and its options. Database-wide parameters are prefixed with *.as they would be specified in the SPFILE. These sample settings are for a database named SALES that has two instances with ORACLE_SID values of SALES1 and SALES2. The recommendations pertaining to the parameters in the table, as well as the recommendations that do not have a corresponding database parameter, are detailed in the text below. Thus, to be thorough, review the complete recommendation list. For corresponding descriptions of the recommended settings, follow the links in the table. The links will take you to sections in the detailed configuration recommendations. The settings below are generic to both the primary and secondary site unless otherwise specified. These settings also use the following premise for the log archive destinations: Destination 1 always points to the local archive destination Destination 2 always points to the physical standby destination if used and only applies when in the primary role
5-10
Destination 3 always points to the local alternate destination Destination 4 always points to the logical standby destination if used and only apllies when in the primary role
Appendix C contains a complete SPFILE example, including parameter settings for RAC. Additionally, Appendix C also includes Oracle Net Services file examples. For this recommended setting overview the following network services are defined: SALES_PRIM SALES_SEC SALES_LOG_SEC - points to the primary site database - points to the secondary site database (physical standby initially, if used) - points to the secondary site logical database
For additional database parameter details, consult the Oracle9i Data Guard Concepts and Administration and Oracle9i Database Reference documentation.
5-11
Database Configuration - Oracle Data Guard Table 5-4: Initialization Parameter Settings for Oracle Data Guard
*.LOG_ARCHIVE_DEST _STATE_1 *.LOG_ARCHIVE_DEST_2 (maximum protection or maximum availability modes ) only used when a physical standby is used
enable primary site service=SALES_SEC LGWR sync=noparallel affirm reopen=15 max_failure=10 delay=30 secondary site service=SALES_PRIM LGWR sync=noparallel affirm reopen=15 max_failure=10 delay=30
*.LOG_ARCHIVE_DEST_2 (maximum performance mode only) only used when a physical standby is used
primary site service= SALES_SEC LGWR ASYNC=20480 reopen=15 max_failure=10 delay=30 net_timeout=30 secondary site service= SALES_PRIM LGWR ASYNC=20480 reopen=15 max_failure=10 delay=30 net_timeout=30
enable (if role is production) defer (if role is standby) Primary site or secondary site location=/arch2/SALES arch logical standby database location=/arch2/SALES_LOG arch
*.LOG_ARCHIVE_DEST _STATE_3 *.LOG_ARCHIVE_DEST_4 (maximum availability mode ) only used when a logical standby is used
alternate primary site if using both physical and logical service=SALES_LOG_SEC dependency=log_archive_dest_2 reopen=15 max_failure=10 optional delay=30 primary site if using only logical service=SALES_LOG_SEC LGWR sync=noparallel affirm reopen=15 max_failure=10 delay=30 optional secondary site if using logical only service=SALES_PRIM LGWR sync=noparallel affirm reopen=15 max_failure=10 delay=30 optional secondary site if using both physical and logical, then set dest_4 on physical service=SALES_LOG_SEC dependency=log_archive_dest_1 reopen=15 max_failure=10 optional delay=30
*.LOG_ARCHIVE_DEST_4 (maximum performance mode ) only used when a logical standby is used
primary site if using both physical and logical service=SALES_LOG_SEC dependency=log_archive_dest_2 reopen=15 max_failure=10 optional delay=30 primary site if using only logical service=SALES_LOG_SEC LGWR async=20480 reopen=15 max_failure=10 delay=30 net_timeout=30 optional secondary site if using logical only service=SALES_PRIM LGWR async=20480 reopen=15 max_failure=10 delay=30 net_timeout=30 optional secondary site if using both physical and logical, then set dest_4 on physical service=SALES_LOG_SEC dependency=log_archive_dest_1 reopen=15 max_failure=10 optional delay=30
5-12
*.FAL_CLIENT
*.REMOTE_ARCHIVE_ENABLE *._LOG_ARCHIVE_CALLOUT (Oracle 9i version 9.2.0.5 or later with ARCH transport over very slow network) *.ARCHIVE_LAG_TARGET *.DB_CREATE_ONLINE_LOG_DEST *.LOCAL_LISTENER *.SERVICE_NAMES
TRUE LOCAL_FIRST=TRUE
Every instance archives locally to /arch1 and has an alternate archive destination /arch2 The production instances will remotely archive to the same net service name pointing to only one Oracle standby instance. The net service name is the one that connects to the primary standby instance that is normally running managed recovery or the logical apply.
The following table describes the main rules of the archiving strategy and rationale behind each rule.
Table 5-5: Archiving Rules
Archiving Rules
Archiving must be started and remote archiving enabled. Use a consistent log format, LOG_ARCHIVE_FORMAT Local archiving, LOG_ARCHIVE_DEST_1, is done by the archiver process (ARCH) Create local alternate archive destination, LOG_ARCHIVE_DEST_3 Archive directory structure is identical across all production and standby nodes.
5-13
Archiving Rules
LGWR or archiver should archive remotely to only one standby instance and node per standby RAC.
When using a slow WAN, configure ARCH to complete archiving locally before starting to archive remotely.
In Oracle9i version 9.2.0.5 or later, when setting the parameter _LOG_ARCHIVE_CALLOUT='LOCAL_FIRST=TRUE' the ARCH process will complete archiving an online log locally, then proceed to transfer the archived log to remote destinations. To prevent primary database stalls, LOCAL_FIRST should be enabled in an environment where the redo generation rate exceeds the rate redo can be transferred to the standby system. For simplicity, the standby archive destination, STANDBY_ARCHIVE_DEST, should use the /arch1 directory which is the same as the local archive destination, LOG_ARCHIVE_DEST_1 directory. When standby redo logs (SRLs) are present, the standbys ARCH process writes to the local archive destination. If there is a gap, the fetch archive log (FAL) process writes to the standby archive destination. The STANDBY_ARCHIVE_DEST must be the same as LOG_ARCHIVE_DEST_1 (local archive directory) in a physical standby . When using a physical and a logical the logical setting must match the physical log_archive_dest_1. When using a logical standby only then this setting must be different from the logical standby log_archive_dest_1 setting to avoid file name collisions.
Archive destinations should be sized to hold all the archived redo log files since the last on-disk backup.
For standby database instantiation, the on-disk backup accompanied with the local archived redo log files can be leveraged to re-create a new standby database. Any node can play the role of the primary standby node since switchover or node failures can occur.
The following example illustrates the recommended initialization parameters for a configuration with only a physical standby. There are two instances, SALES1 and SALES2, running in maximum protection mode.
*.LOG_ARCHIVE_DEST_1='location=/arch1/SALES arch noreopen max_failure=0 mandatory alternate=LOG_ARCHIVE_DEST_3' *.LOG_ARCHIVE_DEST_2='service=SALES_SEC lgwr sync=noparallel affirm reopen=15 max_failure=10 delay=30' *.LOG_ARCHIVE_DEST_3='location=/arch2/SALES arch' *.LOG_ARCHIVE_DEST_STATE_1='enable' *.LOG_ARCHIVE_DEST_STATE_2='enable' # defer for the standby instances *.LOG_ARCHIVE_DEST_STATE_3='alternate' *.standby_archive_dest='/arch1/SALES' Note the following observations for this example:
The PARALLEL attribute is utilized when there are multiple standby destinations. When SYNC is set to PARALLEL, the LGWR process initiates an I/O operation to each standby destination at the same time. Since we have a single standby destination, the NOPARALLEL option is set to reduce overhead. The REOPEN=15 MAX_FAILURE=10 setting denotes that if there is a connection failure, the network server attempts to reopen the connection after 15 seconds and retries up to 10 times. The recovery apply delay, DELAY=30, implies that recovery apply is delayed for 30 minutes from the time the log is archived on the physical standby, but the redo transfer to the standby is not delayed. The DELAY setting is discussed further under the Establish a recovery delay recommendation.
5-14
The NET_TIMEOUT=30 setting designates that if there is no reply for a network operation within 30 seconds, then the network server errors out due to the network timeout instead of stalling for the default network timeout period (TCP timeout value) If the network operations are being serviced, the maximum retry time for all failed operations is calculated as REOPEN times MAX_FAILURE. In the above case, it is 15 seconds until a reopen. 15 seconds times 10 retries is 150 seconds or 2.5 minutes. The LOG_ARCHIVE_DEST_2 state, LOG_ARCHIVE_DEST _STATE_2, setting depends on the database role. In a production role, the state is enabled (ENABLE). When the database is in a physical standby role, the state is deferred (DEFER).
Figure 5-1 shows an archive infrastructure based on the settings described above. The primary and secondary environments have the same archive destination structure; however, only the first node of the secondary site or the site with the standby role contains all the archived redo log files. If LOG_ARCHIVE_DEST_1 fills up, ARCH automatically archives to the alternate archive destination defined with LOG_ARCHIVE_DEST_3.
Make the archive destinations accessible to any node within the cluster, either leverage a shared file system technology such as cluster file system, global file system, or high availability network file systems (HA NFS), or have a manual approach to mount the file system by any node within the cluster very quickly. This is needed when media recovery or standby recovery is required and all archived redo log files need to be accessible on the production database nodes. However, media recovery on the primary site will never be the first option since failing over to the standby will be generally quicker and more efficient.
5-15
On the standby database nodes, media recovery or standby recovery from a different node is required when node1 crashes and cannot be restarted. In that case, any of the existing standby instances residing on a different node can initiate managed recovery. In the worst case when the standby archived redo log files are inaccessible, the new Managed Recovery Process (MRP) on the different node will fetch the archived redo log files using the FAL server to retrieve from the production nodes directly. When configuring hardware vendor shared file system technology, verify the performance and availability implications. The following issues should be investigated before adopting this strategy:
Is the shared file system accessible by any node regardless of the number of node failures? What is the performance impact when implementing a shared file system? Is there any impact on the interconnect traffic?
MTTR service level for site failures Detection and reaction time for user errors and corruptions
When the standby database is in managed recovery mode and using SRLs, archived redo is automatically applied when a log switch occurs. However, to prevent corrupted or erroneous changes passing from the production database to the standby database, you may want to create a time lag between archiving a redo log at the production host and applying that archived redo log file on the standby database. This delay mechanism actually starts from the time the standby database has archived the log locally, rather than from the time that the production database archives its redo log. We recommend setting the delay to 30 minutes or less, but effectively utilizing this delay is possible only if you have a monitoring infrastructure that detects problems and stops the standby database managed recovery process. The recovery delay setting is critical for standby configurations regardless of the protection mode. You can set the delay by modifying the LOG_ARCHIVE_DEST_N parameter. In MAA, the LOG_ARCHIVE_DEST_2 parameter represents the remote standby destination. To set a lag of 30 minutes, set *.LOG_ARCHIVE_DEST_2 as follows:
*.LOG_ARCHIVE_DEST_2='service=SALES_SEC lgwr sync=noparallel affirm reopen=15 max_failure=10 delay=30' The actual delay option setting must be determined by the following factors:
What is the maximum time for detecting a user error or corruption on the production database?
5-16
Once detection is achieved, how quickly can you cancel recovery on the standby database? What is the expected MTTR for outages that involve user errors?
If the desired MTTR is greater than the total detection time plus repair time, then you need to optimize detection time or reduce the repair time. Reducing the delay directly reduces repair time by reducing the number of archived redo log files required for standby recovery. Repair time can also be reduced by the size of the online and standby redo logs as described in Size redo logs appropriately. The most common detection mechanism is to monitor the alert log for critical errors such as ORA-600 or ORA-1578 and to alert and react when the application detects a logical corruption like a missing table. In our test environments, a 30-minute delay was configured between the production and standby databases.
Allows for transparent failover to the secondary standby instance if connectivity fails to the primary standby instance. This works only in maximum availability or maximum performance database modes. In this scenario, the MRP session needs to be restarted if it was running on the primary standby instance. Refer to Listener and Oracle Net Configuration. Provides a scheduled maintenance solution whenever the primary standby instance and host needs to be shut down for maintenance. The secondary standby can take over and the new Oracle Net service pointing to the secondary standby instance can be leveraged. You can issue the following command on the production database to change the service:
ALTER SYSTEM SET LOG_ARCHIVE_DEST_2= service=SALES_SEC_ALT lgwr sync=noparallel affirm reopen=15 max_failure=10 delay=30 ;
SALES_SEC_ALT is the alternative Oracle Net alias pointing to the secondary standby instance. This ALTER SYSTEM statement is necessary only when using maximum protection mode to avoid a production database downtime. Note: Having multiple standby instances is NOT the same as having multiple standby databases. The multiple standby instances are sharing the same database. Only one instance has the managed recovery process (MRP).
5-17
Synchronize non-database files such as Oracle binaries or other software between primary and secondary sites. Note that using this solution may require special consideration for handling user error, such as accidentally deleting software, or software upgrades. Synchronize important flat or binary files such as SPFILEs or init.ora files between primary and secondary sites
Data Guard and physical standby databases should always be chosen over a remote mirroring solution from a third party technology because of the following benefits:
Protection from user error or data corruption Reduced network utilization because only the redo traffic is transferred Role management facilities provide simple and integrated switchover and failover procedures Simplified support and certification by using an Oracle-based solution
Maximum Protection mode with LGWR SYNC AFFIRM option for an environment that requires no data loss. Performance overhead is possible. This protection mode is only for a physical standby database only. Maximum Availability mode with LGWR SYNC AFFIRM option for an environment that needs no data loss but tolerates data divergence when sites are temporarily inaccessible. Performance overhead is possible. This protection mode is for a physical or a logical standby. Maximum Performance mode with LGWR ASYNC option for an environment that tolerates minimal data loss and data divergence when sites are temporarily inaccessible. Performance overhead is minimized. Use of ARCH instead of LGWR ASYNC will provide the least performance overhead on the production database. However, ARCH also has the greatest data loss potential. This protection mode is for a physical or a logical standby.
5-18
For further considerations and more detail on this topic, see the Oracle9i Data Guard: Primary Site and Network Configuration Best Practices paper at http://otn.oracle.com/deploy/availability/htdocs/maa.htm.
Protection Modes
Maximum Protection (Log writer synchronous I/O)
Advantages
Guarantees that all logged transactions are available to the standby database at all times.
Considerations
Production database will abort if no standby database is accessible. Possible performance degradation since each log writer I/O must complete both locally and remotely to the online logs before a transaction commit is considered successful. An improperly configured network will impact performance. Cannot be used with a logical standby. Potential data divergence between databases while the network or standby database is inaccessible. Possible performance degradation since each log writer I/O must complete locally and remotely to the online redo logs. An improperly configured network will impact performance. After reestablishing connectivity, the standby is initially data divergent until Oracle completes the automatic gap resolution. A RAC production requires accessibility to a standby database at startup. Data loss likely in the case of primary site failure. An improperly configured network will impact performance.
Guarantees that all logged transactions are available to the standby database while connectivity is established. Production database is not aborted when the standby database is inaccessible; however, the standby database is required to be accessible during the startup of a RAC production.
Minimizes performance degradation due to asynchronous network writes. Limits logged transaction loss when the primary site fails by controlling ASYNC buffer size. Production database is not aborted when the standby database is inaccessible.
Questions
Is data loss acceptable with a primary site failure?
Recommendations
Yes Use any protection mode possible.
5-19
Questions
How much data loss is acceptable in case of a site loss?
Recommendations
No - Use maximum protection or maximum availability modes only. None Use maximum protection or maximum availability modes only. Some Use maximum performance mode with ASYNC=<blocks>. The value for the number of blocks will determine the projected amount of possible redo data loss in the case of a site failure. The current upper limit is 10 MB. For best performance ASYNC=20480 is recommended. If the production throughput is high and the network round trip is slow, Oracle may revert to archiver to ensure that performance impact is minimized. Yes Use maximum performance or maximum availability modes. Exception: For maximum availability mode, a RAC production requires accessibility to a standby
Is potential data loss between the production and standby databases allowed when a standby host or network connection is temporarily unavailable? What is the minimum distance required to eliminate dual site failures?
database at startup.
No Use maximum protection mode only The distance between sites and the network infrastructure between the sites determines network latency. In general, a greater distance implies greater latency. Determine the minimum distance between sites to provide for outage isolation and minimal performance impact to the primary site due to network latency. See Oracle9i Data Guard: Primary Site and Network Configuration Best Practices paper at http://otn.oracle.com/deploy/availability/htdocs/maa.htm.
What is the current or proposed network bandwidth and latency between sites?
For acceptable performance of a Data Guard configuration the practical network throughput must be greater than the maximum redo generation rate. Insufficnent network throughput can cause increased data loss in the event of a primary site failure or unanticipated impact to primary database performance. The larger the network round-trip-time (RTT) between the primary and standby site, the more impact it will have on Maximum Protection and Maximum Availability protection modes. With maximum performance mode using an ASYNC log transport setting or using archiver, the performance impact can be mitigated or hidden, even in high throughput environments. Besides distance and bandwidth, the network throughput can also be affected by various other factors, like processor limitations, network congestion, buffering inefficiencies, transmission errors, traffic loads, number of network hops, or inadequate hardware designs. When determining the bandwidth, use the maximum bandwidth of the least-capable hop of all of the router hops between the primary and secondary hosts.
Yes Use maximum protection or maximum availability modes. Test to determine what acceptable performance degradation is and validate if your environment complies with that requirement. No - Performance tests are required to investigate if there is 1) a performance impact, 2) ways to eliminate impact such as reducing latency when implementing different protection modes.
2.
5-20
Database Configuration - Oracle Data Guard PROTECTION]; Note: The above command can be executed only when the production database is mounted in exclusive mode. This implies that any database protection mode change requires an outage and restart of the production database. 5. After setting the protection mode in exclusive mode, reset the SPFILE parameter for RAC mode, CLUSTER_DATABASE=TRUE, before restarting the instance. ALTER SYSTEM SET CLUSTER_DATABASE=TRUE SCOPE=SPFILE;
6.
You must choose the correct protection mode initially to avoid production database outages. After a failover to the standby database, the protection mode will automatically downgrade to maximum performance mode. Switchover operations do not change the protection mode setting.
Sufficient throughput to accommodate the maximum redo generation rate Minimal latency to reduce the performance impact on the production database Multiple network paths for network redundancy Be configured and tuned to provide for the most efficient network response time
The required throughput of a dedicated network connection is determined by the production database's maximum redo rate3. Network throughput is influenced by bandwidth, latency, and network efficiency, all of which must be accounted for in determining available throughput. Additionally, the chosen database protection mode entails best practice and performance considerations specific to the protection mode. As described in the Database protection modes section above, Maximum Protection and Maximum Availability protection modes require the LGWR SYNC transport and the Maximum Performance protection mode uses the ASYNC transport option or the archiver (ARCHn) instead of log writer to transfer the redo. Additional best practices and performance considerations related to the chosen protection mode are listed below. These best practices were derived from an Oracle internal performance study that measured the impact of network latency on primary database throughput for each Data Guard transport option; ARCH, LGWR ASYNC and LGWR SYNC. The details of this study can be found in a paper on the Oracle Technology Network at http://otn.oracle.com/deploy/availability/htdocs/maa.htm. The network infrastructure between the primary and secondary sites must be able to accommodate the redo traffic since the production database redo data is updating the physical standby database. If your maximum redo traffic at peak load is 8 MB/sec, then your network infrastructure must have sufficient bandwidth to handle this load. Furthermore, network latency can affect overall throughput and response time for OLTP and batch operations.
The database maximum redo rate can be retrieved by Oracle's statspack reports. Take snapshots during peak redo generation intervals. Redo rate = redo generated (or redo size) / time
5-21
When deciding whether to configure Maximum Protection or Maximum Availability mode with LGWR SYNC operations compared to Maximum Performance protection mode with LGWR ASYNC operations, you need to measure if performance or throughput will degrade due to the incurred latency. You should also check if the new throughput and response time are within your application performance requirements. Distance and the network configuration directly influence latency, while high latency may slow down your potential transaction throughput and increase response time. The network configuration, number of repeaters, the overhead of protocol conversions, and the number of routers will also impact the overall network latency and transaction response time. Furthermore, TCP receive and send buffer sizes and Oracles Session Data Used (SDU) settings can dramatically influence overall application response time. The table and chart below are an excerpt from the Oracle9i Data Guard: Primary Site and Network Configuration Best Practices paper that can be found on the Oracle Technology Network website at http://otn.oracle.com/deploy/availability/htdocs/maa.htm . These results illustrate a piece of a performance assessment. This test was conducted on a LAN and with emulated network round-trip-times (RTT) to assess the impact of the different transport options and the network RTT on the primary database throughput. The redo rate (KB/second) was the metric used to quantify the primary database throughput. The redo rate was captured through STATSPACK (redo size statistic per second) snapshots on each primary instance. The ARCH transport option was used as the baseline to compare the impact of the other transport options. In addition to these statistics the operating system resources (CPU, memory, I/O) and network volumes were also monitored. These tests showed no bottlenecks associated with OS resources.
Table 5-8 200 User Test Effect on Primary
RTT Transport
ARCH ASYNC
Redo Rate Redo Rate Scale Redo Rate Scale
SYNC
5-22
5.2.10
The LOG_ARCHIVE_DEST_2 SERVICE, FAL_SERVER, and FAL_CLIENT initialization parameter settings depend on a proper Oracle Net configuration. In order for the Data Guard transport service and the gap resolution feature to work, the SPFILE, listener.ora, tnsnames.ora, and sqlnet.ora files must be consistent. Complete samples of these files are also in Appendix C. These settings are only for Oracle Net services between the primary and secondary site. The remote archive destination, FAL_CLIENT, and FAL_SERVER parameters require an Oracle Net service. FAL_CLIENT, and FAL_SERVER are only used by a physical standby. This service is represented as a net service name entry in the local tnsnames.ora file. Furthermore, we recommend using dynamic service registration instead of a static SID list in the listener configuration. To ensure service registration works properly, the server parameter file should contain the following parameters:
SERVICE_NAMES for the database service name INSTANCE_NAME for the instance name LOCAL_LISTENER to specify a non-default listener address
PMON dynamically registers a database service with the listener. PMON will attempt to resolve LOCAL_LISTENER using some naming method. In our case, PMON will find the corresponding name in the local tnsnames.ora file. Example:
SALES1.INSTANCE_NAME=SALES1 SALES2.INSTANCE_NAME=SALES2 *.LOG_ARCHIVE_DEST_2='service=SALES_SEC lgwr sync=noparallel affirm reopen=15 max_failure=10 delay=30' *.local_listener='SALES_lsnr' *.service_names='SALES' # required for service registration *.FAL_SERVER='SALES_PRIM' *.FAL_CLIENT='SALES_SEC'
5-23
The listener.ora file should be identical for each primary and secondary host except for HOST settings. Since service registration is used, there is no need for statically configured information. The local tnsnames.ora file should contain the net service names and the local listener name translation. To use the same service name on each node, you must use a locally managed tnsnames.ora file for the production and standby databases. On the primary cluster, the tnsnames.ora entry, SERVICE_NAME, should equal the setting of the SPFILE SERVICE_NAMES parameter. If the listener is started after the instance, then service registration does not happen immediately. In this case, issue the ALTER SYSTEM REGISTER statement on the database to instruct the PMON background process to register the instance with the listeners immediately. A helpful note on setting up service registration is MetaLink Note: 76636.1.
5.2.11 Configure connect time failover for the Data Guard network service descriptors
Connect-time failover occurs when a connection request is forwarded to another listener if a listener is not responding. Connect-time failover is enabled by service registration, because the listener knows if an instance is running when attempting a connection. In the sample tnsnames.ora file, notice that the SALES net service name contains multiple address lists (two since this refers to two-node clusters) for the production and standby clusters. The second address list allows for connect-time failover in case the first connection fails. This works for both maximum performance and maximum availability modes but does not work for maximum protection mode.
5.2.12
If the standby host on node 1 needs to be shut down for maintenance, you can alter the LOG_ARCHIVE_DEST_2 setting on the production database to point to an alternate service name. In the example tnsnames.ora file, this is the SALES_SEC_ALT net service name. This allows the production database to remain open in all the protection modes. This approach will be discussed further when discussing standby maintenance.
5.2.13
Set TCP socket send and receive buffer sizes to be equal bandwidth delay product. tcp_xmit_hiwat = tcp_recv_hiwat = bandwidth*RTT (Example on Solaris)4 Reducing the number of round trips across the network is key to optimizing the performance for transporting redo log data to a standby site. With Oracle Net Services it is possible to control data transfer by adjusting the size of the Oracle Net setting for the session data unit (SDU). In a WAN environment, setting the SDU to 32KB can improve performance. The SDU parameter designates the size of an Oracle Net buffer used to place data into before it delivers each buffer to the TCP/IP network layer for transmission across the network. Oracle Net sends the data in the buffer either when requested or when it is full. Oracle internal Data Guard testing on a WAN has demonstrated that the maximum setting of 32KB (32768) performs best on a WAN. The primary gain in performance when setting SDU is due to the reduced number of calls to packet the data. The SDU parameter needs to be set at the listener and connection levels, i.e., in the tnsnames.ora and listener.ora. When using the SDU setting, dynamic instance registration cannot be used prior to Oracle 9.2.0.4. Prior to Oracle 9.2.0.4, using dynamic instance registration with SDU overrides any SDU setting to the default SDU of 2048 (2KB).
4
These parameter names may differ on each platform. If enabled, these parameters should be set on both primary and standby machines and the socket connections need to be restarted. However, it is imperative that memory, network and perfomance impact are re-evaluated.
5-24
Likewise, do not use the default port of 1521 since that automatically attempts to dynamically register. The proper configuration of SDU is illustrated below in the last example of Appendix C. In addition to setting the SDU parameter, increasing the TCP send and receive window sizes can improve performance. Use caution, however, because this may adversely affect networked applications that do not exhibit the same characteristics as archiving. This method consumes more memory per connection. Dont set the TCP window sizes higher than 32KB. For further information on TCP window sizing, contact your network administrator and/or refer to the operating system documentation.
5.2.14 For no data loss protection modes, use a LAN or MAN network environment
As previously described, the maximum protection and maximum availability protection modes require the Data Guard transport service to use the LGWR SYNC transport option. Network latency is an additional overhead for each LGWR SYNC I/O operation. Error! Reference source not found. shows that LGWR SYNC writes both locally to the online redo log and remotely via the network to the RFS process to the standby redo logs. These simple formulas highlight that the remote write is always slower than the local write and is the limiting factor when LGWR synchronous writes are occurring.
Local write = local write I/O time Remote write = network round trip time (RTT) + local write I/O time (on standby machine)
Using an example where the network round trip time (RTT) is 20ms, every LGWR synchronous write and every transaction increases by 20ms. This overhead impacts response time and may affect primary database throughput. Thus, due to the additional overhead incurred by the RTT, a local area network (LAN) or a metropolitan area network (MAN) with an RTT less than or equal to ten milliseconds should be used. Determining whether to use a LAN or MAN is dependent on the results of the performance assessment. Refer to the Oracle9i Data Guard: Primary Site and Network Configuration Best Practices paper at http://otn.oracle.com/deploy/availability/htdocs/maa.htm.
5-25
5.2.15
With only one remote standby destination within a LAN, with sufficient bandwidth and low latency, we recommend LGWR SYNC=NOPARALLEL AFFIRM for the best performance with maximum data protection capabilities. When no data loss is required and there is only one remote archive destination, SYNC=NOPARALLEL performed better than sync parallel (the default) with a single standby destination. If SYNC=PARALLEL is used, the network I/O is initiated asynchronously, so that I/O to multiple destinations can be initiated in parallel. However, once the I/O is initiated, the log writer process waits for each I/O operation to complete before continuing. This is, in effect, the same as performing multiple, synchronous I/O operations simultaneously. SYNC=PARALLEL should only be used if there is more than one standby destination. When the standby protection mode is set to maximum performance with LGWR ASYNC configuration, the LGWR request is buffered if sufficient space is available in the network buffer. In this case, performance is degraded only if the network bandwidth is not sufficient or if the network buffer is filled faster than network I/Os are being processed. If the production database crashes and is inaccessible, you will lose the data in the network buffer.
5.2.16
Based on the results detailed in the Oracle9i Data Guard: Physical Standby Performance Best Practices paper that can be found on the Oracle Technology Network website at http://otn.oracle.com/deploy/availability/htdocs/maa.htm , the archive transport provides the greatest performance throughput coupled with the greatest data loss potential. The ARCH transport does not affect primary performance when latency increases as long as the redo logs are configured correctly as described under Size production and standby redo logs and groups appropriately. The effects of latency on primary throughput are also detailed in the white paper. In Oracle9i version 9.2.0.5 or later, when using a network link that is unable to sustain a throughput greater than the primarys highest redo generation rate, configure ARCH to complete archiving locally before starting to archive remotely by setting the parameter _LOG_ARCHIVE_CALLOUT='LOCAL_FIRST=TRUE'. Setting this parameter will prevent the primary database from stalling due to slow remote archiving by causing the ARCH process to first complete archiving an online log locally, then proceed to transfer the archived log to remote destinations.
5.2.17 In Maximum Performance mode over a WAN, evaluate implementing SSH port forwarding with compression
In addition to using the maximum async buffer size, to reduce the chance of the "Timing out - ASYNC buffer full" error condition occurring, using SSH with compression also decreases the chance of this error occurring and it reduces network traffic. SSH port forwarding with compression also comes with encryption. SSH has the most performance benefits over a WAN with a round trip time of 50ms or more. See MetaLink note 225633.1 for how to setup and use SSH with compression. The SSH testing results are detailed in The Oracle9i Data Guard: Physical Standby Performance Best Practices paper (see http://otn.oracle.com/deploy/availability/htdocs/maa.htm).
5.2.18 Set the OS network TCP send and receive buffer sizes to the bandwidth delay product
TCP uses a sliding window algorithm to process data as it sends and receives. The details of this algorithm are discussed in Request for Comments (RFC) 793 and 1323. This sliding window method causes inefficiency when there is a large bandwidth delay product (BDP) (the product of the estimated minimum bandwidth and the round trip time between two machines). This inefficiency can be reduced by overriding the default TCP buffer size settings. The default buffer sizes
5-26
must be changed on both the sender and the receiver. TCP buffer sizes should be set to the BDP to achieve maximum throughput. Increasing the TCP send and receive buffer size is typically done system-wide, which affects all connections. Some platforms, however, allow buffer size settings for specific host pairs. Increasing TCP send and receive buffer sizes uses additional system memory, so this must be considered from a system resource management point of view. Heres an example on Solaris 2.8 for system-wide setting: 1. 2. 3. Environment: Primary and secondary connected by a T3 (44.736 Mbps) link with a network RTT of 50 ms. BDP=44.376 * 50=2236.8 Megabits=279,600 bytes Check the TCP settings on both hosts by using the ndd command: echo tcp_xmit_hiwat = `/usr/sbin/ndd /dev/tcp tcp_xmit_hiwat ` echo tcp_recv_hiwat = `/usr/sbin/ndd /dev/tcp tcp_recv_hiwat ` 4. Both hosts have the following settings: tcp_xmit_hiwat = 16384 tcp_recv_hiwat = 24576 5. Change the settings on both hosts to the BDP of 279,600. This requires root privilege: /usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 279600 /usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 279600 To change the buffer settings for just a single host on Solaris, do the following: Primary host = PRIM, Standby host name = STBY 1. 2. On PRIM: ndd -set /dev/tcp tcp_host_param STBY sendspace 279600 recvspace 279600 On STBY: ndd -set /dev/tcp tcp_host_param PRIM sendspace 279600 recvspace 279600
These new settings take affect immediately for new connections and no reboot is required, however, existing connections are not affected. Thus, following these changes, the databases and listeners should also be restarted to obtain the new TCP buffer settings. To ensure these settings are maintained across system restarts these commands should also be put in a system startup script, e.g. /etc/rc2.d/S69inet on Solaris.
Standby redo logs (SRLs) should be used on both sites. Follow this formula for the number of SRLs: # of SRLs = sum of all online log groups per thread + 1 Having one more log group than the production databases online redo log groups reduces the likelihood that the production instances LGWR is blocked because an SRL cannot be allocated on the standby. For example, if a primary database has two instances (threads) and each thread has four online log groups then you should have nine SRL groups.
Create the same number of SRLs for both production and standby databases. All of the production databases online redo logs and SRLs should be the same size.
5-27
The SRLs should exist on both production and standby databases with the same size and names.
The standby databases remote file server (RFS) process writes only to an SRL whose size is identical to the production databases online redo log. If it cannot find an appropriately sized SRL, RFS creates an archived redo log file directly instead and logs the following message in the alert log, No standby redo log files of size <#> blocks available. SRLs with the LGWR transport option also provide better data protection than the ARCH transport since changes are contained on the standby rather than waiting for a log switch to occur as in the case of ARCH. Additionally, using SRLs speeds up the time it takes to switchover since a complete archived log doesnt have to be transferred when the switchover starts. Just the local SRL has to be archived on the standby, which avoids the network transfer overhead that ARCH will incur.
5.2.20 In Maximum Performance mode, use the ASYNC attribute with a 10MB buffer size
The largest sized lgwr async buffer of 10MB (ASYNC=20480) performs the best in a WAN. In a LAN, the different async buffer sizes did not impact the primary database throughput, this is illustrated above in Table 5-8. Additionally, using the maximum buffer size also increased the chance of avoiding the Timing out messages due to an async buffer full condition in a WAN, i.e. the smaller the buffer the more chance of the buffer filling up as latency increases. The Oracle9i Data Guard: Physical Standby Performance Best Practices paper (see http://otn.oracle.com/deploy/availability/htdocs/maa.htm ) was based on Oracle release 9.2.0.3, which has performance improvements for asynchronous transport services. Additionally, starting with release 9.2.0.3, if the network buffer becomes full and remains full for 5 seconds, then the transport will timeout and convert to the ARCH transport. This condition indicates that the network to the standby destination cannot keep up with the redo generation rate on the primary database. This will be indicated in the alert.log by the following message: 'Timing out on NetServer %d prod=%d,cons=%d,threshold=%d" This message indicates that the standby destination configured with the LGWR ASYNC attributes encountered an "ASYNC buffer full" error condition. When this occurs, log transport services will automatically stop using the network server process to transmit the redo data and convert to using the archiver process (ARCn) until a log switch occurs. The next log transmission will revert to the ASYNC transport. This change occurs automatically; no specific user action is necessary to make this happen. The largest async network buffer, 10Mb, increased the chance of avoiding the transport converting to ARCH. In the 10MB case the 100-millisecond RTT test converted to ARCH, with the 10MB async buffer tests, the 10 and 50 millisecond RTT tests only converted to ARCH on one node. Whereas the 2MB buffer tests converted to ARCH for all RTT tests (10, 50, and 100 ms). When using SSH with compression no conversion to ARCH occurred with the 10MB async buffer in a 100ms RTT test. The figure below illustrates the architecture when the standby protection mode is set to maximum performance with LGWR ASYNC configuration. The LGWR request is buffered if sufficient space is available in the network buffer. In this case, performance can be degraded if the network bandwidth is not sufficient or if the network buffer is filled faster than network I/Os are being processed. Therefore, inadequate network bandwidth and high latency can reduce application throughput and application response time. If the production database crashes and is inaccessible, you will lose the data in the network buffer.
5-28
5.2.21 Tune the standby database and host for optimal recovery performance
In order to leverage Data Guard with physical standby or leverage any media recovery operation effectively, you need to tune your database recovery. Follow the recommendations provided in an Oracle internal study that was conducted to identify Oracle media recovery tuning best practices. The paper, Oracle9i Media Recovery Best Practices, can be found on the Oracle Technology Network at http://otn.oracle.com/deploy/availability/htdocs/maa.htm.
5.2.22
This practice was identified from the Oracle internal recovery testing study; see the Oracle9i Media Recovery Best Practices paper on the Oracle Technology Network at http://otn.oracle.com/deploy/availability/htdocs/maa.htm for details. By utilizing parallel media recovery or parallel standby recovery, you may experience three times the performance gains over serial recovery for OLTP runs. In our testing, minimal gains were found in large batch runs. The recommended degree of parallelism is 2 times the number of CPUs on the host. The database parameter PARALLEL_MAX_SERVERS needs to set to at least the degree of parallelism since the value of PARALLEL_MAX_SERVERS is the maximum number of parallel recovery processes allowed regardless of what you supply to the RECOVERY command. You should compare several serial and parallel recovery runs to determine your optimal recovery performance. Below are some examples of how to set recovery parallelism.
RECOVER MANAGED STANDBY DATABASE PARALLEL <#CPUS * 2>; # for standby managed recovery RECOVER STANDBY DATABASE PARALLEL <#CPUS * 2>; # for manual standby recovery RECOVER DATABASE PARALLEL <#CPUS * 2>; # for manual media recovery Ensure that you have applied the fix for bug 2555509, which is contained in Oracle9i Release 2 patchset 3.
5-29
5.2.23
This practice was identified from the Oracle internal recovery testing study; see the Oracle9i Media Recovery Best Practices paper on the Oracle Technology Network at http://otn.oracle.com/deploy/availability/htdocs/maa.htm for details. When utilizing parallel media recovery or parallel standby recovery, increasing the PARALLEL_EXECUTION_MESSAGE_SIZE database parameter to 4K (4096) improved parallel recovery by as much as 20%. This should be set on both the primary and the standby in preparation for switchover operations. Increasing this parameter will require more memory from the shared pool by each parallel execution slave process. This parameter is also used by parallel query operations and should be tested with any parallel query operations to ensure there is sufficient memory on the system. A large number of parallel query slaves on a 32-bit installation may reach memory limits and prohibit increasing the PARALLEL_EXECUTION_MESSAGE_SIZE from the default 2K (2048) to 4K.
5.2.24
Clearing any online redo log groups on the standby following the standby creation or following a switchover relieves any subsequent switchover or failover from having to perform the log clearing. Internal testing has shown this can save up to 15 seconds in switchover or failover time. There is no definitive way to validate that online redo log groups on the standby have been cleared other than checking the alert log for messages indicating so. The best method is to make it the last step of a switchover or standby instantiation. The command is, ALTER DATABASE CLEAR LOGFILE GROUP <GROUP# FROM v$log> ;. Execute this for each online redo log group as shown in the V$LOG view.
5.2.25 Use supplemental logging and primary key constraints on all production tables
It is recommended that all tables have primary key constraints defined upon them, which automatically means that the column is defined as NOT NULL. For any table where a Primary Key constraint cannot be defined, then an index should be defined on an appropriate column that is defined as NOT NULL. If a suitable column does not exist on the table, then the table should be reviewed and if possible skipped by the SQL Apply engine.
5-30
5.2.26
If the SQL Apply database is being used for the purpose of offloading reporting or decision support operations from the primary database, then it is likely that you will want to reserve some of the parallel query slaves for such operations. Since the SQL Apply engine by default uses all the parallel query slaves, setting the logical standby MAX_SERVERS parameter allows for a certain number of parallel query slaves to be held back for such operations. The parameter specifies the number of parallel query servers that the SQL Apply engine will reserve when started.
Table 5-9 Logical Standby MAX_SERVERS Example
It is recommended that this parameter be initially set to the larger of 9 or 3 x CPU initially MAX_SERVERS=max(9, 3 x CPU).
5.2.27
Increase the initialization parameter PARALLEL_MAX_SERVERS by the larger of 9 or 3 x CPU on both the primary and standby instances - PARALLEL_MAX_SERVERS=current value + max(9, 3 x CPU). The PARALLEL_MAX_SERVERS parameter specifies the maximum number of parallel query processes that can be created on the database instance. With the exception of the coordinator process, all the processes that constitute the SQL Apply engine are created from the pool of parallel query processes. The SQL Apply engine, by default, uses all the parallel query processes available on the database instance. This behavior can be overridden using the logical standby parameters It is recommended that this parameter be increased by the value of the SQL Apply parameter MAX_SERVERS.
5.2.28
The MAX_SGA parameter allows the default allocation for the T-LCR Struct of 1/4th the Shared Pool to be overridden. This parameter is specified in terms of megabytes and must not exceed the 1/4th of the Shared Pool limit. It is recommended that this parameter be left unset, and be allowed to default to 1/4th the Shared Pool.
5.2.29
The _EAGER_SIZE parameter specifies the minimum number of rows that can be modified by a single transaction, before which the transaction is deemed an eager transaction. An eager transaction differs from the typical transaction processed by the SQL Apply engine as follows; under normal circumstances, the SQL Apply engine passes committed transactions to the coordinator and ultimately onto the apply slaves. When an eager transaction is identified, the coordinator associates an Apply slave to service the transaction, and the SQL statements are passed to the apply slave as they are prepared, removing the need for the complete transaction to be built. The identification of an eager transaction is required for two reasons;
5-31
1.
If a transaction updated 1 million rows on the primary database, then the size of the T-LCR Struct might be insufficient to support the 1 million SQL statements that are required to apply the transaction to the standby database. By allowing the SQL statements to be applied to the standby database, as they are prepared, the length of time for the transaction to be committed is significantly less than if the transaction was only assigned to the Apply slave after the complete transaction had been built. This means that the SQL Apply engine is able to continue applying new transactions to the standby database much faster, assuming Full-consistency.
2.
_EAGER_SIZE is specified in terms of rows modified. In Oracle9iR2, the default value for this parameter is based upon the MAX_SGA logical standby parameter multiplied by 1000. Therefore, if the MAX_SGA parameter is specified at 500Mb, then the default value for the _EAGER_SIZE parameter would be 500,000 rows. This value is typically too high.
5.2.30
The APPLY_DELAY parameter specifies how long the SQL Apply engine should delay the application of the archived redo log relative to the time the log file was registered with the logical standby database. The delay time should be set to a value larger enough for the administrators to identify, report, and ultimately stop the SQL Apply engine, thereby giving the database administrator the ability to stop the SQL Apply engine and prevent the error from being replicated to the logical standby database. This parameter is specified in terms of minutes.
5.2.31
Oracle SQL Apply supports the following methods of data application: For a reporting or decision support system, FULL or READ_ONLY transaction consistency is provided. For a disaster recovery solution or when the SQL Apply engine needs to catch-up or transaction ordering is not mandatory, set TRANSACTION_CONSISTENCY to NONE.
If the logical standby database will be used for reporting or decision support operations, then: If the standby database has multiple instances (Real Application Clusters), then choose FULL If the standby database has only one instance (non Real Application Clusters), then choose READ_ONLY.
5.2.32
For database objects that do not need to be replicated to the standby database, it is advised to skip these objects using the DBMS_LOGSTDBY.SKIP procedure. By skipping such objects, the SQL Apply engine will not need to perform the unnecessary operations of maintaining a table that is not required. This is likely to occur in a decision support environment.
5-32
5.2.33
During a switchover operation, the primary and logical standby databases need to communicate detailed database information. It is recommended that Oracle database links on the primay and logical standby database be created at the time the SQL Apply database is instantiated. The Oracle database link name should be the same on both the primary and standby databases, and reflect the role that the target database will be performing. The following steps should be completed on both the primary and logical standby database. On the logical standby database: SQL> connect sys/<password> SQL> create database link prim_db 2 connect to system 3 identified by manager 4 using SALES_PRIM; On the primary site database: SQL> connect sys/<password> SQL> create database link log_db 2 connect to system 3 identified by manager 4 using SALES_LOG_SEC; The following examples show the results of the same query executed against the two different nodes in the Logical Standby environment.
SQL> select local.host_name local_host 2 ,.remote.host_name remote_host 3 from v$instance@prim_db remote 4 , v$instance local; LOCAL_HOST REMOTE_HOST ----------------------- ----------------------primary1 secondary2 SQL> select local.host_name local_host 2 ,.remote.host_name remote_host 3 from v$instance@log_db remote 4 , v$instance local; LOCAL_HOST REMOTE_HOST ----------------------- ----------------------secondary2 primary1
5-33
Understand when backups are utilized Use Recovery Manager (RMAN) for backing up database files Use a Recovery Catalog Choose a proper backup frequency Ensure the backup area has sufficient space to hold on-disk backups Do backups to disk on both primary and secondary sites Catalog split mirror backups if used in place of disk backups using the RMAN command CATALOG Create tape backups from disk backups on both the primary and secondary sites using the RMAN command BACKUP BACKUPSET During backups, use the target database control file as the RMAN repository and resynchronize afterwards with the RMAN command RESYNC CATALOG Use the RMAN autobackup feature for the control file and server parameter file Periodically test recovery procedures Create frequent controlfile backups and maintain an up-to-date controlfile creation script.
5-34
Standby database instantiation o o o When restoring fault tolerance after a failover During initial MAA environment setup After a corruption or media failure on the standby database
Object-level recovery using block media recovery Double failure resolution Long-term tape storage for archival purposes
5-35
necessary to restore service in the most dire of circumstances. See the Restoring Fault Tolerance after Dual Failures section in the Restoring Full Database Fault Tolerance chapter for additional information.
5-36
Time to complete the backup Time to instantiate a new standby database, which is based on two factors: o o Time to restore the data files from an existing on-disk backup Time to recover the database up to the configured lag
Note that, as recommended below, database backups are made to disk. Tape backups are created from the on-disk backups using the RMAN command BACKUP BACKUPSET. When using local on-disk backups, the time to create a new standby database on the new standby site and to return to the configured lag is a function of the following:
Time to restore the data files from the backup Time to recover the database up to the configured lag o o o Age of the backup to be used for the restore - Recovering from an older backup requires more recovery than a more recent backup. Configured standby database lag - Recovering a standby database with a large configured lag will take less time than recovering one with a small lag. Production redo generation rate5 As the backup is being restored and the standby database is being recovered, the production database is continuing to service users and generate redo. The additional redo generated must also be applied to the standby database as part of the recovery process while it is catching up to the configured lag. Maximum standby redo application rate6 The maximum standby redo apply rate must be greater than the production redo generation rate so that the standby database can catch up while new redo is generated by the production. If the maximum standby redo apply rate is lower than the rate at which the production database is generating redo, the standby database will continually fall behind and will not catch up to the configured lag until there is a reduction in redo generated by the production database.
Assuming that restore time, production redo generation rate, configured standby database lag, and maximum standby redo apply rate are constant, the time to create a new standby database and recover completely is heavily influenced by the age of the backup. Recovery time can be reduced by reducing the restore time (see Chapter 14: Tuning Recovery Manager in Oracle9i Recovery Manager User's Guide) and by increasing the standby redo application rate, which is equivalent to tuning media recovery. For details regarding tuning media recovery, see Oracle9i Media Recovery Best Practices on OTN (http://otn.oracle.com) for more details.
5.3.5 Ensure the backup area has sufficient space to hold on-disk backups
On-disk backups should reside in an area designated as the backup area, which should be mirrored and striped. Please refer to Storage Configuration Best Practices for a description of the backup area. There is no formula to calculate exactly how large this area should be; however, the backup area should be sized so that it can hold at least two whole database backups and the archived redo log files. Two whole database backups is the minimum because the second backup must
The production redo generation rate can be obtained by running statspack during several average and peak transaction processing intervals. Statspack provides a load profile section that contains the average redo size generated per second. Refer to Oracles Performance Tuning Guide and Reference. 6 The standby redo apply rate or the media recovery rate can be obtained by applying several archive logs and determining the average apply rate. Please refer to Oracle9i Media Recovery Best Practices on http://otn.oracle.com
5
5-37
finish successfully before the first backup can be deleted. Physical database changes, such as adding or removing a data file, must be accounted for since they will change the backup area requirement. RMAN whole database backups back up all the blocks in the database that have ever been used. Thus, the size depends on how many blocks in the file have ever been used by Oracle. Note that if the blocks were used by an object but are no longer being used because the object was dropped, RMAN still backs up those blocks. Incremental backups may not provide a significant gain in reducing backup times because all blocks are still read during the backup, however their main advantage is the likely reduction of backup space required since only changed blocks are backed up.
Time to restore fault tolerance after a failover will be significantly higher with a large database. Introduce new backup procedures at the primary site if the secondary site is unavailable for an extended period. RMAN block media recovery at the primary site is not an available recovery option for object level outages.
Consider the following scenario: backups are done at the secondary site only, then there is a site outage at the secondary site where the estimated time to recover is 3 days. The primary site is completely vulnerable to an outage that is typically resolved by a failover (e.g. user error), but also to any outage that could be resolved by having a local backup (e.g. object level outage resolved by block media recovery). In this scenario, a production database outage can only be resolved by physically shipping the off-site tape backups that were taken at the standby site. If primary site backups were available, a local restore would be an available option in place of the undoable failover. Data may be lost, but having primary site backups significantly shortens the MTTR of this scenario. A possible approach is to start taking primary site backups at the time there is a secondary site outage. However, this approach should be avoided because it is introducing new processes and procedures at a time when the environment is already under duress and the impact of a mistake by staff will be magnified. Additionally, it is not a time to learn that backups cannot be taken at the primary site. Additionally, primary site disk backups are necessary to ensure a reasonable MTTR when using RMAN block media recovery. Without a local, on-disk backup, a backup taken at the standby must be restored on to the primary, significantly lengthening the MTTR for this type of outage.
5.3.7 Catalog split mirror backups if used in place of disk backups using the RMAN command CATALOG
Split mirror backups, where the operating system or storage software maintains three mirrored copies of the data files, may be used in place of RMAN-created on-disk backups. RMAN does not automate the creation of split mirror backups, but can make use of split mirror backups for restore and recovery procedures if they are cataloged in the RMAN repository using the RMAN CATALOG command. Reference: Making Split Mirror Backups with RMAN section in Chapter 9: Making Backups and Copies with Recovery Manager of the Oracle9i Recovery Manager User's Guide.
5-38
5.3.8 Create tape backups from disk backups on both the primary and secondary sites using the RMAN command BACKUP BACKUPSET
Tape backups are still required in an MAA environment. Tape backups provide failure protection from certain dual outages, for example, a secondary site-wide outage followed by a primary site-wide outage. Even in these most drastic of circumstances, restoring from a tape backup that has been maintained offsite is an available method of recovery. Tape backups can be made directly from the local on-disk backups using the RMAN command BACKUP BACKUPSET. Utilizing the existing on-disk backups for creating tape backups reduces or eliminates the affect on production service levels since I/O contention between the database and the backup is minimized. Reference: Backing Up Backup Sets section in Chapter 9: Making Backups and Copies with Recovery Manager of the Oracle9i Recovery Manager User's Guide.
5.3.9 During backups, use the target database control file as the RMAN repository and resynchronize afterwards with the RMAN command RESYNC CATALOG
When creating backups (to disk or tape), use the target database control file as the RMAN repository so that the ability to backup or the success of the backup does not depend on the availability of the RMAN catalog in the manageability database. This is accomplished by running RMAN with the NOCATALOG option. After the backup is complete, the new backup information stored in the target database control file can be resynchronized with the recovery catalog using the RESYNC CATALOG command. Reference: Resynchronizing the Recovery Catalog section in Chapter 16: Managing the Recovery Manager Repository of the Oracle9i Recovery Manager User's Guide.
5.3.10 Use the RMAN autobackup feature for the control file and server parameter file
The RMAN autobackup feature provides a way to restore the backup repository contained in the control file when the control file is lost and the recovery catalog is either lost or was never used. You do not need a recovery catalog or target control file to restore the control file autobackup. The RMAN autobackup feature is enabled with the command CONFIGURE CONTROLFILE AUTOBACKUP ON. Reference: Control File and Server Parameter File Autobackups section in Chapter 5: RMAN Concepts I: Channels, Backups, and Copies of the Oracle9i Recovery Manager User's Guide.
5.3.11
Complete, successful, and tested backups are fundamental to the success of any recovery. Create test plans for the different outage types. Start with the most common outage types and progress to the least probable. Issuing backup procedures does not ensure that the backups are successful - they must be rehearsed. Monitor the backup procedure for errors and validate backups by testing your recovery procedures periodically. Also, validate the ability to do backups and restores by using the RMAN commands BACKUP VALIDATE and RESTORE VALIDATE. Reference: Validating the Restore of Backups and Copies section in Chapter 10: Restoring and Recovering with Recovery Manager of the Oracle9i Recovery Manager User's Guide.
5-39
The paper thus far has focused on the MAA for one database or one application. However, a data center supports multiple production applications and databases. This section puts MAA in the context of two distributed data centers supporting multiple applications. The data centers can be configured in an active / passive configuration or an active / active configuration An active/passive data center configuration has one primary data center used for production and a secondary data center that is initially passive. The remote data center is used only after an application fails or switches over. In the MAA, the sites are symmetric; so, after an application fails or switches over, the application does not necessarily need to switch back. To reduce costs, many customers create asymmetric environments where the secondary site only contains a subset of the hardware and network resources compared to the primary site. However, maintaining performance service levels and the operating procedures for failover and switchover will be more difficult, especially if multiple failures occur. A more appealing configuration is the active/active configuration. Both data centers in this configuration support a combination of active and passive applications. However, each database within a site has either a production role or a standby role. So, for any one application, there is still one production database located in only one of the data centers and a corresponding standby database in the opposite data center, therefore providing the same level of availability for each database, as described throughout this paper. Each database may reside on its own cluster for production, and its corresponding standby database may be located on another cluster. To increase utilization, multiple production databases and multiple standby databases for different applications can share the same hardware resources if capacity is sufficient. The following set of diagrams shows examples of how clusters can be shared for multiple production and standby database applications, provided resources are sufficient to handle the combined load. A data center may have several different database applications, some of which may reside on the same cluster or have its own dedicated cluster. In Figure 6-1 and Figure 6-2, there are 4 separate RAC databases (CRM, ERP, Mail, and HR) residing on different clusters. Figure 6-1 shows one cluster within each data center, each with a production database on one cluster and a corresponding standby on the remote cluster. Figure 6-2 shows a pair of clusters at each site that share a common set of application servers. Here the cluster for one database is active on one site while the corresponding standby cluster is inactive on the other site. The advantage is that there is not another database application to coordinate and plan for in the case of a Data Guard failover or switchover. In Figure 6-1, the cluster for the CRM production database also supports a standby RAC database for ERP at Site A. Site B has an identically configured and sized cluster that runs the ERP production database and has the CRM standby database. In this scenario, the resource and capacity available on each cluster should be sufficient to run both databases in production mode in the single cluster, ideally with minimal or no degradation in performance. For example, if site A goes down, the production CRM database uses Data Guard to fail over to the standby CRM database on site B. Now both CRM and ERP are running a production load on a single cluster. If resources have been properly planned, this should not cause any problems or degradation to occur.
Site A
Site B
App Servers
App Servers
App Servers
switches
switches
Figure 6-1: One Cluster with One Application Server in Each Data Center
In Figure 6-2, there are 2 separate clusters that share the same set of mid-tier (application, web server, IMAP, LDAP, etc.) application servers. In the example below, the production RAC Mail database resides on its own cluster at site A and has a corresponding cluster on site B for the standby RAC Mail database. In addition, there is a separate cluster at site B, which runs the production RAC HR database and has another corresponding cluster back on site A for the standby RAC HR database. On each site, the mid-tier application servers are shared by both the Mail and HR applications. Therefore, in the case of failover for one database application, the mid-tier application servers must have enough resource capacity to handle the load for both production applications on the same site, ideally with minimal or no degradation in performance. Site A
A pp Servers
A pp Servers
Site B
Ap p S ervers
Ap p S ervers
switches
sw es itch
H R
RA C Standby
H R
RA C Standby
Mail
RC A Production
Mail
RC A Production
Mai l
RC A S tandby
M ail
RC A S tandby
H R
RC A P roduction
H R
RC A P roduction
Figure 6-2: Two Clusters in Separate Data Centers that Share an Application Server
Note: These diagrams do not show the full redundancy of all the hardware components. Please refer to Client Tier Site Failover for a detailed description of the network or hardware redundancy expected in the MAA configuration. Data centers will probably have varying combinations depending on the service level requirements for each of the database applications they support. The MAA configuration should be deployed for all applications supporting services that require high availability. Having multiple databases on a single cluster or sharing middle-tier servers is fully supported, and should be leveraged as much as possible, but additional resource and capacity planning is required. The main advantage and attraction of the active/active data center environment is that all resources are leveraged and utilized.
6-2
Part III provides detailed information regarding monitoring availability and performance of systems in a highly available environment, not necessarily just those that conform to all the precepts in this document.. Also included are suggestions on using Enterprise Manager to manage these environment, reacting to events as they occur. This part contains the following:
Continuous monitoring of the system, network, database operations, application, and other components of a system ensures early detection of problems resulting in problem avoidance or fast resolution. In addition, monitoring should capture system metrics to indicate trends in terms of both system growth and recurring problems, facilitate prevention, enforce security policies, and manage job processing. More specific to the database server, a sound monitoring system needs to measure availability and detect events that can cause the database server to become unavailable and provide immediate notification to responsible parties for critical failures. The monitoring system itself needs to be highly available and adhere to the same operational best practices and availability practices as the resources it monitors. Failure of the monitoring system leaves all systems that it monitors exposed. Oracle Enterprise Manager (OEM) provides the event management and monitoring capabilities with e-mail/paging notification options. This section provides recommendations for using Enterprise Manager 9.2 to maintain an existing MAA environment. Recommendations are available for methods of monitoring the environment as well as reacting to changes in the environment. In addition, there is a description of how to create a highly available Enterprise Manager configuration using a combination of RAC and Data Guard technology and additional configuration tips. These recommendations are found in the following sections:
7.1 Monitoring
Set EM events to detect service interruptions Use EM events to monitor system availability (data guard events) Use EM events and the EM performance pack to view the overall database health and isolate performance problems
Many of the recommendation for using Enterprise Manager use a feature of the product known as an Event. An Event monitors for a condition to occur and then performs some action when that condition occurs. Events can be set to monitor many different layers of the application stack including the operating system, the application server, the listener and the database. The key points needed to define an event are: What object should be monitored (databases, nodes, listeners, or other services) What instrumentation should be sampled (e.g. availability, CPU percent busy) How frequently should the event be sampled? What should be done when the event exceeds a predefined threshold?
Two thresholds can be set for any event; a warning level and a critical level. The complete process used to define an event through the Enterprise Manager GUI is discussed in Chapter 6 of the Enterprise Manager Administrators Guide. A complete listing of the Events that are provided with the product is available in the Oracle Enterprise Manager Event Test Reference Manual While there are many predefined instrumentation points that are available in Enterprise Manager, new monitoring conditions for events can be added using either SQL or shell scripts. Section 7.1.1 contains the minimum set of events that should be set for critical systems. Section 7.1.2 Details events that should be set to monitor any Data Guard instance. Section 7.1.3 details events that should be set to monitor performance. The events screen is accessable for the main left hand navigation tree in Enterprise Manager. This screen shows any active alerts and their status (warning or critical). The registered tab shows a listing of all active events currently being monitored. The example below shows the Disk Monitoring event has reached a critical threshold and the event titles Crisis Alerts has reached a warning threshold. Enterprise manager can also act on events as they exceed a threshold. For example, an event can be set to trigger a page and / or an email if a particular threshold is reached. In addition, an event can be associated with a Fixit job. The Fixit job concept is explained below under Use EM fixit jobs to respond to event alerts.
7-5
Critical space management conditions that have the potential to cause a service outage should be monitored using the following events. Set the Disk Full event to monitor root file systems for any critical hardware server. This event allows the administrator to pick the threshold percentages that EM will test against and the number of samples that must occur in error before a message will be generated to the administrator. Default recommendations are 70% for a warning and 90% for an error, but these will need to be adjusted depending on system usage. This recommendation applies to the Swap Full, Archive Full and Dump Full events listed below. Set the Swap Full event to monitor the swap space usage for any critical hardware server. Set the Archiver Hung event which will monitor the alert log for any ORA-00257 messages indicating a full archive log directory As an additional check on archive log space, set the Archive Full (%) event with thresholds and a sampling event time appropriate to the environment. This event can then alert the administrator to a potential system stoppage due to a full archive directory. Set the Dump Full (%), should be set to monitor the dump directory destinations. It is critical that dump space be available so that the maximum amount of diagnostic information for any error that occurs be caught the first time.
Potential problems such as space management issues need to be monitored. These events should be set based on the amount of growth anticipated in the system. While production systems tend to be stable and not experience much data file growth, setting these events provide the administrator with a warning if unexpected growth occurs. The Maximum extents events should be set to generate a warning when a critical resource is nearly full, Once a segment has filled the maximum number of extents, any row insertion will fail with an ORA-1631 error
7-6
message. The default for the event is to monitor all tablespaces in any database target, but that can be modified on the Parameters tab for the event. The warning and critical thresholds should also be modified from their defaults of 1 and 2 respectively depending on the environment. Set the Datafile limit to check for the utilization of the data file resource. If the percentage of data files currently used to the limit set in the DB_FILES initialization parameter exceeds the values specified in the threshold arguments, then a warning is generated at 80% used and a critical alert is generated at 90%.
Monitor the alert log for errors. Enterprise Manager provides an Alert event that with signal an alert when any ORA-6XX, ORA-1578 (Database corruption) and ORA_0060 (deadlock detected) event is recorded. If any other alert is recorded, a warning message is generated. Set the Data Block Corruption event, which monitors the alert log for ORA-01157, ORA-27048 entries that would signal a corruption in an Oracle data file.
Monitor the system to insure that the processing capacity is not exceeded. The warning and critical parameters for these events should be modified based on the usage pattern of the system. Use the Process limit to warn if the number of current processes approaches the value set in the PROCESSES database parameter Set the Session limit event to warn of the instance is nearing the maximum number of concurrent connections that the database will allow.
Set the Data Guard Logs Not Shipped event to alert the administrator if there is an extended delay in moving archive logs from the primary to the standby site. This event will flag when the difference between archive logs on primary is greater than the number of archive logs shipped to the standby site. It is a user defined threshold and should be based on the amount of time it takes to transport an archive log across the network. Set the sample time for the event to be approximately the log transport time, and set the number of occurrences to be 2 or greater to avoid false positives for this alert. Good starting values for the warning and critical thresholds are 1 and 2 respectively.
7-7
7.1.3 Use EM events and the EM performance pack to view the overall database health and isolate performance problems
As stated previously, a system without adequate performance is not a highly-available system, regardless of the status of any of the individual components. Many of the listed events from the Oracle Enterprise Manager Event Test Reference Manual pertain to the performance category. While performance problems seldom cause a major system outage or blackout, they can still cause an outage to a subset of customers. Outages of this type are commonly referred to as application service brownouts. The primary cause of brownouts is the intermittent or partial failure of one or more infrastructure components, whether client, network, server, database or application. In order to understand when and why application brownouts occur, IT managers must be aware of how the various infrastructure components are performing their response time, latency and availability and how they are affecting the quality of application service delivered to the end-user. The foundation for any performance tuning and for determining what constitutes a performance event alert is an end-to-end performance baseline. Ideally, baseline data gathered should include the following: Application statistics (transaction volumes, response time, web service times) Database statistics (transaction rate, redo rate, hit ratios, top 5 wait events, top 5 SQLs) OS statistics (CPU, memory, I/O, network)
The baseline is what the performance metrics are for normal operations that are meeting the SLA. Thus, the basis for setting a performance warning or alert event should be the unacceptable deviation from the baseline. Additionally, historical performance data is crucial to identifying trends and for use in capacity planning. This means that you should collect operating system, database, and application statistics from the first day an application is rolled out into production. Detailed performance monitoring practices and metrics are outside the scope of this blueprint but are a critical proactive component of service and availability monitoring. However, all systems will share some basic characteristics that can be monitored using Enterprise Manager Events. Setting the threshold and sampling rates for each of these events will be system dependent. The recommended approach would be to monitor and trend the values recommended below and capture their value at a point of stable system performance under load. This can be done using the Enterprise Manager Capacity Planner. Those sampled values can then be used as a basis to set events that will alert system administrators if those values were to exceed the user defined limits. Using Enterprise Manager, the following database events should be monitored to give an indication of system performance. There are many operating system events that can be used to supplement these tests, but they vary by platform. Set the Disk I/O per Second event. This is a database level event that will monitor I/O operations done by the database and will alert when that activity exceeds a user-defined threshold. The fact that this is reported by database is key, as other I/O operations run on this monitored hardware will not be counted. This event should be used in conjunction with operating system level event that are also available with Enterprise Manager and a platform dependent. However, if the database is the primary application running on this hardware, setting this event does give a general indication of the amount of work done. Set the event based on the total I/O throughput available to the system, taking into account the number of I/O channels available, network bandwidth if running in a SAN environment, affects of cache if using a storage array device and the maximum I/O rate and number of physical spindles available to the database. Set the Physical Reads (Writes) per Second events for each monitored database. As above,these events monitors and alerts on general amount activity measured by the database. Set these for more detailed warnings of potential I/O problems Set the % CPU Busy event. This monitors CPU usage as measured by the database. Set this value to warn at
7-8
75% and to show a critical alert between 8-5 and 90% to alert the administrator of sustained peak usage. This usage may be normal at peak periods, but it may also be an indication of a runaway process or of a potential resource shortage of more growth is expected. Set the % Wait Time event to monitor idle time at a systemic level. Idle time is expected to some degree as resources such as CPU are shared or I/O requests are satisfied. Excessive idle time is an indication of a sharing problem as a bottleneck for one or more resources has likely developed. Measure the system wait time when the application is performing as expected under normal load and use that value to set this event. Set the Network bytes per Second event to monitor the total network traffic as measured by SQL*Net. Similar to the I/O events discussed above, this event will only report traffic that Oracle generates and not overall hardware usage. However, this event can be used to give an indication of a potential network bottleneck. The initial warning values for this event should be based on actual usage during peak time as described under the % Wait Time event. Set the Total (hard) Parses per Second to provide an indicator of SQL performance. While this is another event that should be baselined during normal peaks loads to get an indicator of performance, deviation from this baseline will determine if an application change or change in usage has created a shortage of resources and a potential system bottleneck.
In addition to these events, Enterprise Manager provides a canned event to monitor any statistic in V$SYSSTAT and any specific wait event. In the event of an alert, the Enterprise Manager Performance Pack has a detailed view that allows for a quick view of overall system performance, SQL performance or session performance and tracing. For an alert generated from one of the set performance events, the Enterprise Manager Database Health Overview Chart provides a detailed view of overall system performance. Each individual subchart provides a drill down into more detailed metrics for each problem type, including individual SQL usage and session resource usage. Also an advice wizard can be consulted for tips based on the current system state. Details on this chart are located in the Enterprise manager Concepts and Administrators Guides
7-9
For more detail on performance monitoring, consult the Oracle Enterprise Manager Manuals, Oracle9i Database Performance Methods, and Oracle9i Database Performance Guide and Reference.
7-10
7.2 Managing
As well as passive notification of problems or reactive performance problem analysis, Enterprise Manager should be used as a proactive part of administering any system. The following suggestions describe how to use the Enterprise Manager Job system to react to events as they are raised or to handle daily administrative tasks. This use of this feature is fully documented in the Enterprise Manager Administrators Guide, Chapter 5. In addition, the Data Guard Manager can be use to administer a standby database environment Use EM fixit jobs to respond to event alerts Use EM job scheduling to manage routine events Use EM Data Guard manager to manage non-MAA configurations
7-11
Monitoring and Managing with Enterprise Manager - Enterprise Manager Architecture for High Availability
Console - Console clients and integrated tools provide a graphical interface for administrators This is the front-end GUI interface to EM. Administrators use the Enterprise Manager console by connecting to an OMS process directly on specified hardware
OMS /Repository - Management Servers and a database repository provide a scalable middle tier for processing system management tasks The management server processes are the brokers of work from the console to the agent and from the agent to the repository and console. Management Server uses a repository to store all system data, application data, managed nodes state information, and information about any system management packs. The repository is a set of database tables that must be located in a supported Oracle database accessible to the Oracle Management Server (OMS).
Agents - Intelligent Agents installed on each node monitor node services and execute tasks from the Management Server Agents are the driving force to initiate repeating work. Once programmed by the user through the console (via the OMS) agents wake up at scheduled intervals and check/record statistics and initiate jobs. As the initiator of scheduled work, the agent will make a call to the OMS An agent must run on every node monitored in the architecture, including the nodes the host the primary and standby EM repository and OMS processes .
The EM architecture needs to be as reliable as the application architecture. It is crucial for the monitoring framework to manage, detect, and help initiate repair as efficiently as possible. We recommend the following for configuring a MAA EM architecture and setup: Ensure availability of EM using an MAA architecture Unscheduled Outages for Enterprise Manager
7-12
Monitoring and Managing with Enterprise Manager - Enterprise Manager Architecture for High Availability
to the repository instances running on either RAC node and failover using SQL*Net failover. This is described in section 7.4.2. The critical success factor for this architecture is the amount of network bandwidth to support the communication between the OMS processes and the agents. If this repository is used to manage a larger enterprise, then agent to OMS traffic could be significant, depending on the number of scheduled events and jobs. If the EM framework is leveraged to monitor multiple applications and more dedicated system resources are required, consider scaling the EM repository and management processes with additional nodes. The repository and OMS processes can be scaled independently, depending on need. If required, additional hardware outside of the cluster could be added to scale the number of OMS processes.
...
EM Agent EM Agent EM Agent EM Agent
SQL*Net manages the connection between OMS processes and the active repositories Secondary Site - Hosts the standby EM repository and OMS/agent processes
Primary Site - Hosts the production EM Repository database, an OMS and agent
Oracle* Net
Production Database
Standby Database
Not every system architecture will have the availability needs implicit in a full MAA configuration. For those environments, Appendix A of this document - Risk Assessment of Major Architectural Changes, discusses some different architectures that are still viable, and provide a basis for a cost benefit trade off for the EM repository. In addition, to reduce hardware overload and leverage current resources, the EM repository and management server processes can be hosted on the same
7-13
Monitoring and Managing with Enterprise Manager - Enterprise Manager Architecture for High Availability
hardware another MAA configured application, with the active EM instances and management server processes on the secondary site. This assumes that the secondary site has the capacity and bandwidth to handle the production load, plus an active Enterprise Manager Repository and OMS process
Scope of Outage
Repository Node or Instance failure
Agent failure
Watchdog crashes
Process crash, Accidental user termination Agent crash, User deletion of state files
No data is reported, logging stops for the agent. Any hanging processes need to be manually killed (UNIX) and the agent needs to be restarted Death of the watchdog is not reported back to the EM GUI Stop any still running agent processes (DBSNMP and DBSNMPWD) Delete all *.q and the services.ora files in the $ORACLE_HOME/network/agent directory -Restart the agent (agentctl start) -Reload any schedule event or jobs from the OEM console for that node
Watchdog process restarts management server -flags for this are in the set in the $ORACLE_HOME/OMSconfig.properities file If the management server start fails more than the preset number of times it will need to be restarted manually (oemctl start management server user/password) If the management server fails to restart, a surviving management server takes over the work. If no management server processes are available to assume the work, all EM processing will cease. Failure of an management server will cause a GUI session connected to it to fail, requiring restart to a surviving management server
Watchdog failure
Process crash, Accidental user termination Console loses connection to management server due to network problem, management server failure, node failure
No data is reported, logging stops for the management server processes. Any hanging processes need to be manually killed (UNIX) and the agent needs to be restarted Death of the watchdog is not reported back to the EM GUI As the console is stateless, it gets all data from management server processes. The failure resolution is to re-connect to a surviving management server or start the console stand-alone There are times were the GUI process is not able to restart cleanly. The resolution is to shutdown the EM console completely (and if running on windows, reboot the node?) and restart
Console disconnect
7-14
7.4 EM Configuration
The follow section contains additional configuration information that will be helpful in building Enterprise manager. Configure a separate listener for EM traffic Configure Oracle Net connection load balancing and connection failover Install the Enterprise Manager Repository into an existing Database Check Metalink for updates by component
7.4.2 Configure Oracle Net connection load balancing and connection failover
Configure the Oracle Net connection descriptor or TNS alias used by the OMS for load balancing and failover. The example below shows how connections are initially routed to the first RAC database and balanced between the two. In the event of a site outage, traffic is routed to the alternative site and balanced between those nodes
EM= (description=
7-15
Monitoring and Managing with Enterprise Manager - EM Configuration (failover=on) (address_list= (load_balance=on) (failover=on) (address=(protocol=tcp)(port=1522)(host=EMPRIM1.us.acme.com)) (address=(protocol=tcp)(port=1522)(host=EMPRIM2.us.acme.com))) (address_list= (load_balance=on) (failover=on) (address=(protocol=tcp)(port=1522)(host=EMSEC1.us.acme.com)) (address=(protocol=tcp)(port=1522)(host=EMSEC2.us.acme.com))) (connect_data=(service_name=EMrep.us.acme.com)))
This is similar to the Oracle Net configuration settings for the database except the host names, listener names, and net aliases will be different. Please refer to Appendix C.
7-16
Part IV: Outages, Recovering from Outages, and Restoring Fault Tolerance
Part IV provides detailed information regarding outages that may occur in an MAA environment, the solutions to resolve those outages, and the steps to take to restore fault tolerance for full protection. This part contains the following:
Section 8, Outages Section 9, Recovering from Outages Section 10, Restoring Database Fault Tolerance
8 Outages
Continued availability is a fundamental requirement of many applications today. Businesses operate on a continual basis and in fact, many exist on the premise of 24x7 availability of their applications. Downtimes, intentional or otherwise, have a large opportunity cost in revenue and quality of service. MAA provides a recovery process and architectural framework to manage each outage and minimize the downtime associated with each outage. This section lists and describes all the outages handled by MAA. The outages are divided into 2 classes: Unscheduled Outages Scheduled Outages
Outage
Site-wide
Description
The entire site where the current production resides is unavailable. This includes all tiers of the application. A node in the application tier is unavailable. This includes failure of the node itself or any component that results in the node not being available. This node is usually part of a redundant set of server farms.
Examples
Disaster at the production site such as a fire, flood, or earthquake Power outages. (If there are multiple power grids and backup generators for critical systems, this should affect only part of the data center.) Application tier node crashes or has to be brought down due to bad memory or bad CPU. Application tier node is unreachable because both of the redundant network cards fail. Application tier software crashes due to bugs or incorrect configuration. Last surviving node of the application server farm for a service is no longer available. Both of the redundant hardware based load balancers directing traffic to the application tier crashes. Network connectivity issues to the application tier. Database tier node crashes or has to be brought down due to bad memory, or bad CPU. Database tier node is unreachable. Both of the redundant cluster interconnect switches fail, resulting in one node taking ownership.
The complete application tier is not available. Either all nodes are down or the application server on all nodes is down.
A database instance is unavailable or crashes. The whole cluster hosting the Oracle RAC database is unavailable or crashes. This includes failures of nodes in cluster as well as any other
An instance of the RAC database on the data server crashes due to a software bug or an operating system or hardware problem. Last surviving node on the RAC cluster fails and cannot be restarted.
Outage
Description
components that results in the cluster not being available and the Oracle database and instances on this site not being available.
Examples
Both of the redundant cluster interconnect switch fails Database corruption, which is severe, enough to disallow continuity on the current data server. Disk storage failure Data file accidentally removed or is not available. Media corruption impacting blocks of the database. Oracle block corruption caused by operating system or other node related issues or bugs User error resulting in a table being dropped or in deletion of rows from a table. Application errors resulting in logical corruptions in the database. Operator error resulting in batch job runs more number of times than specified.
Data failure
User error
This failure results in unavailability of parts of the database. The cause is usually a user error either caused by the operator or bugs in the application code.
Note: This category only focuses on user errors that impact the database availability. Failure of a redundant component MAA prescribes redundancy of all hardware and software components. This failure includes the failure of components, which are redundant, which result in automatic takeover by its redundant pair. Network cards, network, cluster interconnects, disk controllers and adapters, Oracle directory services, or host name resolution services
This section provides an outage decision tree for unscheduled outages. For each outage in the decision tree, the high-level recovery steps are listed with links to the detailed descriptions for each recovery step. The outage decision tree is divided into the following tables for both the production and standby sites:
The recovery operation descriptions referenced in the tables are found in the Recovery Operation Descriptions section. Some outages require multiple recovery steps. For example, when a site failover occurs, the outage decision matrix states that 1) Data Guard failover, and then 2) site failover must occur. Refer to the Detailed Recovery Operations for steps and best practices for each recovery operation. Some outages are handled automatically without any loss of availability. For example, instance failure is managed automatically by RAC. Multiple recovery options for each outage are listed wherever relevant. The Operational Best Practice appendix discusses the broad guidelines for prevention and detection of any outage. Prevention is the cheapest solution in the long run since it avoids the need to invoke the repair process when outages do occur.
8-2
Outages - Unscheduled Outages Table 8-2: Unscheduled Outages on the Production Site
Scope of Outage
Site Any application server All application servers
Recovery Steps
1. 2. 1. 2. Database failover Site failover Database switchover Site failover
Managed automatically by RAC Managed automatically by RAC 1. 2. 1. 2. OR Local object recovery Database failover Site failover Database forced failover Site failover
Production Database
User error
Any tier
Component failure
Scope of Outage
Recovery Steps
Standby Instance Failover No impact since the primary node or instance receives redo and the redo is applied by MRP. Restart node and instance when available. Restoring Fault Tolerance after Standby Database Data Failure
Standby Database
Secondary node or instance failure (one which is not running MRP) Data failure (e.g. media failure, disk corruption)
8-3
Outage Class
Site-wide
Description
The entire site where the current production resides is unavailable. This is normally known well in advance and can be scheduled.
Examples
Scheduled power outages Site maintenance Regular planned switchovers to test infrastructure Repair of a failed component such as memory card or CPU board. Addition of memory or CPU to an existing node in the application tier Upgrade of a software component such as the operating system, or application server software Changes to the configuration parameters for operating system Changes to the application server configuration Repair of a failed component such as memory card or CPU board. Addition of memory or CPU to an existing node in the database tier Upgrade to the storage tier necessitating downtime on the database tier Addition of a node to the cluster in some cases Upgrade or repair of the cluster interconnect Upgrade of a software component such as the operating system Changes to the configuration parameters for operating system Upgrade or patching of the cluster software Upgrade of the volume management software One-off patch for Oracle software Upgrade or patching of Oracle system software Moving an object to a different tablespace Converting a table to a partitioned table Renaming/dropping columns of a table
This is scheduled downtime of an application server node for hardware maintenance. This includes any repairs and upgrades. Since the node is usually part of a redundant application server farm, this maintenance can be staggered among nodes and the application server farm will manage application service continuity This is scheduled downtime of an application server node to upgrade, patch a software component or change the configuration. The software component involved can be any one of the following: the operating system, Oracle application server software or the application code residing on the application tier.
This is scheduled downtime of a database server node for hardware maintenance. The scope of this downtime is restricted to a node of the database cluster.
Database tier hardware maintenance (cluster-wide impact) Database tier software maintenance (node impact)
This is scheduled downtime of the database server cluster for hardware maintenance. The scope of this downtime is the whole database cluster. This is scheduled downtime of a database server node for software maintenance. The scope of this downtime is restricted to a node of the database cluster. This is scheduled downtime of the database server cluster for software maintenance. The scope of this downtime is the whole database cluster. Scheduled downtimes for any Oracle software patch Scheduled downtimes for any Oracle software upgrades including patchesets. These are changes to the logical structure or the physical organization of Oracle database objects. The primary motivation for the change may be to improve performance or to improve manageability of the object. This is always a planned activity. The method and the time chosen to do the
Database tier software maintenance (cluster-wide impact) Database tier Oracle software patches Database tier Oracle software upgrades Database tier application changes object reorganization
8-4
Outage Class
Description
reorganization should be planned and appropriate. By using Oracles online reorganization features, objects should be available during the reorganization.
Examples
Application code upgrade or patching at the database tier. This may or may not involve changes to the database objects. Depending on the application, this may or may not involve a database tier outage. Depending on the application upgrade, parts or none of the application objects may be available during the upgrade task.
Upgrade or patching to the application software Changes to an object structure and/or physical organization as part of an application patch
MAA prescribes redundancy of all hardware and software components. These may have to be maintained as well as tested from time to time.
Repair of a failed component such as network card, cluster interconnect or mirrored disks to restore resiliency
This section provides an outage decision tree for scheduled outages. For each outage in the decision tree, the high-level recovery steps are listed with links to the detailed descriptions for each recovery step. The outage decision tree is divided into tables for both the production and standby sites:
The recovery operation descriptions referenced in the tables are found in the Recovery Operation Descriptions section.
Scope of Outage
Site
Recovery Steps
1. 2. Database switchover Site failover
Managed automatically by redundant nodes on the application server farm Managed automatically by redundant nodes in application server farm Managed automatically by RAC 1. 2. Database switchover Site failover Database switchover Site failover
Node software maintenance (includes operating system, application server software, application software) Hardware maintenance (node impact) Hardware maintenance (cluster-wide impact) Software maintenance (node impact) Software maintenance (cluster-wide impact)
Primary Database Oracle software patch upgrade for one off patches Oracle software upgrade including patchsets Database object reorganization Application code changes and upgrades Any tier Component changes
For some patches, RAC rolling upgrades will be possible. Refer to OTN announcements and notes. No rolling upgrade Online object reorganization Out of scope Managed automatically by redundant components
8-5
Scope of Outage
Site
Recovery Steps
Pre-outage: Preparing for Scheduled Secondary Site Maintenance Post-outage: Restoring Fault Tolerance after Secondary Site or Cluster-Wide Scheduled Outage
Hardware or software maintenance on the primary node (that which is running MRP) Hardware or software maintenance on a secondary node (one which is not running MRP) Standby Database Hardware or software maintenance (cluster-wide impact)
Pre-outage: Preparing for Scheduled Secondary Site Maintenance No impact since the primary standby node or instance receives redo and is applied with MRP. Post-outage: Restart node and instance when available. Pre-outage: Preparing for Scheduled Secondary Site Maintenance Post-outage: Restoring Fault Tolerance after Secondary Site or Cluster-Wide Scheduled Outage
No rolling upgrade
8-6
Outages - Scheduled Outages Table 8-6: Preparing for Scheduled Secondary Site Maintenance
Preparation Steps
Site Shutdown Hardware maintenance (cluster-wide impact) Software maintenance (cluster-wide impact) Hardware maintenance on the primary node (that which is running MRPor LSP) Software maintenance on the primary node (that which is running MRPor LSP) Site Shutdown Hardware maintenance (cluster-wide impact) Maximum Availability or Maximum Performance Software maintenance (cluster-wide impact) Hardware maintenance on the primary node (that which is running MRPor LSP) Software maintenance on the primary node (that which is running MRPor LSP) None Switch the production database protection mode to either maximum availability or maximum performance using the steps highlighted in the Oracle Data Guard section titled Setting the database protection mode.
Maximum Protection
8-7
MAA uses a variety of recovery options. These recovery options use a combination of Oracle product features and the infrastructure to prevent and minimize downtime and data loss. The following table contains descriptions of the different recovery operations.
Recovery Operation
Site failover
Description
Site failover involves takeover by the secondary site. The secondary site becomes the new production site. All new client requests are directed to the new site. The application tier at this site is activated to provide the application service. The data server at this site becomes the new production database. The transition should occur from the database tier to the application tier to the outside world. The WAN traffic manager redirects the traffic to the load balancers on the new site. The load balancers at the new site are preconfigured to direct the client traffic to the newly activated application servers. This involves a Data Guard failover of the database tier. Complete recovery is attempted. There is minimal or no data loss. The previous production database will have to be reinstantiated from the backups or the current production. A subsequent site failover has to occur. This is due to an unscheduled outage. This involves a Data Guard failover of the database tier. Incomplete or point in time recovery occurs. There will be data loss when the standby database is activated to a consistent period before a logical corruption. The database on the former production site must be reinstantiated from the backups or the current production database. A subsequent site failover has to occur. This is due to an unscheduled outage. This involves a Data Guard switchover of the database tier. The previous production database becomes the new standby database while the previous standby database becomes the new production database without any reinstantiation required. A subsequent site failover has to occur. This is a scheduled or planned outage. RAC automatically handles instance and node failures at a given site to provide continued access to the backend data server. Except for a small brownout period, the whole process is transparent from an application standpoint and occurs only on the primary site. When standby node requires maintenance, you can switch to another standby instance to avoid any impact on the production database and to ensure that the standby does not lag too far behind. When the standby cluster requires maintenance or the standby cluster fails, the production database needs to be downgraded to maximum availability or performance mode. However, this protection mode allows data divergence between the sites. The application tier automatically fails over to one or more surviving RAC instances when an instance or node failure occurs at the primary site. Oracle Net manages the failover to the new instances. In some scenarios, it may be cost-effective to continue on the current production site and recover lost objects locally incurring partial unavailability. Oracle object recovery features can be used to address these requirements. Many scheduled outages related to the data server involve essentially some kind of reorganization of the database objects. They need to be accomplished with continued availability of the database. Oracle online object reorganization is used to manage the scheduled outages.
Database switchover
This section provides detailed steps for implementing outage solutions. The following tiers organize recovery operations: client tier, application server tier, and database server tier. The following figure illustrates the tiers.
hb
heartbeat
standby components
Client
Internet
Primary Site
Tier 1 - Client
Primary
Secondary Site
Backup
Router
Router
Router
Router
Firewall
hb
Firewall
Firewall
hb
Firewall
Firewall
Switches
hb
Switches
Standby
Active
Switches
hb
Switches
Standby
Hardware-based load balancer
App/Web Servers
IMAP servers
hb
LDAP servers
App/Web Servers
IMAP servers
Firewall
hb
LDAP servers
Firewall
Firewall
Firewall
Firewall
Switches
Switches
Router
Router
hb
Switches
RAC Instance
hb hb
RAC Instance
RAC Instance
hb hb
RAC Instance
RAC Database
RAC Database
Client Tier Site Failover (Includes All Tiers) Application Server Tier Application Server Failover Database Server Tier Object Reorganization Database Server Tier Object Recovery
9-2
Database Server Tier RAC Failover and Transparent Application Failover Database Server Tier - Data Guard Switchover Database Server Tier - Data Guard Failover Database Server Tier - Standby Instance Failover Database Server Tier - Standby Cluster Maintenance
9-3
Recovering from Outages - Client Tier Site Failover (Includes All Tiers)
Client
Internet
Router Firewall
Router Firewall
Active
Switches
hb
Switches
Active
Switches
hb
Switches
App/Web Servers
IMAP servers
hb
LDAP servers
App/Web Servers
Firewall Firewall
Firewall
Switches
hb
Switches
hb
Switches
RAC Instance
hb hb
RAC Instance
RAC Instance
hb hb
RAC Instance
RAC Database
RAC Database
The following figure illustrates the network routes after site failover (heavy dotted blue line with arrows).
9-4
Recovering from Outages - Client Tier Site Failover (Includes All Tiers)
hb
Client
Internet
Primary Site
Secondary Site
Firewall
Firewall
hb
Firewall
Firewall
hb
Firewall
Switches
hb
Switches
Switches
hb
Switches
IMAP servers
hb
LDAP servers
App/Web Servers
IMAP servers
hb
LDAP servers
Firewall
Firewall
Firewall
Switches
hb
Switches
hb
Switches
RAC Instance
hb hb
RAC Instance
RAC Instance
hb hb
RAC Instance
RAC Database
RAC Database
The following steps describe what happens to network traffic during a failover or switchover. 1. 2. 3. The production database is failed over or switched over to the secondary site. The middle-tier application servers start up on the secondary site. The client DNS request and resolution behavior yields the wide-area traffic manager selection of the secondary site. The wide-area traffic manager at the secondary site returns the virtual IP address of the load balancer at the secondary site. The secondary site load balancer directs traffic to the secondary site middle-tier application server. The secondary site is ready to take client requests.
4. 5.
9-5
Recovering from Outages - Client Tier Site Failover (Includes All Tiers)
Failover also depends on the clients web browser. Most browser applications cache the Domain Name Services (DNS) entry for a period of time. Consequently, sessions in progress during an outage may not fail over until the cache timeout expires. The only way to resume service to such clients is to close the browser and restart it.
9-6
Database
9-7
Improve performance of the system by relocating objects Accommodate changes in structure due to application changes or upgrades Partition large tables into smaller partitions
This should be done with minimal disruption to the application availability. The Oracle9i database provides a variety of features to allow reorganization of tables and indexes. This can be done online with minimal discontinuity in availability of the object for regular application function. Object reorganization is a planned activity and should be scheduled to occur at times of low system usage. There are multiple ways to do the same reorganization, and their impacts on object availability differ. An appropriate strategy should be chosen based on:
Size of object Amount and ratio of read and writes on the object Nature and extent of the reorganization Available resources required for the reorganization
Some cases of reorganization involve changes to the data content of the objects. These are normally application-specific and are best tackled at the application level. They are not considered in this discussion. The following table summarizes object reorganization solutions.
Table 9-2: Summary of Object Reorganization Solutions
Requirement
Rename a column Move a table Example: Move table to different tablespace with different block size or storage characteristics Reorganize table physical layout for better performance.
Methods Available
ALTER TABLE RENAME. Alter table move. Online table redefinition CREATE TABLE AS SELECT (CTAS)
Change table organization Example: Change a normal table (heap-organized) to an index-organized table Horizontal partitioning Example: Partitioning a large table into multiple partitions Vertical partitioning Example: Splitting table vertically by its columns into two or more tables
Online table redefinition Online table redefinition Online table redefinition Application-specific customized procedure Multiple CTAS ALTER TABLE SET UNUSED ALTER TABLE DROP UNUSED COLUMN Online table redefinition ALTER TABLE DROP COLUMN ALTER TABLE SET UNUSED Subsequent online index re-creation
Online table redefinition along with new index/constraint definition ALTER TABLE DROP Subsequent online index re-creation
Index creation
9-8
Requirement
Index rebuild Index coalesce Custom application-specific reorganization Example: Reorganizing contents of a BLOB.
Methods Available
ALTER INDEX REBUILD ONLINE ALTER INDEX COALESCE ONLINE Some cases of object reorganization are very specific to the object and the content of the table and are driven by the application changes. These should be accomplished using application specific customized procedure.
Examples appear in the appendix. Please refer to the following Oracle documentation for more details, examples and restrictions for these features:
Oracle9i Database Administrator's Guide, Release 2 (9.2) Oracle9i SQL Reference, Release 2 (9.2) Oracle9i Supplied PL/SQL Packages and Types Reference, Release 2 (9.2)
Modify the storage parameters of the table Move the table to a different tablespace in the same schema Add support for parallel queries Add or drop partitioning support
9-9
Re-create the table to reduce fragmentation Change the organization of a normal table (heap organized) to an index-organized table and vice versa Add or drop a column
The restrictions on the old table and the new table are listed in Chapter 15, Managing Tables, in Oracle9i Database Administrator's Guide, Release 2 (9.2).
Plan the required space for the online table redefinition carefully. Please refer to Useful Practices to determine the required space. Space required will vary with the size of the table, the rate and the amount of writes to the table as well as the extent of the reorganization. Online object reorganization consumes additional system resources and may result in some performance impact. Additional undo segment space is required doing the online reorganization to capture the interim transactions. Undo segment space requirements spike at the beginning, synchronization, and ending steps and are high on the instance where these steps are executed. Additional undo segment space is consumed on all instances where transactions updating the given table are processed. It is recommended that the reorganization be tested in a test environment before the production run. Space (data, index, undo) and performance impact of the organization should be assessed. The test system should be similar to production in object size, transaction mix, transaction rate and the extent of the reorganization. Additional space required for data, index and undo segment can be measured in the test environment and scaled for the production environment. Performance impact can also be measured and scaled for the production environment. It is suggested that the impact measurement be done for the worst-case scenario. This will ensure that enough system resources are available to complete the reorganization successfully. Even though updates are allowed while online table redefinition is in progress, it should be scheduled for times of low system activity on the given table. This also reduces the amount of undo space required for the reorganization and helps expedite the online redefinition process.
Useful Practices
Note that synchronization of the application code to use the new definition of the table should be coordinated externally if the tables structural definition has changed. The complexity of this task depends on the nature of the change and the application code. The start and end of the online table redefinition, start_redef_table () and finish_redef_table (), should be done when there are no long running transactions involving the table being redefined. The redef_table calls wait for all pending transactions to be either committed or rolled back. Any index and constraint definition on the interim table must be done after the start of the redefinition. This will make the start of the redefinition faster. Referential integrity constraints and triggers created on the interim table should be disabled. The synchronization call (sync_interim_table ()) synchronizes the table and the interim table. Calling this on a regular basis during the redefinition reduces the absolute time taken by finish_redef_table (). There may also be a reduced requirement for space. If any materialized view exists on the table being redefined, they must be dropped during the operation.
9-10
Only one online table redefinition can take place at a time on a table. Space requirement for the entire process should be planned in advance and more than adequate space must be available. The additional space requirement for online table redefinition includes: o o o o Space for all current rows of the table in the new definition. Space for all additional rows and updates that will be made during the redefinition process. Space for indexes on the new definition. Space for materialized view log for the changes during the redefinition process.
DML locks must be enabled and sufficient for online table redefinition.
ALTER TABLE ... DROP COLUMN This allows dropping of one or more column of a table. If the table is large, then this can be a long operation and should be avoided. For such cases, ALTER TABLE SET UNUSED is the preferred option. Any columns already marked unused are also dropped as part of this operation. Any indexes that have the dropped column as one of the index columns are also dropped. Any application still referring to the unused column will receive the following Oracle error (ORA-00904: "col": invalid identifier). ALTER TABLE SET UNUSED This is like a deferred ALTER TABLE ... DROP COLUMN statement. It marks columns to be dropped as unused. For all practical purposes, the column is no longer a part of the table though it still occupies the same physical space as before. This is an extremely fast operation with minimal downtime. The actual column drop occurs when a call to ALTER TABLE DROP COLUMN is invoked. ALTER TABLE RENAME This statement allows renaming of existing columns. When you rename a column, all function-based indexes and check constraints that depend on the renamed column remain valid, but other dependent views, triggers, domain indexes, functions, procedures, and packages are marked INVALID. If these still refer to the old name then the subsequent attempt at revalidation will also fail. You cannot rename a column that is used to define a join index. Instead you must drop the index, rename the column, and re-create the index. ALTER TABLE MOVE This allows relocation a non-partitioned table in order to change storage parameters. For index-organized, nonpartitioned tables, this move can be made online. For heap-organized non-partitioned tables, using this command to move will lock the table during the time of the move. If the table is large, online redefinition is a better option.
9-11
Improve performance Drop and re-create existing indexes New or different constraints Changing data access patterns for the application
All index builds can be done online without impacting application availability. Please see the appendix for examples of online index reorganization.
Useful Practices
Even though indexes can be built online, this operation should be scheduled for times of low system activity. Because an in-progress online index build operation waits while there are uncommitted transactions on the base table, such DML should be kept to a minimum and should not be long-running. Index rebuilding can be used to move indexes across tablespaces as well as to change their storage characteristics. Index coalescing coalesces leaf blocks within same branch of tree and frees up index leaf blocks for use. Coalescing frees up space in the allocated index segment for new entries and reduces fragmentation. This may provide some performance gains. Coalescing is preferred over rebuilding if enough disk space is not available for an index rebuild. Coalescing indexes can be done online. Ensure that there is sufficient sort space available for index build and index rebuild scenarios. In some cases of rebuilding indexes, it may be faster to drop and re-create an index online. Typically, these cases occur when the index rebuild is being attempted during heavy system activity on the base table and there are many updates to the columns that are part of the index being rebuilt.
9-12
Local Object Recovery See Local Object Recovery Solutions for details. o o Recover locally now - Start recovery process locally as soon as possible. Recover locally later - Do local recovery but defer the recovery process to later.
Role Transition o o o Switchover - Data Guard switchover Failover with complete recovery - Data Guard failover completely or as much as possible. Failover to a point in time - Data Guard failover to a point in time before error occurred
Customers should use the process as outlined in the following section to create a customized object corruption decision matrix to help them quickly decide between a role reversal operation and local object recovery when an object is corrupted. When the decision is unclear or too complex and the object is affecting application availability, customer should initiate a database failover and site failover should occur to quickly restore application availability. In this section, a process is available to decide between recovering locally and failing over to the standby. Finally, the object recovery solutions used in MAA are discussed followed by general strategies for problem avoidance. This section includes the following topics:
Deciding Between Local Object Recovery and Role Transition Local Object Recovery Solutions
9-13
3. 4. 5. 6. 7.
Determine how widespread the problem is Determine how critical the object is Determine the nature of the problem Determine the type of the object Decide the recovery action to take
bid => block id For missing objects, verify non-existence of the object. Check your external change management tool for validity of object.
Action:
9-14
If SEGMENT_TYPE is ROLLBACK/TYPE2 UNDO, then failover. If SEGMENT_TYPE is TEMPORARY, do not fail over. Consider local re-creation of the temporary object of the temporary tablespace. For all other objects, follow the decision matrix for object.
9-15
Problem
Immediacy of Access
Later
Problem on Standby
Yes Yes No No
Action
Recover locally later Recover locally now Recover locally now Recover locally now Database failover Database forced failover Recover locally later Recover locally now Recover locally now Database forced failover Recover locally later Database failover (database switchover possible in some cases) Recover locally now
Now
Low
The decision matrix above requires that four criteria be evaluated in determining the proper recovery action: Immediacy of Access, Time Since Event, Problem on Standby, and Cost of Local Recovery. Descriptions of the criteria are given below in Table 9-4.
Table 9-4: Object Recovery Decision Criteria Descriptions
Criterion
Explanation
Possible Values
Now
Value Description
Object will be accessed in the time required to restore Object will not be accessed in the time required to restore Time since the causing event is past the standby lag and the undo retention time on the production. Time since the causing event is within the standby lag or the undo retention time on the production. Problem can be seen on standby as well.
Immediacy of access
How much time has passed since occurrence of the problem and step 0? This time is since the occurrence of the problem and not just since detection of the problem.
Within lag
Problem on standby
Has the problem propagated to the standby already? In some cases, the time at which the problem was caused cannot be easily
Yes
9-16
Criterion
database
Explanation
determined and it may be important to know if problem has already propagated to the standby.
Possible Values
No
Value Description
Problem not seen on standby so far. Verify by opening the standby database in read only mode or running a DBVERIFY operation on the relevant data files. Cost to re-create object locally is very high compared to failover and reinstantiation Cost to re-create locally is low compared to failing over and reinstantiation Cost is the similar even though cost factors are different If cost of local recovery is Low and Same, the actions are always similar since in this case, it is always safer to remain on the current database and avoid the inherent risk involved in client failover.
How does the cost of local recovery compare with the cost of failing over to the standby and reinstantiation of the new standby? This is not a business cost which is assumed to be implicit in the criticality of object availability criterion but cost in terms of feasibility of recovery, resources required and their impact on performance and total time taken. Cost of local recovery should include: Cost of local recovery or re-creation Time to restore and recover the object from a valid source Time to recover other dependent objects like: o o o o Indexes Constraints (primary key, foreign keys, other constraints) Related tables and its indexes and constraints Dictionary objects based on this object (recreation and revalidation)
High
Low
Same
Availability of resources like disk space, data/index tablespace, temporary tablespace. Impact on performance and functionality of current normal application functions due to absence of object Impact on performance and functionality of current normal application functions due to attempt to re-create the object.
Object Type
Recovery Method
Drop index Create index online RMAN block media recovery
Comments
Useful for small indexes that need to be re-created because of media corruption. Creating an index online allows the base object to be used in conjunction Useful when RMAN backup is available and block media recovery is possible. This is probably the easiest and quickest solution when the backup is available online and the corrupted blocks are well identified and not too widespread. Useful when detection of the problem due to a user/application error is within the UNDO_RETENTION time specified on the production. Note that to do this well, impact on all the dependent objects should be understood well. If the table needs to be re-created or certain rows updated, a possible method is to get the data from the standby. The standby needs to be opened in the read-only mode. The actual mechanism of transferring data may include export from standby and import to production; dumping to a file and reloading on production; or another method such as inserting with a database link or using SQL*Loader. If the table is mainly a look-up table or the scripts to re-create the table and populate it are readily available, then simply dropping the table and re-creating it may be the fastest option. Useful when RMAN backup is available and block media recovery is possible. This is probably the easiest and quickest solution when the backup is available online and the corrupted blocks are well identified and not too widespread. Not suitable when the physical media is corrupted or when the problem is due to a user or application error.
Index
Table
Flashback query
9-17
Object Type
Recovery Method
DBMS_REPAIR
Comments
Useful when working at an object level. Can be used to mark blocks as invalid and enable skipping of corrupt blocks at the object level. Can leverage block media recovery in the future to recover damaged blocks.
Flashback Query
Flashback Query uses Oracle's multi-version read-consistency capabilities to restore data by applying undo as needed. Administrators can configure undo retention by simply specifying how long undo should be kept in the database. Using Flashback Query, a user can query the database as it existed this morning, yesterday, or last week. We can use this to recover an object from user or application error, which results in unwanted or incorrect DML.
DDL that causes any structural changes to the table or relocates a table should not have been issued on the object since the time the user or application error. If any ALTER TABLE statement with MODIFY COLUMN, DROP TABLE, or TRUNCATE TABLE options have been issued, then flashback query cannot be used as a recovery mechanism. The amount of undo generated since the time of the user error is less than the undo space available and no undo has been overwritten. Note that the UNDO_RETENTION time is the maximum time period for which undo information is retained in the database. Regardless of the UNDO_RETENTION time specified, if the amount undo generated exceeds the undo space available, the undo will be overwritten and the actual amount of time one can go back in time may be reduced. To maintain data integrity of the database, an useful prerequisite is the knowledge of operation which resulted in the error, including: o o o o o Appropriate time or SCN of the error Cause of the user error All the changes caused by user error on the given object All the side effects caused by the DML due to triggers based on the given object. Other objects may have been affected and this has to be known to maintain data integrity . Understanding of all the dependencies if an object was flashed back
9-18
Understanding of all transactions that have used the current data and their impact if the object was flashed back.
It is important to have the exact data model and the transaction model for the application to do an impact analysis in terms of all the objects affected due to problems with a given object. The size of the object impacted and the amount of change is small enough to allow re-creation and recovery of the object or the impacted rows.
Useful Practices
UNDO_RETENTION time should be set based on the allotted time to flash back to the past. If local recovery from user error is the preferred method for an installation, this time should be larger than the standby lag. If failing over to the standby is the preferred method, the UNDO_RETENTION time should be less than the standby lag. Please refer to Database Configuration for methods of sizing the UNDO_RETENTION time. The sizing of UNDO tablespaces should be based on the UNDO_RETENTION time setting and the amount of undo generated during peak intervals and not on the average undo generation rates. Please refer to the Oracle Administration Guide for more information on sizing undo tablespaces and monitoring undo space usage. For large objects, if the extent of the change is known, it is suggested that the flashback query be designed to access and fix only the rows impacted. For small objects, it may be better to create the whole table, its indexes and dependent objects, and then replace the original table with the new table. If the object impacted is re-created as a new object, all the constraints, triggers, views, synonyms and grants should also be re-created on the new object. All DDL-related to a given object should be available. Using an object repository tool makes things manageable. If the update on the table is part of a transaction, it is suggested to do flashback query based on SCN rather than a time stamp. This should be done for all tables involved in the transaction. It is possible to exceed or work around the UNDO_RETENTION time limit by doing the following: o o Create a new undo tablespace. Change the undo tablespace for the instance by issuing: ALTER SYSTEM SET UNDO_TABLESPACE=<new_tablespace>; The instance uses this new undo tablespace for all new transactions. The undo information needed by the flashback query remains preserved on the old undo tablespace, which is no longer used by new transactions. The old undo tablespace information is not overwritten until you drop the tablespace or switch the undo tablespace back.
9-19
Useful Practices
If block media recovery is to be used, then the RMAN backup and the archive logs to be used should be available locally. Even if archived redo logs have missing redo, block media recovery can still work if the block has been re-newed in the interim or there are no changes for the specific block in the missing redo. If RMAN block validation has been run proactively, then the V$DATABASE_BLOCK_CORRUPTION view will have a list of blocks validated as corrupt by RMAN. These can be proactively recovered through block media recovery.
DBMS_REPAIR Package
DBMS_REPAIR is a packaged procedure to detect and repair corrupt blocks in tables and indexes. Using this approach, you can address corruptions where possible, and also continue to use objects while you attempt to rebuild or repair them. Please refer to Chapter 22, Detecting and Repairing Data Block Corruption, of Oracle9i Database Administrators Guide for more details as well as limitations.
9-20
Since the repair is at the object level, logical dependency of the given object on other objects should be taken into account. Also, repair can sometimes involve loss of data.
Useful Practices
The REPAIR_TABLE and the ORPHAN_KEYS_TABLE tables can be created beforehand. This avoids the need to create them when required. The SKIP_TABLE option is set until it is explicitly unset. With the SKIP_TABLE option on (set by the DBMS_REPAIR.SKIP_TABLE procedure), table and index scans continue to skip any detected corrupted blocks including blocks that were corrupted after the initial outage. Reset this option as soon as possible. It is better to recreate these tables at the earliest opportunity. To verify that the SKIP_TABLE option has been set for a table, use this SQL command: SELECT SKIP_CORRUPT FROM DBA_TABLES WHERE TABLE_NAME = table_name;
9-21
Recovering from Outages - Database Server Tier RAC Failover and Transparent Application Failover
9.6 Database Server Tier RAC Failover and Transparent Application Failover
RAC and Transparent Application Failover (TAF) eliminate or reduce the impact of a production host or instance outage. With automatic instance recovery and a pre-configured client failover, clients will not notice any disruption of service. If there is a hardware or software failure on RAC Instance 1, then clients reconnect to RAC Instance 2 by using connecttime failover (new connection) or Oracle Net transparent application failover (existing connections). RAC can hide instance or node failures from the end user or application server when RAC is used with application failover features such as TAF. Automatic application failover functionality transparently re-route connections to a surviving instance and resubmit queries that were in progress at the time of the failure. This level of resiliency is accomplished without special application coding, if the application is using OCI libraries, release 8 or later. Implement TAF with connect-time failover and client load balancing for multiple addresses. In the following example, Oracle Net connects randomly to one of the protocol addresses on sales1-server or sales2-server. If the instance fails after the connection, then the TAF application fails over to the listener on another node.
sales.us.acme.com= (DESCRIPTION= (LOAD_BALANCE=on) (FAILOVER=on) (ADDRESS=(PROTOCOL=tcp)(HOST=sales1-server)(PORT=1521)) (ADDRESS=(PROTOCOL=tcp)(HOST=sales2-server)(PORT=1521)) (CONNECT_DATA= (SERVICE_NAME=sales.us.acme.com) (FAILOVER_MODE= (TYPE=session) (METHOD=basic) (RETRIES=20) (DELAY=15))))
9-22
Switchover Method
Physical Standby Database Switchover Logical Standyby Database Switchover Physical Standby Database Switchover
Data Guard Switchover Overview Physical Standby Database Switchover Logical Standyby Database Switchover
9-23
Data Guard switchover should not be used where object recovery solutions provide a faster and more efficient alternative.
Switchover Preparation Step 1: Check the Status of Log Transport Services Switchover Preparation Step 2: Check for Archive Gaps Between Production and Standby Databases Switchover Preparation Step 3: Record Current Online Redo Log Sequence Number and Standby Recovery Sequence Numbers Switchover Preparation Step 4: Shut Down All Production and Standby Instances Except One Switchover Preparation Step 5: Stop Active Sessions on Remaining Active Production Instance Switchover Preparation Step 6: Check the Switchover Status on the Production Database
2.
3. 4.
Ensure that the log transport services are working before switching over. The standby alert log should also have messages about opening the standby online redo logs and writing sequences. For example:
RFS: Successfully opened standby logfile
9-24
Switchover Preparation Step 2: Check for Archive Gaps between Production and Standby Databases
If you check the protection mode from productions V$ARCHIVE_DEST_STATUS, it will either state the protection mode or RESYNCHRONIZATION, which also implies an archive gap. Check the protection mode of the standby destination (log_archive_dest_2 in this case) as follows:
SELECT PROTECTION_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2;
On the production database, run the following query to determine which archived redo logs were not shipped to the standby database:
SELECT LOCAL.THREAD#, LOCAL.SEQUENCE# FROM (SELECT THREAD#, SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=1) LOCAL WHERE LOCAL.SEQUENCE# NOT IN (SELECT SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND THREAD# = LOCAL.THREAD#);
If an archive gap does exist, copy the archive redo logs directly and register them with the standby database. Refer to the Oracle9i Data Guard Concepts and Administration, Appendix B.3, Resolving Archive Gaps Manually. Normally this is done automatically through automatic gap detection.
Switchover Preparation Step 3: Record Current Online Redo Log Sequence Number and Standby Recovery Sequence Numbers
Record the current online redo log sequence number and the standby recovery sequence numbers for potential debugging. On the production database:
SELECT THREAD#, SEQUENCE# FROM V$LOG WHERE STATUS='CURRENT'; On the standby database: SELECT THREAD#, MAX(SEQUENCE#) FROM V$LOG_HISTORY GROUP BY THREAD#;
Switchover Preparation Step 4: Shut Down All Production and Standby Instances except One
After shutting down all of the instances but one, execute the following query on production and on the standby to validate that no other instances are active:
SELECT INSTANCE_NAME, HOST_NAME FROM GV$INSTANCE WHERE INST_ID <> (SELECT INSTANCE_NUMBER FROM V$INSTANCE);
9-25
No rows should be returned. If any rows are returned, then shut down those instances. Attempting the SWITCHOVER TO STANDBY and the SWITCHOVER TO PRIMARY with other instances active results in the following error:
ORA_01105: mount is incompatible with mounts by other instances
Switchover Preparation Step 5: Stop Active Sessions on Remaining Active Production Instance
To identify active sessions, execute the following query:
SELECT SID, PROCESS, PROGRAM FROM V$SESSION WHERE TYPE = 'USER' AND SID <> (SELECT DISTINCT SID FROM V$MYSTAT); Common processes that prevent switchover are:
CJQ0, the Job Queue Scheduler Process Stop CJQ0 with ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0; QMN0, the Advanced Queue Time Manager Stop QMN0 with ALTER SYSTEM SET AQ_TM_PROCESSES=0; DBSNMP, the Oracle Enterprise Manager Intelligent Agent At the OS level, issue the AGENTCTL STOP command.
Instead of stopping the processes manually, you can use the WITH SESSION SHUTDOWN option of the switchover command in Switchover_Step_2.
ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;
Switchover Preparation Step 6: Check the Switchover Status on the Production Database
Enter the following statement:
SELECT SWITCHOVER_STATUS FROM V$DATABASE;
Switchover Step 1: Ensure that Switchover Preparation Steps Have Been Executed Switchover Step 2: Switch Over the Current Production Database to the Standby Database Switchover Step 3: Start the New Standby Database Switchover Step 4: Check the Switchover Status and Finish Recovery if Necessary Switchover Step 5: Convert the Former Standby Database to a Production Database Switchover Step 6: Restart All Instances
9-26
Switchover Step 1: Ensure that Switchover Preparation Steps Have Been Executed
See Preparing for Data Guard Switchover. Preparation is essential for a successful switchover.
Switchover Step 2: Switch Over the Current Production Database to the Standby Database
On the current production instance, execute the following statement:
ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY [WITH SESSION SHUTDOWN]; This statement accomplishes the following:
Closes the primary database, terminating any active sessions Archives any unarchived redo logs and applies them to the standby database Adds an end-of-redo marker to the header of the last log file being archived Creates a backup of the current control file Converts the current control file into a standby control file
Switchover Step 4: Check the Switchover Status and Finish Recovery if Necessary
On the current standby instance, execute the following statement (the steps in this section can be run in parallel with the steps to start the new production instance from the previous section):
SELECT SWITCHOVER_STATUS FROM V$DATABASE; If SWITCHOVER_STATUS=TO PRIMARY, then continue with Switchover Step 5.
9-27
If SWITCHOVER_STATUS=SWITCHOVER PENDING, this may be satisfactory if the next sequence number required is in the standby online redo logs. It is safer to take the following actions:
RECOVER MANAGED STANDBY DATABASE CANCEL; RECOVER MANAGED STANDBY DATABASE NODELAY DISCONNECT; Alert log sample: Media Recovery Log <archive log file name> Media Recovery Log <archive log file name> Identified end-of-REDO for thread 1 sequence <sequence#> Media Recovery Log <archive log file name> Identified end-of-REDO for thread 2 sequence <sequence#> Media Recovery End-Of-Redo indicator encountered Media Recovery Applied until change <SCN> MRP0: Media Recovery Complete: End-Of-REDO Resetting standby activation ID <activation-id> MRP0: Background Media Recovery process shutdown Check the alert log to see if recovery is progressing. Then query the switchover status until it is TO PRIMARY.
This statement accomplishes the following: Makes sure the last archived redo log file has been received and applied through the switchover (end-ofredo) marker. If not, Data Guard returns an error. Closes the database if it has been opened for read-only transactions Converts the standby control file to the current control file
If statement completes without error, then continue with Switchover Step 6. Successful completion is indicated by the following message in the alert log:
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY RESETLOGS after incomplete recovery UNTIL CHANGE SCN 239480106 Switchover: Complete - Database shutdown required Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY
9-28
Post-Switchover Step 1: Clear Online Redo log Groups on the Standby Database Post-Switchover Step 2: Check Local and Remote Archive Destinations on the Production Database Post-Switchover Step 3: Ensure that the Lag is Set Up Correctly Post-Switchover Step 4: Ensure that Recovery is Applying New Archived Redo Logs
Post-Switchover Step 1: Clear Online Redo Log Groups on the Standby Database
On the standby database, enter the following sequence of statements to clear the online redo log groups. Clear each online redo log group as listed from the query:
SELECT GROUP# FROM V$LOG; RECOVER MANAGED STANDBY DATABASE CANCEL ALTER DATABASE CLEAR LOGFILE GROUP 1; ALTER DATABASE CLEAR LOGFILE GROUP 2; ALTER DATABASE CLEAR LOGFILE GROUP 3; ALTER DATABASE CLEAR LOGFILE GROUP 4; RECOVER MANAGED STANDBY DATABASE DISCONNECT
Post-Switchover Step 2: Check Local and Remote Archive Destinations on the Production Database
On the production database, enter the following statement:
SELECT NAME_SPACE, STATUS, TARGET, LOG_SEQUENCE, TYPE, PROCESS, REGISTER, ERROR FROM V$ARCHIVE_DEST WHERE STATUS!='INACTIVE'; Local and remote archive destinations should be returned. If all expected destinations are not returned, then investigate the alert log and V$ARCHIVE_DEST and V$ARCHIVE_DEST_STATUS views for errors. SELECT * FROM V$ARCHIVE_DEST_STATUS WHERE STATUS!=INACTIVE;
Post-Switchover Step 4: Ensure that Recovery is Applying New Archived Redo Logs
Execute the following query on the production and standby databases:
SELECT 'ARCHIVED LOG MAX ' || THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG GROUP BY THREAD#; On the production database, this query shows the redo logs archived and sent if Post-Switchover Step 1 showed no errors. The output from the standby and production databases should match.
9-29
Switchover Preparation Step 1: Check the Status of Log Transport Services Switchover Preparation Step 2: Check for Archive Gaps between Production and Standby Databases Switchover Preparation Step 3: Record Current Online Redo Log Sequence Number and Standby Recovery Sequence Numbers Switchover Preparation Step 4: Stop Active Sessions on Remaining Active Production Instance
2.
3.
Ensure that the log transport services are working before switching over.
Switchover Preparation Step 2: Check for Archive Gaps between Production and Standby Databases
If you check the protection mode from productions V$ARCHIVE_DEST_STATUS, it will either state the protection mode or RESYNCHRONIZATION, which also implies an archive gap. Check the protection mode of the standby destination (log_archive_dest_2 in this case) as follows:
9-30
Recovering from Outages - Database Server Tier Data Guard Switchover SELECT PROTECTION_MODE FROM V$ARCHIVE_DEST_STATUS WHERE DEST_ID=2;
On the production database, run the following query to determine which archived redo logs were not shipped to the standby database:
SELECT LOCAL.THREAD#, LOCAL.SEQUENCE# FROM (SELECT THREAD#, SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=1) LOCAL WHERE LOCAL.SEQUENCE# NOT IN (SELECT SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND THREAD# = LOCAL.THREAD#);
If an archive gap does exist, copy the archive redo logs directly and register them with the standby database. Refer to the Oracle9i Data Guard Concepts and Administration, Appendix B.3, Resolving Archive Gaps Manually. Normally this is done automatically through automatic gap detection.
Switchover Preparation Step 3: Record Current Online Redo Log Sequence Number and Standby Recovery Sequence Numbers
Record the current online redo log sequence number and the standby recovery sequence numbers for potential debugging. On the production database:
SELECT THREAD#, SEQUENCE# FROM V$LOG WHERE STATUS='CURRENT'; On the standby database: SELECT THREAD# , SEQUENCE# FROM DBA_LOGSTDBY_LOG LOG , DBA_LOGSTDBY_PROGRESS PROG WHERE PROG.APPLIED_SCN BETWEEN LOG.FIRST_CHANGE# AND LOG.NEXT_CHANGE#;
Switchover Preparation Step 4: Stop Active Sessions on Remaining Active Production Instance
To identify active sessions, execute the following query:
SELECT SID, PROCESS, PROGRAM FROM V$SESSION WHERE TYPE = 'USER' AND SID <> (SELECT DISTINCT SID FROM V$MYSTAT); Common processes that prevent switchover are:
CJQ0, the Job Queue Scheduler Process Stop CJQ0 with ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0; QMN0, the Advanced Queue Time Manager Stop QMN0 with ALTER SYSTEM SET AQ_TM_PROCESSES=0; DBSNMP, the Oracle Enterprise Manager Intelligent Agent At the OS level, issue the AGENTCTL STOP command.
9-31
Switchover Step 1: Ensure that Switchover Preparation Steps Have Been Executed Switchover Step 2: Switch Over the Current Production Database to the Standby Database Switchover Step 3: Disable the Log Transport Service Switchover Step 4: Check the Switchover Status and Finish Recovery if Necessary Switchover Step 5: Convert the Former Standby Database to a Production Database Switchover Step 6: Start Logical Standby Apply
Switchover Step 1: Ensure that Switchover Preparation Steps Have Been Executed
See Preparing for Logical Standby Database switchover. Preparation is essential for a successful switchover.
Switchover Step 2: Switch Over the Current Production Database to the Standby Database
On the current production instance, execute the following statement:
ALTER DATABASE COMMIT TO SWITCHOVER TO LOGICAL STANDBY; This statement accomplishes the following:
Terminates any active sessions Archives any unarchived redo logs and applies them to the standby database Adds an end-of-redo marker to the header of the last log file being archived
Switchover Step 4: Check the Switchover Status and Finish Recovery if Necessary
On the current standby instance, execute the following statement:
SELECT FOUND FROM DBA_LOGSTDBY_EVENTS WHERE EVENT_TIME = ( SELECT MAX(EVENT_TIME) FROM DBA_LOGSTDBY_EVENTS ) AND STATUS_CODE = 16128; If the query returns the message FOUND, then continue with Switchover Step 5.
9-32
If the query returns no output, then the Apply Engine is applying the last of the redo. If there is a delay configured for the Apply Engine then this should be removed.
ALTER DATABASE STOP LOGICAL STANDBY APPLY; EXECUTE DBMS_LOGSTDBY.APPLY_UNSET(APPLY_DELAY); ALTER DATABASE START LOGICAL STANDBY APPLY; Alert log sample: LOGSTDBY event: ORA-16128: User initiated shut down successfully completed
This statement accomplishes the following: Enables log transport services to the new standby Makes sure the last archived redo log file has been received and applied through the switchover (end-ofredo) marker. If not, Data Guard returns an error.
If statement completes without error, then continue with Switchover Step 6. Successful completion is indicated by the following message in the alert log:
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY LSP1 started with pid=38 Completed: alter database commit to switchover to primary
Post-Switchover Step 1: Check Local and Remote Archive Destinations on the Production Database Post-Switchover Step 2: Ensure that the Delay is Set Up Correctly Post-Switchover Step 3: Ensure that Recovery is Applying New Archived Redo Logs
Post-Switchover Step 1: Check Local and Remote Archive Destinations on the Production Database
On the production database, enter the following statement:
9-33
Recovering from Outages - Database Server Tier Data Guard Switchover SELECT NAME_SPACE, STATUS, TARGET, LOG_SEQUENCE, TYPE, PROCESS, REGISTER, ERROR FROM V$ARCHIVE_DEST WHERE STATUS!='INACTIVE'; Local and remote archive destinations should be returned. If all expected destinations are not returned, then investigate the alert log and V$ARCHIVE_DEST and V$ARCHIVE_DEST_STATUS views for errors. SELECT * FROM V$ARCHIVE_DEST_STATUS WHERE STATUS!=INACTIVE;
Post-Switchover Step 3: Ensure that Recovery is Applying New Archived Redo Logs
Execute the following query on the production databases:
SELECT THREAD#, MAX(SEQUENCE#) FROM V$ARCHIVED_LOG GROUP BY THREAD#; Execute the following query on the standby databases: SELECT THREAD#, SEQUENCE# SEQ# FROM DBA_LOGSTDBY_LOG LOG, DBA_LOGSTDBY_PROGRESS PROG WHERE PROG.APPLIED_SCN BETWEEN LOG.FIRST_CHANGE# AND LOG.NEXT_CHANGE# ORDER BY NEXT_CHANGE#
On the production database, this query shows the redo logs archived and sent if Post-Switchover Step 1 showed no errors. The output from the standby and production databases should match. See also Oracle9i Data Guard Concepts and Administration, section 6.4, Monitoring Log Apply Services.
9-34
In all MAA cases, remaining standby databases will need to be reinstantiated. Use Physical Standby Database Failover Or Logical Standby Database Failover
If the original production database is still accessible, you should always consider a Data Guard switchover first. A failover requires that the initial production database be reinstantiated as a new standby database, which can be a very expensive operation. On the contrary, a switchover is a planned operation that encompasses a role reversal between production and standby databases without any database reinstantiation required.
9-35
recovery is intended. For failover with complete recovery scenarios, the production database should not be accessible or cannot be restarted. For forced failover scenarios, the production database can be available but a point in time recovery is intended on the standby database with the expectation of data loss. Data Guard failover should not be used where object recovery solutions provide a faster and more efficient alternative. With object recovery, no database reinstantiation is required while any failover will require the previous production database to be reinstantiated as the new standby database.
Outage Types
Site failure Complete cluster failure containing production database Data failures (data or media corruption) User errors or malicious acts (drop table, duplicate batch runs)
Preparation Steps
Assess if the production database is inaccessible or cannot be restarted. If production database is accessible and can be started, use Data Guard switchover instead. Assess if the production database is inaccessible or cannot be restarted. If production database is accessible and can be started, use Data Guard switchover instead. Assess if the production database is inaccessible or cannot be restarted. If production database is accessible and can be started, use Data Guard switchover instead. 1. Cancel managed recovery on the standby database. RECOVER MANAGED STANDBY DATABASE CANCEL; This prevents the corruption from being applied. 2. 3. Record conservative time or SCN prior to event or error. Defer the log transport services on the current production database if it is still open to stop the log transport to the standby database. Since the standby database needs to recover to a specific point in time prior to the user error, new production archived redo logs do not need to be sent. ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER; 4. Shut down the current production database or deny access to the user community to prevent further data loss.
9-36
failover recovers all data or is a forced failover. This section focuses on providing detailed steps in implementing two failover scenarios and describing how to monitor the progress of each step. This section includes the following topics:
Physical Standby Database: Failover with Complete Recovery Steps(recover all possible data) Physical Standby Database Forced Failover Steps
Step 1: Check for Archive Gaps Step 2: Shut Down Other Standby Instances Step 3: Ensure the Primary is Inaccessible and RFS Connection(s) are Terminated Step 4: Finish Recovery Step 5: Check Database State Step 6: Commit to Switchover Step 7: Restart Instance
2.
3.
Verify that all the archived redo logs within the gap are available to the standby database. N the stadby check the primary archive destination, LOG_ARCHIVE_DEST_1 (e.g. /arch1; same setting as STANDBY_ARCHIVE_DEST), and the alternate destination directories, LOG_ARCHIVE_DEST_3 (e.g. /arch2).
If the production host is still available, you can run a more thorough archive gap query. If a gap is detected, you can manually copy production archived redo logs and register the archived redo logs with the standby database manually. Archive gap query:
SELECT LOCAL.THREAD#, LOCAL.SEQUENCE# FROM
9-37
Recovering from Outages - Database Server Tier Data Guard Failover (SELECT THREAD#, SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=1) LOCAL WHERE LOCAL.SEQUENCE# NOT IN (SELECT SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=2 AND THREAD# = LOCAL.THREAD#); After copying the archived redo logs to the standby, register the archived redo logs with the following statement: ALTER DATABASE REGISTER LOGFILE /ARCH1/SALES/archname;
2.
Kill the identified processes. For non-UNIX systems see your system documentation and MetaLink. For example, on UNIX: kill 9 14914 14928
9-38
Monitor the standby alert log for errors and messages. The following message implies that full recovery has been completed:
Incomplete recovery applied all redo ever generated. Recovery completed through change 56985. Media Recovery Completed
The following message implies that some data loss probably occurred.
Incomplete recovery applied all redo ever generated. Recovery completed through change 77637. Terminal recovery recovered to last consistent point in redo stream But it did not apply all redo; some data maybe lost
If you receive the following 2 messages, then the finish recovery command completed successfully.
Terminal Recovery: successful completion Completed: ALTER DATABASE RECOVER managed standby database finish
If you receive an error during this command or you see errors in the standby alert log indicating that this command has failed, you can attempt to manually apply the archive logs with a RECOVER STANDBY DATABASE command. You may have to abort the instance and retry manually if the "FINISH" command is hung. A common problem is an irresolvable archive gap that requires a forced failover, which requires that the following command be issued:
ALTER DATABASE ACTIVATE STANDBY DATABASE SKIP STANDBY LOGFILE
If SWITCHOVER_STATUS= 'TO PRIMARY', then continue with Step 5. If SWITCHOVER_STATUS != 'TO PRIMARY', then there was an error on the previous command. Either attempt step 2 again or issue the following command to skip applying the standby redo logs:
RECOVER MANAGED STANDBY DATABASE FINISH SKIP
Re-query V$DATABASE, if SWITCHOVER_STATUS still does not equal 'TO PRIMARY' after the RECOVER MANAGED STANDBY DATABASE FINISH SKIP command, then a forced failover must be done by running the following command:
ALTER DATABASE ACTIVATE STANDBY DATABASE SKIP STANDBY LOGFILE;
9-39
Step 1: Check for Archive Redo Logs Step 2: Point-in-Time Recovery Step 3: Mount Standby Instance in Exclusive Mode Step 4: Activate Standby Database Step 5: Restart New Production Database Step 6: Record the Resetlogs SCN
If all required archived redo logs are not available, you can query the same V$ARCHIVED_LOG view on the production database and copy the corresponding production archived redo logs to the secondary host STANDBY_ARCHIVE_DEST directory.
9-40
Recovering from Outages - Database Server Tier Data Guard Failover Log applied. Media recovery complete Check the standby alert log for errors throughout the whole process.
Ensure that local archiving is functional and the log transport is temporarily in a DEFER state. Query V$ARCHIVE_DEST and V$ARCHIVE_DEST_STATUS views to validate if local archiving is functional. Back up all production archived redo logs from the time of the failover to its current archived redo logs. During a forced failover, the archive sequence numbers are reset. You should also record the resetlogs SCN after activating the standby database.
9-41
When the previous production and now standby host is restored, back up and move all previous archived redo logs. It is essential that archived redo logs with the same name but different content are not confused when recreating the new standby database. Notify the user community and application community if data loss occurred. If a forced failover occurs, you need to repopulate the old data. If you attempted to recover all available redo, you can check the standby alert log and see if you have the following message:
Terminal recovery recovered to last consistent point in redo stream But it did not apply all redo; some data maybe lost.
Outage Types
Site failure Complete cluster failure containing production database Data failures (data or media corruption) User errors or malicious acts (drop table, duplicate batch runs)
Preparation Steps
Assess if the production database is inaccessible or cannot be restarted. If production database is accessible and can be started, use Data Guard switchover instead. Assess if the production database is inaccessible or cannot be restarted. If production database is accessible and can be started, use Data Guard switchover instead. Assess if the production database is inaccessible or cannot be restarted. If production database is accessible and can be started, use Data Guard switchover instead. 1. Cancel managed recovery on the standby database. ALTER DATABASE STOP LOGICAL STANDBY APPLY; This prevents the corruption from being applied. 2. 3. Record conservative time or SCN prior to event or error. Defer the log transport services on the current production database if it is still open to stop the log transport to the standby database. Since the standby database needs to recover to a specific point in time prior to the user error, new production archived redo logs do not need to be sent. ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER; 4. Shut down the current production database or deny access to the user community to prevent further data loss.
9-42
2.
The query will report the last logical log file registered for each thread of redo as well as any log files that are missing within a particular thread. The results above indicate that thread# 1, sequence# 357 and thread# 2, sequence 126 are the last two logical log files that have been registered and that there is a gap commencing at logfile thread# 1, sequence# 350. This means that at a minimum, thread#1, sequence# 351 is missing, but additional sequence#s may also be missing.
If the production host is still available, you can run a more throrough archive gap query. If a gap is detected, you can manually copy the production archived redo logs and register the archived redo logs with the standby database. Archive Gap Query: For a Logical Standby only environment on the production host:
SELECT LOCAL.THREAD#, LOCAL.SEQUENCE# FROM (SELECT THREAD#, SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=1) LOCAL WHERE LOCAL.SEQUENCE# NOT IN
9-43
Recovering from Outages - Database Server Tier Data Guard Failover (SELECT SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=4 AND THREAD# = LOCAL.THREAD#); For an environment running both a logical and physical standby, on the standby host: SELECT LOCAL.THREAD#, LOCAL.SEQUENCE# FROM (SELECT THREAD#, SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=1) LOCAL WHERE LOCAL.SEQUENCE# NOT IN (SELECT SEQUENCE# FROM V$ARCHIVED_LOG WHERE DEST_ID=4 AND THREAD# = LOCAL.THREAD#); After copying the archived redo log files to the standby, register the archived redo logs with the following statement. ALTER DATABASE REGISTER LOGICAL LOGFILE /arch1/SALES/archname;
Ensure that local archiving is functional and the log transport is temporarily in a DEFER state. Query V$ARCHIVE_DEST and V$ARCHIVE_DEST_STATUS views to validate if local archiving is functional. Back up all production archived redo logs from the time of the failover to its current archived redo logs. When the previous production and now standby host is restored, back up and move all previous archived redo logs. It is essential that archived redo logs with the same name but different content are not confused when recreating the new standby database. Notify the user community and application community of the data loss that occurred.
9-44
Standby recovery can continue even when the primary standby host is undergoing maintenance or incurs a failure. By not interrupting standby recovery and maintaining the appropriate recovery lag, you can ensure that a Data Guard failover or switchover can be completed within the tolerated MTTR. The production database will not be interrupted or production downtime will be minimized since the subsequent network connection switches to the secondary standby instance.
The standby instance failover steps differ depending if there was a scheduled or unscheduled outage on the primary standby instance.
Considerations for Multiple Standby Database Environments Unscheduled Standby Instance Failover Scheduled Standby Instance Failover
If there is an outage that affects the physical standby database, then two options are available: The logical standby database can effectively be stopped, pending the restoration of the physical standby database. In this case, the customer need take no action. The logical standby database may continue, by having the primary database log transport service #4 (log_archive_dest_4) enabled and for the logs to be shipped directly from the primary database to the logical standby database.
Follow these steps to reconfigure the primary database to transfer the redo logs to the logical standby database. Step 1: Reconfigure the physical standby log transport services on the primary database Step 2: Reconfigure the logical standby log transport service on the primary database
9-45
Step 1: Reconfigure the physical standby log transport services on the primary database
From a production instance, change the status of the remote archive destination (LOG_ARCHIVE_DEST_STATE_2) to disable the transfer of the redo logs to the physical standby database. ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER;
Step 2: Reconfigure the logical standby log transport service on the primary database
From a production instance, change the status of the remote archive destination (LOG_ARCHIVE_DEST_STATE_4) to enable the transfer of the redo logs to the logical standby database. ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_4=ENABLE;
Step 1: Ensure that the Secondary Standby Instance is mounted Step 2: Verify Oracle Net Connection to the Secondary Standby Host Step 3: Start Production Database if Necessary Step 4: Start Recovery on the Secondary Standby Instance Step 5: Copy Archived Redo Logs to the Secondary Standby Host Step 6: Verify Remote Archiving to the Secondary Standby Host Step 7: Verify Managed Recovery on the Secondary Standby Host Step 8: Restart Primary Standby Instance and Listener
9-46
Recovering from Outages - Database Server Tier Standby Instance Failover % tnsping SALES_ALT
9-47
9-48
Prod Instance 1
redo
Standby Instance 2
Prod Instance 1
redo
Standby Instance 1
Prod Instance 2
SALES_ALT
redo
Standby Instance 2
The prerequisite is that you have at least one available standby instance available to fail over to after the primary standby instances in your cluster becomes unavailable. Complete the following steps for standby instance failover:
Step 1: Ensure that the Secondary Standby Instance is available Step 2: Verify Oracle Net Connection to the Secondary Standby Host Step 3: Alter Remote Archive Destination to the Secondary Standby Host Step 4: Start Recovery on the Secondary Standby Instance Step 5: Copy Archived Redo Logs to the Secondary Standby Host Step 6: Verify Remote Archiving to the Secondary Standby Host Step 7: Verify Managed Recovery on the Secondary Standby Host Step 8: Shut Down the Primary Standby Instance and Listener Step 9: Perform Maintenance on the Standby Host
9-49
Recovering from Outages - Database Server Tier Standby Instance Failover If it is not mounted, either pick a different target standby instance or manually mount this standby instance. STARTUP NOMOUNT ALTER DATABASE MOUNT STANDBY DATABASE; Required output for a Logical Standby: READ WRITE LOGICAL STANDBY If it is not opened read write, either pick a different target standby instance or manually open this standby instance. STARTUP
9-50
9-51
Whenever a component within MAA fails, then the full protection, or fault tolerance, of MAA is compromised and possible single points of failure exist until the component is repaired. Restoring MAA to full fault tolerance to reinstate full MAA protection requires repairing the failed component. While full fault tolerance may be sacrificed during a scheduled outage, the method of repair is well understood because it is planned, the risk is controlled, and it ideally occurs at times best suited for continued application availability. However, for unscheduled outages, the risk of exposure to a single point of failure must be clearly understood. This section focuses on describing the steps in restoring database fault tolerance. The database tier fault tolerance restoration processes are detailed in the following sections:
Restoring Failed Nodes or Instances within Real Applications Cluster Restoring the Standby Database after a Failover Restoring Fault Tolerance after Secondary Site or Cluster-Wide Scheduled Outage Restoring Fault Tolerance after Standby Database Data Failure Instantiation of the Standby Database Restoring Fault Tolerance after Dual Failures
When to perform these tasks in order to incur minimal or no impact on the current running environment Resetting network components (load balancer or Oracle Net) which were modified for failover and now need to be reset Failing back or rebalancing existing connections
How an application runs within a RAC environment (similar to initial failover) also dictates how to restore the node or instance, as well as whether to perform other processes or steps. Once the problem that caused the initial node or instance failure has been corrected, a node or instance can be restarted and added back into the RAC environment at any time. When restarting a RAC instance, no changes or modifications should be needed, just an Oracle instance STARTUP command. However, there may be some performance impact on the current workload when rejoining the node or instance.
Table 10-1: Performance Impact of Restarting or Rejoining a Node or Instance
Restoring Database Fault Tolerance - Restoring Failed Nodes or Instances within Real Applications Cluster
Action
Restarting node or rejoining a node into a cluster Restarting/Rejoining a RAC Instance
Therefore, it is important to consider the following when restoring a node or RAC instance:
Test and evaluate the performance impact under a stress workload when rejoining a node into the cluster or restarting an Oracle RAC instance. If service levels are acceptable and 2 or more RAC instances are still available in the cluster, consider rejoining the failed instance during non-peak work periods.
For detailed steps on how to start and join a node back into a cluster, refer to your vendor specific cluster management documentation. For further information on restarting a RAC instance, refer to Oracle9i Real Application Clusters Administration and Deployment.
Partitioned with services running on a subset of RAC instances Nonpartitioned where all services run equally across all nodes Have a combination of some services evenly load-balanced across all instances and some services running on a specific subset of instances
This is very valuable for modularization of application and database form/function while still maintaining a consolidated data set. For the cases where an application is partitioned or has a combination of partitioning and non-partitioning, the response time and availability aspects for each service will need to be considered. If redistribution or failback of connections for a particular service is required, those connections must be appropriately identified (distinguished from connections for other services), and then moved.
10-2
Restoring Database Fault Tolerance - Restoring Failed Nodes or Instances within Real Applications Cluster
For load-balancing application services across multiple RAC instances, we recommend using the Oracle Net connect-time failover and connection load balancing (described in previous sections). When utilizing this feature, no changes or modifications are required for failover or restoration. It is also possible to use hardware-based load balancers. However, there may be limitations in distinguishing separate application services (which is understood by Oracle Net Services) and restoring an instance or a node. For example, when a node or instance is restored and available to start receiving new connections, a manual step may be required to include the restored node or instance in the hardware-based load balancer logic. The following table provides an outline of the considerations for new and existing connections after an instance has been restored. The considerations differ depending on whether the application services are partitioned, non-partitioned, or have a combination of each type. As stated above, the actual redistribution of existing connections may or may not be required depending on the resource utilization and response times.
Table 10-2: Restoration and Connection Failback
Application Services
Partitioned
Nonpartitioned
Failback batch jobs: Need to identify existing connections, which should fail back after the instance is restored. Otherwise, new users will go back to the original instance, and existing users will be on the failed-over instance. Failback probably means that a user is disconnected and reconnected to the original instance. Nonpartitioned OLTP should not be an issue, unless the load needs to be rebalanced, since restoring the instance means that the load there is low. If the load needs to be rebalanced, then the same problems are encountered as if application services were partitioned.
Figure 10-1 shows a 2-node RAC database. In this case, the application is partitioned. Therefore, each instance services a different portion of the application (HR and Sales in this example). Client processes connect to the appropriate instance based on the service they require.
10-3
Restoring Database Fault Tolerance - Restoring Failed Nodes or Instances within Real Applications Cluster
App/Web Servers
HR Sales
HR Sales
Node 1
hb
Instance 1 text
Node 2
hb
Instance 2 text
HR Service
RAC DB
Sales Service
Figure 10-2 shows what happens when one RAC instance fails. If one RAC instance fails, the service and existing client connections can be automatically failed over to another RAC instance. In this example, the HR and Sales services are both supported by the remaining RAC instance. In addition, new client connections for the HR service can be routed to the instance now supporting this service.
App/Web Servers
HR
HR
Sales
Sales
Node 1
hb
Instance 1 text
Node 2
hb
Instance 2 text
HR Service
RAC DB
After the failed instance has been repaired and restored as in Figure 10-1, failed-over clients and any new clients that had connected to the HR service on the failed-over instance may need to be identified and failed back. (New client connections, which are started after the instance has been restored, should automatically connect back to the original instance. Therefore, over time, as older connections disconnect, and new sessions connect to the HR service, the client load will migrate back to
10-4
Restoring Database Fault Tolerance - Restoring Failed Nodes or Instances within Real Applications Cluster
the restored instance. Rebalancing the load immediately after restoration depends on the resource utilization and application response times.) Figure 10-3 shows a nonpartitioned application, except that services are evenly distributed across all active instances available. Each instance has a mix of client connections for both HR and Sales.
App/Web Servers
HR
Sales HR
Sales
Node 1
hb
Node 2
Instance 1 text
hb
Instance 2 text
HR Service
RAC DB
Sales Service
Figure 10-4 shows what happens when one RAC instance fails. If one RAC instance fails, existing client connections can be automatically failed over to another RAC instance. In addition, new client connections are routed only to the available RAC instance.
App/Web Servers
HR
HR
Sales
Sales
Node 1
hb
Instance 1 text
Node 2
hb
Instance 2 text
HR Service
RAC DB
10-5
Restoring Database Fault Tolerance - Restoring Failed Nodes or Instances within Real Applications Cluster
After the failed instance has been repaired and restored as in Figure 10-3, some clients may need to be moved back to the restored instance. For non-partitioned applications, identifying appropriate services is not required for rebalancing the client load among all available instances. In addition, this is necessary only if a single instance is not able to adequately service the requests. (Note: New client connections, which are started after the instance has been restored, should automatically connect back to the restored instance since it will be the least loaded. Therefore, over time, as older connections disconnect and new sessions connect to the restored instance, the client load will again evenly balance across all available RAC instances. Rebalancing the load immediately after restoration depends on the resource utilization and application response times.)
10.1.2 Consideration for Logical Standby Client Connections after Restoring a RAC Instance
The discussion regarding Client Connections after restoring a RAC instance is equally applicable to the client connections that are connected to a logical standby database. Clients that were connected to the failed RAC instance can be automatically failed over to another RAC instance. After the failed instance has been repaired and restored, some clients may need to be moved back to the restored instance. It might also be applicable to switchback the Logical Standby Process if this was moved because of a Database Server Tier Standby Instance Failover.
10-6
Restoring Database Fault Tolerance - Restoring the Standby Database after a Failover
10.2.1
If the primary database is Real Application Cluster (RAC) and the Data Guard protection mode used is Maximum Performance or Maximum Availability, following a Data Guard failover: Backups taken at the old primary database or the old standby database prior to the failover cannot be used to recreate the old primary as a new (physical or logical) standby Such backups cannot also be used to do complete recovery past the failover SCN Bystander standby databases will need to be recreated. Therefore, to restore full fault tolerance for the above cases, a new standby database must be instantiated. Refer to Instantiation of the Standby Database for additional details. However if a forced failover occurred where the standby database was activated or the Data Guard protection mode was Maximum Protection prior to a Data Guard failover, you can still use a backup prior to failover SCN to re-create your new standby database. The following steps describe how to restore full fault tolerance after a failover for the forced failover and Data Guard failover with Maximum Protection cases: Step 1: Retrieve The Failover SCN Step 2: Restore A Backup To The Site Hosting New Standby Database Step 3: Restore Archives After Failover SCN Step 4: Create New Standby Control File Step 5: Startup And Mount New Standby Database Step 6: Defer Log Transport Services On Standby Step 7: Start Managed Recovery Step 8: Verify Log Transport Services On Production Step 9: Verify Managed Recovery Is Progressing On Standby
10-7
Restoring Database Fault Tolerance - Restoring the Standby Database after a Failover ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY NORESETLOGS after complete recovery through change 56985 Switchover: Complete - Database shutdown required Completed: alter database commit to switchover to primary
A backup taken of the new production database after the failover. Any backup taken of the data files at either site before the failover, including a backup of the data files from the old production database before the failover, or a backup of the data files from the old standby database before it became the new production database.
A backup taken of the new production database after the failover. When using a backup of the new production, restore only the data files. Any backup taken of the data files at either site before the failover, including a backup of the data files from the old production database before the failover, or a backup of the data files from the old standby database before it became the new production database. When using a backup from before the failover, restore control files and data files. Data files from before the forced failover need to be recovered up to the failover SCN (obtained in step 1) before continuing. Step 2.1: Startup and mount the standby database with the backup control file
STARTUP NOMOUNT; ALTER DATABASE MOUNT; Step 2.2: Defer log transport services ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER SCOPE=MEMORY; Step 2.3: Recover to the forced failover SCN RECOVER DATABASE UNTIL CHANGE <forced failover SCN> USING BACKUP CONTROLFILE;
10-8
Restoring Database Fault Tolerance - Restoring the Standby Database after a Failover
From the new production database, issue the following query to identify the archives that must be copied over to the standby host:
SELECT THREAD#, SEQUENCE# FROM V$ARCHIVED_LOG WHERE NEXT_CHANGE# >= (SCN from alert.log); Previous archive logs prior to Time B will be identical between the databases.
10-9
Restoring Database Fault Tolerance - Restoring the Standby Database after a Failover
10.2.2
Following the failover to a Logical Standby Database, it is not possible to restore the original primary database, as it is when a failover to a Physical Standby database occurs. Therefore, to restore full fault tolerance after a failover to a Logical Standby database, a new standby database must be instantiated. Refer to Instantiation of the Standby Database for additional details.
10-10
Restoring Database Fault Tolerance - Restoring Fault Tolerance after Secondary Site or Cluster-Wide Scheduled Outage
10.3 Restoring Fault Tolerance after Secondary Site or Cluster-Wide Scheduled Outage
The following steps are required to restore full fault tolerance after a scheduled secondary site or cluster-wide outage: Step 1: Startup And Mount Standby Database Step 2: Defer Log Transport Services On Standby Step 3: Start Managed Recovery Step 4: Verify Log Transport Services On Production Step 5: Verify Managed Recovery Is Progressing On Standby Step 6: Restore Production Database Protection Mode
10-11
Restoring Database Fault Tolerance - Restoring Fault Tolerance after Secondary Site or Cluster-Wide Scheduled Outage
10-12
Restoring Database Fault Tolerance - Restoring Fault Tolerance after Standby Database Data Failure
If the archives required for recovery are available on the standby system in a configured log archive destination, MRP will automatically locate and apply them as needed. No restore is necessary. If the required archives have been deleted from the standby system, but are still available on the production system, FAL will be invoked automatically to transfer the needed archives to the standby system. No restore is necessary. If the archives required to recover the restored data files up to the configured lag have been deleted from both the production and standby systems, those archives must be restored on to the standby system.
10-13
Restoring Database Fault Tolerance - Restoring Fault Tolerance after Standby Database Data Failure Required output for a Physical Standby: MOUNTED PHYSICAL STANDBY If it is not mounted, either pick a different target standby instance or manually mount this standby instance. STARTUP NOMOUNT ALTER DATABASE MOUNT STANDBY DATABASE; Required output for a Logical Standby: READ WRITE LOGICAL STANDBY If it is not opened read write, either pick a different target standby instance or manually open this standby instance. STARTUP
10-14
Restoring Database Fault Tolerance - Restoring Fault Tolerance after Standby Database Data Failure
protection mode back to maximum protection. Follow the steps in the Oracle Data Guard section titled Setting the database protection mode.
10-15
10.5.1
The following steps are required to create the initial physical standby database. Refer to Oracle9i Data Guard Concepts and Administration for additional details.
Step 1: Restore Backup of the Production Database to Standby Cluster Step 2: Set up Standby Initialization Parameter File (SPFILE) and Oracle Net Files Step 3: Ensure Standby Redo Logs have been Created on Production Step 4: Create Standby Control File Step 5: Copy or Restore Archives to Standby Step 6: Startup and Mount Standby Database Step 7: Defer Standby Database Log Transport Services Step 8: Start Managed Recovery Step 9: Verify Log Transport Services On Production Step 10: Verify Managed Recovery Is Progressing On Standby
Step 1: Restore Backup of the Production Database to Standby Cluster Step 2: Set up Standby Initialization Parameter File (SPFILE) and Oracle Net Files
Refer to the Oracle Data Guard configuration section and Appendix C for details.
10-16
Restoring Database Fault Tolerance - Restoring Fault Tolerance after Dual Failures
10.5.2
Following a failover to a logical standby database, a new logical standby database must be instantiated from the new primary database, to restore full fault tolerance. Refer to Oracle9i Data Guard Concepts and Administration, chapter 4 Creating a logical standby database for the steps required to create the logical standby database.
Available Backups
Local backup on production and standby databases
10-17
Restoring Database Fault Tolerance - Restoring Fault Tolerance after Dual Failures
Available Backups
Restore the local standby backup to the standby database. Do incomplete recovery and activate it as the new production database.
If restoring a physical standby database, recreate the standby database from the new production database by following the steps described in Restoring the Standby Database after a Failover. If restoring a logical standby from a local backup, re-create the standby database from the new production database by following the steps described in Instantiation of the Standby Database.
Restore tape backups locally. Recover the production database and activate as new production database.
If restorig a physical standby datavbase, recreate the standby database from the new production database by following the steps described in Restoring the Standby Database after a Failover. If restoring a logical standby from a local backup, re-create the standby database from the new production database by following the steps described in Instantiation of the Standby Database.
10-18
11 Conclusion
The Maximum Availability Architecture is Oracles paramount high availability solution that provides:
Architectural components to protect your data and achieve high availability Key Oracle features such as Real Application Clusters and Data Guard Operational best practices Configuration best practices Outage decision matrix leading into high availability solutions Steps and best practices to implement high availability solutions
MAA embraces HA so that any failure is handled transparently or with a thorough, automated recovery procedure that can achieve a low MTTR. After setting up MAA with the operational and configuration best practices, we recommend using the outage decision matrix and the detailed solutions to build a complete high availability solution for all potential outages. The complete deployment should be rehearsed to ensure that the required MTTR is met. After automation and some testing, you should be able to meet high availability requirements by leveraging the MAA practices such as:
Fast site failover using a wide area traffic manager for reroute clients to the secondary site Transparent application tier failover with hardware or software-based load balancers and mid-tier application server farms Host and instance failover with Transparent Application Failover and Real Application Clusters Database role reversal between primary and secondary sites for scheduled maintenance using Data Guard switchover Database failover to the standby database to protect from user errors, data errors and site disasters using Data Guard failover
MAA will be continually enhanced with knowledge from customer implementation experiences and with the results of further internal testing. Future projects will include Oracle features such as logical standby databases, and features not yet released. MAA enhancements will be validated using different application infrastructures such as Oracles Mail Server application, Oracles Customer Relationship Management (CRM) application, and Oracles Enterprise Resource Planning (ERP) application. The Server Technologies High Availability Systems Group charter is to build, test, validate and design high availability solutions. One of the goals of this group is to simplify customer HA deployments and ensure completeness. This requires integrated Oracle HA solutions that are easy to deploy and easy to manage and enable customers to meet their service levels. Many of the features and manual steps described in this paper will ultimately be automated and incorporated into future Oracle releases. Enhancements will continue to make Oracle HA solutions more transparent and manageable. To keep abreast of future updates, check http://otn.oracle.com/deploy/availability/htdocs/maa.htm
Part V: Appendixes
Appendix A, Risk Assessment of Major Architectural Changes Appendix B, Operational Best Practices Appendix C, Database SPFILE and Oracle Net Configuration File Samples Appendix D, Test Configurations Appendix E, Object Reorganization and Recovery Examples Appendix F, Server Parameter File (SPFILE)
Maximum Availability Architecture provides the best Oracle HA architecture. However, when designing your system architecture, you may need to balance your availability needs with your application performance requirements and your IT spending limits. This section focuses on alternative architectures by providing a high-level glance of different architectural solutions and their disadvantages and associated risks. The alternative architectures focus on removing main architectural components of MAA, which include RAC, Data Guard, and the secondary site. The tradeoff is that removing a component also removes the benefits it provides. For example, removing RAC from the environment can be done in several ways, but in each case, key features will be lost, such as the loss of fast failover capability and the lack of ability to do scheduled maintenance without incurring downtime or subjecting your business to increased risk because resiliency had to be removed. The following table compares the differences between MAA and the alternative architectures, and identifies the areas of risk that each alternative introduces. A more detailed explanation and diagram of each alternative are presented below.
Table 12-1: MAA vs. Alternative Architectures
Alternative Architecture
Primary site: RAC production Secondary site: Data Guard single instance standby Primary site: single instance production Secondary site: Data Guard single instance standby Primary site: RAC production and Data Guard with RAC standby Secondary site: none Primary site: RAC production and Data Guard single instance standby Secondary site: none
Replace RAC at primary and secondary sites with single instance at both sites. Remove secondary site but add a physical standby database residing on a different RAC cluster on the same LAN. Remove secondary site; add a single instance physical standby database residing on a different host on the same LAN.
Site disaster protection Scheduled maintenance Fast failover capability Processing power Operational complexity
Primary site: single instance production and Data Guard single instance standby Secondary site: none
Remove secondary site and RAC from the primary site; add a single instance physical standby database residing on a different host on the same LAN.
Site disaster protection Scheduled maintenance Fast failover capability Scalability Operational complexity
User error or corruption recovery Site disaster protection Scheduled maintenance Additional reporting or opened database resource
Appendix A - Risk Assessment of Major Architectural Changes - Restoring Fault Tolerance after Dual Failures
Primary Site: RAC Production Secondary Site: Data Guard using Single Instance Standby
This alternate architecture is very similar to MAA except there is only a single instance (i.e. no RAC or cluster) at the secondary site to reduce costs. By removing RAC at the secondary site, you are exposed to increased risk because of the following disadvantages:
Inability to do scheduled maintenance at the secondary site without removing resiliency. Production database protection mode must be downgraded if running in maximum protection mode. No protection from instance and node failures at the secondary site. Potential performance degradation when switching over or failing over to the secondary site since there is only one node. It is likely that the number of users serviced by the secondary site is less than the primary site because there is less processing power, and the business will run at reduced capacity until the primary site is restored. Because the configuration is not symmetrical, the environment is more difficult to operate and manage since each site requires different processes and procedures.
This configuration, pictured below, does still provide similar or equal projected MTTR for all the different outages that may occur, but only those occurring at the primary site. In addition, performance and a level of resiliency are sacrificed when moving the production role to the secondary site.
12-2
Appendix A - Risk Assessment of Major Architectural Changes - Restoring Fault Tolerance after Dual Failures
Primary Site: Single Instance Production Secondary Site: Data Guard using Single Instance Standby
This alternate architecture is also very similar to MAA except there is only a single instance (i.e. no RAC or cluster) at both the primary and secondary sites to reduce costs. The disadvantages are:
Inability to do scheduled maintenance at either site without removing resiliency. Production database protection mode must be downgraded if running in maximum protection mode. No fast failover protection from instance and node failures. Ability to scale is limited to the capabilities of a single machine.
When RAC is removed, the key benefits it provides are lost. Without RAC and its fast-failover capability, any scheduled or unscheduled host or instance outage will result in a longer MTTR and will likely be handled by a Data Guard switchover or failover. However, because there is only a single node at the site where the outage occurred, resiliency will not exist until the outage is resolved. In addition, it is not possible to scale the environment beyond the capabilities of a single machine.
Figure 12-2: Single instance Production and Data Guard single instance
12-3
Appendix A - Risk Assessment of Major Architectural Changes - Restoring Fault Tolerance after Dual Failures
Primary Site: RAC Production and Data Guard using RAC Standby Secondary Site: none
In many cases, the secondary site or disaster site is the last to be implemented because the costs prohibit it or current SLAs do not justify it. You can reduce cost considerably by removing the secondary site and its operational infrastructure completely. The risk is that a disaster may require hours or likely days to recover. Instead of the secondary site, Data Guard at the primary site is utilized to provide protection and resolution from human errors and data failures. The physical standby database resides on a cluster different from the production database, but on the same LAN as the production hosts. The disadvantages are:
No protection from a site disaster. Inability to do site-wide scheduled maintenance without downtime.
Figure 12-3: RAC Production and Data Guard using RAC at Primary Site
12-4
Appendix A - Risk Assessment of Major Architectural Changes - Restoring Fault Tolerance after Dual Failures
Primary Site: RAC Production and Data Guard using Single Instance Standby Secondary Site: none
This architecture is a combination of two alternatives previously presented. It removes the secondary site and removes RAC from the Data Guard physical standby. The single instance physical standby database resides on a separate host from the production database, but on the same LAN. The risks introduced are also a combination of those from the previous architectures:
There is no protection from a primary site disaster. Recovering from a complete primary site outage may take hours, days, or even weeks, provided the proper plans are in place. Inability to do site-wide scheduled maintenance without downtime. Potential performance degradation when switching over or failing over to the standby since there is only one node. It is likely that the number of users serviced by the single instance database is less than the RAC production database because there is less processing power, and the business will run at reduced capacity until the RAC database is restored. Because the configuration is not symmetrical (RAC production versus single instance standby), the environment is more difficult to operate and manage since each database requires different processes and procedures.
Figure 12-4: RAC Production and Data Guard with Single Instance at Primary Site
12-5
Appendix A - Risk Assessment of Major Architectural Changes - Restoring Fault Tolerance after Dual Failures
Primary Site: Single Instance Production and Data Guard using Single Instance Standby Secondary Site: none
Another combination of two architectures previously presented, this alternative removes the secondary site and removes RAC from both the production and standby databases. The single instance physical standby database resides on a separate host from the production database, but on the same LAN. The risks introduced are also a combination of those from the previous architectures:
There is no protection from a primary site disaster. Recovering from a complete primary site outage may take hours, days, or even weeks, provided the proper plans are in place. Inability to do site-wide scheduled maintenance without downtime. Inability to do scheduled maintenance on the production database or standby database without removing resiliency. Production database protection mode must be downgraded if running in maximum protection mode. No fast failover protection from instance and node failures. Ability to scale is limited to the capabilities of a single machine.
When RAC is removed, the key benefits it provides are lost. Without RAC, any scheduled or unscheduled host or instance outage will result in an extended MTTR while the outage is resolved, or a lack of resiliency while the standby host or instance outage is resolved. In addition, the environment cannot scale beyond the capabilities of a single machine.
12-6
Appendix A - Risk Assessment of Major Architectural Changes - Restoring Fault Tolerance after Dual Failures
The only method of recovering from user error or corruption is a partial or full database restore and recovery, which may require an extended outage. There is no protection from a primary site disaster. Recovering from a complete primary site outage may take hours, days, or even weeks, provided the proper plans are in place. Inability to do site-wide or database-wide scheduled maintenance without downtime Lost of additional opened database resource with logical standby.
Conclusion
The alternative architectures should only be considered if your business requirements do not require the best achievable MTTR service levels or if the actual costs are too high even with the suggested cost reductions mentioned previously. However, MAA provides availability and performance benefits that may be difficult to sacrifice.
12-7
The operational best practices in partner with service management are fundamental to avoiding, minimizing and reacting to operational outages as well as reducing the time to recover from an outage, i.e., the MTTR. The goal of this section is to clearly describe these practices. A checklist with recommendations is included under each practice to assist with assessing your infrastructures adherence to these practices. The best practices are categorized into logistical and technical sections. The logistical best practices section includes those that are the foundation of managing the IT infrastructure. The logistical best practices are geared towards process, policy and management. The technical best practices are aimed at the specific technical details that are requirements for high availability environment. Together the logistical and technical best practices form the basis for building a solid highly available infrastructure.
Service level management Change management Backup, restore, and recovery plans Disaster recovery planning Scheduled outages planning Staff training Documentation Security policies and procedures
The technical operational best practices are found in the Technical Best Practices section.
13.1.1
With the ever-shifting way corporations are doing business, more pressure is being put on corporate Information Technology (IT) departments to deliver more direct business value. IT departments are under constant scrutiny to deliver higher levels of service and availability while reducing and/or avoiding costs. A proven and accepted method to ensure that IT services are meeting the business requirements is Service Level Management. Service Level Management requires a dialogue between IT managers and the companys lines of business. Service Management starts with mapping business requirements to IT investments. Service Level Management encompasses complete end-to-end management of the service infrastructure. The foundation of Service Level Management is the Service Level Agreement (SLA). The SLA is a critical device for building accountability into the provider-client relationship and for evaluating the providers performance. SLAs are becoming more accepted and necessary as a monitoring and control instrument for the relationship between a customer and the IT supplier (external or internal). Foremost, SLAs cover mission critical business processes and application systems, such as order processing. SLAs should be developed with the same business individuals who specify the functionality of those systems and represent
a detailed, complete description of the services that the supplier is obligated to deliver, and the responsibilities of the users of that service. Developing an SLA challenges the business managers (the client) to rank their requirements and focus resources towards the priorities. Additionally, an SLA should be a living document that evolves with the business requirements. There is not a standardized SLA that will meet the needs of all companies, but a typical SLA should contain the following:
Definition of the basics - Definition of the service provided, the parties involved and the effective dates of the agreement. Availability specification - Specification of the hours and days during which the service or application will be available, excluding time for scheduled testing, maintenance or upgrades. Specification of the service scope - Specifications of the numbers and locations of users and/or hardware for which the service or application will be offered. Problem reporting and escalation procedures - Explain how problems will be reported, including the conditions for escalating calls for help to higher levels of support. This should set an expected response time for problems. Change procedures - Explanation of procedures for requesting changes, possibly including expected times for completing routine requests. Specification of service metrics - Specification of quality targets and explanations of how these metrics are calculated and how frequently they are reported. Definition of service costs and charges - Specifications of charges associated with the service; may be flat rate or may be tied to different levels of service quality. Specification of user responsibilities - Specifications of user responsibilities under the SLA (user training, maintaining proper desktop configuration, not introducing extraneous software or circumventing change management procedures). Description of procedures for resolving service-related disagreements
Historically, SLAs are constructed per service, for example "WAN", "Desktop", and "Data Center. The compilation of an SLA and metrics requires commitment and hard work on the part of all parties involved. The metrics should be more than traditional technical measurements like response-time, rather geared towards the business requirements, such as cost per order processed. Any shared services or components must perform at the level of the most stringent SLA. Furthermore, as part of overall service management, SLAs should be developed between interdependent IT groups and with external suppliers. In fact, many technologists advocate an integrated, comprehensive SLA rather than individual SLAs for infrastructure components. Key benefits of developing a SLA are:
A professional relationship between the supplier and customer with documented accountability A mutual goal of understanding and meeting the business requirements A system of measurement for service delivery so that IT can quantify their capabilities and results in terms of the business, and continuously improve upon them Enables IT to operate more proactively so that unwanted events that negatively impact availability can be thwarted or responded to in a much quicker manner A documented set of communication and escalation procedures
13-2
13.1.2
Change Management
Change management is a set of procedures or rules that ensures that any changes to the hardware, software, application, and data on a system are authorized, scheduled, and tested. A stable system in which unexpected, untested and unauthorized changes are not permitted is one that will guarantee integrity to its users. The end users can rely on the hardware, software and data to perform as anticipated. Knowing exactly when and what changes have been applied to the system are vital to debugging a problem. Each customer will handle change management of systems, databases, and application code differently, but there are general guidelines that can help prevent unnecessary system outages thus protecting the systems integrity. 7 With proper change management, there is greater stability in application and hardware systems and new problems are easier to debug.
Develop a change control process Form a change control group Evaluate proposed changes Gather statistics for base comparisons Track database changes Use a version control system for application code Develop quality assurance and testing procedures Perform internal audits
The following diagram, Figure 13-1, describes the typical change control flow. For emergency cases such as disasters, the change control process may need to be shortened.
Information derived from Expert Review Server Handbook and from customer operational reviews
13-3
Report Decision
Disapprove Change
Report Decision
Evaluate Request With Requestor/Requesting Group Close Outstanding Actions Report Findings to Change Control Board
13-4
Enterprise Manager's change management pack also provide Oracle database change control and some change implementation features
13-5
13.1.3
Proper backup, restore and recovery plans are essential, and those plans must be constructed to meet your specific service levels. In MAA, both disk and tape database backups are utilized. However, in most circumstances the standby database at the secondary site is leveraged to recover from logical and physical failures. If a disk or tape backup were used instead, the time to restore and recover the database would likely exceed any desired MTTR, especially with large databases. Nevertheless, disk and tape backups are essential for disaster recovery scenarios and for those cases where you need to restore from a very old backup. The specific recovery strategy you employ depends upon many factors including database size, transaction rate, and network bandwidth and latency between the primary and secondary sites. Refer to the Backup and Recovery Configuration Best Practice section for details. Fundamental to a robust backup, restore, and recovery scheme is an understanding of how it is strengthened or compromised by the physical location of files, the order of events during the backup process, and the handling of errors. A robust scheme is one which is resilient in the face of media failures, programmatic failures, and, if possible, operator failures. Complete, successful, and tested processes are fundamental to the successful recovery of any failed environment.
Create recovery plans Validate backups periodically Automate backup, restore, and recovery procedures Choose correct backup frequency Maintain offsite tape backups Stock replacement parts Maintain updated documentation for backup and recovery plans
13-6
13.1.4
Disaster recovery planning encompasses a great deal more than the usual concepts of backup and recovery. It is a process designed and developed specifically to deal with catastrophic, large-scale interruptions in service to allow timely resumption of operations. These interruptions can be caused by disasters like fire, flood, earthquakes, or malicious attacks. The basic assumption is that the building where the data center and computers reside may not be accessible, and that the operations will need to resume elsewhere. It assumes the worst and tries to deal with the worst. As an organization becomes increasingly reliant on its electronic systems and data, access to these systems and data become a fundamental component of success, which only underscores the importance of disaster recovery planning. Proper disaster planning reduces MTTR during a catastrophe and provides continual availability of critical applications helping to preserve customers and revenue.
Choose the right disaster recovery plans Determine what is covered under the disaster recovery plans Document disaster recovery plans (DRP), including diagrams of affected areas and systems Put all application mechanisms in place Look at the big picture and assess all the key components Assign a DR coordinator Test and validate DRP
13-7
Look at the big picture and assess all the key components
Consider all the components that allow your business to run. Ensure that your DRP includes all system, hardware, application and people resources. Verify network, telephone service and security measures are in place.
Assign a DR coordinator
A coordinator and a backup should be pre-assigned to ensure that all operations and communications are passed on.
13.1.5
Scheduled outages can affect the application server tier, the database tier, or the entire site. These outages may include one or more of the following: node hardware maintenance, node software maintenance, Oracle software maintenance,
13-8
redundant component maintenance, and entire site maintenance. Proper scheduled outage planning will reduce MTTR when making planned changes, and reduce risk when changes do not go as planned.
Create a list of scheduled outages For each possible scheduled outage, document impact and assess risk Justify the outage Create and automate change, testing, and fallback procedures
For each possible scheduled outage, document impact and assess risk
For scheduled outages that do not require software or application changes, these usually can be done with minimum downtime if a subsequent system can take over the new transactions. With Real Application Clusters and Data Guard Switchover, you can upgrade hardware and do some standard system maintenance with minimum downtime to your business. For most software upgrades such as Oracle upgrades, the actual downtime can be less than an hour if prepared correctly. For more complex application changes that require schema changes or database object reorganizations, customers must assess if Oracle's online reorganization and rebuild features will suffice. Refer to the Outages and Solution Roadmap for more details.
13.1.6
Staff Training
Highly trained people can make better and more informed decisions and are less likely to make mistakes. A comprehensive plan for the continued technical education of your systems administration, database administration, development, and users groups can help ensure higher availability of your systems and databases. Additionally, just as redundancy of system components eliminates a single-point of failure, knowledge management and cross training should eradicate the effects to operations of losing an employee.
13-9
Cross-train for business critical positions Develop guidelines to ensure continued technical education Implement a knowledge management process Maintain training materials in parallel with application or system revisions
13.1.7
Documentation
Documentation best practices should be part of every operational best practice. Without documenting the steps for implementing or executing a process, you run the risk of losing that knowledge, increasing the risk for human error during the execution of that process and possibly omitting a step within a process. All of these can impact availability. Clearly defined operational procedures are key to shorter learning curves as new employees join your organization. Properly documented operational procedures can greatly reduce the number of questions for your support personnel, especially when the people who put the procedures in place are no longer with the group. Proper documentation can also eliminate operational confusion by clearly defining roles and responsibilities within the organization. Clearly documented applications are fundamental to shorter learning curves as new employees join your organization. As maintenance and enhancements are called for on homegrown applications, documentation helps developers refresh their knowledge of the internals of the programs. If the original developers are no longer with the group, this documentation becomes especially valuable to new developers who would otherwise have to struggle through reading the code itself. Readily available application documentation also can greatly reduce the number of questions for your support organization.
13-10
Ensure documentation is kept up to date Approve documentation changes through change management process Document lessons learned and problem resolutions Protect the documentation
13.1.8
Security, taken from a logistical perspective, covers the physical security and operations of the hardware and/or data center. By physical security, we mean protection from unauthorized access, as well as from physical damage such as from fire, heat, electrical surge, an unintended kick, etc. Data security is discussed under the Technical Best Practices section. Physical security is the most fundamental security precaution and is a major determinant in the capability of the system to meet the customers availability requirements. Physical security protects against external and internal security breach. The CSI/FBI Computer Crime & Security Survey documents a trend towards increasing external intrusions9 but maintains that internal security violations still pose a large threat. A detailed discovery process into the security of data center operations and organization is outside the scope of this blueprint. However, a properly secured infrastructure reduces the risk of damage, downtime and financial loss.
Provide a suitable physical environment for computer equipment Restrict access to the operations area to authorized personnel Use internal security monitoring
13-11
13-12
13.2.1
Many design principles and operational practices reduce the exposure when an object availability problem does surface. MAA also advocates early and proactive detection of problems. Object level corruptions are best detected proactively then discovered at run time. Pro-active detection of object level corruption is particularly important as it allows the DBA to fix a problem using the most optimal method while the system is up and running. The following utilities can be used to detect block corruption on a proactive basis. These should be run on a regular basis. Any reported error by these tools should be investigated. The problems can be fixed by doing media recovery or recreation of the objects at the appropriate level.
Eliminate unused indexes Coalesce or rebuild indexes on a regular basis Maintain object definition details Design for smaller units of data Set non-SYS users default tablespace Use a manageable tablespace practice with application autonomy Use partitioned tables and indexes Store partitions in separate tablespaces
13-13
A repository of all such relevant information along with scripts to recreate the object helps in minimizing the recovery time. For dictionary objects, such as packages, procedures, functions, views, and synonyms, quick access to the object definition scripts results in a quick repair upon detection of the missing object. A given installation has a wide choice of methods to make object definition scripts available. Oracle Change Management Pack, which is part of the Enterprise Manager (EM), is one such tool.
Split an entity into tables grouped by columns used together or with similar frequency. Object recovery on one will not impact availability of other.
13-14
Reduce the possibility of data corruption in multiple partitions Improve manageability, availability, and performance
13.2.2
A database object may be completely unavailable or may be partially unavailable. Complete unavailability in this context includes cases where the object has been dropped or truncated. Partial availability includes cases when parts but not the whole of an object is still accessible. Partial availability is normally caused by corruption that directly impacts the object or a dependent object resulting in the object being unusable from the application standpoint. This section includes the following topics: Prevention, detection, and repair of completely unavailable objects Prevention, detection, and repair of partially unavailable or corrupted objects Oracle detection tools
Prevention
ALTER TABLE <tablename> DISABLE TABLE LOCK; - preferred method Set parameter DML_LOCKS10=0 for all instances. Control access through privileges. Access privileges granted should be the minimum required for the application.
Detection
Application accessing the object will get ORA00942: table or view does not exist; If the application logs all its error centrally, then its logs can be monitored. Enabling AUDIT NOT EXISTS and/or AUDIT DROP TABLE will result in audit trail, which can be monitored (manual or automated). Slowdown in performance. Enabling AUDIT NOT EXISTS and/or AUDIT DROP INDEX will result in audit trail, which can be monitored. Application accessing the object will get ORA00942: table or view does not exist. Enabling AUDIT NOT EXISTS will audit this to the audit trail which can be monitored. Application accessing the object will get ORA00942: table or view does not exist.
Repair
See object recovery decision process.
Index
Synonym
Control access through privileges. To drop a PUBLIC synonym, you must have the DROP PUBLIC SYNONYM system privilege Control access to object and dependencies through privileges
View
10
If DML lock is disabled for an object or at an instance level, explicit lock statements such as LOCK TABLE IN EXCLUSIVE MODE or any DDL on the object is not allowed.
For all the dictionary objects above, using a change management tool to record object status and definition will prove extremely beneficial in case of object loss. Recreation of the object can be achieved within minimal amount of time.
13-15
Object Type
Prevention
Detection
Check for INVALID views in DBA_OBJECTS. Enabling AUDIT NOT EXISTS and/or AUDIT DROP INDEX will audit this to the audit trail, which can be monitored.
Repair
recompiled by Oracle on next use. Resolve errors in dependent objects and manually recompile invalid views using ALTER VIEW COMPILE statement. Recreate constraint locally.
Constraint
ALTER TABLE <tablename> DISABLE TABLE LOCK; Set DML_LOCKS=0 Control access through privileges
Increases possibility of data integrity loss. Auditing for ALTER TABLE statement. Monitor audit trails. Application functionality reduced and may be incorrect. Auditing for ALTER TRIGGER/DROP TRIGGER and monitoring audit trails.
Application will receive an ORA error (usually ORA-6550). Check for INVALID procedures and packages in DBA_OBJECTS. Audit for create and drop of procedure/package/function and monitoring audit trails.
Recreate the procedure, function, or package locally. Invalid procedure, function, and package objects automatically recompiled by Oracle on next use. Resolve errors in dependent objects and manually recompile invalid procedures with the ALTER PROCEDURE COMPILE statement. Resolve errors in dependent objects and manually recompile invalid packages using ALTER PACKAGE COMPILE statement.
Sequence
Application accessing object gets ORA-2289 and fails. Audit drop sequence and/or audit not exist and monitor the audit trails.
Recreate the sequence locally. Maintain map of sequence and table/column it is used in. This will be useful in determining the next sequence number. SELECT MAX (table.column) FROM TABLE; to determine new value for the sequence.
Application may start behaving abnormally in terms of performance. Application accessing the link gets ORA error (ORA-2019). Audit for drop of dblink or public dblink and/or audit for not exists and monitor audit trails.
Proper object security and limited privileges will prevent most of these user errors or malicious acts that can damage an object; however preventing DDL operations on objects is another strategy. DDL operations can be disabled for an individual table or index with the ALTER TABLE|INDEX DISABLE TABLE LOCK statement, or DDL operations can be disabled database-wide by setting database parameter DML_LOCKS=0. However, disabling DDL operations database-wide should be done only if it is very unlikely that there will be structural changes or additions to the database. Restricting DDL operations at the table level is preferred since it provides a finer level of control. To verify if DML locking has been disabled for a table at the table level, the following SQL statement can be used:
SQL> SELECT TABLE_LOCKED FROM DBA_TABLES WHERE TABLE_NAME = table_name; TABLE_LOCKED -----------DISABLED
13-16
13-17
Table 13-2: Prevention, Detection, and Repair of Partially Unavailable or Corrupted Objects
Type
Data or index segment media corruption Data or index segment non-media corruption
Prevention
Set DB_BLOCK_CHECKING = TRUE Set DB_BLOCK_CHECKSUM = TRUE Use HARD where available to prevent certain corruption. Set DB_BLOCK_CHECKING = TRUE. Set DB_BLOCK_CHECKSUM = TRUE The data server should be maintained to be at the appropriate patch level for OS, Oracle software, device drivers and firmware.
Detection
OS level I/O Error in OS log files ORA-600 errors in alert log. Transient errors in alert log. ORA-1578, ORA-1110 / ORA-600 [3339] in the alert log. Check OS logs for memory corruptions, OS corruptions. Oracle tools such as DBVERIFY, DBMS_REPAIR, RMAN, or ANALYZE statement with VALIDATE STRUCTURE option. See below for details. ORA- error received by the application. Proactively, by ANALYZE statement with VALIDATE STRUCTURE option.
Repair
See object recovery decision process. See object recovery decision process.
Use HARD where available to prevent certain corruption. Set DB_BLOCK_CHECKING = TRUE Set DB_BLOCK_CHECKSUM = TRUE The data server should be maintained to be at the appropriate patch level for OS, Oracle software, device drivers and firmware.
DBVERIFY
DBVERIFY can be used to validate the physical data structure integrity check on an offline and online database. It opens a database file in read only mode and validates the blocks at a data file level or at a segment level. Since the segment level verification locks the object being verified, it should not be used in the MAA. The file level verification can be done with the database open. DBVERIFY validates all blocks of the file and can be used to detect problems in parts of the data file not currently in use by any segment of the database. Please refer to the appendix for an example of using DBVERIFY. The errors reported by DBVERIFY can then be used for further diagnosis and fix using media recovery.
RMAN validation
RMAN block validation can be used to validate the structural integrity of database blocks. The validation is done for the blocks that are currently in use. The errors detected by the results of the validation are recorded in the V$DATABASE_BLOCK_CORRUPTION view. This can then be used to do media recovery or block media recovery. Please refer to the Appendix for an example.
13-18
DBMS_REPAIR package
The DBMS_REPAIR package can be used to validate the structure of a particular data or index segment. It validates the internal structure of the block and populates a repair table with results of the problems found. It can also be used to determine the indexes that are impacted by corruption found on data segment. Please refer to the Appendix for an example.
13.2.3
Normally, the primary focus is on the solutions to handle various outages. However, prevention and detection of these outages is equally important. The focus is on the unplanned outages since the planned outages can be foreseen and planned for execution with minimal downtime. Prevention mechanisms include deploying the correct technology and implementing operational practices. Operational best practices help avoid outages and minimize downtime by automating failover procedures. A common theme to preventing outages is implementation and diligent practice of operational best practices. Detection mechanisms should be reliable and time-bound in terms of problem detection and notification. A monitoring infrastructure may include tools such as the Oracle Enterprise Manager. Detection of failures across the multiple tiers of the architecture should be in place and well tested. These tools should detect failures at a site and notify locally as well as to the secondary site. The monitoring infrastructure should detect outages reliably and quickly, notify appropriate agents (human or automated), and, where applicable, fix the problem (such as automatic restarts for mid-tier processes, automatic failover for Oracle Net services). For the outage types discussed in the Outages section, here are the broad strategies for prevention and detection. Repair for these outages is covered in Recovering from Outages section. These outages can impact either the primary or secondary site components. Table 13-3: Prevention and Detection of Unscheduled Outages Unscheduled Outage
Site-wide failure
Prevention
Data center should be secure with restricted access to critical resources. Redundant infrastructure masks many kind of site wide failure. Alternate sources of external resources like power supply and network paths between primary and secondary sites should be planned. Redundant set of application server nodes recommended in MAA allows failure in one set of nodes to not impact application availability. Monitoring infrastructure to detect failures and where applicable restart mid-tier server software. Application tier should be well tested for different kinds of load patterns before deployment into production. All changes to the application software should go through change management process and only well tested software should be deployed. All nodes should be exactly replicated in terms of their configuration. Prevention mechanism for application tier node failure applies here as well. The secondary site has a redundant set of application tier nodes to take over in case of primary site problems and recover quickly from outages. The application servers at the secondary site should be configured similar to the primary site. The network infrastructure include the WAN manager, DNS services locally are configured to have failover capability to redirect clients to the application tier on the secondary site. This should be well tested.
Detection
Heartbeat on the primary site to detect site failures This heartbeat mechanism checks site availability and application availability.
At production site, monitor failures in the mid-tier including detection of individual components as well as end-to-end monitoring of the application service. Components monitored include: Node failures Node component failures Application server software crash Application software crash Mid-tier misconfiguration At primary site, monitor failures in the mid-tier including detection of individual components as well as end-to-end monitoring of the application service. At secondary site, monitor the end-to-end availability of the application at the primary site and initiate takeover process when necessary.
13-19
Unscheduled Outage
Database tier node failure
Prevention
Deployment of RAC in the data server tier provides continued data server availability in case of single node failure. RAC hides the outage and users will reconnect to the available instances. On the secondary site, multiple standby instances will be available. If one fails, the subsequent standby instance can restart the Managed Recovery Process (MRP).
Detection
Detection of an instance failure and data server reconfiguration is handled automatically by RAC. Monitor to detect such failures and automatic notification of concerned personnel is a recommended practice Detection of standby instance failure needs to be automated. One of the surviving standby instances needs to restart the MRP. Detection of an instance failure and data server reconfiguration is handled automatically by RAC. Monitor to detect such failures and automatic notification of concerned personnel is a recommended practice. Detection of standby instance failure needs to be automated. One of the surviving standby instances needs to restart the MRP.
Deployment of RAC in the data server tier provides continued data server availability in case of single node failure. RAC hides the outage and users will reconnect to the available instances. On the secondary site, multiple standby instances will be available. If one fails, the subsequent standby instance can restart the MRP.
Deployment of Data Guard along with the failover of network infrastructure, Oracle Net and the application tier allows failover with zero to minimal data loss to the secondary site. If there is a standby cluster wide failure, there will be no impact on the production database unless the database is configured with maximum protection mode. In this case, you need to downgrade the protection mode before restarting the production database. Secure access to the database physical and logical objects and rigorously implement database server security at the user level. Implement change management process to control any changes to the database structure. Follow practices described in the Operational Best Practices section. Sound application design and testing procedures before deployment into production Sound application design procedures to capture and log to application logs all Oracle and OS error messages on encountering an exception. A sound security practice is probably the biggest prevention component here.
At primary site, monitor availability of a database service. In case of non-availability of a database service, initiate notification and failover at the application tier level and the network infrastructure (naming services, application tier load balancers).
Data failure
Monitoring infrastructure to monitor Oracle alert logs, Oracle audit logs, application server logs, applicationspecific logs, OS logs on the database tier. Notification and automatic repair jobs should be implemented where appropriate. Monitoring infrastructure to monitor Oracle alert logs, Oracle audit logs, application server logs, application-specific logs, OS logs on the database tier. Notification and automatic fix jobs should be implemented where appropriate. Application logic to catch logical errors and react such as stopping standby database
User error
MAA has redundant components for all resources across the technology stack, which may fail. There is no single point of failure. Automatic takeover by a redundant component in case of failure of the active component is assumed. This should be tested in advance and on regular basis to prevent unwanted outages.
13.2.4
The biggest threat to corporate data comes from employees and contractors with internal access to networks and facilities. Corporate data is one of companies' most valuable assets that can be at grave risk if placed on a system or database that does not have proper security measures in place. A well-defined security policy can help protect your systems from unwanted access and protect sensitive corporate information from sabotage. Proper data protection reduces the chance of outages due to security breeches.
Exercise proper database controls o o Lock and expire default user accounts Enable password management
13-20
o o o o o o o o o o o
Change default user passwords Enable data dictionary protection Grant necessary privileges only (including revoking unnecessary privileges from PUBLIC) Restrict permissions on run-time facilities Authenticate remote clients properly Use user-specific SYSOPER-type connection for database startup and shutdown Use normal user accounts with strong authentication for database administration Create audit and DBA sub-roles Consider using Oracle Advanced Security Create a mechanism to remove logins of employees who have left the company Use different passwords for test, development and production databases
Restrict network access o o o o Utilize a firewall Check network IP addresses Prevent unauthorized administration of the Oracle listener Encrypt network traffic
Harden the operating system Apply all security patches and workarounds Maintain updated virus protection Limit the number of operating system users Perform security audits Monitor security advisory notifications
Note: The following section highlights some of the key aspects of security policies and procedures. However, it is important to also reference your other vendors best practices, and recommendations for securing hardware, network, operating system, and cluster configurations.
13-21
disrupt database operations. LOCK and EXPIRE all default database user accounts after performing any kind of initial installation that does not utilize DBCA. For example, SQL> alter user outln lock;
Grant necessary privileges only (including revoking unnecessary privileges from PUBLIC)
Do not provide database users more privileges than are necessary. In other words, principle of least privilege is that a user be given only those privileges that are actually required to efficiently and succinctly accomplish the task. To implement least privilege, restrict: The number of SYSTEM and OBJECT privileges granted to database users The number of SYS-privileged connections to the database as much as possible. For example, there is generally no need to grant CREATE ANY TABLE to any non DBA-privileged user.
Revoke all unnecessary privileges and roles from the database server user group PUBLIC. PUBLIC acts as a default role granted to every user in an Oracle database. Any database user can exercise privileges that are granted to PUBLIC. Such privileges include EXECUTE on various PL/SQL packages that may permit a minimally privileged user to access and execute packages that he may not directly be permitted to access.
13-22
Use normal user accounts with strong authentication for database administration
Rather than connecting AS SYSDBA, create normal database users and grant the DBA role (or appropriate sub-roles) for those who need to administer the database. It is also highly desirable that these users use a form of strong authentication supported by Oracle Advanced Security, such as RSA Securitys SecurID cards.
Create a mechanism to remove logins of employees who have left the company
Upon employee termination all company accounts for that user should be removed or at a minimum expired permanently.
13-23
13-24
http://www.cert.org/
13-25
The file samples in this appendix are included to illustrate the best practices as they relate to the MAA database tier Oracle configuration files. Additionally, these samples clarify how the database system parameter file (SPFILE) relates to the Oracle Net configuration for purposes of dynamic service registration. The following sample files are included:
SPFILE Oracle Net configuration files o o listener.ora for each host tnsnames.ora for each host
The files are shown for the following configuration with ORACLE_BASE=/mnt/app/oracle: Primary Site Host Name primary_host1 primary_host2 ORACLE_SID SALES1 SALES2
secondary_host2
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Database System Parameter File (SPFILE) Sample
The following Database System Parameter file sample is common to the Primary database instances as well as the Physical and/or Logical Standby databases. Parameters that are specific to a particular database instance follow this section.
################################################################ # see Database Configuration Best Practices # compatible >= 9.0.0 ################################################################ *.COMPATIBLE='9.2.0' *.CLUSTER_DATABASE=true # RAC naming SALES1.THREAD=1 SALES2.THREAD=2 SALES1.INSTANCE_NUMBER=1 SALES2.INSTANCE_NUMBER=2 ################################################################ # # Data Guard Configuration parameters # see Data Guard Configuration Best Practices ################################################################ *.STANDBY_ARCHIVE_DEST='/arch1/SALES' *.STANDBY_FILE_MANAGEMENT='auto' *.LOG_ARCHIVE_DEST_STATE_1='enable' *.LOG_ARCHIVE_DEST_STATE_2=defer *.LOG_ARCHIVE_DEST_STATE_3='alternate' *.LOG_ARCHIVE_DEST_STATE_4=defer *.LOG_ARCHIVE_FORMAT='arch_%t_%S.log' *.LOG_ARCHIVE_START=true # This can be used for debugging purposes *.LOG_ARCHIVE_TRACE=0 *.REMOTE_ARCHIVE_ENABLE=true *.ARCHIVE_LAG_TARGET=0 *.DB_CREATE_FILE_DEST= ################################################################ # # Fast Start Checkpointing Parameters # see Database Configuration Best Practices # for determining the proper setting ################################################################ *.FAST_START_MTTR_TARGET=300 *.LOG_CHECKPOINT_INTERVAL=0 *.LOG_CHECKPOINT_TIMEOUT=0 ################################################################ # # Oracle Net Services Related Parameters # see Database Configuration Best Practices # subheading Ensuring Registration with Initialization Parameters # ################################################################ *.LOCAL_LISTENER='SALES_lsnr' ################################################################ # # Other Best Practices Related Parameters # see Database Configuration Best Practices ################################################################
14-2
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Database System Parameter File (SPFILE) Sample *.DB_BLOCK_CHECKING=true *.DB_BLOCK_CHECKSUM=true *.LOG_CHECKPOINTS_TO_ALERT=true *.TIMED_STATISTICS=true ################################################################ # # Automatic undo management, 1 tablespace per instance # see Database Configuration Best Practices # and Integrating automatic undo ################################################################ *.UNDO_MANAGEMENT='auto' SALES1.UNDO_TABLESPACE='rbs01' SALES2.UNDO_TABLESPACE='rbs02' *.UNDO_RETENTION=900
14.1.2
The following parameters are applicable to both the Primary & Physical Standby database
*.DB_NAME=SALES *.SERVICE_NAMES='SALES' *.CONTROL_FILES='/dev/vx/rdsk/ha-dg/SALES_cntr01', '/dev/vx/rdsk/ha-dg/SALES_cntr02' # OFA Compliant directory structure *.BACKGROUND_DUMP_DEST='/mnt/app/oracle/admin/SALES/bdump' *.CORE_DUMP_DEST='/mnt/app/oracle/admin/SALES/cdump' *.USER_DUMP_DEST='/mnt/app/oracle/admin/SALES/udump'
14.1.3
The following parameters are applicable to the Logical Standby database. These parameter changes are required when both a Physical Standby database and a Logical Standby database reside on the same host.
*.DB_NAME=SALES_LOG *.SERVICE_NAMES='SALES_LOG' *.CONTROL_FILES='/dev/vx/rdsk/ha-dg/SALES_LOG_cntr01', '/dev/vx/rdsk/ha-dg/SALES_LOG_cntr02' # OFA Compliant directory structure *.BACKGROUND_DUMP_DEST='/mnt/app/oracle/admin/SALES_LOG/bdump' *.CORE_DUMP_DEST='/mnt/app/oracle/admin/SALES_LOG/cdump' *.USER_DUMP_DEST='/mnt/app/oracle/admin/SALES_LOG/udump'
14.1.4
Depending upon the chosen standby configuration, one of the three sets of parameters need to be specified in the System Parameter File of the database running on Primary Site.
14-3
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Database System Parameter File (SPFILE) Sample # # LOG_ARCHIVE_DEST_4 is not set as there is no Logical Standby # database in this configuration. ################################################################ *.FAL_CLIENT='SALES_PRIM' *.FAL_SERVER='SALES_SEC' *.LOG_ARCHIVE_DEST_1='location=/arch1/SALES arch noreopen max_failure=0 mandatory alternate=log_archive_dest_3' *.LOG_ARCHIVE_DEST_2='service=SALES_SEC reopen=15 max_failure=10 lgwr sync=noparallel affirm delay=30' *.LOG_ARCHIVE_DEST_3='location=/arch2/SALES arch' *.LOG_ARCHIVE_DEST_4=
14-4
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Database System Parameter File (SPFILE) Sample
14.1.5
If a physical standby database has been deployed at Secondary Site, one of the following three sets of parameter needs to be specified in the System Parameter File of the physical standby database running on Secondary Site.
14-5
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Database System Parameter File (SPFILE) Sample *.LOG_ARCHIVE_DEST_4='service=SALES_LOG_SEC reopen=15 max_failure=10 delay=30 dependency=LOG_ARCHIVE_DEST_1'
14.1.6
If a logical standby database has been deployed at Secondary Site, one of the following three sets of parameter needs to be specified in the System Parameter File of the logical standby database running on Secondary Site.
14-6
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Database System Parameter File (SPFILE) Sample *.LOG_ARCHIVE_DEST_3='location=/arch2/SALES_LOG arch' *.LOG_ARCHIVE_DEST_4=''
14-7
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Oracle Net Configuration File Samples
14.2.2
lsnr_SALES = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(HOST=primary_host2)(PORT=1513) (QUEUESIZE=1024))))) # Password Protect listener; See "Oracle Net Services Administration Guide" PASSWORDS_lsnr_SALES = 876EAE4513718ED9 # Prevent listener administration
ADMIN_RESTRICTIONS_lsnr_SALES=ON
14.2.3 tnsnames.ora for primary_host 1 and 2 using dynamic instance registration
# Used for database parameter local_listener SALES_lsnr= (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(PORT=1513))) # Net service used for log_archive_dest_2 SALES_SEC = (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host1)) (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host2))) (CONNECT_DATA=(SERVICE_NAME=SALES))) # Alternate for log_archive_dest_2 when in maximum protection mode SALES_SEC_ALT= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host2))) (CONNECT_DATA=(SERVICE_NAME=SALES))) SALES_SEC_ALT2= (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host1))) (CONNECT_DATA=(SERVICE_NAME=SALES)))
14-8
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Oracle Net Configuration File Samples # Net service used for log_archive_dest_4 SALES_LOG_SEC = (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host1)) (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host2))) (CONNECT_DATA=(SERVICE_NAME=SALES_LOG)))
14.2.4
lsnr_SALES = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(HOST=secondary_host1)(PORT=1513) (QUEUESIZE=1024))))) # Password Protect the listener; See "Oracle Net Services Administration Guide" PASSWORDS_lsnr_SALES = 876EAE4513718ED9 # Prevent listener administration ADMIN_RESTRICTIONS_lsnr_SALES=ON
14.2.5
lsnr_SALES = (DESCRIPTION_LIST = (DESCRIPTION = (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(HOST=secondary_host2)(PORT=1513) (QUEUESIZE=1024))))) # Password Protect listener; See "Oracle Net Services Administration Guide" PASSWORDS_lsnr_SALES = 876EAE4513718ED9 # Prevent listener administration ADMIN_RESTRICTIONS_lsnr_SALES=ON
14-9
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Oracle Net Configuration File Samples (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=primary_host1))) (CONNECT_DATA=(SERVICE_NAME=SALES))) # Net service used for log_archive_dest_4 SALES_LOG_SEC = (DESCRIPTION= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host1)) (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host2))) (CONNECT_DATA=(SERVICE_NAME=SALES_LOG)))
14-10
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Oracle Net Configuration File Samples
14.2.7
14-11
Appendix C - Database SPFILE and Oracle Net Configuration File Samples - Oracle Net Configuration File Samples SALES_PRIM_ALT= (DESCRIPTION= (SDU=32768) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=primary_host2))) (CONNECT_DATA=(SID=SALES2))) # Net service used for log_archive_dest_4 SALES_LOG_SEC = (DESCRIPTION= (SDU=32768) (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(PORT=1513)(HOST=secondary_host2))) (CONNECT_DATA=(SID=SALES_LOG1)))
14-12
RAC and Data Guard are the two key components of MAA. The testing environments used to validate their use with MAA are depicted below.
Two SUN Ultra 60 workstations were used to drive a load through the 100BaseT network switch into the Oracle database running on a 2-node Sun E4000 cluster. The database nodes were directly attached to a Sun Photon disk array configured with one SAME set for the database files striped and mirrored using Veritas CVM 3.0.4 across 18 9GB disks.
Appendix D - Test Configurations - Data Guard Primary Site / Secondary Site Test Configuration
The production and the standby sites are identical 2-node clusters. All are connected to the public network through a 100BaseT network switch. The database nodes are directly attached to a Sun Photon disk array configured with one SAME set for the database files striped using Veritas CVM 3.0.4 across 4 18GB disks. The network between the
15-2
well even if connectivity limits the maximum throughput to less than what the disk drives can deliver. To review the details of the SAME validation testing results and configuration, see the Oracle white paper SAME and the HP XP512 at http://otn.oracle.com/deploy/availability/pdf/SAME_HP_WP_112002.pdf. For more information on the concepts and general guidelines for SAME, please visit http://otn.oracle.com/deploy/availability/pdf/oow2000_same.pdf.
15-3
This section contains examples of the features that may be used to handle object outages. The Outages and Recovering from Outages sections should be reviewed along with this material. This section is divided into two areas: Object Reorganization Examples o o Online Table Reorganization ALTER TABLE Statements
Object Recovery Examples o o o o o Flashback Query Block Media Recovery with RMAN DBMS_REPAIR package DBVERIFY Block Validation with RMAN
16.1.1
Here is an example of the online redefinition. In this example we use online redefinition to drop the street address(es) in the dist table. The table to be reorganized is show below.
SQL> DESC SALES.DIST NAME NULL? ----------------------------------------- -------D_ID D_W_ID D_YTD D_TAX D_NEXT_O_ID D_NAME D_STREET_1 D_STREET_2 D_CITY D_STATE D_ZIP TYPE ---------------------NUMBER(2) NUMBER(5) NUMBER NUMBER NUMBER VARCHAR2(10) VARCHAR2(20) VARCHAR2(20) VARCHAR2(20) CHAR(2) CHAR(9)
D) Synchronize the interim table with the base table. This can be done multiple times.
SQL> EXECUTE DBMS_REDEFINITION.SYNC_INTERIM_TABLE('SALES','DIST','DIST_INTERIM'); PL/SQL procedure successfully completed. SQL> EXECUTE DBMS_REDEFINITION.SYNC_INTERIM_TABLE('SALES','DIST','DIST_INTERIM'); PL/SQL procedure successfully completed.
16-2
16.1.2
D) The USER_UNUSED_COL_TABS can be checked for the table name and the number of unused columns.
SQL> SELECT * FROM USER_UNUSED_COL_TABS; TABLE_NAME COUNT ------------------------------ ---------DIST 1
16-3
16-4
16.2.1
Flashback Query
This example shows usages of flashback query in recovering from logical error caused by a transaction as the result of a user error or an application error.
INIT.ORA setup
UNDO_MANAGEMENT=AUTO UNDO_TABLESPACE=UNDOTBS1 UNDO_RETENTION=10800 # time in seconds
Usage
SCN Known
If the SCN of the transaction that caused the problem is known, it is best to use flashback query with the SCN. The steps are: I. II. III. Make a list of the tables changed by the transaction directly. Make a list of the tables changed due to triggers on the tables above. For each table above do the following:
SQL> CREATE TABLE T1_DIFF AS SELECT * FROM T1 MINUS SELECT * FROM T1 AS OF SCN 556565; This table, T1_DIFF, has all the changes made to table T1 since the SCN 556565. Many of these may be changes the installation may want to retain. Based on the results, the relevant data can be recovered and rollback to the main table.
Time Known
If the time stamp of the transaction that caused the problem is known and the SCM is not known, the following can be used. Note that the timestamp-based flashback query returns results at a time rounded down by up to 5 minutes. The steps are similar to above except for the change in the AS OF clause: I. II. III. Make a list of the tables changed by the transaction directly. Make a list of the tables changed due to triggers on the tables above. For each table above do the following:
SQL> CREATE TABLE T1_DIFF AS SELECT * FROM T1 MINUS SELECT * FROM T1 AS OF TIMESTAMP (TO_TIMESTAMP ('24-MAY-02 10:17:00','DD-MON-YY HH:MI:SS')); This table, T1_DIFF, has all the changes made to table T1 since the time specified (rounded up to the 5minute interval). Many of these may be changes the installation may want to retain. If the results returned do
16-5
not show the changes, the timestamp could be adjusted or the query could be redone using an SCN at around this time. Based on the results, the relevant data can be recovered and rollback to the main table.
16.2.2
This example shows recovering a corrupted block using the block media recovery feature in RMAN.
Error reported
The error can be detected by the application or by looking at the alert log file. The typical entry looks like:
ORA-01578: ORACLE data block corrupted (file # 4, block # 26) ORA-01110: data file 4: '/dev/vx/rdsk/ha-dg/SALES_dtf04'
Usage
The following RMAN command can be used to fix the above block.
RMAN> blockrecover data file 4 block 26 from backupset; Starting blockrecover at 07-APR-02 allocated channel: ORA_DISK_1 channel ORA_DISK_1: sid=12 devtype=DISK channel ORA_DISK_1: restoring block(s) channel ORA_DISK_1: specifying block(s) to restore from backup set restoring blocks of data file 00004 channel ORA_DISK_1: restored block(s) from backup piece 1 piece handle=/rmanbackup/01dlart2_1_1 tag=TAG20020407T162610 params=NULL channel ORA_DISK_1: block restore complete starting media recovery media recovery complete Finished blockrecover at 07-APR-02 Note that if more than one block is being recovered, multiple data file and block combination can be specified in the above command. Other examples of usage can be found in the Oracle9i Recovery Manager Reference, Release 2 (9.2).
16.2.3
DBMS_REPAIR Package
This example shows usage of DBMS_REPAIR to enable skipping of a corrupted block for an object. Other examples of usage can be found in Chapter 22, Detecting and Repairing Data Block Corruption of the Oracle9i Database Administrator's Guide, Release 2 (9.2). The example shown here uses sqlplus to issue the commands. A) The repair table and the orphan key tables need to be created. It is suggested that this should be pre-created.
SQL> BEGIN 2 DBMS_REPAIR.ADMIN_TABLES 3 ('REPAIR_TABLE',DBMS_REPAIR.REPAIR_TABLE,DBMS_REPAIR.CREATE_ACTION); 4 END; 5 / PL/SQL procedure successfully completed. SQL> BEGIN 2 DBMS_REPAIR.ADMIN_TABLES ( 3 'ORPHAN_KEY_TABLE', DBMS_REPAIR.ORPHAN_TABLE, DBMS_REPAIR.CREATE_ACTION); 4 END; 5 / PL/SQL procedure successfully completed. B) To check for corruption in an object: SQL> VARIABLE NUM_CORRUPT NUMBER;
16-6
Appendix E - Object Reorganization and Recovery Examples - Object Recovery Examples SQL> BEGIN 2 DBMS_REPAIR.CHECK_OBJECT(SCHEMA_NAME => 'SALES', 3 OBJECT_NAME => 'ORDR', CORRUPT_COUNT => :NUM_CORRUPT); 4 END; 5 / PL/SQL procedure successfully completed. SQL> PRINT NUM_CORRUPT ----------1 C) To check the details of the corruption: SQL> SELECT 2 OBJECT_NAME, BLOCK_ID, MARKED_CORRUPT, CORRUPT_DESCRIPTION, 3 REPAIR_DESCRIPTION, CHECK_TIMESTAMP 4 FROM REPAIR_TABLE 5 ORDER BY CHECK_TIMESTAMP 6 / D) To attempt to repair the objects shown above: SQL> VARIABLE FIX_COUNT NUMBER; SQL> BEGIN 2 DBMS_REPAIR.FIX_CORRUPT_BLOCKS 3 (SCHEMA_NAME => 'SALES',OBJECT_NAME => 'ORDR', 4 FIX_COUNT => :FIX_COUNT); 5 END; 6 / PL/SQL procedure successfully completed. SQL> PRINT FIX_CORRUPT ----------1 E) To enable skip of corrupted blocks for this table: SQL> BEGIN 2 DBMS_REPAIR.SKIP_CORRUPT_BLOCKS(SCHEMA_NAME => 'SALES', 3 OBJECT_NAME => 'ORDR', OBJECT_TYPE => DBMS_REPAIR.TABLE_OBJECT, 4 FLAGS => DBMS_REPAIR.SKIP_FLAG ); 5 END; 6 / PL/SQL procedure successfully completed. To verify the above run the following: SQL> SELECT SKIP_CORRUPT FROM DBA_TABLES WHERE TABLE_NAME = SALES; SKIP_COR -------ENABLED
16.2.4
DBVERIFY Utility
This example shows how to use this utility to detect corrupt blocks at the data file level. Segment-level verification is not recommended in MAA unless the segment is mainly accessed in a read mode.
$ dbv FILE=/dev/vx/rdsk/ha-dg/SALES_dtf01 BLOCKSIZE=8192 DBVERIFY: Release 9.2.0.1.0 - Production on Fri May 24 14:14:53 2002 Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved. DBVERIFY - Verification starting : FILE = /dev/vx/rdsk/ha-dg/SALES_dtf01 DBVERIFY - Verification complete Total Pages Examined : 262016 Total Pages Processed (Data) : 120682
16-7
Appendix E - Object Reorganization and Recovery Examples - Object Recovery Examples Total Total Total Total Total Total Total Total Total Pages Pages Pages Pages Pages Pages Pages Pages Pages Failing (Data) : Processed (Index): Failing (Index): Processed (Other): Processed (Seg) : Failing (Seg) : Empty : Marked Corrupt : Influx : 0 3884 0 54 0 0 137396 0 0
The block and file number can be used to further investigate and fix the problem.
16.2.5
RMAN can be used to validate the used blocks of a database. The following command runs a backup validation to populate V$DATABASE_BLOCK_CORRUPTION.
RMAN> BACKUP VALIDATE DATABASE; If problems are found, the following query will return rows as shown: SQL> SELECT * FROM V$DATABASE_BLOCK_CORRUPTION; FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO ---------- ---------- ---------- ------------------ --------4 25 1 0 FRACTURED 4 27 1 0 FRACTURED
16-8
16-9
Within a RAC environment, the server parameter file (SPFILE) allows for a single, central parameter file that holds all the database initialization parameters associated with all the instances involved in a RAC database. Thus, providing a simple, persistent and robust environment for managing database parameters.
Create shared volume (raw device) Create a text initialization file Create the SPFILE from the text initialization file Create symbolic links to the SPFILE
17.1.1
Example:
This volume must be accessible to all instances in the cluster and have the database owner access rights.
/usr/sbin/vxassist -g ha-dg -U gen make SALES_spfile 2m layout=mirrored stripeunit=256k nstripe=4 user=oracle group=dba mode=660
17.1.2
SPFILEs do not allow the use of the initialization parameter file IFILE option and it is not necessary with SPFILEs. If you were previously using an IFILE option for global (same for all instances) parameters, then you can merge the files together. When creating the text file, any global parameters do not need any line prefix but they can have a *. line prefix, e.g. *.log_archive_start=TRUE. Instance specific parameters, such as thread, require a ;ORACLE_SID. line prefix, e.g., SALES1.thread=1 where ORACLE_SID=SID1. This is a sample from a text initialization parameter file named /mnt/app/oracle/SALES/pfile/initSALES.ora:
*.fal_server='SALES_SEC' *.fal_client='SALES_PRIM' SALES1.thread=1 SALES2.thread=2 SALES1.undo_tablespace='rbs01' <can you use more than 1 tbs> SALES2.undo_tablespace='rbs02' SALES1.instance_name='SALES1' SALES2.instance_name='SALES2' SALES1.instance_number=1 SALES2.instance_number=2 log_archive_start=TRUE
Note that the fal_ lines do not require the *. and that the log_archive_start line could have a line prefix of *..
17.1.3
Using the previously created shared volume and the text initialization file, the binary SPFILE can be created from the SQL*Plus environment while the database is up or down. Example:
SQL> create spfile='/dev/vx/rdsk/ha-dg/SALES_spfile from 2 pfile='/mnt/app/oracle/admin/SALES/pfile/initSALES.ora'; File created.
17.1.4
Now that the SPFILE is created on a shared volume, database startup operations can be simplified by creating a symbolic link that uses the default server parameter file, spfile$ORACLE_SID.ora. For UNIX, the location (directory) for the server parameter file is $ORACLE_HOME/dbs. For Microsoft Windows environments, the location is $ORACLE_HOME\database. Accordingly, for each database instance, execute the following in the default location directory:
ln s <shared volume file spec> spfile$ORACLE_SID.ora
For a database with 2 instances, SALES1 and SALES2, execute the following commands in the $ORACLE_HOME/dbs directory: On the SALES1 instance: ln s /dev/vx/rdsk/ha-dg/SALES_spfile spfileSALES1.ora On the SALES2 instance: ln s /dev/vx/rdsk/ha-dg/SALES_spfile spfileSALES2.ora Alternatively, if your platform does not support symbolic links or shortcuts, then you can use a PFILE (default or nondefault) that contains the following line: SPFILE=/dev/vx/rdsk/ha-dg/SALES_spfile The same steps can be followed on the secondary site database to create the SPFILE.
17-2
SCOPE Option
SPFILE Memory SPFILE and memory
Effect
The change is applied in the server parameter file only. This is the only SCOPE specification allowed for static parameters. The change is applied in memory only, but it is not persistent because the server parameter file is not updated. This is not allowed for static parameters. The change is applied in both the server parameter file and memory.
Some initialization parameters can only be changed by stopping and starting the Oracle instance. These are called static parameters. Static parameter changes are recorded in the SPFILE using the SCOPE=SPFILE option. SPFILE-only changes do not effect change until the database is restarted with the SPFILE. For instance-specific parameter values, use the SID designator:
ALTER SYSTEM SET undo_retention=600 scope=memory SID=SALES1; SPFILE settings can be checked in the V$SPPARAMETER view. It is possible for a value in the V$SPPARAMETER view to differ from the V$PARAMETER view if parameter change has been made with SCOPE=SPFILE or SCOPE=MEMORY. Additionally, instance-specific SPFILE settings can be viewed with the following query: SELECT SID,NAME,VALUE FROM V$SPPARAMETER WHERE SID <>* ORDER BY SID; Rather than allowing the ALTER SYSTEM SET command to default, being explicit with the command options leaves no doubt. Thus, always specifying the SCOPE and the SID on the ALTER SYSTEM SET command is recommended.
17-3
Maximum Availability Architecture June 2005 Author: High Availability Systems Group Contributing Authors: Andrew Babb, Pradeep Bhat, Ray Dutcher, Susan Kornberg, Ashish Prabhu, Lawrence To, Doug Utzig, Jim Viscusi, Shari Yamaguchi Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 www.oracle.com Oracle is a registered trademark of Oracle Corporation. Various product and service names referenced herein may be trademarks of Oracle Corporation. All other product and service names mentioned may be trademarks of their respective owners. Copyright 2005 Oracle Corporation All rights reserved.