Vous êtes sur la page 1sur 28

Microsoft Cluster Server and Oracle Fail Safe

Quick Start Guide

Step-by-step instructions for installing Microsoft


Cluster Server, installing Oracle Fail Safe and
configuring a database.
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Microsoft Cluster Server and Oracle Fail Safe...........................................................................1


Quick Start Guide ...................................................................................................................1
Introduction ............................................................................................................................1
Part 1: Hardware Configuration and Set-Up .............................................................................1
Certified Hardware.......................................................................................................... 1

Disk Configuration.......................................................................................................... 1

Configure Network Cards ............................................................................................... 2

Part 2: Installing Microsoft Cluster Server ................................................................................5


Installing MSCS on the First Node ................................................................................. 5

Adding Additional Nodes ............................................................................................... 9

Using Cluster Administrator ......................................................................................... 10

Part 3: Installing Oracle Fail Safe.......................................................................................... 11


Match Home Names on All Nodes ............................................................................... 11

Oracle Services for MSCS Security Setup .................................................................... 12

Completing the Fail Safe Configuration....................................................................... 13

Making the Database Fail Safe ..................................................................................... 14

Creating the Database ................................................................................................... 15

Verifying the Standalone Database Configuration ....................................................... 16

Creating a Group ........................................................................................................... 17

Adding the Database to a Group ................................................................................... 22


Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Introduction
This paper is divided into two parts. Part One provides you the hardware configuration,
Part Two provide you the step-by-step instructions for installing Microsoft Cluster Server
(MCS). Part 3 gives you the step-by-step instructions for installing Oracle Fail Safe and
configuring a database. Here is an overview of the steps required to install MSCS:
1. Hardware Configuration and Set-Up
?? Confirm Hardware is Certified for MSCS
?? Configure Shared Disks
?? Select Disk to be Quorum Disk
?? Configure Network Cards
?? Obtain IP Address and Network Name for Cluster Group and Register in
DNS or HOSTS file
2. Install Cluster Server on First Node and on Second/Additional Nodes
3. Install Oracle Fail Safe

Part 1: Hardware Configuration and Set-Up

Certified Hardware
Oracle does not specifically certify hardware for Oracle Fail Safe. Instead, you must
ensure that the hardware is on the Microsoft Cluster Server Hardware Compatibility List
(HCL) that is available from Microsoft? . You will find the HCL at:
http://www.microsoft.com/hcl/

Disk Configuration
Disks need only be configured from one node. Do not attempt to write to the disks from
multiple nodes until the clustering software has been installed. Avoid creating software
volumes—any striping or RAID configuration should be done at the hardware level, prior
to configuring the disks in the Disk Management console; this will give you better
performance. Choose a node from which to configure the disks, and open the Disk
Management Console
Partitioning a single physical disk into multiple partitions can be done, but MSCS sees
the entire Physical Disk as a single resource, so the entire disk must always move
together, no matter how many partitions are on it. Therefore, it normally makes sense to
simply create one partition on each Physical Disk. Format all of the shared drives as
NTFS volumes and assign the drive letters as appropriate. Note, in the example below
that we have labeled volumes as either Shared or Private.

1
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Figure 1: Screenshot of the Computer ManagementConsole

Quorum Disk

MSCS requires that one of your shared disks be assigned as the quorum disk. The
quorum disk assists in handling certain clustering functions. The quorum disk is critical
to resolving ownership of resources should the interconnect go down. Additionally, it
provides an area of physical storage that all nodes can access. The quorum disk does not
require much space, so you should choose the smallest drive possible. Microsoft
recommends a minimum drive size of 500MB. Keep in mind that if the quorum disk fails,
the cluster fails, therefore you may want the quorum disk to be a RAID volume of some
type. It is possible in some versions, to place Oracle datafiles on the same drive as the
quorum disk, Oracle and Microsoft recommend that the quorum disk be kept separate
from any other resource disks.
Decide which shared disk you want to be the quorum disk.

Configure Network Cards


It is likely that you will have at least two network cards in each node of the cluster. One
network card is generally used for public communication with network clients and
servers, while the second network card is generally reserved for cluster communications.
If there are only two nodes in the cluster, these cards can be connected directly to each
other via a crossover cable. Or, you can go through a hub if you have more than two
nodes. It is possible to have the cluster communications go through the public network,
but this is not recommended because the cluster communication involves polling of

2
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

resources on a regular basis. Not only can this polling result in a large amount of traffic,
but a network glitch could be incorrectly interpreted as a resource failure which could
result in a restart or failover of a healthy resource. Thus, it is better to have a dedicated
network for the resource polling.

Binding Order
With a network card dedicated to the interconnect, and a second card dedicated to the
public network, it is important to ensure that the bindings are set up correctly. Any public
cards which will be communicating with client machines should always be bound first,
leaving the network card for the interconnect bound last of all. This is critical in ensuring
the name resolution works correctly, particularly when nodes are communicating with
each other. If the binding order is incorrect, you may see that a ping of the public host
name resolves to the private IP address. Thus, if a listener is configured to listen on a
host name, it may incorrectly resolve that host name to the private IP address which
means incoming connections from clients will fail.

How to Check the Binding Order


1. Right-click My Network Places and choose Properties.
2. From the Advanced drop-down menu, choose Advanced Settings.
3. Look in the Adapters and Bindings tab and ensure that the card with your public
IP address is first in the list. If it is not listed as the first entry move it up. Follow
the same steps on both nodes.

3
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Figure 2: Screenshot of the Adapters and Bindings Window

Disabling WINS on Interconnect


You also want to ensure that the WINS address is left empty for the private card.
1. Right-click the private network connection in Network and Dial-up
Connections.
2. Choose Properties, and select Properties again for the Internet Protocol
(TCP/IP).
3. Choose the Advanced button and select the WINS tab.

If there is a WINS address defined, remove it otherwise the Cluster Service


will become confused when attempting to communicate with the Domain
Controller (all cluster nodes must be members of a domain).

Use DNS or HOSTS for Name Resolution


Finally, make sure that all public IP and host name combinations have been registered in
DNS. Be sure to include the IP addresses and host names for groups that you intend to
create for the cluster itself as well as any Fail Safe groups. Additionally, you may want

4
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

to assign a network name to the cards on the private interconnect. Since these cards
usually are not going to be connected to a DNS server, you should add entries into the
hosts file. You can find the hosts file in the \WINNT\System32\drivers\etc directory. A
popular naming convention is to append ".SAN" to the end of the actual node name, and
use that as the host name assigned to the private card. This convention indicates clearly
that this hostname is on its own subnet, using the private interconnect. If you have two
nodes called RMNTOFS1 and RMNTOFS2, your host file entries might look like so:

10.10.10.1 RMNTOFS1.SAN #PRIVATE CONNECTION for Node1


10.10.10.2 RMNTOFS2.SAN #PRIVATE CONNECTION for Node2
192.1.1.1 RMNTOFS1 #PUBLIC Connection for Node1
192.1.1.2 RMNTOFS2 #PUBLIC Connection for Node2
192.1.1.3 RMNTCLUSER #MSCS Cluster Group IP
192.1.1.4 RMNT_FAIL-1 #Fail Safe Group IP

Double-check the setup by pinging the public and private names of all nodes in the
cluster, ping each node from itself. Verify that a ping of the public name always returns
the public IP address, and a ping of the private name returns the private IP address:

C:\>ping rmntofs1
Pinging rmntofs1.US.ORACLE.COM [192.1.1.1] with 32 bytes of data:

Reply from 192.1.1.1: bytes=32 time<10ms TTL=128


..
C:\>ping rmntofs1.san
Pinging RMNTOFS1.SAN [10.10.10.1] with 32 bytes of data:

Reply from 10.10.10.1: bytes=32 time<10ms TTL=128


..

Part 2: Installing Microsoft Cluster Server


Once you have all of the hardware properly set-up and configured: your disks are
partitioned such that you have enough physical drives to support the appropriate number
of groups, you have all of the necessary host names and IP addresses registered in DNS
or in the HOSTS file, and you have confirmed that your network cards are configured
appropriately, you are now ready to install Microsoft Cluster Server.

Installing MSCS on the First Node


1. Open up the Windows 2000 Control Panel on one of your cluster nodes, and
choose Add/Remove Programs.
2. Choose Add/Remove Windows Components from the dialog window.
3. Place a check box next to Cluster Service and choose Next.

You will be prompted for the Windows 2000 Advanced Server CD, and then
the Cluster Configuration Wizard will be started.

5
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

4. Choose Next on the welcome screen to display a link to the Microsoft


Hardware Compatibility List (HCL). Notice the disclaimer states that
hardware not on the HCL is not supported..
5. Click the I Understand button and choose Next.
6. Indicate this is the first note in the cluster.
7. Input the network name that you have chosen for the Cluster Group.

Remember, this network name and cluster IP combination should have


already been registered in DNS or in the hosts file. If not, this step will fail.
(You will be prompted for the IP address later on in the install.)

Figure 3: Screenshot of the Cluster Name Dialog Window

User Account Set-up for Running Cluster Service


8. On the next screen, you will be prompted for a username under which the
Cluster Service will run. This is a Domain Administrator Account, and the
domain name that the cluster node is a member of should show up in the
bottom box. Type in the correct username and password and continue on to
the next screen.
9. On the Add or Remove Managed Disks window you should see the listing of
shared drives that you previously configured in the Disk Management
Console. Ensure that all of the drives that you intend to use are listed on the
right-hand side, under the Managed Disks column. Continue to the next
screen, where you will choose which drive will be the quorum disk.

6
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Figure 4: Screenshot of the Cluster Name Dialog Window

Defining Networks
10. After selecting the quorum disk, you will be presented with a screen on which
you will define the networks. You can name them whatever you choose—
generally, we recommend that you keep it simple and call them "Public" and
"Private". For the "Private" network, you want to ensure that you select the
radio button to enable the network for Internal Cluster Communications
Only. For the public network, you should probably select All
Communications, to provide a certain amount of redundancy.
11. On the next screen, you will determine which network should be used first for
cluster communications, assuming that both networks are functioning.

Be sure that the Private network is first, so that as long as it is functional, the
public network will be configured only as a fallback. It is also fairly common
for some sites to have three or four network cards in each node, so that a
second private network can be defined for the interconnect, again providing
additional redundancy.

If you have more than two cards in each node, configure the networks
according to which order you want cluster communications to fall back in the
event of a failure.
12. For the final step, you will be prompted to enter the IP address that you have
reserved for the virtual Cluster Group. As previously mentioned, this IP

7
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

address is the same that was registered in DNS or the HOSTS file with the
cluster’s Network name that was specified at the outset of the MSCS install.
Type in the IP and ensure that the correct network is chosen. In our example,
the cluster name is RMNTCLUSER, and the IP Address is 138.1.144.117. On
the final screen, be sure to click Finish to complete the cluster installation.

Figure 5: Screenshot of the Cluster IP Address Window

8
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Adding Additional Nodes


The process of adding an additional node to the cluster is much quicker. On the second
node, start the install in the same fashion as before, but this time, select the radio button
for The Second Or Next Node In The Cluster. Provide the same username, password,
and domain information as in the initial install, and then finish the cluster installation on
the second node. This node has now joined the cluster as an equal member.

Figure 6: Screenshot of the Create or Join a Cluster Window

9
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Using Cluster Administrator


Once you installed Microsoft Cluster Server, you will be able to run the Microsoft
Cluster Administrator to view the nodes, groups, and resources in your cluster.
1. Start Cluster Administrator by clicking Start | Programs | Administrative
Tools | Cluster Administrator.

The Figure below is an expanded view of Cluster Administrator. Initially,


after a fresh install, you will have a group called "Cluster Group", which
contains as resources the Cluster IP Address, the Cluster Name, and the
quorum disk. This is the first virtual server group that has been created as part
of your cluster. You cannot add an Oracle database or other resources to this
group—you must create a second group. However, the install of Fail Safe later
on will add an Oracle Fail Safe Server into the Cluster Group. We discuss this
in the coming section on Fail Safe installation.

Figure 7: Cluster Administrator

Disk Groups
In addition to the Cluster Group, you see in the Figure that you will have a Disk Group
for each additional shared disk besides the quorum disk. These Disk Groups are simply
placeholders for the disk resources—they are not true virtual groups, as they do not have
network names and IP addresses associated. However, ownership of the disk groups can
still be transferred back and forth between the nodes. When a database with files residing
on one of these disks is added to a new group, the disk resource associated will be
removed from the temporary disk group and placed into the database group. At this time,
you will be able to delete the disk group, if you so desire.

10
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Resources and Resource Types


Refer again to the Figure 7 the Cluster Administrator. You will also see a folder called
Resources. When this is highlighted, it will list all cluster resources, the group in which
each resource resides, and the current owning node. In addition, under Cluster
Configuration you should see a Resource Types folder. When highlighted, this will list
each of the resource types and the Resource DLL used to monitor that type of resource.
Once Oracle is installed and configured, you should see a resource type of Oracle
Database listed here.

Part 3: Installing Oracle Fail Safe


As mentioned earlier, Oracle software must be installed on the private drive on each node
of the cluster. This includes the database software, any Oracle application software (such
as Forms, Reports, or 9iAs) and Oracle Fail Safe itself. As such, this also requires proper
planning prior to embarking on the installation. First, you must ensure that you have
enough space available on the private drives of all nodes in the cluster. Second, you must
determine which nodes in the cluster are meant to run which software. This is primarily a
consideration in clusters with multiple nodes. If you are, in fact, planning an architecture
with three or four nodes, comprising different tiers, you may not want or need all of the
software on all of the nodes in the cluster. Determine which nodes should be able to run
the database and which nodes should be able to run the application software, and plan
accordingly. We recommend that you install the Fail Safe software last. During the
install of Fail Safe, the Cluster Service must be running.

Match Home Names on All Nodes


It is required that the Oracle home names for the database software and the Fail Safe
software, respectively, are identical on each node. In addition, Oracle Fail Safe should be
installed into its own Oracle home, separate from other Oracle products. Thus, on Node1
if the database software is installed in a home called OraHome90, and Fail Safe itself is
installed in a home called OFSHome, you must make sure that the home names match
identically for each of these products on all nodes in the cluster. We also recommend
that you match the directory names and orders of install on all nodes when possible.
Though this is not strictly required, it prevents confusion and simplifies administration.
Once you have decided on the Oracle product choices, home names, and directories, you
are ready to begin the actual install of the product.
Again, the install must be performed as a user account with Local Administrator
privileges on each node. After selecting the home name and directory, if you are
installing Fail Safe 3.2 (the first release to be certified with Oracle9i), you will be
prompted to select either Oracle Fail Safe or Real Application Cluster Guard. - select
Oracle Fail Safe. Choosing a Typical install will give you the components necessary to
make the database highly available. Prior to the actual beginning of the installation, you
will be cautioned that a reboot is required after the installation completes.

11
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Oracle Services for MSCS Security Setup


In Fail Safe releases 3.1.x and lower, the service created by the Fail Safe install was
called the Oracle Fail Safe Service. Starting with release 3.2, this service is named
OracleMSCSServices. At the end of the installation, you will be prompted for another
domain name, username, and password combination. This is the account that will be
used to run the OracleMSCSServices. Again, in prior releases, this service was named
the Oracle Fail Safe service. This can be the same account information that you provided
earlier for the MS Cluster Server installation, but it does not have to be. The account that
you specify must be a Domain User on the same domain as MSCS uses, and must also
have Local Administrator privileges on all nodes of the cluster. You should use the same
account for all nodes. The Security setup will configure the OracleMSCSServices
service to be started and run as the user that you specify.

DCOM Security
In addition to configuring the service logon, the security setup will configure DCOM
access by calling the configuration tool and adding the local SYSTEM account to the
default access permissions list for Distributed COM security. You can view this by
running dcomcnfg at a command prompt and choosing Default Security and editing
Default Access Permissions. In earlier releases of Oracle Fail Safe, the default access
permissions were left untouched. This is normally empty, and thus the SYSTEM and
INTERACTIVE accounts are assumed to have privileges. However, some third-party
applications may add user accounts to the default access list, nullifying any default
permissions. If default permissions are modified, you may experience a hang when
running the Verify Cluster tool unless SYSTEM is explicitly added to the default access
permissions, so starting with the 3.2 release, the Oracle Services for MSCS Security
Setup has been modified to always add the SYSTEM account. See MetaLink Document
ID 155317.1 for more details on this problem.

Running the Security Setup Post Install


Should the need arise to change passwords after an install, or to update the security, the
Oracle Services for MSCS Security Setup can be run after the install by choosing Start |
Programs | Oracle – <OFS Homename> | Oracle Services for MSCS Security Setup. Any
post-installation changes that you make with this tool will not take effect until after the
OracleMSCSServices service is restarted.

Reboot Each Node Independently after Install


After Fail Safe has been installed on the first node, it must be rebooted. Wait until the
reboot completes and the node has rejoined the cluster prior to beginning the install on
the second node. Then, repeat the preceding steps on each node of the cluster, rebooting
each node after the Fail Safe install completes.

Registry Keys Updated


The Oracle Fail Safe install will add a Registry key as a subkey of the normal Oracle key,
at HKLM\Software\Oracle\Fail Safe. In addition, an Oracle key is created under the

12
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

cluster key at HKLM\Cluster\Oracle. Last, once the Oracle Database and Oracle TNS
Listener resource types are registered, you will be able to view this under
HKLM\Cluster\Resource Types. If you ever need to remove Fail Safe from a cluster, you
should uninstall it if possible, so that the resource types are unregistered and removed
from the Registry. Uninstalling Cluster Server will remove the HKLM\Cluster key,
forcing you to reregister the Fail Safe resource types after you reinstall Fail Safe. This
can be accomplished by rerunning Verify Cluster, discussed in the next section.

Completing the Fail Safe Configuration


As noted previously, the install of Oracle Fail Safe creates a service called
OracleMSCSServices. This service is a resource that gets added to the Cluster Group,
which was created when you initially installed Microsoft Cluster Server. This is the only
Oracle resource that should be added to the Cluster Group, and the install will do this for
you. Though the service exists on each node, it will be actively running only on the node
that owns the Cluster Group. This is the process that Fail Safe Manager attaches to when
it is run, so failure of this service will lead to a failure when logging on to Fail Safe
Manager.

Logging in to Fail Safe Manager


Fail Safe Manager is the interface provided by Oracle to interact with the cluster. Fail
Safe Manager duplicates some of the things that you see in Cluster Administrator. It can
be used to monitor the location and ownership of resources, change dependencies and
failover policies, and so on, and it can be used to create new virtual groups. All of these
operations can be done through Cluster Administrator as well. However, Fail Safe
Manager must be used to add an Oracle database or other supported Oracle resources into
a Fail Safe group. In addition, Fail Safe Manager provides invaluable troubleshooting
tools to verify the cluster setup and resource configuration prior to adding resources to a
group, and to verify the integrity of a group after it has been created.
When logging in to Fail Safe Manager, you must provide an operating system account
that is a member of the cluster’s domain, and that also has local administrative privileges.
The Cluster name and Domain name are, of course, the same as specified when installing
the cluster:

13
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Figure 8: Connect to Cluster Login

Fail Safe Manager can be installed on a client machine to allow you remote management
access to the cluster. Previous releases of Oracle Fail Safe required that the Fail Safe
Manager client be the same version as the Fail Safe Server running on the cluster.
However, beginning with the 3.2 release of OFS, the Fail Safe Manager can be used to
manage clusters running Fail Safe version 3.1.1 or later. Thus, in an environment with
multiple clusters, you do not have to upgrade all at once, nor do you need to sacrifice the
manageability of using Fail Safe Manager to manage multiple clusters. Simply ensure
that you have the latest version of Fail Safe Manager on your desktop, and it will work
with the 3.1.x clusters and 3.2.x clusters.

Running Verify Cluster


Run OFSM by choosing Start | Programs | Oracle – <OFS Homename> | Oracle Fail Safe
Manager. The first time that it is run on a new cluster, you will be given the choice to run
the Verify Cluster tool or exit. Verify Cluster is the first of the "Verify xxx" operations
provided by Fail Safe Manager to assist in configuration and assurance of the integrity of
the database. This tool must be run to register the Oracle Resource DLL and Oracle
Resource Types for use by the cluster. However, in addition to doing this, Verify Cluster
checks the cluster configuration to make sure that all of the networking components are
properly configured, and also to confirm that the Oracle install was done properly (i.e.,
the home names and products installed match on each node).

Heed Warnings in Verify Cluster


Because Verify Cluster must complete in order to register the Resource DLL, you will
not get an absolute failure message—you will almost always read that the operation
completed successfully. However, you may get warnings. You should save the output
from the clusterwide operation to a text file and check this file closely for any errors.
Some errors/warnings are only informative in nature, indicating that certain software
components are not installed. However, if you see errors indicating an IP address
mismatch, this is an indication that the binding order of your cards is incorrect, a
condition that may lead to name resolution problems and resource failures down the road.
Refer to the earlier section on cluster configuration to resolve these problems, and then
rerun the Verify Cluster operation. You should also pay close attention to any errors
reporting a mismatch in the names of the ORACLE_HOMEs on the respective nodes. If
you mistakenly name the Fail Safe home or the database home incorrectly on one of the
nodes, you will need to reinstall in order to get Fail Safe to work properly. Once the
Verify Cluster operation completes, you should be able to see the Oracle Database and
Oracle TNS Listener resource types listed in Cluster Administrator.

Making the Database Fail Safe


Once Fail Safe has been successfully installed and the cluster setup has been verified, you
are now ready to create the Fail Safe group and add a database. Essentially, these are the
steps that you will follow:

14
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

?? Create the database


?? Verify the standalone database
?? Create the Virtual Group
?? Add the database to the group
In this section, we detail each of these steps.

Creating the Database


If you have not yet created the database, you can do so via the Database Configuration
Assistant or you can create a database manually. In addition, Oracle Fail Safe provides a
template for a sample database, which you can create through Fail Safe Manager itself.
To do this, choose Create Sample Database from the Resource menu in Fail Safe
Manager. However, this is meant more for demonstration purposes than as a template for
your production instance. So while you can use this to quickly create a database to show
the concept works, we recommend that you use the DBCA or your own scripts to create
the true database.
You should create the database on one node only, but be sure when creating the database
that all files associated with the database are on a shared drive. This includes control
files, log files, datafiles, and any local archive destinations that you define in the init.ora
(or SPFILE). While it is not required to have the background_dump_dest and the
user_dump_dest on shared drives, we strongly recommend it. Having an alert log that is
written to the private drive can lead to gaps in the log file if the group moves to another
node in the cluster. Move all drives where files will ultimately reside, so that they are all
owned by the same node, and create the database from that node.

Placement of Parameter File


In addition to placement of trace files, you must also determine if you are going to have
the init file or spfile reside on the private drive or on the shared drive. Having the
parameter file on the shared drive will ease administration, since you do not have to be
concerned with maintaining multiple copies of init.ora on all nodes. However, this
reduces the flexibility to have differences in certain parameters, depending on which
node the database resides on. As a general rule, if you have an Active/Active
configuration, you may need to consider having different parameter files, placed on the
private drive of each node. With an Active/Passive scenario, you should put the
parameter file on the shared drive. In a three- or four-node cluster, you will have to
determine which nodes the database will reside on, and what resources would be
available to the database on each node in event of a failure. Place the parameter file
accordingly, depending on your needs and the available resources.
Note: If using an SPFILE, you will have to have a normal init file with the line
SPFILE=xxxx. You cannot pass the SPFILE directly to Fail Safe when
adding the database to a group.

15
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Verifying the Standalone Database Configuration


Once the database has been created, you should be able to discover it as a standalone
resource on the node on which it was created. Fail Safe Manager will list the Nodes in the
left-hand pane. Expand the node on which the database exists, and you will see a folder
for Groups on that node, and another folder for Standalone Resources. Under Standalone
Resources, you will see a message that Fail Safe is "Discovering Standalone Resources"
on the node, and then you should see a listing of Oracle resources on that machine that
are supported in a Fail Safe environment. An existing database will be discovered as a
resource on the node where it resides, providing there is a service for the instance on that
node (OracleService<sid>), and there is a valid TNSNAMES.ORA entry on the node,
which connects to the same SID name or SERVICE_NAME, using the HOST name or IP
address of the node:

Figure 9: Screenshot of Oracle Fail Safe Manager

16
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Once you identify your database, right-click it and choose Verify Standalone Database.
You will be prompted for the instance name, parameter file location, and whether you
want to connect using OS Authentication or you want to provide a password. If you
choose OS authentication, Fail Safe will create a local OS group called
ORA_<sidname>_DBA and add the accounts that were specified for the Cluster Service
and the OracleMSCSServices. This allows members to connect only to this particular
instance—Fail Safe will not automatically create the more generic ORA_DBA group, but
it will work if you manually add the accounts to this group instead of a group specific to
your SID.

Why Run Verify Standalone?


The Verify Standalone Database will check the configuration of the database and prepare
it to be added into a Fail Safe Group. It will check that all drives being used by the
database are shared drives. If the database is configured for Automatic Startup or
Shutdown, those features will be disabled, because once in the group, the Cluster Service
will be responsible for bringing the database offline and online. The Verify Standalone
operation will also check to ensure that the services for the instance exist on only one
node. At this point, since the database is still a standalone database, the services for the
instance should not yet exist on the second node—if they do, you will be prompted for
the correct node, and the services will be deleted from the other node(s).
In addition, the Verify Standalone Database operation will check the tnsnames.ora and
listener.ora files and ensure that they are configured correctly, in order to allow them to
be parsed by Fail Safe when it comes time to add the database to a group. This is critical,
because when the database is ultimately added to the group, these files must be
reconfigured on each node to account for the virtual server connect information. Failures
in parsing these sqlnet configuration files is one of the most common reasons that an
operation to add the database to a group will fail, so running Verify Standalone Database
is an important step in ensuring these files are set up correctly and ready for the
impending Add to Group operation.

Creating a Group
We reiterate here that you cannot add the database into the Cluster Group—you must
create a separate group for the database, and you must have a host name and IP address
combination ready. Even though you can use MS Cluster Administrator to create the
group, we recommend that you create it through Fail Safe Manager, as it provides an
interface to add a hostname and IP address into the group. In Fail Safe Manager, right-
click the Groups folder and choose Create. You will be prompted for a name for the
group—this can be any name that you decide on; it need not match the hostname. Type in
the name and an optional description and choose Next:

17
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Figure 10: Step 1 Creating a Group

18
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Defining a Failback Policy and a Preferred Node


On Page 2 of the Create Group Wizard, you will be prompted to define a Failback Policy
for the group. If the group fails over to the other node, and the original node then comes
back online, do you want this group to back to the original node automatically? If so, how
quickly? Should it happen immediately, or should it happen only during specific hours? If
you choose the Prevent Failback option, then the group will not fail back automatically—
you will need to manually move the group back to the preferred node if so desired.

Figure 11: Step 2 Creating a Group

A Failback policy does not have any meaning if there is not a preferred node, because the
Failback is triggered when the preferred node rejoins the cluster. Accordingly, if you
chose to Failback Immediately, this Failback event will be triggered as soon as the
preferred node comes back online. Choosing Prevent Failback on Page 2 implies that
there is no preferred node, so you will not see Page 3 of the Create Group Wizard, which
is where the preferred node for the group is selected.

Adding Virtual Addresses to a Group


Once the group is created, you will be immediately prompted to add a virtual address to
the group. A virtual address is simply an IP address and network name combination that
will be assigned to the group that you have just created. Think of this process as like
adding an entirely new server to your network. In order to bring up a new server on your
network, you must have an IP address and network name that are valid for your network,
and you must configure the server with that information. Adding a virtual address to the
group accomplishes the same thing for your virtual server, which is associated with your

19
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

newly created group—the wizard configures the group with that address, and then MSCS
is responsible for registering that address with the gateway and directing all network
communications to the appropriate owning node. This virtual address then becomes the
means by which your clients connect to the virtual server and communicate with the rest
of the resources that will ultimately be added to this group. As such, this network name
and IP address combination must be unique on your network, even among other virtual
address that already exist, and it must resolve successfully and be accessible by any
clients that wish to access the database.
Choose Yes in answer to the Add Virtual Address question, and the Add Resource
Wizard will be initialized. You will be prompted to select which network you want to add
the virtual address from. In most cases, you will be choosing the public network, which
allows your clients to access the network. Theoretically, though, if the only client is an
application tier, which runs on one of the other cluster nodes, you could select the private
cluster network.

Figure 12: Step 3 Creating a Group

The network name and address that you supply must be valid on one of the subnets tied
to a physical card. As an aside, it is possible to have multiple IP address and network
name combinations existing in a single group, and it is also possible to have these IPs be
on different subnets, to provide further redundancy and load balancing. However, a
virtual IP address must always be on the same subnet as at least one physical card within
the cluster. Thus, having two IP addresses in a group that are on different subnets would

20
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

require two different physical network cards, each with an IP address on the respective
subnets used by the virtual IP.

21
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Choose the appropriate network for the initial virtual IP, and then put in the host name
that you have predefined in DNS or your hosts file. If this is set up correctly, the IP
address should be filled in automatically. If not, you will get an error indicating that the
host name does not resolve to an IP address. Another common error here is to put in the
existing host name of the Cluster Group. If you do so, this will fail with an FS-11221
error, indicating that this network name is already in use. Duplicate network names, of
course, are not allowed. The group will still be created, but it will not have a virtual
address assigned. You must then go back to Fail Safe Manager, right-click the empty
group, and choose Add Resource to Group.... The Add Resource Wizard will be initiated
again, and you can choose Virtual Address from the list of available Resource Types, this
time selecting a new network name and IP address combination not currently in use
anywhere on your network.

Adding the Database to a Group


Once you have completed the steps of successfully verifying the cluster setup, creating
and verifying a standalone database, and creating a group with a virtual IP address and
host name combo, you are ready to add your database into the group. You can do this in a
couple of ways—by right-clicking the database itself, under Standalone Resources on the
given node, or by right-clicking the newly created group, choosing Add Resource to
Group.. and then selecting Oracle Database for the Resource Type. However you start the
process, the steps will be the same—be sure the appropriate Resource Type (Oracle
Database) and group name are highlighted on the first page of the Add Resource to
Group Wizard:

Figure 13: Step 1 Add a Resource Group

22
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Once you have verified this information, continue on to the next screen. Here, you will
define the network service name, the instance name, the database name (as defined by
DB_NAME in the init file), and the location of the parameter file that you wish to use.

Figure 14: Step 2 Add a Resource Group

The next page is the Database Authentication page. If you previously ran the Verify
Standalone Database procedure and specified that you wanted to use OS authentication at
that time, then it is assumed that you are doing so again when the database is actually
added to the group. If you have not run Verify Standalone previously, or if you chose to
use the SYS account for authentication, then you will be asked again. (Internal is still
offered as an option for backward compatibility, because the 3.2 release of Fail Safe
Manager will support Oracle8i and Oracle 8.0 databases.) If you choose OS
authentication here, again, an OS group called ORA_<sidname>_DBA will be created,
and the logon accounts for both the Cluster Service and the OracleMSCSServices will be
added to this group. If you had done this during the Verify Standalone operation this
group will already exist.
Next, you will still be asked if you want to maintain a password file on all nodes of the
cluster. This is recommended if you want to allow access via the password file, but you
do not want to add certain OS users to the ORA_DBA group. (Refer to Chapter 4 for
more information on using a password file.) The key thing to realize here is that if you do
not use OS authentication, then you must ensure that any changes to the password file are
propagated to all nodes in the cluster. The polling that is done by the Cluster Service uses

23
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

this information to connect, and if the password is wrong on one of the nodes, the polling
may fail, or the database may not be able to come online at all.

Behind the Scenes When Adding a DB to a Group


Once you have answered the questions on database configuration and authentication, the
process to add the database to the group will begin. The service for your instance (i.e.,
OracleServicePROD90) will be set to manual start, if it is not already, and a second
listener will be added to listener.ora. The listener name will be FSLxxxx, where xxxx is
the virtual host name associated with the group. This will cause a second listener service
to be created on the current node, which will be set to manual start also. In addition, the
tnsnames.ora file will be updated to reflect the virtual host information for the group.
Once these changes are made, the entire group will be brought offline and moved to the
other node(s) defined as possible owners. Fail Safe will create a service for the instance
(OracleServicePROD90) and configure the tnsnames.ora and listener.ora files on the
subsequent node; it will then actually bring the database online on that node, to confirm
that all is configured correctly. Once this is done, the group will be returned to reside on
the preferred node, or it will go back to the original node if a preferred node is not
defined for the group. When this operation is complete, the database is now running in a
Fail Safe environment.

Figure 15: Adding Resources

24
Microsoft Cluster Server and Oracle Fail Safe Quick Start Guide

Behind the Scenes with a Fail Safe Database


Once a database has been made Fail Safe, we can begin to explore some of the resource
properties to determine just exactly what is going on. Expand the group in Fail Safe
Manager and select the recently added database. On the right, choose the Policies tab.
The Looks Alive interval is the shorter period of time; this is the interval at which the
service for the instance is checked, to ensure that it is still running. The “Is Alive”
interval is a more thorough check. By default, every 60 seconds a login to the database is
completed and a query is run. These checks are actually performed by the Microsoft
Cluster Service, using information provided to it by the Oracle Database Resource DLL.
The Cluster Service will actually log on to the database using a sqlnet connect string. If
the logon fails, it is directed to retry using a local bequeath connection. Once connected,
the following query is run:

Select NAME from TS$ where TS$.NAME=’SYSTEM’;


This is just a basic check to verify that the database is running. Should the connect
attempt fail, or the query fail, then an error is logged in the Application Log in the
Windows 2000 Event Viewer. An internal retry is executed three more times before the
resource is officially considered to have failed. These retries after an error are normally
executed within 15 seconds or less—this interval is internal and not configurable.
If four attempts to log on and run the query have failed, then the restart policy’ ‘Restart
Policy’ should be left uppercasedefined for the database will kick in. By default, Fail Safe
will attempt to stop and then restart the database on the same node. If the restart fails
three times, then a failover to another node is initiated because the defined Failover
Policy has determined that if this resource fails, the entire group should be affected. If
this box is not checked, then once the resource has failed to restart the specified number
of times, it will be marked as Failed and will be left alone.
Note: If you are forced to run both production and test databases in the same group, due
to a lack of disk resources or other limitations, you may want to consider removing the
check from this box for your test database, so that a failure of a test instance will not
affect the entire group.

25

Vous aimerez peut-être aussi