Vous êtes sur la page 1sur 52

Oracle 10g RAC load-balancing configuration

This is a discussion on Oracle 10g RAC load-balancing configuration within the Database
Discussions forums in Database and Unix Discussions category; Hi, I am new to RAC
administration. I have been trying to configure load-balancing across two nodes in a cluster.
192.168.2.5, 192.168.2.6 are public IP addresses of the nodes and 192.168.2.9, 192.168.2.10
are the VIP addresses. My configuration files look like this:
-------------------- SQLNET.ORA ---------------------------
TRACE_LEVEL_CLIENT = OFF
names.directory_path = (TNSNAMES)
names.default_domain = world
name.default_zone = world
break_poll_skip=2000000000
----------------------------------------------------------------------------
--------------------- TNSNAMES.ORA --------------------------

TPCC.world =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
)
)

TPCC1.world =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
)
)

TPCC2.world =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
)
)

EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
(CONNECT_DATA =
(SID=PLSExtProc)(PRESENTATION=RO)))

LISTENERS_TPCC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
----------------------------------------------------------------------------

--------------------- LISTENER.ORA --------------------------


LISTENER_TPCC1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.5)(PORT = 1521))
)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
)
)

LISTENER_TPCC2 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.6)(PORT = 1521))
)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
)
)

----------------------------------------------------------------------------
INIT.ORA ---------------------------
BACKGROUND_DUMP_DEST =
/home/oracle/app/oracle/OraHome_1/rdbms/log
USER_DUMP_DEST =
/home/oracle/app/oracle/OraHome_1/rdbms/log
control_files = ?/dbs/tpcc_disks/control_001
parallel_max_servers = 48
DB_BLOCK_SIZE = 2048
recovery_parallelism = 20
compatible = 10.1.0.2
db_name = tpcc
db_files = 200
db_file_multiblock_read_count = 16
pga_aggregate_target = 800m
dml_locks = 500
log_checkpoint_interval = 100000000
log_checkpoints_to_alert = TRUE
open_cursors = 400
processes = 200
sessions = 600
transactions = 400
transactions_per_rollback_segment = 1
cursor_sharing = force
cursor_space_for_time = TRUE
timed_statistics = TRUE
disk_asynch_io = TRUE
db_writer_processes = 5
hpux_sched_noage = 178

db_cache_size = 2700m
java_pool_size = 0
large_pool_size = 300m
shared_pool_size = 1000m

log_buffer = 1024000
db_recycle_cache_size = 16m
db_cache_advice = ON
plsql_optimize_level = 2
undo_management = AUTO

#####cluster parameter #######################


cluster_database = true
cluster_database_instances = 2
tpcc1.thread = 1
tpcc1.instance_name = tpcc1
tpcc1.instance_number = 1
tpcc1.undo_tablespace = UNDOTBS1
tpcc2.thread = 2
tpcc2.instance_name = tpcc2
tpcc2.instance_number = 2
tpcc2.undo_tablespace = UNDOTBS2
tpcc1.local_listener = 'LISTENER_TPCC1'
tpcc2.local_listener = 'LISTENER_TPCC2'
remote_listener = 'LISTENERS_TPCC'
#############################################

-------------------------------------------------------------------
On both the nodes I have started the listeners LISTENER_TPCC1 and
LISTENER_TPCC2 respectively and their status looks similar to this:

-------------------------------------------------------------------
> lsnrctl start LISTENER_TPCC1

LSNRCTL for HPUX: Version 10.1.0.2.0 - Production on 25-APR-2006


20:16:12

Copyright (c) 1991, 2004, Oracle. All rights reserved.

Starting /home/oracle/app/oracle/OraHome_1/bin/tnslsnr: please wait...

TNSLSNR for HPUX: Version 10.1.0.2.0 - Production


System parameter file is
/home/oracle/app/oracle/OraHome_1/bench/bench1000/benchrun/network/listener
.ora
Log messages written to
/home/oracle/app/oracle/OraHome_1/network/log/listener_tpcc1.log
Listening on:
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168. 2.9)(PORT=1521)))
Listening on:
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168. 2.5)(PORT=1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)) )

Connecting to
(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.
2.9)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias LISTENER_TPCC1
Version TNSLSNR for HPUX: Version 10.1.0.2.0 -
Production
Start Date 25-APR-2006 20:16:12
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File
/home/oracle/app/oracle/OraHome_1/bench/bench1000/benchrun/network/listener
.ora
Listener Log File
/home/oracle/app/oracle/OraHome_1/network/log/listener_tpcc1.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168. 2.9)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168. 2.5)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)) )
The listener supports no services
The command completed successfully
-------------------------------------------------------------------
Everytime I try to start the database instances I get the following
error:

SQL> startup pfile=init.ora


ORA-00119: invalid specification for system parameter REMOTE_LISTENER
ORA-00132: syntax error or unresolved network name 'LISTENERS_TPCC'
ORA-01078: failure in processing system parameters

Can somebody please help me out in pointing out what am I doing wrong?

Thanks a lot,
Sudhanshu

SOLUTION

Thanks for your time guys. I tried quite a few things in the pfile -
single quotes, without quotes, double quotes - finally I just took the
entry from the tnsnames.ora and pasted it there in the pfile itself. So
the entries in the new pfile look like this:

tpcc1.local_listener = '(ADDRESS =
(PROTOCOL=TCP)(HOST=192.168.2.9)(PORT=1521))'
tpcc2.local_listener = '(ADDRESS =
(PROTOCOL=TCP)(HOST=192.168.2.10)(PORT=1521))'
remote_listener = '(ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST =
192.168.2.9) (PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST =
192.168.2.10)(PORT = 1521)))'

After this the database came up without problems. Still connection load
balancing was not working. So I figured it had to do something with
tnsnames.ora. I rewrote the tnsnames.ora file as:

TPCC.WORLD =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
)
(CONNECT_DATA = (SERVICE_NAME = tpcc))
)

TPCC1.WORLD =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
(INSTANCENAME = tpcc1)
(SERVER = DEDICATED)
)
)

TPCC2.WORLD =
(DESCRIPTION =
(load_balance = on)
(failover = on)
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.10)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = tpcc)
(INSTANCENAME = tpcc2)
(SERVER = DEDICATED)
)

---------------------------------------------------------------------------

Restarting the listeners the new status was like this:

---------------------------------------------------------------------------------------------
--------------------------
> lsnrctl services LISTENER_TPCC1

LSNRCTL for HPUX: Version 10.1.0.2.0 - Production on 27-APR-2006


11:54:26

Copyright (c) 1991, 2004, Oracle. All rights reserved.

Connecting to (DESCRIPTION=(address=(protocol=ipc)(key=extproc)) )
Services Summary...
Service "plsextproc" has 1 instance(s).
Instance "plsextproc", status UNKNOWN, has 1 handler(s) for this
service...
Handler(s):
"DEDICATED" established:0 refused:0
LOCAL SERVER
Service "tpcc" has 2 instance(s).
Instance "tpcc1", status READY, has 2 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.9)(PORT=152 1))
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Instance "tpcc2", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.2.10)(PORT=15 21))
The command completed successfully

--------------------------------------------------------------------------------------------
After checking several connections using:
sqlplus "tpcc-at-tpcc/tpcc"

I could see the connections were being load balanced properly between
the two nodes.
Thanks a lot for your time guys! Seems the problem had to do with
entries in tnsnames.ora and pfile.

- Sudhanshu
RAC Networking,
Failover, and Load
Balancing
Setting up failover, load
balancing, TAF and FCF

Objectives
At the end of this module the
student will understand the
following tasks and concepts.
• Understand how Oracle Net
interacts with RAC.
• Understand how to configure
tnsnames for RAC with
Transparent Application Failover
(TAF)
• Understand how to configure
JDBC connections for RAC
• Understand how to configure
Fast Connection Failover as an
alternative to RAC

Overview
• The Listener
• Configuration Files
• Oracle Net and RAC
• Connect-Time Failover
• TAF Failover
• Load Balancing
• JDBC Configuration for Failover
and Load Balancing
• Fast Connection Failover

Listener
Client
Node 1

Node 2
Database

Listener
• Resides on the server
• Listens for incoming client
connection requests
• Manages the traffic to the
server; when a client requests a
network session with a server,
the listener actually receives the
request and brokers the client
request
• If the client's information
matches the listener's
information, then the listener
grants a connection to the
server.

listener.ora
• Configuration file for the listener
process. Identifies:
o The listener name

o Protocol addresses that it will

accept connection requests


on
o Services for which it is

listening
• Typically resides in the
%ORACLE_HOME%\network\adm
in directory on Windows NT.

tnsnames.ora
• Configuration file that
determines how you connect to
the database.
o Connect to specific instance

o Load Balancing

o Connect Time Failover

o TAF Client Failover


• Must be present on server and
“thick” clients (ODBC and JDBC
“thick” OCI driver)

Oracle Net and RAC


• Oracle Net
understands RAC
• Is used to set up the
cluster behavior
• Supports multiple
modes
o Connect-Time Failover
o Load Balancing
o TAF

Connect-Time Failover
• One service
o Tries one node

o then the other node

• Used for failover


• Balance is done manually
• In the event of a failure
o Transaction is lost

o Session is lost

o Reconnection occurs

automatically

Connect-Time Failover
• TESTA1 =
• (DESCRIPTION=
• (ADDRESS_LIST=
• (LOAD_BALANCE=OFF)
• (FAILOVER=ON)
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb1)(PORT=1521))
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb2)(PORT=1521)))
• (CONNECT_DATA=
• (SERVICE_NAME=TESTA)
• )
• )

• TESTA2 =
• (DESCRIPTION=
• (ADDRESS_LIST=
• (LOAD_BALANCE=OFF)
• (FAILOVER=ON)
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb2)(PORT=1521))
• (ADDRESS=(PROTOCOL=TCP)(HOST=oradb1)(PORT=1521)))
• (CONNECT_DATA=
• (SERVICE_NAME=TESTA)
• )
• )

Failover - Backup
• Pairs of services
o The first service defines the

second as its backup


o The second service defines

the first service as its backup


• Used for RAC failover
• Balance is done manually
• In the event of a failure
o Transaction is lost

o Session is maintained

o Queries are restarted

o Reconnection occurs

automatically
Failover – Backup
• TESTA1 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb1)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA1)
• (FAILOVER_MODE =
• (BACKUP = TESTA2)
• (TYPE = SELECT)
• (METHOD = BASIC)
• (RETRIES = 120)
• (DELAY = 5))))

• TESTA2 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb2)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA2)
• (FAILOVER_MODE =
• (BACKUP = TESTA1)
• (TYPE = SELECT)
• (METHOD = BASIC)
• (RETRIES = 120)
• (DELAY = 5))))

Failover - Preconnect
• Pairs of services
o The first service defines the second as
its backup
o The second service defines the first
service as its backup
o Both services preconnect to their
backup when the primary connections
are made
• Used for RAC failover
• Balance is done manually
• In the event of a failure
o Transaction is lost
o Session is maintained
o Queries are restarted
o Connection is quickly switched over to
the preconnected backup

Failover - Preconnect
• TESTA1 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb1-vip)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA1)
• (FAILOVER_MODE =
• (BACKUP = TESTA2)
• (TYPE = SELECT)
• (METHOD = PRECONNECT)
• (RETRIES = 120)
• (DELAY = 5))))

• TESTA2 =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (LOAD_BALANCE = off)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb2-vip)(PORT = 1523))
• (CONNECT_DATA =
• (SERVICE_NAME = TESTA)
• (INSTANCE_NAME = TESTA2)
• (FAILOVER_MODE =
• (BACKUP = TESTA1)
• (TYPE = SELECT)
• (METHOD = PRECONNECT)
• (RETRIES = 120)
• (DELAY = 5))))

Load Balancing
• One service
o Oracle Net determines which

node you use


• Tries to balance the load
• Only active nodes are used
• Balance is somewhat random
• In the event of a failure
o Transaction is lost
o Connection is lost

Load Balancing
• TESTA =
• (DESCRIPTION =
• (ADDRESS_LIST =
• (LOAD_BALANCE = on)
• (FAILOVER = off)
• (ADDRESS = (PROTOCOL =
TCP)(HOST = oradb1-vip)(PORT
= 1523))
• (ADDRESS = (PROTOCOL =
TCP)(HOST = oradb2-vip)(PORT
= 1523))
• )
• (CONNECT_DATA =
• (SERVER = DEDICATED)
• (SERVICE_NAME = TESTA)
• )
• )
Transparent Application
Failover
• One service
o Multiple sessions are created

• Only active nodes are used


• In the event of a failure
o available node is used

o No loss of connection

o Transaction is lost

o Query is restarted

Transparent Application
Failover
• TESTA =
• (DESCRIPTION =
• (ENABLE = BROKEN)
• (ADDRESS_LIST =
• (LOAD_BALANCE = on)
• (FAILOVER = on)
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb1-
vip)(PORT = 1523))
• (ADDRESS = (PROTOCOL = TCP)(HOST = oradb2-
vip)(PORT = 1523))
• )
• (CONNECT_DATA =
• (SERVER = DEDICATED)
• (SERVICE_NAME = TESTA)
• (FAILOVER_MODE =
• (TYPE = SELECT)
• (METHOD = BASIC)
• (RETRIES = 120)
• (DELAY = 5)
• )
• )
• )

Characteristics of TAF
• TAF protects:
o Client/server connection

o User session state

o OCI programs

o Java thick drivers (OCI

drivers)
o Sql*plus

o Active cursors (select

statements) that have begun


to return results
• Not failed over:
o Active update transactions
o PL/SQL procedure states
o PL/SQL server-side package
variables
o alter session statements
o Server side program
variables
o Uncommitted transactions

Network Timeouts and


TAF
• When a “planned” instance
shutdown occurs, failover occurs
quickly
o Shutdown instance with

srvctl or OEM
o Failover takes a few seconds

• When an “unplanned” instance


failure occurs, failover timing is
sensitive to network timeout
parameters
o Failover cannot occur until

the network connection is


released
o By default, it may take up to

eight minutes for the network


connection to be released
and for failover to occur
• Server-side and client-side
operating system level TCP/IP
timeout parameters may be
shortened to speed up failover
o Failover can occur as quickly

as one minute
o Can increase risk of disrupted

network connections and


unnecessary failover events
Example Windows
TCP/IP Timeout
Parameters for TAF
• The following steps may be
performed on Windows
database servers, mid-tier
servers, and clients to speed up
failover
o Open the registry hive under

HKEY_LOCAL_MACHINE :
SYSTEM CurrentControlSet :
Services : Tcpip : Parameters
o Add KeepAliveTime :

REG_DWORD to the registry


Note: The correct case is
required for KeepAlive
o Set the value to a reasonable

timeout. Eg: (120000 = 2


minutes, 60000 = 1 minute)
JDBC Connections and
TAF
• The Oracle JDBC thick driver
supports TAF
o tnsnames,ora file must be

defined on the server-side


o JDBC Data Source file must

be defined on the client-side


(or mid-tier)
o TAF failover and load

balancing are supported


• The Oracle JDBC thin driver does
not fully support TAF
o tnsnames,ora file must be

defined on the server-side


o JDBC Data Source file must

be defined on the client-side


(or mid-tier)
o Load balancing is supported,

but not TAF failover


JDBC Thick Driver
Example
• <data-source
• class="com.evermind.sql.DriverManagerDataSource"
• name="OracleDS"
• location="jdbc/OracleCoreDS"
• xa-location="jdbc/xa/OracleXADS"
• ejb-location="jdbc/OracleDS"
• connection-driver="oracle.jdbc.driver.OracleDriver"
• username="scott"
• password="tiger"
• url="jdbc:oracle:oci8:@(description=(load_balance=on)(f
ailover=on)
• (address=(protocol=tcp)(host=db-node1)(port=1521))
• (address=(protocol=tcp)(host=db-node2)(port=1521))
• (address=(protocol=tcp)(host=db-node3)(port=1521))
• (address=(protocol=tcp)(host=db-node4)(port=1521))
• (connect_data=
• (service_name=sales.us.oracle.com)
• (failover_mode=(type=select)(method=basic)(retries=20)
(delay=15))))"
• rac-enabled="true"
• inactivity-timeout="300"
• connection-retry-interval="2"
• max-connect-attempts="60"
• max-connections="60"
• min-connections="12"
• />

JDBC Thin Driver


Example
o <data-source
o class="com.evermind.sql.DriverManag
erDataSource"
o name="OracleDS"
o location="jdbc/OracleCoreDS"
o xa-location="jdbc/xa/OracleXADS"
o ejb-location="jdbc/OracleDS"
o connection-
driver="oracle.jdbc.driver.OracleDriver
"
o username="scott"
o password="tiger"
o url="jdbc:oracle:thin:@(DESCRIPTION=
o (LOAD_BALANCE=on)
o (ADDRESS=(PROTOCOL=TCP)
(HOST=host1) (PORT=1521))
o (ADDRESS=(PROTOCOL=TCP)
(HOST=host2) (PORT=1521))
o (CONNECT_DATA=(SERVICE_NAME=ser
vice_name)))"
o inactivity-timeout="300"
o connection-retry-interval="2"
o max-connect-attempts="60"
o max-connections="60"
o min-connections="12"
o />
Fast Application
Notification
• ONS and Fast
Application
Notification
o The Oracle Notification
Service provides a framework
for passing event messages
about failover between nodes
o ONS may notify both RAC
nodes and client or mid-tier
nodes
o Fast Application Notification
provides the basis for Fast
Connection Failover

Fast Connection
Failover
• Fast Connection Failover
provides a way to control
failover response
o Effective for Oracle 10g

Application Server JDBC


connections
o More mid-tier applications

may be supported in the


future
o Works with JDBC Thick and

JDBC Thin drivers


o Provides an alternative to TAF

o Avoids network timeout

sensitivity
o Fast and efficient

Pre-requisites for Fast


Connection Failover
• The JDBC Implicit Connection
Cache is enabled.
• The application uses service
names to connect to the
database, not service ids.
• The database should be at least
release 10.1.0.2 (10.1.0.3 is
preferred).
• ONS is configured and running
on the node where JDBC is
running.
• The JVM in which the JDBC
instance is running must have
oracle.ons.oraclehome set to
point to the ORACLE_HOME
where the ONS files were
installed.

Setting Up the Implicit


Connection Cache for
FCF
• The following JDBC Data Source file sets up
the Implicit Connection Cache for FCF on
the Application Tier:

• OracleDataSource ods = new


OracleDataSource()
• ...
• ods.setUser(“Scott”);
• ods.setPassword(“tiger”);
• ods.setConnectionCachingEnabled(Tru
e);
• ods.setFastConnectionFailoverEnabled
(True);
• ods.setConnectionCacheName(“MyCache”);
• ods.setConnectionCacheProperties(cp);
• ods.setURL("jdbc:oracle:thin:@(DESCRIPTIO
N=
• (LOAD_BALANCE=on)
• (ADDRESS=(PROTOCOL=TCP)(HOST=VIP1)
(PORT=1521))
• (ADDRESS=(PROTOCOL=TCP)(HOST=VIP2)
(PORT=1521))
• (CONNECT_DATA=(SERVICE_NAME=service
_name)))");
ONS on the
Applications Tier
• The Oracle
Notification Service
must be installed and
running on the
Applications Tier
nodes
o The ONS client is available
with the 10.1.0.3 client CD
o For the 10.1.0.2 ONS client,
contact Oracle Support
• Run “onsctl ping” on
each application node to
determine that ONS is
running
ONS on the
Applications Tier
• Edit the file
%ORACLE_HOME%\opmn\conf\o
ns.config on each mid-tier node:
o localport=6100 # This is the

port ONS is writing to on this


node
o remoteport=6200 # This is

the port ONS is listening on


this node
o loglevel=3

o # This is the list of hosts and

ports ONS is posting to.


o # Include RAC and client
nodes.
o nodes=RAC1.mycompany.co

m:6200,RAC2.mycompany.co
m:6200,
o RAC3.mycompany.com:6200,
APPS1.mycompany.com:6200
,
o APPS2.mycompany.com:6200

ONS on the
Applications Tier
• When starting the
application:
o Specify the system property

Doracle.ons.iraclehome=<locatio
n of ons on client>
o Include the ons.jar file in the
application CLASSPATH
Modifying Server-Side
ONS for FCF
• Use the
%ORA_CRS_HOME%\bin\racgons
command to modify the server-
side ONS configuration to be
aware of the mid-tier nodes:
o racgons add_config

APPS1.mycompany.com:6200
APPS2.mycompany.com:6200
o This adds the host and port

names to the OCR

Differences between
FCF and TAF
• Application-Level
Connection
Retries.
o Fast Connection Failover
supports application-level
connection retries. This
gives the application control
of responding to connection
failovers.
o TAF supports connection
retries only at the OCI/Net
layer.

Differences between
FCF and TAF
• Integration with
the Connection
Cache
o Fast Connection Failover is
well-integrated with the
Implicit Connection Cache,
which allows the connection
cache manager to manage
the cache for HA.
o Failed connections are
automatically invalidated in
the cache.
o TAF works at the network
level on a per-connection
basis, which means that the
connection cache cannot be
notified of failures.
Differences between
FCF and TAF
• Event-Based
o Fast Connection Failover is
based on the RAC event
mechanism. This means that
Fast Connection Failover is
efficient and detects failures
quickly for both active and
inactive connections.
o TAF is based on the network
call mechanism.

Differences between
FCF and TAF
• Load-Balancing
Support
o Fast Connection Failover
supports UP event load
balancing of connections and
runtime work request
distribution across active
RAC instances.

Review
• What are the two key
configuration files for Oracle
Net?
• What key phrase in the
tnsnames.ora file influences
load balance behaviour?
• What key phrase in the
tnsnames.ora file influences
failover behaviour?
• Which Oracle Net configurations
allow users to maintain
connections through a failover
event?
• Which JDBC drivers are
supported under TAF and FCF?
• What is the name of the utility
to add mid-tier nodes to the
server-side ONS configuration?

Summary
• Listener process
• Configuration files
o listener.ora

o tnsnames.ora

• Failover
o Basic

o Preconnect

• Load Balancing
• TAF
• JDBC Thick and Thin drivers and
TAF
• Setting up Fast Connection
Failover
The listener is a process that resides on the server whose responsibility is to
listen for incoming client connection requests and manage the traffic to the
server.

The listener brokers the client request, handing off the request to the server.
Every time a client (or server acting as a client) requests a network session with
a server, a listener receives the actual request. If the client's information matches
the listener's information, then the listener grants a connection to the server.

Oracle databases do not require identification of the database service in


listener.ora because of service registration. However, static service configuration
is required for an Oracle databases if you plan to use a listener.

The tnsnames.ora (or the equivalent) must be used if you want to use
Transparent Application Failover. There must be a copy located on each RAC
server. In addition, key components of the server-side tnsnames.ora file must be
present in a client-side tnsnames.ora file. However, for JDBC connections, the
tnsnames.ora content is often contained in a long connection string in the JDBC
configuration file on the client. Note that the JDBC driver for TAF must be the
Oracle supported “thick” OCI driver.

Connect-Time Failover

By setting up multiple network paths, OCI will try each of the paths in the connection list. In this
manner you can set up multiple nodes in the cluster, but at the same time, specifying which node
is the primary connection. This can be done by putting the following entry into the tnsnames.ora
file.

prod1.us.company.com =
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=OFF)
(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=prod2)(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=prod1.us.company.com)))

prod2.us.company.com =
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=OFF)
(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=prod2)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=prod1)(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=prod2.us.company.com)))

In this example, there are two different service names. The first uses prod1 as the primary and
prod2 as the failover, the second uses prod2 as the primary and prod1 as the failover. This
allows you to do your own manual load balancing, including failover to the designated node. The
connection is re-established during failover, but session state is lost for all transactions and
queries.

This type of failover pre-dates RAC, and can even be used with some Oracle 8i Net connections.
TAF failover is preferred with RAC.

Failover – Backup Method

This failover method uses only one address per service name. However, each service explicitly identifies another service as its
BACKUP service. If the primary service fails, the connection will be switched to the backup service. This can be done by putting
the following entry into the tnsnames.ora file.

prod1.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)

(INSTANCE_NAME = prod1.us.company.com)

(FAILOVER_MODE =
(BACKUP = prod2.us.company.com)

(TYPE = select)
(METHOD = basic))))

prod2.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)

(INSTANCE_NAME = prod2.us.company.com)

(FAILOVER_MODE =
(BACKUP = prod1.us.company.com)

(TYPE = select)
(METHOD = basic))))
In this example, there are two different service names. The first uses prod1 as the primary and prod2 as the failover, the second
uses prod2 as the primary and prod1 as the failover. Note the extended syntax under “FAILOVER_MODE=” as compared to
connect-time failover. This syntax is RAC specific, and allows the session state to be maintained through failover for queries
(transactions are rolled back). The RETRIES and DELAY parameters allow the connection to be retried if the first attempt is not
successful.

Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.

Failover - Preconnect

Note that setting the “METHOD = preconnect” allows the backup connection to be pre-
allocated. This can aid failover speed, at the cost of twice as many concurrent connections.

prod1.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)

(INSTANCE_NAME = prod1.us.company.com)

(FAILOVER_MODE =
(BACKUP = prod2.us.company.com)

(TYPE = select)
(METHOD = preconnect))))

prod2.us.company.com =
(DESCRIPTION =
(LOAD_BALANCE = off)
(FAILOVER = on)
(ADDRESS = (PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = prod.us.company.com)

(INSTANCE_NAME = prod2.us.company.com)

(FAILOVER_MODE =
(BACKUP = prod1.us.company.com)

(TYPE = select)
(METHOD = preconnect))))
Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.

Load Balancing

Load balancing is similar to Connect-time Failover in that one service is defined


with multiple addresses. In this case, automatic load balancing is turned on
while failover is turned off. This can be done by putting the following entry into
the tnsnames.ora file.

prod.us.company.com =
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=prod1)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=prod2)(PORT=1521)))
(CONNECT_DATA=
(SERVICE_NAME=prod.us.company.com)))

Here either prod1 or prod2 will be used. The system used is determined at
connection time. Load balancing is commonly used with RAC, both in the
server-side tnsnames.ora file and the client-side tnsnames.ora file.

Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.

Transparent Application Failover (TAF)

The TAF mechanism provides some fault tolerance in the event of a node failure. It does this by
establishing a connection to multiple nodes in the cluster. TAF will automatically resend some
transactions, and will not result in the loss of connection to the user in the event of a failure. TAF
is configured by specifying both FAILOVER and LOAD_BALANCE as shown here:

prod.us.company.com=
(DESCRIPTION=
(LOAD_BALANCE=on)
(FAILOVER=on)
(ADDRESS=
(PROTOCOL=tcp)
(HOST=prod1)
(PORT=1521))
(ADDRESS=
(PROTOCOL=tcp)
(HOST=prod2)
(PORT=1521))
(CONNECT_DATA=
(SERVICE_NAME=prod.us.company.com)
(FAILOVER_MODE=
(TYPE=select)
(METHOD=basic))))

The TYPE parameter accepts the following options:

session - In this mode a new session is created on the failover server in the event of a failure.

select - Cursor processing is maintained. This incurs significant overhead on the client.

none (default) - No failover is done.

The METHOD parameter accepts the following options:

basic - establishes the connection to the failover server when it is needed.

preconnect - establishes the connection to the failover server at initial connection time.

Note that the IP addresses or host names used in the tnsnames.ora file should
be Virtual IP addresses or host names for Oracle 10g RAC.

What TAF Restores

TAF automatically restores some or all of the following elements associated with active database connections. Other elements,
however, may need to be embedded in the application code to enable TAF to recover the connection.

Client-Server Database Connections


TAF automatically reestablishes the connection using the same connect string or an alternate connect string that you specify
when configuring failover.

Users' Database Sessions


TAF automatically logs a user in with the same user ID as was used prior to failure. If multiple users were using the connection,
then TAF automatically logs them in as they attempt to process database commands. Unfortunately, TAF cannot automatically
restore other session properties. These properties can, however, be restored by invoking a callback function.

Executed Commands
If a command was completely executed upon connection failure, and it changed the state of the database, TAF does not resend
the command. If TAF reconnects in response to a command that may have changed the database, TAF issues an error message
to the application.

Open Cursors Used for Fetching


TAF allows applications that began fetching rows from a cursor before failover to continue fetching rows after failover. This is
called "select" failover. It is accomplished by re-executing a SELECT statement using the same snapshot, discarding those rows
already fetched and retrieving those rows that were not fetched initially. TAF verifies that the discarded rows are those that were
returned initially, or it returns an error message

Active Transactions
Any active transactions are rolled back at the time of failure because TAF cannot preserve active transactions after failover. The
application instead receives an error message until a ROLLBACK is submitted.

Server-side Program Variables


Server-side program variables, such as PL/SQL package states, are lost during failures; TAF cannot recover them. They can be
initialized by making a call from the failover callback.

Note the “failover=on”, “failover_mode=”, and “rac_enabled=‘true’” sections.

Note the “LOAD_BALANCE=on” is set, but failover is not set (TAF failover is not
supported with the thin driver).

The command “%ORACLE_HOME%\opmn\bin\onsctl start” can be used to start


ONS, and the command “%ORACLE_HOME%\opmn\bin\onsctl stop” can be
used to stop ONS.

Do not modify the server-side ons.config file to add the mid-tier hosts and ports.
This does not modify the OCR. Use the racgons command instead.

Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.

You should not attempt to run both TAF and FCF at the same time.

Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.

You should not attempt to run both TAF and FCF at the same time.
Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.

You should not attempt to run both TAF and FCF at the same time.

Note: The application should not try to roll back the transaction; the transaction
was already rolled back in the database by the time the application received the
exception.

You should not attempt to run both TAF and FCF at the same time.
Oracle RAC Administration - Part 15: Connection Load
Balancing and FAN
Tarry Singh, tarry.singh@gmail.com

Brief intro

In our last article, we looked at the service architecture. We will continue our quest to understand
the workload characteristics and the load balancing connections in our Virtualized RAC.

Introduction to Workload Management

Connection Load Balancing

With Oracle Notification Services (ONS), we are able to have our RAC balance our client
connections across the nodes. The CLB (connection load balancing) can be handled on the
server-side or on the client-side. The client-side takes care of the connection sharing across the
Listeners, while server-side load balancing manages and handles the connection requests to the
most willingly available node whose information is in turn provided to it by the LBA (Load
Balancing Advisory).

This in turn is analyzing the goals that may have been set for the LBA. You can use a long goal or
a short goal for this connection load balancing. Let’s take a look at them both quickly.

Long Goals: Use long goals for applications that need long uninterrupted connections. It is the
default connection load balancing goal. To see how to implement them, lets take a simple syntax.
Using the DBMS_SERVICE and CLB_GOAL_LONG packages, you can define the connection load
balancing for long connections as follows and here FOKESERV is our service:

EXECUTE DBMS_SERVICE.MODIFY_SERVICE (service_name =>


'FOKESERV'
, clb_goal => DBMS_SERVICE.CLB_GOAL_LONG);

Short Goals: Think of quick and fast transactions like ordering or placing a secure purchase like
that on Amazon or any other place. You would prefer to fill the order up fast and if, for any reason
the process is delayed, the connection is restarted, ensuring that the transaction is performed in a
short-lived connection. Checking the syntax, here CARTSERV is the service in question:

EXECUTE DBMS_SERVICE.MODIFY_SERVICE (service_name =>


'CARTSERV'
, CLB_GOAL => DBMS_SERVICE.CLB_GOAL_SHORT);

Note: When you install your RAC with the DBCA (Database Configuration Assistant), server-side
load balancing is enabled by default. Also check out the sample script in tnsnames.ora file. The
services created through the DBCA will have the Connection Load Balancing default setting as
CLB_GOAL=CLB_GOAL_LONG.

The ONS maintains client connections until the client decides to close the connection or if a node
fails or any other form of outage prevents the continuity of that connection. However, if you
configure TAF (transparent application failover) then the connections are moved to another
healthy instance. For a typical SARG (Select Arguments), TAF can restart the query. Alternately,
from the client side, if you were accessing information on the web, the search would simply carry
on without you noticing the failover of the connection to another node. However, if you were in the
middle of a transaction (INSERT, UPDATE, or DELETE) then the application must rollback the
failed transaction and submit it again. Any other session's customizations must be re-executed.
However, in a normal processing scenario the sessions don’t move irrespective of the workload
changes, newer sessions are just redirected to other nodes.

With the services, the deployment of TAF only becomes easier. By defining a TAF policy for a
service, all connections using this service will have TAF automatically enabled for them. No client-
side intervention is required, meaning the server-side TAF setting will override the TAF setting at
the client level. In order to define the TAF policy you will need the DBMS_SERVICE procedure:

EXECUTE DBMS_SERVICE.MODIFY_SERVICE (service_name =>


'fokeserv.domain.com'
, aq_ha_notifications => TRUE
, failover_method => DBMS_SERVICE.FAILOVER_METHOD_BASIC
, failover_type => DBMS_SERVICE.FAILOVER_TYPE_SELECT
, failover_retries => 180
, failover_delay => 5
, clb_goal => DBMS_SERVICE.CLB_GOAL_LONG);

Client-side load balancing is defined in your client connection with the parameter
LOAD_BALANCE=ON (the default is ON). Upon setting this parameter to ON, Oracle will pick out an
address from the address list at random and connect to that node, thereby balancing the client
connections across the cluster. Run the command lsnrctl services occasionally to see what
services a listener supports.

FAN (Fast Application Notification)

As the Oracle RAC manual states:

FAN is a notification mechanism that RAC uses to notify other processes about configuration and
service level information such as includes service status changes, such as UP or DOWN events.
Applications can respond to FAN events and take immediate action. FAN UP and DOWN events
can apply to instances, services, and nodes.

RAC publishes the FAN events the minute any changes are made to the cluster. So, instead of
waiting for the application to check on individual nodes to detect an anomaly, the applications are
notified by FAN events and are able to react immediately.

FAN also publishes load balancing advisory (LBA) events. Applications are in a position to take
full advantage of the LBA FAN events to facilitate a smooth transition of connections to healthier
nodes in a cluster. One can take advantage of FAN is the following ways:
• When using integrated Oracle Client, the applications can use FAN with no
programming whatsoever. Oracle 120g JDBC, ODP.NET and OCI would be
considered as the components of the integrated clients.
• Programmatic changes in ONS API make it possible for applications to still
subscribe to the FAN events and can execute the event handling actions
appropriately.
• FAN can be implemented with server-side callouts on your database tier.

For instance, a typical DOWN event will prevent any further disruption of service by cleanly
terminating the sessions on that failed node and notifying the user. Moreover, a typical UP event
can address and allocate extra resources for new incoming requests/ connections. There are
however several additional benefits with the server-side callouts, mainly you can utilize FAN in
order to:

• Logging
• Paging/SMS the DBA and/or to open trouble tickets when the resources fail
to (re)start
• Change resource plans or to shut down services when the number of
available instances decreases, thus preventing further load on the cluster
and keeping the RAC running until another healthy node is added to the
cluster.
• Automate the fail service back to PREFERRED instances when required.

Conclusion

In this article, we looked at the CLB and FAN. In future articles, we will continue to discuss the
services architecture and discuss FAN in detail and take a look at the LBA.

Oracle RAC Administration - Part 16: Balancing act


between Server and Client
Brief intro

We will take a little detour from our regular path of following the manual, some would say, to more
practical path. One topic that most intrigues both an Oracle DBA and their clients (in their own
way) is connectivity. DBAs are sometimes puzzled with the load balancing and failover part of
connectivity. Where is it, on the Client-side OR the Server-side? Is tnsnames.ora still useful if I
deploy the Easy Connect Net Naming method? Let's brush up on some basics.

Easy Connect Naming Methods

Easy Connect Naming methods were introduced in Oracle 10g. The benefits were simple. The
need to do a service name lookup in the tnsnames.ora files for TCP/IP environment was
eliminated, a directory naming system is no longer needed and clients are connected
automatically. Connecting is also fairly simple:

Syntax:

CONNECT username/password@[//]host[:port][/service_name]
So our typical examples could look like:

CONNECT oe/oe@localhost/nickserv
CONNECT scott/tiger@172.22.202.100/examples

Connect-Time Failover

From the clients end, when your connection fails at one node or service, you can then do a look
up from your tnsnames.ora file and go on seeking a connection with the other available node.
Take this example of our 4-node VMware ESX 3.x Oracle Linux Servers:

FOKERAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =
nick01.wolga.com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = nick02.wolga.
com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = brian01.wolga.
com)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = brian02.wolga.
com)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = fokerac)
)
)

Here the first address in the list is tried at the client’s end. Should the connection to
nick01.wolga.nl fail, then the next address, nick02.wolga.nl, will be tried. This phenomenon is
called connection-time failover. You could very well have a 32-node RAC cluster monitoring the
galactic system at NASA and thus have all those nodes typed in your tnsnames.ora file.
Moreover, these entries do not necessarily have to be part of the RAC cluster. So it is possible
that you are using Streams, Log Shipping or Advanced Replication to maintain your HA (High
Availability) model. These technologies facilitate continued processing of the database by such a
HA (High Availability) model in a non-RAC environment. In a RAC environment we know (and
expect) the data to be the same across all nodes since there is only one database.

TAF (Transparent Application Failover)

Transparent Application Failover actually refers to a failover that occurs when a node or instance
is unavailable due to an outage or other reason that prohibits a connection to be established on
that node. This can be set to on with the following parameter FAILOVER. Setting it to ON will
activate the TAF. It is turned on by default unless you set it to OFF to disable it. Now, when you
turn it on you have two types of connections available by the means of the FAILOVER_MODE
parameter. The type can be session, which is default or select. When the type is SESSION, if the
instance fails, then the user is automatically connected to the next available node without the
user’s manual intervention. The SQL statements need to be carried out again on the next node.
However, when you set the TYPE to SELECT, then if you are connected and are in the middle of
your query, then your query will be restarted after you have been failed over to the next available
node. Take this example of our tnsnames.ora file, (go to the section beginning with
CONNECT_DATA):
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = fokerac.wolga.com)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)

This is also demonstrated in this image from my article on Quest’s soRAC tool.

This is more intelligent in the sense that should the instance, (where the user has fired a query),
go down then the session will remember the rows in its Global Buffer Cache, restart the query
and pick up from where the user left off when they were disconnected from the first node.

Compare it to a scenario where the manager is running the query through a client such as Crystal
Reports or Oracle Reports. Running the monthly query to get the data from his Server Monitoring
software, let's say NetIQ, to get the statistics of all the servers the DBA needs to draw a report on
monitoring the CPU, memory or disk usage. Now if he ran a query that had to fetch 5000 rows
and he gets cut off at 3000 rows, he still has to reissue (not manually though) start from the
beginning but will only see the results when he crosses the 3001 mark. Then upon prompt, the
results will be shown. However, this does not apply to the DML (Data Manipulation Language)
such as insert, update, delete. The user is met with an error if node failure happens in the middle
of the processing and all the (uncommitted) transactions are rolled back. The session eventually
fails over without the user having to connect, however, he does have to issue the query again.

Preconnect Method

Consider our tnsnames.ora file again. Here in our older example, where the METHOD was
BASIC, there is some additional time loss in reconnecting the lost connection to the immediate
available node. However, when you set the METHOD to PRECONNECT, a session is connected
to all available nodes in the list. The advantage is that failover is a lot quicker than the BASIC
method but the downside is that the other instances will continue to claim necessary resources
such a system memory to keep those connections alive.

(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = fokerac.wolga.com)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = PRECONNECT)
(RETRIES = 180)
(DELAY = 5)
)
)
Conclusion

As a DBA, these little things are very crucial in deciding what you intend to do with the RAC. Is it
a DSS or will it be used for a heavy OLTP environment. On a DSS, you can easily choose a
PRECONNECT method as you expect to have less connections with long durations (batch
processing, etc.) where as a typical OLTP will perform better if the cluster is optimally configured
with a BASIC method.