Vous êtes sur la page 1sur 18

An Oracle White Paper

Updated August 2010

Oracle GoldenGate for Linux, UNIX, and


Windows

Oracle GoldenGate for Linux, UNIX, and Windows

Executive Overview............................................................................. 1
Introduction ......................................................................................... 1
Server Environment ............................................................................ 2
ODBC.............................................................................................. 2
Disk ................................................................................................. 2
GoldenGate Environment.................................................................... 4
Comments....................................................................................... 4
GoldenGate Macros ........................................................................ 4
GoldenGate Manager...................................................................... 5
GoldenGate Capture ....................................................................... 8
GoldenGate Data Pump.................................................................. 9
TCPFLUSHBYTES ....................................................................... 11
GoldenGate Trails ......................................................................... 11
GoldenGate Apply......................................................................... 12
Load Balancing GoldenGate Processes ....................................... 13

Oracle GoldenGate for Linux, UNIX, and Windows

Executive Overview
This document presents generic best practices for Oracle GoldenGate installed in Linux, UNIX, or
Windows environments. Database specific best practices are intentionally omitted and are discussed in
individual database best practices documents.

Introduction
Oracle GoldenGate is inherently complex to architect, the following sections provide generic
configuration settings for areas commonly missed, or improperly configured by novice users. The
target audience of this document has an existing knowledge of GoldenGate and is adept in its
configuration.

Oracle GoldenGate for Linux, UNIX, and Windows

Server Environment
There are several items in the server environment that should be considered when running GoldenGate. Best
practices for ODBC authentication and disk configuration are below.

ODBC
For installations utilizing ODBC connectivity to the database perform user authentication at the ODBC level
instead of the GoldenGate Capture or Apply. This will preclude the storing of user database logon ids and
passwords in the GoldenGate parameter files; which could be a security issue.
In the GoldenGate parameter file all that will be required to access the database will be the option
SOURCEDB <system dsn>.
Linux and Unix

To configure user authentication for OBDC for Linux or Unix servers, add the following to the odbc.ini file:
LogonUser = server_username
LogonAuth = password_for_LogonUser
Windows

In the Windows environment, when configuring the ODBC System DSN, select Windows Authentication
where applicable (SQL Server, etc) or provide the username and password if Windows Authentication is not
supported (Teradata, etc).

Disk
Internal vs. RAID Disk

If GoldenGate is not being installed on a cluster server, use internal disks. Testing has shown that
checkpointing could be delayed up to 2 seconds depending upon the RAID architecture.
If RAID disks are required, RAID1+0 is preferred over RAID5 due to the overhead required for RAID5
writes.
RAID1+0 Explained

RAID1 is data mirroring. Two copies of the data are held on two physical disks, and the data is always
identical. RAID1 has a performance advantage, as reads can come from either disk, and is simple to implement.
RAID0 is simply data striped over several disks. This gives a performance advantage, as it is possible to read
parts of a file in parallel. However not only is there no data protection, it is actually less reliable than a single
disk, as all the data is lost if a single disk in the array stripe fails.
RAID1+0 is a combination of RAID1 mirroring and data striping. This means it has very good performance,
and high reliability, so it is ideal for mission critical database applications.
RAID5 Explained

RAID5 data is written in blocks onto data disks, and parity is generated and rotated around the data disks. This
provides good general performance, and is reasonably cheap to implement.
The problem with RAID5 is write overhead. If a block of data on a RAID5 disk is updated, then all the
unchanged data blocks from the RAID stripe must be read from the disk, and a new parity calculated before
the new data block and new parity block can be written out. This means that a RAID5 write operation requires
4 IOs. The performance impact is usually masked by a large subsystem cache.

Oracle GoldenGate for Linux, UNIX, and Windows

RAID5 has a potential for data loss on hardware errors and poor performance on random writes. RAID5 will
not perform unless there is a large amount of cache. However, RAID5 is fine on large enterprise class disk
subsystems as they all have large, gigabyte size caches and force all write IOs to be written to cache, thus
guaranteeing performance and data integrity.
For RAID5 configurations, a smaller stripe size is more efficient for a heavy random write workload, while a
larger block size works better for sequential writes. A smaller number of disks in an array will perform better,
but has a bigger parity bit overhead. Typical configurations are 3+1 (25% parity) and 7+1 (12.5% parity).
Disk Space

Sufficient disk space should be allocated to GoldenGate on the source server in order to hold extract trails for
the worse case scenario target server outage. This is customer dependent; however, space allocation for seven
days may be used as a general rule of thumb for disaster recovery purposes.
NFS Mounts

Unless IO buffering is set to zero (0) then NFS mounts should not be used by any GoldenGate disk input or
output process. The danger occurs when one process registers the end of a trail file or transaction log and
moves on to the next in sequence yet after this event data in the NFS IO buffer gets flushed to disk. The net
result is skipped data and this cannot be compensated for with the parameter OEFDELAY.

GoldenGate Environment
Comments
Comments should always be included in GoldenGate parameter files. Comments aid in troubleshooting
issues and in documenting configurations. In GoldenGate comments can be denoted by the word
COMMENT but are more commonly expressed by 2 dashes (--) preceding any text.
Comments should (1) identify modifications to the files (who, when, what), (2) provide an explanation
for various settings, and (3) provide additional information about the process and configuration.

GoldenGate Macros
GoldenGate Macros are a series of commands, parameters, or data conversion functions that may be
shared among multiple GoldenGate components. The best use of macros is to create a macro library;
which is a series of commonly used macros stored in a shareable location.
To setup a GoldenGate Macro library, create a subdirectory named dirmac under the main
GoldenGate installation directory. Macro library files stored in this location will consist of edit files with
the suffix .mac.
Macro Library Contents

In its simplest form, the macro library contains database connection information used in Capture and
Apply. Removing this information from the parameter files adds an additional layer of security in the
server environment as the database access information cannot be viewed in the parameter or report files.
Table 1 presents a sample macro file containing database connect information.
Table 1. dbconnect.mac - Sample database connect macros.
MACRO #odbc_connect_dsn
BEGIN
SOURCEDB MyDSNconnect
END;
MACRO #odbc_connect_clearpwd
BEGIN
SOURCEDB MyDSNconnect, USERID lpenton, PASSWORD lpenton
END;
MACRO #odbc_connect_encrypt
BEGIN
SOURCEDB MyDSNconnect, USERID lpenton, PASSWORD
AACAAAAAAAAAAAKAVDCCTJNGFALEWEVECDIGAEMCQFFBZHVC,
encryptkey default
END;
MACRO #dbconnect_clearpwd
BEGIN
USERID lpenton, PASSWORD lpenton
END;
As shown above, the sample macro library file, dbconnect.mac, contains the following macros:
1.
2.

#odbc_connect_dsn
a. This shows the database connect string when user authentication is performed by
ODBC.
#odbc_connect_clearpwd

a.

3.

4.

This shows the database connect string when user authentication is not performed by
ODBC.
#odbc_connect_encrypt
a. This shows the database connect string when user authentication is not performed by
ODBC, and an encrypted password using the default GoldenGate encryption key is
supplied.
#dbconnect_clearpwd
a. This shows the database connect string for databases where GoldenGate does not use
ODBC as the access method (i.e., Oracle).

NOTE: CLEAR PASSWORDS SHOULD NEVER BE USED IN A CUSTOMER


ENVIRONMENT. ALL PASSWORDS MUST BE ENCRYPTED VIA THE GGSCI COMMAND
ENCRYPT PASSWORD WITH DEFAULT LEVEL ENCRYPTION USED AT A MINIMUM.
MACRO denotes the macro name. BEGIN and END denote the starting and ending points for
the macro definition. All statements between BEGIN and END comprise the macro body.
Loading the Macro Library

Macros files are loaded and processed only when GoldenGate components (Manager, Capture, Data
Pump, or Apply) are started.
To load a macro file, add text similar to that below in the respective parameter file:
NOLIST
include ./dirmac/dbconnect.mac
LIST
NOLIST specifies that at start time, the process is not to log whatever follows this statement into its
report file. include ./dirmac/dbconnect.mac specifies that the process is to read the contents of the
file and include them as part of its runtime options. LIST specifies that the process is to log whatever
follows this statement into its report file.
It is a best practice to never list the macro file contents.
If the example above was included in a Capture parameter file, when the GGSCI command START
EXTRACT <extract name> is executed the macro library file dbconnect.mac is opened and read as
part of the processes runtime environment. If this file does not exist the process will abend with a nonexistent file error.
Macro Execution

To reference a macro that is part of our example library file, you would add a line such as the one below
into the parameter file:
#odbc_connect_dsn ()
If we had issued the start extract command above; during process startup, this line will be logically
replaced by the macro body contents (SOURCEDB MyDSNconnect in this case) and a database
logon is attempted using the connect method specified.

GoldenGate Manager
Manager is the GoldenGate parent process and is responsible for the management of GoldenGate
processes, resources, user interface, and the reporting of thresholds and errors.
Even though the default setting for Manager will suffice in most instances; there are several settings that
should be reviewed and modified for a well configured and running GoldenGate environment.

PORT

The default Manager listener port is 7809. Because this is a well documented and publicized port
number, it may not be the best setting in a customer environment; especially if the server is on a nonsecure network. The customers network administrator should assign GoldenGate Manager a nondefault port number.
DYNAMICPORTLIST

When Manager receives a connect request from Capture, Apply, or GGSCI; the default functionality is
to utilize any available free port for data exchange. In a production environment this may not be desired
functionality; therefore, the customers network administrator should assign a series, or range of ports
for exclusive use by GoldenGate processes.
DYNAMICPORTLIST may be configured as:
1. A series of ports
a. DYNAMICPORTLIST 15301, 15302, 15303, 15380, 15420
2. A range of ports
a. DYNAMICPORTLIST 12010-12250
3. A range of ports plus individual ports
a. DYNAMICPORTLIST 12010-12020, 15420, 15303
A maximum of 256 ports may be specified.
PURGEOLDEXTRACTS

Use PURGEOLDEXTRACTS in the Manager parameter file to purge trail files when
GoldenGate has finished processing them. As a Manager parameter, PURGEOLDEXTRACTS
provides trail management in a centralized fashion and takes into account multiple processes.
To control the purging, follow these rules:
1.

Specify USECHECKPOINTS to purge when all processes are finished with a file as indicated
by checkpoints. Basing the purge on checkpoints ensures that no file is deleted until all
processes are finished with it. USECHECKPOINTS considers the checkpoints of both
Extract and Replicat before purging.

2.

Use the MINKEEP rules to set a minimum amount of time to keep an unmodified file:
a. Use MINKEEPHOURS or MINKEEPDAYS to keep a file for <n> hours or days.
b. Use MINKEEPFILES to keep at least <n> files including the active file. The default
is 1.

3.

Use only one of the MINKEEP options. If more than one is used, GoldenGate selects one of
them based on the following:
a. If both MINKEEPHOURS and MINKEEPDAYS are specified, only the last one is
accepted, and the other will be ignored.
b. If either MINKEEPHOURS or MINKEEPDAYS is used with MINKEEPFILES,
then MINKEEPHOURS or MINKEEPDAYS is accepted, and MINKEEPFILES is
ignored.

Set MINKEEPDAYS on the source server to keep <n> number of local extract trails on disk to
facilitate disaster recovery of the source database. The number of days is determined by the customer.
BOOTDELAYMINUTES

This WINDOWS parameter and must be the first entry in the Manager parameter file.
BOOTDELAYMINUTES specifies the amount of time Manager is to delay after the server has booted.
This delay is to allow other server components (RAID disks, database, network, etc) to startup and
become active before Manager begins processing.

AUTOSTART

AUTOSTART is used to start GoldenGate Capture or Apply processes as soon as Managers startup
process completes.
LAGREPORT

LAGREPORT denotes the interval at which Manager checks Capture and Apply processes for lag. This
setting is customer dependent; however, a setting of LAGREPORTHOURS 1 should be sufficient in
most environments.
LAGCRITICAL

LAGCRITICAL denotes the threshold at which lag becomes unacceptable. Manager writes a message to
the GoldenGate Error Log. This setting is customer dependent; however, the recommended settings
below will suffice for most installations:
Environment
Disaster Recovery
Decision Support
Reporting

Setting
LAGCRITICALMINUTES 5
LAGCRITICALMINUTES 5
LAGCRITICALHOURS 1

Windows Cluster Considerations

Ensure that GoldenGate Manager is installed as a Service on each node in the cluster. The GoldenGate
Manager resource in the SQL Server Cluster Group must be moved to each node and GoldenGate
Manager installed as a service. To do so without moving all the resources in the SQL Server Cluster
Group, do the following:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.

Create a new, temporary cluster group.


Ensure that the GGS Manager resource is NOT checked to Affect the Group on
failover.
Ensure that the disk resource W, for GoldenGate is NOT checked to Affect the Group
on failover if GoldenGate is the only resource using this disk resource.
Stop all Extracts and Pumps from GGSCI with STOP ER *
Take GGS Manager Offline from Cluster Administrator.
Delete any dependencies for the GG Manager resource.
Move the GGS Manager resource from the SQL Server Cluster Group to the newly
created cluster group.
Move this drive resource to the newly created cluster group.
Move the newly created group to the new node.
Login to the new node and browse to the GG install directory.
Start GGSCI.EXE and type:
a. SHELL INSTALL ADDSERVICE ADDEVENTS
Open the Windows Services applet and set the Log On account for the GGSMGR service
that was just created if using other than the default, Local System Account.
Ensure that the Log On account for the GGSMGR service is a member of the new nodes
local Administrators Group.
Ensure that system DSNs have been created on the new node exactly as they have been
created on the primary node.
Bring GGS Manager Resource online through Cluster Administrator.
Verify in GGSCI that All Extracts and Pumps have started. They should start
automatically when GGS Manager Resource comes online, but may take a few seconds
and may require a manual start with: START ER *
Stop Extracts and Pumps in GGSCI with: STOP ER *
Move the new cluster group with the GGS Manager and disk resource back to the primary
node.
Move both resources in the new cluster group back to the SQL Server Cluster Group and
reset the dependencies for the GGS Manager Resource (SQL Server and drive W resource
dependencies).
Delete the temporary cluster group created in Step 1.

21. Bring GGS Manager Resource back online, and verify all Extracts and Pumps are running
through GGSCI.

GoldenGate Capture
GoldenGate Change Data Capture retrieves transactional data from the source database. Below are
some generic best practices guidelines for Capture:
1.
2.

3.

As stated in section 3.2, use GoldenGate Macros to configure database access.


Do not configure Capture to transmit data over TCP/IP, Capture must store change data
locally to EXTTRAILS.
a. The data transmission causes Capture to slow down.
b. Capture should only do one thing, get change data.
c. If connectivity to the target server fails, Capture abends. Process restart and catch up
causes undo stress on the server and database.
Do not have a number as the last character of a Capture Group Name.
a. By default, GoldenGate store 10 report files in the dirrpt directory for each
component, and appends a number (0 through 9) to the Group Name (i.e.,
myext0.rpt). Because these files are aged by altering them (renaming 0 to 1, 1 to 2,
etc); having a number as the last character in the Group Name causes confusion
when attempting to locate noncurrent report files.

Activity Reporting

Activity reporting is imperative in maintaining a well configured GoldenGate environment. This


function not only provides valuable information to the end user, but it also provides a means for
determining how to load balance Capture and Apply processes.
At a minimum activity reporting should be performed on a daily basis. The following parameters will
cause Capture to report activity per table daily at 1 minute after midnight:
STATOPTIONS RESETREPORTSTATS
REPORT AT 00:01
REPORTROLLOVER AT 00:01
REPORTCOUNT EVERY 1 HOUR, RATE
As shown above:
1. STATOPTIONS RESETREPORTSTATS
a. Controls whether or not statistics generated by the REPORT parameter are reset
when a new process report is created. The default of NORESETREPORTSTATS
continues the statistics from one report to another as the report rolls over based on
the REPORTROLLOVER parameter.
2. REPORT AT 00:01
a. Causes Capture to generate per table statistics daily at 1 minute after minute and
record those statistics in the current report file.
3. REPORTROLLOVER AT 00:01
a. Causes Capture to create a new report file daily at 1 minute after midnight. Old
reports are renamed in the format of <group name><n>.rpt, where <group name>
is the name of the Extract or Replicat group and <n> is a number that gets
incremented by one whenever a new file is created, for example: myext0.rpt,
myext1.rpt, myext2.rpt, and so forth.
4. REPORTCOUNT EVERY 1 HOUR, RATE
a. Every hour a count of transaction records that have been processed since the Capture
process started will be written to the report file.
b. Rate reports the number of operations per second and the change in rate, as a
measurement of performance. The rate statistic is the total number of records
divided by the total time elapsed since the process started. The delta statistic is the
number of records since the last report divided by the time since the last report.

TRANSMEMORY

TRANSMEMORY controls the amount of memory and temporary disk space available for caching
uncommitted transaction data. Because GoldenGate sends only committed transactions to the target
database, it requires sufficient system memory to store transaction data on the source system until either
a commit or rollback indicator is received.
Transactions are added to the memory pool specified by RAM, and each is flushed to disk when
TRANSRAM is reached. An initial amount of memory is allocated to each transaction based on
INITTRANSRAM and is increased by the amount specified by RAMINCREMENT as needed, up to
the maximum set with TRANSRAM. The value for TRANSRAM should be evenly divisible by the sum
of (INITTRANSRAM + RAMINCREMENT).
The setting for TRANSMEMORY will be customer dependent, based upon their unique workload. For
OLTP environments, the default settings should be sufficient as transactions tend to be very short lived;
however, in other environments long running transactions may exceed the default 500kb setting.
If Capture exceeds allocated TRANSMEMORY the process will abend. The current settings will be
written to the Capture report file.
LOBMEMORY

LOBMEMORY controls the amount of memory and temporary disk space available for caching
transactions that contain LOBs.
LOBMEMORY enables you to tune GoldenGates cache size for LOB transactions and define a
temporary location on disk for storing data that exceeds the size of the cache. Options are available for
defining the total cache size, the per-transaction memory size, the initial and incremental memory
allocation, and disk storage space.
For OLTP transactions, the default 200Mb Ram setting should suffice; however, computations will need
to be run for large Data Warehouses to ensure LOBMEMORY is set properly.

GoldenGate Data Pump


GoldenGate Extract Data Pump retrieves data from a local extract trail and transmits the records over
TCP/IP to the target server. Below are some generic best practices guidelines for Data Pumps:
1. Do not have a number as the last character of a Data Pump Group Name.
a. By default, GoldenGate stores 10 report files in the dirrpt directory for each
component, and appends a number (0 through 9) to the Group Name (i.e.,
mypmp0.rpt). Because these files are aged by altering them (renaming 0 to 1, 1 to 2,
etc); having a number as the last character in the Group Name causes confusion
when attempting to locate noncurrent report files.
2. Do not evaluate the data.
a. Data Pumps function most efficiently when in PASSTHRU mode. Evaluation of
the data for column mapping or data transformation will require more resources and
cause processing delays.
3. Read once, write many.
a. One of the benefits of using Data Pumps is their ability to write to multiple remote
locations. This is of benefit for load balancing Apply processes on the target server
when there is no discernable lag evident for the Data Pump; or when the same data
needs to be delivered to multiple locations.

RMTHOST Options
COMPRESS

COMPRESS should be set anytime data is being transmitted over TCP/P. Compressing outgoing
blocks of records to reduce bandwidth requirements. GoldenGate decompresses the data before writing
it to the trail.
COMPRESS typically results in compression ratios of at least 4:1 and sometimes better.
Encryption

Data transmitted via public networks should be encrypted. Transactions containing sensitive data (credit
card numbers, social security numbers, healthcare information, etc) must be encrypted before
transmission over unsecure networks.
The customer security administrator should make the determination as to whether encryption is
required; however, and data that contains financial information (account numbers, etc), personal
information (social security number, drivers license number, address, etc), or health care information
must be encrypted.
If there are any doubts, encrypt!
TCPBUFSIZE

TCPBUFSIZE controls the size of the TCP socket buffer in bytes. By increasing the size of the buffer
larger packets can be sent to the target system. The actual size of the buffer depends on the TCP stack
implementation and the network. The default is 30,000 bytes, but modern network configurations
usually support higher values.
Valid values are from 1000 to 200000000 (two hundred million) bytes.
To compute the proper TCPBUFSIZE setting:
1.

2.

3.
4.

Use ping to determine the average round trip time to the target server.
a. PING has several options that may be specified:
i. -n <count>
1. The number of echo requests to send. Defaults to 4.
ii. -l <size>
1. The size of the ping packet. Defaults to 32 bytes.
b. If you know the average transaction size for data captured, use that value for the ping;
otherwise, use the default.
After obtaining the average round trip time, multiply that value by the network bandwidth.
a. For example, ping returned 64 ms as the average round trip time, and the network
bandwidth is 100 Mb. Multiplying these two numbers will product a result of: .07 *
100 = 7 Mb (.065 rounded up)
Network bandwidth is in bits per second, so we need to convert the result from step 2 into
bytes per second.
a. Divide the result in step 2 by 8: 7/8 = .875 megabytes per second.
TCPBUFSIZE should be set to 875000 in this instance.

On Unix/Linux systems using the ifconfig command to get the TCP Receive Space will also provide the
correct value for the TCPBUFSIZE. On the target system issue the following command:
ifconfig a
Example output:
eth0

Link encap:Ethernet HWaddr 00:0C:29:89:3A:0A


inet addr:192.168.105.166 Bcast:192.168.105.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST TCP Receive Space:125000 Metric:1
RX packets:23981 errors:0 dropped:0 overruns:0 frame:0
TX packets:5853 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000
RX bytes:2636515 (2.5 MiB) TX bytes:5215936 (4.9 MiB)
Interrupt:185 Base address:0x2024
Note that the TCP Receive Space is 125000.Therefore for shortest send wait time set the TCPBUFSIZE
also to 125000.
The waittime can be determined by the ggsci command:
send <extract pump>, gettcpstats
Example output:
Sending GETTCPSTATS request to EXTRACT PHER ...
RMTTRAIL .\dirdat\ph000000, RBA
717
OK
Session Index 0
Stats started 2008/06/20 16:06:07.149000
0:00:11.234000
Local address 192.168.105.151:23822 Remote address 192.168.105.151:41502
Inbound Msgs
9 Bytes
66,
6 bytes/second
Outbound Msgs
10 Bytes
1197, 108 bytes/second
Recvs
18
Sends
10
Avg bytes per recv
3, per msg 7
Avg bytes per send
119, per msg 119
Recv Wait Time
125000, per msg
13888, per recv 6944
Send Wait Time
0, per msg
0, per send
0
The lower the send Wait time the better performance for the pump over the network.
The customer network administrator can also assist in determining an optimal value.

TCPFLUSHBYTES
Controls the size of the buffer, in bytes, that collects data that is ready to be sent across the network.
When this value is reached, the data is flushed to the target.
Set TCPFLUSHBYTES to the same value as TCPBUFSIZE.

GoldenGate Trails
GoldenGate Trails are used to store records retrieved by Change Data Capture (EXTTRAIL) or
transmitted over TCP/IP to a target server by Extract Data Pump (RMTTRAIL). A single GoldenGate
instance supports up to 100,000 trails ranging in size from 10Mb (the default size) to 2000Mb, and
sequentially numbered from 000000 to 999999.
Below are some generic best practices guidelines for GoldenGate Trails:
1.

2.

3.

Do not use numeric characters in the trail identifier.


a. Trails are identified by two characters assigned via the GGSCI command add exttrail
or add rmttrail. Using numeric characters, i.e., a1, could cause confusion when
attempting to locate trails for troubleshooting purposes as there could be a trail
named dirdat/a1111111 on disk. To eliminate this confusion, use characters a
through z (and uppercase A through Z on Linux/Unix servers) only. If you need
more that 52 unique trails, use a directory other than ./dirdat as the repository.
Do not use the default trail size.
a. 10Mb is really too small for modern production environments. File creation is a very
expensive activity; so size the trails to hold 24 hours of data. If the customer is
generating more than 2Gb of data per day, size both local and remote trails for the
maximum.
Manage the trails via Manager

a.
b.

Have Manager housekeeping tasks delete trails when they are no longer required.
On the source server, keep a few days (up to 7) local trails on disk as an added
security measure in the event of a catastrophic source database failure.

GoldenGate Apply
GoldenGate Change Data Apply executes SQL statements on the target database based upon
transactional data captured from the source database. Below are some generic best practices guidelines
for Apply:
1.
2.

3.

4.

As stated in section 3.2, use GoldenGate Macros to configure database access.


Do not have a number as the last character of a Capture Group Name.
a. By default, GoldenGate store 10 report files in the dirrpt directory for each
component, and appends a number (0 through 9) to the Group Name (i.e.,
myext0.rpt). Because these files are aged by altering them (renaming 0 to 1, 1 to 2,
etc); having a number as the last character in the Group Name causes confusion
when attempting to locate noncurrent report files.
Always discard
a. Always configure DISCARDFILE in Apply parameter files. In the event of a data
integrity issue, the Apply process will log the data record to this file.
b. Use .dsc as the suffix for Apply discard files.
i. This just makes it easier to identify them
c. Put discard files in a dedicated subdirectory.
i. Create a directory dirdsc for discard file storage. This aids in
troubleshooting as the files will be stored in a dedicated location.
Never use HANDLECOLLISIONS for real-time data Apply.
a. HANDLECOLLISIONS should only be active during catch-up phase after database
instantiation.
b. If HANDLECOLLISIONS is active END RUNTIME or END <timestamp>
must be active as well.

Activity Reporting

Activity reporting is imperative in maintaining a well configured GoldenGate environment. This


function not only provides valuable information to the end user, but it also provides a means for
determining how to load balance Apply processes.
At a minimum activity reporting should be performed on a daily basis. The following parameters will
cause Apply to report activity per table daily at 1 minute after midnight:
STATOPTIONS RESETREPORTSTATS
REPORT AT 00:01
REPORTROLLOVER AT 00:01
REPORTCOUNT EVERY 1 HOUR, RATE
As shown above:
1. STATOPTIONS RESETREPORTSTATS
a. Controls whether or not statistics generated by the REPORT parameter are reset
when a new process report is created. The default of NORESETREPORTSTATS
continues the statistics from one report to another as the report rolls over based on
the REPORTROLLOVER parameter.
2. REPORT AT 00:01
b. Causes Apply to generate statistics daily at 1 minute after minute and record those
statistics in the current report file.
3. REPORTROLLOVER AT 00:01
a. Causes Apply to create a new report file daily at 1 minute after midnight. Old reports
are renamed in the format of <group name><n>.rpt, where <group name> is the
name of the Extract or Replicat group and <n> is a number that gets incremented by
one whenever a new file is created, for example: myrep0.rpt, myrep1.rpt, myrep2.rpt,
and so forth.
4. REPORTCOUNT EVERY 1 HOUR, RATE

a.
b.

Every hour a count of transaction records that have been processed since the Apply
process started will be written to the report file.
Rate reports the number of operations per second and the change in rate, as a
measurement of performance. The rate statistic is the total number of records
divided by the total time elapsed since the process started. The delta statistic is the
number of records since the last report divided by the time since the last report.

LOBMEMORY

LOBMEMORY controls the amount of memory and temporary disk space available for caching
transactions that contain LOBs.
LOBMEMORY enables you to tune GoldenGates cache size for LOB transactions and define a
temporary location on disk for storing data that exceeds the size of the cache. Options are available for
defining the total cache size, the per-transaction memory size, the initial and incremental memory
allocation, and disk storage space.
For OLTP transactions, the default 200Mb Ram setting should suffice; however, computations will need
to be run for large Data Warehouses to ensure LOBMEMORY is set properly.
BATCHSQL

BATCHSQL may be used to increase the throughput of Apply by grouping similar SQL statements into
arrays and applying them at an accelerated rate.
When BATCHSQL is enabled, Apply buffers and batches a multitude of statements and applies them in
one database operation, instead of applying each statement immediately to the target database.
Operations containing the same table, operation type (I, U, D), and column list are grouped into a batch.
Each type of SQL statement is prepared once, cached, and executed many times with different variables.
The number of statements that are cached is controlled by the MAXSQLSTATEMENTS parameter.
BATCHSQL is best used when the change data is less than 5000 bytes per row. Given this,
BATCHSQL cannot process data that contains LOB or LONG data types, change data for rows greater
than 25k in length, or when the target table has a primary key and one or more unique keys.
If the transaction cannot be written to the database via BATCHSQL, Apply will abort the transaction,
temporarily disable BATCHSQL processing, and then retry the transaction in normal mode. If this is
occurring frequently, you will need to evaluate the tradeoff of using BATCHSQL versus the impact of
this error handing functionality.

Load Balancing GoldenGate Processes


Should any GoldenGate component (Capture, Data Pump, or Apply) report lag that is unacceptable to
the customer, or exceeds their documented service level agreement (SLA); a performance audit will need
to be conducted. Here, well discuss the basic concepts for performing a performance audit and load
balancing exercise.
The Pareto principle states that, for many events, 80% of the effects come from 20% of the causes.
Applying this principal for optimization means that we will concentrate our efforts on the 20% top
resource consumers.
Step 1. Identify the Bottleneck

What component is reporting an unacceptable lag? Typically, lag will be evident in Apply because of the
work that must be performed to maintain the target database. However, lag can also be evident for
Change Data Capture if data is added to the database logs faster than we can extract. Data Pumps may
report lag for the same reason, or if the network is too slow, or if the TCPBUFSIZE setting is too small.
For Data Pump lag, refer to section 3.5 and compute the recommended TCPBUFSIZE setting. The
customer network administrator can provide information about network bandwidth and TCP settings
for the server.

One methodology for determining where lag resides in the data transport mechanism is to enable a
heartbeat table. For more information of heartbeat table usage and configuration, refer to the Best
Practices Heartbeat Table document for more information on this subject.
Step 2. Determine Database Activity

What are the busiest tables, the type of activity being performed, and the average record length of the
data?
If Capture statistics are being generated on a daily basis, we can use the GoldenGate report files to find
the busiest tables. Ideally, we want to use 7 days worth of data to compute a running average per day,
per hour, per minute, and per second.
To determine activity per transaction type for each table, use the following formulas (you may want to
put this is a spreadsheet):
1.
2.
3.
4.
5.
6.
7.
8.

Daily activity: The raw numbers from each report file.


Hourly per day: daily/60
Per minute per day: hourly/60
Per second per day: per minute/60
Weekly average: sum(daily)/7
Hourly average: sum(hourly)/7
Per minute average: sum(per minute)/7
Per second average: sum(per second)/7

Sort the data to determine the most active tables per day, per hour, per second, weekly average, hourly
average, and per second average. The busiest tables will be the ones where these six values intersect
(they are within the top 20% of all tables in the list).
Since we now know the busiest table, we need to determine the average record length of the data
captured. For this well be using the EXTTRAILS on the source server and LOGDUMP. Ideally, we
need several days of trail history to complete this step; however, we can use whats available.
For each available EXTTRAIL, and for each table in our busiest tables list, do the following:
1. Start LOGDUMP
2. Open each trail.
a. Open ./dirdat/<xx><nnnnnn>
3. Set the detail level
a. Detail data
4. Filter on the table
a. Filter include filename <schema.table>
b. Be sure to put the table name in upper case.
5. make sure were at the beginning of the trail
a. Pos 0
6. Get the count
a. Count
Logdump will return information similar to this:
Scanned 10000 records, RBA 62732472, 2007/06/28 11:08:15.000.000
LogTrail C:\GGS\ora92\dirdat\e9000000 has 6738 records
Total Data Bytes
19962835
Avg Bytes/Record
2962
Delete
352
Insert
4000
PK FieldComp
2386
Before Images
352
After Images
6386
Filtering matched
6738 records
suppressed
13476 records
Average of 5056 Transactions

Bytes/Trans .....
Records/Trans ...
Files/Trans .....

4012
1
1

LPENTON.T2_IDX
Total Data Bytes
19962835
Avg Bytes/Record
2962
Delete
352
Insert
4000
PK FieldComp
2386
Before Images
352
After Images
6386

Partition 4

We now know the daily total data bytes captured by table and the daily average bytes per record
captured. We can use the daily average bytes to compute per hour and per minute values as we did
above when determining table activity.
Step 3. Configuration Changes
Change Data Capture

Before making configuration changes to Capture, we want to make sure that the bottleneck is due to
database activity. Check the connectivity to the database to make sure there is not a network issue.
If the bottleneck is related to workload, then we can use the information gathered in step 2 to split the
database workload across two capture processes. Create a new Change Data Capture and divide the
workload evenly across the two processes.
Data Pump

If the bottleneck is not related to TCP or network configuration, create a new data pump and split the
workload evenly across the two processes. Multiple Data Pump processes may read from the same
EXTTRAIL; however, when doing so be sure to monitor disk activity to ensure contention at the file
level does not occur as that can impact both Capture and Data Pump.
Change Data Apply

Apply is where youll see the most improvement by load balancing. Here we need to make some
decisions:
1.
2.

3.

Is BATCHSQL enabled?
a. If not activate BATCHSQL and check Apply performance?
Is the workload Update or Delete intensive?
a. If so, what is the primary key or unique index GoldenGate is using?
b. Is this key sufficient for accessing the row with a minimal number of ios?
i. If not, consider setting KEYCOLS in Apply or adding a new unique index
to the table for GoldenGate access along with KEYCOLS.
What type of connectivity do we have to the target database?
a. Do we have enough bandwidth?

If none of the above apply, we can use the FILTER option for workload distribution across multiple
Apply processes. In Apply, FILTER is activated as part of the MAP statement and the syntax is MAP
<source table>, TARGET <target table>, FILTER (@RANGE (<n>, <n>));.
Using the information gained in step 2, setup multiple Apply processes to distribute the workload for
our busiest tables. The number of Apply processes will depend upon the workload and type of activity.
When setting up multiple Apply processes to read from a single trail, monitor disk activity to ensure
contention at the file level does not occur.
WARNING: @RANGE cannot be used on tables where primary key updates may be executed. This
will cause Apply failures and/or database out of sync conditions.

Oracle GoldenGate for Linux, UNIX, and


Windows
Updated August 2010

Copyright 2010 Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and
the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any
other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of

Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.

merchantability or
fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations
are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by
any
means, electronic or mechanical, for any purpose, without our prior written permission.

Worldwide Inquiries:

Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective

Phone: +1.650.506.7000

owners.

Fax: +1.650.506.7200
oracle.com

0109

Vous aimerez peut-être aussi