Vous êtes sur la page 1sur 48

IBM Information Management

IBM Informix on POWER7


Best Practices

A Technical White Paper

Contents
Introduction ..................................................................................................................... 5
What is an LPAR ............................................................................................................ 5
PowerVM........................................................................................................................ 5
Database workload......................................................................................................... 6
Mission critical .....................................................................................................7
High throughput ...................................................................................................7
Low throughput ....................................................................................................7
Simultaneous multithreading (SMT)................................................................................ 7
Current SMT mode ..............................................................................................8
Enabling SMT ....................................................................................................10

SMT1 vs SMT2 vs SMT4 .............................................................................................. 11


One core SMT testing....................................................................................................11
Multiple core SMT testing ..............................................................................................12
Recommendation ..........................................................................................................12

Dedicated LPAR vs shared LPAR................................................................................. 13


Dedicated LPAR............................................................................................................13
Shared LPAR ................................................................................................................14
Capped or uncapped shared LPAR ...................................................................15
Virtual processors ..............................................................................................16
Processor folding ...............................................................................................18
Recommendation ..........................................................................................................19

Virtual I/O Server (VIOS) LPAR .................................................................................... 20


Recommendation ..........................................................................................................20

Additional LPAR recommendations............................................................................... 21


Recommendation ..........................................................................................................21

Memory considerations ................................................................................................. 22


RESIDENT parameter ...................................................................................................22
4 KB memory page size ................................................................................................23
16 MB memory page size ..............................................................................................23
64 KB memory page size...............................................................................................24
Recommendation ..........................................................................................................25

IBM Informix on POWER7 Best Practices

Feedback Directed Program Restructuring (FDPR) Tool .............................................. 26


Recommendation ..........................................................................................................28

I/O subsystem ............................................................................................................... 29


Read/Write access times...............................................................................................29
KAIO/DIRECT_IO..........................................................................................................30
Queue depth..................................................................................................................31
AIO servers ...................................................................................................................32
Recommendation ..........................................................................................................33

Network subsystem ....................................................................................................... 34


TCP traffic .....................................................................................................................34
Local loopback ..............................................................................................................34
Recommendation ..........................................................................................................35

Number of CPU virtual processors................................................................................ 36


CPU-intensive workload ................................................................................................36
I/O-intensive workload ...................................................................................................37
Recommendation ..........................................................................................................38

Affinity ........................................................................................................................... 39
Recommendation ..........................................................................................................39

Understanding onstat g glo ......................................................................................... 40


Recommendation ..........................................................................................................41

Starting LPARs.............................................................................................................. 42
Recommendation ..........................................................................................................42

Appendix A:

Recommendations summary.................................................................. 43

Appendix B:

Useful commands................................................................................... 44

amepat ..........................................................................................................................44
bosboot .........................................................................................................................44
chdev ............................................................................................................................44
ifconfig...........................................................................................................................44
ioo .................................................................................................................................44
iostat .............................................................................................................................44
lparstat ..........................................................................................................................44
lsattr ..............................................................................................................................45
schedo...........................................................................................................................45
smtctl.............................................................................................................................45

IBM Informix on POWER7 Best Practices

vmo ...............................................................................................................................45
vmstat............................................................................................................................45

Appendix C:

Additional reading ............................................................................... 46


DeveloperWorks: AIX Virtual Processor Folding is misunderstood.....................46
IBM Systems: Understanding Micro-Partitioning..............................................46
IBM Systems: Getting a handle on Entitled Capacity & Virtual Processors .......46
YouTube: Power7 Performance Entitlement, VPs, Affinity, Memory..............46
Feedback Directed Program Restructuring (FDPR) ...........................................46
Developer Works: VIOS Advisor .......................................................................46
IBM Redbooks Publication: IBM PowerVM Virtualization Managing and
Monitoring..........................................................................................................46
IBM Redbooks Publication: AIX 5L Performance Tools Handbook....................46

Appendix D:

References.......................................................................................... 47

IBM Informix on POWER7 Best Practices

Introduction

This document describes best practices for using IBM Informix on AIX POWER7 processorbased servers. Topics of discussion include logical partitions (LPARS), dedicated and shared
resources, capped LPARS compared with uncapped configurations, and I/O configuration.
It is assumed that you have a working knowledge of Informix and are familiar with physical and
logical database design for Informix. You also need to have skills in Informix server
administration, and be familiar with configuration and tuning of the Informix server.
You should have basic skills in working with LPARS and system administration on AIX
POWER7 systems.
What is an LPAR
An LPAR, short for logical partition, is the division of a computers processors, memory, and
storage into smaller units. Each unit can run its own instance of the operating system and
applications. This concept was introduced with the POWER5 processor.
PowerVM
IBM PowerVM provides a secure, stable, and sophisticated virtualization environment for IBM
Power SystemsTM. A single physical server can be divided into multiple virtual servers using a
fraction of a processor to using all the processors on the physical machine. POWER7 systems
support up to 1000 LPARs on a single server.
Businesses can deploy an appropriate mix of LPARs to meet their needs, sharing resources
where applicable, or by using fully dedicated resources as needed. With PowerVM, you have
the flexibility of a heterogeneous environment with the LPARs running a combination of AIX and
Linux operating systems.
Through the use of virtualization, PowerVM has the ability to respond to business needs faster
by dynamic resource allocation. The Power architecture also provides simultaneous
multithreading (SMT), which allows increased throughput on your Power system.

IBM Informix on POWER7 Best Practices

2 CPU

10 CPU

AIX 6.2
2 CPU

AIX 6.1
1.5 CPU

AIX 6.2
1.5 CPU

AIX 7.1

AIX 6.1
0.5 CPU

Micro-partitions

2.4 CPU

AIX 7.1

Micro-partitions

2.5 CPU

AIX 6.2

AIX 6.1

AIX 7.1

Dedicated partitions

1 CPU
Shared Process Pool 0

Shared Process Pool 1

Power Hypervisor
Dedicated processors

Physical shared-processor pool

Database workload
Before we address best practices for Informix on the POWER7 architecture, you must
understand the type of workload that you expect to have. A mission critical database server that
runs a consistent workload with a large number of users has different requirements than a
database server that has a light workload where that work is sparse throughout the day.

Primary Database Server


24 x 7 x 365

OLTP
Workload

Month-end
Processing

Batch
Loads

Sales
Reports
Secondary
Database
Server

Business
Analytics

IBM Informix on POWER7 Best Practices

Mission critical
A mission critical database is one that your business relies on, where the uptime and
performance is most important. This type of database server typically requires dedicated
resources with certain service level agreements (SLA) in place regarding throughput
(transactions per second) and uptime. This type of system is important to the business.
High throughput
If your system has a high throughput (transactions per second), it is important to understand
how and when that workload uses the Informix database server. Is the work a consistent
workload that uses the database server 24 hours a day, or does the work come in bursts or at
specific times throughout the day? For instance, you might have a database server that has a
consistent, high workload from 8:30 a.m. to 5:30 p.m. during the business day, and a different
database server that experiences a heavy workload from midnight to 4 a.m. Using a dedicated
system for each of these database servers wastes resources when one database server or the
other is idle.
Low throughput
Some database servers have a very low throughput on a regular basis, requiring very little
processing power and resources. Perhaps data is loaded into the database server once a day
or at scheduled intervals, and otherwise the database server is idle. The data might be used to
run end-of-day processing reports or possibly month-end processing reports. Placing a
database server like that onto a large dedicated server with many CPUs would be a waste of
processing power.
To configure a POWER7 system to handle the workload, its important to know which database
servers will require more resources than others. It is important to understand when the
workload will run on the database servers. Knowing this information will help you decide how to
configure LPARs. For example, a dedicated LPAR might be a good choice for a mission critical
database server with a consistent, high throughput. However, a shared-resource LPAR might
be a good option for a database server where the workload occurs at a specific time of day or
night, and the server is idle the rest of the time.

Simultaneous multithreading (SMT)


Simultaneous multithreading is the ability of a single processor to simultaneously dispatch
instructions from more than one hardware thread context. The Power architecture uses SMT to
provide multiple streams of hardware execution, and the POWER7 processor can be configured
to run in SMT4, SMT2, or SMT1 (single-threaded mode). By using multiple SMT threads, a
workload can take advantage of more of the hardware features that are provided on the Power
system. POWER6 and POWER5 support SMT2 or SMT1.
IBM Informix has performed a series of benchmarks comparing SMT4 with SMT2 and SM1.
The results of these tests fall in line with industry benchmarks on POWER7 and with SMT
testing.

IBM Informix on POWER7 Best Practices

2 CPU

2 CPU

lcpu 6
lcpu 7

Core 2
lcpu 4
lcpu 5

Core 1
lcpu 2
lcpu 3

Core 2

lcpu 0
lcpu 1

Core 1

lcpu 2
lcpu 3

Core 2

lcpu 0
lcpu 1

Core 1

lcpu 1

2 CPU

lcpu 0

AIX 6.2

LPAR/ SMT4

AIX 6.1

LPAR/ SMT2

AIX 7.1

LPAR/ SMT1

Current SMT mode


To determine the current SMT mode, you can use one of the following AIX commands: lparstat,
amepat, smtctl.
# lparstat
System configuration: type=Dedicated mode=Capped smt=4 lcpu=128 mem=513536MB
%user %sys %wait %idle
----- ----- ------ -----1.0
0.5
0.0
98.5

The following sample output from the lparstat command shows that SMT4 is being used, and
that there are 128 logical CPUs, which means that there are 32 physical CPUs.

IBM Informix on POWER7 Best Practices

# amepat
Command Invoked

: amepat

Date/Time of invocation
Total Monitored time
Total Samples Collected

: Mon Sep 23 13:07:51 CDT 2013


: NA
: NA

System Configuration:
--------------------Partition Name
Processor Implementation Mode
Number of Logical CPUs
Processor Entitled Capacity
Processor Max. Capacity
True Memory
SMT Threads
Shared Processor Mode
Active Memory Sharing
Active Memory Expansion

:
:
:
:
:
:
:
:
:
:

shake
POWER7
128
32.00
32.00
501.50 GB
4
Disabled
Disabled
Disabled

..
# smtctl
This system is SMT capable.
This system supports up to 4 SMT threads per processor.
SMT is currently enabled.
SMT boot mode is not set.
SMT threads are bound to the same physical processor.
proc0 has 4 SMT threads.
Bind processor 0 is bound
Bind processor 1 is bound
Bind processor 2 is bound
Bind processor 3 is bound

with
with
with
with

proc0
proc0
proc0
proc0

proc4 has 4 SMT threads


Bind processor 4 is bound
Bind processor 5 is bound
Bind processor 6 is bound
Bind processor 7 is bound

with
with
with
with

proc4
proc4
proc4
proc4

proc124 has 4 SMT threads.


Bind processor 124 is bound
Bind processor 125 is bound
Bind processor 126 is bound
Bind processor 127 is bound

with
with
with
with

IBM Informix on POWER7 Best Practices

proc124
proc124
proc124
proc124

Enabling SMT
Simultaneous multithreading is set at the LPAR level. An SMT setting for a particular LPAR will
not affect the settings for another LPAR.
SMT can be enabled or disabled with the following smtctl command.
smtctl -m {off|on}

To set the SMT threads to 4, the following command can be used. This command affects the
current LPAR only, and the change is immediate.
smtctl -t 4

By default, the SMT change does not persist after the LPAR is rebooted. For an SMT change to
persist after a reboot, the boot image must be remade with the bosboot command. See the full
man pages for the bosboot and smtctl commands.

IBM Informix on POWER7 Best Practices

10

SMT1 vs SMT2 vs SMT4

IBM Informix has performed extensive benchmark tests comparing results when using multiple
threads on POWER7. This type of testing is possible even when SMT4 is configured because
the 1st thread is used for a processor, and when it nears full consumption, the 2nd thread is
used, and so on. When all four threads are in use on a core, we see an increase in overall
throughput by approximately 60%. While the overall throughput increases, it is important to
note that single-thread response time does not scale linearly as more threads are used per
core.
One core SMT testing
The following graph shows transaction throughput for a single core when 1, 2, 3, and 4 threads
were used. The transactions per minute (TPM) increased when each additional thread was
used.
H/W threads
1
2
3
4

IBM Informix on POWER7 Best Practices

throughput (TPM) diff%


32893.67
46267.33
+40.7%
51181.67
+10.6%
53884.00
+ 5.3%

11

Multiple core SMT testing


IBM Informix performed additional tests to measure throughput for SMT on multiple cores. The
following graph shows the transaction throughput of 1, 2, 3, and 4 threads on 1 through 8 cores.

Recommendation
If you are most concerned about overall throughput for your Informix server, use SMT4,
because using SMT4 can more fully utilize the core. While tests showed a 60% increase in
throughput, keep in mind that single-thread response time does not scale linearly as more SMT
threads are used. If you want to optimize for response time, you can start with SMT4 for the
increased throughput, but if you see single-thread response time suffer, move to SMT2.

IBM Informix on POWER7 Best Practices

12

Dedicated LPAR vs shared LPAR

Virtualized environments offer many choices for deployment such as dedicated or nondedicated processor cores and memory micro-partitioning, which uses a fraction of a physical
processor core. There are pros and cons to each approach, and it is important to understand
how the LPAR is to be used and its expected workload.
Dedicated LPAR
A dedicated LPAR is one that gets a specific set of resources. It will not grab additional
resources or release any of its resources. The following graphic shows three dedicated LPARs:
LPAR #1 has 2 CPUs, LPAR #2 has 10 CPUs, and LPAR #3 has 1 CPU.
One of the drawbacks of dedicated LPARs is that, if there is an over allocation of resources, you
can have a situation where the CPUs are idle. At the same time, there could be another LPAR
that has used all of its resources and could benefit from increased resources. For example, in
the following graphic, LPAR #1 might be running at full capacity with CPU utilization near 100%,
while LPAR #2 is running at 10% utilization. In that situation, LPAR #1 would benefit from using
resources that LPAR #2 is not using.

2 CPU

10 CPU

AIX 6.1

AIX 6.2

1.5 CPU

2 CPU

AIX 6.2
1.5 CPU

AIX 7.1

AIX 6.1
0.5 CPU

Micro-partitions

2.4 CPU

AIX 7.1

LPAR
#2

Micro-partitions

2.5 CPU

AIX 6.2

AIX 6.1

Dedicated partitions

AIX 7.1

LPAR
#1

1 CPU
Shared Process Pool 0

Shared Process Pool 1

Power Hypervisor
Dedicated processors

IBM Informix on POWER7 Best Practices

Physical shared-processor pool

13

Shared LPAR
A shared LPAR, sometimes referred to as a non-dedicated LPAR, is an LPAR that is assigned a
minimum set of resources, and that may use more resources from a shared pool, as needed, if
the additional resources are available. This method also has pros and cons.
For example, assume that you defined a set of shared LPARs as shown in the following graphic.
If LPAR #1 consumes 100% of its 2.5 CPUs, it can use additional resources from the shared
pool to allow increased throughput on that LPAR. However, if LPAR #2 pulls all the available
resources from the shared pool, and LPAR #1 becomes 100% consumed, LPAR #1 will not be
able to use additional resources, and its performance will suffer.

2 CPU

10 CPU

AIX 6.1

AIX 6.2
2 CPU

AIX 6.2
1.5 CPU

1.5 CPU

AIX 6.1
0.5 CPU

Micro-partitions
AIX 7.1

AIX 7.1

Micro-partitions

2.5 CPU

AIX 6.2

AIX 6.1

AIX 7.1

Dedicated partitions

LPAR
#2

2.4 CPU

LPAR
#1

1 CPU
Shared Process Pool 0

Shared Process Pool 1

Power Hypervisor
Dedicated processors

Physical shared-processor pool

The question is: Should you use shared LPARs or dedicated LPARs? IBM Informix tests show
that a properly configured shared LPAR, in ideal conditions, can perform nearly as well as a
dedicated LPAR (see the following graph). However, one of the benefits that you get with a
dedicated LPAR is that the LPAR is much easier to configure and monitor, and it will give you
consistent results. A shared LPAR has factors that are out of your control, and that might cause
variations in throughput results.

IBM Informix on POWER7 Best Practices

14

Capped or uncapped shared LPAR


For a capped shared LPAR, the entitlement capacity is the maximum number of cycles that can
be used. An example would be creating a capped shared LPAR with an entitlement capacity of
8 CPUs. That LPAR will not use more than 8 CPUs. If that LPAR uses only 2, the other 6 CPUs
can be used by other uncapped shared LPARs.
If an uncapped shared LPAR that has 8 CPUs entitled consumes all 8 CPUs, it can acquire
more resources from the shared pool, and use more than its entitled capacity. It can use up to
the number of online virtual processors that are defined for the LPAR.
There are obvious throughput benefits to using an uncapped shared LPAR that can access the
shared pool of processors. The LPAR must have enough virtual processors defined to take
advantage of the idle processors in the shared pool.
The following graph shows test results of two shared LPARS. One is a capped shared LPAR
with an entitlement of 8 CPUs. The other is an uncapped shared LPAR with an entitlement of 8
CPUs, and 16 virtual processors defined to take advantage of the 8 CPUs that are in the shared
processor pool.

IBM Informix on POWER7 Best Practices

15

If you are using a dedicated LPAR the simplest way to test a shared LPAR is to change the
LPAR mode from dedicated to shared uncapped, and make the number of virtual processors
and entitlement capacity equal to the number of CPUs that were allocated to the dedicated
LPAR. This test provides the immediate advantage of allowing unused CPU cycles to be used
by the shared processor pool.
Virtual processors
Virtual processors are similar to CPUs from an AIX operating system standpoint. That is, a
virtual processor is a logical entity that is backed up by physical processor cycles. The number
of online virtual processors dictates the absolute maximum CPU consumption that an LPAR can
achieve. If an LPAR has an entitlement of 2 CPUs, and you set up 4 virtual processors, the
LPAR could consume up to 4 physical processors, in which case it would report a 200% CPU
utilization.
You can use the lparstat command to check the entitled capacity of the number of online virtual
CPUs as well as other parameters for an LPAR. The following examples show three lparstat
command outputs. The first output is for a dedicated LPAR with 8 CPUs assigned to it. The
second output is for a capped shared LPAR with 8 CPUs entitled, and the third output is for the
shared LPAR moved to an uncapped shared LPAR with 8 CPUs entitled and 16 virtual
processors.

IBM Informix on POWER7 Best Practices

16

Use the following command:


#lparstat l

Dedicated (lparstat l)
Node Name
Partition Name
Partition Number
Type
Mode
Entitled Capacity
Partition Group-ID
Shared Pool ID
Online Virtual CPUs
Maximum Virtual CPUs
Minimum Virtual CPUs
Online Memory
Maximum Memory
Minimum Memory
Variable Capacity Weight
Minimum Capacity
Maximum Capacity
Capacity Increment
Maximum Physical CPUs in system
Active Physical CPUs in system
Active CPUs in Pool
Shared Physical CPUs in system
Maximum Capacity of Pool
Entitled Capacity of Pool
Unallocated Capacity
Physical CPU Percentage
Unallocated Weight
Memory Mode

:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:

v1009h01
v1009h01-b3p019-informix
4

:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:

v1009h02
v1009h02-b3p019-informix
6

Dedicated-SMT-4
Capped
8.00
32772
-

8
12
1
32768 MB
49152 MB
16384 MB
1.00
12.00
1.00
128
128
0
0
0
100.00%
Dedicated

Capped Shared LPAR (lparstat l)


Node Name
Partition Name
Partition Number
Type
Mode
Entitled Capacity
Partition Group-ID
Shared Pool ID
Online Virtual CPUs
Maximum Virtual CPUs
Minimum Virtual CPUs
Online Memory
Maximum Memory
Minimum Memory
Variable Capacity Weight
Minimum Capacity
Maximum Capacity
Capacity Increment
Maximum Physical CPUs in system
Active Physical CPUs in system
Active CPUs in Pool
Shared Physical CPUs in system
Maximum Capacity of Pool

IBM Informix on POWER7 Best Practices

17

Shared-SMT-4
Capped
8.00
32774
1

8
12
1
32768 MB
49152 MB
16384 MB
0
0.10
12.00
0.01
128
128
16
32
1600

Entitled Capacity of Pool


Unallocated Capacity
Physical CPU Percentage
Unallocated Weight
Memory Mode

:
:
:
:
:

1600
0.00
100.00%
0
Dedicated

:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:

v1009h02
v1009h02-b3p019-informix
6

Uncapped Shared LPAR (lparstat l)


Node Name
Partition Name
Partition Number
Type
Mode
Entitled Capacity
Partition Group-ID
Shared Pool ID
Online Virtual CPUs
Maximum Virtual CPUs
Minimum Virtual CPUs
Online Memory
Maximum Memory
Minimum Memory
Variable Capacity Weight
Minimum Capacity
Maximum Capacity
Capacity Increment
Maximum Physical CPUs in system
Active Physical CPUs in system
Active CPUs in Pool
Shared Physical CPUs in system
Maximum Capacity of Pool
Entitled Capacity of Pool
Unallocated Capacity
Physical CPU Percentage
Unallocated Weight
Memory Mode

Shared-SMT-4
Uncapped
8.00
32774
1

16
16
1
32768 MB
49152 MB
16384 MB
128
0.10
16.00
0.01
128
128
20
41
2000
1600
0.00
50.00%
0
Dedicated

Processor folding
Processor folding is a method of turning off unused virtual processors so that they are not
scheduled to run and consume CPU cycles. If an LPAR has 8 CPUs entitled and 10 virtual
processors, but the LPAR only requires 2.5 CPUs for the current workload, it will run on just 3
CPUs. The other 7 virtual processors are folded away, and when the workload dictates, the
virtual processors are used again (unfolded).
To determine if processor folding is enabled, use the schedo command, and look for the current
setting of the vpm_fold_policy parameter.

IBM Informix on POWER7 Best Practices

18

schedo L

NAME

CUR

DEF

BOOT

MIN

MAX

UNIT

TYPE

DEPENDENCIES
. . . .

-------------------------------------------------------------------------------vpm_fold_policy
1
1
1
0
15
D
-------------------------------------------------------------------------------vpm_xvcpus
0
0
0
-1
2G-1
processors
D
--------------------------------------------------------------------------------

We see in the output that the vpm_fold_policy current value is set to 1. This is a bitmask
setting, so a value of 1 indicates that folding is enabled if the LPAR is using shared processors.
See the AIX documentation for all possible settings for this parameter.
To disable processor folding regardless of the type of LPAR or power saving mode, set the
vpm_fold_policy to 4 as shown in the following example.
Example: schedo -p -o vpm_fold_policy=4
The vpm_xvcpus parameter is used to determine the number of extra virtual processors to
unfold when the system determines it needs to unfold a processor. For example, when the
operating system needs to unfold a processor, if vpm_xvcpus is set to 3, the operating system
unfolds 4 virtual processors.
Example: schedo -p -o vpm_xvcpus=3

Recommendation
If the workload requires consistent performance with stringent latency requirements, then such a
workload is best deployed on dedicated partitions rather than on a shared LPAR. In IBM
Informix tests, a dedicated LPAR provided the most consistent performance.
The exception to using a dedicated LPAR would be when a shared processor pool is not overcommitted nor over-utilized.
Use processor folding for more efficient use of the cores. However, disable folding if you see a
problem with processor folding, or if you see an excessive amount of folding. Processor folding
is dynamically configurable so test it in peak-load and low-load scenarios. If you choose to use
processor folding, set the vpm_xvcpus parameter to 3. That setting helps avoid any penalties
from unfolding one virtual processor at a time.

IBM Informix on POWER7 Best Practices

19

Virtual I/O Server (VIOS) LPAR

A Virtual I/O Server (VIOS) is a special LPAR that has additional software installed for the
purpose of managing the I/O for other LPARs. Instead of the individual network and disk
resources being carved out on an LPAR by LPAR basis, the VIOS manages the disk and
network resources on behalf of the other LPARs. The size of the VIOS is important.

Recommendation
The VIOS must be a dedicated LPAR, and it is recommended that you disable processor folding
for the VIOS. The size of the VIOS server is important. Refer to AIX documentation for proper
configuration requirements. There is also a VIOS advisor that you can use, which can provide
recommendations for your VIOS configuration.

http://www.ibm.com/developerworks/wikis/display/WikiPtype/VIOS+Advisor

IBM Informix on POWER7 Best Practices

20

Additional LPAR recommendations

Shared resource LPARs are very dynamic in nature. There are various performance tools that
can be used to help improve allocation and placement of resources on the physical machine
and within the LPARS.
The Dynamic Platform Optimizer (DPO) is a PowerVM feature that you can use to improve
partition memory and processor affinity across the logical partitions in a Power Server. DPO is
a feature that can help you reap performance gains for the IBM Informix server.
Active System Optimizer (ASO) is a subsystem that is designed to automatically improve the
performance of AIX workloads running on POWER7. Dynamic System Optimizer (DSO) is built
on the ASO framework and provides additional optimizations.

Recommendation
We recommend that the System Administrator work with AIX and use DPO and ASO/DSO to
optimize workloads for Informix. See the following IBM Redbooks publication, IBM PowerVM
Virtualization Managing and Monitoring, for details.

http://www.redbooks.ibm.com/redpieces/abstracts/sg247590.html

IBM Informix on POWER7 Best Practices

21

Memory considerations

Using larger virtual memory page sizes for an applications memory can significantly improve an
applications performance and throughput. The improvement in system performance stems
from the reduction of Translation Lookaside Buffer (TLB) misses due to the ability of the TLB to
map to a larger virtual memory range. Starting with the POWER4 processor, support for 16 MB
large pages was introduced in addition to the default 4 KB pages. To use large pages on
hardware where multiple page sizes are supported, run AIX 5LTM Version 5.3 updated with
5300-04 Maintenance Package (or later.)
Starting with version 11.50.xC4, Informix supports 16 MB pages. AIX does not automatically
configure large pages in the environment. The system administrator must configure AIX to
use these page sizes, and must specify the number of pages to be reserved. The number of
configured large pages will not be automatically changed by the operating system based on
demand.
We will look at the 64 KB pages, which are dynamically allocated by the operating system on an
as-needed basis, making them simpler to use than the 16 MB large page size. (Starting with
POWER5+TM hardware, huge 16 GB pages are also supported.)
IBM Informix performed tests to compare the results of 4 KB page sizes, 64 KB page sizes, and
16 MB page sizes. The results of these tests are discussed later in this section.
RESIDENT parameter
The RESIDENT parameter in the Informix configuration file ($ONCONFIG) needs to be
considered with respect to memory considerations on AIX. For reference, here are the
comments from the onconfig.std file.
###################################################################
# Shared Memory Configuration Parameters
###################################################################
# RESIDENT
- Controls whether shared memory is resident.
#
Acceptable values are:
#
0 off (default)
#
1 lock the resident segment only
#
n lock the resident segment and the next n-1
#
virtual segments, where n < 100
#
-1 lock all resident and virtual segments

On AIX systems with a lot of allocated pinned resident memory, when Informix uses kernal
asynchronous (KAIO) or direct I/O, Informix might experience KAIO read or write failures with
errno 22 (EINVAL).

IBM Informix on POWER7 Best Practices

22

Example:
04:30:40
04:30:40

KAIO: error in kaio_WRITE, kaiocbp = 0x22b620d0, errno= 22


fildes = 258 (gfd 3), buf = 0x700000122b64000, nbytes= 4096, offset = 130785280

Setting the RESIDENT configuration parameter to -1 is not recommended on AIX. As Informix


allocates more memory, the database server attempts to pin that memory, and that results in a
higher likelihood of seeing an error. With Informix 11.70.FC6 and later, a warning message is
displayed in the database server message log (online.log) if resident memory is used with KAIO
or direct I/O.
4 KB memory page size
The default page size on AIX is 4 KB. The testing that IBM Informix has performed with varying
page sizes of 64 KB and 16 MB are compared against that default.
16 MB memory page size
Before Informix can start to use large pages, the pages must be allocated by the System
Administrator. The following command example allocates 3072 large pages.
vmo -p -o lgpg_regions=3072 o lgpg_size=16777216
vmo -p -o v_pinshm=1

This command can take a while to process. Monitor the number of allocated 16 MB pages with
the vmstat command. In the following output, there are 0 (avm) 16 MB pages active.
vmstat -P ALL 5
System configuration: mem=513536MB
pgsz
memory
page
----- -------------------------- -----------------------------------siz
avm
fre
re
pi
po
fr
sr
cy
4K 54844752 9497936
799539
0
0
0
0
0
0
64K
594475
526063
68412
0
0
0
0
0
0
16M
16384
0
16384
0
0
0
0
0
0
4K 54844752
64K
594475
16M
16384

9497948
526062

789981
68413
16384

0
0
0

0
0
0

0
0
0

0
0
0

0
0
0

0
0
0

After the 16 MB pages are allocated, you must bring the Informix server offline, set the
IFX_LARGE_PAGES environment variable, and then bring the instance back online.
export IFX_LARGE_PAGES=1

IBM Informix on POWER7 Best Practices

23

The following data is from a test performed by IBM Informix comparing 4 KB AIX page size with
16 MB AIX page size. The 4 KB test resulted in 255,491 TPM. The 16 MB test resulted in
270,254 TPM. The use of 16 MB page size produced a 5.77% gain in performance.

64 KB memory page size


Starting with POWER5+ hardware, there is also support for 64 KB page sizes. The 64 KB pages
are dynamically allocated by the operating system on an as-needed basis, making the use of 64
KB pages simpler because no pre-allocation has to occur.
Take the following steps to enable 64 KB page sizes for Informix.
1. Bring the Informix instance offline.
2. Set the LDR_CNTRL environment variable.
export LDR_CNTRL DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K@SHMPSIZE=64K

3. Bring the Informix instance online.


4. Unset the LDR_CNTRL environment variable.
unset LDR_CNTRL

The reason for unsetting the LDR_CNTRL environment variable is to avoid the unintended use
of 64 KB pages for applications that might start from the same terminal.

IBM Informix on POWER7 Best Practices

24

The following data is from a test performed by IBM Informix comparing 4 KB AIX page size with
64 KB AIX page size. The 4 KB test resulted in 255,491 TPM, the 64 KB test resulted in
271,052 TPM. The use of 64 KB page size produced a 6.09% gain in performance.

The 64 KB test results show a better performance gain than the test results for 16 MB large
pages. The benefits of the 16 MB large pages can become more evident as the size of the
database and memory usage grows for that database.

Recommendation
IBM Informix recommends using 64 KB page sizes. The simplicity of use, the dynamic nature,
and results that are similar to that of 16 MB large pages drive this recommendation. In a very
large database the larger 16 MB page sizes might produce better performance gains, but this
needs to be tested on an individual basis.
When using KAIO or direct I/O, do not set the RESIDENT configuration parameter to -1. Set it to
0. Setting it to 1 or 2 might also be in order, but the System Administrator will need to monitor
the pinned memory to make sure that it does not exceed 80% of the physical memory on the
computer or LPAR.

IBM Informix on POWER7 Best Practices

25

Feedback Directed Program Restructuring (FDPR) Tool

The FDPR tool for AIX is included with AIX 5L operating system V5.2 and later. FDPR is used
as a post-link utility for improving the performance of binaries that were compiled on the Power
family platform. It optimizes the binary to achieve a better hit/miss i-cache ratio, reduces the
number of branches, and reduces TLB misses and page faults. This tool is useful for very large
programs or those that used dynamically linked libraries.
From the man page:
The fdpr command (Feedback Directed Program Restructuring) is a performance-tuning
utility that may help improve the execution time and the real memory utilization of userlevel application programs. The fdpr program optimizes the executable image of a
program by collecting information on the behavior of the program while the program is
used for some typical workload, and then creating a new version of the program that is
optimized for that workload. The new program generated by fdpr typically runs faster and
uses less real memory. Attention: The fdpr command applies advanced optimization
techniques to a program which may result in programs that do not behave as expected;
programs which are optimized using this tool should be used with due caution and
should be rigorously retested with, at a minimum, the same test suite used to test the
original program in order to verify expected functionality. The optimized program is not
supported.
The following steps outline how to determine if the FDPR tool can optimize the oninit
executable program that runs Informix. These steps do not describe the complete optimization
process, they only outline steps that you can use to test the optimization.

1. Create a script to set the correct environment, run the oninit command, and run the
workload. FDPR expects to find the executable not running and in fact replaces it with an
instrumented version before startup. Because oninit is a setuid executable, and the SUID
information is not in the replacement executable, you must set the SUID and correct owner
mask before the script starts oninit:
chown root:informix ${INFORMIXDIR}/bin/oninit
chmod 6755 ${INFORMIXDIR}/bin/oninit

2. Make sure that Informix is fully configured for the benchmark, and that the data is already
loaded. The script should execute only the workload portion of the benchmark.
3. Keep in mind that immediately after Informix starts, no data is cached, and so I/O activity is
higher than normal. Make sure that the run time is adjusted so that FDPR can see the product
performing for a considerable amount of time after the cache is warmed up.

IBM Informix on POWER7 Best Practices

26

4. Run FDPR in the same directory where the oninit executable file resides, or else FDPR
could experience problems renaming or re-linking the executable. If the workload script uses
external drivers to run the benchmark, make sure that the drivers are in the execution path, and
that all output goes to the specified location.
Next, run the following command, where "/tpcc/fdpr-workload" is the script that loads the
RDBMS:
( cd $INFORMIXDIR/bin ; timex fdpr -p oninit -x /tpcc/fdpr-workload )

5. Some versions of FDPR might lose information about the optimization level that is used
during product build. The workaround for this is simple: Pass the optimization level to FDPR on
the command line (see man page for details).

Below are the results of our benchmark comparing the original Informix (baseline) with the
results obtained after the oninit binary had been optimized by FDPR. The following steps were
taken. Points were selected from two sets (baseline, optimized) where throughput was
measured as a function of the number of user terminals. Both the baseline set and the one for
the FDPR-optimized binary were collected in a configuration that used 64 KB pages. Maximum
throughput in both sets was achieved with 32 active terminals.
The following data is from IBM Informix test results using 64 KB page size and testing the
results on non-optimized versus FDPR optimized binary. The non-optimized binary produced
270,526 TPM and the FDPR optimized binary produced throughput of over 300,000 TPM. This
amounted to a 13% increase in performance.

Using FDPR
325000
300000
275000
250000
225000
200000
baseline
optimized

175000
150000
125000
100000
75000
50000
25000
0
tpm

IBM Informix on POWER7 Best Practices

27

Recommendation
IBM Informix recommends testing FDPR in a non-production environment using a production
load. Using FDPR can reap performance gains, and that tool should be tested thoroughly
before being used in a production environment.

For more information, see the following two documents: Feedback Directed Program
Restructuring (FDPR) and AIX 5L Performance Tools Handbook (Redbook).
https://www.research.ibm.com/haifa/projects/systems/cot/fdpr/
http://www.redbooks.ibm.com/abstracts/sg246039.html

IBM Informix on POWER7 Best Practices

28

I/O subsystem

The I/O subsystem is a key factor in a well-performing database server. A properly configured
I/O subsystem will allow maximum I/O throughput by the database server. A poorly configured
I/O subsystem can have major negative impacts on the database server. It might or might not
be obvious as to where a problem resides in a poorly performing database server.
Read/Write access times
A good read-and-write access time is determined in part by the technology of the storage that is
being used. A typical I/O should take from 0 milliseconds (ms) to 15 ms. I/Os that take longer
than 15 ms might indicate a problem with the I/O system or a busy device. When a solid-state
drive (SSD) is used, the I/O will typically be less than 2-3 ms. Again, I/Os taking longer might
indicate a problem or a busy device. SSD might even produce results less than 1 ms, moving
into the microsecond range.
You can monitor the access times from Informix or at the operating system level.
In Informix, you would review the onstat g iof data and the onstat g ioh data. The onstat
g iof data will show the chunks and a summary of the response times since the instance has
been online. In the following output, we see the average read service time for KAIO is 9.8 ms.
onstat -g iof
IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 17 days 23:09:01 -590776 Kbytes
AIO global files:
gfd pathname
3
rootdbs.1
op type
seeks
reads
writes
kaio_reads
kaio_writes

bytes read
6187008
count
0
0
0
2360
212009

page reads
3021
avg. time
N/A
N/A
N/A

bytes write
4007526400

page writes io/s


1956800
823.1

0.0098
0.0011

Informix also provides this output with a historical viewpoint going back one hour. This output is
a better way to monitor I/O because it summarizes the data for the past hour on a per minute
basis.
onstat -g ioh
IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 00:03:56 -- 525240
Kbytes
AIO global files:

IBM Informix on POWER7 Best Practices

29

gfd pathname
3
rootdbs.1

bytes read
1073152

page reads
524

avg read
reads
io/s
6
0.1
47
0.8
216
3.6

time
13:54:14
13:53:14
13:52:14

bytes write
386072576

page writes io/s


188512
457.1

avg write
writes
io/s
1772
29.5
2010
33.5
378
6.3

op time
0.02417
0.08274

0.00214

op time

0.00185
0.00141
0.00097

From the operating system standpoint, you can use the iostat command to monitor I/O
throughput. The following iostat command will take two samples 5 seconds apart
Example: iostat -D 5 2
hdisk79

xfer:
read:
write:
queue:

%tm_act
13.3
rps
0.4
wps
0.0
avgtime
0.0

bps
1.6K
avgserv

tps
0.4
minserv
27.0
minserv
0.0
maxtime
0.0

331.6
avgserv
0.0
mintime
0.0

bread
bwrtn
1.6K
0.0
maxserv
timeouts
836.7
0
maxserv
timeouts
0.0
0
avgwqsz
avgsqsz
0.0
0.0

fails
0
fails
0
sqfull
0.0

KAIO/DIRECT_IO
Kernel Asynchronous I/O (KAIO) is enabled by default and will be used for raw disk space. It
provides performance gains over regular I/O. AIX also supports direct I/O & concurrent I/O for
file system access. Informix supports those types of file system access with the DIRECT_IO
configuration parameter.
AIX only supports concurrent I/O on JFS2 file systems. Direct I/O is similar to using KAIO for a
file system. Concurrent I/O adds functionality by avoiding unnecessary write serialization. For
reference, here are the comments for the DIRECT_IO configuration parameter from the
onconfig.std file.
# DIRECT_IO
#
#
#
#
#

- Specifies whether direct I/O is used for cooked


files used for dbspace chunks.
Acceptable values are:
0 Disable
1 Enable direct I/O
2 Enable concurrent I/O

To determine what type of I/O is being used, review the following output. To check on basic
KAIO, run the onstat g ath command, and look for kaio threads. For example, the following
output shows 1 kaio thread.
onstat -g ath|grep kaio
18

60f0a360

IBM Informix on POWER7 Best Practices

30

IO Idle

1cpu*

kaio

To check for direct I/O or concurrent I/O run the onstat d command and look at the flags listed
in column 5 for each chunk. A value of D represents direct I/O and C represents concurrent
I/O.
onstat -d

IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 17 days 22:55:45 -- 590776 Kbytes

Chunks
address

chunk/dbs

5ff081d0
1
1
/chunks/IDSPERF/rootdbs.1

offset

size

free

5000000

4928731

bpages

flags pathname
PO-B- D

To disable KAIO completely, you can use the KAIOOFF environment variable. Prior to bringing
the Informix instance online, set KAIOOFF to 1.
export KAIOOFF=1

Queue depth
From an application standpoint (database server), the length of time to do an I/O equals the
time to service the I/O plus the time that the I/O waits in the hard disk (hdisk) wait queue. Each
hdisk has an associated queue depth setting and, if this setting is poorly configured, it can have
negative impacts on I/O throughput. Use the lsattr command to check the current setting for a
device.
lsattr -El hdisk6 |grep queue
queue_depth

16

Queue DEPTH

True

The faster the drive, the more I/O operations per second (IOPS) that a disk can handle. The
maximum throughput will be limited by the queue depth/average I/O service time. For example,
a queue depth of 3 and an average I/O service time of 10 ms yield a maximum throughput of
300 IOPS.
You can use the iostat -D command to monitor the service times as well as the queue times. If
you start to see time spent waiting in the queue, you might want to increase the queue depth for
a specific device.
In the following iostat output, we see that we are spending an average of 3 ms in the queue,
and had a max of 9 ms wait time in the queue.

IBM Informix on POWER7 Best Practices

31

iostat -D 2 2
System configuration: lcpu=128 drives=9 paths=8 vdisks=0
hdisk6

xfer:
read:
write:
queue:

%tm_act
63.5
rps
0.0
wps
236.5
avgtime

3.0

bps
5.0M
avgserv
8.0
avgserv
5.3
mintime
0.0

tps
236.5
minserv
0.0
minserv
1.1
maxtime

9.0

bread
bwrtn
0.0
5.0M
maxserv
timeouts
18.0
0
maxserv
timeouts
71.0
0
avgwqsz
avgsqsz
0.0
1.0

fails
0
fails
0
sqfull
12.0

To change the queue depth, use the chdev command.


Example:

chdev l

hdisk66 a queue_depth=32

For more information regarding queue depth, monitoring, and configuration see the following
article: AIX Disk Queue Depth Tuning for Performance.
https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105745
AIO servers
AIO is an AIX software subsystem that allows processes to issue I/O operations without waiting
for I/O to complete. This feature is particularly important in a database environment
A kernel process (kproc), called an AIO server (AIOS), is in charge of each request from the
time that the request is taken off the queue until it completes. The number of servers limits the
number of disk I/O operations that can be in progress in the system simultaneously. The default
value of minservers is 3 and maxservers is 30. When more than 3 servers are needed, they will
automatically be allocated, up to the maxservers value.
There is also an aio_server_inactivity tunable parameter that indicates the duration of inactivity
before the inactive AIO servers are stopped. The stopping of these AIO servers can occur
down to the minservers value. The default value for aio_server_inactivity is 300 seconds.
To view the current settings use the ioo command.
# ioo -a |grep aio_

aio_maxreqs
aio_maxservers
aio_minservers
aio_server_inactivity

= 8192
= 30
=3
= 300

These defaults with AIX 6.1 can cause slow I/O, and the issue might not be easily identifiable.
With these low defaults, a situation can occur where more than 3 AIO servers are needed to

IBM Informix on POWER7 Best Practices

32

process a peak load of I/O activity. For example, during low I/O activity, the AIO servers can be
taken down to the minimum of 3. But when a checkpoint occurs, more AIO servers are
required. This situation can generate extra overhead each time the peak load of I/O activity
occurs, because starting the AIO servers incurs an extra cost.
To change these parameters use the ioo command.
Example:
ioo -p -o aio_maxreqs=65536 -o aio_minservers=100 -o aio_server_inactivity=86400

Recommendation
IBM Informix recommends the use of KAIO for raw devices. For file system chunks, set the
DIRECT_IO configuration parameter to 2. This setting permits direct I/O on file system chunks
and allows for concurrent I/O on JFS2 file systems.
Informix also recommends monitoring the queue depth, with a minimum setting of 16. If
monitoring shows wait times in the queue, increase the queue depth accordingly.
Set the aio_minservers and aio_maxservers parameters to 100. Set the aio_server_inactivity
parameter to 86400, which represents a 24-hour period of inactivity before any extra servers are
taken down. Also, set the aio_maxreqs parameter to 65536.

IBM Informix on POWER7 Best Practices

33

Network subsystem

TCP traffic
The network layer can have performance implications based on the amount of data that needs
be sent across the network and whether the network is poorly configured. If you suspect that a
slow network is the cause of a performance problem, you can use the following methods to test
network throughput.
Run the slow-performing test or SQL locally, and compare the results against the same test or
SQL that is run remotely, where the result set has to be returned over the network.
Use scp or ftp commands to test throughput speed by sending a large file and monitoring its
throughput. If the network is not producing the amount of throughput that is expected, make
sure that the TCP window size is configured properly.
The following TCP parameters can affect network performance: tcp_recvspace, tcp_sendspace,
and rfc1323. The tcp_recvspace parameter specifies the number of bytes that the receiving
system can buffer in the kernel. The tcp_sendspace parameter specifies the number of bytes
that the sending system can buffer in the kernel. The rfc1323 parameter enables the TCP
window scaling option.
Local loopback
The fastpath loopback option is used to achieve better performance for loopback traffic. The
tcp_fastlo network parameter permits the TCP loopback traffic to reduce the distance for the
entire TCP/IP stack to achieve better performance.
To display the current setting for tcp_fastlo:
no a |grep tcp_fastlo

The tcp_fastlo parameter is disabled by default (value of 0). To set the parameter, use the no
command. The p option applies the changes to both current and reboot values.
no p o tcp_fastlo=1

IBM Informix tests have shown an increase from ~350,000 TPM to ~520,000 TPM for local
loopback testing. This is an increase of ~50%.

IBM Informix on POWER7 Best Practices

34

Recommendation
IBM Informix recommends contacting AIX Software Support to discuss any throughput issues
with the network. The following changes might increase throughput but should not be tried
before discussing them with AIX Software Support.
If network throughput becomes an issue, consider increasing the TCP window size to 256 KB
with the following commands. Consult AIX Software Support to discuss these changes.
ifconfig en11 rfc1323 1 tcp_nodelay 1 tcp_sendspace 262144 tcp_recvspace 262144
chdev -l en11 -a rfc1323=1 -a tcp_nodelay=1 -a tcp_sendspace=262144 a tcp_recvspace=262144 -P

If using a local loopback connection, enable tcp_fastlo for performance improvements in the
TCP loopback traffic.
no p o tcp_fastlo=1

IBM Informix on POWER7 Best Practices

35

Number of CPU virtual processors

The question of how many CPU virtual processors (VPs) should be configured has been an
interesting problem on POWER7, specifically when SMT2 or SMT4 is used. The important thing
to understand when sizing the number of CPU VPs on a system is that turning on SMT2 or
SMT4 does not give you 2x or 4x CPU power, respectively. Configuring this number properly
with Informix is important, and the number depends on the type of work that the Informix
database will do. Is the workload more CPU intensive, or is the workload more I/O intensive?
These are some of the questions that need to be understood to properly size the system.

CPU-intensive workload
If the Informix server is performing a more CPU-intensive workload, set the number of CPU VPs
to 1.5x the number of physical CPUs allocated to the LPAR. On a POWER7 LPAR with 32
cores and SMT4 enabled (128 logical CPUs), a good starting point is 48 CPU VPs. Use the
VPCLASS configuration parameter to specify the number of CPU VPs that Informix will use
when first bringing the database server online. For reference, here are the comments for
VPCLASS from the onconfig.std file.
# VPCLASS cpu
#

VPCLASS

- Configures the CPU VPs. The format is:


VPCLASS cpu, num=<number of CPU VPs>,

cpu,num=48,noage

Monitor the Informix engine to make sure that all 48 CPU VPs are needed, and decrease the
number if necessary. To change the number of CPU VPs, you update the value of the
VPCLASS configuration parameter, and the value takes effect after you stop and then start the
Informix server. Use the onstat g glo command to determine if there are CPU VPs that are
not being used. In the following example, some of the latter CPU VPs have only clocked 5-6
minutes of CPU time over nearly 58 days of being online. In that case, consider decreasing the
number of CPU VPs.
IBM Informix Dynamic Server Version 11.70.FC2 -- On-Line
28512384 Kbytes

-- Up 57 days 23:42:02 --

Virtual processor summary:


class
vps
usercpu
syscpu
total
cpu
72
20609362.66 1415471.47 22024834.13

IBM Informix on POWER7 Best Practices

36

Individual virtual processors:


vp
pid
class
usercpu
syscpu
total
Thread
Eff
1
9044394
cpu
1318982.55 126036.12 1445018.67 3471009.49
2
4980998
adm
2021.36
1038.05
3059.41
0.00
0%

41%

19
20

18808928
6750704

cpu
cpu

16515902
11075584

cpu
cpu

1513914.43
1479419.71

106858.67
106683.61

1620773.10
1586103.32

3952966.80
3858124.02

41%
41%

87
88

273.03
251.17

93.09
91.13

366.12
342.30

4967.90
4876.44

7%
7%

If, however, you determine that the Informix server is constantly seeing threads in the ready queue
waiting to run, consider increasing the number of CPU VPs.
onstat -g rea
IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 3 days 01:55:42 -- 558008 Kbytes
Ready threads:
tid
tcb
rstcb
prty status
194655974 700000abab6b028 700000b7e0d9ae0 1
ready
195234123 700000a71e78568 700000af4e67920 1
ready
195317254 700000b2516e028 700000bd5045b70 1
ready
195372610 700000ac7f422a0 700000af4e90bb0 1
ready
195425354 700000aadeb4d20 700000a9015b8a0 1
ready
195426222 700000b53cae0d0 700000b7e0d3a80 1
ready

vp-class
32cpu
36cpu
31cpu
34cpu
1cpu
32cpu

name
sqlexec
sqlexec
srvinfx
sqlexec
srvinfx
scan_3.0

Adding CPU VPs can be done dynamically with the onmode command. The following command adds 5
CPU VPs.
onmode p +5 cpu

Keep in mind, threads 2 4 in SMT do not scale linearly, so although the total throughput will increase as
threads 2 4 are used, single thread response time might suffer. For this reason, when increasing the
number of CPU VPs, test to find the best setting for your specific environment.

I/O-intensive workload
If the Informix server is performing a workload heavy on I/O, overload the CPU VPs a bit more.
In the previous example with 32 cores (SMT4), 3x the physical cores, 96 CPU VPs is a good
place to start.
As described earlier, monitor the CPU clock time to determine if the VPs are over configured,
and monitor the ready queue to see if more is needed. Also, monitor the user threads, and
check to see if there are a lot of threads consistently waiting on I/O.

IBM Informix on POWER7 Best Practices

37

onstat g ath|grep IO Wait


194806496
194812270
195228271
195234152
195288699
195293668

700000aa4534568
700000af6fe3c68
700000b21d4eb18
700000ac13eab20
700000abfe7cb10
700000adf34d930

700000af4e751f8
700000b88f07108
700000a9016d1b8
700000b88f17a10
700000b7e0d3a80
700000b848189d0

1
1
1
1
1
1

IO
IO
IO
IO
IO
IO

Wait
Wait
Wait
Wait
Wait
Wait

48cpu
49cpu
24cpu
26cpu
39cpu
37cpu

sqlexec
sqlexec
sqlexec
sqlexec
sqlexec
sqlexec

If this is a consistent characteristic, increasing the number of CPU VPs might help performance
throughput.

Recommendation
Monitor the system workload:
For CPU-intensive workloads, use a starting point for the number of CPU VPs at 1.5x
the number of physical CPUs in the LPAR
For I/O-intensive workloads, use a starting point for the number of CPU VPs at 3x the
number of physical CPUs in the LPAR

IBM Informix on POWER7 Best Practices

38

Affinity

The database server supports automatic binding of CPU virtual processors to a processor in a
multiprocessor environment. The default behavior of affinity in Informix is to affinitize starting
with cpu0, then cpu1, cpu2, etc. On the POWER7 architecture, this is not necessarily the most
beneficial behavior as this equates to logical cpu0, logical cpu1, logical cpu2, etc. With
POWER7 architecture and SMT, it is more beneficial to use the first thread of each physical
CPU before using the 2nd threads of each physical CPU.

By disabling affinity in Informix and allowing the operating system to schedule the CPU virtual
processors, you will get the behavior of using the first thread for each core. This behavior is the
most advantageous due to the throughput that is obtained by the first thread of each core. As
stated earlier in this paper, using threads 2-4 for a core will gain an additional 40% - 60%
improvement in overall throughput.

Recommendation
On the POWER7 architecture, if SMT2 or SMT4 is used, disable affinity. Using affinity can
degrade performance due to the usage of threads 2-4 before using all the first threads for each
core in the LPAR.

IBM Informix on POWER7 Best Practices

39

Understanding onstat g glo

The onstat g glo command is used to display information about the virtual processors within
Informix. One of the values that the command displays is a virtual processor efficiency value.
This value is the ratio of the total CPU time to the total time that the threads ran on the virtual
processor. This value shows efficiency utilization for the CPU virtual processor.
Example:
Individual virtual processors:
vp
pid
class
usercpu
19
18808928
cpu
1513914.43

syscpu
106858.67

total
1620773.10

Thread
3952966.80

Eff

41%

Threads were scheduled to run on this CPU VP for 3,952966 seconds, but the CPU VP only ran
on the CPU for 1,620,773 seconds. The efficiency rating of 41% is derived by dividing 1620773
by 3952966.
To understand the efficiency rating on POWER7, it is necessary to understand the load on the
server and the number of physical and logical CPUs allocated to an LPAR. For example, an
LPAR with 1 CPU allocated to it, with SMT4 with a load that would keep 4 CPU VPs maxed out,
would show an efficiency rating of 25% for each of the 4 CPU VPs. Or all 4 CPU VPs added up
would approach 100%.
This would not be a typical setup, and it is not recommended to have a 1-to-1 relation of CPU
VPs with logical CPUs in an LPAR. This measurement will have more relevant meaning in
systems not using SMT. The DBA would use the onstat g glo command along with mpstat
and lparstat data to obtain information and understand CPU utilization.
Monitoring lparstat can give you a general idea of how busy or idle the LPAR is.
System configuration: type=Dedicated mode=Capped smt=4 lcpu=128 mem=513279MB
%user %sys %wait %idle
----- ----- ------ -----61.7 10.3
0.1
28.0
58.2 10.9
0.0
30.9
56.3
9.5
0.1
34.1

In this output, the system is in the 30% range of being idle. The output also contains other
information about the LPAR. The LPAR is dedicated, using SMT4, and it has 128 logical CPUs.

IBM Informix on POWER7 Best Practices

40

Monitoring the mpstat data is not straight forward to read and understand.
cpu min
0 1510
1 273
2
5
3
8
4 1438
5 539
6
1
7
0

ALL 30648

maj
0
0
0
0
0
0
0
0

mpc
0
0
0
0
0
0
0
0

int
cs
607 1720
391 656
309
62
258
21
444 1459
352 342
263
46
265
49

ics
280
60
4
1
262
44
3
2

rq mig lpa sysc us sy wa id


pc
1 4518 100 12509 74 16 0 9 0.46
0 556 100 4988 65 10
0 25 0.31
0
68 100 432 12 6
0 82 0.12
0
27 100 216 5 5
0 90 0.11
1 3772 100 11519 72 18 0 10 0.38
0 344 100 4875 83 5
0 11 0.42
0
41 100 218 17 6
0 77 0.10
0
32 100 340 14 8
0 78 0.10

0 49044 86261 12756

18 155654 100 641630 56

0 34 31.98

The summary in the ALL line at the end of the mpstat data looks more like the lparstat data.
Monitoring this output might show that the first thread #1 for each core is pretty active. Logical
CPU 0 is thread #1 for the first core in this LPAR. Logical CPU 4 is thread #1 for the second
core in the LPAR.
The output also shows that threads 2-4 are in use, but that they are not as heavily utilized as
thread #1 for each core. As the number of CPU VPs becomes greater than the number of cores
in use, threads 2-4 will begin to show more utilization.
When threads 2-4 are being used, monitor throughput and response times to verify that the
number of CPU VPs is properly configured with respect to the number of cores and logical
CPUs for the LPAR.

Recommendation
If the number of CPU VPs is greater than the number of cores in an LPAR, monitor closely for
total throughput compared to single-user response time. If response time degrades to
unacceptable levels, test and monitor decreasing the number of CPU VPs to ensure better
response times. As is noted in other areas of this document, threads 2-4 for a core do not scale
linearly.

IBM Informix on POWER7 Best Practices

41

Starting LPARs

The order in which LPARs are started can affect the physical resource allocation within the
LPARs and can have performance implications within a system. The first LPAR that is started
gets the most optimal resources, and the 2nd LPAR started gets the next best resources, etc.
And the last LPAR that is started will get whats left.
For example, assume that you have a 32 core system with four chips, each with eight cores. If
five partitions are configured, each with six cores, the first four LPARs would be located on each
chip and the fifth LPAR would be spread across three chips.
This is especially true if the LPARs resources that are allocated are not greater than the cores
on a single chip, in which case there is a better opportunity for them to obtain good affinity
characteristics in their core and memory allocations.

Recommendation
The order in which LPARs are started should be considered in obtaining the best performance
for high-priority workloads. Start the most important partitions first to obtain the best resources
from a single chip.

IBM Informix on POWER7 Best Practices

42

Appendix A:

Recommendations summary

Summary table of recommendations made throughout this whitepaper. For complete


recommendations see the appropriate section in the white paper.
SMT settings

LPAR type
VIOS
Additional LPAR improvements

Memory Page Sizes

FDPR

IO Subsystem

Network Subsystem

Number of CPU VPs

Affinity
Interpreting onstat -g glo

Starting LPARs

IBM Informix on POWER7 Best Practices

Use SMT4 for increased overall throughput. Use SMT2 when


single-thread response time is more important. See the SMT1
vs SMT2 vs SMT4 section for details.
Use a dedicated LPAR where possible. See the Dedicated
LPAR vs shared LPAR section for details.
Use a dedicated LPAR. See the Virtual I/O Server (VIOS)
LPAR section for details.
When possible, use DPO and ASO/DSO to optimize workloads
for Informix. See the Additional LPAR recommendations
section for details.
When using KAIO or direct I/O, do not set RESIDENT
configuration parameter to -1. Use 64 KB large pages for
performance improvements. See the Memory considerations
section for details.
Use FDPR in a non-production environment to test for
performance improvements. See the Feedback Directed
Program Restructuring (FDPR) section for more details.
Use KAIO or DIRECT IO where possible. For disks, set the
queue depth to a minimum of 16. For AIO servers set the min
and max aio servers to 100 and aio_server_inactivity to 86400.
See the I/O subsystem section for details.
Test network throughput and as tune as needed. Increase
tcp_recvspace and tcp_sendspace AIX parameters up to 256
KB. See the Network Subsystem section for details.
For CPU-intensive workloads, set CPU VPs at 1.5x the number
of physical CPUs in the LPAR. For I/O-intensive workloads, set
the CPU VPs to 3.x the number of CPUs in the LPAR. See the
Number of CPU virtual processors section for details.
Disable affinity when SMT2 or SMT4 is being used. See the
Affinity section for details.
If the number of CPU VPs is greater than the number of cores
in an LPAR, monitor closely the throughput, and adjust the
number of CPU VPs as appropriate. See the Interpreting
onstat -g glo section for details.
Start the most critical LPARs first. See the Starting LPARs
section for details.

43

Appendix B:

Useful commands

amepat
Active MemoryTM Expansion Planning and Advisory Tool. The amepat command reports Active
Memory Expansion information and statistics as well as provides an advisory report that assists
in planning the use of Active Memory Expansion for an existing workload. This document used
this tool to show statistics for an LPAR.
bosboot
Creates a boot image. This utility is used to reboot an LPAR with any setting changes that have
been made.
chdev
Changes the characteristics of a device. This document used this tool to modify the queue
depth for a hard disk, as well as changing TCP settings for a network interface.
ifconfig
Configures or displays network interface parameters for a network using TCP/IP. This
document used this tool to make modifications to TCP parameters.
ioo
Manages Input/Output tunable parameters. This document used this tool to view the AIO server
settings as well as to modify some of the parameters.
iostat
Reports Central Processing Unit (CPU) statistics, asynchronous input/output (AIO) and
input/output statistics for the entire system, adapters, TTY devices, disks CD-ROMs, tapes, and
file systems. This document used this tool to monitor I/O statistics.
lparstat
Reports logical partition (LPAR) related information and statistics. This document used this tool
to gather and show statistics for an LPAR.

IBM Informix on POWER7 Best Practices

44

lsattr
Displays attribute characteristics and possible values of attributes for devices in the system.
This document used this tool to monitor the queue depth for a hard disk.
schedo
Manages processor scheduler tunable parameters. This document used this tool to check the
processor folding settings and modify the settings as needed.
smtctl
Controls the enabling and disabling of processor simultaneous multithreading mode. This
document used this tool to enable/disable SMT, and set the mode accordingly.
vmo
Manages Virtual Memory Manager tunable parameters. This document used this tool to set up
and pre-allocate memory for 16 MB large pages.

vmstat
Reports virtual memory statistics. This document used vmstat to monitor the creation of 16 MB
large pages.

IBM Informix on POWER7 Best Practices

45

Appendix C:

Additional reading

The following list of links provides useful background information for PowerVM and POWER7
systems. Also listed is 3rd-party material, which is not an endorsement of the material by IBM,
but is meant to offer the reader a variety of information and viewpoints.

DeveloperWorks: AIX Virtual Processor Folding is misunderstood


https://www.ibm.com/developerworks/community/blogs/aixpert/entry/aix_virtual_processor_foldi
ng_in_misunderstood110?lang=en
IBM Systems: Understanding Micro-Partitioning
http://www.ibmsystemsmag.com/aix/tipstechniques/systemsmanagement/Understanding-MicroPartitioning/?page=1
IBM Systems: Getting a handle on Entitled Capacity & Virtual Processors
http://www.ibmsystemsmag.com/aix/administrator/systemsmanagement/entitled_capacity/
YouTube: Power7 Performance Entitlement, VPs, Affinity, Memory
http://www.youtube.com/watch?v=1W1M114ppHQ
Feedback Directed Program Restructuring (FDPR)
https://www.research.ibm.com/haifa/projects/systems/cot/fdpr/
Developer Works: VIOS Advisor
http://www.ibm.com/developerworks/wikis/display/WikiPtype/VIOS+Advisor
IBM Redbooks Publication: IBM PowerVM Virtualization Managing and Monitoring
http://www.redbooks.ibm.com/redpieces/abstracts/sg247590.html
IBM Redbooks Publication: AIX 5L Performance Tools Handbook
http://www.redbooks.ibm.com/abstracts/sg246039.html

IBM Informix on POWER7 Best Practices

46

Appendix D:

References

IBM Informix database server 12.10.xC1: A Technical White Paper


Release Notes for IBM Informix 12.10.xC2
IBM Informix 12.10 .NET Provider Reference Guide
IBM Informix 12.10 Administrators Reference
IBM Informix 12.10 Backup and Restore Guide
IBM Informix 12.10 Database Extensions User's Guide
IBM Informix 12.10 Enterprise Replication
IBM Informix 12.10 GLS User's Guide
IBM Informix 12.10 Guide to SQL: Reference
IBM Informix 12.10 Guide to SQL: Syntax
IBM Informix 12.10 Migrating and upgrading
IBM Informix 12.10 Performance Guide
IBM Informix 12.10 Security
IBM Informix 12.10 TimeSeries Data Users Guide
IBM Informix 12.10 Warehouse Accelerator Administration Guide

IBM Informix on POWER7 Best Practices

47

For more information


To learn more about the Informix features, contact your IBM representative or IBM Business
Partner, or visit ibm.com/software/data/informix

IBM shall not be responsible for any damages arising out


of the use of, or otherwise related to, this publication or
any other materials. Nothing contained in this publication
is intended to, nor shall have the effect of, creating any
warranties or representations from IBM or its suppliers or
licensors, or altering the terms and conditions of the
applicable license agreement governing the use of IBM
software.

IBM Informix on POWER7


Best Practices
A Technical White Paper
December 2013
Darin Tracy
Monish Gupta
Vladimir Kolobrodov
Copyright 2013 IBM Corporation
IBM Corporation
Software Group
Route 100
Somers, NY 10589
U.S.A.
The information contained in this publication is provided
for informational purposes only. While efforts were made
to verify the completeness and accuracy of the
information contained in this publication, it is provided AS
IS without warranty of any kind, express or implied. In
addition, this information is based on IBMs current
product plans and strategy, which are subject to change
by IBM without notice.

IBM Informix on POWER7 Best Practices

48

References in this publication to IBM products, programs,


or services do not imply that they will be available in all
countries in which IBM operates. Product release dates
and/or capabilities referenced in this presentation may
change at any time at IBMs sole discretion based on
market opportunities or other factors, and are not
intended to be a commitment to future product or feature
availability in any way. Nothing contained in these
materials is intended to, nor shall have the effect of,
stating or implying that any activities undertaken by you
will result in any specific sales, revenue growth, savings
or other results.
IBM, the IBM logo, ibm.com, and Informix are trademarks
of International Business Machines Corp., registered in
many jurisdictions worldwide. Other product and service
names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web
at Copyright and trademark information at
www.ibm.com/legal/copytrade.shtml. Linux is a
registered trademark of Linus Torvalds in the United
States, other countries, or both.

Vous aimerez peut-être aussi