Vous êtes sur la page 1sur 73

Architecting and Deploying IBM

Power Enterprise Systems

Chandan Chopra
Power Systems Solution Architect, IBM Systems Lab Services
chandan.chopra@in.ibm.com

Agenda

Power Systems Portfolio


Power8 Enterprise Systems Architecture
Deployment Guidelines
Solution Guidelines
Q&A

Power Systems Portfolio &


Power8 Enterprise Systems Architecture

Power Systems Portfolio

Power E870 and E880 Servers


Power E870

Increased performance and scale


System Control Unit (midplane)
Active Memory Mirroring
8 PCIe3 adapter slots per node
PCIe Gen3 I/O drawers
Power Enterprise Pool
PowerVM Enterprise included
Enterprise RAS

Power E880

Even for 1-node system

24x7 Warranty

Power8 Enterprise Family

E850

E870

E880

16 - 48 Cores
3.72 GHz (12c)
3.35 GHz (10c)
3.02 GHz (8c)
128 GB 2 TB Memory*
7 - 51 PCI Adapters

8 - 80 Cores, 1-2 nodes


4.19 GHz (10c)
4.02 GHz (8c)
256 GB 8 TB Memory
8 - 96 PCI Adapters

8 - 192 Cores, 1-4 nodes


4.02 GHz (12c)
4.35 GHz (8c)
256 GB 16 TB Memory
8 - 192 PCI Adapters

* Statement of direction to 4 TB. Statements of direction represent plans only and are subject to change without notice.
6

Power8 Enterprise System Structure

Power8 System Control Unit

Improves availability of all E870 and E880 configurations

System Node PCIe slots

Slots use a new low profile


blind swap cassette (BSC).
Server comes fully
populated with BSC. No
special feat code
associated with BSC.

Eight Low profile (LP) adapter slots


Used for PCIe adapters (Gen1, Gen2 or Gen3 LP adapters)
Or used to connect to PCIe Gen3 I/O Expansion Drawer
9

PCIe Gen3

Though these cards physically look the same and fit in the same slots
Gen3 cards/slots have up to 2X more bandwidth than Gen2 cards/slots
Gen3 cards/slots have up to 4X more bandwidth than Gen1 cards/slots

More virtualization
More consolidation
More ports per adapter

saving PCI slots and I/O drawers

18
16
14
12

Peak

A Gen1 x8 PCIe adapter has a theoretical max (peak)


bandwidth of 4 GB/sec.
A Gen2 x8 adapter has a peak bandwidth of 8 GB/sec.
A Gen3 x8 adapter has a peak bandwidth of 16 GB/sec.

Sustained

10
8
6
4
2
0

Gen1

Gen2

Gen3
10

PCIe Gen3 I/O Expansion Drawer

Feat #EMX0
Front view

Rear view

Fan-out Module
6 PCIe Gen3 Slots
Attaches to 1 system node PCIe slot

Fan-out Module
6 PCIe Gen3 Slots
Attaches to 1 system node PCIe slot

12 PCIe Gen3 slots

4U drawer

Full high PCIe slots

Hot plug PCIe slots

Modules not hot plug

11

Single Root I/O Virtualization (SR-IOV)


VM
1

VM
2

VM
3

VM
4

Direct Ethernet virtualization


Lower CPU overhead
Better throughput
QoS capable

Up to 64*
Virtual Functions

Example: 4-port PCIe3 10Gb FCoE Adapter


Model

SR-IOV Mode Supported Slots

E850

All internal slots

E870

All internal slots

E880

All internal slots

I/O Drawer

Slots C1 and C4 of the 6-slot fan-out module

* Note: The number of Virtual Functions available


per adapter or port is adapter dependent

Software

SR-IOV Software Support

AIX

AIX 6.1
AIX 7.1
AIX 7.1
AIX 6.1

IBM i

IBM i 7.1 TR10 or later


IBM i 7.2 TR2 or later
Both require either VIOS or adapter in SR-IOV mode

Red Hat

Red Hat Enterprise Linux 6.6 or later


Red Hat Enterprise Linux 7.1, big endian, or later
Red Hat Enterprise Linux 7.1, little endian, or later

SUSE

SUSE Linux Enterprise Server 12 or later

Ubuntu

Ubuntu 15.04 or later

PowerVM

Firmware 830 available June, 2015 and HMC V8.830

TL9 SP5 and APAR IV68443 or later


TL3 SP5 and APAR IV68444 or later
TL2 SP7 or later (planned availability 3Q 2015)
TL8 SP7 or later (planned 3Q 2015)

12

EXP24S SFF Gen2-bay Drawer


(24) 2.5 inch hot-swap SAS or
SSD disks

Front

Ordered as 1,2, or 4 sets of disks*


Redundant power

Rear

* Applies to orders for AIX, Linux, and VIOS, IBM i is ordered as 1 set
13

Enterprise System Deployment


Guidelines

Hardware areas to discuss

POWER Processors and levels of cache


Does processor speed (frequency) matter?

Multi-Core Multi-Node Systems


How many Nodes (Books/Enclosures) ?
Should I use more than minimum?
How many should I have installed vs active and why?

Memory
How much do I need ? Should I fill the Memory card slots ?
Memory access (Local, Near, and Far NUMA)

I/O

How many drawers on a loop ?


Do card slots matter ?
Adapter placement across drawers and nodes for potential higher availability,
Performance

15

Processor Designs

POWER6

POWER7

POWER7+

POWER8

Technology

65nm

45nm

32nm

22nm

Size

341 mm2

567 mm2

567 mm2

675 mm2

Transistors

790 M

1.2 B

~2.4 B

~5 B

Cores

12

Frequencies

4+ GHz

3 4+ GHz

3 4+ GHz

3 4.35 GHz

L2 Cache

4MB / Core

256 KB / Core

256 KB / Core

512 KB / Core

L3 Cache

32MB

32MB

80MB

96MB

L4 Cache

128MB

Memory (Dram
Channel)

8 DDR2

16 DDR2

16 DDR2

32 DDR3/4

I/O

Propriety GX

Propriety GX+

Propriety GX+

Integrated PCIe

Architecture

In of Order

Out of Order

Out of Order

Out of Order

Threads

8
16

Simultaneous Multithreading

17

Simultaneous Multithreading
SMT1

Largest unit of execution work


SMT2

Smaller unit of work, but provides greater amount of


execution work per cycle
SMT4

Smaller unit of work, but provides greater amount of


execution work per cycle
SMT8

Smallest unit of work, but provides the maximum


amount of execution work per cycle

4
3.5
3
2.5
2
1.5
1
0.5

Can dynamically change modes as required:


SMT1 / SMT2 / SMT4 / SMT8

P7
P8
P8
P8
P8
SMT1 SMT1 SMT2 SMT4 SMT8

18

Power Sizing: Throughput and Response time


Higher SMT Boosts capacity by
Allowing core to continue executing instructions during cache miss delays.
Using available execution resources not used by other task(s).
Overall throughput increases

Task executes fastest when alone.


Task Dispatcher of dedicated-processor partition spreads tasks first over available cores.

As task count increases, task speed decreases.


Tasks executing individually slower, but are executing.

Response Time consideration:


Consider setting partition limit to four threads (P7 mode) on POWER8.
Big improvement in task execution speed

19

Power Sizing: rPerf and CPW


Core-to-Core Performance

POWER7 and POWER8 provide significant


gains in CPW & rPerf Ratings
Impressive core-to-core capacity increase
Outstanding socket-to-socket increase in capacity

8-core POWER6 vs. POWER7


1.4
1.2
1

POWER6 550
5.0GHz
POWER7 750
3.3GHz

0.8
0.6
0.4
0.2
0

8-core

CPW and rPerf are OLTP DB workloads used


for representing Capacity

Socket-to-Socket Performance
1-chip POWER6 vs. POWER7
5
4.5
4
3.5
3

POWER6 570
5.0GHz

2.5

POWER7 780
3.86GHz

2
1.5
1
0.5
0
1-socket

20

Power Sizing: rPerf and CPW


rPerf
E870

CPW
E870

32-core
64-core

4.02 GHz
4.02 GHz

674.5
1,349.0

32-core
64-core

4.02 GHz
4.02 GHz

359,000
711,000

40-core
80-core

4.19 GHz
4.19 GHz

856.0
1,711.9

40-core
80-core

4.19 GHz
4.19 GHz

460,000
911,000

4.35 GHz
4.35 GHz
4.35 GHz
4.02 GHz
4.02 GHz
4.02 GHz

381,000
755,000
1,523,000
518,000
1,034,000
2,069,000

E880

E880
32-core
64-core
128-core
48-core
96-core
192-core

4.35 GHz
4.35 GHz
4.35 GHz
4.02 GHz
4.02 GHz
4.02 GHz

716.0
1,432.5
2,865
976.4
1,952.9
3,905.8

32-core
64-core
128-core
48-core
96-core
192-core

21

Power Sizing: rPerf and CPW


What if I had a workload that needed 70,000 CPW

9117-MMD 12-core (4.2GHz) = 90,000 CPW and 90,000/12 cores = 7500/core


9119-MME 40-core (4.19GHz) = 460,000 CPW and 460,000/40 cores = 11500/Core
In this example CPW on POWER7 @ 7,500 per core running SMT4
and CPW on POWER8 = 11,500 per core running SMT8

and CPW on POWER8 = 9,200 per core running SMT4

(460,000 x .8 / 40 = 9,200 CPW)

Based on CPW math


POWER7 (SMT4)

70,000 CPW
divided 7,500 per core
-----------------9.33 Cores

POWER8 (SMT8)

70,000 CPW
divided 11,500 per core
-----------------6.08 Cores

POWER8 (SMT4)

70,000 CPW
divided 9,200 per core
-----------------7.6 Cores

The POWER8 system might very well provide the CPW capacity However, remember
response time vs throughput. You might get the transactions but at increased response times
and longer batch runtimes.
USE WLE to size

22

Best Practice #1
If speed (response time and batch run time) is the priority for the workload
then consider using higher frequency POWER8 Processors.
Consider appropriate rPerf and CPW for selecting a POWER8 system.

Remember these are ratings of capacity not speed.


You can migrate a workload to a slower frequency system with at least the same

or better CPW and/or rPerf rating, but not when per thread performance (speed)
is critical
Start with about 3/4 of cores of POWER7 if speed is the requirement.
Consider using SMT4 (POWER7 mode) when speed is a major concern on

POWER8 systems.
Consider dedicated or dedicated donate for partitions that are business critical
Understand the number of cores worth of capacity and performance you need in

POWER8 compared to POWER7 or POWER6


Use performance sizing tools
23

Power Sizing: Tools


IBM Systems Workload Estimator (WLE):
Strategic sizing tool that recommends the best
IBM system to satisfy overall workload and virtualization
requirements
Power Systems, System x, PureFlex System
- AIX, IBM i, Linux, Windows
- PowerVM (Partitions and VIOS), x virtualization
- Customizable storage (internal, SAN, SSD)
Considers existing customer data for sizing upgrades,
migrations, and consolidations
Sizes new workloads via 300 WLE plug-ins

Flexible interface for IBM/ISVs to build plug-ins


Free strategic sizing tool for IBM Sales, ISV/BP, customers
http://www-947.ibm.com/systems/support/tools/estimator/

IBM Systems Energy Estimator (SEE)


Estimates energy for Power Systems
Integration points: WLE, SPT, e-Config
SEE drives 550 energy estimates per week
http://www-947.ibm.com/systems/support/tools/estimator/energy/

24

Multi-core Multi-node Systems

Multi-core - smaller die size, more transistors, more processor cores per chip, more
threads per core. more functions on chip,
Use of SMP (Symmetric Multi-Processing) to scale across more cores

Multi Core and Multiple Node Power Systems 870, 880, 770, 780, 795

NUMA (Non-Uniform Memory Access), a concept that is used to further drive up the
performance capacity of a system.

What is Multi-Node: http://www-03.ibm.com/systems/resources/pwrsysperf_WhatIsMulticoreP7.pdf


25

Power 870, 880,770, 780, and 795 Scale by adding Nodes


These systems differ from the non-Enterprise Power Systems
Additional scaling by adding Enclosures/Books/Nodes

Each additional node adds cores, memory, and I/O (Bandwidth)


Adding Nodes can improve RAS characteristics
770/780 adding second enclosure adds second clock and FSP (795 always has

second clock and FSP in System frame)


870 and 880 always has dual clock and dual FSP in System control unit
Additional I/O multi path if node failure/maintenance
Adding Nodes can improve Performance
Extra capacity is controlled with CUoD activation codes
Memory and processor On Demand
If more cores and memory installed than active, Hypervisor has more options for

partition placement for best processor and memory affinity


26

64 way 770 to POWER8 Upgrade for best performance


770 64 way needs four enclosures nodes and has memory in all four nodes
E880 48 way needs only one node and the memory in one node

Should I use one System node? Would it be better to use two nodes ?

Additional nodes provides better RAS and gives Hypervisor better


placement options which could provide better performance

27

NUMA - Non-Uniform Memory Access


is a computer memory design used in multiprocessing, where the memory
access time depends on the memory location relative to a processor.
Under NUMA, a processor can access its own local memory faster than
non-local memory, that is, memory local to another processor or memory
shared between processors.

Why would we design something like this?


1. The key to the answer is bandwidth
2. Bandwidth available for accessing memory scales up
linearly with the number of chips
3. A more rapid access to local memory and scalable bus
bandwidth is largely what a NUMA-based system
produces.
28

Memory: POWER8 Processor Planner Memory Layout

8 CDIMMS per SCM


Each CDIMM adds memory bandwidth
Each CDIMM adds L4 cache
29

Memory: POWER8 Memory CDIMMs Rule

8 CDIMM slots per SCM (2 feature codes per SCM)


Minimum one memory feat code any size (four identical CDIMM) per SCM
Optional second memory feature (four identical CDIMM) per SCM
2nd memory feature code same capacity as 1st memory feature code
30

Memory: E870/E880 Memory Bandwidth

Up to 1 TB / Socket (with two 512GB features, eight 128GB CDIMMs)


31

Best Practice #2 (Memory Configuration)


Understand your LPAR definitions (processors and memory)
Avoid having chips without DIMMs.
Attempt to fill every chips DIMM slots, activating as needed.
Hypervisor tends to avoid activating cores without local memory.

POWER8 Performance Best Practices


http://www14.software.ibm.com/webapp/set2/sas/f/best/home.html
32

Affinity
Affinity is a measurement of the proximity a thread has to a physical resource,

and performance is optimal when data crossing affinity domains is minimized


Examples of resources can include L2/L3 cache, memory, core, chip and System

node

Cache Affinity: threads in different domains need to communicate with each other,

or cache needs to move with thread(s) migrating across domains


Memory Affinity: threads need to access data held in a different memory bank not
associated with the same chip or node

Think about your biggest partitions cores and memory, could it fit on a node

with the addition of the Hypervisor memory usage?

33

Power8 Cache

L1: 96 KB per Core


L2: 512 KB per Core
Large working sets
Single thread sensitive
Multi-threaded

L3: 96MB per SCM


Virtualization
Shared data

L4: 16MB off-chip on each


memory card
Write burst traffic
55% lower latency reads
Mixed reads and writes
34

Where does your application access data?

Access
data:

L1 cache

L2 cache

L3 cache

L4 cache

Local
memory

Remote
memory

Distant
memory

Cycles

12

28

180

320

500

800
35

Best Practice #3 (Partition Placement for Affinity)


Help the hypervisor cleanly place partitions when they are first defined
and activated.
Define dedicated partitions first.
Within shared pool, define large partition first.
After initial LPAR definitions IPL the System.
At full system (not partition) IPL , Hypervisor will

allocate resources for best affinity on given


configuration.
At deep IPL (System power cycle) Hypervisor will

use previous partition allocation table to place


partitions for best performance.
Consider use of DPO and PowerVP
36

Dynamic Platform Optimizer (DPO)


Designed to reduce the complexity and time required for clients to

manage and tune their systems


DPO optimizes processor and memory affinity in virtualized consolidated

environments
Process first runs to assess level of affinity by partition
User then selects partitions for system optimization
System and workloads continue to run during optimization process
System adjusts workload placement in background to optimize performance
without requiring additional interaction

Available at no additional charge for Power 770, 780, 795, 870 and 880

systems with firmware level 760 or later


DPO operations can be automated using HMC

37

Cores Cores
Cores Cores

DIMMs

Cores Cores

DIMMs

DIMMs

DIMMs

DIMMs

Be aware of the number of cores per chip and


chips per book/drawer.

DIMMs

Ideally, partitions shouldnt span a chip or


book/drawer boundary.

Cores Cores

DIMMs

Think about the nodal resources as you


define partitions resources.

DIMMs

Best Practice #4

38

Best Practice #5
Dont under-commit entitlement.
Every virtual processor has a preferred Node ID.
That set of cores close to where memory resides.

Too little entitlement results in too many VCPUs


contending for nodes cores.
Results in reduction in system capacity when
needed most.
Set VCPUs to entitlement rounded up.

Dont over-commit shared-processor pool with


virtual processors.
39

Best Practice #6
Update Firmware to latest level
The hypervisor has had numerous performance
enhancements

Partition X
Memory

Favor performance over energy savings


Home node re-dispatch

Partition Y
Memory

Partition X
Processors

Partition YPartition Z
Processors
Processors

Dynamic Platform Optimizer added

Partition Z
Memory

Free LMBs

New PowerVP License Program product


Partition X
Partition Y
Processors
Processors

Partition Z
Processors

40

PCIe Adapter Placement Rules and Priorities


Rules for E870 and E880

All slots are x16 with buses direct from the Processor Modules and must be
used to install high-performance PCIe adapters

The adapter priority for these slots is for the PCIe3 Optical Cable Adapter
(FC EJ07), SAS adapters (FC EJ0M, EJ11), followed by any other highperformance low-profile adapter

Refer to Slot priority table for all supported adapters for optimal placement

https://www-01.ibm.com/support/knowledgecenter/9119-MHE/p8eab/p8eab_87x_88x_slot_details.htm

All slots support Single Root IO Virtualization (SRIOV) capable adapters

Verify whether the adapter is supported for your system. IO placement can
be planned and validated using System Planning Tool (SPT)
41

PCIe I/O Drawer per E870/E880 Node

2x more drawers
PLUS
More flexibility

0, 1, 2, 3 or 4 PCIe Gen3 I/O Drawers in 2015


(max 8 fan-out modules per node)

Requires 8.3 firmware level available June 2015


42

PCIe I/O Drawer per E870/E880 Node

For even more flexibility can choose to have 1/2 drawers.


Thus any of the drawers could have a single 6-slot fan-out module

0, , 1, 1, 2, 2, 3, 3 or 4 PCIe Gen3 I/O Drawers in 2015


(max 8 fan-out modules per node)

Requires 8.3 firmware level available June 2015


43

Supported PCIe I/O Drawer Cabling Examples


Note the single blue/green/etc lines below each depicts two physical AOC cables

Notes:
With two system nodes it is a good practice (but not required) to attach the two fan-out
modules in one I/O drawer to different system nodes. Combined with placing
redundant PCIe adapters in different fan-out modules, system availability is enhanced.
PCIe I/O drawer can be in the same or different rack as the system nodes. If large
numbers of I/O cables are attached to PCIe adapters, its nice to have the I/O drawer in
a different rack for cable management ease
System control unit not shown for visual simplicity
44

Supported PCIe I/O Drawer Cabling More Examples

45

System Planning Tool

www.ibm.com/systems/support/tools/systemplanningtool/
46

Enterprise System Solution


Guidelines

SMT Guidance
Active Memory Mirroring Guidance
SRIOV Guidance
Power Saving Guidance
Enterprise Pools

Review: Power6 vs Power7/Power8 SMT Utilization

48

Power6 vs Power7/Power8 Dispatch

49

Power6 vs Power7/Power8 Dispatch

50

Migrations: Dispatching, SMTGuidance


When migrating from POWER7 to POWER8, expect the following
Dispatch behavior remains the same
Physical CPU consumption will look similar based on VPs
When migrating from POWER5/POWER6 to POWER8, expect the following
Dispatch behavior will be different (scaled and raw)
Physical CPU consumption will look higher on POWER8
Too low VP can limit the dynamic scalability of workload

Too high VP can result in


Higher physical CPU usage for heavily loaded partitions (raw through put

mode, default)
VP folding for less loaded partitions

51

Power8 SMT Default: Why SMT4?


A partition that runs AIX 6.1 on POWER8 will only support POWER6,

POWER6+ or POWER7 mode


Will limit partition to SMT4
A partition that runs AIX 7.1 on POWER8 will only support POWER6,

POWER6+, POWER7 or POWER8 mode


Will scale partitions to SMT8
AIX chose to keep SMT4 as default on POWER8

Most workloads will be fine with SMT4 or SMT8


Applications with scalability issues will not be able to leverage SMT8
Many workloads do not run at 80% utilization levels to be able to use SMT8

threads
SMT4 is the best of all worlds for now, but there are now more options to

exploit SMT
52

Power8 SMT: Should I use SMT8?

53

Power8 SMT: Should I use SMT8?


Any PoC or benchmark where we are going to drive to 80% utilization
We want to use all the capacity
OLTP DB, large WAS servers, etc will get benefit
Environment where you have fair idea of SMT behavior
If utilization is high and increasing SMT threads had improved performance
It is easy and free to test SMT4 and SMT8 modes, no reboot required

For new applications, need to review software stack


If application space is well known on AIX, SMT8 should not be a problem
If application is new to AIX, should be tested for scaling issues

54

Scaled Throughput Guidance

55

Active Memory Mirroring - Hypervisor Mirroring


Standard on E870 and E880 Systems

Eliminate Platform outages due to

uncorrectable errors in memory


Maintains two identical copies of the system

hypervisor in memory at all times


Both copies are simultaneously updated with

any changes
In the event of a memory failure on the

primary copy, the second copy will be


automatically invoked and a notification sent
to IBM via the Electronic Service Agent (ESA)

56

AMM Guidance
Hypervisor memory mirroring defaults to enabled. You need to be aware of this when sizing system

memory. Plan on AMM to take about 8% of each nodes memory and 16% if hypervisor mirroring
Remember,
Hypervisor data that is mirrored:
Hardware Page Tables (HPTs) that are managed by the hypervisor on behalf of partitions to track

the state of the memory pages assigned to the partition.


Translation control entries (TCEs) that are managed by the hypervisor on behalf of partitions to
communication partition I/O buffers for I/O devices,
Hypervisor code (instructions that make up the hypervisor kernel)
Memory used by hypervisor to maintain partition configuration, I/O states, Virtual I/O information,
partition state and so on
Hypervisor data that is not mirrored:
Memory used to hold contents of platform dump while waiting for offload to HMC/OS
Partition data is not mirrored:
Desired memory configured for individual partitions are not mirrored
Switch off the I/O Adapter Enhanced Capacity Feature unless you are running Linux with dedicated

physical adapters.
I/O Adapter Enhanced Capacity is reserved memory
With hypervisor memory mirroring enabled, this gets doubled. Reserved memory can go excessively
high for Power8 Enterprise systems

57

SRIOV Guidance
Link Aggregation (LACP) will not function properly with

multiple logical ports using the same physical port


Etherchannel is not recommended for an SR-IOV

configuration. For Etherchannel, SR-IOV logical ports may go


down while the physical link remains up. Switch does not
recognize a logical port going down and will continue to send
traffic on the physical port
Use Link Aggregation (LACP) with one logical port per

physical port. Provides greater bandwidth than a single link


with failover
Best Practice
Assign 100% capacity to each SR-IOV logical port in the

Link Aggregation Group to prevent accidental assignment


of another SR-IOV logical port to the same physical port
58

SRIOV Guidance (LPM Options with SRIOV)


Multiple VIOS configuration

Use current Virtual Ethernet support with logical

ports as Shared Ethernet Adapter (SEA)


physical connections to the network

Reduced adapter and port requirements

Does not receive performance benefits provided

with SR-IOV Direct Access

59

SRIOV Guidance (LPM Options with SRIOV)


Active-backup configuration
Configure SR-IOV logical port as Active

connection and Virtual Ethernet adapter as


backup
Prior to migration, use dynamic LPAR operation

to remove SR-IOV logical port


Virtual Ethernet becomes Active connection
Migrate the partition

On target system, configure SR-IOV logical port

as Active connection

60

VIOS, AIX, Linux and HMC Guidance


The minimum level of AIX 6.1 or 7.1 supported on

E870 and E880 depends on partition having 100%


virtualized (via VIOS) resources or not
The minimum level of VIOS
VIOS 2.2.3.4 with ifix IV63331
VIOS 2.2.3.51 with APAR IV68443 and

IV68444
Fix Level Recommendation Tool (FLRT)
https://www14.software.ibm.com/webapp/set2/flrt/home

For LPM fix recommendations, use FLRT LPM

Report
61

Power Saving and Favor Performance


Power Saver Mode
Predetermined reduction in

processor frequency
Dynamic Power Saver Mode
Processor frequency varies

based on usage of processors


Frequency can be increased
(favor performance) or
reduced (energy saving)
If performance is favored over

energy saving, consider enabling


Favor performance mode in
ASMI
62

Power Enterprise Pools


Power
Enterprise
Pools

Flexibility & Ease of operations & Price performance


Enhanced availability and cloud characteristics

For POWER7+ 770, POWER7+ 780, Power795,


and Power E870, Power E880

63

Power Enterprise Pools


Power Enterprise Pools enable you to move processor and memory
activations within a defined pool of systems, at your convenience.

New mobile activations for both processor and memory

Mobile activations can be used for systems within the same pool
One pool type for Power E880 & POWER7+ 780 & Power 795 systems
One pool type for Power E870 & POWER7+ 770 systems

Activations can be moved at any time by the user without contacting IBM
Done using HMC

Movement of activations is instant, dynamic and non-disruptive

Many Power Systems software entitlements also mobile


64

Power Enterprise Pools Example


Monday 8 am

Sys A
64-core E880
4.35 GHz
Activations:
10 static
40 mobile
14 dark

Sys B
96-core 795
3.7 GHz
Activations:
30 static
40 mobile
26 dark

Sys C
96-core 780
3.7 GHz
Activations:
16 static
20 mobile
60 dark

Sys D
128-core 795
4.0 GHz
Activations:
40 static
60 mobile
28 dark

Pool Totals
Activations:
96 static
160 mobile
128 dark

65

Power Enterprise Pools Example

Monday 8:01 am

Sys A
64-core E880
4.35 GHz
Activations:
10 static
0 mobile
54 dark

Sys B
96-core 795
3.7 GHz
Activations:
30 static
55 mobile
11 dark

Sys C
96-core 780
3.7 GHz
Activations:
16 static
45 mobile
35 dark

Sys D
128-core 795
4.0 GHz
Activations:
40 static
60 mobile
28 dark

Pool Totals
Activations:
96 static
160 mobile
128 dark

66

Power Enterprise Pools Guidance


PLAN
DEFINE
SIGN
REQUEST

Review Power Enterprise Pools offering and plan implementation


Define participating systems by serial numbers within a pool

Sign Power Enterprise Pools contract and addendum


Submit addendum to IBM and request Pool ID

ORDER

Order mobile enablement, processor and memory activations

INSTALL

Install new firmware for participating systems and HMC

DOWNLOAD

Download configuration file to HMC from IBM web site

USE

Assign activations to systems

67

Summary (1 of 2)
Identify Power Enterprise systems best suitable for you needs
Perform sizing based on throughput and response time considerations
For response time critical workloads, higher frequency POWER8 processor will give

more benefit
Understand SMT behavior on POWER8 systems and evaluate, apply accordingly
For maximum memory bandwidth, populate all memory DIMMS slots
For optimum cache and memory affinity, plan for partition placement in processor nodes
Additional drawers may help you get better performance. Plan for scalability and

performance
Apply latest firmware level and review minimum supported OS, VIOS and HMC levels for

using various capabilities on POWER8 Enterprise systems


Consider planning for IO adapter placement based on slot priorities

68

Summary (2 of 2)
AMM can be leveraged for higher reliability on Enterprise systems. Disable IO adapter

Enhance Capacity to avoid excessive usage of hypervisor memory


SRIOV can be considered based on solution requirements
Leverage tools like SPT, WLE, SEE for planning
DPO, PowerVP can help is management of partition affinity on Enterprise systems
Power Enterprise Pools will help provide additional availability across pool on systems

69

PowerCare Service
Select one PowerCare service
option with each Power E870 or E880
A PowerCare Services engagement offer is included, at no additional charge, with the purchase
of each Power E870 or E880 system.
Power E870 engagement options include :

Enterprise Systems Optimization


Power Systems Availability
Cloud Enablement
Power Integrated Facility for Linux (IFL)

Power E880 PowerCare engagement options include:

Enterprise Systems Optimization


Power Systems Availability
Cloud Enablement
Security
Power Integrated Facility for Linux (IFL)
Tivoli Monitoring Enablement
Mobile Enablement with Worklight
Private Technical Training

For more information contact IBM Lab Services stgls@us.ibm.com


70

Thank You

References
Power systems best practices
http://www14.software.ibm.com/webapp/set2/sas/f/best/home.html

E870, E880 Redbook


https://www.redbooks.ibm.com/redbooks.nsf/RedbookAbstracts/redp5137.html?Open

IBM System Planning Tool


www.ibm.com/systems/support/tools/systemplanningtool/

Fix Level Recommendation Tool


https://www14.software.ibm.com/webapp/set2/flrt/home

PCIe Slot priority table for all supported adapters for optimal placement
https://www-01.ibm.com/support/knowledgecenter/9119-MHE/p8eab/p8eab_87x_88x_slot_details.htm

Dynamic Platform Optimizer


https://www-01.ibm.com/support/knowledgecenter/POWER7/p7hat/iphatdpoovw.htm?cp=POWER7%2F1-8-2-5-3-5-0

72

References
AIX Performance website
https://www.ibm.com/developerworks/wikis/display/WikiPtype/Performance+Monitoring+Documentation
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Power%20Systems/page/rperff

System Performance Reports http://www.ibm.com/systems/power/hardware/reports/system_perf.html


IBM Benchmark Index http://www-03.ibm.com/systems/power/hardware/reports/system_perf.html
Benchmarking blog https://www.ibm.com/developerworks/mydeveloperworks/blogs/benchmarking
Workload Estimator http://www.ibm.com/systems/support/tools/estimator/
Americas Lab Services http://www-03.ibm.com/systems/services/labservices/

73