Vous êtes sur la page 1sur 132

Storage Networking

Design and Management

Section 3: Performance Stack & Design Considerations

2007 EMC Corporation. All rights reserved.


Overview
Module 3.1: Application Layer
Application characteristics
Performance Stack
Databases
& Design
Logical & Physical structure of
databases Considerations

Database system I/O


I/O profile of database files
I/O patterns and distributing
workload
Module 3.2: Host Layer
File systems & volume managers
Host Virtual memory Application
Case Study:
Host Layer Storage Layer Email-Exchange
HBA Layer

Overview of system performance


Tools
Module 3.3: Storage Layer
Service processor
Cache
Back-end Storage design
Module 3.4: Case Study

2007 EMC Corporation. All rights reserved.


Performance Stack & Design Considerations
Upon completion of this section, you will be able to:
Define all elements of the performance stack and how they impact service
level targets.
Define I/O characteristics of different applications
Describe the logical and physical structure of databases
Explain different types of database files and the I/O profile of database files
Detail the overall performance impact on the application due to host layer
components such as file systems, volume managers, virtual memory and
HBA
Describe various procedures and tools available to collect the data from
host, databases and Storage infrastructure
Use best practices to configure the back-end of a Storage array with
complete understanding of Storage processor, cache and application
demands
Demonstrate effective application of the principles in a standard email
exchange environment (via case study)
2007 EMC Corporation. All rights reserved.
Module 3.1

Application Layer

2007 EMC Corporation. All rights reserved.


Application Layer
Upon completion of this module, you will be able to:
Define the design considerations of the Application layer of the
Performance Stack:
Types of application, characterization of I/O, application buffers and their
impact in design
Define logical and physical structure of databases
List all types of database files and database I/O operations
Explain concurrency and parallelism in a database and how it
impacts Storage design
Explain the separation of database files and how it improves
performance
Detail a strategy to select an appropriate technology (SAN/NAS or
hybrid) to host a database application

2007 EMC Corporation. All rights reserved.


Performance Stack
Array

APPs Host
/ O.S.
data
Volumes
data Host STORAGE/IP
/ O.S. NETWORK
Array

Applications Host
Storage Networks
Layer / O.S. Layer

Host & HBA


Layer Volumes

Storage
Layer

2007 EMC Corporation. All rights reserved.


Lesson 1: Application Layer Design Considerations

Upon completion of this lesson, you will be able to:


Gather Requirements for an Application and describe the
Application Workload
Identify Application Types:
OLTP
DSS
Back-up and Email
File sharing
ERP
e-Business,
Describe Application I/O characteristics and buffers and
their impact in design
2007 EMC Corporation. All rights reserved.
Gathering Requirements for an Application
Is the application / database / file system mission critical?
Does the application / database / file system need 24x7
availability?
Does it need back-up and recovery?
How large is the database/file system?
What is the expected growth rate for this database/file Availability
system?
What data types are needed? (binary, large objects?) Performance
Is replication needed?
How many concurrent users will have database/file system Recoverability
access?
What support is expected for this database/file system?
Is there a platform dependency or requirement?
When is this database / file system needed?
How long will does the database / file system need to be
accessible?

2007 EMC Corporation. All rights reserved.


Application Demands of SLA from Storage Management
Teams

Efficiency
Effectiveness
Quality System Oriented
Timeliness Service Oriented
Productivity
Cost

2007 EMC Corporation. All rights reserved.


Understanding the Application Workload
Application classification
Externally networked, Enterprise, Workgroup

Application characteristics that impact the NFR


Application efficiency
I/O character Application
Data access patterns File System
Buffering Volume Manager
Virtual Memory
Tuning of Host SW HBA/Driver
Host buffers Storage Controller
Database global memory Cache
De-fragmentation of swap files Back-end

2007 EMC Corporation. All rights reserved.


Characterizing I/O
Random vs Sequential
Random: Successive reads or writes from non-contiguous addresses
accesses spread across the addressable capacity of LUN
Sequential: Successive reads or writes that are physically contiguous one
logical block address after the other
Random I/O requires disk to seek for subsequent requests
Slow mechanical movements of disk will be the bottleneck
Read vs Write
Cache based Storage subsystems have eliminated the disk bottleneck for
writes
Sequential reads use cache effectively

I/O Request size


Bursts & Busy periods
2007 EMC Corporation. All rights reserved.
Application Type: OLTP
Online Transaction Processing
Workload
Large number of concurrent users
High volume of transactions
Works on a single database row at a time
Response time sensitive
67% reads 33% writes
Design Considerations
Indexes to boost performance
Well tuned SQL
Small block sizes Examples of OLTP Applications:
Few table scans
Electronic banking
E-tailing
2007 EMC Corporation. All rights reserved.
Application Type: DSS
Decision Support Systems
Data Extracted from multiple sources to support end-user
Data warehouse
Data marts
On-line Analytical Processing (OLAP) Examples of DSS Applications
Data mining
Credit risk rating
Reporting
Workload Medical diagnosis
Smaller number of users
Large number of queries using complex SQL statements
Process many database rows at a time
Design Considerations
Large disk space
Longer Response times
Fewer users
80 -90 % reads; frequent Table Scans (sequential); Heavy write to temp space
2007 EMC Corporation. All rights reserved.
Application Types: Back-up and Email
Back-up
File based back-ups are sequential
Application based back-ups, such as MS Exchange online back-up,
are random
Back-up with Point-in-time copy provides advantages for enhanced
availability for applications
Back-up to disk is becoming a viable option

Email
Very similar to OLTP
MS exchange known for its 60-40 read-write ratio

2007 EMC Corporation. All rights reserved.


Application Type: File Sharing

2007 EMC Corporation. All rights reserved.


Application Systems: Enterprise Resource Planning (ERP)
Three tier client-server solution
Central database server
Multiple application servers
User workstations
Complex database structure with large number of tables
Complex queries
Database generic and hence customized with installation specific data and complex
joins of several tables.
Not optimized as in OLTP
Updates are typically queued and done in batches to increase response
time
Large number of users
OLTP & OLAP
Processes transactions using simple and complex SQL
Process many database rows at the same time
Response time sensitive
Example: SAP
2007 EMC Corporation. All rights reserved.
Application Systems: ERP Service Level Requirements

Rule of thumb: requires 5 times the disk space


Designed to support 100 to 1000s of users
Response times should be less than four seconds
99.999 availability for most production systems;
development and test systems would need to be
available during normal business hours
Zero system down-time for production systems; 4 to 8
hours per night for development and test systems due to
the lack of a 24 x 7 availability requirement
Long implementation cycles due to extensive
customization
2007 EMC Corporation. All rights reserved.
Application Systems: e-Business - 1
Solution Architecture

User Tier Middle Tiers


Web Tier
Web, http, XML J2EE OR
Web
CORBA
Server Farm
Containers,
Voice RMI Workflow
Messaging Expert
WAP
CORBA systems
Front eCommerce
End HTTP Portal
Integration XML Business Logic
Back Office
Other
Systems

2007 EMC Corporation. All rights reserved.


Application Systems: e-Business - 2
Data Tier

Distributed
Databases, On-line back-up
Warehousing
SQL via Services
JDBC or Data
ODBC Storage
Logic and
Reporting

B2B
Partner
Gateways,
Payment Network or Internet
RMI EDI, Web Services,
servers etc.
Messaging XML over HTTP etc,
CORBA
Supplier
Integration

2007 EMC Corporation. All rights reserved.


Application Systems: e-Business Service Level
Requirements
Typically requires three times the disk space (like OLTP)
No real concept of a user, but each database hit usually
results in the execution of a database transaction
Response times typically less than two seconds
Available 24 x 7 because it is connected to the web
Application is typically a third party package, possibly
modified for requirements

2007 EMC Corporation. All rights reserved.


Application I/O Size
Depends on the framework on which the application is
built
Typically controlled by the underlying database engine
and the environment variables set
To determine I/O sizes in a complex environment, use
tools such as Navisphere Manager / ECC Performance
manager

2007 EMC Corporation. All rights reserved.


How Many IOPS Can We Do? (An Example)
Ultra 3 SCSI controller 0.3 ms How many I/O can we do?
overhead; 160 MB throughput Transfer time
Consider Oracle I/O 4K / 160MB = 0.025 msec
8K / 160MB = 0.05 msec
Single block reads (4K, 8K, 16K)
16K / 160MB = 0.1 msec
Multiblock reads (32K, 64K, 128)
32K / 160MB = 0.2 msec
Single block writes (4K, 8K, 16K)
64K / 160MB = 0.4 msec
Multiblock writes (32K, 64K, 128K)
Controller Overhead + Transfer
Time
64K 1/(0.3 + 0.4)ms = 1428 IOPS
32K 1/(0.3 + 0.2)ms = 2000 IOPS
16K 1/(0.3 + 0.1)ms = 2500 IOPS
8K 1/(0.3 + 0.05)ms = 2857 IOPS
4K 1/(0.3 + 0.025)ms = 3076 IOPS

2007 EMC Corporation. All rights reserved.


How Many I/Os Can a Disk Do?
Average Time for a disk I/O: 4K (1/7.1 ms = 140 IOPS)
5ms + (0.5/15000RPM) + 4K/40MB = 5 + 2 + 0.1 = 7.1
15000 RPM
5 ms average seek time 8k (1/7.2 ms = 139 IOPS)
5ms + (0.5/15000RPM) + 8K/40MB = 5 + 2 + 0.2 = 7.2
40 MB/sec transfer rate
16K (1/7.4 ms = 135 IOPS)
I/O time is: 5ms + (0.5/15000RPM) + 16K/40MB = 5 + 2 + 0.4 = 7.4
5 ms + (rotational
32K (1/7.8 ms = 128 IOPS)
delay)/15000 + (blocksize/ 5ms + (0.5/15000RPM) + 32K/40MB = 5 + 2 + 0.8 = 7.8
(40MB/sec))
64K (1/8.6 ms = 116 IOPS)
5 ms + 0.5 /15000 + 5ms + (0.5/15000RPM) + 64K/40MB = 5 + 2 + 1.6 = 8.6
32K/40MB = 7.8 msec
1 / 7.8 ms = 128 IOPS

2007 EMC Corporation. All rights reserved.


Optimizing Response Times
8K blocksize (135 IOPS, 7.2 ms)
135 => 240.0 ms
105 => 29.5 ms
75 => 15.7 ms
45 => 10.6 ms

64K blocksize (116 IOPS, 8.6 ms)


135 => Not Possible
105 => 88.6 ms
75 => 24.6 ms
45 => 14.6 ms

2007 EMC Corporation. All rights reserved.


Application Buffering & Coalescing
The goal in buffering and coalescing
Minimize number of transfers to disk

Applications as creators of data knows the best about its


usage
Buffering and coalescing
Put / Retain data in RAM for reuse
Application aware of data to retain and reuse

2007 EMC Corporation. All rights reserved.


Data Organization in the Disk: Page
What is the unit of transfer between disk and memory?
Entire relation?
May not fit in memory
Only part of relation may be needed by query

Page-based organization
Storage organized into fixed size Pages (or disk blocks)
Memory buffers organized into similarly sized Page frames
Disk pages brought into memory, loaded into a page frame
Typical database page sizes: 2KB, 4KB, 8KB

2007 EMC Corporation. All rights reserved.


Buffer Management
Buffer: Portion of memory to store copies of disk pages
Buffer manager: manages pool of buffers
Page frames used to store disk pages
Goal is to minimize transfers between disk and main memory

2007 EMC Corporation. All rights reserved.


Typical Database I/O: When a page is requested
What does the buffer manager do?
Allocate an empty page frame if one exists
If no empty page frame exists, then selects a victim page

If victim page is dirty, writes it back to disk


Read page into allocated frame, pin it, return its address
CC & Recovery may induce other I/Os & constraints

Victim selection is determined by replacement policy


Replacement policy has big impact on # of I/Os

Many policies exist, best one depends on access pattern

2007 EMC Corporation. All rights reserved.


Temporal Patterns and Peak Activities
Batch processing workload
OLTP: Peak time
Bursts can be global or localized

2007 EMC Corporation. All rights reserved.


Typical Characteristics of Applications

2007 EMC Corporation. All rights reserved.


Check Your Knowledge
List three design considerations for:
ERP applications
OLTP

What are current trends in the back-up application to


enhance application availability?
What are the four application characteristics that impact
NFR?
What is the primary goal of application buffering and how
is it different from Storage caching?

2007 EMC Corporation. All rights reserved.


Lesson 2: Databases
Upon completion of this lesson, you will be able to:
Know when to use SAN, NAS or IP-Storage Solutions
Describe database design considerations
Identify the logical and physical structure of databases
List basic rules of Storage Design for databases

2007 EMC Corporation. All rights reserved.


When to Use SAN, NAS or IP-Storage Solutions
Application: Amount and type of data
(file level or block level) to store and
share Performance
Performance: I/O and throughput
requirements
Scalability: Long-term data growth
Availability and Reliability: How
mission-critical are your applications?
Cost Convenience
Data Protection: Back-up & Recovery
requirements
Available IT staff and resources
Budget concerns

2007 EMC Corporation. All rights reserved.


Storage Networking Emerging Trends

New Requirements New Technologies


Massive Consolidations New Standards
Distributed deployments Networking Advances
Unlimited scalability Storage Advances
Do More with Less Virtualization
Global access ILM

Opportunity for
Reinvention
IP-Based Storage
Thousands of Servers
Thousands of Users Application Virtualization
NFS Authentication
CIFS Authentication Global File Management
ILM Tiered Storage Grids, Simulations, Database
Email, Web, Collaboration
Need High Performance
Need Single Namespace
2007 EMC Corporation. All rights reserved.
SAN & NAS Comparison
SAN NAS

Protocol Fibre Channel TCP/IP


Fibre Channel-to-SCSI NFS/CIFS

Applications Mission-critical transaction-based database File sharing in NFS and CIFS


application processing
Small-block data transfer
Centralized data back-up over long distances
Database applications with lower
Disaster recovery operations performance requirements
Storage consolidation
Advantages High availability Few distance limitations
Data transfer reliability Simplified addition of file sharing
capacity
Configuration flexibility
Easy deployment and
High performance
maintenance
High scalability
Centralized management
Multiple vendor offerings

2007 EMC Corporation. All rights reserved.


Database Applications: Traditional Database I/O Access
Client/App server

A
C
LAN

DB Server Array

Volumes

2007 EMC Corporation. All rights reserved.


NAS Database I/O Access
Client/App server

A
Array
E
LAN

Volumes
DB Server
B
C

D
NAS Device
2007 EMC Corporation. All rights reserved.
SAN Database I/O Access
Client/App server

C LAN

DB Server
SAN
Array
B

Volumes
2007 EMC Corporation. All rights reserved.
Technology Decision for Database Application
Items to Consider:
Application Workload
Organizational issues

l
ca
Data

iti
Multi-protocol

Cr
Centers

n
io
Point in time copies FC-SAN

ss
$$$

Mi
3TB
Multi-vendor Storage
Performance IP-SAN
Departments
Ease of use
rta n

NAS
nt
po io
Im iss

1TB
Growth $$
M

NAS/DAS
Workgroups

2007 EMC Corporation. All rights reserved.


Database Design Considerations

Response time Read and write


experienced here measured here

User applications
Threads
Database
Volume manager
File system
Operating systems
Host devices
Device drivers
Multipathing
HBAs Ports
Zones
Response time LUN Masking
measured here Response time
Ports Host buses measured here

Directors/CPUs
Cache

2007 EMC Corporation. All rights reserved.


Logical and Physical Structure of Databases

2007 EMC Corporation. All rights reserved.


Database Storage on Physical Disk
Database stored as a collection of files
Each file is a sequence of records
Each record is a sequence of fields
Records may be of variable length
Fields within records may be of variable length

Each file is stored as a sequence of disk pages


Records packed inside a page
Record identifier or RID = (Page identifier, offset)
A disk page usually contains numerous records
Records (rows) packed in disk blocks
Typical block size is: 2K, 4K, 8KB
Scan of a table retrieves all blocks
All blocks containing the records of a table are retrieved
Minimizing disk I/Os (block accesses) is key database performance goal
Disks are slow
Reducing disk I/Os best way to speed up queries
2007 EMC Corporation. All rights reserved.
Database Storage Viewpoint: A Few Basics
ACID (Atomicity, Consistency, Isolation, Durability)
Built-in by the application developers

Instance
Process or collection of processes that work on a set of data files to
store and retrieve information

Types of database Storage elements


Database binaries
Database configuration files
Data files
Temporary database files
Transaction logs
Archive logs
2007 EMC Corporation. All rights reserved.
Database Workload
Database I/O workload is very complex
Many file types
Data, log, archive, temp, undo, system, control, back-up

Many operation types


Scan, lookup, load, insert, create index, join, LOB, sort, hash, back-
up, recovery, batch write, etc.
Like an Operating System, the Database acts on behalf of
an application
OLTP different from warehouse
Batch different from OLTP
Cache efficiency varies by application
Tables, indexes, and queries are totally application dependent
Workload for different tables, indexes, etc is radically different
2007 EMC Corporation. All rights reserved.
Database: System & I/O
Database system performance I/O bound
Performance largely determined by how often we go to disk

Number of disk I/Os


Disk is very often the bottleneck, but not always

Some queries bottleneck on the processor


Storage must be carefully organized to reduce I/Os
During search
Equality, range searches
During updates
Space Time trade-off
Insert, Delete, Update
Query Update trade-off

2007 EMC Corporation. All rights reserved.


Database: Sequential I/O
Database recognizes operations that perform sequential
I/O and treats them specially
Scans, loads, back-ups, logging, sorts, etc.
Oracle issues large (up to 1MB) I/O requests

Sequential operations
Database performs read-ahead for sequential I/O operations
Non-sequential access uses smaller I/Os (4K)

For efficient disk access, the Storage subsystem should


not break up these I/Os

2007 EMC Corporation. All rights reserved.


Database: Parallel Execution
Database can parallelize many operations
Scan, sorts, joins, hash, load, create index, etc.
Parallel execution I/O rate is up to 10s of GB/sec

Parallel Execution focuses intense I/O activity on an


arbitrary subset of the database
Chosen table, index, or partition gets all CPUs
If very high I/O bandwidth is not available for the subset, then
parallel execution will not scale
Implies that any subset of database must be spread across many
disks

2007 EMC Corporation. All rights reserved.


I/O Profile of Database Files
System Tablespace
Created during DB creation; random block reads; minimum writes
Redo logs
Written sequentially for every change made on the database
Archive logs
Copy of redo logs once they are filled
Rollback segments
Used for read consistency, database recovery
Temporary Tablespace
Used for Sort, mostly sequential
Control files
Read during instance start-up
User table and Indexes
Characteristics defined by the application

2007 EMC Corporation. All rights reserved.


Distributing I/O
Evaluate database disk-Storage requirements by
checking the size of the files and the disks
Identify expected I/O throughput for each file
Determine which files have the highest/lowest I/O rates
Lay out the files on all available disks to even out the I/O rate

Segregation of files based on


I/O rates
Recoverability concerns
Manageability issues

2007 EMC Corporation. All rights reserved.


Separation of Files - 1
Tables, Indexes, and TEMP Tablespaces
Tune the SQL or application code before separation
Investigate on disk sorts and opportunities for tuning sorts
Tools available for these investigations:
Automatic Database Diagnostic Monitor
Automatic Workload Repository
V$SQL view
Custom Workload
SQL Trace
I/O layout should be considered after optimizing the application and
SQL code

2007 EMC Corporation. All rights reserved.


Separation of Files - 2
Redo log files
If the high-I/O files are redo log files, then consider splitting the redo
log files from the other files.
Possible configurations are:
Placing all redo logs on one disk without any other files
Placing each redo log group on a separate disk that does not store any
other files
Striping the redo log files across several disks, using an operating
system striping tool (Manual striping is not possible in this situation)
Avoid using RAID 5 for redo logs

2007 EMC Corporation. All rights reserved.


Separation of Files - 3
Archive log files
If archive logs are striped on the
same set of disks as other files,
then any I/O requests on those
disks could suffer when redo logs
are being archived
Archive log separation benefits
Archive can be performed at very
high rate (using sequential I/O)
Nothing else is affected by the
degraded response time on the
archive destination disks
Archived re-do logs
If the archiver is slow, then it
might be prudent to prevent I/O
contention between the archiver
process and LGWR by ensuring
that archiver reads and LGWR
writes are separated. This is
achieved by placing logs on
alternating drives

2007 EMC Corporation. All rights reserved.


The SAME Configuration
Stripe Across all Disks With Large Stripe Depth
Insures high disk utilization and performance

Mirror Data
Ensures high availability

Subset Data by Partition Instead of Disk


Prevents subsets from becoming bottlenecks

Stripe And Mirror Everything

2007 EMC Corporation. All rights reserved.


Basic Rules of Storage Design for Databases
Lay out files using Operating System or Hardware
Striping
Decision on Stripe depth & width based on:
Requested I/O size
Concurrency of I/O requests
Alignment of physical stripe boundaries with block size boundaries
Manageability of the proposed system

Distribute I/O across several disks


Archive logs
Re-do logs
Tables, indexes, Temp Table spaces, Oracle files

2007 EMC Corporation. All rights reserved.


Check Your Knowledge
Define space/time query/update tradeoff in a database
environment
What are the key considerations for segregating files in a
database environment?
What are the three considerations in choice of SAN, NAS
or hybrid technology for database?
List different types of database files and different
operations
What is a SAME configuration?

2007 EMC Corporation. All rights reserved.


Module Summary
Key points covered in this module:
Design considerations for application layer of the performance stack
Various types of application
Application buffering and coalescing
Logical and physical structure of databases
Types of database files and I/O operations
I/O profile of different types of Oracle files
Optimal layout of data for database files and segregation of
workload based on I/O profiles and user requirements
Concurrency and parallelism in database operations and their
impact on Storage design
SAME configuration for Oracle its benefits and enablers
2007 EMC Corporation. All rights reserved.
Module 3.2

Host Layer

2007 EMC Corporation. All rights reserved.


Host Layer
Upon completion of this module, you will be able to:
Define the design considerations in the Host & HBA layer
of the Performance Stack
Host Virtual memory, file systems, Volume managers and the effects
of HBA

Gather data from hosts, applications and Storage


subsystems and understand how to interpret them

2007 EMC Corporation. All rights reserved.


Performance Stack
Array

APPs Host
/ O.S.
data
Volumes
data Host STORAGE/IP
/ O.S. NETWORK
Array

Applications & Databases Host


Storage Networks
Layer / O.S. Layer

Volumes

Storage
Layer
Host & HBA
Layer
2007 EMC Corporation. All rights reserved.
Host File Systems
Present logical (abstract) view of files and
directories
Hide complexity of hardware devices
Application
Facilitate efficient use of Storage devices File System
optimize access, i.e., to disk Volume Manager
Support sharing Virtual Memory
Provide protection HBA/Driver
File system resides on a single logical disk or Storage Controller
partition
Cache
A partition can be viewed as a linear array of
blocks Back-end
block represents the granularity of space
allocation for files
a disk block is 512 bytes * some power of 2
physical block number identifies a block on a
given disk partition
physical block number can be translated into
physical location on a partition

2007 EMC Corporation. All rights reserved.


I/O Optimizations Available Inherently with File Systems

Reducing the number of I/Os to the underlying device(s)


(where possible)
Grouping smaller I/O's together into larger I/O's (where
possible)
Optimizing the seek pattern to reduce the amount of time
spent waiting for disk seeks
Caching as much data as realistic to reduce physical I/Os

2007 EMC Corporation. All rights reserved.


Understanding Workload Profile

2007 EMC Corporation. All rights reserved.


File System Buffering

application: read/write files

OS: translate file to disk blocks

maintains ...buffer cache ...

controls disk accesses: read/write blocks

hardware:

Any problems?

2007 EMC Corporation. All rights reserved.


File System Request Size: Minimum & Maximum I/O Size

Request size: size of I/O used to transfer data from disk


to memory and vice versa
File system block size is minimum request size
Maximum I/O size: largest size file systems can write on
to the physical media

2007 EMC Corporation. All rights reserved.


File System Coalescing
Combine separate but logically contiguous I/O requests
to one large request
Helps to enhance bandwidth
Enabled with parameters set while creating the File
system
Eg. SOLARIS
MAXPHYS (Maximum Physical I/O Size)
MAXCONTIG

2007 EMC Corporation. All rights reserved.


File System Fragmentation & Alignment
Fragmentation
Blocks in a file system cease to be contiguous over time
Causes performance bottleneck if left fragmented

Alignment
64 KB write split into 48KB & 16KB

Disk 1 Disk 2

2007 EMC Corporation. All rights reserved.


Raw Partitions vs File systems
Raw Partitions File systems
Can use larger I/O than most More tools are available for
file systems managing (Moving, backing up,
Can use smaller I/O than restoring) and analyzing data
most file systems Back-up images are easier to
Avoid fragmentation issues mount (with raw partitions, the
partition must have the same
Are easier to create point in
device context to mount a
time copies and remain in a
partition on the back-up
coherent state, no buffers to
machine)
flush
Massive buffering can be
Simplify aligning I/O to the
advantageous in some cases
RAID stripe

2007 EMC Corporation. All rights reserved.


Volume Managers
Partitioning of LUNS into smaller logical Application
devices File System
Aggregation of several LUNS into a larger Volume Manager
logical device in order to provide a larger Virtual
logical device
HBA/Driver
Memory
Storage Controller
Cache
Back-end

Striping of data across


several LUNS in order to
distribute I/O across
many disks

2007 EMC Corporation. All rights reserved.


Host Virtual Memory
Application
Virtual Memory System (VM)
File System
Tool used by OS to provide contiguous
logical memory spaces for all applications Volume Manager
Managed from a pool of RAM and mirrored Virtual Memory
(in full) to a reserved space on disk HBA/Driver
Also known as page or swap file Storage Controller
Has profound effect on application Cache
performance Back-end

Bandwidth MB/s Throughput IOPSStorage system Host Virtual


Write cache Memory settings
13.5 118 Enabled Standard
15.2 122 Disabled Tuned
33.5 195 Enabled Tuned
2007 EMC Corporation. All rights reserved.
Host HBA Effects
Application
Performance & Redundancy File System
A Single 4GB FC HBA can sustain up Volume Manager
to 380 MB/s throughput and 40,000 Virtual Memory
IOPS HBA/Driver
Storage Controller
A lot of CPU power and bus bandwidth
Cache
required to exceed the capability of a
Back-end
HBA
Number of HBA to match the Storage
system bandwidth requirements
HBA settings for Queue depth impacts
0 1
performance
HBA Firmware version, HBA driver 0 1 0 1

version used and the OS impacts SP A SP B


performance LUN 0 LUN 1

2007 EMC Corporation. All rights reserved.


Failover and Multipathing
Multipathing (Host based SW-Powerpath)
Advantages
Failover
Load balancing
Higher bandwidth

Limitations
Use of host CPU resources
Every active and passive path requires an initiator.
Active paths increase the time to failover in some cases.

2007 EMC Corporation. All rights reserved.


I/O Measurement: Host
Components that contribute to the overall performance of a host for a
particular application/program:
User state CPU Time
System state CPU time
I/O time
Network Time
Time spent running other programs
Virtual memory performance
Why do we Measure?
Verify that other resources as memory, CPU or network activities are not the real
cause of the I/O performance degradation
Controlling service levels as transactions per second, disk busy and disk response
times (not provided by most UNIX platforms) help in locating I/O performance
problems
Modifying operating system parameters to improve read and write ahead activity,
page stealing policy, I/O pacing and logical volume implementation as striping,
partition sizing, mirroring and data positioning have major effects on I/O performance
Understanding how the application is accessing data (sequential or random access,
reads vs writes) and the block size is essential to select the correct raid

2007 EMC Corporation. All rights reserved.


System Activity Tools in UNIX

2007 EMC Corporation. All rights reserved.


IOSTAT Report

2007 EMC Corporation. All rights reserved.


sar Reports
sar Utility:
Reads and interprets data about system activity
sdac command collects the system statistics periodically and stores
in a persistent file.

2007 EMC Corporation. All rights reserved.


Buffer Cache Usage Statistics

2007 EMC Corporation. All rights reserved.


Block Device Statistics

2007 EMC Corporation. All rights reserved.


Average Queue Length Statistics

2007 EMC Corporation. All rights reserved.


CPU Utilization Statistics

2007 EMC Corporation. All rights reserved.


Swapping and Paging Statistics

2007 EMC Corporation. All rights reserved.


Windows System Monitor: An Overview

2007 EMC Corporation. All rights reserved.


EMC Tools
ECC Performance Manager
Navisphere Analyzer
Syminq
EMC grab

2007 EMC Corporation. All rights reserved.


Check Your Knowledge
What are key design considerations for File system
performance?
List key difference between raw partitions and file
systems
List main advantages of buffer cache
How is sar utility set-up and what is the key criteria to
set-up data gathering with the sar utility
What are the algorithms used for path load balancing in
Powerpath?
Discuss the impact of maxcontig parameter

2007 EMC Corporation. All rights reserved.


Module Summary
Key points covered in this module:
Design considerations for Host layer of the performance
stack
File systems
Volume managers
HBA
Various tools available to collect data from hosts on key
performance metrics

2007 EMC Corporation. All rights reserved.


Module 3.3

Storage Layer

2007 EMC Corporation. All rights reserved.


Storage Layer
Upon completion of this module, you will be able to:
Define the design considerations of the Storage layer of
the Performance Stack
Storage processor, cache and back-end

Explain the impact of RAID configuration and Cache


settings in the overall design for performance

2007 EMC Corporation. All rights reserved.


Performance Stack
Array

APPs Host
/ O.S.
data
Volumes
data Host STORAGE/IP
/ O.S. NETWORK
Array

Applications & Databases Host


Storage Networks
Layer / O.S. Layer

Host & HBA


Layer Volumes

Storage
Layer

2007 EMC Corporation. All rights reserved.


Storage Layer Array Features: An Example

2007 EMC Corporation. All rights reserved.


Parity RAID Behavior and RAID Groups
CLARiiON RAID groups use 64 KB stripe size
This default is the most efficient for performance
Disk groups can be 3 to 16 drives
RAID optimizations work for ALL group sizes
However, power of 2 sizes work best with most host systems
I/O is typically 64 KB, 128 KB, 256 KB, 512 KB, etc.
Odd stripe sizes (6+1) can be hard to fit to host I/O when cache bypassed
Main difference to clients is cost/GB
8+1 uses only 11% capacity for parity, but stripe size is large (512 KB)
4+1 uses 20% capacity for parity, stripe size = 256 KB
Larger groups have a longer rebuild time

RAID type & Size Stripe size with 64 KB


element
5-disk RAID 5 256 KB
6-disk RAID 5 320 KB
9-disk RAID 5 512 KB
8-disk RAID 1/0 256 KB
16-disk RAID 1 512 KB

2007 EMC Corporation. All rights reserved.


The Storage Back-end
Configuration of the LUNS
Distribute I/O evenly across all
spindles
RAID Group
Physical Disks
Stripe & Stripe element size to be
configured for optimal performance
LUN Distribution
LUN 1
Highest use LUNS should be LUN 2
distributed evenly throughout the
RAID group
LUN 3
Keeping the drives busy
Minimize disk contention
Disk Type Considerations
RPM, FC, ATA

RAID Group 0(R1) RAID Group 1(R10) RAID Group 2(R5)


LUN 0 (SMTP) LUN 11 (Balances) LUN 12 (Email)
LUN 10 (Application)
LUN 20 (Data) LUN 21 (Addresses) LUN 22 (Source Code)

LUN 50 (Archive 1) LUN 51 (Logs) LUN 52 (Archive 2)

LUN Service Levels Hig Medium Low


2007 EMC Corporation. All rights reserved. h
RAID Impacts on Performance

Small (less than element size) write on RAID 3 & 5


P = D1 + D2 + D3 + D4
If parity is valid, then: Pnew = Pold Dold + Dnew
2 disk reads and 2 disk writes
Parity Vs Mirroring
Reading, calculating and writing parity segment introduces penalty to every write operation
Parity RAID penalty manifests due to slower cache flushes
Increased load in writes can cause contention and can cause slower read response times

2007 EMC Corporation. All rights reserved.


Parity Stripe Optimization
If all data for calculating parity is already at the processor:
Stripe can be written out to disk without any reads
P = D1 + D2 + D3 + D4

If write cache is on, parity stripe optimization can happen for certain
combinations of element size, RAID group size and cache page size

2007 EMC Corporation. All rights reserved.


RAID Levels and Performance - 1
RAID 1/0 has highest penalty for sequential writes
Each write must be done to two disks

RAID 3 has optimizations for large sequential writes


Large back-end writes reduce total number writes
But dedicated parity disk can become a bottleneck random writes

RAID 5 can deliver the highest read bandwidth from a small number
of disks
All disks of the RAID group participate in reads
For RAID 3 and, RAID 5 larger raid groups deliver higher write
bandwidth
Less parity overhead per unit of host data

ATA drives offer cost effective alternative compared to Fibre channel


devices
2007 EMC Corporation. All rights reserved.
RAID Levels and Performance - 2
RAID 1/0 has highest penalty for sequential writes
Each write must be done to two disks

RAID 3 has optimizations for large sequential writes


Large back-end writes reduce total number writes
But dedicated parity disk can become a bottleneck random writes

RAID 5 can deliver the highest read bandwidth from a small number
of disks
All disks of the RAID group participate in reads
For RAID 3 and, RAID 5 larger raid groups deliver higher write
bandwidth
Less parity overhead per unit of host data

ATA drives offer cost effective alternative compared to Fibre channel


devices
2007 EMC Corporation. All rights reserved.
Host Based Striping (plaids) & Port Throughput
Plaids
Volume is stripped across LUNS
LUNS are stripped across disks

Three Plaid techniques

Host Volume A Host Volume E Host Volume C


Host Volume D
SP A SP B SP A SP B
SP A SP B

RAID RAID
Group 10 Group 11 RAID
SP A SP B RAID
Group 10 Group 11

A. Plaid with dedicated


RAID group C. Cross System Plaid
B. Multisystem Plaid

2007 EMC Corporation. All rights reserved.


Meta LUNS
LUN objects created from multiple smaller LUN objects
Can be created with stripped/concatenated LUNS
Reduces the number of LUN objects managed by the
host and provides dynamic expansion of hosts capacity
Similar to plaids Storage distributed across a very large
number of disk drives however there are key functionality
and performance differences (See notes)

2007 EMC Corporation. All rights reserved.


Cache Structure
Region of the main memory that is separate from all other
processes
Partitioned into two regions
Read & write

Organized into pages


The Vault
Disk area where cache is dumped if a failure threatens the system

2007 EMC Corporation. All rights reserved.


Read Cache Benefits
Pre-fetch when Storage processor detects sequential
access to a LUN
SP launches pre-fetch I/Os to bring data into read cache
Fixed
Independent of host I/O size
Variable
Amount of pre-fetch = pre-fetch multiplier * I/O size
Size of pre-fetch I/O = segment multiplier * I/O size
Single threaded sequential I/O can result in multi-threaded I/O at
disks
Compensates for lack of host I/O concurrency

Cache hits caused by requests for data read recently

2007 EMC Corporation. All rights reserved.


Effects of Write Cache
Align sequential I/O
Lower response time
Delivers high bandwidth to low
concurrency applications
Coalesce small sequential I/O
I/O size must be multiple of stripe size
Stripe aligned I/Os
Must have high concurrency
But mirroring the write cache has a
throughput cost
For some workloads, disabling
write cache can deliver
Higher bandwidth
System can tolerate higher response
time from Storage
2007 EMC Corporation. All rights reserved.
Cache Settings
Typical Configurable parameters
Cache page size (global)
Allocation of cache: read vs. write (global)
Pre-fetch: type, amount pre-fetched, when to disable (per LUN)
Write aside size (per LUN)

High and low water marks & flushing


Process of writing dirty pages from cache to disk
Two Global Settings (called watermarks) to manage flushing
High & Low
Flushing can be categorized in three ways:
Idle flush
High water flush
Forced flush
2007 EMC Corporation. All rights reserved.
Check Your Knowledge
How many disk operations are required when a RAID stripe is
written with a recalculation of parity and why?
Explain High & Low watermark settings for the cache flush
What are the costs associated with mirroring write cache?
How does a meta LUN differ from a volume manager plaid?
What are two types of settings available for pre-fetching to read
cache?
What is the application for which RAID-3 is a best fit and why?
What are key benefits of read & write cache?

2007 EMC Corporation. All rights reserved.


Module Summary
Key points covered in this module:
Design considerations for the Storage layer of the
Performance Stack
Impact of RAID configuration on performance
Meta LUN configuration and its benefits
Optimal performance settings for Cache

2007 EMC Corporation. All rights reserved.


Module 3.4: Case Study

Storage Design for Exchange

2007 EMC Corporation. All rights reserved.


Case Study: Storage Design for Exchange
Upon completion of this module, you will be able to:
Gather user requirements for deploying the email
infrastructure
Explain Storage architecture elements in the exchange
environment
Explain all design considerations for each element in the
exchange environment
Design an optimal Storage architecture for Exchange

2007 EMC Corporation. All rights reserved.


Exchange Overview
Messaging server platform
Transfers email messages to intended recipients in a reliable way,
whether the recipients reside on the local server, another server in
the same Exchange Server organization, or another server in an
external messaging environment that is connected to the
organization
Stores the email messages in a server-based store
Supports various email clients used to access or download
messages
Gives users information about recipients in the organization through
an address book or global address list

2007 EMC Corporation. All rights reserved.


Application Systems: Email - Exchange

2007 EMC Corporation. All rights reserved.


Exchange Storage Architecture
Exchange server stores
data in two files
.edb
.stm

Storage Groups (ESG)


Set of:
Log files
edb & stm files
Max 5 databases in a Storage
group
Max 4 groups per server in
Exchange 2003

2007 EMC Corporation. All rights reserved.


Storage Design for Exchange
Information to Gather
User Community Information
Back-up / Recovery Requirements
Organizational Constraints

Data collected supported by empirical measurements


from the current environment
Concurrent users
IOPS
Read/write ratio
Log files/day

2007 EMC Corporation. All rights reserved.


User Community Information - 1
How many total Users?
Today and anticipated growth in the next few years

How many concurrent users during the peak period?


How many mail boxes not associated with an individual
users (help desk mailbox)?
What mail client is used?
Outlook/ Outlook web access / Mobile devices (Blackberry)

When are the peak activity periods?


What is the typical working day?
Is there a geographic dispersal of users across time
zones?
2007 EMC Corporation. All rights reserved.
User Community Information - 2
What is the Exchange activity level of the users?
Categorization of user types leads to estimated base IOPS
Measured I/O in the existing environment gives the best starting point
Are there special category users with different security,
performance, back-up/recovery requirements?
What are mailbox size limits?
Is there anything pertinent that helps to describe the user profile for
the organization?
Heavy use of personal folder
Do users often send large documents
Considerable use of outlook shared folders
What is the characteristics of public folder usage?
Size of the public store
Replication activity among public stores
2007 EMC Corporation. All rights reserved.
Back-up / Recovery Requirements
What is the deleted item retention period?
What is the chosen back-up method?
Back-up to disk using standard Exchange online back-up
Back-up to tape using standard Exchange online back-up
Clone or snapshot based replication

What is the timing of the back-up activity?


What are the SLA requirements for recovery?
Is a distance replication / DR planned?
DR site distance
Network connection

2007 EMC Corporation. All rights reserved.


Other Considerations and Constraints
What are the type, number and location of Exchange servers?
Are Exchange servers clustered?
What is the planned Exchange front-end/back-end server layout?
What is the SAN/network structure?
Is there an existing Storage system?
What other SW will be operating in the Exchange environment?
Antivirus
Email archiving solution
Workflow applications integrated with Exchange
Exchange integrated third party tools (mail box recovery, enhanced indexing
)

2007 EMC Corporation. All rights reserved.


Storage Planning for Exchange
ESG design considerations
I/O considerations
Base I/O per user
Total IOPS requirement for the environment
Read/write ratio

LUN sizing
RAID considerations
Other Considerations
Back-up/recovery
DR
Operation Schedules
2007 EMC Corporation. All rights reserved.
ESG Considerations - 1

2007 EMC Corporation. All rights reserved.


ESG Considerations - 2

2007 EMC Corporation. All rights reserved.


ESG Considerations - 3

2007 EMC Corporation. All rights reserved.


I/O Considerations - 1
Understanding the usage profile: Base I/O per user

Other factors that impact IOPS per user


Very active Exchange servers (>2000 users)
Large mailboxes (>200 MB)
Regularly sending very large documents (>5MB)
Balackberry clients
Typical Read:Write ratio in exchange is from 2:1 to 3:1
2007 EMC Corporation. All rights reserved.
I/O Considerations - 2
RAID 1/0 (Striping + Mirroring)
Offers best performance
Introduces write penalty of 2
Lower usable capacity per RAID group

RAID 5
Higher usable capacity per RAID group
Effective for environment with large mailbox sizes and low IOPS
4 physical I/O operations for each write requested

(BASE IOPS x READ %age) + (BASE IOPS x WRITE %age x RAID PENALTY) =
RAID-ADJUSTED BACK-END IOPS

2007 EMC Corporation. All rights reserved.


I/O Considerations - 3
Other factors that may impact I/O:
High load on the server
Caching increases
Going from 2500 to 4000 users may increase I/O by 10%
Server based anti-virus protection
Extra reads CPU utilization increases by 20%
Integrated features and applications
Workflow applications, content indexing
Synchronous / asynchronous mirroring
Time of day

2007 EMC Corporation. All rights reserved.


IOPS Calculation Example
3000 Exchange users
Separated evenly into 4 ESG
1000 heavy users at a peak of 1 IOPS each
2000 typical users at a peak of .5 IOPS each
Read/Write ratio 2:1
Maximum concurrency of active users 90%
Estimated overhead % for other workload 20%
Calculate IOPS requirement at peak activity for
RAID 1/0
RAID 5

2007 EMC Corporation. All rights reserved.


Calculating Capacity Requirements - 1
Space requirements for databases
I/O requirements determine the number of drives required
Growth, retention of deleted folders and defragmentation
Space Calculation

Space required for the ESG database LUNS =


Maximum Mailbox size * Number of mailboxes
+ Extra space for deleted item retention
+ Public Folder space (if part of ESG)
+ 10 20% free space for growth protection
+ Space for off-line defragmentation if required
2007 EMC Corporation. All rights reserved.
Calculating Capacity Requirements - 2
Disk type
IOPS measures to be considered for choosing Disk type
Exchange 4KB random writes Benchmark from Vendors used
10K drives 130 IOPS
15K drives 180 IOPS
Drive # for Exchange database LUNS = Raid adjusted IOPS/IOPS per disk

Dedicate One or more RAID group for each ESG


RAID 1 Advantages
Flush the cache of Exchange random write loads faster than RAID 5
Rebuild time and Rebuild impacts reduced with RAID 1/0 in the event
of disk failures

RAID 1 Trade-off (?)

2007 EMC Corporation. All rights reserved.


Calculating Capacity Requirements - 3
User IOPS / Drive
IOPS PER DRIVE x HOST-BASED IOPS / BACK-END IOPS

Users / Drive
USER IOPS PER DRIVE/ IOPS PER USER

Capacity Check of a drive


Smaller drives offer more performance per GB
Usable Capacity per drive less than the marketing capacity
36 GB 33 GB ; 73 GB 66 GB ; 146 GB 134 GB
Usable space in a RAID group
RAID 1/0 = USABLE DRIVE SPACE x (# DRIVES/2)
RAID 5 = USABLE DRIVE SPACE x (# DRIVES 1)

2007 EMC Corporation. All rights reserved.


Calculating Capacity Requirements - 4
6 drives configured as RAID 1/0 and RAID 5 for different
Mailbox sizes 73 GB drives

2007 EMC Corporation. All rights reserved.


Meta LUNS and ESG configurations
Implementation in a Clariion Storage system
Combines multiple LUNS into a larger meta LUN that spans multiple
RAID groups
Load balancing & expandability

RAID 1/0 Group 1 RAID 1/0 Group 2

ESG 1 Meta LUN

ESG 2 Meta LUN

3+3 3+3

2007 EMC Corporation. All rights reserved.


Interleaving Meta LUNS

ESG 2
ESG 1
Meta LUN
Meta LUN

RAID 1/0 Group 1 RAID 1/0 Group 2


3+3 3+3

Meta LUN: adds level of complexity in design


Better performance and easier management if managed with limited
number of proven RAID group types and sizes
2007 EMC Corporation. All rights reserved.
Log LUN Configuration
Performance and response is time sensitive
Mostly sequential writes 100% writes typically 4KB to 12KB
RAID 1/0 better option
One set of logs for each ESG
Should be on separate LUN from its associated database
Host I/O to logs is approximately equal to 10-15% of host I/O to the
databases
Size of each log file is 5MB
Calculate capacity requirements based on number of log files to be
maintained recovery objectives
Exchange back-up process deletes log files whose transactions have been
committed to the database.
Circular logging Delete immediately after commit Recovery point
objective compromised

2007 EMC Corporation. All rights reserved.


Exchange Case Study: Putting It All Together - 1
12,000 Exchange users
8,000 typical
4,000 Heavy

Based on current environment, typical and heavy users


generated 0.7 IOS per user
Clustered Exchange servers in a single site at Eastern
Time zone
8000 of the 12,000 users in Eastern time zone
2,000 in the pacific time zone (-3 hours)
2,000 in western Europe (+5/6 hours)

Each active node will support 4,000 users evenly spread


(also proportional typical and heavy) across 4 ESGs
2007 EMC Corporation. All rights reserved.
Exchange Case Study: Putting It All Together - 2
Measured read/write ratio is close to 2:1
Back-ups will run twice a day, including once around mid-
day. They will also use local replication, which generates
additional IOPS of 25%
User concurrency during peak period is 90%
Typical users are allotted maximum mailbox size of
75MB; heavy users allotted 150MB
Majority of email clients is Outlook 2003 in cached mode,
with a significant number of users accessing with Outlook
Web access (OWA) - especially during off-hours

2007 EMC Corporation. All rights reserved.


Exchange Case Study: Putting It All Together - 3
Provide the IOPS calculation for RAID 1/0 and RAID 5
configurations
Compute the number of drives required with 10K and 15K
RPM drives
Validate capacity will match the mailbox size
requirements
List other design considerations
Log LUN, Time zone, Database layout

Provide a high-level design

2007 EMC Corporation. All rights reserved.


Module Summary
Key points covered in this module:
Storage design requirements for the Exchange email
application
How to calculate the disk requirements based on the
workload
Application of design principles for a large Exchange
environment

2007 EMC Corporation. All rights reserved.


Section Summary
Key points covered in this section:
Design considerations for the Performance Stack Application layer
Logical and physical structure of databases, the typical I/O
operations and workload profiles in a database environment
Optimal data layout for database files and segregation of workload
based on I/O profiles and user requirements
Design considerations for the Performance Stack Host layer
Various tools available to collect data from the hosts on key
performance metrics
Design considerations for the Performance Stack Storage layer
Storage design requirements for the Exchange email application

2007 EMC Corporation. All rights reserved.

Vous aimerez peut-être aussi