h7113 Vplex Architecture Deployment

V-Plex Arch.
book Page 1 Tuesday, April 27, 2010 9:35 AM
EMC VPLEX Architecture and

Deployment: Enabling the
Journey to the Private Cloud
Version 1.0
EMC VPLEX Family Architecture

System Integrity
VPLEX Local and VPLEX Metro: Local and Distributed
Storage Federation Platforms
Nondisruptive Workload Relocation
Michael Cram
Bala Ganeshan
Bradford Glade
Varina Hammond
Mary Peraro
Suzanne Quest
Jim Wentworth
V-Plex Arch.book Page 2 Tuesday, April 27, 2010 9:35 AM
Copyright 2010 EMC Corporation. All rights reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is
subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION MAKES
NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION
IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable
software license.
For the most up-to-date regulatory document for your product line, go to the Technical Documentation and
Advisories section on EMC Powerlink.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com.
All other trademarks used herein are the property of their respective owners.
Part Number h7113.1
EMC VPLEX Architecture and Deployment: Enabling the Journey to the Private Cloud
Contents
Preface
Chapter 1
VPLEX Family Overview

1.1 Introduction ...........................................................................
1.2 VPLEX family architecture overview.................................
1.3 VPLEX family overview.......................................................
1.3.1 VPLEX Local ................................................................
1.3.2 VPLEX Metro...............................................................
1.3.3 VPLEX Geo and VPLEX Global ................................
1.4 VPLEX family clustering architecture................................
1.4.1 Provisioning storage...................................................
1.4.2 Use cases.......................................................................
1.4.3 Single site configuration ............................................
1.4.4 Multi-site configuration .............................................
1.5 High-level features................................................................
Chapter 2
18
20
21
22
22
22
23
24
26
29
30
31
Hardware and Software

2.1 Hardware................................................................................
2.2 Software ..................................................................................
2.3 Networks ................................................................................
2.4 Scalability and limits.............................................................
2.5 Hardware architecture..........................................................
2.6 Engine components...............................................................
2.7 I/O modules ..........................................................................
2.8 Management server ..............................................................
2.9 Other hardware components...............................................
34
35
36
37
38
40
41
42
43
Chapter 3
Software Architecture
3.1 Introduction ...........................................................................
3.2 Simplified storage management .........................................
3.3 Management server user accounts .....................................
3.4 Management server software ..............................................
3.4.1 Management console..................................................
3.4.2 Command line interface.............................................
3.4.3 System reporting.........................................................
3.5 Director software...................................................................
3.6 Internal connections..............................................................
3.7 External connections.............................................................
3.8 Configuration overview.......................................................
3.8.1 Small configurations...................................................
3.8.2 Medium configurations .............................................
3.8.3 Large configurations ..................................................
3.9 I/O implementation .............................................................
3.9.1 Cache layering roles ...................................................
3.9.2 Share groups................................................................
3.9.3 Cache coherence..........................................................
3.9.4 Meta-directory.............................................................
3.9.5 How a read is handled ...............................................
3.9.6 How a write is handled..............................................
Chapter 4
System Integrity
4.1 Overview ................................................................................
4.2 Cluster.....................................................................................
4.3 Path redundancy through different ports .........................
4.4 Path redundancy through different directors ...................
4.5 Path redundancy through different engines .....................
4.6 Path redundancy through site distribution.......................
4.7 Safety check............................................................................
Chapter 5
70
71
72
73
74
75
76
VPLEX Local and VPLEX Metro Federated Solution

5.1 Enabling a federated solution .............................................
5.2 Deployment overview..........................................................
5.2.1 VPLEX Local deployment .........................................
5.2.2 When to use a VPLEX Local deployment ...............
5.2.3 VPLEX Metro deployment within a data center ....
5.2.4 VPLEX Metro deployment between data centers ..
46
47
48
49
49
52
56
57
58
59
61
62
63
64
65
65
66
66
66
66
67
78
79
79
80
80
81
5.3 Workload resiliency .............................................................. 83

5.3.1 Storage array outages ................................................. 83
5.3.2 SAN outages ................................................................ 84
5.4 Technology refresh use case............................................ 87
5.5 Introduction to distributed data access.............................. 88
5.5.1 Traditional approach for distributed data access... 89
5.5.2 Removing barriers for distributed data access with
VPLEX Metro........................................................................ 90
5.6 Technical overview of the VPLEX Metro solution ........... 91
5.6.1 Distributed block cache.............................................. 91
5.6.2 Enabling distributed data access .............................. 91
Chapter 6
VPLEX Use Case Example using VMware

6.1 Workload relocation.............................................................. 96
6.2 Use case examples................................................................. 97
6.3 Nondisruptive migrations using Storage VMotion ......... 98
6.4 Migration using encapsulation of existing devices ........ 101
6.5 VMware deployments in a VPLEX Metro environment 112
6.5.1 VMware cluster configuration ................................ 112
6.5.2 Nondisruptive migration of virtual machines using
VMotion............................................................................... 119
6.5.3 Changing configuration of non-replicated VPLEX
Metro volumes.................................................................... 122
6.5.4 Virtualized vCenter server on VPLEX Metro ....... 127
6.6 Conclusion............................................................................ 129
Glossary
Index
Figures
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Primary use cases ......................................................................................

VPLEX family architecture overview .....................................................
VPLEX offerings ........................................................................................
VPLEX vision .............................................................................................
EMC VPLEX Storage Cluster ...................................................................
Storage provisioning overview ...............................................................
Use cases .....................................................................................................
Cluster configuration ................................................................................
Single site configuration ...........................................................................
Multi-site configuration............................................................................
VPLEX system cabinet - front and rear view.........................................
Standard EMC cabinet ..............................................................................
VPLEX engine components......................................................................
Port roles in the I/O modules..................................................................
VPLEX Management Console..................................................................
Management Console welcome screen ..................................................
Provision Storage tab ................................................................................
Context-sensitive help example...............................................................
Network architecture ................................................................................
Management wiring of small, medium, and large configurations.....
VPLEX configurations - small, medium and large ...............................
VPLEX small configuration......................................................................
VPLEX medium configuration ................................................................
VPLEX large configuration ......................................................................
Cache layer roles and interactions ..........................................................
Port redundancy ........................................................................................
Director redundancy .................................................................................
Engine redundancy ...................................................................................
Site redundancy .........................................................................................
VPLEX Local deployment ........................................................................
18
20
21
21
23
25
26
28
29
30
38
39
40
41
49
50
51
51
58
59
61
62
63
64
65
72
73
74
75
79
7
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
VPLEX Metro deployment in a single data center ............................... 80

VPLEX Metro deployment between two data centers......................... 82
RAID 1 mirroring to protect against array outages.............................. 84
Dual-fabric deployment ........................................................................... 85
Fabric assignments for FE and BE ports ................................................ 86
EMC Storage device displayed by EMC Virtual Storage
Integrator (VSI) .......................................................................................... 98
VPLEX devices in VMware ESX server cluster..................................... 99
Storage VMotion to migrate virtual machines to VPLEX devices ... 100
Devices to be encapsulated .................................................................... 102
Encapsulating devices in a VPLEX system.......................................... 103
Creating extents on encapsulated storage volumes ........................... 104
VPLEX RAID 1 protected device on encapsulated VMAX devices . 105
Virtual volumes on VPLEX.................................................................... 105
Storage view ............................................................................................. 106
Rescan of the SCSI bus on the VMware ESX servers ......................... 108
Mounting datastores on encapsulated VPLEX devices ..................... 109
Resignaturing datastores on encapsulated VPLEX devices.............. 110
Adding virtual machines........................................................................ 111
Configuration of VMware clusters ....................................................... 114
Storage view of the datastores presented to VMware clusters......... 116
Datastores and virtual machines view ................................................. 117
Metro-Plex volume presented to the VMware environment............ 118
Detach rules on VPLEX distributed devices ....................................... 119
vCenter Server allowing live migration of virtual machines............ 120
Progression of VMotion between two physical sites ......................... 121
VMware datastore at a single site in a Metro-Plex configuration .... 123
Non-replicated Metro-Plex virtual volume ......................................... 123
Creating a device on VPLEX.................................................................. 124
Protection type change of a RAID 0 VPLEX device to
distributed RAID 1 .................................................................................. 125
VPLEX virtual volume at the second site ............................................ 126
Viewing VMware ESX servers .............................................................. 126
Tables
1
2
3
4
Features and descriptions..........................................................................

Scalability and limits ..................................................................................
Management server user accounts...........................................................
Using VPlexcli wildcards ..........................................................................
31
37
48
55
10
Preface
As part of an effort to improve and enhance the performance and

capabilities of its product lines, EMC periodically releases revisions of
its hardware and software. Therefore, some functions described in this
document may not be supported by all versions of the software or
hardware currently in use. For the most up-to-date information on
product features, refer to your product release notes.
This document describes/provides a description of the the EMC
VPLEX family and how it removes physical barriers within, across and
between data centers.
Audience
This document is part of the EMC VPLEX family documentation set,

and is intended for use by storage and system administrators.
Readers of this document are expected to be familiar with the following
topics:
Related
documentation
Storage Area Networks
Storage Virtualization
VMware Technologies
EMC Symmetrix and CLARiiON Products
Related documents include:
EMC VPLEX Installation and Setup Guide
EMC VPLEX Site Preparation Guide
Implementation and Planning Best Practices for EMC VPLEX Technical

Notes
Using VMware Virtualization Platforms with EMC VPLEX - Best

Practices Planning
11
Workload Resiliency with EMC VPLEX - Best Practices Planning
Nondisruptive Storage Relocation: Planned Events with EMC VPLEX Best Practices Planning
This document is divided into six chapters:
12
Chapter 1, VPLEX Family Overview, summarizes the VPLEX

family. It also covers some of the key features of the VPLEX family
system.
Chapter 2, Hardware and Software, summarizes hardware,

software, and network components of the VPLEX system.
Chapter 3, Software Architecture, summarizes the software

interfaces that can be used by an administrator to manage all
aspects of a VPLEX system.
Chapter 4, System Integrity, summarizes how VPLEX clusters are

able to handle hardware failures in any subsystem within the
storage cluster.
Chapter 5, VPLEX Local and VPLEX Metro Federated Solution,

summarizes the VPLEX offering for careful planning and an
understanding of the capabilities and scalability options.
Chapter 6, VPLEX Use Case Example using VMware, provides

use case example using Storage VMotion for ease of use in VMware
environments with VPLEX.
Typographical
conventions
EMC uses the following type style conventions in this document:
Normal
Used in running (nonprocedural) text for:

Names of interface elements (such as names of windows, dialog boxes, buttons,
fields, and menus)
Names of resources, attributes, pools, Boolean expressions, buttons, DQL
statements, keywords, clauses, environment variables, functions, utilities
URLs, pathnames, filenames, directory names, computer names, filenames, links,
groups, service keys, file systems, notifications
Bold
Used in running (nonprocedural) text for:

Names of commands, daemons, options, programs, processes, services,
applications, utilities, kernels, notifications, system calls, man pages
Used in procedures for:
Names of interface elements (such as names of windows, dialog boxes, buttons,
fields, and menus)
What user specifically selects, clicks, presses, or types
Italic
Used in all text (including procedures) for:

Full titles of publications referenced in text
Emphasis (for example a new term)
Variables
Courier
Used for:
System output, such as an error message or script
URLs, complete paths, filenames, prompts, and syntax when shown outside of
running text
Courier bold
Used for:
Specific user input (such as commands)
Courier italic
Used in procedures for:

Variables on command line
User input variables
<>
Angle brackets enclose parameter or variable values supplied by the user
[]
Square brackets enclose optional values
Vertical bar indicates alternate selections - the bar means or
{}
Braces indicate content that you must specify (that is, x or y or z)
...
Ellipses indicate nonessential information omitted from the example
13
This TechBook was authored by a team from Symmetrix and the

Virtualization Product Group based at EMC Headquarters, Hopkinton,
MA.
Michael Cram has 11 years of experience in the data storage industry
specializing in SAN virtualization and federation technologies.
Bala Ganeshan is a Corporate Systems Engineer in the Symmetrix
Partner Engineering team focusing on VMware and Cisco technologies.
Before starting in the current position, Bala worked as a Technical
Business Consultant in the Southeast area focusing on Business
Continuity. He has worked at EMC for 10 years in various capacities.
Bala has over 20 years of experience working in the IT industry. He
holds a Ph.D. from University of California at San Diego.
Bradford Glade has 20 years of experience in developing fault-tolerant
distributed systems for the storage, telecommunications, financial, and
manufacturing industries. Over the last seven years his work has
focused on creating storage virtualization technologies at EMC. He
earned his Ph.D. and M.S. in Computer Science from Cornell University
and his B.S. in Computer Science from the University of Vermont.
Varina Hammond has been working at EMC for more than 13 years,
almost entirely in the Symmetrix and Storage Virtualization Product
Group. She specializes in storage operating systems, and currently
holds the title of Senior Engineering Manager.
Mary Peraro is a member of the EMC Engineering Communications
team and has 20 years of experience in hardware and software technical
writing.
Suzanne Quest is a member of the EMC Engineering Communications
team and has over 20 years of experience in software technical writing.
Jim Wentworth has been working with EMC for over 15 years in the
areas of testing and validation, professional services, and education.
He currently holds the position of Advanced Solutions Architect in the
Customer Research Engineering group within the Symmetrix and
Virtualization Product Group.
14
Additional contributors to this book:
Jennifer Aspesi has over 9 years of work experience with EMC in

SAN, WAN and storage security technologies.
Anshul Chadda is SVE Product Manager for EMC VPLEX.
Arne Joris has 10 years of experience in storage virtualization and

software engineering.
Donald Kirouac has 14 years of experience in open systems storage

network architecture, design, and implementation. He is currently
Principle Corporate Systems Engineer for EMC SVPG.
David Lewis has more than 12 years of experience with expertise in

Global Finance, Sales and Product Marketing.
We'd like to hear from you!
Your feedback on our TechBooks is important to us! We want our books

to be as helpful and relevant as possible, so please feel free to send us
your comments, opinions and thoughts on this or any other TechBook:
TechBooks@emc.com
15
16
1
This chapter provides a brief executive summary of the EMC VPLEX

family described in this TechBook. It also covers some of the key
features of the VPLEX family system. Topics include:
1.1 Introduction ......................................................................................

1.2 VPLEX family architecture overview ............................................
1.3 VPLEX family overview..................................................................
1.4 VPLEX family clustering architecture ...........................................
1.5 High-level features...........................................................................
18
20
21
23
31
17
1.1 Introduction
The purpose of this TechBook is to introduce the EMC VPLEX
family as it is logically and physically architectured and provides an
overview of the features and functionality associated with the VPLEX
family as it pertains to its primary use cases, as shown in Figure 1.
Figure 1
Primary use cases
The TechBook concludes with a use case demonstrating how storage

and system administrators can nondisruptively move data in their
VMware environments across disparate clusters using VPLEX family
distributed federation. Though the VPLEX family implementation is
not limited to the VMware virtualized server environment, it portrays a
primary use case that is compelling and enlightens the readers
imagination as to the potential simplified scalability behind the VPLEX
family. The local federation capabilities of the VPLEX family system
allow a collection of the heterogeneous data storage solutions at a
physical site and present it as a pool of resources for VMware
virtualization platform, thus enabling the major tenets of a cloud
offering.
18
An extension of the VPLEX family systems' capabilities to span

multiple data centers enables IT administrator to leverage either private
or public cloud offerings from hosting service providers. The synergies
provided by a VMware virtualization offering connected to a VPLEX
family system help customers to reduce total cost of ownership while
providing a dynamic service that can rapidly respond to the changing
needs of their business.
Introduction
19
1.2 VPLEX family architecture overview

The VPLEX family represents the next-generation architecture for data
mobility and information access. This new architecture is based on
EMCs 20+years of expertise in designing, implementing and perfecting
enterprise-class intelligent cache and distributed data protection
solutions.
The VPLEX family is a solution for federating EMC and non-EMC
storage. The VPLEX family resides between the servers and
heterogeneous storage assets and introduces a new architecture with
unique characteristics:
Figure 2
20
Scale-out clustering hardware, that lets you start small and grow
big with predictable service levels
Advanced data caching utilizing large-scale SDRAM cache to

improve performance and reduce I/O latency and array contention
Distributed cache coherence for automatic sharing, balancing and

failover of I/O across the cluster
Consistent view of one or more LUNs across VPLEX family

clusters, separated either by a few feet within a data center or across
synchronous distances, enabling new models of high-availability
and workload relocation
VPLEX family architecture overview
1.3 VPLEX family overview

The VPLEX family today consists of:
EMC VPLEX Local, as shown in Figure 3, for managing data

mobility and access within the data center.
EMC VPLEX Metro, as shown in Figure 3, for mobility and access

across two locations at synchronous distances. VPLEX Metro also
includes the unique capability where a remote VPLEX Metro site
can present LUNs without the need for physical storage for those
LUNs at the remote site.
The VPLEX family future will consist of:
EMC VPLEX Geo, as shown in Figure 4, planned for 2011, adds

access between 2 sites over extended asynchronous distances.
EMC VPLEX Global, as shown in Figure 4, planned for later, will

enable EMC AccessAnywhere across multiple locations.
Figure 3
VPLEX offerings
Figure 4
VPLEX vision
VPLEX family overview
21
1.3.1 VPLEX Local

VPLEX Local provides simplified management and nondisruptive data
mobility across heterogeneous arrays. VPLEX Metro provides data
access and mobility between two VPLEX clusters within synchronous
distances. With a unique scale-up and scale-out architecture, VPLEX's
family of advanced data caching and distributed cache coherency
provides workload resiliency, automatic sharing, balancing and failover
of storage domains, and enables both local and remote data access with
predictable service levels.
VPLEX Local supports local federation today. EMC Symmetrix
VMAX hardware also adds local federation capabilities natively to
the array later in 2010 and will work independently of the VPLEX
family.
1.3.2 VPLEX Metro

VPLEX Metro delivers distributed federation capabilities and extends
access between two locations at synchronous distances. VPLEX Metro
leverages AccessAnywhere to enable a single copy of data to be
shared, accessed, and relocated over distance.
1.3.3 VPLEX Geo and VPLEX Global

Future capabilities of the planned VPLEX Geo and VPLEX Global
additions to the family will expand the platforms distributed
federation capabilities, including support for asynchronous distances
(VPLEX Geo), and more than two sites (VPLEX Global).
22
1.4 VPLEX family clustering architecture

The VPLEX family uses a unique clustering architecture to help
customers break the boundaries of the data center and allow servers at
multiple data centers to have read/write access to shared block storage
devices.
An EMC VPLEX Storage Cluster, as shown in Figure 5, can scale up
through the addition of more engines, and scale out by connecting
clusters into an EMC VPLEX Metro-Plex (two VPLEX Metro clusters
connected within metro distances). VPLEX Metro transparently moves
and shares workloads, including VMware ESX servers, consolidates
data centers, and optimizes resource utilization across data centers. In
addition, it provides nondisruptive data mobility, heterogeneous
storage management, and improved application availability. VPLEX
Metro supports up to two clusters, which can be in the same data
center, or at two different sites within synchronous distances
(approximately up to 60 miles or 100 kilometers apart).
Figure 5
EMC VPLEX Storage Cluster
VPLEX family clustering architecture
23
All VPLEX clusters are built from a standard engine configuration. The
engine is responsible for the federation of the I/O stream. It connects to
hosts and storage using Fibre Channel as the data transport. A VPLEX
cluster consists of an engine that contains several major components.
Hardware architecture, on page 38 provides more hardware details.
1.4.1 Provisioning storage

To begin using a VPLEX cluster, you must provision and export storage
so that hosts and applications can use the storage. Provisioning and
exporting storage refers to the tasks required to take a storage volume
from a storage array and make it visible to a host. This process consists
of the following tasks:
Discovering available storage
Claiming and naming storage volumes
Creating extents from storage volumes
Creating devices from the extents
Creating virtual volumes on the devices
Registering initiators
Creating a storage view
Adding initiators, ports, and volumes to the storage view.
Starting from the bottom, Figure 6 on page 25 shows the storage

volumes that are claimed. These volumes are divided into multiple
extents, however you can create a single full size extent using the entire
capacity of the storage volume. Devices are then created to combine
extents or other devices into one large device. From this large device, a
virtual volume is created.
24
Figure 6
Storage provisioning overview
The virtual device is presented to the host through a storage view. A

storage view defines which hosts access which virtual volumes on
which VPLEX family ports. It consists of the following components:
Registered initiators (hosts) to access the storage
VPLEX family ports (front-end) to export the storage
One or more virtual volumes to export.
Typically, one storage view is created for all hosts that require access to
the same storage.
25
1.4.2 Use cases

Both VPLEX Local and VPLEX Metro deliver significant value to
customers.
For VPLEX Local, common use cases include:
Figure 7
26
Data mobility between EMC and non-EMC storage platforms.

VPLEX allows users to federate heterogeneous storage arrays and
transparently move data across them to simplify and expedite data
movement, including ongoing technology refreshes and/or lease
rollovers.
Simplified management of multi-array storage environments.

VPLEX provides simple tools to provision and allocate virtualized
storage devices to standardize LUN presentation and management.
The ability to pool and aggregate capacity across multiple arrays
can also help improve storage utilization.
Increased storage resiliency. This allows storage to be mirrored

across mixed platforms without requiring host resources.
Leveraging this capability can increase protection and continuity
for critical applications.
Use cases
VPLEX Metro can be deployed in the following ways:
In a single data center for moving, mirroring, and managing

storage.
In multiple data centers for workload relocation, disaster

avoidance, and data center maintenance.
When two VPLEX Metro clusters are connected together within

synchronous distances, they form a Metro-Plex.
Common Metro-Plex use cases include:
Mobility and relocations between locations over synchronous

distances. In combination with VMware and VMotion over
distance, VPLEX Metro allows users to transparently move and
relocate virtual machines and their corresponding applications and
data over distance. This provides a unique capability allowing
users to relocate, share, and balance infrastructure resources
between data centers.
Distributed and shared data access within, between, and across

clusters within synchronous distances. A single copy of data can be
accessed from multiple users across two locations. This allows
instant access to information in real time, and eliminates the
operational overhead and time required to copy and distribute data
across locations.
Increased resiliency with mirror volumes within and across

locations. VPLEX Metro provides non-stop application availability
in the event of a component failure.
27
Figure 8
28
Cluster configuration
1.4.3 Single site configuration

There are common situations that can disrupt the continuous access
that hosts need to their data. A typical example is when it comes time to
replace the physical array that is providing this storage. When this
situation arises, historically the data used by the host must be copied
over to a new volume on the new array and the host reconfigured to
access this new volume. This process often requires downtime for the
host.
Figure 9
Single site configuration
With VPLEX, however, because the data is in virtual volumes, it can be

copied nondisruptively from one array to another without any
downtime for the host. The host does not need to be reconfigured; the
physical data relocation is performed by VPLEX transparently and the
virtual volumes retain their same identities and access points to the
host.
29
1.4.4 Multi-site configuration

Another common requirement for data centers is maintaining
redundant copies of data at a different location for workload relocation.
If one site goes down for any reason, the data is still available to the
host from the other location.
In Figure 10, the hosts located at Site 1 have complete access to the data
stored at Site 2 and vice versa. Two copies of critical data can be
mirrored between sites to ensure business continuity.
Figure 10
Multi-site configuration
Chapter 4, System Integrity, describes in more detail the VPLEX

recovery process that occurs when any component in a redundant
configuration fails.
30
1.5 High-level features

The VPLEX family provides immediate benefits for single-site
deployments, including increased resiliency for unplanned outages,
centralized storage management, and storage optimization. The
benefits increase further and extend out over distance with VPLEX
Metro, which is capable of delivering VPLEXs benefits within, between
and across clusters, over synchronous distances.
Table 1 provides a brief description of the high-level VPLEX family
features.
Table 1
Features and descriptions
Feature
Description
Storage volume encapsulation
Disks on a back-end array can be imported into an instance of VPLEX

and used while keeping their data intact.
RAID 0
Devices can be aggregated to create a RAID 0 striped device.
RAID C
Devices can be concatenated to form a new larger device.
RAID 1
Devices can be mirrored within a site.
Distributed RAID 1
Devices can be mirrored between sites.
Disk slicing
Storage volumes can be partitioned and devices created from these

partitions.
Migration
Volumes can be migrated nondisruptively to other storage systems.
Remote export
The presentation of a volume from one VPLEX cluster where the

physical storage for the volume is provided by a remote VPLEX cluster.
Write-through cache
Host writes pass through the cache to the back-end arrays and are
acknowledged by the arrays prior to acknowledgement to the host.
High-level features
31
32
2
This chapter provides a brief introduction to the hardware, software,

and network components of the VPLEX system. It also includes
scalability and interoperability information. Topics include:
2.1 Hardware...........................................................................................
2.2 Software .............................................................................................
2.3 Networks ...........................................................................................
2.4 Scalability and limits........................................................................
2.5 Hardware architecture.....................................................................
2.6 Engine components..........................................................................
2.7 I/O modules .....................................................................................
2.8 Management server .........................................................................
2.9 Other hardware components..........................................................
34
35
36
37
38
40
41
42
43
33
2.1 Hardware
All VPLEX clusters are built from a standard VPLEX engine
component. The engine is responsible for the federation of the I/O
stream. It connects to hosts and storage using Fibre Channel as the data
transport. A VPLEX cluster consists of one, two, or four engines each of
which contain the following major components:
Two directors, which run the EMC GeoSynchrony software and

connect to storage, hosts, and other directors in the cluster with
Fibre Channel and gigabit Ethernet connections.
One Standby Power Supply, which provides backup power to

sustain the engine through transient power loss.
Two management modules, which contain interfaces for remote

management of a EMC VPLEX engine.
Each director is configured with five (5) Fibre Channel I/O modules
and one (1) Gigabit Ethernet I/O module.
Each cluster also consists of:
A management server, which manages the cluster and provides an

interface from a remote management station.
An EMC standard 40U cabinet to hold all of the equipment of the

cluster.
Additionally, clusters containing more than one engine also have:
A pair of Fibre Channel switches used for inter-director

communication.
A pair of Universal Power Supplies that provide backup-power for

the Fibre Channel switches and allow the system to ride through
transient power loss.
A VPLEX cabinet can accommodate up to four engines, making it easy

to upgrade from a small configuration to a medium or large cluster
configuration.
Section 2.5, Hardware architecture, on page 38 provides more
hardware details.
34
2.2 Software
The VPLEX cluster firmware is GeoSynchrony 4.0, which manages
cluster functions, such as processing I/O from hosts, cache processing,
virtualization logic, and interfaces with arrays for claiming and I/O
processing.
The VPLEX management server contains the software for the command
line interface (VPlexcli) and the VPLEX management console, a
web-based graphical user interface (GUI). The VPLEX management
server communicates with the directors, retrieves logs by querying
system state, supports multiple CLI and HTTP sessions, listens to the
system events and determines which events are of interest for call
home, and interprets the call home list and initiates the call home.
The management server can also provide call-home services through
the public Ethernet port by connecting to an EMC Secure Remote
Support (ESRS) gateway deployed on that same network, which can
also be used to facilitate service by EMC personnel.
Software
35
2.3 Networks
The VPLEX system is inter-connected using Fibre Channel.
A management server has four Ethernet ports, identified as eth0
through eth3 by the operating system. A public management port (eth3)
is the only Ethernet port in the VPLEX rack connected to the customers
management LAN. Other components in the cluster's rack are
connected to two redundant private management Ethernet networks,
connected to the management server's eth0 and eth2 ports. The service
port (eth1) can be connected to a service laptop, giving it access to the
same services as a host on the management LAN.
36
2.4 Scalability and limits

Table 2 provides the scalability and limits for VPLEX clusters. Always
check the latest version of the release notes for the most up-to-date
information.
Table 2
Scalability and limits

Parameter
Maximum number
Virtual volumes
8000
Storage volumes
8000
Initiators (HBA ports)
400
Initiators per port
256
Extents per storage volume
128
Extents
24000
Scalability and limits
37
2.5 Hardware architecture

The VPLEX storage cluster cabinet contains one, two, or four VPLEX
engines, I/O modules, management modules, management servers,
Fiber Channel switches, power supplies, and fans.
Figure 11
38
VPLEX system cabinet - front and rear view
All VPLEX cluster configurations are pre-installed in a standard EMC

cabinet during manufacturing. The rack measurements and access
details are shown in Figure 12.
Figure 12
Standard EMC cabinet
All configurations have a common power feed design. VPLEX clusters

use single phase power. Each cabinet contains four power distribution
panels (PDPs), two of which are used, and four power distribution
units (PDUs).
Note: Two power drops are required per cluster.
VPLEX clusters ship as a complete rack with the VPLEX engine(s)

pre-installed, pre-cabled, and pre-tested. Unused space is reserved for
future expansion.
Hardware architecture
39
2.6 Engine components

The VPLEX engine is responsible for the virtualization of the I/O
stream. An engine contains two directors, each containing 32 GB of
memory and a 30 GB solid state drive (SSD). Each director has an SSD
drive providing the storage for the GeoSynchrony image and space for
storing diagnostic data, such as core dumps and traces.
Figure 13 illustrates the engine components.
Figure 13
VPLEX engine components
Each director has four (4) ports of 8 Gb/s Fibre Channel for host I/O
and four (4) ports of 8 Gb/s Fibre Channel for storage array I/O.
Each director has one (1) SSD with a 30 GB capacity. The SSDs are used
to store the operating system, firmware, and logs.
Two management access modules provide service access to the
directors through an embedded Ethernet switch and two (2) external
RJ-45 interfaces as well as two (2) micro-DB-9 interfaces that are used
for monitoring the standby power supply (SPS) that provides battery
backup power to the chassis in the presence of power loss and also for
monitoring of the power source (UPS) that provides backup power to
the internal intra-cluster switches in medium and large configurations.
Power is provided by two (2) power supplies and cooling is supported
within the engine by four (4) hot-swappable fan modules.
The engine consumes 4U of rack space.
40
2.7 I/O modules

Each director is configured with five (5) Fibre Channel I/O modules
and one (1) Gigabit Ethernet I/O module. A description of each module
follows:
Fibre Channel I/O Module

This module provides four (4) 8 Gb/s Fibre Channel ports. Each
I/O director uses four (4) of these modules for host and storage
connectivity and a fifth for inter-director and WAN communication.
Gigabit Ethernet I/O Module

This module is currently not used.
Figure 14 illustrates the port roles in the I/O modules.
Figure 14
Port roles in the I/O modules
I/O modules
41
2.8 Management server

Each VPLEX cluster has one (1) management server. The management
server provides the connectivity to the customers IP network and
serves as the management access point for the VPLEX cluster. In a
Metro deployment across distance, the management servers in different
clusters provide a secure VPN tunnel between the sites, which allows
the management servers to communicate with each of the directors in
the Metro-Plex.
The management server runs the VPLEX System Management Software
(SMS) as well as the ConnectEMC software that provides support for
notifications to a local ESRS gateway and to the customer.
The management server contains:
Uni-processor, dual-core Xeon 3065 2.33 GHz
4 GB RAM
One 250 GB SATA near-line drive
Management connectivity is enabled by the redundant management

modules. There are two management modules in each engine. Each
management module provides internal connectivity to both directors.
For availability purposes, the management network is wired to form
two physically and logically independent IP subnets.
42
2.9 Other hardware components

In addition to the cabinet, engine, management modules, and
management server, the following hardware is also included in a
VPLEX cluster.
2.9.0.1 Engine SPS
Each VPLEX engine is supported by a pair of standby power supplies

that provide DC power in the case of loss of AC power. The pair of SPS
units are mounted side-by-side and consume 2U of rack space. The
batteries in these units support a hold-up time of ten minutes. The
maximum hold-up time for a single event is five minutes.
2.9.0.2 UPS
A 350V UPS unit is used for each of the private intra-cluster switches
used in the medium and large VPLEX configurations to supply power
to these switches for transient power loss ride through. The UPS units
are monitored through their serial port. These ports are connected via a
serial cable to the primary management station within the rack.
Each power supply module is hot-swappable.
2.9.0.3 Fans
Four independent fans (four FRUs) cool each VPLEX engine.

Each fan module is hot-swappable.
2.9.0.4 Fibre Channel COM switch

An intra-director Fibre Channel COM switch is used in large cluster
configurations to create a redundant network for intra-cluster
communication between directors. Each director has two independent
COM paths to every other director.
A Fibre Channel COM switch is required for two-engine and
four-engine configurations.
Other hardware components
43
44
3
This chapter explains the software interfaces that can be used by an

administrator to manage all aspects of a VPLEX system. In addition, a
brief overview of the internal system software is included. Topics
include:
3.1 Introduction ......................................................................................

3.2 Simplified storage management ....................................................
3.3 Management server user accounts ................................................
3.4 Management server software .........................................................
3.5 Director software ..............................................................................
3.6 Internal connections.........................................................................
3.7 External connections ........................................................................
3.8 Configuration overview ..................................................................
3.9 I/O implementation.........................................................................
46
47
48
49
57
58
59
61
65
45
3.1 Introduction
The system management software for VPLEX family systems consists
of the following high-level components:
Command line utility
Management console (web interface)
Business layer
Firmware layer
Each cluster in a VPLEX deployment requires one management server,

which is embedded in the VPLEX cabinet along with other essential
components, such as the directors and internal Fibre Channel switches.
The management server communicates through private, redundant IP
networks with each director. The management server is the only VPLEX
component that is configured with a public IP address on the customer
network.
The management server is accessed through a Secure Shell (SSH).
Additionally the administrator may run VNC client to the management
server. Within the SSH session the administrator can run a CLI utility
called VPlexcli to manage the system. Alternatively, the VPLEX
management console web interface (GUI) can be started by pointing a
browser at the management servers public IP address.
The following processes run on the management server:
46
System Management Server Communicates with the directors,

retrieves logs by querying system state, supports multiple
concurrent CLI and HTTP sessions, listens to the system events and
determines which events are of interest for call home, and interprets
the call home list and initiates the call home.
EmaAdapter Collects events from VPLEX components and sends

them to ConnectEMC.
ConnectEMC Receives the formatted events and sends them to

EMC.com
3.2 Simplified storage management

VPLEX supports a variety of arrays from various vendors covering both
active/active and active/passive type arrays. VPLEX simplifies storage
management by allowing simple LUNs, provisioned from the various
arrays, to be managed through a centralized management interface that
is simple to use and very intuitive. In addition, a Metro-Plex
environment that spans data centers allows the storage administrator to
manage both locations through the one interface from either location by
logging in at the local site.
Simplified storage management
47
3.3 Management server user accounts

The management server requires the setup of user accounts for access to
certain tasks. Table 3 describes the types of user accounts on the
management server.
Table 3
Management server user accounts

Account type
Purpose
admin (customer)
Performs administrative actions, such as user

management
Creates and deletes Linux CLI accounts
Resets passwords for all Linux CLI users
Modifies the public Ethernet settings
service
(EMC service)
Starts and stops necessary OS and VPLEX

services
Cannot modify user accounts
(Customers do have access to this account)
Linux CLI accounts
Uses VPlexcli to manage federated storage
All account types
Uses VPlexcli
Modifies their own password
Can SSH or VNC into the management server
Can SCP files off the management server from
directories to which they have access
Some service and administrator tasks require OS commands that

require root privileges. The management server has been configured to
use the sudo program to provide these root privileges just for the
duration of the command. Sudo is a secure and well-established UNIX
program for allowing users to run commands with root privileges.
VPLEX documentation will indicate which commands must be prefixed
with "sudo" in order to acquire the necessary privileges. The sudo
command will ask for the user's password when it runs for the first
time, to ensure that the user knows the password for his account. This
prevents unauthorized users from executing these privileged
commands when they find an authenticated SSH login that was left
open.
48
3.4 Management server software

The management server software is installed during manufacturing
and is fully field upgradeable. The software includes:
VPLEX Management Console
VPlexcli
Server Base Image Updates (when necessary)
Call-home software
3.4.1 Management console

The VPLEX Management Console provides a graphical user interface
(GUI) to manage the VPLEX cluster. The GUI can be used to provision
storage, as well as manage and monitor system performance.
Figure 15 shows the VPLEX Management Console window with the
cluster tree expanded to show the objects that are manageable from the
front-end, back-end, and the federated storage.
Figure 15
VPLEX Management Console
Management server software
49
The VPLEX Management Console provides online help for all of its
available functions. You can access online help in the following ways:
Click the Help icon in the upper right corner on the main screen to
open the online help system, or in a specific screen to open a topic
specific to the current task.
Click the Help button on the task bar to display a list of links to
additional VPLEX documentation and other sources of information.
Figure 16 is the welcome screen of the VPLEX Management Console

GUI, which utilizes a secure http connection via a browser. The
interface uses Flash technology for rapid response and unique look and
feel.
Figure 16
50
Management Console welcome screen
Figure 17 displays the inside of the Provision Storage tab, an EMC

standardized interface that is both responsive and intuitive.
Figure 17
Provision Storage tab
Figure 18 is an example of the context-sensitive help all functions have

that is built into the GUI for ease of management.
Figure 18
Context-sensitive help example
51
3.4.2 Command line interface

The VPlexcli is a command line interface (CLI) for configuring and
running the VPLEX system, for setting up and monitoring the systems
hardware and inter-site links (including com/tcp), and for configuring
global inter-site I/O cost and link-failure recovery. The CLI runs as a
service on the VPLEX management server and is accessible using
Secure Shell (SSH).
Once a Secure Shell connection is established to the management server,
the CLI session is started by telnet localhost 49500. You are prompted
for a login and password.
Output from specific commands is displayed in the same terminal
window as the commands are executed.
3.4.2.1 Navigation and context
Fundamental to using the VPlexcli is the concept of object context, which
is determined by current location within the directory tree of managed
objects.
A detailed explanation for each attribute of an object can be obtained
using the describe command, as shown in the following example:
VPlexcli:/engines/engine-1-1/fans/fan-0> 11
Name
Value
---------------------------operational-status
online
speed-threshold-exceeded
false
VPlexcli:/engines/engine-1-1/fans/fan-0> describe
Attribute
Description
------------------------ ---------------------------------operational-status
The operational status of the fan.
speed-threshold-exceeded The flag indicates if the speed of the
fan has exceeded the threshold or not.
The CLI prompt indicates your current context, for example:

VPlexcli:/clusters/Hopkinton>
The CLI navigation is similar to a UNIX/Linux shell. What appears as
directories are actually contexts. You can see more contexts in the CLI
than you did in the context tree of the GUI.
52
Each context may have attributes as well as sub- or child contexts. You
may change your current context location using:
cd <relative path> such as
VPlexcli:/> cd clusters/Boston
cd <full path> such as
VPlexcli:/> cd /clusters/Boston
cd ..such as
VPlexcli:/clusters/Hopkinton> cd ../Boston
Available commands change depending on the current context. Some

commands are global and some are inherited from parent contexts. To
see a list of commands available, enter help or ?.
To get help on a specific command, type in the command followed by a
-h.
Additionally, some commands have sub-commands that can be listed
by using Tab completion.
It is important to spend some time getting familiar with the context tree.
The layout is similar to the tree represented in the Management
Console. The following is a representation of a portion of the tree:
VPlexcli:/>
Clusters/
Boston/
Hopkinton/ (expanded to illustrate child contexts)
devices/
exports/
initiator-ports/
ports/
storage-views/
storage-elements/
extents/
storage-volumes/
virtual-volumes/
distributed-storage/
distributed-devices/
rule-sets/
cluster-1-detaches
cluster-2-detaches
53
Tab completion prompts the CLI to attempt to fill in valid information

from based on what has already been typed onto the current command
line. If more than one option is available, pressing TAB a second time
lists out available options.
Examples:
VPlexcli:/> cd eng[TAB] => cd engines/
VPlexcli:/> cluster [TAB]
add
expel
forget
shutdown
summary
status
Some commands can be executed outside their context. Within many

contexts there are create and destroy commands. But, for instance, you
can create a local-device outside of the devices context by using
local-device create instead of just create.
Use ls, ls l, or ll to list the contents of a context. Using either ls l or ll
will provides a full listing including context attributes.
Use the set command to set attribute values. Note that some attributes
are read-only.
Use set and describe to find out more about attributes and their
allowable values.
Finally, use some regular expressions, or command globbing, examples:
ls /clusters/*/storage-elements/storage-volumes
ls /clusters/B*
set SymmA001_1[0A][5-9]:: application-consistent true
Lastly, note that command syntax frequently uses positional parameters

as well as command flags, but usually not mixed them within a single
instance of a command.
Always remember to refer to CLI help, it is very useful.
54
3.4.2.2 Command line options

The VPlexcli provides two styles for specifying options:
-<letter>
--<word>
Both styles produce the same results, for example:

VPlexcli set -h
or
VPlexcli set --help
Both produce the help page for the set command.

The VPlexcli has context-sensitive and context insensitive syntax.
Table 4 shows the VPlexcli wild cards and explains how to use them.
Table 4
Using VPlexcli wildcards
Symbol / description
Example
*
Matches any unknown
number of characters.
VPlexcli:/> cd engines/*/directors
VPlexcli:/engines/engine-1-1/directors>
?
Matches one unknown
character
VPlexcli:/engines/engine-1-1/directors> cd
128.221.25?.35
VPlexcli:/engines/engine-1-1/directors/128.221.252.35>
**
Recursively matches
objects at any level
VPlexcli:/> cd **/directors
VPlexcli:/engines/engine-1-1/directors>
multiple symbols
VPlexcli:/> ls **/*.36/firmware
Use multiple symbols

together to match objects
/engine/engine-1-1/directors/128.221.252.36/firmware:
Name
Value
----------------------------------application-status
running
uid
0x000000003b300c49
uptime
6 days,21 hours,31 minutes,58
seconds.
version
v8_1_41-0
As in standard UNIX shells, \ (backslash) serves as the escape

character.
For information about the VPlexcli, refer to the EMC VPLEX CLI Guide.
55
3.4.3 System reporting

VPLEX system reporting software collects various configuration
information from each cluster and each engine. The resulting
configuration file (XML) is zipped and stored locally on the
management server or presented to the SYR system at EMC via call
home.
You can schedule a weekly job to automatically collect SYR data
(VPlexcli command scheduleSYR), or manually collect it whenever
needed (VPlexcli command syrcollect).
56
3.5 Director software

The director software provides:
Basic Input/Output System (BIOS ) Provides low-level hardware

support to the operating system, and maintains boot configuration.
Power-On Self Test (POST) Provides automated testing of system

hardware during power on.
Linux Provides basic operating system services to the Vplexcli

software stack running on the directors.
VPLEX Power & Environmental Monitoring (ZPEM ) Provides

monitoring and reporting of system hardware status.
EMC Common Object Model (ECOM ) Provides management

logic and interfaces to the internal components of the system.
Log server Collates log messages from director processes and

sends them to the SMS.
GeoSynchrony (I/O Stack) Processes I/O from hosts, performs all

cache processing, replication, and virtualization logic, interfaces
with arrays for claiming and I/O.
Director software
57
3.6 Internal connections

The internal (within the cabinet) network connections are Fibre
Channel COM switches, as shown in Figure 19. Each director has two
independent COM paths to create a redundant network for COM. The
COM network is private; customer connections are not permitted.
Figure 19
Network architecture
The Connectrix DS-300B switch is required on all VPLEX systems that

have two or more engines, with port connections as follows:
VPLEX-04 systems (two engines) use four ports per switch.
VPLEX-08 systems (four engines) use eight ports per switch.
Sixteen of the switch ports are unlicensed and disabled.
58
3.7 External connections

The following section describes how the VPLEX management network
is used and configured for two sites.
Figure 20 on page 59 illustrates how the management network is wired
in different VPLEX configurations. The management connectivity is
enabled by redundant management modules (shaded gray). There are
two management modules in each engine. Each management module
provides internal connectivity to both directors. For availability
purposes, the management network is wired to form two physically
and logically independent daisy chains, as follows:
Figure 20
One connecting A-side management modules (depicted on the right

side), the management server and the top private Fibre Channel
switch (in medium and large configurations).
One connecting B-side modules (depicted on the left side), the

management server and the bottom private Fibre Channel switch
(in medium and large configurations).
Management wiring of small, medium, and large configurations
External connections
59
This network configuration allows VPLEX to maintain management

connectivity within the cluster in the presence of management module,
engine, or Ethernet cable failures.
To provide a management server with the capability to access both local
and remote directors, VPLEX ensures that management IP addresses
assigned to components operate in different sites and never conflict
with one another. To simplify service and maintenance, the directors
will automatically assign their IP addresses based on two resume
PROM parameters:
Cluster ID, which identifies a given cluster.
Enclosure ID, which identifies the engine within the cluster.
The VPLEX Management Console graphical user interface (GUI) is

accessible as a web service on the management server's public Ethernet
port and service port using the HTTPS protocol. It is available on the
standard port 443.
60
3.8 Configuration overview

The VPLEX configurations are based on how many engines are in the
cabinet. The basic configurations are small, medium, and large, as
shown in Figure 21.
The configuration sizes refer to the number of engines in the VPLEX
cabinet. The remainder of this section describes each configuration size.
ON
I
ON
I
ON
I
ON
I
ON
I
ON
I
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
ON
I
ON
I
ON
I
ON
I
ON
I
ON
I
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
ON
I
ON
I
ON
I
ON
I
ON
I
ON
I
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Figure 21
VPLEX configurations - small, medium and large
Configuration overview
61
3.8.1 Small configurations

The VPLEX-02 (small) configuration includes the following:
Two directors
One engine
Redundant engine SPSs
16 front-end Fibre Channel ports
16 back-end Fibre Channel ports
One management server
The unused space between engine 1 and the management server in

Figure 22 is intentional.
ON
I
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Management server
ON
I
O
OFF
Engine
Director B
Director A
SPS
Figure 22
62
VPLEX small configuration
3.8.2 Medium configurations

The VPLEX-04 (medium) configuration includes the following:
Four directors
Two engines
Redundant Fibre Channel COM switches for local COM; UPS for
each Fibre Channel switch
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
Fibre Channel switch B

UPS B
Fibre Channel switch A
UPS A
O
OFF
O
OFF
ON
I
ON
I
Management server
Engine 2
Director 2B
Director 2A
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
SPS 2
ON
I
ON
I
Engine 1
Director 1B
Director 1A
SPS 1
Figure 23
VPLEX medium configuration

Configuration overview
63
3.8.3 Large configurations

The VPLEX-08 (large) configuration includes the following:
Eight directors
Four engines
Redundant Fibre Channel COM switches for local COM; UPS for
each Fibre Channel switch
ON
I
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
O
OFF
O
OFF
ON
I
ON
I
Management server
ON
I
O
OFF
Engine
Director B
Director A
SPS
Figure 24
64
VPLEX large configuration
3.9 I/O implementation

The VPLEX cluster utilizes a write-through mode whereby all writes
are written through the cache to the back-end storage. Writes are
completed to the host only after they have been completed to the
back-end arrays, maintaining data integrity.
This section describes the VPLEX cluster caching layers, roles, and
interactions. It gives an overview of how reads and writes are handled
within the VPLEX cluster and how distributed cache coherency works.
3.9.1 Cache layering roles

All hardware resources (CPU cycles, I/O ports, and cache memory) are
pooled in a VPLEX cluster. Each cluster contributes local storage and
cache resources for distributed virtual volumes within a VPLEX cluster.
As shown in Figure 25, within the VPLEX cluster, the DM (Data
Management) component includes a per-volume caching subsystem
that provides the following capabilities:
Figure 25
Local Node Cache: I/O management, replacement policies,

pre-fetch, and flushing (CCH) capabilities.
Distributed Cache (DMG - Directory Manager): Cache coherence,

volume share group membership (distributed registration), failure
recovery mechanics (fault-tolerance), RAID, and replication
capabilities.
Cache layer roles and interactions

I/O implementation
65
3.9.2 Share groups

Nodes export the same volume from a share group. This share group
membership is managed through a distributed registration mechanism.
Nodes within a share group collaborate to maintain cache coherence.
3.9.3 Cache coherence

Cache coherence creates a consistent global view of a volume.
Distributed cache coherence is maintained using a directory. There is
one directory per user volume and each directory is split into chunks
(4096 directory entries within each). These chunks exist only if they are
populated. There is one directory entry per global cache page, with
responsibility for:
Tracking page owner(s) and remembering the last writer
Locking and queuing
3.9.4 Meta-directory
Directory chunks are managed by the meta-directory, which assigns
and remembers chunk ownership. These chunks can migrate using
Locality-Conscious Directory Migration (LCDM). This meta-directory
knowledge is cached across the share group for efficiency.
3.9.5 How a read is handled

When a host makes a read request, VPLEX first searches its local cache.
If the data is found there, it is returned to the host.
If the data is not found in local cache, VPLEX searches global cache.
Global cache includes all directors that are connected to one another
within the VPLEX cluster. When the read is serviced from global cache,
a copy is also stored in the local cache of the director from where the
request originated.
If a read cannot be serviced from either local cache or global cache, it is
read directly from the back-end storage. In this case both the global and
local cache are updated to maintain cache coherency.
66
3.9.5.1 I/O flow of a read miss

1. Read request issued to virtual volume from host.
2. Look up in local cache of ingress director.
3. On miss, look up in global cache.
4. On miss, data read from storage volume into local cache.
5. Data returned from local cache to host.
3.9.5.2 I/O flow of a local read hit
3. On hit, data returned from local cache to host.
3.9.5.3 I/O flow of a global read hit
3. On miss, look up in global cache.
4. On hit, data read from owner director into local cache.
5. Data returned from local cache to host.
3.9.6 How a write is handled

All writes are written through cache to the back-end storage. Writes are
completed to the host only after they have been completed to the
back-end arrays.
When performing writes, the VPLEX system Data Management (DM)
component includes a per-volume caching subsystem that utilizes a
subset of the caching capabilities:
Local Node Cache: cache data management, and back-end I/O

interaction.
Distributed Cache (DMG Directory Manager): Cache coherence,

dirty data protection, and failure recovery mechanics
(fault-tolerance).
I/O implementation
67
3.9.6.1 I/O flow of a write miss

1. Write request issued to virtual volume from host.
2. Look for prior data in local cache.
3. Look for prior data in global cache.
4. Transfer data to local cache.
5. Data is written through to back-end storage.
6. Write is acknowledged to host.
3.9.6.2 I/O flow of a write hit
1. Write request issued to virtual volume from host.
2. Look for prior data in local cache.
3. Look for prior data in global cache.
4. Invalidate prior data.
5. Transfer data to local cache.
6. Data is written through to back-end storage.
7. Write is acknowledged to host.
68
4
System Integrity
This chapter explains how VPLEX clusters are able to handle hardware
failures in any subsystem within the storage cluster. Topics include:
4.1 Overview ...........................................................................................

4.2 Cluster................................................................................................
4.3 Path redundancy through different ports.....................................
4.4 Path redundancy through different directors ..............................
4.5 Path redundancy through different engines ................................
4.6 Path redundancy through site distribution ..................................
4.7 Safety check.......................................................................................
System Integrity
70
71
72
73
74
75
76
69
4.1 Overview
VPLEX clusters are capable of surviving any single hardware failure in
any subsystem within the overall storage cluster. These include host
connectivity subsystem, memory subsystem, etc. A single failure in any
subsystem will not affect the availability or integrity of the data.
Multiple failures in a single subsystem and certain combinations of
single failures in multiple subsystems may affect the availability or
integrity of data.
This availability requires that host connections be redundant and that
hosts are supplied with multipath drivers. In the event of a front-end
port failure or a director failure, hosts without redundant physical
connectivity to a VPLEX cluster and without multipathing software
installed may be susceptible to data unavailability.
70
4.2 Cluster
A cluster is a collection of one, two, or four engines in a physical
cabinet. A cluster serves I/O for one storage domain and is managed as
one storage cluster.
All hardware resources (CPU cycles, I/O ports, and cache memory) are
pooled:
The front-end ports on all directors provide active/active access to

the virtual volumes exported by the cluster.
For maximum availability, virtual volumes must be presented

through each director so that all directors but one can fail without
causing data loss or unavailability. All directors must be connected
to all storage.
Cluster
71
4.3 Path redundancy through different ports

Because all paths are duplicated, when a director port goes down for
any reason, data seemlessly processes through a port of the other
director, as shown in Figure 26.
Figure 26
Port redundancy
Multi-pathing software plus redundant volume presentation yields

continuous data availability in the presence of port failures.
72
4.4 Path redundancy through different directors

If a a director were to go down, the other director can completely take
over the I/O processing from the host, as shown in Figure 27.
Figure 27
Director redundancy
Multi-pathing software plus volume presentation on different directors

yields continuous data availability in the presence of director failures.
Path redundancy through different directors
73
4.5 Path redundancy through different engines

In a clustered environment, if one engine goes down, another engine
completes the host I/O processing, as shown in Figure 28.
Figure 28
Engine redundancy
Multi-pathing software plus volume presentation on different engines

yields continuous data availability in the presence of engine failures.
74
4.6 Path redundancy through site distribution

The ultimate in site redundancy ensures that if a site goes down, or
even if the link to that site goes down, the other site can continue
seamlessly processing the host I/O, as shown in Figure 29. On site
failure of Site B, the I/O continues unhindered on Site A.
Figure 29
Site redundancy
Path redundancy through site distribution
75
4.7 Safety check

In addition to the redundancy fail-safe features, the VPLEX cluster
provides event logs and call home capability.
76
5
VPLEX Local and VPLEX
Metro Federated Solution
This chapter explains the VPLEX offering for careful planning and an
understanding of the capabilities and scalability options. Topics
included:
5.1 Enabling a federated solution.........................................................

5.2 Deployment overview .....................................................................
5.3 Workload resiliency .........................................................................
5.4 Technology refresh use case.......................................................
5.5 Introduction to distributed data access.........................................
5.6 Technical overview of the VPLEX Metro solution.......................
VPLEX Local and VPLEX Metro Federated Solution
78
79
83
87
88
91
77
5.1 Enabling a federated solution

Deploying a VPLEX solution requires careful planning and a full
understanding of the capabilities and scalability options. As seen in the
architectural overview, all models are built from common building
blocks. Starting with the smallest configuration, the system can be
upgraded over time to scale to the largest Metro-Plex configuration.
This concept is limited to building up from a single source cluster and
does not support combining two independent existing clusters for the
purpose of joining them together to form a Metro. Each cluster, even
those in a Metro-Plex, can be upgraded from a single engine to a two or
four engine cluster.
The initial release of VPLEX supports two clusters to join in a
Metro-Plex with each cluster controlling its own set of storage devices.
Devices provided from both clusters can be combined to form
distributed devices, which become a single entity across the clusters
and appear on each side as though it were a single device.
As with the introduction of any high-end product to the environment, it
is imperative that the entire environment be scrutinized for proper
support of all components. By design, VPLEX is fully self-contained
and has no interdependencies on specific Fibre Channel switch code or
other features within the SAN infrastructure. However it is still
recommended to follow EMC support matrices and best practices.
78
5.2 Deployment overview

The VPLEX V4.0 product supports several different deployment
models to suit different needs. The next few sections describe these
different models and when they should be used.
5.2.1 VPLEX Local deployment

Figure 30 illustrates a typical deployment of a VPLEX Local system.
VPLEX Local systems are supported in small, medium, or large
configurations consisting of one, two, or four engines respectively,
yielding systems that provide two, four, or eight directors of
connectivity and processing capacity.
Figure 30
VPLEX Local deployment
Deployment overview
79
5.2.2 When to use a VPLEX Local deployment

VPLEX Local is appropriate when the virtual storage capabilities of
workload relocation, workload resiliency, and simplified storage
management are desired within a single data center and the scaling
capacity of VPLEX Local is sufficient to meet the needs of this data
center. If a larger scale is needed, consider deploying a VPLEX Metro, or
consider deploying multiple instances of VPLEX Local.
5.2.3 VPLEX Metro deployment within a data center

Figure 31 illustrates a typical deployment of a VPLEX Metro system in a
single data center. VPLEX Metro systems contain two clusters, each
cluster having one, two, or four engines with two directors per engine.
The clusters in a VPLEX Metro deployment need not have the same
number of engines. For example a 2x4 VPLEX Metro system is
supported with one cluster having two engines, and the other cluster
having four engines.
Figure 31
80
VPLEX Metro deployment in a single data center
5.2.3.1 When to use a VPLEX Metro deployment within a data center

Deploying VPLEX Metro within a data center is appropriate when the
virtual storage capabilities of workload relocation, workload resiliency,
and simplified storage management are desired within a single data
center and more scaling is needed beyond that of a VPLEX Local or
when additional resiliency is desired. VPLEX Metro provides the
following additional resiliency benefits over VPLEX Local:
The two clusters of a VPLEX Metro can be separated by up to 100

km. This provides excellent flexibility for deployment within a data
center and allows the two clusters to be deployed at separate ends
of a machine room, or on different floors to provide better fault
isolation between the clusters. For example, this often allows the
clusters to be placed in different fire suppression zones which can
mean the difference between riding through a localized fault, such
as a contained fire and a total system outage.
Volumes can be mirrored between the two directors allowing

active/active clustered servers to have access to the data through
either cluster. This provides added resiliency in the case of an entire
cluster failure. In such a deployment, on a per distributed volume
basis, one cluster is designated as the primary cluster for data
consistency. Should the primary cluster of a distributed volume
fail, data access to this volume is suspended on the secondary
cluster and must be manually resumed once the failure condition
has been assessed. This is to prevent a partition between the two
clusters from allowing the state of the volumes to diverge in an
occurrence of the well-known split-brain problem.
5.2.4 VPLEX Metro deployment between data centers

Figure 32 on page 82 illustrates a deployment of a VPLEX Metro system
between two data centers. This deployment is similar to that shown in
Figure 31 on page 80, only here the clusters are placed in separate data
centers. This typically means that separate hosts connect to each
cluster. Clustered applications can have one set of application servers
deployed for example in data center A, and another set deployed in
data center B for added resiliency and workload relocation benefits.
As described in Figure 32 on page 82, it is important to understand that
a site or total cluster failure of the primary cluster for a distributed
volume will require manual resumption of I/O on the secondary site.
Deployment overview
81
Figure 32
VPLEX Metro deployment between two data centers
5.2.4.1 When to use a VPLEX Metro deployment between data centers

A deployment of VPLEX Metro between two data centers is
appropriate when the additional workload resiliency benefits of having
an applications data present in both data centers is desired. This
deployment is also desirable when applications in one data center want
to access data in the other data center, or when one wants to redistribute
workloads between the two data centers, or when one data center has
run out of space, power, or cooling.
82
5.3 Workload resiliency

The next few sections cover different faults that can occur in a data
center and look at how VPLEX can be used to add additional resiliency
to applications allowing their workload to ride through these fault
conditions. The following classes of faults and service events are
considered:
Storage array outages (planned and unplanned)
SAN outages
VPLEX component failures
VPLEX cluster failures
Host failures
Data center outages
5.3.1 Storage array outages

To overcome both planned and unplanned storage array outages,
VPLEX supports the ability to mirror the data of a virtual volume
between two or more storage volumes1 using a RAID 1 device.
Figure 33 on page 84 shows a virtual volume that is mirrored between
two arrays. Should one array incur an outage, either planned or
unplanned, the VPLEX system will be able to continue processing I/O
on the surviving mirror leg. After restoration of the failed storage
volume, the data from the surviving volume is resynchronized with the
recovered leg.
5.3.1.1 Best practices
The following are recommended best practices:
For critical data, it is recommended to mirror these data onto
two or more storage volumes that are provided by different
arrays.
For the best performance, these storage volumes should be
configured identically and be provided by the same type of
array.
1.
Up to two legs per mirror volume are supported.

Workload resiliency
83
Figure 33
RAID 1 mirroring to protect against array outages
5.3.2 SAN outages

When the best practice of using a pair of redundant Fibre Channel
fabrics is used with VPLEX, VPLEX directors are connected to both
fabrics, both for the front-end (host-side) connectivity, as well as for the
back-end (storage array side) connectivity. This deployment along with
the isolation of the fabrics, allows the VPLEX system to ride through
failures that take out an entire fabric and allow the system to provide
continuous access to data through this type of fault. Figure 34 on
page 85 illustrates a best practice dual-fabric deployment.
84
Figure 34
Dual-fabric deployment
5.3.2.1 Best practice

It is recommended that I/O modules be connected to redundant fabrics.
For example, in a deployment with fabrics A and B, it is recommended
that the ports of a director be connected as shown in Figure 35 on
page 86.
Workload resiliency
85
Figure 35
86
Fabric assignments for FE and BE ports
5.4 Technology refresh use case

Many data centers have a process referred to as evergreening where
hardware and software resources are periodically replaced. However,
this is not simply a practice of replacing the old with new. Capabilities
exist today that were not available four years ago, the typical time for
technology replacement. Today we have storage capabilities such as
ultra-high performance storage provided by Enterprise Flash Drives,
low-cost storage provided by high-capacity SATA drives, efficiency
techniques such as Symmetrix Virtual Provisioning, and the ability to
set policies that move data between drive types as business
requirements and workloads change. Leveraging these capabilities
allows EMC to offer the highest service levels for our demanding
customers, and to do so at the lowest possible operating costs.
Furthermore, data centers need to position themselves to rapidly
respond to the changes in business requirements and technology
development cycles, which are now measured in months rather than in
years.
Virtualization of servers and federating storage resources has proven
instrumental to achieving these goals. Inserting a virtualization layer
between the operating systems and server hardware, as with VMware
vSphere, provides proven value in managing physical resources and
changing service levels. Similarly, storage federation provides a layer
between the server and storage that enables movement of data between
the underlying storage systems, completely transparent to the host
environment. EMC VPLEX systems deliver local and distributed
federation. This federation layer enables seamless migration between
storage systems and allows storage administrators to leverage new
storage capabilities that provide greater levels of efficiency and lower
cost, without compromising service levels to end users.
The VPLEX product offerings provide significant savings that go above
a simple array replacement when the VPLEX product is permanently
introduced into the environment. Future array replacements will be
accomplished in a matter of just a few days as opposed to several
months. Little or no coordination is required with the business units.
Future migrations will be completely transparent and require minimum
staff to facilitate.
Technology refresh use case
87
5.5 Introduction to distributed data access

Distributed (or shared) data access occurs when more than one host or
initiator accesses the data present on a storage volume. Hosts accessing
shared data must somehow coordinate their I/O activity to avoid
overwriting (or corrupting) the data.
There are several classes of applications using distributed data access.
Scale-out distributed applications can be designed to run on several
hosts to load-balance over all the computing resources and make the
application resilient to equipment failures. These applications scale-out
by assigning more servers to them. Most high-end databases can be
distributed over multiple server and many web servers now have these
capabilities too. Clustered virtual host environments such as VMware
ESX and Microsoft Hyper-V fall into this class as well.
Mobile applications to make use of more desirable computing
resources or in preparation of maintenance activity, applications can
move from one physical computing device to another. This is most
commonly implemented as virtual machines migrating between cluster
nodes
Stand-by failover hosts are waiting to start running an application
when a failure in the primary hosts is detected.
Processing chain using a distributed file system to coordinate file
locking and data hand-off, the data in a file is accessed by one host after
another. This can be one host producing the data and another reading it
(multimedia streams for example) or there can be chains of hosts
modifying the data (multimedia processing for example).
Metacomputing using large numbers of computers to process data in
parallel, metacomputing is used for pharmaceutical research and
multi-media production. These computers run loosely coupled
applications that must often start by reading data from a common data
set, process the data and then write the results into another common
data set.
88
5.5.1 Traditional approach for distributed data access

In addition to offering high throughput and reliability and supporting
flexible cabling options, Fibre Channel SANs were introduced to enable
distributed data access. Fibre Channel SANs first became popular in the
mid 1990s in the media industry, where many very large files moved
from host to host to be processed.
When every storage device and every host is connected to a Fibre
Channel SAN, enabling distributed data access is only a matter of
zoning Fibre Channel fabric and LUN masking the storage arrays.
Problems with this approach include:
Many arrays The capacity growth in storage arrays has not
kept up with explosion of data, resulting in large numbers of
storage arrays being deployed. Each storage array represents a
storage domain that must be managed individually.
Many array models and vendors The need to support multiple
storage tiers has resulted in many SANs deploying several
different array models to match the applications cost/benefit
analysis of the storage. Having several different array types
makes the management of LUN masking complicated.
SAN islands Many organizations deploying Fibre Channel
SANs have several islands of equipment, typically as a result of
department and project boundaries. Each SAN island creates a
storage domain that can only be accessed by hosts in the same
island. Merging and bridging SAN islands is complex.
Distance Although sufficient Fibre Channel switch buffer
credits can make it possible to maintain throughput, in practice
increased latency makes it hard for applications accessing the
data to drive that throughput. Arrays located in one physical
location create a storage domain that is best accessed by hosts in
close proximity.
Introduction to distributed data access
89
5.5.2 Removing barriers for distributed data access with VPLEX Metro
VPLEX Metro federates storage domains, hiding most of the costs and
complexity associated with distributed data access. A VPLEX cluster
federates the storage domains from all the storage arrays connected to
it, creating a single point of control for provisioning and LUN masking,
overcoming some of the complexity issues. VPLEX Metro connects two
clusters over Fibre Channel, federating each clusters storage domain
into one domain, accessible by initiators connected to each cluster.
The distributed VPLEX cache also provides local read caching, hiding
the latency effects for common workloads, while not burdening the
applications with maintaining the cache coherency. To hosts accessing
storage through a VPLEX cluster, all storage appears on the SAN as if it
were local. A distributed or clustered file system on top of this storage
provides a hierarchical file representation of the data, more suitable for
most applications that raw block access. Hosts accessing a file on the
distributed file system will use their local SAN connection to retrieve
the data, relying on VPLEX to retrieve the data from local cache or from
the appropriate storage array.
90
5.6 Technical overview of the VPLEX Metro solution

The following sections outline the technical overview of the VPLEX
Metro solution.
5.6.1 Distributed block cache

VPLEX provides a distributed block cache across all the directors,
guaranteeing cache coherency across the entire solution. This lets
applications access the data from any director in any cluster without
having to worry about coherency. The GeoSynchrony operating system
implementing this distributed cache keeps often-referenced often
referenced data in the director that is accessing the data, optimizing
access times.
To make optimal use of read caching, updates to the same data should
be as localized as possible. When you distribute writes to the region of
blocks, GeoSynchrony coordinates cache invalidation among all
participating directors, delaying the acknowledgement of the write
operation. When all updates to a region of blocks are localized,
GeoSychrony optimizes cache coherency traffic to finish the write
operation earlier.
5.6.2 Enabling distributed data access

There are two storage models enabling distributed data access with
VPLEX Metro, remote access and distributed devices.
5.6.2.1 Remote access
A cluster in VPLEX Metro can make virtual volumes build on local
storage available at the remote cluster for presentation to initiators
connected to that remote cluster. The storage supporting these virtual
volumes is physically only attached to the local VPLEX cluster.
This storage model is best suited to active/passive access patterns,
where the hosts producing (writing) the data are attached to one
VPLEX cluster, hosting the data on its local storage, giving hosts on the
remote VPLEX cluster access to it over the inter-cluster link. All I/O
from the remote cluster must go over the inter-cluster link since they
have no local copy.
During inter-cluster link outages, the virtual volumes on the remote
cluster will return I/O errors but continue to serve I/O on the cluster
hosting the storage.
Technical overview of the VPLEX Metro solution
91
5.6.2.2 Distributed devices

Each cluster in VPLEX Metro provides storage for a distributed mirror
whose virtual volume is available at both clusters for presentation to
one or more initiators.
This storage model best suits the active/active access patterns, where
both clusters in VPLEX Metro have hosts actively writing data. It also
has characteristics that make it ideal for data that is frequently read at
both clusters. All legs of distributed devices contain identical data,
providing local storage for reads at local SAN speeds.
During inter-cluster link outages, a configurable detach rule will let one
of the two VPLEX clusters continue I/O while the virtual volumes on
the other cluster remain suspended.
Detach rules for distributed devices
When a cluster in VPLEX Metro loses connectivity to its peer cluster,
I/O on distributed devices is suspended until the cluster detaches the
distributed device. After detaching a distributed device, local writes no
longer go to the remote mirror leg but instead are committed to the
local mirror leg only. The remote cluster, not having detached the
distributed device, will keep I/O to that device suspended until
communication with the other cluster is restored.
When communication between two clusters is restored, VPLEX Metro
will automatically re-sync both distributed mirror legs by updating the
remote mirror legs with all the writes that occurred after the detach.
While I/O at the detaching cluster was never interrupted, the other
cluster can now resume after having been suspended during the link
outage.
Detach rules must be configured for distributed devices when both
clusters in VPLEX Metro are in contact. By specifying a detach rule, you
determine which cluster can keep doing I/O during link problems.
And, because link failure is indistinguishable from a remote VPLEX
cluster failure (or remote site failure) from a clusters perspective, the
detach rule also determines which cluster can keep doing I/O when its
peer stops functioning.
For active/passive access patterns, the cluster attached to the active
initiators should always detach distributed devices during
communication problems. The passive initiators will remain
suspended.
92
Support for cluster quorum and fencing mechanisms

Applications performing distributed data access must coordinate access
to the data to avoid corrupting data. Application clusters often use
fencing and quorum mechanisms to accomplish some of this
coordination.
Fencing is a mechanism for quarantining cluster members that are
having communication problems. There are two main implementations
of fencing: SCSI reservations and Fibre Channel switch fencing. VPLEX
Metro supports distributed SCSI-2 and SCSI-3 reservations in support
of the SCSI reservation fencing mechanism. Fibre Channel switch
fencing relies on all cluster nodes being able to communicate with the
Fibre Channel switches in order to remove a misbehaving nodes access
to the storage.
Quorum disks are used by some clusters to deal with split brain
situations. You can use distributed devices as quorum disks; the detach
rules will determine which half of the cluster will be able to do I/O to
the quorum disk during connectivity problems, and consequently,
which half of the cluster will remain operational.
Technical overview of the VPLEX Metro solution
93
94
6
VPLEX Use Case Example
using VMware
This chapter provides use case example using Storage VMotion for ease
of use in VMware environments with VPLEX. Topics include:
6.1 Workload relocation......................................................................... 96

6.2 Use case examples ............................................................................ 97
6.3 Nondisruptive migrations using Storage VMotion .................... 98
6.4 Migration using encapsulation of existing devices ................... 101
6.5 VMware deployments in a VPLEX Metro environment .......... 112
6.6 Conclusion....................................................................................... 129
VPLEX Use Case Example using VMware
95
6.1 Workload relocation

The previous chapters have provided the storage administrator the
insight and knowledge needed to enable their journey to the private
cloud from A to Z. VPLEX provides ease-of-use integration with
VMware as a use case; however non-VMware platforms are also
supported1. The VPLEX system is a natural fit for this next generation
of environments virtualized with VMware vSphere technology. The
capabilities of VPLEX to federate disparate storage systems and present
globally unique devices not bound by physical data centers boundaries
work hand in hand with the inherent capabilities of VMware vSphere to
provide cloud-based service.
For this use case example, it is assumed that the customer has a fully
functioning VPLEX Metro installation. The stretched VMware cluster
allows for transparent load sharing between multiple sites, while
providing the flexibility of migrating workloads between sites in
anticipation of planned events. Furthermore, in case of an unplanned
event that causes disruption of services at one of the data centers, the
failed services can be restarted at the surviving site with minimal effort
while minimizing time to recovery (RTO).
VMware virtualization platforms virtualize the entire IT infrastructure
including servers, storage, and networks. The VMware software
aggregates these resources and presents a uniform set of elements in the
virtual environment. This capability allows IT resources to be managed
like a shared utility and resources can be dynamically provisioned to
different business units. This capability of VMware virtualization
platform allows customers to realize the benefits of cloud computing to
their data centers.
VMware vSphere 4 brings the power of cloud computing to the data
center, reducing IT costs while also increasing infrastructure efficacy.
For hosting service providers, VMware vSphere 4 enables a more
economic and efficient path to delivering cloud services that are
compatible with internal cloud infrastructures. VMware vSphere 4
delivers significant performance and scalability improvements over the
previous generation, VMware Infrastructure 3, to enable even the most
resource-intensive applications, such as large databases, to be deployed
on internal clouds. With these performance and scalability
improvements, VMware vSphere 4 can enable 100 percent virtualized
internal cloud.
1.
96
Refer to the EMC VPLEX V4.0 Simple Support Matrix
6.2 Use case examples

In the following use case examples, VPLEX is shown to be resourceful
in a variety of ways to do seamless migrations locally and across a
federated Metro-Plex.
Existing deployments of VMware virtualization platforms can be
migrated to VPLEX environments. There are a number of different
alternatives that can be leveraged.
The easiest method to migrate to a VPLEX environment is to use
Storage VMotion. However, this technique is viable only if the storage
array has sufficient free storage to accommodate the largest datastore in
the VMware environment. Furthermore, Storage VMotion may be
tedious if several hundreds of virtual machines or terabytes have to be
converted, or if the virtual machines have existing snapshots, or if the
VMware virtualization platform consists of ESX servers V3.0 or earlier.
For these scenarios, it might be appropriate to leverage the capability of
VPLEX systems to encapsulate existing devices. However, this
methodology is disruptive and requires planned outages to the
VMware virtualization platform.
Use case examples
97
6.3 Nondisruptive migrations using Storage VMotion

Figure 36 shows the datastores available on a VMware ESX server
version 3.5 managed by a vSphere vCenter server. The view is available
using the EMC Virtual Storage Integrator (VSI) client-side plug-in that
extends the storage-related information displayed from vSphere Client.
It can be seen in the figure that the virtual machine W2K8 VM1 (VI3)
resides on DataStore_1 hosted on device 4EC on a Symmetrix VMAX
array. The inset shows the version of the ESX kernel (3.5 build 153875)
for the server 10.243.168.160.
Figure 36
EMC Storage device displayed by EMC Virtual Storage Integrator (VSI)
Figure 37 on page 99 shows the devices visible on the ESX server. It can
be seen that there are two devices with the product identification
Invista but without any details. This is the case since EMC Virtual
Storage Integrator (VSI) at this point does not have the capability to
resolve the devices presented from VPLEX systems. The figure also
shows the NAA number for the devices. As discussed earlier, the Fibre
Channel OUI (organizationally unique identifier), 00:01:44, corresponds
98
to VPLEX devices. Therefore, it can be concluded from the picture that

the VMware ESX server is presented with devices from both EMC
Symmetrix VMAX arrays and VPLEX systems.
Figure 37
VPLEX devices in VMware ESX server cluster
The migration of the data from the VMAX arrays to the storage
presented from VPLEX can be performed using Storage VMotion after
appropriate datastores are created on the devices presented from
VPLEX.
Figure 38 on page 100 shows the steps required to initiate the migration
of a virtual machine from Datastore_1 to the target datastore, Target_1,
that resides on a VPLEX device. It is important to note that although an
ESX server V3.5 was utilized to showcase the migration procedure, the
same process is applicable for ESX servers running V4.0 or later.
Furthermore, it should also be noted that the migration wizard is
available only when vCenter Server V4.0 or later is leveraged. The
Storage VMotion functionality is available via command line utility for
vCenter Server V2.5. Detailed discussion of Storage VMotion is beyond
the scope of this TechBook.
Nondisruptive migrations using Storage VMotion
99
Figure 38
100
Storage VMotion to migrate virtual machines to VPLEX devices
6.4 Migration using encapsulation of existing devices

As noted earlier, although Storage VMotion provides the capability to
perform nondisruptive migration from an existing VMware
deployment to VPLEX systems, it might not be always a viable tool. For
these situations, the encapsulation capabilities of VPLEX systems can
be leveraged. The procedure, however, is disruptive but the duration of
the disruption can be minimized by proper planning and execution.
The following steps need to be taken to encapsulate and migrate an
existing VMware deployment.
1. Zone the back-end ports of the VPLEX system to the front-end ports
of the storage array currently providing the storage resources.
2. Change the LUN masking on the storage array so the VPLEX
system has access to the devices that host the VMware datastores. In
the example that was used in the previous section, the devices 4EC
(for Datastore_1) and 4F0 (for Datastore_2) have to be masked to the
VPLEX system.
Figure 39 on page 102 shows the devices that are visible to the
VPLEX system after the masking changes have been performed and
a rescan of the storage array has been performed on the VPLEX
system. The figure also shows the SYMCLI output of the Symmetrix
VMAX devices and their corresponding WWNs. A quick
comparison clearly shows that the VPLEX systems have access to
the devices that host the datastores that need to be encapsulated.
Migration using encapsulation of existing devices
101
Figure 39
Devices to be encapsulated
3. Once the devices are visible to the VPLEX system they have to be
claimed. This step is shown in Figure 40 on page 103. The -appc
flag during the claiming process ensures that the content of this
device is preserved, and that the device is encapsulated for further
use within the VPLEX system.
102
Figure 40
Encapsulating devices in a VPLEX system
4. After claiming the devices, create a single extent that spans the
whole disk. Figure 41 on page 104 shows this step for the two
datastores that are being encapsulated in this example.
103
Figure 41
Creating extents on encapsulated storage volumes
5. Create a VPLEX device (local-device) with a single RAID 1 member

using the extent that was created in the previous step. This is shown
for the two datastores, Datastore_1 and Datastore_2, hosted on
device 4EC and 4F0, respectively, in Figure 42 on page 105. The step
should be repeated for all of the storage array devices that need to
be encapsulated and exposed to the VMware environment.
104
Figure 42
VPLEX RAID 1 protected device on encapsulated VMAX devices
6. Create a virtual volume on each VPLEX device that was created in

the previous step. This is shown in Figure 43 for the VMware
datastores Datastore_1 and Datastore_2.
Figure 43
Virtual volumes on VPLEX
7. Create a storage view on the VPLEX system by manually registering

the WWN of the HBAs on the VMware ESX servers that are part of
the VMware virtualization domain. The storage view allows the
VMware virtualization platform access to the virtual volume(s) that
were created in step 5.
By doing so, the disruption to the service during the switchover
from the original storage array over to the VPLEX system can be
minimized. An example of this for the environment used is shown
in Figure 44 on page 106.
105
Figure 44
Storage view
8. In parallel to the operations conducted on the VPLEX system, create

new zones that allow the VMware ESX servers involved in the
migration access to the front-end ports of the VPLEX system. These
zones should also be added to the appropriate zone set.
The zones that provide the VMware ESX server access to the storage
array whose devices are being encapsulated should be removed
from the zone set. However, the modified zone set should not be
activated until the maintenance window when the VMware virtual
machines can be shut down.
9. When the maintenance window is open, gracefully shut down all of
the virtual machines that would be impacted by the migration.
This can be either done using the VMware Infrastructure Client,
vSphere Client, or command line utilities that leverage the VMware
SDK.
106
10. The devices presented from the VPLEX system host the original
datastore. However, the VMware ESX hosts do not automatically
mount datastores since VMware ESX considers datastores as a
snapshot since the WWN of the devices exposed through the
VPLEX system differ from the WWN of the devices presented from
the Symmetrix VMAX system.
VMware vSphere allows the resignaturing process of datastores
that are considered snapshots to performed on a device-by-device
basis. This reduces the risk of mistakenly resignaturing the
encapsulated devices from VPLEX system. The use of persistent
mount also provides other advantages such as retaining of the
history of all of the virtual machines. Therefore, for a homogeneous
vSphere environment, EMC recommends the use of persistent
mounts for VMware datastores that are encapsulated by VPLEX.
For VMware environments that contain VMware ESX version 3.5 or
earlier, this step should be skipped.
Activate the zone set that was created in step 8. A manual rescan of
the SCSI bus on the VMware ESX servers should remove the
original devices and add the encapsulated devices presented from
the VPLEX system.
Figure 45 on page 108 shows an example of this for a VMware
vSphere environment. The figure shows all of the original virtual
machines in the environment are now marked as inaccessible. This
occurs since the datastores, Datastore_1 and Datastore_2, created on
the devices presented from the Symmetrix VMAX system are no
longer available.
107
Figure 45
Rescan of the SCSI bus on the VMware ESX servers
In Figure 46 on page 109, the results after the persistent mounting of

the datastores presented from the VPLEX system are shown. It can
be seen that all of the virtual machines that were inaccessible are
now available. The persistent mount of the datastores considered
snapshots retain both the UUID of the datastore and the label. Since
the virtual machines are cross-referenced using the UUID of the
datastores, the persistent mount enables vCenter Server to
rediscover the virtual machines that were previously considered
inaccessible.
Steps 11 through 14 listed below do not apply to homogeneous
vSphere environments and should be skipped.
108
Figure 46
Mounting datastores on encapsulated VPLEX devices
11. If the VMware environment contains ESX server version 3.5 or

earlier (even if managed by VMware vCenter Server version 4), it is
advisable to resignature the encapsulated devices presented from
the VPLEX system. This recommendation is based on the fact that in
these releases of VMware ESX server, the resignaturing process of
devices that are considered snapshots is not selective and cannot be
reversed.
The virtual machines hosted on the datastores should be removed
from the vCenter Server inventory. This can be performed using the
Virtual Infrastructure Client, the vSphere Client, or command line
utilities that leverage the VMware SDK. Note that when the virtual
machines are unregistered all historical information about the
virtual machine is deleted from the Virtual Center database.
12. Change the Advanced Settings flag, LVM.EnableResignature,
on one of the VMware ESX hosts to resignature the datastores.
Activate the zone set that was created in step 8. A manual rescan of
the SCSI bus on the VMware ESX servers should remove the
original devices, and add and resignature the encapsulated devices
presented from the VPLEX system.
109
Figure 47 shows the datastores after the resignaturing process is

complete. As it can be seen from the figure the prefix snap-xxxxxxxx
has been added to the original label of the datastores.
Figure 47
Resignaturing datastores on encapsulated VPLEX devices
13. Once the VPlex devices have been discovered and the VMware
datastores have been resignatured, the advanced parameter
LVM.EnableResignature should be set to 0.
14. The virtual machines that were unregistered in step 10 can be added
back to the vCenter Server inventory using either the Virtual
Infrastructure Client, vSphere Client, or command line utilities
based on the VMware SDK. An example of this is shown in
Figure 48 on page 111.
110
Figure 48
Adding virtual machines
15. After the virtual machines are properly identified or registered, they
can be powered on.
111
6.5 VMware deployments in a VPLEX Metro environment

The VPLEX system breaks physical barriers of data centers and allows
users to access data at different geographical locations concurrently2.
This functionality in a VMware context enables functionality that was
not available earlier. Specifically, the ability to concurrently access the
same set of devices independent of the physical location enables
geographically stretched clusters based on the VMware virtualization
platform3. This allows for transparent load sharing between multiple
sites while providing the flexibility of migrating workloads between
sites in anticipation of planned events, such as hardware maintenance.
Furthermore, in case of an unplanned event that causes disruption of
services at one of the data centers, the failed services can be quickly and
easily restarted at the surviving site with minimal effort. Nevertheless,
the design of the VMware environment has to account for a number of
potential failure scenarios and mitigate the risk for services disruption.
The following paragraphs discuss the best practices for designing the
VMware environment to ensure an optimal solution.
6.5.1 VMware cluster configuration

A VMware HA cluster uses a heartbeat to determine if the peer nodes
in the cluster are reachable and responsive. In case of communication
failure, the VMware HA software running on the VMware ESX server
normally utilizes the default gateway for the VMware kernel to
determine if it should isolate itself. This mechanism is necessary since it
is programmatically impossible to determine if a break in
communication is due to server failure or a network failure.
The same fundamental issue presented abovewhether the lack of
connectivity between the nodes of the VPLEX clusters is due to a
network communication failure or site failure applies to the VPLEX
clusters that are separated by geographical distances. A network failure
is handled on the VPLEX systems by automatically suspending all I/Os
to a device (detached) on one of the two sites based on a set of
predefined rules.
2. Although the architecture of VPLEX is designed to support concurrent

access at multiple locations, the first version of the product supports a
two-site configuration separated by synchronous distance.
3. The solution requires extension of VLAN to different physical data centers.
Technologies such as Ciscos Overlay Transport Virtualization (OTV) can be
leveraged to provide the service.
112
The I/O operations at the other site to the same device continue
normally. Furthermore, since the rules are applied on a
device-by-device basis it is possible to have active devices on both sites
in case of a network partition. Imposition of the rules to minimize the
impact of network interruptions does have an impact in site failure. In
this case, based on the rules defining the site that detaches, if there is a
breakdown in communications, the VPLEX cluster at the surviving site
automatically suspends the I/O to some of the devices at the surviving
site. To address this, the VPLEX software provides the capability to
manually resume I/Os to the detached devices. However, a more
detailed of the procedure to perform these operations is beyond the
scope of this TechBook.
Figure 49 on page 114 shows the recommended cluster configuration
for VMware deployments that leverage devices presented through
VPLEX Metro systems. It can be seen from the figure that the VMware
virtualization platform is divided into two separate VMware clusters.
Each cluster includes the VMware ESX servers at each physical
datacenter (site A and site B). However, both VMware clusters are
managed under a single datacenter entity that represents the logical
combination of multiple physical sites involved in the solution. Also
shown in the figure, as an inset, are the settings for each cluster. The
inset shows that VMware DRS and VMware HA are active in each
cluster, thus restricting the domain of operation of these components of
the VMware offering to a single physical location.
VMware deployments in a VPLEX Metro environment
113
Figure 49
Configuration of VMware clusters
Although Figure 49 shows only two VMware clusters, it is acceptable to

divide the VMware ESX servers at each physical location into multiple
VMware clusters. The goal of the recommended configuration is to
prevent intermingling of the ESX servers at multiple locations into a
single VMware cluster object.
The VMware datastores presented to the logical representation of the
conjoined physical data centers (site A and site B) are shown in
Figure 50 on page 116.
114
The figure shows that a number of VMware datastores are presented

across both data centers4. Therefore, the logical separation of the
VMware DRS and VMware HA domain does not in any way impact, as
discussed in the following section, the capability of VMware vCenter
Server to migrate transparently the virtual machines operating in the
cluster designated for each site to its peer site.
Figure 50 also highlights the fact that a VPLEX Metro configuration in
of itself does not imply the requirement of replicating all of the virtual
volumes created on the Metro-Plex system to all physical datacenter
locations5. Virtual machines hosted on datastores encapsulated on
virtual volumes with a single copy of the data and presented to the
VMware cluster at that location are, however, bound to that site and
cannot be nondisruptively migrated to the second site while providing
protection against unplanned events. The need to host a set of virtual
machines on non-replicated virtual volumes could be driven by a
number of reasons including business criticality of the virtual machines
hosted on those datastores.
4. It is possible to present a virtual volume that is not replicated to VMware

clusters at both sites. In such a configuration, all I/O activity generated at
the site that does not have a copy of the data is satisfied by the storage array
hosting the virtual volume. Such a configuration can impose severe
performance penalties and does not protect against unplanned events at the
site hosting the storage array.
5. The creation of a shared datastore that is visible to VMware ESX servers at
both sites is enabled by creating a distributed device in VPLEX Metro. The
procedures to create distributed devices is beyond the scope of this
TechBook.
115
Figure 50
Storage view of the datastores presented to VMware clusters
Figure 51 is an extension of the information shown in Figure 50. This

figure includes information on the virtual machines and the datastores
in the configuration used in this study. The figure shows that a
datastore hosting virtual machines are executing at a single physical
location. Also shown in this figure, is the WWN of the SCSI device
hosting the datastore Distributed_DSC_Site_A.
The configuration of the VPLEX Metro virtual volume with the WWN
displayed in Figure 50 is exhibited in Figure 51. The figure shows that
the virtual volume is exported to the hosts in the VMware cluster at site
A.
116
Figure 51
Datastores and virtual machines view
117
Figure 52
Metro-Plex volume presented to the VMware environment
Figure 53 on page 119 shows the rules enforced on the virtual volume
hosting the datastore Distributed_DSC_Site_A. It can be seen from the
figure that the rules are set to suspend I/Os at site B in case of a
network partition. Therefore, the rules ensure that if there is a network
partition, the virtual machines hosted on datastore
Distributed_DSC_Site_A is not impacted by it. Similarly, for the virtual
machines hosted at site B, the rules are set to ensure that the I/Os to
those datastores are not impacted in case of a network partition.
118
Figure 53
Detach rules on VPLEX distributed devices
6.5.2 Nondisruptive migration of virtual machines using VMotion

An example of the capability of migrating running virtual machines
between the cluster, and hence physical data centers, is shown in
Figure 54 on page 120. The figure clearly shows that from the VMware
vCenter Server perspective the physical location of the data centers
does not play a role in providing the capability to move live workloads
between sites supported by a VPLEX Metro system.
119
Figure 54
vCenter Server allowing live migration of virtual machines
Figure 55 on page 121 shows a snapshot during the nondisruptive

migration of a virtual machine from one site to another. The figure also
shows the console of the virtual machine during the migration process
highlighting the lack of any impact to the virtual machine during the
process.
120
Figure 55
Progression of VMotion between two physical sites
It is important to note that EMC does not recommend the migration of a

single virtual machine from one site to another because it breaks the
paradigm discussed previously. A partial migration of the virtual
machines hosted on a datastore can cause unnecessary disruption to the
service in case of a network partition. For example, after the successful
migration of the virtual machine IOM02 shown in Figure 54 on
page 120 and Figure 55, if there is a network partition, the rules in effect
on the devices hosted on the datastore suspend I/Os at the site on
which the migrated virtual machine is executing. The suspension of
I/Os results in abrupt disruption of the services provided by IOM02. To
prevent such an untoward event, EMC recommends migrating all of the
virtual machines hosted on a datastore followed by a change in the
rules in effect for the device hosting the impacted datastore. The new
rules should ensure that the I/Os to the device continue at the site to
which the migration occurred.
121
6.5.3 Changing configuration of non-replicated VPLEX Metro volumes

As mentioned in the previous paragraphs, VPLEX Metro does not
restrict the configuration of the virtual volume exported by the cluster.
A VPLEX Metro configuration can export a combination of
non-replicated and replicated virtual volumes. Business requirements
normally dictate the type of virtual volume that has to be configured.
However, if the business requirements change, the configuration of the
virtual volume on which the virtual machines are hosted can be
changed nondisruptively to a replicated virtual volume and presented
to multiple VMware clusters at different physical locations for
concurrent access.
Figure 56 shows the datastore Conversion_Datastore that is currently
available only at a cluster hosted at a single site (in this case Site A).
Therefore, the virtual machines contained in this datastore cannot be
nondisruptively migrated to the second site available in the VPLEX
Metro configuration6.
6. Technologies such as Storage VMotion can be used to migrate the virtual

machine to a VPLEX Metro virtual volume that is replicated and available at
both sites, and thus enable the capability to migrate the virtual machine
nondisruptively between sites. However, this approach adds unnecessary
complexity to the process. Nonetheless, this process can be leveraged for
transporting virtual machines that cannot tolerate the overhead of
synchronous replication.
122
Figure 56
VMware datastore at a single site in a Metro-Plex configuration
Figure 57 shows the configuration of the virtual volume on which the

datastore is located. It can be seen from the figure that the virtual
volume contains a single device available at the same site. If the
changing business requirement requires the datastore to be replicated
and made available at both locations, the configuration can be easily
changed, as long as sufficient physical storage is available at the second
site that currently does not contain a copy of the data.
Figure 57
Non-replicated Metro-Plex virtual volume

123
The process to convert a non-replicated device encapsulated in a virtual

volume so that it is replicated to the second site and presented to the
VMware cluster at the second site is presented next. The process
involves four steps:
1. Create a device at the site on which the copy of the data needs to
reside. The process to create a device, shown in Figure 58, is
independent of the host operating system.
Figure 58
Creating a device on VPLEX
2. Add the newly created device as a mirror to the existing device that
needs the geographical protection. This is shown in Figure 59 on
page 125, and just like the previous step, is independent of the host
operating system utilizing the virtual volumes created from the
devices.
124
Figure 59
Protection type change of a RAID 0 VPLEX device to distributed RAID 1
3. Create or change the LUN masking on the Metro-Plex to enable the

VMware ESX servers attached to the node at the second site to
access the virtual volume containing the replicated devices.
Figure 60 on page 126 shows the results after the execution of the
process.
125
Figure 60
VPLEX virtual volume at the second site
4. The newly exported VPLEX virtual volume that contains replicated

devices needs to be discovered on the VMware cluster at the second
site. This process is the same as adding any SCSI device to a
VMware cluster. Figure 61 shows the replicated datastore is now
available on both VMware clusters at site A and site B after the
rescan of the SCSI bus.
Figure 61
126
Viewing VMware ESX servers
6.5.4 Virtualized vCenter server on VPLEX Metro

VMware supports virtualized instances of vCenter Server version 4.0 or
later. Running the vCenter Server and associated components in a
virtual machine provides customers with great flexibility and
convenience since the benefits of a virtual datacenter can be leveraged
for all components in a VMware deployment. However, in an VPLEX
Metro environment, a careless deployment of a vCenter Server running
in a virtual machine can expose interesting challenges if there is a site
failure. This is especially true if the vCenter Server is used to manage
VMware environments also deployed on the same VPLEX Metro
clusters.
As discussed in previous sections, in case of site failure or network
partition between the sites, the VPLEX system automatically suspends
all of the I/Os at one site. The site at which the I/Os are suspended is
determined through a set of rules that is active when the event occurs.
This behavior can increase the time to recovery (RTO) if there is a site
failure and the VMware vCenter Server is located on an VPLEX
distributed volume that is replicated to both sites. The issue can be best
elucidated through the use of an example.
Consider a VMware environment deployment in which the vCenter
Server and a SQL Server is running on separate virtual machines.
However, the two virtual machines are hosted on a replicated VPLEX
device, D between two sites, A and B. In this example, let us assume
that the vCenter Server and SQL Server are executing at site A. The best
practices recommendation would therefore dictate that the I/Os to
device D be suspended at site B. This recommendation allows the
virtual machines hosting the vSphere management applications to
continue running at site A in case of network partition7. However, if a
disruptive event causes all service at site A to be lost, the VMware
environment becomes unmanageable since the instance of device, D, at
site B would be in a suspended state.
7.
It is important to note that in case of a network partition, the virtual

machines executing at site B continue to run uninterrupted. However, since
the vCenter Server located at site A has no network connectivity to the
servers at site B, the VMware ESX Server environment at site B cannot be
managed. This includes unavailability of advanced functionality such as
DRS and VMotion.
127
To recover from this, a number of corrective actions listed below would

have to be performed:
1. The I/Os to device D have to be resumed. This can be done through
the VPLEX management interface.
2. Once the I/Os to the device D have been resumed, the vSphere
Client should be pointed to the ESX server at site B that has access
to the datastore hosted on device D.
3. The virtual machines hosting the vCenter Server and SQL Server
instance have to be registered using the vSphere Client.
4. After the virtual machines are registered, the SQL Server should be
started first.
5. Once the SQL Server is fully functional, the vCenter Server should
be started.
These steps restore a fully operational VMware management
environment at site B in case of a failure at site A.
The example above clearly shows that hosting a vCenter Server on
replicated VPLEX Metro device can impose additional complexity in
the environment if there is a site failure. There are two possible
techniques that can be used to mitigate this:
The vCenter Server and SQL Server should be hosted on
non-replicated VPLEX devices. VMware Heartbeat can be used
to transparently replicate the vCenter data between the sites and
provide a recovery mechanism in case of a site failure. This
solution allows the vCenter Server to automatically fail over to
the surviving site with minimal to no additional intervention.
Consult the VMware vCenter Server Heartbeat documentation
for further information.
The vCenter Server and SQL Server can be located at a third and
independent site that is not impacted by the failure of the site
hosting the VMware ESX servers. This solution allows the
VMware management services to be available even during
network partition that disrupts communication between a sites
hosting the VPLEX Metro systems.
Customers should decide on the most appropriate solution for their
environment after evaluating the advantages and disadvantages of
each.
128
6.6 Conclusion
VPLEX running on the GeoSynchrony operating system and common
EMC hardware introduces the first platform in the world that delivers
both local and distributed federation to the data center. It is capable of
providing ease-of-use integration into an existing data center as well as
data distribution on a global level. This TechBook not only describes the
architecture of VPLEX, but also provides insight into the primary use
cases that VPLEX supports at publication. The use cases revolve around
unplanned and the planned events, nondisruptive data migration,
simplified storage management, AccessAnywhere, across synchronous
distances in a VPLEX Metro, as well as workload resiliency.
Conclusion
129
130
Glossary
This glossary contains terms related to VPLEX federated storage

systems. Many of these terms are used in this manual.
A
AccessAnywhere
The breakthrough technology that enables VPLEX clusters to provide

access to information between clusters that are separated by distance.
active/active
A cluster with no primary or standby servers, because all servers can

run applications and interchangeably act as backup for one another.
active/passive
A powered component that is ready to operate upon the failure of a

primary component.
array
A collection of disk drives where user data and parity data may be
stored. Devices can consist of some or all of the drives within an array.
asynchronous
Describes objects or events that are not coordinated in time. A process

operates independently of other processes, being initiated and left for
another task before being acknowledged.
For example, a host writes data to the blades and then begins other
work while the data is transferred to a local disk and across the WAN
asynchronously. See also synchronous.
B
bandwidth
The range of transmission frequencies a network can accommodate,

expressed as the difference between the highest and lowest frequencies
of a transmission cycle. High bandwidth allows fast or high-volume
transmissions.
131
bit
block
block size
byte
A unit of information that has a binary digit value of either 0 or 1.

The smallest amount of data that can be transferred following SCSI
standards, which is traditionally 512 bytes. Virtual volumes are
presented to users as a contiguous lists of blocks.
The actual size of a block on a device.
Memory space used to store eight bits of data.
C
cache
cache coherency
cluster
Temporary storage for recent writes and recently accessed data. Disk
data is read through the cache so that subsequent read references are
found in the cache.
Managing the cache so data is not lost, corrupted, or overwritten. With
multiple processors, data blocks may have several copies, one in the
main memory and one in each of the cache memories. Cache coherency
propagates the blocks of multiple users throughout the system in a
timely fashion, ensuring the data blocks do not have inconsistent
versions in the different processors caches.
Two or more VPLEX directors forming a single fault-tolerant cluster,
deployed as one to four engines.
cluster ID
The identifier for each cluster in a multi-cluster deployment. The ID is

assigned during installation.
cluster deployment
ID
A numerical cluster identifier, unique within a VPLEX cluster. By

default, VPLEX clusters have a cluster deployment ID of 1. For
multi-cluster deployments, all but one cluster must be reconfigured to
have different cluster deployment IDs.
clustering
COM
command line
interface (CLI)
132
Using two or more computers to function together as a single entity.

Benefits include fault tolerance and load balancing, which increases
reliability and up time.
The intra-cluster communication (Fibre Channel). The communication
used for cache coherency and replication traffic.
A way to interact with a computer operating system or software by
typing commands to perform specific tasks.
continuity of
operations (COOP)
controller
The goal of establishing policies and procedures to be used during an

emergency, including the ability to process, store, and transmit data
before and after.
A device that controls the transfer of data to and from a computer and a
peripheral device.
D
data sharing
device
director
dirty data
disaster recovery
(DR)
disk cache
distributed device
distributed file
system (DFS)
The ability to share access to the same data with multiple servers
regardless of time and location.
A combination of one or more extents to which you add specific RAID
properties. Devices use storage from one cluster only; distributed
devices use storage from both clusters in a multi-cluster plex. See also
distributed device.
A CPU module that runs GeoSynchrony, the core VPLEX software.
There are two directors in each engine, and each has dedicated
resources and is capable of functioning independently.
The write-specific data stored in the cache memory that has yet to be
written to disk.
The ability to restart system operations after an error, preventing data
loss.
A section of RAM that provides cache between the disk and the CPU.
RAMs access time is significantly faster than disk access time; therefore,
a disk-caching program enables the computer to operate faster by
placing recently accessed data in the disk cache.
A RAID 1 device whose mirrors are in geographically separate
locations.
Supports the sharing of files and resources in the form of persistent
storage over a network.
E
engine
Enclosure that contains two directors, management modules, and

redundant power.
133
Ethernet
event
extent
A Local Area Network (LAN) protocol. Ethernet uses a bus topology,

meaning all devices are connected to a central cable, and supports data
transfer rates of between 10 megabits per second and 10 gigabits per
second. For example, 100 Base-T supports data transfer rates of 100
Mb/s.
A log message that results from a significant action initiated by a user
or the system.
A slice (range of blocks) of a storage volume.
F
failover
fault tolerance
Fibre Channel (FC)
field replaceable
unit (FRU)
firmware
Automatically switching to a redundant or standby device, system, or

data path upon the failure or abnormal termination of the currently
active device, system, or data path.
Ability of a system to keep working in the event of hardware or
software failure, usually achieved by duplicating key system
components.
A protocol for transmitting data between computer devices. Longer
distance requires the use of optical fiber; however, FC also works using
coaxial cable and ordinary telephone twisted pair media. Fibre channel
offers point-to-point, switched, and loop interfaces. Used within a SAN
to carry SCSI traffic.
A unit or component of a system that can be replaced on site as
opposed to returning the system to the manufacturer for repair.
Software that is loaded on and runs from the flash ROM on the VPLEX
directors.
G
geographically
distributed system
gigabit (Gb or
Gbit)
gigabit Ethernet
134
A system physically distributed across two or more geographically

separated sites. The degree of distribution can vary widely, from
different locations on a campus or in a city to different continents.
1,073,741,824 (2^30) bits. Often rounded to 10^9.
The version of Ethernet that supports data transfer rates of 1 Gigabit

per second.
gigabyte (GB)
1,073,741,824 (2^30) bytes. Often rounded to 10^9.
global file system

(GFS)
A shared-storage cluster or distributed file system.
H
host bus adapter
(HBA)
An I/O adapter that manages the transfer of information between the

host computers bus and memory system. The adapter performs many
low-level interface functions automatically or with minimal processor
involvement to minimize the impact on the host processors
performance.
I
input/output (I/O)
Any operation, program, or device that transfers data to or from a

computer.
internet Fibre
Channel protocol
(iFCP)
Connects Fibre Channel storage devices to SANs or the Internet in

geographically distributed systems using TCP.
intranet
A network operating like the World Wide Web but with access
restricted to a limited group of authorized users.
internet small
computer system
interface (iSCSI)
A protocol that allows commands to travel through IP networks, which

carries data from storage units to servers anywhere in a computer
network.
I/O (input/output)
The transfer of data to or from a computer.
K
kilobit (Kb)
kilobyte (K or KB)
1,024 (2^10) bits. Often rounded to 10^3.

1,024 (2^10) bytes. Often rounded to 10^3.
L
latency
load balancing
Amount of time it requires to fulfill an I/O request.

Distributing the processing and communications activity evenly across
a system or network so no single device is overwhelmed. Load
balancing is especially important when the number of I/O requests
issued is unpredictable.
135
local area network

(LAN)
A group of computers and associated devices that share a common

communications line and typically share the resources of a single
processor or server within a small geographic area.
logical unit number

(LUN)
Used to identify SCSI devices, such as external hard drives, connected

to a computer. Each device is assigned a LUN number which serves as
the device's unique address.
M
megabit (Mb)
megabyte (MB)
metadata
metavolume
Metro-Plex
mirroring
miss
1,048,576 (2^20) bits. Often rounded to 10^6.

1,048,576 (2^20) bytes. Often rounded to 10^6.
Data about data, such as data quality, content, and condition.
A storage volume used by the system that contains the metadata for all
the virtual volumes managed by the system. There is one metadata
storage volume per cluster.
Two VPLEX Metro clusters connected within metro (synchronous)
distances, approximately 60 miles or 100 kilometers.
The writing of data to two or more disks simultaneously. If one of the
disk drives fails, the system can instantly switch to one of the other
disks without losing data or service. RAID 1 provides mirroring.
An operation where the cache is searched but does not contain the data,
so the data instead must be accessed from disk.
N
namespace
network
network
architecture
network-attached
storage (NAS)
136
A set of names recognized by a file system in which all names are

unique.
System of computers, terminals, and databases connected by
communication lines.
Design of a network, including hardware, software, method of
connection, and the protocol used.
Storage elements connected directly to a network.
network partition
When one site loses contact or communication with another site.
P
parity
The even or odd number of 0s and 1s in binary code.
parity checking
Checking for errors in binary data. Depending on whether the byte has
an even or odd number of bits, an extra 0 or 1 bit, called a parity bit, is
added to each byte in a transmission. The sender and receiver agree on
odd parity, even parity, or no parity. If they agree on even parity, a
parity bit is added that makes each byte even. If they agree on odd
parity, a parity bit is added that makes each byte odd. If the data is
transmitted incorrectly, the change in parity will reveal the error.
partition
A subdivision of a physical or virtual disk, which is a logical entity only

visible to the end user, not any of the devices.
plex
A VPLEX single cluster.
R
RAID
The use of two or more storage volumes to provide better performance,

error recovery, and fault tolerance.
RAID 0
A performance-orientated striped or dispersed data mapping

technique. Uniformly sized blocks of storage are assigned in regular
sequence to all of the arrays disks. Provides high I/O performance at
low inherent cost. No additional disks are required. The advantages of
RAID 0 are a very simple design and an ease of implementation.
RAID 1
Also called mirroring, this has been used longer than any other form of
RAID. It remains popular because of simplicity and a high level of data
availability. A mirrored array consists of two or more disks. Each disk
in a mirrored array holds an identical image of the user data. RAID 1
has no striping. Read performance is improved since either disk can be
read at the same time. Write performance is lower than single disk
storage. Writes must be performed on all disks, or mirrors, in the RAID
1. RAID 1 provides very good data reliability for read-intensive
applications.
RAID leg
A copy of data, called a mirror, that is located at a user's current

location.
137
rebuild
redundancy
reliability
remote direct
memory access
(RDMA)
The process of reconstructing data onto a spare or replacement drive

after a drive failure. Data is reconstructed from the data on the
surviving disks, assuming mirroring has been employed.
The duplication of hardware and software components. In a redundant
system, if a component fails then a redundant component takes over,
allowing operations to continue without interruption.
The ability of a system to recover lost data.
Allows computers within a network to exchange data using their main
memories and without using the processor, cache, or operating system
of either computer.
S
scalability
simple network
management
protocol (SNMP)
Ability to easily change a system in size or configuration to suit

changing conditions, to grow with your needs.
Monitors systems and devices in a network.
site ID
The identifier for each cluster in a multi-cluster plex. By default, in a

non-geographically distributed system the ID is 0. In a geographically
distributed system, one clusters ID is 1, the next is 2, and so on, each
number identifying a physically separate cluster. These identifiers are
assigned during installation.
small computer
system interface
(SCSI)
A set of evolving ANSI standard electronic interfaces that allow

personal computers to communicate faster and more flexibly than
previous interfaces with peripheral hardware such as disk drives, tape
drives, CD-ROM drives, printers, and scanners.
stripe depth
striping
138
The number of blocks of data stored contiguously on each storage

volume in a RAID 0 device.
A technique for spreading data over multiple disk drives. Disk striping
can speed up operations that retrieve data from disk storage. Data is
divided into units and distributed across the available disks. RAID 0
provides disk striping.
storage area
network (SAN)
storage view
storage volume
synchronous
A high-speed special purpose network or subnetwork that

interconnects different kinds of data storage devices with associated
data servers on behalf of a larger network of users.
A combination of registered initiators (hosts), front-end ports, and
virtual volumes, used to control a hosts access to storage.
A LUN exported from an array.
Describes objects or events that are coordinated in time. A process is
initiated and must be completed before another task is allowed to
begin.
For example, in banking two withdrawals from a checking account that
are started at the same time must not overlap; therefore, they are
processed synchronously. See also asynchronous.
T
throughput
1. The number of bits, characters, or blocks passing through a data

communication system or portion of that system.
2. The maximum capacity of a communications channel or system.
3. A measure of the amount of work performed by a system over a
period of time. For example, the number of I/Os per day.
tool command
language (TCL)
transmission
control
protocol/Internet
protocol (TCP/IP)
uninterruptible
power supply (UPS)
universal unique
identifier (UUID)
A scripting language often used for rapid prototypes and scripted

applications.
The basic communication language or protocol used for traffic on a
private network and the Internet.
U
A power supply that includes a battery to maintain power in the event
of a power failure.
A 64-bit number used to uniquely identify each VPLEX director. This
number is based on the hardware serial number assigned to each
director.
139
V
virtualization
virtual volume
A layer of abstraction implemented in software that servers use to

divide available physical storage into storage volumes or virtual
volumes.
A virtual volume looks like a contiguous volume, but can be distributed
over two or more storage volumes. Virtual volumes are presented to
hosts.
W
wide area network
(WAN)
world wide name

(WWN)
write-through
mode
140
A geographically dispersed telecommunications network. This term

distinguishes a broader telecommunication structure from a local area
network (LAN).
A specific Fibre Channel Name Identifier that is unique worldwide and
represented by a 64-bit unsigned binary value.
A caching technique in which the completion of a write request is
communicated only after data is written to disk. This is almost
equivalent to non-cached systems, but with data protection.
Index
architecture
characteristics 20
clustering 23
overview 20
software 45
audience 11
deployment overview 79
VPLEX Local 79
when to use 80
director software 57
distribute data access
introduction 88
distributed block cache 91
distributed data access
removing barriers with VPLEX Metro 90
traditional approach 89
approach problems 89
distributed device
cluster quorum 93
detach rules 92
fencing mechanisms 93
C
cabinet
power distribution panels 39
power distribution units 39
cache coherence 66
cache layering roles 65
distributed cache 65
I/O implementation 65
local node cache 65
clustering architecture 23
command line interface 52
describe command 52
options 55
Secure Shell 52
command line options 55
configuration overview 61
configurations
large 64
medium 63
small 62
ConnectEMC 46
Connectrix DS-300B switch 58
context-sensitive help 51
E
EmaAdapter 46
EMC VPLEX Metro-Plex 23
EMC VPLEX Storage Cluster 23
encapsulate and migrate
steps 101
engine components 40
director 40
director, I/O modules 41
management access modules 40
power 40
rack space 40
external connections 59
141
F
federated soultion
enabling 78
Fibre Channel COM 58
Fibre Channel SANs 89
H
hardware
engine 43
fans 43
Fibre Channel COM switch 43
other components 43
switches 43
hardware architecture 38
storage cluster cabinet 38
hardware components 24, 34
high-level features
disk slicing 31
distributed RAID 1 31
migration 31
RAID 0 31
RAID 1 31
RAID C 31
remote export 31
storage volume encapsulator 31
write-through cache 31
management server 35, 42

command line interface 52
connectivity 42
contents 42
software 49
management server software 49
components 49
meta-directory 66
N
networks 36
Ethernet ports 36
public management port 36
nondisruptive migration 119
nondisruptive migrations 98
nondisruptive workload relocation 96
O
online help
accessing 50
overview and architecture
introduction 18
R
related documentation 11
I/O implementation 65
cache coherence 66
cache layering roles 65
meta-directory 66
share groups 66
I/O modules
Fibre Channel 41
Gigabit Ethernet 41
port roles 41
internal connections 58
Fibre channel COM 58
SAN outages 84
best practices 85
scalabilty and limits 37
Secure Shell (SSH) 52
share groups 66
simplified storage management 87
context-sensitive help 51
provision storage tab 51
software
GeoSynchrony 35
management server 35
storage provisioning 24
process 24
Storage VMotion 98
storage VMotion 95
system cabinet 38
system management server 46
ConnectEMC 46
M
management connectivity
Solar Flare management 59
management console 49, 53
online help 50
142
EmaAdapter 46
processes 46
Secure Shell 46
user accounts 48
system management software
components 46
system reporting 56
automatically collect data 56
configuration file 56
manually collect data 56
T
technology refresh
use case 87
U
user accounts 48
admin 48
all account type 48
Linux CLI 48
service 48
V
VMware cluster configuration 112
VMware deployments 112
VMware vSphere 96
VPLEX
deployment overview 79
system reporting 56
VPLEX cluster
cabinet 39
configurations
pre-installed 39
scalabilty and limits 37
single phase power 39
storage provisioning 24
VPLEX family
architecture 20
clustering architecture 23
components 21
VPLEX Geo 21
VPLEX Global 21
VPLEX Local 21
VPLEX Metro 21
configuration overview 61
configurations
large 61
medium 61
small 61
engine components 40
external connections 59
handling a read 66
hardware 24, 34
hardware and software 33
high-level features 31
how to handle a write 67
management console 49
networks 36
software 35
software architecture 45
system management software 46
VPLEX Geo 21, 22
VPLEX Global 21, 22
VPLEX Local 21, 22
VPLEX Local deployment 79
VPLEX management console 49
VPLEX management console web interface 46
VPLEX Metro 21, 22
distributed data access
removing barriers 90
distributed devices 92
remote access 91
technical overview 91
VMware deployments 112
VPLEX Metro deployment
between data centers 81
when to use 82
within a data center 80
when to use 81
VPLEX system data management 67
VPLEX System Management Software (SMS) 42
VPlexcli 52
command line options 55
wildcards 55
multiple symbols 55
question mark 55
two astericks 55
W
workload relocation 96
use case example 96, 97
143
workload resiliency 83
best practices 83
storage array outages 83
144

h7113 Vplex Architecture Deployment

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

h7113 Vplex Architecture Deployment

Transféré par

Droits d'auteur :

Formats disponibles

V-Plex Arch.

book Page 1 Tuesday, April 27, 2010 9:35 AM

EMC VPLEX Architecture and

EMC VPLEX Family Architecture

V-Plex Arch.book Page 2 Tuesday, April 27, 2010 9:35 AM

Copyright 2010 EMC Corporation. All rights reserved.

Part Number h7113.1

V-Plex Arch.book Page 3 Tuesday, April 27, 2010 9:35 AM

VPLEX Family Overview

Hardware and Software

V-Plex Arch.book Page 4 Tuesday, April 27, 2010 9:35 AM

VPLEX Local and VPLEX Metro Federated Solution

V-Plex Arch.book Page 5 Tuesday, April 27, 2010 9:35 AM

5.3 Workload resiliency .............................................................. 83

VPLEX Use Case Example using VMware

V-Plex Arch.book Page 6 Tuesday, April 27, 2010 9:35 AM

V-Plex Arch.book Page 7 Tuesday, April 27, 2010 9:35 AM

Primary use cases ......................................................................................

V-Plex Arch.book Page 8 Tuesday, April 27, 2010 9:35 AM

VPLEX Metro deployment in a single data center ............................... 80

V-Plex Arch.book Page 9 Tuesday, April 27, 2010 9:35 AM

Features and descriptions..........................................................................

V-Plex Arch.book Page 10 Tuesday, April 27, 2010 9:35 AM

V-Plex Arch.book Page 11 Tuesday, April 27, 2010 9:35 AM

As part of an effort to improve and enhance the performance and

This document is part of the EMC VPLEX family documentation set,

Storage Area Networks

EMC Symmetrix and CLARiiON Products

Related documents include:

EMC VPLEX Installation and Setup Guide

EMC VPLEX Site Preparation Guide

Implementation and Planning Best Practices for EMC VPLEX Technical

Using VMware Virtualization Platforms with EMC VPLEX - Best

V-Plex Arch.book Page 12 Tuesday, April 27, 2010 9:35 AM

Workload Resiliency with EMC VPLEX - Best Practices Planning

This document is divided into six chapters:

Chapter 1, VPLEX Family Overview, summarizes the VPLEX

Chapter 2, Hardware and Software, summarizes hardware,

Chapter 3, Software Architecture, summarizes the software

Chapter 4, System Integrity, summarizes how VPLEX clusters are

Chapter 5, VPLEX Local and VPLEX Metro Federated Solution,

Chapter 6, VPLEX Use Case Example using VMware, provides

V-Plex Arch.book Page 13 Tuesday, April 27, 2010 9:35 AM

EMC uses the following type style conventions in this document:

Used in running (nonprocedural) text for:

Used in running (nonprocedural) text for:

Used in all text (including procedures) for:

Used in procedures for:

Angle brackets enclose parameter or variable values supplied by the user

Square brackets enclose optional values

Vertical bar indicates alternate selections - the bar means or

Braces indicate content that you must specify (that is, x or y or z)

Ellipses indicate nonessential information omitted from the example

V-Plex Arch.book Page 14 Tuesday, April 27, 2010 9:35 AM

This TechBook was authored by a team from Symmetrix and the

V-Plex Arch.book Page 15 Tuesday, April 27, 2010 9:35 AM

Additional contributors to this book:

Jennifer Aspesi has over 9 years of work experience with EMC in

Anshul Chadda is SVE Product Manager for EMC VPLEX.

Arne Joris has 10 years of experience in storage virtualization and

Donald Kirouac has 14 years of experience in open systems storage

David Lewis has more than 12 years of experience with expertise in

We'd like to hear from you!