Vous êtes sur la page 1sur 8

SOLUTION BRIEF

Brocade IO Insight

HIGHLIGHTS IO Insight – Critical Visibility for Operational Stability


••Monitors individual host and storage
The Brocade® Gen 6 Fibre Channel network supports mission-
devices to gain deeper insight into the
performance of the network to maintain
critical applications with cutting-edge technologies for data center IT
SLA compliance infrastructure that offer unparalleled bandwidth, scale, performance, and
••Obtains total IOs, first response time, availability. Beyond these traditional requirements, the need for network
IO latency (Exchange Completion visibility and actionable insight that IT administrators can use to ensure
Time, or ECT), and outstanding IOs operational consistency and stability remain essential. Brocade Gen 6
performance metrics for a specific host
technology continues to deliver innovation in this critical area, with the
or storage device, in order to diagnose
IO operational issues
Brocade IO Insight capability. IO Insight complements and extends
existing Brocade Fabric Vision™ technology and features with deep
••Enables tuning of device configurations visibility on storage Input/Output (IO) performance to ensure operational
with integrated IO metrics, to optimize
storage performance stability. This solution brief presents the demand for IO Insight capability,
explains the IO Insight metrics, and demonstrates the use cases for
IO Insight. Finally, it compares and contrasts IO Insight with the Brocade
Analytics Monitoring Platform.

The Challenge of utilization and prevent congestion.


IO Performance Monitoring Beyond that, they need initiator-target
IT organizations continue increasing flow level performance metrics to monitor
mission-critical workloads in their performance associated with applications.
data centers. The demand for a large- While Brocade Fabric Vision technology
scale storage network with optimal with Gen 5 products has addressed
performance that is delivered consistently some of these requirements through
and with operational stability has the Brocade Monitoring and Alerts
increased dramatically. Data center Policy Suite (MAPS) and Brocade Flow
customers need deep visibility into the Vision diagnostic tools, visibility into
performance of storage IO workloads additional key metrics can further ease
to guarantee application performance administrative overhead and extend
between hosts and storage devices. At storage fabric resiliency to new levels.
a very basic level, almost all customers Specifically, administrators need latency
need throughput-related performance and performance metrics for storage
metrics for each switch port to monitor
IO operations that run on top of the these metrics are important, consider how with a single SCSI operation forms an
storage network. These storage IO Fibre Channel protocol structure supports FC exchange (see Figure 1).
operations are Small Computer System the workloads in a storage network.
A SCSI Read operation has three
Interface (SCSI) commands on a Fibre
Fibre Channel protocol supports block different phases: Command, Data, and
Channel (FC) network. Without this
storage. The majority of traffic running Status as shown in Figure 2. A SCSI
visibility, it is difficult to proactively monitor
over Fibre Channel networks is SCSI IOs Write operation has four different
conditions to ensure consistent storage
between server applications and storage phases: Command, Transfer Ready,
performance for applications.
arrays connected via a fabric. The SCSI Data, and Status as shown in Figure 3
For example, almost all large enterprise protocol commands and data are mapped on the following page. Each Command,
IT organizations have adopted solid to FC protocol structures. A SCSI Read or Transfer Ready, and Status phase is
state drive technologies, including either Write operation between an initiator and represented by a unique FC sequence.
all flash arrays or hybrid flash arrays, target are divided into different phases. Depending on the data size and on end-
which can dramatically reduce IO Each phase of the operation is mapped device implementation, the Data phase
latency. These organizations tend to to a specific FC sequence. Depending on can be a single sequence consisting of
deploy their latency-sensitive applications the IO data size, a sequence is mapped to multiple frames or multiple sequences,
to these devices. Yet, due to the lack of one or more FC frames. The collection of each consisting of multiple frames.
latency metrics, it remains a challenge frames and sequences that are associated
to proactively monitor the performance
of key infrastructure equipment to
maximize Return on Investment (ROI). Exchange
Furthermore, when performance
problems related to storage IO operations
occur, it is difficult for administrators to Sequence Sequence ... Sequence
isolate the problem scope, in order to
quickly troubleshoot the issues. These
difficulties can result in suboptimal or
Frame Frame ... Frame
unstable application performance and
increased operational costs.
Figure 1: FC exchange, sequence, and frame.
Brocade IO Insight
IO Insight is a Brocade Gen 6
Application-Specific Integrated
Circuit (ASIC) built-in capability that
FCP Read CMD
SCSI
nondisruptively and noninvasively gathers First Response
SCSI Initiator Target
IO statistics. The instrumentation allows Time
monitoring and baseline application- FCP Data – First Response
level and device-level latency and
IOs per Second (IOPS) metrics to
FCP Data N (Optional)
detect degraded storage performance.
Administrators can proactively control
Command Completion
performance and availability to ensure Time Status
operational stability. IO Insight provides
these metrics from device ports on
Gen 6 platforms. To understand why Figure 2: SCSI Read command (CMD) sequence and latency.

2
FCP Write CMD Data Time.” For a Write command, the
SCSI first response is the Transfer Ready
First Response Target (Xfer_ready) frame responded by the
Time target (see Figure 3). IO Insight displays
SCSI Initiator FCP Xfer Ready – First response
the metric as “WR CMD -> 1st XFER_RDY
FCP Data Time.” The First Response Time indicates
the access time delay for a Read or Write
FCP Data N (Optional) operation. When a First Response Time is
obtained at a host port, it includes fabric
Command Completion latencies and target device latencies.
Time Status
When a First Response Latency Time is
measured at a target port, it measures
Figure 3: SCSI Write command sequence and latency. only target device latencies.

Command Completion Time: The


Because of this multilayer protocol of the problem without the metrics from Command Completion Time, also
structure that carries the application the perspective of a fabric. As a result, a commonly known as the Exchange
storage traffic across a network, it is fabric is usually assumed to be the cause Completion Time, measures the time
important to have visibility at the different of the problem. between when a SCSI command is
layers. Certain metrics can be obtained issued by an initiator and when the status
only at a particular layer. The FC frame Defining IO Insight Metrics frame is issued by a target to indicate
layer metrics provide throughput and IO Insight metrics are provided on Gen the completion of that operation. For
frame rate information that represents the 6 device ports, that is, ports connected a Read command, IO Insight displays
speed at which frames are processed at a to hosts or targets. The metrics are not the metric as “RD CMD -> Status Time”
port. Frequently, these metrics might not available on Inter-Switch Link (ISL) ports, (see Figure 2). For a Write command,
provide sufficient visibility on the health of because it is commonly sufficient to IO Insight displays the metric as “WR
storage IO operations. For example, while obtain device-level IO metrics on the CMD -> Status Time” (see Figure 3). The
the frame level throughput for an initiator- edge ports in a fabric. It is important to Command Completion Time indicates
target flow is normal, the application note that IO Insight metrics are always the total time delay, including access time
response may still be slow due to many IO measured from the perspective of and data transfer time, for a Read or Write
operations that are handled by the storage device ports. In other words, the operation. When a Command Completion
target and Logical Unit Number (LUN). metrics represent the timing when a Time is obtained at a host port, it includes
Metrics for IO latency and IOPS can only frame enters or leaves a switch port. fabric latencies and target device latencies.
be obtained at an FC exchange layer, since Normally, the metrics may differ slightly When a Command Completion Time
each exchange represents a specific SCSI from what the host or target side IO metric is obtained at a target port, it
Read or Write operation. Although server- performance tools report. IO Insight measures only target device latencies.
side or array-side tools exist to provide includes the following metrics. The First Response Time and Command
this information outside of a fabric, without First Response Time: The First Response Completion Time are usually the best
the visibility from a fabric, Storage Area Time measures the time between when indications of application performance
Network (SAN) administrators face the a SCSI command frame is issued by an impact, because persistently high
complexity of Service-Level Agreement initiator and the first response frame is values typically result in slow application
(SLA) compliance for multiple applications issued by a target. For a Read command, responsiveness and poor user experience.
running in a fabric. When issues related to the first response is the first data frame For any mission-critical application,
application IO performance and reliability sent by the target (see Figure 2). IO Insight this condition often requires immediate
occur, it is difficult to pinpoint the source displays this metric as “RD CMD -> 1st resolution by SAN administrators.

3
Pending IO: The number of Pending longer to complete, resulting in higher The Accuracy of
IOs measures the average number of IO command completion time and lower IO Insight Metrics
operations that are pending completion IOPS. On the other hand, IOs with a In order for IO Insight to be useful for
when a Read or Write command is issued small data size require shorter times to measuring performance at high speed
by an initiator, that is, the number of IO complete, resulting in lower command and low latency, the metrics that are
operations that are outstanding between completion time and higher IOPS. These gathered must be accurate. In the
the Command phase and Status phase. characteristics require metrics of different following example, a host generates
The metrics are provided separately IO sizes to be provided. Hence, IO Insight 8,192 IOs to a target, each with 2,000
for pending Read commands and for provides each of the above metrics for bytes of data. The metrics for this flow
pending Write commands. IO Insight four different ranges of data size: less than are obtained by IO Insight and by the
displays the average pending Read 8 Kilobytes (KB), 8 KB or above but less JDSU Xgig Fibre Channel Analyzer.
commands as “RD Pending IO.” It displays than 64 KB, 64 KB or above but less than Table 1 compares the metrics for these
the average pending Write commands as 512 KB, and 512 KB or above. The metrics two methods.
“WR Pending IO.” The Pending IO metric based on data sizes provide the necessary
is closely associated with queue depth, information to understand the IO workload IO Insight with Flow Vision
which is the maximum number of Pending profile, in order to establish baseline and MAPS
IOs for a server or a storage. Server Host behavior for accurate monitoring. IO Insight capabilities are integrated into
Bus Adapters (HBAs) typically support the existing features of Brocade Fabric
queue depth configuration to control IO
operations to run concurrently on servers. Table 1: IO Insight metrics compared with the JDSU Fibre Channel Analyzer.
Storage arrays usually have a maximum
queue depth limit that it can support from Xgig Expert
all servers. Metrics IO Insight Report

IO Count: The total number of completed READ IO Count 8.19 K 8,192


SCSI IOs between a pair of initiators and
WRITE IO Count 8.19 K 8,192
targets. IOPS means the average rate of
completed SCSI IOs per second. IOPS RD CMD -> Status Time (microseconds) 69 68
a commonly used metric to baseline
WR CMD -> Status Time (microseconds) 89 88
the performance of storage systems.
IO Insight provides separate metrics RD CMD -> 1st Data Time (microseconds) 54 53
for completed Read commands and Maximum
WR CMD -> 1st XFER_RDY Time (microseconds) 66 65
for completed Write commands. IO
Insight displays the completed SCSI RD Pending IOs 4 4
Read metric as “RD IO Count” and the
WR Pending IOs 4 4
completed SCSI Write metric as “WR IO
Count.” Storage arrays typically include RD CMD -> Status Time (microseconds) 39 38
IOPS specifications. The IOPS metrics
with IO Insight provide validation and WR CMD -> Status Time (microseconds) 47 47

monitoring of storage array performance RD CMD -> 1st Data Time (microseconds) 23 23
in production environments. Average
WR CMD -> 1st XFER_RDY Time (microseconds) 19 19
The SCSI IO performance and latency
metrics are inherently affected by RD Pending IOs 2 N/A
the data size of the IO commands.
WR Pending IOs 2 N/A
Typically, IOs with large data size take

4
Vision technology. The IO Insight metrics this performance graph as a widget and The IO Insight metrics available to
are accessible through Flow Vision. add it to the Brocade Network Advisor Brocade Gen 6 fixed-port switches and
Users define initiator-target or initiator- dashboard for at-a-glance performance Gen 6 directors are offered differently, in
target-LUN flows with the Flow Monitor view of the important IO flows. order to align with the different market
function to obtain the IO Insight metrics. segments that these products serve. The
As described in the previous section,
The average and maximum values of Gen 6 fixed-port switches can obtain
IO Insight metrics are available on
IO Insight metrics are sampled every IO Insight metrics only on the target
device ports. Flow Vision requires a port
six seconds to provide the current values. device ports: only flows with egress port
parameter in the flow definition as the
The average and maximum values directions are supported. The Gen 6
reference point of metrics. To obtain IO
across all sampling periods since the directors can obtain IO Insight metrics on
Insight metrics on an initiator port, the
flow activation are maintained to provide either host device ports or target device
point of reference must be defined on
history information. In addition, Brocade ports: flows with either ingress port or
the ingress port, whereas to obtain the
Network Advisor 14.0.1 supports the IO egress port directions are supported.
metrics on a target port, the point of
Insight metrics displayed in the Flow Furthermore, Gen 6 fixed-port switches
reference must be defined on the egress
Vision real-time performance graph for can obtain IO Insight metrics for initiator-
port. This initiator-target direction aligns
the past six hours. target flows only. Gen 6 directors, on the
with the SCSI Read and Write command
other hand, support IO Insight metrics
The screen capture in Figure 4 of the directions. If a flow is defined with the
for initiator-target or initiator-target-LUN
Brocade Network Advisor real-time direction reversed, IO Insight metrics are
flows to be able to obtain LUN-level IO
performance graph displays the IO not available.
performance visibility.
Insight metrics. Administrators can save
After IO Insight metrics are available
through a Flow Monitor flow, the flow
can be imported into MAPS to configure
threshold-based monitoring and
alerting on the latency metrics. Users
can configure maximum acceptable IO
latencies for a particular Initiator-Target
or Initiator-Target-LUN flow. Different
thresholds for each of the data size
ranges can be configured. The thresholds
can also be configured over the time
window of a minute, an hour, and a day, to
support monitoring at different granularity.
When latencies above the threshold are
detected by MAPS, users receive MAPS
notifications based on the preconfigured
threshold actions and can then take
appropriate measures to respond.

As mentioned, Brocade Network Advisor


supports IO Insight metrics with Flow
Monitor performance graphs. These
performance graphs can be added to
dashboards. Administrators can create
Figure 4: IO Insight metrics displayed in a Brocade Network Advisor real-time performance graph. a custom dashboard that includes a

5
widget with IO Insight metrics, in order For certain specific latency-sensitive normal range, the problem can be within
to gain a quick at-a-glance view of the applications that may be provisioned the fabric or from the hosts.
key performance statistics for important with high-performance all flash arrays, a
With Gen 6 directors, administrators can
flows. In addition, other widgets from higher SLA might be mandated. In such
further troubleshoot by defining a flow
existing Brocade Fabric Vision technology an example, if the customer’s latency
with the same initiator and target, but
features, such as MAPS dashboards tolerance is under 5 milliseconds, the
on the host ports. If the metrics fall
and Fabric Performance Impact (FPI) administrator can set up a customer-
within the normal range, then the
monitoring violations, can be displayed specific threshold. Administrators can
problem is likely caused by a host side
in the same dashboard. Administrators define a separate flow with fully specified
issue. On the other hand, if the metrics are
can easily correlate the data from initiator, target, and LUN IDs to monitor
abnormal, problems within the fabric or
different tools. the storage target or LUN performance
the slow draining host are likely causing
for the desired applications. All these
the slow responses. Administrators can
IO Insight Use Cases monitoring steps can be accomplished
correlate the Fabric Performance Impact
With IO Insight capabilities, the following through Brocade Fabric OS® (Brocade
monitoring to confirm whether the host is
use cases are supported: FOS) Flow Vision and MAPS commands,
a slow draining device. If the host
Storage Performance SLA Compliance or through Brocade Network Advisor.
is not a slow draining device, it is likely
When administrators are responsible The fact that this built-in capability can
that the congestion within the fabric
to guarantee the performance and be enabled at any time and on any device
is affecting the performance of this
reliability of their storage networks, port without any disruption makes this a
flow. Once again, IO Insight provides
throughput and latency are sometimes very powerful, yet flexible, tool.
the visibility for troubleshooting with a
the key parameters in the SLAs to their Storage Performance Troubleshooting few commands or clicks, without any
customers. The integrated capabilities The second use case for IO Insight is disruption to operations.
of IO Insight are well suited for this use troubleshooting storage performance
case. In an environment with mixed
Storage Performance Optimization
problems—in particular, IO performance
The third use case for IO Insight
Hard Disk Drive (HDD) and Solid-State issues. When applications experience
is optimizing the performance of a
Drive (SSD) storage arrays that support IO-related performance problems such
storage network. For latency-sensitive
mixed workloads, administrators may be as a slow response, a time-out, or even a
applications, administrators can use
required to guarantee the latency for IO crash, administrators are under pressure
IO Insight metrics to directly measure
operations to be under a certain threshold, to resolve the issues quickly. Because
IO latency to any storage targets, so that
for example, 25 milliseconds, to ensure many elements within a storage network
they can make informed decisions to
the responsiveness of applications. can affect performance, it is often
properly provision and deploy these
In order to achieve that requirement, users necessary to isolate the root cause of a
applications. Furthermore, administrators
can create Flow Monitor flows with target problem to either within a fabric or to a
can utilize the IO Insight metrics—in
devices (destination ID) specified but storage device.
particular, the Pending IO metric—to tune
source devices (source ID) unspecified. IO Insight is a valuable tool in this step. and optimize the performance of the
This flow definition provides the IO Insight Flows can be defined on storage ports overall storage network.
metrics for all flows that are destined to to obtain IO Insight metrics. Since
As mentioned in the previous section,
the specified target devices. The flows can these metrics are obtained on storage
storage arrays have physical upper
be imported into MAPS and configured ports, they represent the performance
limits on the number of IOs that can
with Read and Write completion time and latency of storage devices. If these
be supported concurrently. This limit is
thresholds of 25 milliseconds over the metrics are abnormal, it is very likely
known as queue depth. At the other end,
desired monitoring time window. If this that the problems are due to the storage
server HBAs have queue depth settings
baseline is violated, administrators are devices rather than to the fabric. On the
that control the number of concurrent
notified to take early action before their other hand, if these metrics are within the
IOs on a LUN. HBAs and different
customers request support.

6
driver versions have default queue ••Deeper visibility: AMP provides Summary
depth settings, which most likely are not full visibility to all SCSI commands, IO Insight is one of the key offerings of the
optimized for a production environment. beyond the SCSI Read and Write Brocade Gen 6 Fibre Channel technology.
commands that are offered by IO Data center administrators can leverage
To prevent exceeding a storage array’s
Insight. In addition, because the metrics this integrated capability to gain visibility
queue depth limit, it is important to know
are obtained on the AMP device, AMP on application IO performance to ensure
the Pending IO values for each initiator.
can directly measure and monitor the SLA compliance, troubleshoot IO
These can be obtained by iterating
fabric latency if vTap is enabled on performance problems, and optimize the
through all initiator flows. Administrators
both host ports and target ports. This storage performance in their environment.
can use this data in conjunction with the
deeper level of visibility on AMP makes This unique nondisruptive solution further
number of servers to a storage port, and
it a powerful tool to perform extensive advances the benefits of Brocade Fabric
with the number of LUNs to a storage
diagnostics and troubleshooting. Vision technology by offering simplified
port, to determine if the HBA queue depth
settings should be lowered. In addition, ••High scalability: As a dedicated storage management, increased operational
stability, and reduced costs.
the Pending IO metrics can be compared analytics appliance, AMP is highly
with the latency metrics to set a higher scalable: Its capabilities range from
About Brocade
HBA queue depth for latency-sensitive monitoring a few Gen 5 and Gen 6
Brocade networking solutions help
mission-critical applications and a lower platforms with hundreds of flows, to
organizations achieve their critical
queue depth for less critical applications. monitoring a large fabric with tens of
business initiatives as they transition
thousands of flows. Both performance
IO Insight and Brocade Analytics to a world where applications and
and latency thresholds are monitored,
Monitoring Platform information reside anywhere. Today,
and violations can be alerted at time
Brocade Analytics Monitoring Platform Brocade is extending its proven data
windows of a day, minutes, seconds, and
(AMP) provides the advanced storage center expertise across the entire network
down to each IO command.
telemetry to enable unmatched with open, virtual, and efficient solutions
monitoring, advanced troubleshooting, ••Customized management interface: built for consolidation, virtualization,
and increased ROI. IO Insight provides Brocade Network Advisor supports and cloud computing. Learn more at
some of the capabilities that are offered deeper integration with AMP capability, www.brocade.com.
in AMP. As available on AMP, the SCSI offering a dedicated Analytics
IO latency and performance metrics Monitoring dashboard with default
can also be obtained through IO Insight widgets. Brocade Network Advisor
nondisruptively. But AMP has many supports deep AMP metrics retention
advanced capabilities beyond what for up to two years. In addition, the
IO Insight offers, as follows. Brocade Network Advisor report
generation capability dedicated for AMP
••Full automation: AMP supports provides crucial operational summary
automatic learning of all Initiator- on demand.
Target and Initiator-Target-LUN flows
passing through a fabric switch that With these key differentiations,
has enabled virtual tap (vTap). This AMP is recommended for end-to-
means that administrators do not need end IO monitoring and advanced
to specify a source device, destination troubleshooting in environments with Gen
device, or LUN ID to gain visibility of 5 or Gen 6 switches, while IO Insight is
the IO performance. This automatic recommended for storage performance
learning capability can be critical in a monitoring with Gen 6 switches.
large storage network that supports
thousands of devices.

7
Corporate Headquarters European Headquarters Asia Pacific Headquarters
San Jose, CA USA Geneva, Switzerland Singapore
T: +1-408-333-8000 T: +41-22-799-56-40 T: +65-6538-4700
info@brocade.com emea-info@brocade.com apac-info@brocade.com

© 2016 Brocade Communications Systems, Inc. All Rights Reserved. 06/16 GA-SB-5766-00
Brocade, Brocade Assurance, the B-wing symbol, ClearLink, DCX, Fabric OS, HyperEdge, ICX, MLX, MyBrocade, OpenScript,
VCS, VDX, Vplane, and Vyatta are registered trademarks, and Fabric Vision is a trademark of Brocade Communications Systems,
Inc., in the United States and/or in other countries. Other brands, products, or service names mentioned may be trademarks
of others.

Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning any
equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this
document at any time, without notice, and assumes no responsibility for its use. This informational document describes features
that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of
technical data contained in this document may require an export license from the United States government.

Vous aimerez peut-être aussi