Académique Documents
Professionnel Documents
Culture Documents
Development
Operations
Web Analytics
Virtualization
Third Parties
Java/.NET
Database
Network
Storage
Server
Development
Operations
Web Analytics
Virtualization
Third Parties
Java/.NET
Database
Network
Storage
Server
APM in 2010
End User Experience Monitoring Application Component Deep Dive
1. Captures the End User Experience of an application or service Captures rich statistics regarding components and component domains
2.
APM 2010
3.
4.
5.
PMDB
Performance Management Database
APM in 2015
APM 2010
1. EUE, deep dive, application model, trans flows, PMDB Policy setting and workflow orchestration Understand, analyse application patterns and spot deviations Distributed knowledge capture, knowledge sharing and improvements Support cloud model and end to end management off premises and on Monitor resource usage
2.
APM 2015
3.
4.
Cloud Enablement
5.
6.
Introduction to APM
Introduction to APM
Application Performance Management
End-user
What is the end-user experience?
Enterprise / Business
Why Manage End-user Experience?
Operations
Typical Enterprise Requirements
The Application Performance Challenge: Problems Everywhere Along the Delivery Chain Traditional Operational monitoring
Development
Traditional development flow
The Answer: Adopt an Application Point of View That Starts with the User
Application Point of View that Starts with the End User
Data Center Cloud: Private and Public Users
ISPs Mobile carriers Browsers Devices AJAX JavaScript Mobile apps
Web Mobile App logic Database Network Mainframe Virtualization SOA CDNs Third party services
Customers
Application
Application
Employees Infrastructure
Cost REDUCED
Revenue IMPROVED
Isolate Remediation
Root Cause Resolve
Response Times
Operations: The Monitoring Challenge: Problems Everywhere Along the Delivery Chain
The Application Delivery Chain
Data Center Cloud: Private and Public
Web Mobile App logic Database Network Mainframe Virtualization SOA CDNs Third party services Inconsistent geo performance Bad performance under load Blocking content delivery Poorly performing Java or .NET methods Application Slow SQL or Web services transactions Server performance
Users Resource contention Mobile carriers Browsers ISPs Capacity issues Devices AJAX JavaScript Mobile apps Slow bursting
Customers
Poorly performing JavaScript Browser/ device incompatibility Pages too big Low cache hit rate
Infrastructure
Configuration issues Oversubscribed POP Poor routing optimization Low cache hit rate
Employees
Network resource shortage Faulty content transcoding SMS routing / latency issues
INTERNET
Im on it!
Storage DB Servers
Web Servers
Local ISP
Mobile Carriers
Not my Problem!
APPLICATION TEAM DATA CENTER NETWORK TEAM Third-party/ INTERNET CUSTOMERS
Not my Problem!
Storage DB Servers
Local ISP
App Servers
Not my Problem!
MAINFRAME TEAM Content Delivery Networks
Mobile Carriers
Respons e Time
Respons e Time
Server Monitoring
Respons e Time
Network Monitoring
Respons e Time
Respons e Time
Application Monitoring
Load Balancer
Virtualized Application Servers Web Services, RSA Log File SAN Message Queue
Wily Introscope, Mercury Topaz , OV Transaction Analyzer, ITCAMs, dynaTrace, Optier, IBM ITCAMs
Respons e Time
Respons e Time
Database Monitoring
Quest Software, IBM Tivoli, Quest Fog Light , Precise, Oracle App SAN 1000 GB RSA SAN 250 GB
Database Instance
War Room
APPLICATION TEAM DATA CENTER
. ..
NETWORK TEAM
INTERNET
. !!!!!...
Storage DB Servers
Web Servers
Service Manager
Network
SERVER TEAM
Local ISP
App Servers
Load Balancers
MAINFRAME TEAM
Major ISP
Mobile Carriers
Development
Test/QA
Production
Development
Test/QA
Production
Business
Business impact $
Development
Test/QA
Production
dynaTrace PurePath
4 5
On-Premises
dynaTrace Enterprise Analysis
DATA CENTER INTERNAL USERS INTERNET
SaaS
Gomez SaaS multi-tenant data store
CUSTOMERS
Storage
DB Servers
App Servers
Local ISP
Major ISP
Mobile Carriers
RUM Browser Mobile
dynaTrace
Java .NET
Streamin g
Mobile
Backbon e
Last Mile
Enterprise
Internet
First Mile
Application monitoring
Enterprise
Backbone
Last Mile
Real Users
Browsers Customers
Data Center
Virtual/Physical Environment DB App Multi-tier transactions Servers
Java/.NET analysis
Mainframe
Servers
Web Servers
Balancers
Browsers
Storage
Web Services
Mobile Components
Data centers & cloudContent 5,000+ supported providers Delivery mobile Networksdevices
Devices
Employees
Mobile apps
Gomez Mobile Carrier Data Monitoring (a.k.a., Mobile Carrier Vantage Service Check) VantageView VantageView (no change)
Test/monitor your app the way users access it: What they do: key transactions Where they do it: geographic locations How they do it: fat clients, browsers and native devices
All tiers, all transactions, all users
Prioritize & Resolve Issues: Measure the business impact users Isolate root causes Deep application and transaction analysis
Browsers
Deep analysis
Application
PurePat h
Mobile apps
DCRUM Capabilities
Agentless real user monitoring Unifies network and application reporting Monitors all data center tiers in one dashboard Optimize EUE for web and non-web Diagnose root-cause application problems though dynaTrace integration
DCRUM Differentiators
EUE: all users, all transactions End-to-end: whole ADC Actionable data Simplicity of deployment
Web and non-web applications ERP: SAP, Oracle EBS Business core: IBM MQ, XML middleware, mainframe front-end
Whole Application Delivery Chain Multi-vendor integration and Multi-tier view Network influenced monitoring captures all transactions
Business impact Application-specific decodes (28+) All users, all transactions, granular
No software agents to deploy or maintain Out of Box and bespoke reporting Industrys leading scale for monitoring
Internet
Firewall
Virtualized App Server Load Balancer Virtualized App Servers Web Services Message Queue
Database Instance
This combination delivers systems excellence and solution differentiation providing our customers with choice and flexibility to respond to the everchanging demands of the business.
Customers can: - improve application performance
&
- increase scalability
- simplify operations.
Application Infrastructure
Virtual and physical environments KEY EXAMPLES
Process Automation
Existing solutions e.g., Service Desk and Event Management KEY EXAMPLES
Cloud Services
CDN, Cloud provider, and third parties KEY EXAMPLES
Data Center Analysis View provides instant visual indication of problem areas with 1click access to detailed troubleshooting information.
Browser / Rich-Client
Web Server
Java
.NET
Other
Database
Synthetics
End-to-End Transaction Execution Path Across tiers: browser servers - database Remoting Web Services External services Code-level depth Heterogeneous- .NET & Java
Thread Dumps
Monitoring data
Synchronization
Exceptions Logs
dynaTrace Session
Development Developers, CI
Production Edition
42
How often does the same issue resurface in production release to release? How often does the same bug reappear?
Automatically detect & Analyze Regressions
Detect performance and reliability regressions early. Compare performance and behavior of a current build to previous versions and baselines. Automate analysis to enable you to focus on features instead of debugging.
43
Application not scaling in production after passing QA? Assure A Scalable, Performing Architecture PurePath Technology provides true end-toend tracing -- Browser to Web Server to App. Server to Database. Visualize app. behavior under load for even large, complex applications to prevent scalability issues from reaching production.
44
What does fast or slow really mean? What does performs well and it scales really mean? Meet Performance Goals With KPIs Measure, track and alert against KPIs -Service level, Throughput & Response time. Compare performance relative to your competition with SpeedoftheWeb.
45
Debugging applications in the test environment? Firefighting in production? Automate Collaboration & Resolution Capture issue rootcause when they occur so engineers simply replay, at codelevel, precisely what happened.
Alerts publish captured PurePath Sessions to issue tracking systems for engineers to access immediately.
46
Gomez SaaSNetwork: The Worlds Most Comprehensive Performance and Testing Network
Last Mile Web Performance Management and Load Testing 150,000+ locations
Cross-Browser Testing Real-user Monitoring 500+ browser/ Worldwide, wherever OS combos your users are 5,000+ supported devices
Outside in perspective of cloud service provider performance Real-time data Historic comparisons Performance & availability bottleneck identification Independent validation of providers SLA claims
Future APM
Compuware Delivers
TODAY
Provide visibility into the performance of heterogeneous applications from the enterprise to the cloud NEAR TERM
Active Management
Compuware Concepts
Compuware Concepts
Information Gathering Protocol Analyzers Software Services Operations, Applications and Transactions Reporting Hierarchy
Tiers
Locations Metrics
Information Gathering
Application monitoring can only be as good as it is defined. Therefore, as much information as possible should be gathered surrounding the tobe-monitored applications: Minimally: Logical application topology information IP address (range) supporting the services for this application Port number (range) supporting the services for this application
Information Gathering
Synthetic Auto-check
Synthetic Transaction
Protocol Analyzers
A.k.a. decodes monitors, parses, and analyzes a network protocol in the monitored traffic Some analyzers perform transaction monitoring: they can recognize exchanges of information where there is a recognizable question-and-answer dialog Licensed features Examples: TCP, HTTP, HTTPS, XML, MSSQL and Oracle
Software Services
Services that support an application at different levels, for example on a Web, Application or Database level.
Are minimally defined by a server IP (range) and a server port (range) together with a protocol, for example:
HTTP service on server IPs 10.10.10.1-10.10.10.3 on port 80 SOAP service on server IP 10.10.10.4 on port 8080
Transaction A
Application 1
Transaction B
Transaction C
Physician Login
URL (http://10.21.79.243/physician/login.do)
Admin Login
URL (http://10.21.79.243/admin/login.do)
Patient Login
URL (http://10.21.79.243/patient/login.do)
Application Performance
Transaction Performance
Reporting Hierarchy
Hierarchy levels depend on the analyzer type. The CAS can report on up to four levels for the following traffic types: HTTP SAP GUI Cerner
SOAP
Any database Each level can be reported independently or combined with the remaining ones. If you use DMI you are able to create reports with entries from arbitrarily chosen hierarchy levels.
Reporting Hierarchy
In the current DC RUM release (12) the division to hierarchy levels is supported: Operation The first level in the hierarchy, for example: URL, Query, SOAP Operation type Task The second level in the hierarchy, for example: Page name, Operation name, SOAP Method
Reporting Hierarchy
Reporting Hierarchy
AMD Users
Start by monitoring the initial entry point of the End-Users transaction Add additional tiers for greater Fault Domain Isolation and Visibility Wide variety of transaction support: HTTP/S, Oracle/SQL/DB2/ Queries, SAPGUI, Oracle Forms, XML, MQ
CIO CTO IT Mgt Data Center Ops Monitoring Team Application Owners
Tiers
A tier is a specific layer where DC RUM collects performance data. Tiers are either pre-defined, or defined by the user in the Central Analysis Server (CAS).
Immediately after the CAS is deployed, data is reported based on the default tier configuration. If the default tier configuration does not fit your network architecture, you should configure tiers to match your topology
Tiers are configured globally. You should not create separate tiers for individual applications
Front-end Tiers
Best practice mark the tier as front-end which is closest to the user or to a device that acts on behalf of the user. In short the first layer the user connects with.
Network Tiers
Client Network:
Wide Area Network (WAN) from remote sites. Manually and automatically defined sites (AS and CIDR blocks), except the All other site Network: Datacenter Local Area Network (LAN). All other site
Locations
DC RUM refers to locations as Sites and defines them as IP address ranges. Location definitions can be made in a three-level architecture in DC RUM : Site: lowest level of granularity Area: Consists of one or more sites Region: Consists of one or more areas
Server operation size - The size of a server operation. In HTTP and HTTPS (decrypted and non-decrypted), server operation size equals the page size.
DCRUM Components
Enterprise Portal
Dashboards
Operational reports
3rd-party Integration
Service Model
Synthetic Monitoring
dynaTrace DTM
DCRUM Components
Central Analysis Server (CAS) The main reporting component for dynaTrace Data Center Real-User Monitoring Combines measurements from the Agentless Monitoring Device (AMD) using different contexts CAS pulls its data from the AMDs in the form of zdata sample files Stores its results in an MS SQL Server database Results can be viewed real time or historically Agentless Monitoring Device (AMD) Network probes that analyze network traffic Console Client Used for configuring devices and application monitoring Console Server Stores the configuration in a flat file database
DCRUM Components
Compuware Security Server (CSS) New in the 12.0 release is a new functionality called the Compuware Security Server. Provides a central authentication and user management capability for o Central Analysis Server, Console, Advanced diagnostic server, Enterprise Portal and BSM This central component allows Users to defined locally in a CSS database or for the customer to use their own corporate user management system such as the LDAP based systems Active Directory or Apache DS. Advanced Diagnostics Server (ADS) Is a separate report server, that is integrated with CAS on reporting and configuration level Provides a more detailed, troubleshooting-oriented analysis (i.e. element level for HTTP instead of page level on CAS) Supports applications based on HTTP(S), XML over HTTP(S)/MQ, SAPGUI, DB2, MSSQL, Sybase, Informix, Oracle and Oracle Forms
DCRUM Components
ADS pulls its data from the AMDs in the form of vdata sample files Stores its results in an MS SQL Server database Results can be viewed real time or historically Enterprise Portal (EP) Helps speed the isolation of the fault domain and reduces the cost of troubleshooting issues, while restoring service as quickly as possible. Contains robust data mining and report building tools for creating new and customized reports quickly and easily. Contains dashboards which display graphs, geographic views, and tabular data regarding service and application quality, fault domain isolation, business impact, and infrastructure health. Consolidates reporting, security, and configuration functionality into a single component.
Analysis Modules
Transaction decode (analysis modules) include:
HTTP/HTTPS SAP SOAP/XML Databases: MS SQL, Oracle, DB2, Sybase, Informix
Oracle Forms
IBM MQ MS Exchange
Analysers
Multi-purpose and Expandable Product Family
CAS (Web)
Oracle EBS HTTP(S) Siebel Fault Isolation Detailed HTTP MS Exchange Oracle Forms
Tuxedo/JOLT
Bus Trans
SAP GUI
SQL\ DB
TCP/IP
SOAP
Information Database
AMD
Flow Collector
Netflow data analysis (since v 10.2)
87
SMTP
Citrix
XML
DNS
MQ
CAS (Ent)
ADS
Analysis Modules
2 3
4
5
1. 2. 3. 4. 5.
Real-time and historical trending views of application , user, network and overall data center performance Supports web and non-web applications such as SAP. Quickly identify poorly performing data center tiers. Isolate network performance impact on applications and users. Monitor baseline performance and availability with synthetic monitoring.
View overall status of applications and end-user performance through a single dashboard that includes quick drill down views into performance, availability, operation time and usage for individual applications and users.
Caption: Drill down from Application Health Status for a focused analysis of performance by data center tier. Isolating application performance problems in multi-tier environments in todays modern application and data center architectures is a daunting task for IT, yet the business demands rapid problem isolation to reduce business impact. The new Data Center Analysis View provides instant visual indication of problem areas with 1-click access to detailed troubleshooting information. Isolate tier, server, time period, slow web pages, middleware messages, and database queries in a single interactive view that accelerates fault domain isolation.
1. 2. 3. 4.
Data Center Analysis provides real-time views of application performance, operations, availability and usage along with requests broken down by the supporting tier of infrastructure. Historic detail of performance of tiers is displayed with mouse-over detail of how user and application performance is affected by the corresponding infrastructure tier. Individual application operations are displayed in context of overall application performance, network health and end-user experience. End-user performance is displayed for any infrastructure tier and can be sorted by user group, individual users or client types.
1. DCRUM provides a broad view across infrastructure to triage performance of services, servers operations and websites. 2. Reports on affected users, transaction times and availability quickly surface hot spots in application performance. 3. From DCRUM dashboards, a direct drill down into dynaTrace reporting provides method call and code-level analysis of application performance issues.
1. 2. 3.
Drill down from affected users heat map to view individual user performance Identify the application(s) responsible for poor end-user performance. For specific users, identify the offending application operation with a breakdown of slow, fast and aborted requests
CSS
Alarms can be sent to a specified e-mail address, or can be sent via an SNMP trap.
There are also alarms that are generated even if they have no subscribers assigned. Such alarm notifications are recorded in the alarm logs, which store records of all alarms generated.
Metric alarms
These alarms provide a simple and fast mechanism for performing complex queries on a set of pre-defined metrics. The advantage of using these alarms is easy of use and modification as well as performance. To define metric alarms, you do not need to know the structure of the database or how to program in SQL. However, not all conditions can be expressed as metric alarms.
Network alarms
These alarms are similar in design and function to the metric alarms above, though they view the monitor traffic as it is done on the Network View report.
Link alarms
These are fast-executing alarms designed to monitor link utilization as presented on the Link View report.
Other alarms
A few other alarms are available which were designed for very specific purposes and which can be modified in only limited ways and which do not allow user access to the detector code.
RUM Console
Components
RUM Console consists of two components: RUM Console Server A back-end server application that maintains configuration images and device information, runs tasks related to configuration management, and provides a Web services API for RUM Console to manage configurations. The server is a Windowsbased service that can be installed on a machine with Windows 2003 Server or Windows 2008 Server R2 with a network connection to all of the managed devices within the Compuware APM infrastructure. RUM Console A GUI application for configuring report servers and data collectors. With the console, you can create and edit configurations for Compuware APM devices and propagate such configurations to other Compuware APM devices
RUM Console
Wizard configuration
Tracing ability Entire configuration: experienced user All same options Health reports Sequence transactions
Guided Configuration
Device information
AMD
The Agentless Monitoring Device (AMD) is a completely passive device, placing no additional load on the network. The AMD can be connected to the network in two ways: Spanning the switch In todays switched environments most switches have the ability to mirror multiple ports and or multiple VLANs to a single monitoring port. This gives the AMD the ability to passively monitor traffic from a number of different perspectives. Therefore the AMD can see traffic in front of and behind load balancers, as well as all the tiers in between. In cases where the switch can not accommodate more spans, the use of regeneration taps can be favourable. Cisco switches may also use VLAN Access Lists (VACLs) to bridge routed traffic to an outgoing port much in the same way as port mirroring. Passive Taps In certain cases, the use of span ports may not be viable. In this case passive taps may be utilized to capture the application traffic to be monitored. This method requires multiple tap points to fully see all tiers within the application.
AMD
AMDs job is to sniff traffic for the purpose of performance monitoring AMD processes performs initial processing of the data. Data is organized into files to be retrieved by report servers at configured time intervals Red Hat Enterprise Linux 5+ and 6+ Hardware slots are filled with additional network interface for monitoring Monitoring NICs are passive Can be copper or fiber or mixed SSL decryption is performed on the AMD RSA private key needed SSL decryption card (Nitrox Cryptoswift) Decryption processing is offloaded from main CPU RSA keys are guarded. They are not stored on disk or in main memory. Software only OpenSSL AMD does not store/keep packet traces. It inspects packets to see the URL, the userid, etc. The exception is HTTP Header request/response and POST data when using the ADS report server (optional) Sensitive data can be masked
Breaks down the page load time by individual web page element (images, css, javascript, etc.)
Can be used to drill into the transaction to see the input submitted by the user (POSTed data).
Component Scaling
RECOMMENDED ARCHITECTURES
RECOMMENDED ARCHITECTURES
multi-threading
32 GB RAM
multi-threading
64 GB RAM
Scalability Each version brings more optimal traffic decoding, the version 12 numbers are bit better than 11.7 version again
For the AMD it differs per traffic profile. Below a few examples can be seen:
123
Practical data reduction levels will vary Theoretical benefit: 3x 7x reduction in number of sessions
RECOMMENDED ARCHITECTURES
AMD scaling
Passive in-line tap or splitter AMD in load-balancing mode Intelligent switch (e.g. Gigamon, Anue) Each AMD analyzes one or part of one application SPAN
Tap
AMD
AMD
AMD
126
RECOMMENDED ARCHITECTURES
CAS scaling
Add more CAS servers and distribute data per monitored Server IP Designate one CAS as master
AMD
CAS
CAS
CAS
127
Enterprise Portal
Enterprise Portal
ADS always acts as a slave server There are no performance reasons to set up a separate Master CAS
Just designate one of the CASes in the cluster
DMI back-end
DMI back-end
RECOMMENDED ARCHITECTURES
AMD
CAS
CAS
ADS
SQL
130
Complex application and network interaction can demand more than real-time monitoring. DCRUM includes a Transaction Trace feature that provides deep root cause analysis needed to quickly remedy complex network problems
134
Monitoring Citrix
VTCAM software is installed on presentation server (Citrix or MS Terminal Server)
Runs a Windows service Collects CPU & Memory utilization stats of Citrix host Maps back-end application traffic to the responsible end-user (session mapping data)
CAS reports
CAS
Gomez User
Monitoring Citrix
Citrix Remote Users Citrix Server Farm
Appropriat e Analysis Modules Web TCP level analysis AMD Applicati ons
TCAM
Other Application s
A lightweight component is placed on the server to correlate user logins and back end Citrix conversations.
The agent uses Citrix API and Microsoft Windows API to obtain information on which user is opening which TCP sessions from the Citrix/WTS server. Agent communicates in real-time with AMD and provides mappings from TCP session IDs to Citrix user login names. This information is used by AMD to tag measurements taken on the Citrix<->application server path with actual user login names.
Target Environments
Citrix and WTS enabled applications Deployment Considerations Additional information on resources utilization (CPU, HDD, RAM, TCP, Number of Terminal Services sessions and Number of active Terminal Services sessions) statistics of Citrix server is also available. One AMD can monitor multiple Citrix/WTS machines (different servers, different protocols) One CAS can gather data from multiple AMDs and provide a single view of service delivery
Monitoring Citrix
Monitoring Citrix
WAN Optimization Controllers (WOCs) are installed at branch office and data center locations
The AMD adds a SPAN or TAP on the optimized side of the data center WOC