Vous êtes sur la page 1sur 43

Application Monitoring

Jeremy Kalsow

The Northwestern Mutual Life Insurance Company Milwaukee, WI

Why Application Monitoring


Majority of all corporations Northwestern Mutual Total 1,000+ servers Team is 6 people

Team uses 16 servers


Average 50 applications per server

Need a way to know status fast

What is it?
The ability to monitor performance and availability

Gather metrics
Show trends

Pretty pictures for management

Why?
Trends predict future problems Solve application issues faster Uptime relates directly to profit for many companies

View all applications, servers, databases and other items being monitored with a single dashboard.

Types of Monitoring
Fault Performance Configuration Security

Accounting

Fault
Detects major errors Easy to implement Examples
Network loss Database Connectivity

Very Important

Fault
Type of Monitoring Hardware
CPU utilization Memory utilization Storage System

What to Monitor
CPU load Memory load Available space Application working Error Log monitoring Database is online Latency

When to monitor
Load > 99% for x minutes Load > 99% for x minutes System out of Space Working or Error If error occurred Database is up/down Latency > acceptable range

Applications

Application available Application Logs

Databases Network

Database online Latency

Performance
Slow Performance Service Level Agreements Metrics Old and New Metrics

Visual Display

Performance

http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff /polozoff.html

Configuration
Configuration variables Connectivity Speed Performance

Proactive Servers and Applications

Configuration
Why would the configuration change? Hardware Storage Service packs

Hot fixes
Windows Updates

Security
Attempts to access the system Open ports Inventories Firewall

Packets
System events

Blocked Exploits

Accounting
Monitors Usage Generally used for fees Profit/Loss

Example
Electric Company Northwestern Mutual

Types of Monitoring Recap


Fault Performance Configuration Security

Accounting

Types of Monitoring Recap


Historical data Baseline test Current test Performance disagreements

Types of Monitoring Recap


Allows for trends to be seen Modifications can be made Trends over multiple releases

Types of Monitoring Recap


Monitoring is important Not enough time is given Implemented After discovery of an issue Monitoring only in areas of known problems

Adding monitoring requires time and money

Challenges of application monitoring


Various types of systems Shared Clustered Virtualized

Production logging

Shared Systems
1 server / Multiple applications System resources are shared Tracking individual usage is difficult Many applications may be impacted

Server without access (production)

Clustered Systems
Applications on more than one server Avoid single point of failure May be hard to target the issue

Production Logging
Generally Limited Most errors repeated in test Application downtime Use of company resources

Implement Application Monitoring


Plan Early Monitor Proactively Create a Recovery Plan Create and use SLAs

Plan Early
Planning stage Add monitoring during development Late additions cover known issues

Monitor Proactively
Harder to implement Issues are dealt with before end user knows

Monitor Proactively
Tools based approach Easy and relatively fast setup No code Multiple applications

Monitor Proactively
Logging is directly in the code Less efficient More specific Developers have less time

Create a Recovery Plan


Fast resolution Knowledge management

Recovery Plan Template

Service Level Agreements


What percentage of time that the services will be up (uptime) How many people can use the application at once without performance issues Performance metrics and benchmarks to be used with performance monitoring alerts The rules for notification announcements What statistics will be monitored and when and where they will be available Acceptable response time

Service Level Agreements

Using the Statistics


Visual display Alerts Tickets

Visual (Dashboard)
Easily view statistics Comparison results Trend comparison Cross Platform

Auto-generated management reports

Dashboard

Alerts and Tickets


Auto-generated alerts Tickets for queue system Vital information in each

Alerts and Tickets


Most common: Email Text, popup, printout, recording and more Tickets: auto-generated Knowledge databases

Common fixes and resolutions

Application Monitoring
Maximize application uptime Higher end user satisfaction Higher Profit

References
Polozoff, A. (2003, April 9). Proactive Application Monitoring. IBM - United States. Retrieved October 20, 2011, from http://www.ibm.com/developerworks/websphere/library/techarticles/0304_polozoff/polozoff.html Choice. (2009, December 20). Application Monitoring. Adminschoice - Unix Made Easy. Retrieved October 31, 2011, from http://adminschoice.com/application-monitoring Application Monitoring Software - uptime software. (n.d.). Server Monitoring Software - IT Systems Management, Capacity Planning, Application and Server Monitoring Tool by uptime software. Retrieved October 31, 2011, from http://www.uptimesoftware.com/application-monitoring.php Marko, K. (2005, December 30). Proactive Application Monitoring. Processor.com:

Data Center IT Equipment at Processor, Routers, Storage, Rackmount Servers, Computer Room Cabling and Flooring. Retrieved October 29, 2011, from http://www.processor.com/editorial/article.asp?article=articles%2Fp2752%2F43p52%2F43p52.asp
"IT Service Level Agreement Templates | ContinuityPlanTemplates." ContinuityPlanTemplates | Free Business Continuity Plan (BCP) Templates. ContinuityPlan Templates, n.d. Web. 30 Oct. 2011. http://www.continuityplantemplates.com/it-service-level-agreement-templates

XML

Upcoming events with Dashboard


Ability to display visualized graphs and other pertinent information

Ability to click a failed component and have the system auto generate a ticket
Ability to Alert others of the issue found Performance monitoring as well as fault