Converged Network Management

The Definitive Guide To
tm
Converged Network Management

Ken Camp
Introduction
Introduction to Realtimepublishers
by Don Jones, Series Editor
For several years, now, Realtime has produced dozens and dozens of high-quality books that just happen to be delivered in electronic formatat no cost to you, the reader. Weve made this unique publishing model work through the generous support and cooperation of our sponsors, who agree to bear each books production expenses for the benefit of our readers. Although weve always offered our publications to you for free, dont think for a moment that quality is anything less than our top priority. My job is to make sure that our books are as good asand in most cases better thanany printed book that would cost you $40 or more. Our electronic publishing model offers several advantages over printed books: You receive chapters literally as fast as our authors produce them (hence the realtime aspect of our model), and we can update chapters to reflect the latest changes in technology. I want to point out that our books are by no means paid advertisements or white papers. Were an independent publishing company, and an important aspect of my job is to make sure that our authors are free to voice their expertise and opinions without reservation or restriction. We maintain complete editorial control of our publications, and Im proud that weve produced so many quality books over the past years. I want to extend an invitation to visit us at http://nexus.realtimepublishers.com, especially if youve received this publication from a friend or colleague. We have a wide variety of additional books on a range of topics, and youre sure to find something thats of interest to youand it wont cost you a thing. We hope youll continue to come to Realtime for your educational needs far into the future. Until then, enjoy. Don Jones
Table of Contents Introduction to Realtimepublishers.................................................................................................. i Chapter 1: Introduction to Unifying Network Management and Converged IP Communications .1 Convergence Covers Many Areas ...................................................................................................2 Infrastructure Convergence in the Networkthe Wiring and Circuit Convergence ..........2 Service Convergence over a Common IP Networking Environment ..................................3 Device Convergencethe Convergence of Desktops, Laptops, Tablets, and PDAs ..........4 Application ConvergenceIntegration of Enterprise Business Applications with Network Services .................................................................................................................5 Fixed Mobile Convergence..................................................................................................6 VoIP as a Converged Service ..............................................................................................7 Voice vs. Data..........................................................................................................8 The Cost of Doing Business Drives VoIP ...............................................................8 Technical Definition vs. Market Definition...........................................................10 VoIP or IPT? ..................................................................................................................................12 A Telephone Call Simplified .............................................................................................13 Converting the Analog Signal to Digital ...........................................................................13 When Does VoIP Become IPT? ........................................................................................14 Intended for PC-to-PC Voice Messaging ..........................................................................15 Never Intended to Replace Global Telephony.......................................................15 Free Voice Was Compelling and Simple ...........................................................15 Pushes the Edge Over Convergence ..................................................................................15 Multiple Play..........................................................................................................16 New Devices ..........................................................................................................16 VoIP Can Include Advanced Applications ........................................................................17 Converged Communications Leads to Converged Applications ...................................................18 FMC ...................................................................................................................................18 Summary ........................................................................................................................................19 Chapter 2: Key Considerations in Effective Voice and Data Integration for a Changing IT/IP Landscape ......................................................................................................................................20 Quantifiable Business Processes....................................................................................................20 Web-Centric Businesses ....................................................................................................20 Call Centers........................................................................................................................22 IVR Systems ......................................................................................................................24 CTI .....................................................................................................................................24
ii
Table of Contents CTI History ............................................................................................................24 Application Integration with CRM and Enterprise Resource Planning Systems ..............27 Customer Relations and CRM ...............................................................................28 Delivering Call Quality with VoIP ................................................................................................30 Traditional Voice Characteristicsthe PSTN...................................................................30 IP Traffic Characteristics ...................................................................................................31 Design Considerations and Class of Service .................................................................................33 QoS ................................................................................................................................................36 QoS Approaches in IP Networks .......................................................................................37 QoSthe Signaled Approach................................................................................40 QoSthe Provisioned Approach...........................................................................43 QoSthe Bypass/Shim Approach.........................................................................45 Summary ........................................................................................................................................49 Chapter 3: Business Drivers and Justification ...............................................................................50 Vertical Market Business Drivers for Change ...............................................................................50 Business Sales....................................................................................................................56 Web-Enabled Business ..........................................................................................59 Product Sales..........................................................................................................61 Service Sales ..........................................................................................................61 Financial Services ..............................................................................................................63 Health Care ........................................................................................................................64 Manufacturing....................................................................................................................64 Financial Cost Reduction Drivers for Change ...............................................................................65 Local Telephone Expenses ................................................................................................65 Long-Distance Expenses....................................................................................................65 Inside the Enterprise ..............................................................................................66 Outside the Enterprise............................................................................................66 Support Costs .....................................................................................................................66 Help Desk Support.................................................................................................67 Adds, Moves, and Changes....................................................................................68 Remote and Mobile Workers .................................................................................68 Strategic Drivers for Change .........................................................................................................69 New Applications...............................................................................................................69
iii
Table of Contents Obsolescence of Legacy Systems ......................................................................................70 Stop Investing in Legacy Technologies.................................................................70 Manufacturer Discontinuation of Product and/or Support.....................................70 Rising Support and Maintenance Costs .................................................................71 Inability of Legacy Solutions to Support Business Needs.....................................71 Summary ........................................................................................................................................71 Chapter 4: Productivity Advantages of Unified Communications ................................................73 How Service Convergence Drives Productivity and Enables New Business Operational Processes74 Management and Utilization of Property...........................................................................76 Cost Reduction and Management ......................................................................................76 Enterprise Agility...............................................................................................................76 Staff Productivity ...............................................................................................................77 The Non-PC Workstation Environment.................................................................77 Evaluating the Call Center Strategy.......................................................................78 Business Process Optimization ..............................................................................78 Productivity in Resource Management..................................................................78 Focusing on Employee Productivity..................................................................................79 Unified Messaging .................................................................................................79 Personal Communications Assistants ....................................................................79 IP Video Solutions .................................................................................................80 Web-Centric Businesses ....................................................................................................80 Integrating Voice with Sales, Service, and Support ..............................................81 Call CentersLocalized, Distributed, Offshore................................................................82 IVR Systems ......................................................................................................................82 Computer Telephony Integration.......................................................................................83 Application Integration with CRM and ERP Systems.......................................................83 Customer Relations and CRM ...............................................................................84 Vertical Market Business Drivers for Change ...............................................................................85 Business Sales and the Web, or Net-Enabled Business.....................................................87 Product and Services Sales ....................................................................................87 Financial Services ..............................................................................................................88 Health Care ........................................................................................................................89 Manufacturing....................................................................................................................89
iv
Table of Contents Summary ........................................................................................................................................89 Chapter 5: Key Steps in VoIP Deployment and Management for the Enterprise .........................92 Network Readiness Assessment ....................................................................................................92 Ensuring Network Readiness for Converged Services ......................................................92 Plan to SucceedDont Fail to Plan .................................................................................93 Analysis..................................................................................................................94 Planning .................................................................................................................94 Testing....................................................................................................................94 Acquire the Right Resources .................................................................................94 Look to the FutureThink Long Term .................................................................95 Understanding Existing Voice Needs ............................................................................................95 One Voice FactorCompression Methods .......................................................................96 Voice Call QualityThe PSTN vs. VoIP .........................................................................98 Network Design Considerations ..................................................................................................101 The Performance Envelope..........................................................................................................102 Choosing Performance Envelope Characteristics to Measure .....................................................108 Throughput.......................................................................................................................109 Bandwidth ............................................................................................................109 Response Time.....................................................................................................109 CPU Utilization....................................................................................................110 Network Segment Utilization ..............................................................................110 Integrity/Reliability..........................................................................................................110 Packet Loss ..........................................................................................................110 Jitter......................................................................................................................110 Delay/Latency ......................................................................................................110 Cost ..................................................................................................................................111 Availability ......................................................................................................................111 Uptime99.999% ...............................................................................................111 Security ............................................................................................................................111 Manageability ..................................................................................................................112 Scalability ........................................................................................................................112 Summary ......................................................................................................................................112 Chapter 6: Impact Analysis, Root Cause, and Event Correlation................................................114
Table of Contents SNMP...........................................................................................................................................114 What Is a MIB?................................................................................................................115 The Architecture of SNMP ..............................................................................................116 Using SNMP ....................................................................................................................118 snmpwalk .............................................................................................................118 Factors to Consider with SNMP ......................................................................................119 Autodiscovery ......................................................................................................120 Negative Implications ..........................................................................................120 ICMP............................................................................................................................................121 ICMP Message Format ....................................................................................................123 Reachability Testing ........................................................................................................124 ping ......................................................................................................................124 Traceroute ............................................................................................................125 Syslog, Data Logging, and the Console.......................................................................................127 Integrating Tools for Event Correlation.......................................................................................128 Network Monitoring ........................................................................................................129 VoIP Service Elements to Monitor ......................................................................129 Monitoring Bandwidth and QoS..........................................................................131 Measurements and Metrics for Voice Quality .....................................................132 The Do-It-Yourself Approach..........................................................................................133 The SLA...............................................................................................................133 Pros and Cons of Rolling Out Your Own Management Platform .......................135 Freeware and Open Source vs. Commercial Products.........................................136 Integrated Commercial Solutions ....................................................................................136 Pros and Cons of a Packaged Management Platform ..........................................138 Summary ......................................................................................................................................138 Chapter 7: Effective Service Availability Management and Capacity Planning.........................139 Introducing FCAPS: A Sustainable Model for Balanced Network Management .......................140 Fault Management ...........................................................................................................142 Configuration Management .............................................................................................143 Accounting/Administration/Asset Management..............................................................144 Performance Management ...............................................................................................145 Security Management ......................................................................................................145
vi
Table of Contents FCAPS Simplified ...........................................................................................................146 Optimization for Service Availability and Capacity....................................................................148 Trending and Capacity Planning......................................................................................149 Bandwidth ............................................................................................................149 Ports/Lines ...........................................................................................................152 Codec Planning ....................................................................................................152 Optimizing the Infrastructure Optimization.....................................................................152 Optimizing Hardware...........................................................................................153 Optimizing Software............................................................................................153 SLA Optimization............................................................................................................154 Delay ....................................................................................................................154 Delay Variation....................................................................................................155 Packet Loss ..........................................................................................................156 Summary ......................................................................................................................................159 Chapter 8: Effective Network Configuration, Network Fault, and Network Performance Management.................................................................................................................................161 Fault Management .......................................................................................................................161 Network and Fault ManagementAn Integration Strategy............................................164 Caveats of Implementation Strategies for NMSs.............................................................164 Recognize, Isolate, and Detect Faults ..............................................................................165 The Role of the NMS.......................................................................................................166 SNMP and Fault Management.............................................................................167 Syslog and Fault Management.............................................................................169 Fault Management and ROI.............................................................................................169 Configuration Management .........................................................................................................170 Collecting and Storing Configuration Data .....................................................................171 Configuration and Change Management .........................................................................171 Performance Management ...........................................................................................................173 QoS and Bandwidth Monitoring..........................................................................178 Collecting and Analyzing the Data ..................................................................................180 Monitoring the Health of the Network.................................................................180 Performance and Utilization Trends ....................................................................180 Administration Management for Performance and Planning ..........................................181 Gathering Usage Statistics ...................................................................................181
vii
Table of Contents Managing Backups and Synchronization for Performance .................................181 Summary ......................................................................................................................................182 Chapter 9: Effective Security Management.................................................................................184 FCAPS and Security Management ..............................................................................................185 Identifying Risks..............................................................................................................186 Inconsequential Risk Classes...............................................................................187 Significant Risk Classes.......................................................................................187 Introducing Incident Response Planning: CSIRT Basics ................................................192 Building Processes for Defense in Depth ....................................................................................193 User Authentication and the Gold Standard ....................................................................195 Authentication......................................................................................................195 Authorization ...................................................................................................................197 Auditing ...........................................................................................................................197 Prepare to Respond ..........................................................................................................198 Policies, Procedures, and Awareness...............................................................................198 Physical Access and Environment Controls ....................................................................199 Perimeter: First Line of Defense Between the Internet and Internal Networks...............199 Network: Internal Network Layer....................................................................................200 Systems: Server and Client OS Hardening Practices.......................................................202 Application: Application Hardening Practices ................................................................203 Data: Protection of Customer and Private Information ...................................................203 Summarizing Defense in Depth ...................................................................................................203 Prevention ....................................................................................................................................204 Protect the Protection.......................................................................................................205 Detection ......................................................................................................................................205 Reaction .......................................................................................................................................205 Proactive and Reactive Strategies....................................................................................206 Reactive Strategies.......................................................................................................................208 Proactive Strategies......................................................................................................................208 Testing the Strategy .....................................................................................................................210 Summary ......................................................................................................................................210 Chapter 10: Asset Reporting, Audit Compliance, and IT Documentation ..................................211 FCAPS and Asset/Administration/Accounting Management......................................................211
viii
Table of Contents Accounting/Administration/Asset Management..............................................................211 Managing Billing .................................................................................................212 User Accounting ..................................................................................................213 The Asset Management View..............................................................................213 Why Documentation? Why Process?...........................................................................................214 Creating Repeatable Processes ........................................................................................214 Revisiting the Importance of Knowing Your Environment.............................................214 Be Prepared: Protecting Against Future Business Issues ................................................215 Legal and Regulatory Issues to Consider.....................................................................................216 SOX..................................................................................................................................216 Risk Assessment ..................................................................................................217 Control Environment ...........................................................................................217 Control Activities.................................................................................................217 Monitoring ...........................................................................................................218 Information and Communication.........................................................................218 GLBA...............................................................................................................................218 HIPAA .............................................................................................................................219 Methodologies in Best Practices for Management and Oversight...............................................222 ITIL ..................................................................................................................................222 ISO 17799 ........................................................................................................................226 Business Continuity Planning ..............................................................................226 System Access Control ........................................................................................229 Physical and Environmental Security ..................................................................230 Compliance ..........................................................................................................230 Asset Classifications and Control ........................................................................230 Security Policy .....................................................................................................230 Managing and Protecting the Network ........................................................................................231 Inconsequential Risk Classes...........................................................................................231 Significant Risk Classes...................................................................................................231 Reducing Risk for Single Occurrence Losses..................................................................232 Addressing the Risks........................................................................................................233 Risk Management Life Cycle ..........................................................................................233 Summary ......................................................................................................................................234
ix
Copyright Statement
Copyright Statement
2007 Realtimepublishers.com, Inc. All rights reserved. This site contains materials that have been created, developed, or commissioned by, and published with the permission of, Realtimepublishers.com, Inc. (the Materials) and this site and any such Materials are protected by international copyright and trademark laws. THE MATERIALS ARE PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. The Materials are subject to change without notice and do not represent a commitment on the part of Realtimepublishers.com, Inc or its web site sponsors. In no event shall Realtimepublishers.com, Inc. or its web site sponsors be held liable for technical or editorial errors or omissions contained in the Materials, including without limitation, for any direct, indirect, incidental, special, exemplary or consequential damages whatsoever resulting from the use of any information contained in the Materials. The Materials (including but not limited to the text, images, audio, and/or video) may not be copied, reproduced, republished, uploaded, posted, transmitted, or distributed in any way, in whole or in part, except that one copy may be downloaded for your personal, noncommercial use on a single computer. In connection with such use, you may not modify or obscure any copyright or other proprietary notice. The Materials may contain trademarks, services marks and logos that are the property of third parties. You are not permitted to use these trademarks, services marks or logos without prior written consent of such third parties. Realtimepublishers.com and the Realtimepublishers logo are registered in the US Patent & Trademark Office. All other product or service names are the property of their respective owners. If you have any questions about these terms, or if you would like information about licensing materials from Realtimepublishers.com, please contact us via e-mail at info@realtimepublishers.com.
Chapter 1 [Editor's Note: This eBook was downloaded from Realtime NexusThe Digital Library. All leading technology guides from Realtimepublishers can be found at http://nexus.realtimepublishers.com.]
Chapter 1: Introduction to Unifying Network Management and Converged IP Communications

The convergence of voice and data networks has been evolving and gaining momentum for several years. Although networks today have not converged, they are moving toward convergence in several ways. Many organizations are implementing Voice over Internet Protocol (VoIP) in an effort to cut communications costs or leverage the competitive advantage of integrated services, and VoIP implementers often focus on voice quality and interoperability important factors in the delivery of Quality of Service (QoS). However, convergence really means much more than that. Today, converging networks and net-centric applications are changing everything about network management. Network administrators need to manage and monitor a wide variety of network elements. They have to understand more complex network events and respond more quickly than ever. Effective management for these integrated network technologiesdata, voice, video, wireless, and so forthis crucial to network operations. This guide will highlight the overall service management challenges facing enterprise business and identify common industry best practices for effectively managing an integrated, unified communications environment using VoIP. It will offer a series of systematic and holistic techniques for managing the total integrated network to ensure consistent service delivery and support for ongoing business operations. The term convergence is generally used in reference to the integration of telephony with data services and applications as well as video onto a single network. This single network is frequently assumed to be the Internet, but the convergence of services is bringing voice and data networks closer together in many ways. These technologies all used dedicated, separate resources in the past but can now share resources and interact with each other, creating new efficiencies for business. The IP data network is evolving much further than just VoIP. Video technologies are blending and overlapping VoIP. Video VoIP (VVoIP) is becoming an accepted business service. With new operating system (OS) evolutions ahead and increased difficulty in air travel, many businesses are seriously exploring video collaboration as an alternative approach to traditional travel. Beyond video, mobility is a prime business consideration in todays business environment. The evolution of wireless technologies to broadband services increased productivity for mobile workers. Another looming aspect of convergence is the convergence of the wired enterprise data network with the wireless cellular networks. This fixed mobile convergence (FMC) will surely add momentum as the technologies and handsets mature.
Chapter 1 A trend is underway in which voice and data communications are merging. The irresistible logic is that digitized voice is just another kind of data, so why not carry it on the same data links that handle all your other ones and zeros? The economies of convergence can be considerablethere is no need to build and support separate voice and data infrastructures when you can have just one. That combination of infrastructures presents the problem that convergence aims to solve: data people, who havent worried about voice in the past, have to worry about it now. Similarly, people who used to specialize exclusively in switched voice circuits must adapt to the new environment. From enterprise business to small business to consumer, the end user doesnt care what network delivers services. The ability to work from any single device, anywhere, any time is more in demand today than ever in history. This chapter will review the broad aspects and implications of convergence in several forms.
Convergence Covers Many Areas

Five years ago, when people spoke of convergence, they really meant the evolution to VoIP or IP Telephony (IPT). Over those years, convergence has emerged with many different shapes and serves multiple purposes. It means different things to different people. Lets briefly explore the different nuances of convergence to fully understand what is happening in the network technology evolution. Infrastructure Convergence in the Networkthe Wiring and Circuit Convergence Infrastructure convergence has taken place over nearly 20 years in different forms. The earliest examples of infrastructure convergence were brought on within the corporate building, driven by advances in LAN technology and improvements in digital PBXs. This initial convergence was the shift to a single Category-5 unshielded twisted-pair (UTP) wiring run to every work cubicle. Wiring plans simplified to bring voice and data services on a single wired infrastructure in the office. This premise-based convergence was driven by cost and convenience, but it set the stage for further network infrastructure integration. Large enterprise business struggled for years with the separation of telephony and data services. Organizations bought circuit connections from carriers in staggering numbers: Primary Rate T-1s (PRIs) Basic rate ISDN lines for telecommuters and small offices POTS lines for home offices Point-to-point circuits to connect PBXs Frame Relay T-1s, both full and fractional Point-to-point circuits for SNA and other data needs POTS lines for dial-up to Remote Access Service (RAS)
Billing and administrative tracking of all this circuit connectivity placed a huge burden on enterprise business. The validation of circuit billing and Centrex service billing as well as, in some cases, trying to validate billed minutes of use, is a huge, labor-intensive effort.
Chapter 1 Reducing the number of circuits became an obvious solution. Virtual circuit technologies such as Asynchronous Transfer Mode (ATM) and Frame Relay demonstrated the practical potential for the large enterprise business to receive all voice and data services converged onto a single fat pipe circuit. Fiber optic cable began replacing copper wire in many areas as Synchronous Optical Networking (SONET), using lasers to transmit information, provided advances in carrying capacity of the physical media.
References for ATM and SONET and other technologies: Wikipedia provides an excellent basic framework for high-level explanation and history of many of the technologies described in this guide. There are hundreds of excellent technical books on the protocols and technologies mentioned.
Service Convergence over a Common IP Networking Environment For telephony carriers, infrastructure service convergence was also driven by Internet dial-up access. The Public Switched Telephone Network (PSTN) has been finely tuned and optimized to support voice traffic. An average telephone call lasts 3 to 4 minutes in duration. Dialup access to the Internet presented a new traffic engineering challenge to the telcos. The PSTN needed to support Internet sessions that might last for hours or days. The PSTN is designed using blocking switches. What this means is that if every customer connected to a local central office (CO) picks up the phone at the same time, some will be blocked. It isnt cost effective to over-engineer a central office to support 100,000 concurrent users when traffic studies have indicated that only 30 percent might be using the telephone at any given point in time. Dialup access to the Internet forced a change in thinking about how the Internet and PSTN interact. As Figure 1.1 illustrates, they began as separate, disconnected networks, but dialup access introduced touch points between the two. As technologies evolved, a converging phase was reached with numerous high-capacity connections between the two. That is the current state of network technology. The fully converged infrastructure network of tomorrow will blend the PSTN and Internet so closely that many users wont even detect a distinction. Voice calls might route off the PSTN, a wired LAN connection using a VoIP phone on the desktop, or a wireless handheld smartphone that uses WiFi in the office then hands a call off to the cellular network when the user moves out of WiFi range.
Chapter 1
Figure 1.1: Network convergence evolution.
Device Convergencethe Convergence of Desktops, Laptops, Tablets, and PDAs In business, the way people work has changed dramatically in the past 20 years. The text-based green screen used to work on a mainframe application isnt entirely gone. Now that function is often performed using emulation software on a PC workstation. The PC has evolved as well. Today, businesses and workers customize and optimize their workstations to provide the most effective, productive working environment. Business has become finely attuned to the cost of real estate. This has driven many businesses to implement telecommuting programs. Reduction in computer size has brought small notebook computers into the workplace. Today, tablet PCs are becoming quite common. This shrinkage in PC size has extended beyond the computer into the workers purse of pocket. Personal Digital Assistants (PDAs) became very popular, and today are blending into the smartphone. Mobile phones today have more computational power and memory than earlier PCs.
Chapter 1
Figure 1.2: Device convergence driven by shrinking desktop real estate and increased mobility.
Its also important to recognize convergence on the desktop. Real estate, even desktop real estate, is a costly commodity. Integrating voice services into the desktop workstation, laptop, or even the PDA/smartphone is another facet of convergence. Application ConvergenceIntegration of Enterprise Business Applications with Network Services Perhaps one of the most interesting and potentially productive areas of convergence is the integration of applications and services. The Web no long relies solely on HTML. Today XMLbased Web services provide communications between front-end applications and back-end servers. Software development tools such as AJAX and Ruby on Rails as well as plug-in tools such as Flash provide an increasingly rich end user experience in using Web services. Combining these advancements in development tools with enterprise applications provides a new frontier of service convergence that might lead to process re-engineering efforts in workflow. Major business applications include Customer Relationship Management (CRM) systems that can integrate all aspects of the relationship life cycle between businesses and customers. Enterprise Resource Planning (ERP) systems aid in supply-chain management for manufacturing organizations. They help monitor inventory control, quality assurance processes, and product life cycle management.
Chapter 1
Fixed Mobile Convergence A more recent development in convergence is the idea of embedding WiFi technology into mobile handsets. Business users all carry a mobile telephone. There is a clear trend emerging in the form of a fixed and mobile telephony convergence. The goal is to provide both services on one handsetit is a critical strategic issue across the telecommunications industry for the fixed wireline carriers, the mobile wireless carriers, and IP network providers. When this convergence is accomplished, these operators, regardless of service, will be able to provide voice access to the fixed line infrastructure of the PSTN. Beyond voice conversations, there is a real independence in the end users device. Location, access technology, and terminal all become irrelevant because they all work. As Figure 1.3 shows, using FMC features, a phone call might originate via a softphone on the PC in a workers home. When the worker leaves for the office, the call could be transferred to the mobile handset, perhaps using WiFi over a home wireless network. While driving to the office, the call could be continued over the cellular connection to the PSTN. Upon arriving at the office, this call might automatically be passed from the cellular network onto a corporate WiFi connection to minimize cellular airtime. Finally, the worker might switch the call over to a rich media desktop client in the office to collaborate over an application, shift to video conferencing, or share some other network resources.
Figure 1.3: Example of FMC.
Chapter 1 Key drivers for traffic migration between fixed and mobile networks include the fact that customers gain the ability to choose which network carries voice calls. This choice can be made based on both cost and convenience. Users making calls from home or office locations (where both are available) can select the transport network that suits their needs. Will increasing dependence cause consumers to use mobile devices at home? Absolutely business users already do a tremendous volume of mobile calling. The continual downward spiral in per-minute costs can drive consumers to rely more and more on mobile services. The rising availability of broadband data services delivered to these handsets assures the spread of new uses. According to one survey conducted in Britain, the penetration levels for mobile telephony are as high as 70 percent in several developed countries. Mobile telephony may present a viable alternative to traditional fixed telephony for many users. From the mobile operators perspective, increased indoor usage of the mobile phone will drive acceptance. Coverage, capacity, and spectrum constraints might limit their QoS. Using convergence of technologies to embrace the unlicensed, short-range wireless protocols today and the longer range WiMax technologies on the horizon, provide an inexpensive means for them to increase capacity and provide service moving ahead. Its in their best interest to actively embrace the convergence of fixed and mobile services. FMC depends on the handsets to make a wireless connection to an access point and to both make and receive calls via the fixed line infrastructure. Short-range wireless technologies using the unlicensed frequency bands give fixed line service providers the opportunity to create new services without the investment in expensive spectrum licensing or complicated frequency planning. The two most likely technologies for wireless interface in the 2.4GHz unlicensed band are Bluetooth and WiFi. For FMC, there is a key limiting factoror success factor: the handset. Compatible handsets are appearing in growing numbers. Bluetooth is available in more than 25 percent of new models. Bluetooth is suitable as a wireless media for FMC and was designed to handle voice service, but its not the sweet spot. WiFi integration into the handset is quickly becoming a key differentiator for manufacturers. This is a trend that will rapidly expand to the majority of new handsets over the next year or two. VoIP as a Converged Service IPT is one of the most visible, talked about technologies in the marketplace today. Improvements in performance and cost reduction issues are huge drivers for enterprise business adoption. There are several factors that make IPT a suitable technology evolution for business telecommunications. Because there are so many different business needs ranging from small companies with home offices to very large enterprises spread around the globe, there is no single one size fits all solution for telephony requirements. This should not come as a surprise, because there isnt a single solution for using existing telephony services and equipment.
Chapter 1
Voice vs. Data Voice telephony services require a long holding time to support call durations averaging about 4 minutes per phone call. The signal is sensitive to delay and jitter because it is a real-time interactive communication. In the PSTN, you deliver this service using circuit switched technology. In some cases, large businesses have implemented private telephone links, such as T-1 tie lines, between branch offices to provide internal telecommunications from site to site. Data communications dont have long holding times or durations. Theyre described as bursty and unpredictable in nature. The durations of data transfers may be quite short, especially with increases in available bandwidth. Data communications are frequently not a real-time, two-way, interactive session like a phone call. Email and Web browsing, for example, are not as timesensitive as voice. Data services have generally used all the available bandwidth for a very short duration. Thats beginning to change as QoS becomes an integral part of network operations. In the past, it wasnt uncommon for enterprise businesses to have two or three completely separate networks for conducting business. Voice traffic was often handled through a PBX connected to the PSTN. Real-time interactive data was transmitted via some type of packet network, either IP or Frame Relay in many cases. Large file transfer between mainframe systems or server farms might have been carried out across a dedicated point-to-point connection. The Cost of Doing Business Drives VoIP One of the major early drivers behind convergence and VoIP was cost, the job of calculating the true total cost of separate voice and data networks is a complex process. The network infrastructure cost is one factor. Telephone services have historically been billed based on minutes of use. A phone call requires that a circuit be established for the duration of the call, so this billing system made sense in the past. Data networks are different, and most commonly bill for either the bandwidth provided or some guaranteed carrying capacity (referred to as Committed Information RateCIR). Equipment cost can usually be treated as one-time capitol expenditures (CAPEX).This cost is tied to buying required equipment, routers, computers, telephone systems, wiring and cabling, and so forth. CAPEX costs are closely scrutinized when businesses invest in new network solutions, but they only represent one facet of the total cost of ownership (TCO). Operational expense (OPEX) is another factor that is too easily underestimated. Given that the topic of this guide is converged network management, it will be digging into a variety of holistic management issues and techniques. The labor effort required to provide operational support for additions, moves, and changes to the network are difficult for many organizations to quantify. Reorganization is for many enterprises, business as usual, with constant upheaval. The costs to support this continual churn over the lifetime of the network often far outweigh the one-time equipment costs. For the enterprise managing multiple networks, telephone and voice, this OPEX may be duplicated for each network. It is very common for an organization to have a support person or staff for the telephone system and another for the data network. In the past, these support personnel required different skill sets, but convergence, even in the state achievable today, can change that.
Chapter 1 Administrative costs comprise still another facet of networking that may be underestimated in some enterprises. There are the basic costs of processing monthly invoices for payment. Beyond simple accounts payable, there is administrative effort involved in the ongoing monitoring of network performance compared with monthly billing. Billed minutes of use in the voice network and committed information rate delivered have to be compared with Service Level Agreements (SLAs) and validated against traffic reports. The same staff that manages day-to-day operations might perform this analysis. Although often overlooked and perhaps not high in terms of actual cost, this function can potentially yield high return, particularly in networks that are volatile and changing frequently. Holding providers accountable for meeting the terms of their contracts is a labor-intensive but crucial component of both voice and data network services. The telephone and data network(s) in an enterprise need holistic treatment, like a living organism. The health and well-being of these vital business resources must be constantly monitored. Enterprise business constantly evaluates and assesses employee performance. Network performance must be proactively managed in a similar fashion. That means vigilant monitoring throughout the life cycle of the network to ensure maximum performance. Holistic management of the business network includes working closely with voice and data service providers. Its far too easy for changes in a dynamic enterprise to happen without any correlation to network services. The result is that the network reality diverges from the requirements. They move in different directions driven by changes in the business.
Ive helped many clients conduct network analysis and assessment focused on identifying potential OPEX reduction. In one such case recently, we quickly discovered that the company had downsized operations by half in the preceding year. What was troubling was that the network billing didnt reflect any reduction. The circuit cost component of their OPEX hadnt changed. When a remote branch office closed, this company had completely failed to have the network provider disconnect circuits and stop billing. Several thousand dollars were paid to a provider over the course of a year for idle circuits that carried no traffic. Although this might seem a laughable example of really poor management, that isnt the case. This type of problem is far too common.
Given the technologies generally available, is it practical to build one single network infrastructure that will support all traffic and service types required to do business? A few years ago, the answer to that question was probably no. Four or five years ago, only the earliest adopters of leading-edge technologies were able to take advantage of integrating multiple services into a single environment. Today, networks are actively converging. Right now, for many companies, unifying communications onto a single integrated infrastructure, converging services into single workstations, or integrating applications and services makes sound business sense. As technologies collide, keep in mind that what didnt work yesterday may be viable today, and what doesnt work today, could well be the de facto standard of tomorrow.
Chapter 1
Technical Definition vs. Market Definition Its perhaps an important distinction to identify the differences between VoIP and IPT and what they might mean in the marketplace. From a purist perspective, VoIP is simply the process of digitizing and packetizing voice. Figure 4.4 shows a common arrangement for making a VoIP call. Although IP telephone calls can take on many forms, this example represents an IP phone call in a form that doesnt directly involve the subscriber in any way. The telephone set in this example is a traditional telephone connected to a traditional PBX.
Figure 4.4: A variation on IPTthe gateway between business and the PSTN.
The phone call begins when the subscriber lifts the receiver and a dial tone is supplied from the local PBX. In this case, when the caller keys in the telephone number, that information is forwarded to an IPT gateway. The gateway provides a conversion point for traditional voice traffic to be converted to IP and vice versa.
10
Chapter 1 The originating gateway has to convert standard PSTN signals and correlate the dialed telephone number to the IP address of the terminating gateway that serves the called party. This signaling information is packetized and sent to the IPT gateway at the far end. The receiving gateway must decode the IP packets and convert the signals back into traditional voice format for the PBX on the remote end. When the called party answers the telephone, a complete two-way conversation can occur. Some benefits to this approach are obvious. Others are more subtle. This method of VoIP implementation has been employed by both corporate enterprises and commercial VoIP service providers (also called an Internet Telephony Service ProviderITSP): Using a packet network to transmit allows the sharing of resources rather than the dedicated resources for voice and separate resources for data. Although this visual shows the Internet, many enterprises use their internal wide area packet network to deliver this service. If the Internet is the packet network used, the service provider can now provide two distinctly different but necessary services from one consolidated backbone infrastructure. This approach may drive the need for QoS in the packet network. Because the ISP or ITSP is not a regulated telephone company, the requirement for payment of access charges doesnt exist. Thus, the ITSP can deliver calls cheaper. However, it also means that the local exchange company doesnt receive remuneration while its circuits are being used in many cases. This is good for one part of the industry, but bad for another part. Telephony traffic is shifting off the PSTN and on to the Internet. In this particular implementation of IPT, the end user doesnt have to convert to a VoIP phone. The end user doesnt have to do anything. They might not even know the telephone call is being carried over a VoIP service. The Internet is a large and growing network, but the user experience for a voice call may best be served using a telephone set. There is no added complexity of soft phones or specialized instruments for end users in this model. This could prove a viable approach for common calling services such as pay phones, hotel phones, and public phones where other, more advanced services arent required. The initial startup cost, or barrier to entry, for a VoIP service provider is very low in comparison with the startup costs for a new telephone company using traditional PSTN technologies. IP packet switching technologies provide for greater efficiencies at a lower cost.
11
Chapter 1 Variations on implementing the converged services network range from straightforward to very complex. When evaluating how services converge within an enterprise, its vital to do so with an eye to not just the business needs but also the ease of support and network management. Some other variations on implementations might be: IP Centrex services from a managed service providerThis service could be delivered to the corporate network over one single network connection, with voice and data services broken out inside the company network. A managed IP-PBX inside the corporate networkThe key difference between this approach and that shown in Figure 4.4 is the potential for VoIP telephones or integrated softphones within the corporate network. Internet users might utilize some form of commercial gatewayNotable early examples of this were http://www.dialpad.com and http://www.net2phone.com. You can expect to see an increase in Web services providing this sort of service for Web only, or thin client, connections. In particular, as developers find new ways to embed VoIP services within the browser, you can expect to see more of this functionality. Some companies are using PC-based software to provide internal telephone calling and collaboration services between employees on the corporate networkThis LAN-based VoIP is efficient within the network, and essentially free of charges. This may be some of the early use you see for the new Microsoft Live Communications Server. Calls to the PSTN might require employees to have a traditional telephone using the traditional network. A large multi-location enterprise deploying unified communications might implement multiple gateways at geographically dispersed office sites. Calls might be directed to the nearest gateway to the called party, minimizing toll and long-distance charges. This least cost routing approach has long been used in large enterprise voice networks.
VoIP or IPT?
Voice digitization is not a new technology. Digitization began inside the telcos in the 1960s utilizing what is called T-1 service today. During this time, T-Carrier, as it was called then, was deployed to provide trunking capacity between telco central offices. Analog transmission technologies were in widespread use at the time, and digital technologies enabled network performance improvements in the PSTN that benefited both customers and the phone companies. The following section explores the basics of a telephone call to set the stage for a discussion of network requirements and call quality issues in later chapters.
12
Chapter 1
A Telephone Call Simplified When a person makes a telephone call, there are several basic steps that must occur whether the call is a videoconference, a fax transmission, or a voice conversation. These tasks are performed by the service network in each case. When you speak, your vocal chords vibrate, generating sound waves. These sound waves are an analog signal. These sound waves travel through the air, and your ears convert them back into signals your brain can understand. The world around you is very much an analog place. The telephone converts the sound waves from your spoken conversation into electrical signals that can be transmitted over copper wires. The telephone set at the receiving end converts that electrical signal back into analog sounds waves you can hear. In the telephone network, that electrical signal is converted from analog into a digital signal, a stream of zeros and ones defined by changes in the electrical state of the circuit. Thus, to carry on a telephone conversation, your analog voice must be changed from sound waves made up of vibration of air molecules into an electrical representation of the analog wave. This analog wave then must be digitized and transmitted across the network. At the other end, the signal is converted back to analog, then back into sound waves at the earpiece of the telephone handset. Converting the Analog Signal to Digital The telephone network today is made of digital central office switches connected by digital trunk circuits. It makes sense to transmit digital signals. The local loops (also called the last mile) that connect subscribers and most phones are analog. Analog-to-digital conversion must be performed somewhere in the network. This conversion is typically accomplished using a technique known as pulse code modulation (PCM). In order to perform PCM, a coder and decoder (more commonly called a codec) are needed. PCM takes samples of the voice conversation 8000 times per second. These samples are then converted into an 8-bit word, resulting in a 64Kbps sample (8 bits 8000 samples per second = 64kbps). Each of these 8 bit samples can be coded into one of 255 different possible combinations. The 8 bits of binary data can represent values from 0 to 255, but all zeroes cannot be used in this coding scheme. This guide will touch on the 64Kbps line rate more than once. This line speed is what a standard voice channel in the PSTN provides. In the United States, a common time-division multiplexing transmission scheme is used to transmit 24 voice channels as a digital stream of data over a single circuit. This is the basic voice service provided over a T-1 circuit today, with 24 voice channels delivered over 1.544Mpbs (due to some overhead). This design of digital facilities and voice circuits is part of the Synchronous Digital Hierarchy throughout the world. Other parts of the world dont use T-1 circuits, but their approach is similar. By todays standards, 64Kbps really doesnt seem like a lot of bandwidth, but the technical reality is that, depending on which codec is used, its far more than necessary to carry a voice conversation.
A future chapter will touch on coding schemes used to sample and compress audio traffic streams.
13
Chapter 1 Network economy of scale and simple economics drive the industry to work toward reducing the bandwidth required for a phone call. The less bandwidth required per conversation, the more conversations the network can carry at one time. Adding carrying capacity through codecs and compression minimized capital investment in network equipment. This can potentially drive down the cost of carrying a phone call. When Does VoIP Become IPT? The PSTN is a complex and mature implementation of technology that has evolved over more than a hundred years of use. Though technologies are changing and advancing quickly, the Internet is quite immature by comparison. The PSTN has what is referred to as the Advanced Intelligent Network (AIN). 800 number services provide access to databases of caller information. Caller ID, call waiting, call hold, and conference calling are common features available to almost all users. E-911 services are constantly evolving with technology in order to identify the callers location to within a very small geographic radius. This mature network has been optimized over time to provide the best quality voice and a variety of services that are now taken for granted. The PSTN does far more than transport voice traffic. It provides a comprehensive and robust suite of telephony services. The IP networks, or Internet, certainly cant provide that rich set of services by itself. VoIP has been achievable for a number of years. Simply packetizing a voice signal is easy. There is far more to IPT than just carrying voice traffic in IP packets. The full, rich telephony feature set has to be incorporated before users could give serious consideration to IPT as a viable service in the production network for enterprise business. IPT solutions can now bridge the gaps between the PSTN and the IP network. Using the TCP/IP protocol suite, a network such as the Internet can now provide many aspects of the traditional telephone network. The IPT market has seen a substantial growth rate over the past several years. Vendor products have improved, but so has basic network technology. The issue of VoIP vs. IPT may be one of semantics, but it represents a problem for business people trying to make a decision. One vendor will talk about VoIP, another about IPT, and a third will now refer to unified communications. All the scenarios described in this guide are IPT in some form. Its clear that if the end user connects to the PSTN, theyll require some kind of a telephone. If they connect to the IP network, theyll need an IP device of some kind. As business managers, its important not to get hung up in the semantics vendors or service providers use. Its prudent to focus on the services being provided to ensure they support business needs.
14
Chapter 1
Intended for PC-to-PC Voice Messaging Pure IPT between computer users over the Internet was a popular, early hobbyists communication technique. This form of VoIP has been accomplished by users meeting in a chat room. Today, its performed by a staggering array of software solutions including voice-oriented solutions such as Skype, Gizmo Project, and GoogleTalk. There is also a whole family of instant messaging (IM) programs including ICQ, AOL Instant Messenger (AIM), MSN Live Messenger, Yahoo, and others. IM users can easily activate a voice session over the Internet in most popular client solutions today. Many of these VoIP solutions that began as PC-to-PC calling tools now have links in and out from the PSTN as well. Early prophecies for success in IPT viewed the technology as a tool for consumers to combat the high price of long-distance telephone service. Those arguments seemed to hold water at that point in time, but they overlooked the technical advances and growth of mobile telephony. Today, many wireless providers offer nationwide calling at rates cheaper than imaginable 10 years ago. VoIP still aids in reducing the cost of international long distance, but the current primary driver is more likely to be service integration than cost reduction. Never Intended to Replace Global Telephony VoIP wasnt developed with the ultimate goal of replacing the PSTNnot initially. But VoIP disruption has occurred and is still gaining momentum. Delivery of voice service is no longer tied to the vertically integrated Class-5 telco switch. This disruption has been accepted as the trend to the future of telecommunications. IP evolution is replacing monolithic, proprietary solutions and broadening into Web and application services. Innovation is increasing at a faster rate in a dynamic and highly competitive environment. Free Voice Was Compelling and Simple The idea of free voice calls spurred hobbyists and early adopters to experiment with an array of new ideas. The drawback is that this PC-centric group of innovators and early adopters provides only a limited market. They represent an application and integration proving ground, but not a mass market. Pushes the Edge Over Convergence What will be the accelerant for VoIP and unified communications? The ongoing debate between cost reduction and enhanced utility will continue into the foreseeable future. Today, the market sees far too many more of the same or me too solutions. This limits the advance of both user experience and penetration. The VoIP developer community has often been providing equivalent features to the telco environment. VoIP as a replacement for PSTN dial tone is neither useful nor well received in todays market. If applications will fuel the future growth and demand, one option might be to pursue integration inside what have been termed the walled gardens. An integration layer between proprietary applications and content providers will expand the reach of converged services to a broader market.
15
Chapter 1 Vertical communities of interest are early adopters, but early adopters really only provide fuel for developers. Markets emerge over time through an established pipeline of applications and services. You never know at the beginning of the development cycle what the killer app will be. Communities are the early adopters and innovators that help create the future. A rich multimedia, real-time telephony experience exists within the technology, but its just taking root and beginning to expand. One thing is clear: Voice will no longer be driven by the telephone. Voice services of the future will be driven by the context of the business application. Multiple Play Multi-play is a popular marketing term describing the delivery of multiple communications services by service providers that traditionally only offered one or two of those services. Triple play and quadruple play are commonly used to describe the combined delivery of high-speed Internet, television, telephone, and mobile phone services, respectively: Dual playThe dual-play service is a marketing term for the provisioning of the two services: high-speed Internet and telephone service over a single broadband connection. This has most frequently been used with cable providers. Triple playThree-way convergence is inextricably linked to the underlying communication infrastructure. A prime example of this would be packaging communication services in a form so that customers can purchase television, Internet, and telephony in a single, bundled service subscription. Quadruple playSo far, the quadruple play service remains elusive. Its the triple play service of broadband Internet access, television, and telephone with the addition of wireless service provisions. Advancements in WiMax and other leading-edge technologies are rapidly improving. Transmission over a wireless connection link at combinations of speeds, distances, and non-line-of- sight conditions may soon make it possible to never connect to voice or data services by a wire to anything, even while at home.
New Devices New devices are driven by technology advances, user expectations, and how people use their VoIP and mobile phones. There is some relevance evolving in the market surrounding how often people change phones. Many consumers move to the next new smartphone every 6 months in the mobile environment, but in business, many people have been using the same telephone set with the same features for many years. You get new cell phones regularly because they enhance your productivity in some way. This leads providers to think about how people interact with their phone set. The comparison of how you change cell phones sets the stage for other unified communications options. Although vendors continue to see rising shipments of dedicated IP phones quarterly, there are other business drivers to consider. Three words describe this new driverrelevance, context, and presence. The type of telephone set you need varies depending on your context. In some cases, youll be in a meeting or unable to talk and an instant message might be the most relevant communication method. At other times, you cant be interrupted at all. Whether its a PC-based softphone, a traditional desktop telephone, or a mobile handset, each offers strengths depending on the context being used.
16
Chapter 1 Today, people work in distributed, virtual, mobile teams. The work day doesnt start when you enter the office, and it doesnt end when you leave. You need to manage your communications flow throughout the day. There are hundreds of features and services in existing systems, but users generally dont know how to access them. Most people use a very limited subset of the available features and functionality. Features will become more intuitive. The context of your work day and how youre using other business applications will be more central to the development of converged communications devices as things evolve. The key questions for device manufactures are: What do users expect from their phone? Users interact with many different devices all day long. Those interactions drive expectations. How can we add efficiency via the telephone device? How can phones increase productivity?
Phones are moving from being feature focused to user focused. Its a paradigm shift on the desk and in your purse or pocket. Can a phone really improve or impact efficiency? Yes, but if it fails to provide access to features and functions, it can be a frustration and lead to lost productivity. VoIP Can Include Advanced Applications The convergence of real-time media carried over IP along with data is where weve arrived. That is the state of technology today. Now were looking at how to converge these services onto application engines. Convergence with enterprise business applications presents another whole convergence layer. CRM, human resources (HR), and ERP systems all introduce interesting ideas when blended with telephony services. Converged communications is the capability to integrate at different layers than just telecommunications. These other layers are still fairly separated into silos, or vertical groups. Workers today have multiple communications devices and soft phones. New capabilities such as real-time directories that provide presence and availability information, conference call setup, and rich media integration empower the end user. Convergence integrates the power of multiple platforms that offer service capability in a unified environment rather than discrete applications or services. The Services Oriented Architecture (SOA) is moving toward opening up the services of one application for sharing with another application altogether. This isnt just VoIP. It isnt just integrating communications. Its service integration through software across multiple business applications. Telephony software in the converged environment can now act like a business application and interact with other business applications. Many of the functions that the hardware enterprise PBX provided can now be performed in software. Integration at the software level, at the application level, is the key to opening up integrated services within IP. You can do new things when you integrate business applications rather than focus on technology.
17
Chapter 1
Converged Communications Leads to Converged Applications

Specialized applications are becoming standard applications. Using the SOA information resources available to all participants (users, services, and applications) in the network as independent services may now be accessed in a standardized, converged way. Users dont need to change context from one application to another. The end user shouldnt have to shift the context of the core application in which he or she is working to make a telephone call. The developer communities are working to open up integration services to allow all that context switching to happen behind the scenes. The following list highlights real application examples: A Web browser-based interface is the simplest example. It can display state information to users. The browser can easily show present state in an address book or telephone directory, with click-to-dial capability. A mobile device that can run thin client or browser-based applications presents another example. This device makes it easy to authenticate as a mobile WiFi user and then display state information based on that IP connectivity. The mobile device simply becomes a thin client using Web services on the network. Again, directories and presence state are prime examples of integrating productivity tools. For large enterprises, think about the whole cumbersome process of voice and data adds, moves, and changes for new employees and people shuffling from cube to cube. Thats historically a huge labor effort. Why not enable an HR system that interacts with the voice and data services network to automatically configure telephony services for new employees?
Converging business applications is absolutely where the entire business community is headed. FMC Mobility is a huge issue, and the advances ahead with FMC make this a subject of interest across many business sectors. The environment is ripe for mobile VoIP. The delivery of enterprise applications, information, and services to a mobile worker is vital today. One factor that has made mobile VoIP viable today is near-universal standardization on Session Initiation Protocol (SIP). At a recent developer conference, informal polling of attendees pointed out that email usage on mobile phones was at about a 50 percent usage rate. We are becoming mobile data users more and more. Enterprise users are a huge piece of the broad mobile services market and have been since 1973. Although mobile usage is on the rise and ripe for convergence of VoIP and mobile services, user interactions and experiences are distracting and degrade from the overall the user experience. Radio characteristics present another challenge. Current WiFi technology is very chatty in standby mode, draining mobile device batteries and shortening usable lifetime. Because of this, dual-mode devices tend to suffer from measurably greatly shortened battery life. There is industry work needed to make the WiFi device more effective as a phone.
18
Chapter 1 A recent survey of mobile professionals produced some interesting results. The most active mobile users in business are either customer-facing workers or management. More than 75 percent were happy with coverage. About half felt that dual mode could improve coverage for better service. Churn in the mobile industry is driven by coverage issues more than any other factor. Most mobile carriers are now down to about a 2 percent churn rate. Business today is very mobile. 70 million Americans use their cell phones for work. 70 percent of all cell calls start in WiFi-enabled areas (home, office, hotel, and so on). By 2009, the industry expects that 25 percent of new mobiles will be smartphones. One really interesting note from the survey about business and mobility: 67 percent of mobile professionals receive or make more than 25 percent of their business calls on mobile phones. Mobiles are very important devices to enterprise business. (These survey numbers come from a collection of surveys conducted by FirstHand, Nortel, RHK, and Gartner.) Mobile VoIP is much more than just terminating a SIP session on a small handheld device. Its about enabling a productive personal or business experience.
Summary
You can wait for evolution or you can begin using the technologies that are available today and ride the wave toward the fully converged networks of tomorrow. Even as developers are defining the architecture, boundaries, requirements, and interfaces for a shared unified communications network of tomorrow, the current solutions provide a wide open space to grow and evolve existing business services. The chapters ahead will review key considerations for integration, business drivers that make sense, and productivity advantages. This guide will dig further into the key success factors for deployment and management of converged networks. It will investigate event correlation across multi-service networks, and dig into availability management and capacity planning in the world of unified communications. This guide will explore fault, configuration, performance, and security management in this exciting new world. Finally, it will take a look at asset and compliance management. The next chapter will hone in on business processes most often impacted by convergence. In addition, it will examine the issue of call quality and how to deliver total quality network services, including VoIP in the converged network.
19
Chapter 2
Chapter 2: Key Considerations in Effective Voice and Data Integration for a Changing IT/IP Landscape
The integration or convergence of voice, video, and data can provide a business with a competitive edge when effectively implemented. There are several business models and operating environments that present opportunities for strategic consideration when planning for this change. The key factors for success involve leveraging the integrated features to provide the greatest support for existing processes. This chapter will look first at high-level business models and processes that are often impacted by service convergence. Later, the chapter will delve into the issues of call quality that affect every organization implementing an integrated service solution. Quality is often the single biggest factor in a successful implementation, so this chapter will explore a variety of approaches for delivering total quality network services, with a focus on integrated VoIP.
Quantifiable Business Processes

Business crosses a wide array of sectors, each having unique business requirements to support aspects of the core business. Call centers may play a central role for many businesses, particularly those in financial services, insurance, or travel. They also play a key role in many other sectors as smaller customer support teams. Interactive voice response (IVR) systems are frequently automated to reduce the requirement for staffing and provide information to customers. Computer telephony integration (CTI) isnt a new concept with the deployment of VoIP, but in many cases, it becomes easier. CTI may provide levels of service and application integration previously outside the financial grasp of some organizations. Web-Centric Businesses In addition to the traditional sectors of business, the integration of the Internet has heightened awareness of four distinct business models in the Web-centric world of e-business. Some of these models fit with large enterprise; others are more amenable to small business and have been used by many an e-business startup company. In the open market model, anyone can be a buyer, and anyone can be a seller. Theres no centralized control, and minimal trust involved. There isnt particularly high value to integration of enterprise systems because relationships may be ephemeral. Market leadership for these ebusinesses requires being in the right place at the right time, with the right solution at the right price. OASIS and eBay are good examples of the open market business model.
20
Chapter 2 The alliance model is more common in larger businesses. It embraces a distributed corporate environment with multiple leaders of the pack. The goal of these alliances is frequently optimization of specific solutions to solve identified customer problems. Alliances are often formed among the best and brightest in their respective fields. High levels of integration in services and applications between partners bring tremendous value to an alliance. Sun, IBM, Oracle, and Netscape demonstrate this model via the Java Alliance. The aggregation model is typically adopted by the leader in a business sector. The aggregator positions itself between producers and consumers, providing access to products. Integration with consumers may be low, but integration with the producers and internally across the aggregator enterprise can add very high value. Wal-Mart represents a perfect example of this model. The value chain model is adopted by most businesses. Every business is, in some facet, the leader of the pack in its particular sector. Process optimization within the enterprise is crucial to business success. The leader focuses on optimizing the value chain through service and application integration rather than aggregating buyers and sellers. Cisco Systems, Dell, and Amazon represent value chain leaders in the world of e-business. There is also an emerging competitive model, driven by competition between clusters, not individual companies. This model is often a matter of survival for businesses that need partnerships to deliver a complete solution set to market. In this model, companies maximize the user of distributors, resellers, and retail sales chains. Integration of services and applications may focus on facilitative technologies such as Web-based Electronic Data Interchange (EDI) to drive e-commerce online. Business in the new economy of the Internet requires instantaneous reaction to customer behavior. Being nimble and responsive can provide a competitive edge in having the right solution for customers at the right time.
Examples of near-instantaneous change include Wal-Mart and K-Mart. Wal-Mart monitors computer inventories closely in stores. If a customer buys a package of tennis balls, the inventory and shipping systems are automatically updated to ship a replacement set to a specific store. If a store experiences a run on tennis balls, or an unusual trend exceeding normal thresholds for a product occurs, stores all across a geographic area may find they receive increased delivery of tennis balls to support the trend. K-Mart takes a different approach, using outside data sources to drive product stocking. One oftenused example is their use of the National Weather Service (NWS). If K-Mart notes inclement weather headed for a particular part of the country, their stores may receive umbrellas and raincoats for stocking, in support of anticipated demand.
21
Chapter 2 Call Centers Businesses implement call centers to effectively administer incoming product support or provide information to customers. Call centers most often interact directly with consumers and are often the lifeline of the customer relationship. Outbound call centers are used for telemarketing, clientele management, and debt collection. The call center might also be a broader contact center, handling postal mail, fax communications, and email for an enterprise or business unit. Call centers are generally built using large, open work spaces with workstations including computers, telephones with headsets, and supervisory stations. Some businesses build centralized call centers; others distribute call centers at diverse locations, connecting them with voice and data technology. Global companies may have call centers around the world, with each being the primary center as the time of day changes. Location, coupled with time, can provide the advantage of using resources during the local business day. Call centers are linked to the corporate computer network, including mainframes, microcomputers, and LANs. Increasingly, the voice and data services are linked through CTI. Most large enterprise businesses use call centers to interact with their customers. What has changed with the widespread deployment of VoIP is the barrier to entry. Now a midsized company can easily deploy a centralized or distributed call center, leveraging integrated technologies and bringing new services to customers that were previously out of reach. Call center technology essentially provides a queuing network coupled with workforce planning and management to achieve desired service levels. A common example of call center service levels might require that at least 80 percent of the callers are answered within 20 seconds or no more that 3 percent of the customers hang up, due to their impatience, before being served. Call centers provide information about traffic patterns and business patterns as well. Statistics gathered can help determine whether a single large call center is more effective at providing customer service (answering calls) than several distributed, smaller ones might be. Centralizing call management into a call center aims at improving business operations and reducing costs while providing a standardized, streamlined, uniform service for consumers. The efficiency of a repeatable process makes this approach ideal for large companies with extensive customer support needs. Call centers use an array of voice and data technologies and provide a prime candidate for integrated services and applications. These technologies can ensure that customer service agents are kept as productive as possible and that calls are queued and processed as quickly as possible, producing the desired service levels for customers. Some of these technologies include: Automatic call distribution (ACD) groups Analysis tools to review agent activity Optimization analysis for outbound call centers, referred to as Best Time to Call (BTTC) Interactive voice response systems (IVR) to improve efficiency by reducing agent time on the phone CTI Predictive dialing for outbound calling Integration with Customer Relationship Management (CRM) systems Web collaboration and online chat tools for customer support
22
Chapter 2 Beyond the features and technologies, there is a standard suite of typical performance metrics used in the call center management methodology. When a company invests in a call center, its crucial to monitor performance levels and ensure return on the investment. The most common metrics include: Measuring the average delay callers wait in a queue for an agent to come on the line Call duration times, generally referred to as Average Talk Time (ATT) The time an agent spends on the total call, including preparation, conversation, and aftercall work or wrap-up; this is called Average Handling Time (AHT) The percentage of calls that get answered within an established call pickup time; this is typically called the service level (for example, 90 percent of calls are to be answered within 30 seconds) The raw number of telephone calls each agent handles per hour; this is used as a measure of agent productivity How much time an agent spends after getting off the phone in closing out the customer transaction; this is generally called either Wrap-Up or After Call Work (ACW) Calls that completely resolve the customers question or problem on the first call may be tagged with an indicator of First Call Resolution (FCR); the FCR rate is often used as a broad, overall measure of call center performance Either the number or percentage of calls that are abandoned by the customer; the higher the number or percentage, the more indicative this is of long holding times, driving customers to hang up Idle time is the percentage of time agents spend either not on the phone with customers or in ACW; high idle time may be an indication that the call center is overstaffed during a particular time period; its used for trend analysis to balance staffing of work shifts in large call centers Quality assurance (QA) monitoring may be performed by either a QA team or supervisor; customers hear a standard Your call may be monitored for quality purposes. message on a routine basis
The acceptance and widespread deployment of VoIP has enabled the staffing of call centers with remote agents. These agents work from home, often on flexible part-time schedules. In the past, they would use a basic-rate ISDN line, but widespread broadband deployments have made highspeed Internet access easy to couple with VPN and VoIP solutions to provide a fully integrated solution for teleworkers. Clearly, the call center has been a key resource in large customer service organizations for many years. Today, the reduced barrier to entry in call centers has made the technology more available to a wide set of business environments.
23
Chapter 2
IVR Systems IVR is a computerized system that allows a caller to choose options from a voice menu and otherwise interface with a computer system. Typically, the IVR system plays prerecorded voice prompts, and the caller responds by pressing numbers on the telephone keypad to select among the options. Many IVR solutions also allow the caller to speak simple answers such as yes, no, or numbers in response to the prompts. These systems have grown more sophisticated in the past few years, and do a much better job of recognizing human voice than they did when they first gained popularity. Newer systems use natural language speech recognition to interpret the questions that the person wants answered. There is also a growing trend called guided speech IVR that integrates live human agents into the design and workflow of the application to help the speech recognition with human context. Other innovations include the ability for the system to read out complex and dynamic information such as email messages, news reports, and weather information using Text-ToSpeech (TTS) conversion tools. TTS is computer-generated synthesized speech that has advanced well beyond the robotic voice people may associate with computerized systems of the past. Human voices are used to create the speech in very small fragments that are assembled to create very real-sounding responses before being played to the caller. IVR systems are used to create service solutions such as airline ticket booking and banking by phone. Unlike voicemail systems, which are one-way communications, IVR systems provide some level of two-way information exchange between the caller and the company systems. An ACD group in a call center may be the customers first point of contact. IVRs are often used to provide the primary front-end to a call center, automating the most common calls for account balances or other easily delivered information. IVR systems today are built with scripting languages such as VoiceXML or Speech Application Language Tags (SALT), not unlike the way Web pages are constructed. In the case of an IVR, the Web server also acts as the application server, allowing the developer to focus on call flows rather than graphic presentation. Typically, Web developers understand and are familiar with tools in this environment and often dont require additional programming skills. CTI CTI provides for interaction between the telephone and computer systems. As technology has matured, CTI has expanded to include the integration of all customer contact channels, including voice, email, Web, and fax. CTI History CTI evolved from relatively simple screen population (or screen pop) technology. This technique allows data collected from the telephone systems, most often via the touch pad, to be used as input data to query databases with customer information. That data can then be populated instantly to the customer service representatives screen. When the agent already has the required information on his or her terminal screen before speaking with the customer, overall transaction time is reduced.
24
Chapter 2 This technology began in the closed, proprietary roots of every PBX/ACD vendor in the market space. Most vendors eventually adopted the Computer Supported Telecommunications Applications (CSTA) standard. CSTA originated in the ITU and was adopted as an OSI standard in 2000. Some other accepted service and application integration standards in CTI include: Java Telephony API (JTAPI) promoted by Sun TSAPI and TAPITSAPI was originally promoted by AT&T (later Lucent, then Avaya) and is by far the most widely adopted in large-scale contact centers; Microsoft pushed a separate initiative, thus TAPI was born, with support predominantly from Windowsbased applications Information about the callerThis may include the callers number, the telephone number called, and in many cases, which prompt in an IVR system the caller selected. Screen population (screen pop)This provides the agent with information about the caller before the agent picks up the call. One example is prompting a customer to enter an account number on the telephone dialpad. This allows the CTI system to retrieve the account and present that data to the agent, who can then be fully prepared to talk to the customer immediately. Outbound call centers use on-screen dialing tools to increase productivity. Speed dial options and predictive dialing features increase the rate of outbound call placement. A software screen control, such as a VoIP softphone, provides access to telephone features through the computer. Functions such as answering a call, hanging up, placing a caller on hold, and initiating a conference might be performed with the mouse or keyboard shortcuts. Call transfer and transfer of the data screen provide the ability to transfer a call to a more senior agent or supervisor when the agent involved needs assistance in resolving a problem. Administrative functions and duties, such as logging in or out of an ACD group or completing wrap-up work are common.
Some commonly implemented functions in the CTI environment include:
25
Chapter 2 CTI takes on two basic forms. The following excerpts from the Wikipedia online encyclopedia article about CTI provide helpful descriptions (Source: http://en.wikipedia.org/wiki/Computer_telephony_integration#Forms_of_CTI):
First-Party Call Control
First party call control operates as if there is a direct connection between the user's computer and the phone set. An example of this would be a modem card in a desktop computer, or a phone plugged directly into the computer. Typically, only the computer associated with the phone can control it, by sending command directly to the phone. The computer can control all the functions of the phone, normally at the computer user's direction. First party call control is the easiest to implement but is not suited to large scale applications such as call centers.
Third-Party Call Control
Third-party call control is more difficult to implement and often requires a dedicated telephony server to interface between the telephone network and the computer network. Third party call control works by sending commands from a user's computer to a telephony server, which in turn controls the phone centrally. Specifically, the user's computer has no direct connection to the phone set, which is actually controlled by an external device. Information about a phone call can be displayed on the corresponding computer workstation's screen while instructions to control the phone can be sent from the computer to the telephone network. Any computer in the network has the potential to control any phone in the telephone system. The phone does not need to be attached directly to the user's computer, although it may physically be integrated into the computer (such as a VoIP soft phone), requiring only a microphone and headset in the circuit, without even a keypad, to connect to the telephone network. Like many telecommunications technologies, CTI has made great advances over the past 10 years of evolution. The barrier to entry has lowered to a point that CTI is now available to small and midsized businesses and is widely adopted as a competitive tool to improve business processes.
26
Chapter 2
Application Integration with CRM and Enterprise Resource Planning Systems As business advances, there has been enormous acceptance and adoption of an Enterprise Resource Planning (ERP)-driven, Web-centered collaboration in business-to-business (B2B) interaction. For many businesses, this is a competitive move. Companies around the world have focused on improving business processes for many years, spurred early on by W. Edwards Demings Out of the Crisis and his 14 Points for Management. Those works, and others, drove business to improve processes, but with advances during the same time period in IT, weve seen growth in the use of ERP systems across businesses of every size and shape, creating more integrated enterprises. Today, efficiency in process drives many organizations to focus on teamwork. ERP systems help dissolve the barriers that exist between different business units or departments. They help break down the silo effect so common in the past and change how people work together. Web-centric business platforms and the success of e-commerce bring new levels of integration, allowing easier support for inter-organization business process integration between trading partners. ERP systems goal is to integrate all data and process within an organization into a single unified system using computer hardware and software components. One key ingredient to most ERP systems is the use of a single, unified, master database to store data for all the disparate modules involved. ERP originally implied a system for planning the use of resources across the enterprise, and although it originated in the manufacturing sector, today it is much broader in scope. Todays ERP systems strive to encompass all the basic functions of an organization regardless of the business. ERP systems have spread beyond manufacturing to business, non-profits, governmental organizations, and other large enterprises. The term ERP system generally refers to an application that replaces two or more independent applications within an organization, eliminating the need for interfaces between systems. This approach provides benefits ranging from standardization and lower maintenance (a single system rather than multiple systems) to improved, dynamic reporting capabilities and enables managers to better monitor the business. Some examples of different modules that might be available in an ERP system include: Manufacturing Supply chain Financials CRM Human Resources Warehouse management
27
Chapter 2 Each of these discrete application modules support processes require both voice and data communications. Integrating voice services into a CRM module provides a unified customer database, easing customer contact to build relationships. Supply chain modules can build tighter communications processes with vendor partners. Manufacturing and warehouse communications integration can speed time to market. ERP represents the nervous system of an organization and provides an integration point that can tightly couple business process with communications tools to increase productivity and enhance efficiency. Customer Relations and CRM CRM encompasses a broad set of capabilities, methodologies, and technologies that support an enterprise in managing customer relationships. The general purpose of CRM is to enable organizations to better manage their customers through the introduction of reliable systems, processes, and procedures. CRM is a corporate-level strategy that focuses on creating and maintaining lasting relationships with its customers. CRM tools look at the relationship between a business and its customers and help manage relationship building and process improvement. Although there are several commercial CRM software packages on the market that support CRM strategy, building value is not about the technologies. CRM often represents a holistic change in an organizational culture and philosophy, placing emphasis on the customer. To be effective, the CRM process needs to be integrated end to end across marketing, sales, and customer service. An effective CRM program needs to: Identify customer success factors Create a customer-based culture Adopt customer-based measures Develop an end-to-end process to serve customers Recommend what questions to ask to help a customer solve a problem Recommend what to tell a customer with a complaint about a purchase Track all aspects of selling to customers and prospects as well as customer support
As Figure 2.1 shows, CRM becomes central to both business process and IT systems. IT systems become a valuable repository of information about the enterprise business, providing insights into further improvements to increase productivity and profitability.
28
Chapter 2
Business Activity
IT Systems
Data Warehouse
Data Mining
CRM
Knowledge Management
Information
Informed Technology Choices
Intelligent Resource Procurement/Allocation
The Right Service Bundles
= VALUE
Figure 2.1: Building value with CRM.
Customer relationships are managed by a variety of communications tools. There is an old adage in sales that says People buy from people. CRM places the focus on the customer, without whom no business can survive. Integration of CRM systems with VoIP technologies provides tighter coupling of several aspects of relationship management. ERP and CRM solution vendors recognize the criticality of voice communications and are actively working on integration solutions. Because these systems are based on IP technologies, there is a natural synergy with VoIP technologies, which have matured to the point that creative convergence between the network infrastructure, services (voice and data communications), and applications can now provide a seamless working environment.
29
Chapter 2
Delivering Call Quality with VoIP

Call quality is the single biggest factor leading to user acceptance of VoIP solutions. Later, this guide will talk about managing call quality; this section will compare quality in the public switched telephone network (PSTN) with VoIP, and see how voice conversations are handled. Traditional Voice Characteristicsthe PSTN When you implement VoIP, youre trying to use technologies that have evolved in new ways over the past 30 years, specifically, IP. As you start adding voice traffic, a multimedia service, to an IP-based network, you encounter a challenge to networking in general. In the past, networks such as the PSTN were designed to perform specific functions and support a single type of traffic. With VoIP, youre using the Internet, or an IP network, to carry voice traffic. The problem you face isnt trivial. It actually has ramifications that ripple through every facet of network engineering in both the Internet and in the PSTN. Its a new problem that wasnt anticipated. In the past, when you had a task that required a form of networking, you designed a new network to handle the task because different applications have completely different requirements. IP changed all that because anything that can be digitized can be carried in the payload of an IP packet. Everything we understand about voice is based on how its been handled in the PSTN, so lets examine some basic facts about voice transmission: Voice calls require long duration for a conversation. A typical phone conversation lasts 3 to 4 minutes. Voice has historically been connection-oriented. Voice calls dont tolerate network delay well. 50 to 100 milliseconds of delay is the norm. Voice traffic is generally what is referred to as real-time traffic. Voice calls have traditionally been carried in a 4KHz voice channel, with the actual sound frequency being carried ranging between 300 and 3300Hz. Converting analog voice traffic to a digital bit stream, you sample the voice at twice the maximum frequency of the channel (8000 times per second). Each sample is coded into an 8-bit word. These 8000 samples multiplied by 8 bits give us 64Kbps of bandwidth, which matches the telephone network architecture in the PSTN.
These basic facts bring us to the root of a number of the technical problems encountered today. The public phone network has been consciously designed and optimized over a hundred years to deliver voice traffic in the best way possible. This traffic engineering (in telco-speak) has been a critical component of the management and growth of the PSTN. The telecommunications industry has spent years analyzing the network, monitoring its growth, and designing into the network countless optimizations to carry voice traffic effectively. It is a voice network that guarantees delivery of the traffic within the users expectations.
30
Chapter 2
IP Traffic Characteristics There is a packet switched public data network (PSPDN) that was specifically designed to carry packet data. Private data networks have been built around the world to deliver enterprise data applications. If you compare data networks, specifically IP networks like the Internet or the PSPDN, you will find that they have very different characteristics than the PSTN: Long duration isnt necessary because transactions are short in duration, or bursty in nature. Packets are small in size and can be routed over different paths. IP packets carry delivery information (addressing) in each individual packet. It is a connectionless environment. At higher layers, TCP may be used to layer on connection capability, but this requires the overhead of a three-way handshake and is commonly avoided unless necessary. IP packets arent generally delay sensitive. Email messages sent over a network can be delayed 30 seconds with no problems whatsoever. Web traffic delivered to a browser can be delayed and loaded in the background. IP traffic is often non-real-time traffic. Delays are expected in an IP network. IP itself doesnt even guarantee delivery of the packet. IP uses the available bandwidth when it has data to deliver. There are 300bps modems that send data and data transmitted over an Ethernet LAN at 100Mbps. IP doesnt require dedicated bandwidth. Again, IP comes with no guarantees of performance or delivery.
Table 2.1 makes the differences even clearer.

Voice Traffic on the PSTN Connection-orientedA dedicated path is established for each telephone call; calls are long duration (4 minutes on average) IP Network Traffic ConnectionlessConversations are packetized and transmitted over the best route based on routing protocols; packets are small, so conversations are cut up into many packets Best efforts are made to deliver traffic but there are no guarantees Uses the bandwidth that is available IP data traffic is delay insensitive
Delivery is guaranteed once the call path is established Designed to use specific bandwidth; the PSTN uses a 64Kbps voice channel Real-time voice traffic is very sensitive to delay
Table 1: Voice and data traffic comparison.
31
Chapter 2 The chart is small, but it makes one very obvious point. IP was not designed to carry voice traffic. IP was designed specifically to carry bursty data over diverse paths, and make a best efforts attempt at delivery. Data can be freely discarded along the way if there are any problems. This is nothing like the PSTN in the very design of the technology. IP networks today are designed, redesigned, and modified to support carrying any kind of traffic. Most often that is what is today called multimedia traffica combination of data, voice, and video. Using digitization, any media can be carried inside an IP packet. The challenge in IPT and video networks is delivering business-quality voice conversations with all the characteristics users have come to expect, such as fidelity, clarity, and near-instantaneous delivery. If users detect what seems to be unacceptable quality, time has shown they will most likely hang up and try again. If this happens frequently, users become dissatisfied and quit using the service. Its clear that to effectively deliver voice services, tools and mechanisms are required for measuring the quality of service (QoS) and guaranteeing that network resources can support the traffic load without degradation of call quality. One method to address call quality has been to overprovision the network. In many cases, this simply means adding more bandwidth. Although this approach might work initially, in the long run, its a road to ruin. First, it requires capital investment. Upgrading the capacity of connections, switches, and routers can be a very expensive undertaking. This approach might work for a time in a small local area network (LAN), but in a large enterprise network, its often too expensive to be a practical solution. In addition, experience makes it clear that if the bandwidth is available, users will fill it. Data applications can quickly consume the additional bandwidth, leaving you with the same congestion and problems you started with in delivering voice service. To achieve acceptable voice quality over an IP network, factors such as noise, delay, echo, and jitter must be managed to tolerable thresholds. Delay is present in every IP network due to the statistical multiplexing used in routers and the generally bursty nature of packet data. There will be delay. Jitter is nothing more than variation in delay. As different packets can easily take different paths through a large network such as the Internet, variable delay is fairly common. When transmitting voice, jitter can render the audio signal unintelligible to the human ear. Jitter needs to be tightly controlled. Delay can also be caused by the time required to perform the sampling in the codec used to digitize voice. Each router in the network can also induce packetizing and routing delays as decisions are made about how to route each packet. This is also referred to as nodal delay. When designing and managing an IPT service network, its important to remember that delay is cumulative. Every place delay occurs, it adds to other delay in an endto-end service between two people.
32
Chapter 2
Design Considerations and Class of Service

Networks today are large and complex systems, with a variety of applications running concurrently. They evolve and grow so quickly that they are almost organic in nature. There is a concern for any enterprise about the spiraling complexity of network design. When you design a specific set of criteria to support a given application, you create a class of service for that application. The danger in doing so is that every new class of service you create increases network complexity. An enterprise with numerous applications could quickly create an unmanageable set of service classes. Many network designers lean toward simplicity and suggest only a few critical service classes: Quick delivery can provide for real-time traffic such as voice and videoReal-time traffic such as voice and video require quick delivery or the services are rendered unusable. Video collaboration tools and VoIP telephone calls dont work if the data experiences needless delays. This class of service is used to tag those services requiring quick delivery for usable service. Its important to note that streaming video from a server, such as watching a stored training video, is not real-time traffic. This traffic can be buffered and delayed with lower QoS requirements. Real-time traffic is most often person-to-person traffic between people. Person-to-system traffic generally doesnt possess the same requirements. Guaranteed delivery is suitable for mission-critical trafficGuaranteed delivery is most often used to support mission critical data. The CEOs email isnt what is meant by mission critical. The data is what determines its criticality. Mainframe systems running IBMs SNA architecture require delivery but may be tolerant of delay. If packets are delayed, within reason, SNA continues to function just fine. If theyre lost entirely, session timers fail and applications no longer work properly. Best efforts delivery is what IP provides for everything elseThis is the same delivery mechanism used for all IP traffic today. Email, Web browsing, file transfers, and most applications function perfectly well using best efforts delivery. In networks, most of the network traffic falls into this class.
Its easy to identify more classes of service in almost any network. For the sake of simplicity, these three classes give you adequate prioritization capability in this guide and in many networks. Although the debate continues over how many classes of service are really necessary, the bottom line is that any implementation of a QoS mechanism that distinguishes service classes is a vast improvement over the best efforts approach used by IP.
33
Chapter 2 Classes of service are only one aspect of QoS, but they arent the only factor involved. Lets think about how enterprise networks evolved over time: Businesses implemented standalone mainframes in the past, often to process accounting information. Computing power grew, and user terminals were added to connect users to the mainframe. With the advent of the PC in the 1980s, businesses started deploying PCs and constructing small LANs. These LANs were disconnected. They were essentially islands of information. Information was shared by carrying a disk from system to system (what used to be referred to as Sneakernet). The Internet grew in popularity in the 1990s. Companies added routers to connect LANs to each other and to the Internet. Many companies also developed an internal network, or intranet, to share company resources and information. Public Web servers were added as the World Wide Web became a widespread resource. E-commerce systems were implemented in many business sectors. Business automation tools such as enterprise resource planning (ERP), customer relationship management (CRM), and sales force automation were implemented. Today, real-time streaming voice and video traffic are being added to an already busy network.
There was a long-held perception among vendors and service providers that customers networks were poorly designed. In truth, they often werent designed at all. Like some unplanned cyber big bang, they exploded and grew over time based on the needs of the business they supported. It is important to understand that this happened for good reason. Companies in growth mode were successful. This often led to mergers and acquisitions, producing exponential growth. The economy boomed, and networks connected and merged quickly to meet the needs of rapidly growing business. Sometimes networks were redesigned and new technologies were integrated. In other cases, pressing business needs led to cobbling together the best that the beleaguered IT staff could manage. Some of these networks were kept operating only through long hours of frustration and ongoing reconfiguration. Earlier, this guide noted that networks have typically been designed and optimized for specific types of services. Voice traffic was the first networked application. The PSTN was tuned and optimized to support narrowband voice channels with very low delay.
34
Chapter 2 Traffic engineering for voice is a highly developed field. Its foundation is built on traffic patterns, the busy hour of the day, and statistical mathematics using something called the Law of Large Numbers. This law basically says that large groups are easier to predict than small groups. The larger the group, the more group members are likely to be near the average. For example, if you measure 10 peoples height, you may find several people (a high percentage of the group) 5 feet tall. If you measure one thousand peoples heights, youll find that the statistical accuracy of the sample is no longer skewed towards shorter people because the sample size provides for greater statistical accuracy. Both telephone company central offices and enterprise PBX systems are engineered for trunking requirements using a mathematical model known as the Erlang-B distribution. Using Erlang-B, call loads are measured in centi-call seconds (CCS), which is a unit equivalent to a 100-second call. 36CCS represents a telephony circuit at maximum occupancy with zero idle time.
This guide wont explore these formulas in any depth. The focus is the practical business of managing IPT. For those interested in further study of traffic engineering, the basic formula is:
OfferedLoa d =
Calls / Hour AvgMinutes / Call erlangs 60
The offered load in the formula isnt a problem in a network such as the PSTN because the network has been optimized to support the load. QoS is a manageable issue when the network is designed to serve a single purpose. In todays multimedia environment, there has been a phenomenon often referred to as convergence grab center stage. Separate parallel networks dont make either technical or economic sense. As all of todays data applications converge onto the IP infrastructure, the logical next step in that integration is, for many, to integrate voice, video, and data on a single, multi-service network.
As you look to IP networks as the next-generation architecture for delivery of multimedia services, the fundamental design of IP leads to some complex issues for engineers. IP, and all packet switched networks in use today, utilize time division multiplexing (TDM). This is a statistical multiplexing technology used inside the switches and routers in IP networks. The IP Suite protocols and hardware were developed to hand all traffic types, regardless of payload, in the same manner. Routers and other network nodes have traditionally handled traffic not only on a best efforts basis but often using a first-in first-out (FIFO) approach to processing the packet flow.
35
Chapter 2
QoS
Classes of service describe whats needed in the network. QoS characteristics look at the technical aspects that make it possible to support those classes of service. There are several characteristics within a network that may need to meet certain performance levels for any given service or application: AvailabilityAvailability is usually represented as the uptime percentage. Its based on the simple premise that the network is there and available for use whenever its needed. In commercial telecommunications networks, the industry standard is referred to as 5 nines reliability or 99.999% uptime. This availability percentage equates to roughly 5 minutes of downtime per year. For many business networks, that is a challenging availability threshold to meet. ReliabilityReliability is tied closely to availability, but its also a design factor. Reliability includes network architecture features such as redundant paths and duplicated or fault-tolerant equipment to ensure that in the event of a failure, the network remains accessible in the event of an outage.
How necessary is reliability? A consideration is how much effort to put into building reliability into the network. Does the network need redundant paths and fault-tolerant or high-availability equipment to guarantee availability in the event of a failure? For businesses that only operate during the 9-to-5 workday, reliability may be an area of less importance.
ThroughputBandwidth is the most common measure used for throughput. This is simply a measure of how much data inserted into the network at any given source is successfully transmitted through to the destination on the far end over a specific period of time. Error rateMany applications are reasonably tolerant of data loss because the TCP/IP suite relies on higher-layer protocols such as Transmission Control Protocol (TCP) to request retransmission and ensure delivery in the event of errors. Not all applications are delay tolerant. DelayAs described earlier, delay is simply a reality in IP networks because routers use statistical multiplexing to process traffic and much of the transmitted data on any network has a bursty characteristic. There will be some delay in any transmission network due to the laws of physics. Naturally, a lightly loaded or over-engineered network will have lower delay.
36
Chapter 2
JitterJitter is variation in the delay. Because packets can take different routes across the network, delay variation in IP networks is common. Jitter in VoIP conversations results in unintelligible conversations that sound jerky. In video transmission, the video stream can break up and show visual signs of jerkiness as well. One common test technicians have used for years is the jumping jack tests. The constant motion of a person doing a few jumping jacks gives a good indication of the perceived quality of a video stream. ScalabilityAs companies grow and businesses change, the ability of the network to grow with increased needs is an important consideration. ManageabilityThere have been many studies demonstrating that the most expensive component of network services is the ongoing operational support of network management and administration. Additions and changes to the existing network can be very labor intensive, and many system designers now consider this factor very early in the development process. A network with a 5-year life cycle will often cost far more to manage over the course of its lifetime than the initial capital cost to build.
QoS Approaches in IP Networks As noted several times, IP is a best efforts protocol that provides no guarantees. That is not to be misconstrued to mean that the various parameters discussed so far cannot be obtained using IP. However, you must remember that network design becomes a vital factor in any implementation of converged services. Although the techniques used vary widely, all approaches to QoS share a common characteristic. Regardless of approach, with the exception of over-provisioning, QoS is simply a means to implementing a prioritization scheme. In some cases, this prioritization might involve a routing mechanism that aggregates similar traffic types, the routes each traffic type takes over a path suited to the needs of the traffic. There are many approaches with intricate differences, but every approach, save one, adds some form of overhead to the traffic to provide either a prioritization function, or a traffic cop function to direct traffic appropriately. The one exception commonly referred to as gigabandwidth has been seen by many to be the ultimate solution, but this fallacy only defers the need for proper engineering to a future point in time.
37
Chapter 2
Ver
IHL
Type of Service
M D F F
Total Length
Identification
Fragment Offset
Time to Live
Protocol
Header Checksum
Source Address
Destination Address
Options
Padding
Data/User Payload
Figure 2.2: The TCP/IP packet structure.
Figure 2.2 serves as a reminder of the format of an IP packet, but its also shown to review one field in particular. There is a Type of Service (TOS) field in the structure of the IP packet that can and often is used as a QoS mechanism. This prioritization tool was provided in the original specifications for IP. The TOS field, expanded in Figure 2.3, is one octet, 8 bits in length. It consists of several components: Precedence is used purely as a raw prioritization mechanism. The first three bits in binary can represent a precedence value from 0 through 7. These bits provide eight possible levels of prioritization for an IP packet. The higher the precedence value, the higher priority assigned to the traffic.
IP still doesnt guarantee a level of acceptability but merely a prioritization scheme.
Delay (flagged as D in the figure) is a field to indicate whether the packet requires low delay or can tolerate higher delay. A one would indicate low delay is required. A zero would indicate more delay is tolerable. Throughput (flagged as T in the figure) is a relative indicator, with a one indicating the need for higher throughput or more bandwidth. Reliability (flagged as R in the figure) is signified with a one to indicate that a more reliable path is needed. Cost (flagged as C in the figure) remains generally undefined and misunderstood in use or intent. Unused (flagged as U in the figure) is the last bit that remains available for future use.
38
Chapter 2
Precedence
8 levels of precedence or prioritization ----4 1-bit fields to further identify packet requirements
Figure 2.3: The IP TOS field.
Although the designers of IP built this prioritization capability into the protocol, vendors rarely implemented or took advantage of this feature in the past. For many years, router vendors created OSs that didnt even read the field when processing packets. This was due mainly to a complete lack of standardization. The Internet Engineering Task Force (IETF), the IP standards body, never assigned values to any of the sub-fields in the TOS field. Thus, a precedence of zero might be the highest precedence to one vendor, and the lowest to another. Real-world implementations vary greatly in the absence of a standardized approach, but the field has been used as part of a prioritization scheme more in the past couple of years. Once you accept that IP provides no guarantees of any kind for either prioritization or delivery, the next step becomes clear. If a user needs to specify some particular network requirement, or QoS, the user must add something to IP to provide itadd it in some other layer of the protocol stack. The following methods are all in use today. They implement QoS in IP by layering some other overhead into the data stream. There are other techniques used to provide QoS, but the following sections explore three common approaches.
39
Chapter 2
QoSthe Signaled Approach Integrated Services (IntServ) introduces a signaling protocol to IP. Using IntServ, the users application sends some sort of call setup signal to the network. This signal is basically a request for a set of service delivery parameters required to complete the call. Using the approach, the network can check resource availability and deliver traffic accordingly. This approach is very similar in process to the circuit setup and teardown associated with a telephone call on the PSTN. Under IntServ operations, users, through automated software processes, request a particular type of service of the network using Resource Reservation Protocol. RSVP has two key aspects policy control and admission control. Policy control is used to determine whether the user has authorization to request the specified resources from the network. Assuming the user has the necessary permission, admission control governs the process for allocating and reserving the requested resources or setting up the path for the connection. RSVP passes the user requirements from node to node through the network, requesting resources. If any node along the way is unable to comply, the request is denied and no connection is established. Real-Time Transport Protocol (RTP) is a widely used supporting protocol in IntServ deployment. Its used to assure the integrity of a session through timestamping and sequencing of UDP segments. Real-time Transport Control Protocol (RTCP) is used in conjunction with RTP and provides some level of monitoring capability. Although IP theoretically has a rather comprehensive prioritization scheme available, IntServ provides three levels of prioritization. Theyre familiar because the classes of service identified earlier have been derived from the development of IntServ: Best effortThis is exactly what IP provides in existing networks. Controlled load, which is described in the standard as providing data flow with a quality of service closely approximating the QoS that same flow would receive from an unloaded network element. This is equivalent to the guaranteed delivery class of service described earlier. Guaranteed service, which provides a higher level of assurance, which might be required for real-time traffic such as voice or video. This, although called guaranteed delivery in IntServ, mirrors the quick delivery described earlier in this chapter.
40
Chapter 2
RSVP
RSVP provides a signaling capability that allows a users application to send a request to the network reserving a particular set of requirements for a voice or video session. These requirements are referred to as the template.
User applications are referred to rater than users because its important to understand that the person using the computer will not be required to understand QoS or application requirements. The applications will have requirements written into the code to automatically provide the service levels needed.
RSVP allows user applications to identify three of the parameters reviewed earlier that might be necessary for a given application to work properly: Throughput or bandwidth Delay Jitter
In Figure 2.4, the user application makes a request of the network. RSVP policy-control software then evaluates the request against a permissions table and either confirms that the user has the necessary permission to reserve the requested resources or denies access. Assuming the user has permission, RSVP then engages admission control software to determine whether the network has the necessary resources available to complete the request. The network passes the template along the network from node to node in a PATH message until it reaches the receiver. Each node must run an RSVP daemon to validate the request for QoS. In simple form, this specification is a request passed from router to router identifying a need. If an intermediate router can meet the need, the request gets passed to the next node. If not, the request is denied. Intermediate nodes play a vital role in RSVP because every intermediate node must be able to provide the necessary QoS PATH requested in the template. Lack of resources at any point along the path can cause the call to be dropped.
41
Chapter 2
Sender Router Router Router
Receiver
Template PATH
Template PATH
Template PATH
Template PATH FlowSpec
FlowSpec FlowSpec FlowSpec RESV RESV RESV RESV
Figure 2.4: Using RSVP to reserve network resources.
Once a path has been identified, each node must pass a flow specification in a reservation (RESV) message back along the network from node to node. As youve gone to the trouble of reserving resources across the network, you now must ensure that you use the resource allocated. This flowspec reservation is each routers way of knowing where the next router in the reserved path is to pass packets associated with the call or session. There are clearly a several factors to consider when implementing RSVP for QoS. These factors become crucial when using the Internet, as those routers are not all under the control of the implementer. RSVP is overhead intensive. Call setup isnt something IP was designed for. Adding a signaling protocol in the mix increases the processing CPU cycles required at each router in the path. Routing protocols such as RIP and OSPF support only a single routing metric; they dont understand the concept of reservations. RSVP does not provide any solution for this issue. The implication is that to be successful, the network may have to be overengineered beyond expected carrying capacity. RSVP scales very well in the multicast network environment (one-to-many transmissions) but does not scale well for unicast (one-to-one) traffic. As telephony is primarily a one-to-one connection for an end-to-end service, scalability to support VoIP and video conferencing services can quickly cause problems in a large enterprise network. In a multi-provider network such as the Internet, where nobody owns all the routers, scalability may be unattainable.
42
Chapter 2 As all the intermediate nodes must be able to comply with requests, every router in the Internet would have to be either upgraded to support RSVP or replaced. This would require universal support and acceptance that just doesnt exist. Routers have to maintain state tables containing information about every session. The processing load and memory requirements can drive router cost up significantly. RSVP doesnt provide any QoS whatsoever; its simply a mechanism for sending requests to the network for some specific requirements. Its a signaling protocol, and it still requires help from other protocols or other methods to truly implement QoS.
Many network designers agree that IntServ and RSVP provide a potential solution but only in a network that is completely under the implementers control. Each and every intermediate router still presents a potential single point of failure. In the Internet, data crosses many provider networks en route to its destination. IntServ can work well in a fully controlled environment, like an enterprise-owned private network. It may also prove useful at the edges of the Internet, either in customer networks or the local metropolitan portion of the Internet providers network. Internet service providers have almost universally chosen not to adopt IntServ as a solution to the QoS problem. The value provided just doesnt offset the cost of conversion and implementation. IntServ fails the ROI case for service providers. IntServ has found some application in private networks and is still supported by all the major router vendors. Resource Reservation Protocol for Traffic Engineering (RSVP-TE), a variation, has been deployed with success in conjunction with Multiprotocol Label Switching (MPLS). QoSthe Provisioned Approach There is another approach that requires specific routes through the network that are predefined and available for each type or class of traffic called Differentiated Services (DiffServalso called DiffServ Code Point or DSCP). These paths might be pre-existing as part of the network design or they might be set up on demand in some manner. This second option might include some type of signaling protocol such as RSVP to establish the paths. The DiffServ method is often used as a traffic aggregation approach in order to direct similar traffic types on similar network routes. This guide wont consider the evolution to IPv6 but will make an observation: In the development of IPv6, the TOS field was the subject of direct discussion. In IPv6, the protocol has been expanded to provide improved granularity in the delivery of QoS. In the IPv6 packet, there is a field referred to as the DiffServ field as a replacement for the current TOS field. It uses six bits of this field as the DSCP to identify how the nodes in the network should handle each packet. Routers handle packets based on a set of forwarding treatments or Per Hop Behaviors (PHB). These rules must be predefined in each network element or node. This is the provisioned approach to QoS in the network.
43
Chapter 2
DiffServ is also a development of the IETF. The DiffServ working group charter and information is available at http://www.ietf.org/html.charters/DiffServ-charter.html. The objective of this working group is to employ a small, well-defined set of building blocks from which a variety of aggregate behaviors may be built. This suite is also defined by a series of RFCs: RFC 2386Per Hop Behavior Identification Codes RFC 2474Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers RFC 2475An Architecture for Differentiated Services RFC 2597Assured Forwarding PHB Group RFC 2598An Expedited Forwarding PHB RFC 2983Differentiated Services and Tunnels RFC 3086Definition of Differentiated Services Per Domain Behaviors and Rules for their Specification RFC 3140Per Hop Behavior Identification Codes RFC 3246An Expedited Forwarding PHB RFC 3247Supplemental Information for the New Definition of the EF PHB RFC 3248A Delay Bound alternative revision of RFC2598 RFC 3260New Terminology and Clarification for DiffServ
The DiffServ approach is to categorize traffic into classes of service. Similar services are aggregated and treated the same way in the network. Thus, network paths have to be preconfigured to support each class of service. Packets are tagged at the edge of the network and the appropriate forwarding treatment for that class of service tag is applied. This approach results in a much coarser granularity at each router and reduces the need for large state tables. This helps control the need for CPU processing power.
DiffServ has two primary components described in RFC 2475. Packet marking redefines the TOS field of the packet and uses six bits as a coding scheme to classify packets into a class of service. The use of six bits provides for a prioritization scheme that can identify 64 different types of traffic or aggregates. PHBs govern how an individual class or aggregate is handled. This is defined via behavior aggregates. In essence, the PHB describes the scheduling, queuing, and traffic shaping policies used at a specific node for routing the traffic.
DiffServ can scale to very large enterprise and provider networks. It is widely supported by manufacturers and has been deployed by several large service providers. There is far more to DiffServ than can be addressed in this guide, but information and technical specifications are easily located on the Web.
44
Chapter 2
QoSthe Bypass/Shim Approach MPLS is a method of providing QoS that removes the usual hop-by-hop routing in IP from the equation. MPLS adds a tag to every packet. This tag shortcuts the delivery by directing the packet to the best available path for the type of traffic associated with the MPLS tag. MPLS is compatible with frame relay and ATM networks. It has been widely adopted in enterprise and service provider networks for adding QoS capabilities to support VoIP and video traffic. MPLS has often been referred to as a bypass or shim approach to QoS. Because of the addition of a tag into the data stream, its also often referred to as a Layer 2 protocol.
Using MPLS for QoS
Implementing VoIP makes companies re-evaluate their service networks. Often, it brings some fundamental changes to the way these IP networks operate. As has already been reviewed, IP is a best efforts protocol with no assurances of delivery or quality. When organizations implement VoIP, many businesses discover a compelling need to implement some QoS methodology in order to provide acceptable voice call quality. MPLS is a method that allows packet-based networks to essentially emulate some of the behavioral properties of a circuit-switched network, such as the PSTN. The following section provides a look at MPLS as one method thats frequently used to deliver QoS assurance in a large-scale enterprise environment. MPLS evolved from several different but similar techniques. One of the leading approaches was developed by a group of engineers at Cisco Systems. Initially, this proprietary protocol was called tag switching. As the technology evolved, industry leaders came together in a consolidated effort under the auspices of an IETF working group pursuing an open standard that became MPLS.
How MPLS Works
MPLS works by pre-pending packets with an MPLS shim header, or tag, to the beginning of a packet. Although this guide is focused on IP networks, MPLS works equally well with framerelay frames and ATM cells in those networks. This shim header contains one or more labels and is often called a label stack.
45
Chapter 2
Layer 3 IP Packet
4 byte generic Shim Header (1 to n)
MPLS Label
(20 bits)
Exp.(COS)
(3 bits)
S
(1)
TTL
(8 bits)
Layer2
VPI
VCI
DLCI FR
Shim Header Ethernet PPP
ATM
Figure 2.5: The MPLS packet and MPLS encapsulation.
As Figure 2.5 shows, each label stack entry contains four fields: A 20-bit label value An experimental field, often used to denote class of service A 1-bit flag to signify whether this label is the last label in a stack An 8-bit time to live (TTL) field
These MPLS-labeled packets are switched by performing a label lookup/switch instead of a lookup into an IP routing table. Because label switching can be performed within the switching fabric of the hardware, it processes much faster than traditional router-based IP address lookups process. In an MPLS network, routers become label switching nodes that add and remove labels as needed. Youll hear the term label popping from time to time. Using this method, a large routed network looks like one single routing hop at Layer 3 of the TCP/IP stack. Path selection is based, not on a routing metric, but on the MPLS label. Routers may be thought of as Label Switching Routers (LSR) through the core of the network, with Label Edge Routers (LER) at the border points of the MPLS domain. Routers that provide ingress and egress to the MPLS are often called Provider Edge (PE) routers.
46
Chapter 2 The Class of Service (COS) field is used to assign classes of service much as described earlier in the chapter in a section that looked at three classes. MPLS often adds a fourth class for management traffic. Typical MPLS classes of service are: Real-time traffic, such as voice and interactive video, is often given the highest priority to ensure that adequate bandwidth can be provided. The class of service is also designed to provide the delay, packet loss, and jitter requirements suitable for delivery of real-time traffic such as VoIP and video. Mission-critical data traffic, such as that from mainframe computers, is often classified in a guaranteed delivery class of service. Timing of delivery is often a more stringent requirement than bandwidth for this type of data. Management traffic is commonly aggregated into a management class of its own. Management traffic requires some assurances that it can still be passed regardless of congestion in the network to ensure QoS is maintained. All remaining traffic is generally lumped into a best efforts class of service that mirrors how IP networks deliver traffic normally.
The Experimental Bits and QoS
Using this approach, similar traffic types can be aggregated into the same class of service, not unlike DiffServ. When addressing is applied to the packets, they might be labeled by an application, a router, a switch, or some other mechanism at ingress to the network. This COS header is used to aggregate packets into what is called a forwarding equivalency class (FEC) for switching throughout the network. The undefined experimental bits are used by most MPLS vendors to carry the latest 3 of 5 DSCP priority bits in the IPv4 header, as a simple emulation of traffic prioritization. When a labeled packet arrives at an MPLS router, the topmost label is popped and examined. Based on the contents of that label, a swap, push, or pop operation is performed on the packets label stack. Routers can have predefined lookup tables so that they can process the packet very quickly. In a swap operation, the label is swapped with a new label, and the packet is forwarded along whatever network path is associated with that new label. In a push operation, a new label is pushed on top of the existing label, effectively encapsulating the packet in another layer of MPLS. This allows the hierarchical routing of MPLS packets. Notably, this is used by MPLS VPNs. In a pop operation, the label is removed from the packet, which may reveal an inner label below. If the popped label was the last on the label stack, the packet leaves the MPLS tunnel or domain. This is usually done by the network egress router.
47
Chapter 2
LSR: InPort: 3 InLabel: 34.22 Dest: 114.1 OutPort: 1 Outlabel: 43.18 Ingress LSR: InPort: 3 Dest: 114.1 OutPort: 1 Outlabel: 34.22
REQ: 114.1
LSR
Egress LSR: InPort: 3 InLabel: 43.18 Dest: 114.1 OutPort: 1 IP Network 114.1
RE Q: 11 4. 1
IP Network 114.3
AP :
34
.2
MAP:
43.18
LSR
LSR LSR LSR
IP Network 114.2
TO: 114.1.3.2
LSR
Figure 2.6: Label switching with MPLS.
During these operations, the contents of the packet below the MPLS label stack, the IP packet and payload, are not opened or processed. Intermediate transit routers only need to read the topmost label on the stack. Packet forwarding decisions are based on the contents of the labels, providing protocol independent packet forwarding. There is no need to look at a protocoldependent routing table. This eliminates the cumbersome IP longest prefix match routing protocols require at each hop. When the packet leaves the network at an edge router, and the last label has been popped, only the payload remains. This can be an IP packet or any of a number of other kinds of payload packet in other networks. The edge router must therefore have routing information for the packets payload because the router must forward the packet using traditional routing methods. An MPLS transit router has no such requirement.
For more information about MPLS, check out the following sites: MPLS Resource Center at http://www.mplsrc.com/index.shtml MPLS MFA Forum at http://www.mplsforum.org/ IETF MPLS Working Group at http://www.ietf.org/html.charters/mpls-charter.html IETF RFC 3031 at http://www.ietf.org/rfc/rfc3031.txt
48
Chapter 2
Comparing MPLS to IP
MPLS cant be compared directly with IP as a separate entity. They are complementary protocols. MPLS works with IP and IPs interior gateway (IGP) routing protocols. MPLS brings a level of basic traffic engineering capability to IP networks. MPLS relies on traditional IGP routing protocols to construct the label forwarding table, and the scope of any IGP is usually restricted to a single service provider for stability and policy reasons. There isnt a current standard for interoperable carrier-to-carrier MPLS, so its not yet practical to span one MPLS service across more than one carrier.
MPLS Deployment
MPLS is currently in use in large IP only networks and is standardized by IETF in RFC 3031. In practice, MPLS is mainly used to forward IP packets and Ethernet traffic. Major applications of MPLS are telecommunications traffic engineering and MPLS VPNs.
Traffic engineering considerations with MPLS include: IGPs mainly use shortest path algorithms Path overlap causes congestion in the network Traffic can overwhelm a short path while others are underutilized IGPs in large networks present challenges because equal-cost multi-paths (ECMPs) share loads when they really should not, load sharing doesnt happen on multiple paths with different costs, and IGP metrics modified for traffic shifts tend to have side effects
Summary
This chapter began by considering high-level the business models and processes because theyre most often the ones directly affected by VoIP. Theyre also central focal areas as the concept of unified communications brings network services, such as voice and video, and applications together. The chapter then moved into factors crucial to providing call quality. The network cant effectively converge and thrive if suitable quality isnt present. The next chapter will delve into drivers for change in three areas: business drivers, const reduction drivers, and strategic drivers.
49
Chapter 3
Chapter 3: Business Drivers and Justification

In business, there are many factors to consider when addressing the shift in networking paradigms to bring about what many today call Web 2.0 integration. Even the phrase Web 2.0 has spun off variations like Voice 2.0 and Office 2.0. The next generation of network-centric solutions is a key business driver today. In the earlier days of these technologies, convergence was viewed as a cost-reduction technique for business. Although cost reduction is important and remains a driving factor, it has become a minor factor in voice and data integration for many businesses. For large enterprises, convergence brings about a unified, single-network bill from the carrier. Integrating voice and data can lead to consolidation of staff as telephony and data services converge onto a single infrastructure. This early driver has proven a factor only in the largest of enterprises. Today, what integration brings about is a competitive edge. And sometimes it brings revenue to the bottom line because it enables new revenue streams that couldnt be fully captured in the past. VoIP services coupled with Customer Relationship Management (CRM) tools in business bring responsiveness, speed, and knowledge about customers that can provide a measurable differentiator in customer service delivery. Years ago, Bill Gates articulated a strategy of business knowledge being central to a companys digital nervous system. Today, service integration couples knowledge with responsiveness to deliver business solutions more quickly than ever. Convergence of voice and data tightly couples business intelligence with business relationships. The real success of Web 2.0 lies not in technology alone but in integrated, comprehensive business services. Chapter 2 looked at some of the technology drivers for convergence. Building on that foundation, this chapter will explore business motivations for service integration and how they may mirror the technological drivers.
Vertical Market Business Drivers for Change

In his book Telecosm, George Guilder coined the phrase disruptive technologies. In the telecommunications industry as a whole, there have been many disruption points, including the migration from circuit-switched technologies to packet-switching and advances in optical networking in the core and broadband access technologies such as DSL, cable, and EVDO wireless broadband services. Lets take a brief look at disruptive technologies and variations that arent directly related to unified communications. Technology change enables business process change, but these changes often occur at different rates. The adoption cycle for new technologies is fueled by success stories of early adopters. In the Internet today, how you learn about new technologies, advances in existing methods, and the success stories that drive business process has changed.
50
Chapter 3
A Side Note on Blogs and Wikis A Weblog or blog is a Web site presented in something of a journal form. Entries are most frequently displayed from newest to oldest (in a reverse chronological order). Blogs are often focused on a specific subject or interest area, such as technology, a hobby, or regional events. Some blogs are simply online diaries, but others represent citizen journalism or focus on tightly defined interests, such as VoIP. Many blogs use a combination of text, photos, videos, and recorded podcasts as information-sharing methods. Blogs have become a key component of the Web 2.0 evolution in unified communications. They are a way to communicate. Blogs began as a hobbyists tool for sharing journal-like information. Today, many companies and corporate executives use blogs as a tool for sharing information with readers and customers. A wiki is another kind of Web site that allows visitors to add, edit, and delete information. They can be easy to use, simplifying interaction and delivering an effective collaboration tool for multiple contributors. One of the most useful examples is the online encyclopedia, Wikepedia (http://www.wikipedia.org/). Wikis are rapidly becoming another tool used in business to interact with customers and business partners. They can provide a framework for collaboration among developers or early adopters as well as a brainstorming mechanism when working with partners in developing new products and solutions.
In the leading edge of net-centric business, the buzz-phrase Web 2.0 captures the attention of the most competitive business leaders. Blogs and wikis permeate the business environment with one root motivationthe need to share information and support relationships. Many companies, especially in the technology sector, use these tools to partner with customers through online discussion, dialog, and collaboration. This revitalized interest in interactivity is recognized as a highly competitive edge. Open dialog often brings a degree of personalization to an otherwise faceless corporation. Done well, these new tools couple with more traditional business systems to improve results. These tools also enable vendors to share their new technologies. Blogs and syndication tools such as Really Simple Syndication (RSS) let businesses follow trends and advances more closely than ever. These Web 2.0 trends disrupt every type of business in some way. For some companies, theyre a tool for sharing information; for others, they present a tool for learning about how to be more nimble and use emerging technology trends to the best competitive advantage. Much of the business disruption in the market has been driven by record numbers of small businesses adopting the Internet as an asset. This adoption allows small companies to level the playing field with larger corporations. There is a famous cartoon from the Saturday Evening Post that portrays a key truth that on the Internet, nobody knows youre a dog. In more practical terms, on the Internet, the fact that youre operating a small business, or solo enterprise from a home office, may well be completely invisible. What was learned from the small business trend is that by 2003, roughly 70 percent of small businesses were leveraging network technologies, according to a report from IDC. And according to the American Business Journal, small businesses that use the Internet have grown 46 percent faster than those who dont. Although these numbers dont carry across to larger businesses in linear fashion, Internet technologies have clearly become a vital business tool. In this chapter about business drivers and justifications, its important to remember tools such as blogs and wikis as business tools that can be used to integrate voice and data solutions more tightly.
51
Chapter 3 Implementing new unified communications technology is similar to implementing anything new. Success requires a methodical approach. Integration occurs not just in technology but in business process as well. What these new tools enable is process change, but for many businesses, this change ripples through the entire company, forcing many companies to reinvent themselves in new ways. To succeed, you need to lay out a roadmap you can follow to guide the evolution from the past to the future. You need an integrated roadmap to bring unified communications technologies of today and the next generation into your business process. Figure 3.1 provides a high-level view of one such integrated roadmap.
The Integrated Roadmap

Current Business Strategy Business Model Industry & Market Assessment People & Resources Assessment Technology Scan Busin Business Vision Implementation ess Strategic Tactical Desig Deliverables Deliverables n I Fram m Value Assess Bus Strategic Plan ewor Justification p Proposition k Requirements (Refine Bus l Strategy) e Target Choose m Markets Business & Info Design Modify Components Management,e Capabilities & Services Products n Business Choose Information, t Management & OperationalP l & Operational Resources & Solution Staff Solution Components a n Components Historical and Forecasted Performance Review Key Strategic Timelines & Governance Processes & Controls Policies Cost Controls
Business Strategy Roadmap
Review
Funding Assessment & Bus Case
IT Model Value Proposition Value Proposition IT & E-Bus Vision (B2B, B2C, Intranet, Extranet, Internet) Targeted Capabilities & SLAs Key Projects
Assess IT & E-Bus Requirements Determine IT & E-Business Solution Components Determine Management & Operational Solution Components
People & Resources Assessment Technology Scan
E-Business E-Bus CRM / BI Locations & Data WarehouseDMZ Knowledge Management Connectivity & Topology Architectures Data, S/W Transition Applications Plans Infrastructure Components including
I m pl e m e nt Pl a Modify n
IT Strategy Roadmap
Systems Historical and Forecasted Performance Resources & Review Key Staff Processes & Timelines & Governance Policies Update Bus Case
Design Operational Components Cost Controls Review
E-Business, Business Intelligence Customer Relationship Management, Knowledge Management Portals
Network Model
Locations Value Proposition Network Vision Targeted Capabilities, Services & SLAs Key Projects Review Operational Components Key Processes and Policies Update Timelines & Governance Update Bus Case Resources & Staff Assess Network Requirements Review Network Strategy Review Network Components Refine Network Strategy Technology Solutions Connectivity & Topology
Value Proposition
Im pl e m en t Pl an Modify
People & Resources Assessment Technology Scan
Network Architecture
Design Network Mgmt. & Oper. Components Performance
Network Strategy Roadmap
Funding Assessment
Cost Controls
Review
Master Project Plan
Figure 3.1: Building an integrated roadmap.
Figure 3.1 shows a very high-level view of the cradle-to-grave process for implementing new technologies. Although this sort of process isnt the focal point of this guide, it provides a good foundation for thinking about the upcoming discussion in this and later chapters. Lets briefly touch on the following four key components to provide some framework and food for thought in developing strategies for implementing unified communications technologies in business efforts: Business strategy roadmap IT strategy roadmap Network strategy roadmap Master project plan
Each plays a key role in the success of leveraging unified communications for success.
52
Chapter 3 Figure 3.2 doesnt present anything new but shows the many factors that come into play when mapping out a business strategy. This is really just fundamental sound business planning, but it also sets the stage for how a scan of technologythrough watching the vendors, reading blogs, and exploring new ideasmust complement the larger corporate vision. Technologies alone dont provide sustainable business processes. They complement your strategy, mission, and vision as a corporation.
The Business Strategy Roadmap

Current Business Strategy Business Model Business Vision Business Design Framework Assess Bus Requirements Choose Business & Info Components Choose Management & Operational Solution Components Historical and Forecasted Performance Review Key Processes & Policies Strategic Controls Timelines & Governance Cost Controls Review Strategic Deliverables Strategic Plan (Refine Bus Strategy) Tactical Deliverables Implementation
Value Proposition Target Markets
Tactical Plan
Implement Plan
Industry & Market Assessment
Capabilities & Services Products
Justification
People & Resources Assessment
Resources & Staff
Design Management, Business Information, & Operational Solution Components
Modify
Technology Scan
Figure 3.2: The business strategy roadmap.
The business strategy roadmap is the fundamental baseline. It ensures that you focus on the core business. Its crucial that you use the business strategy to drive your other strategies rather than allowing the fanciful attraction of new technologies to drive your business direction. Unified communications must enhance and support the business strategy, for companies both inside and outside of the tech sector. The key for many companies is to remember that new technologies may present more of a temptation than exist as a true business driver. Instead, they should be viewed as tools to enhance what you do. The IT strategy roadmap in Figure 3.3 is a natural progression from the business strategy. Again, a scan of technologies is a key piece, but now you must really look at how these technologies add value to your business strategy. This area is where you determine your approach to ebusiness, CRM, Enterprise Resource Management (ERP), and other systems tools.
53
Chapter 3
The IT Strategy Roadmap

Current IT Strategy IT Model IT & E-Business Vision Value Proposition IT & E-Bus Vision (B2B, B2C, Intranet, Extranet, Internet) Targeted Capabilities & SLAs Key Projects Technology Scan IT & E-Bus Refine IT E-Business Design Framework Solutions Strategy Design Assess IT & E-Bus Requirements Determine IT & E-Business Solution Components Determine Management & Operational Solution Components Historical and Forecasted Performance Review Key Processes & Policies Update Bus Case E-Business Business Intelligence Portals CRM Knowledge Management Architectures Data, S/W Applications E-Bus Locations & DMZ Implementation
Implement, Deploy and Realize Solutions
Value Proposition
Transition Plans
Infrastructure Components including Systems
Asset Mgmt, Support Services
Modify
Design Operational Services Components
Resources & Staff
Cost Controls
Review
Timelines & Governance
Figure 3.3: The IT strategy roadmap.
When delving into unified communications, as with several other areas, overlap beyond the IT strategy into the network strategy (see Figure 3.4).
Current Network Strategy Network Model Network Vision
The Network Strategy Roadmap

Network Design Framework Assess Network Requirements Review Network Strategy Review Network Components Review Operational Components Key Processes and Policies Update Update Bus Case Network Solutions
Network Design Locations
Implementation
Value Proposition Network Vision
Refine Network Strategy Technology Solutions & Project Initiatives
Implement Plan
Value Proposition
Targeted Capabilities, Services & SLAs Visionary Initiatives
Connectivity & Topology
Network Architecture
Design Network Mgmt. & Oper. Components
Modify
Technology Scan
Resources & Staff
Performance
Funding Assessment
Cost Controls
Review
Timelines & Governance
Figure 3.4: The network strategy roadmap.
54
Chapter 3 The network strategy really drives how you will manage internal network services. In a small or midsized business, the IT and network strategies may blend into one unified plan, but for a large enterprise, the IT infrastructure of application services might best be separated from network planning. The two complement each other but often have different characteristics. They are separated here because in most organizations, the IT strategy has already been deployed to support the business strategy. Unified communications, especially in a large enterprise, is most often deployed to leverage existing IT resources. This roadmap discussion simply helps focus on the context of why explore convergence and integration to support the entire chain of business processes. The last piece of the high-level view is the master project plan. The key is that the project plan comes last in the strategic planning cycle. Whether the roadmap planning described here is followed in excruciating detail, through formal or informal processes, is really a decision for each organization to make based on existing management methods.
The Master Project Plan
Figure 3.5: The master project plan
The purpose of this review has really been to caution those who are enamored of unified communications technologies because they appear, on the surface, attractive. In order to leverage the convergence of voice, video, and data communications, its absolutely critical that your eye always be focused on answering the question: What business are you in? Youre pursuing integration to improve your processes, create new efficiencies, and better compete in the core business. Lets move forward and take a look at some of the different business areas and why the converged network really brings value. Later chapters will explore the operation and management of the converged network; when exploring these areas, keep in mind that you manage the network to support the needs of the business while avoiding the danger of letting technology consume more mindshare in your attention than is necessary.
55
Chapter 3 Business Sales Business sales have evolved into what is today B2B electronic commerce. Business partner transactions often take place in much higher volumes than consumer transactions. Business process automation has driven process change for increased efficiency across all sectors. In the business world, everybody is a customer and everybody is a provider. This chapter is about the changing landscape of voice and data integration, but those changes have been explosive in nature since VoIP first came on the scene 10 years ago. The Web has caused a big bang explosion in managerial and business process. The enterprise has exploded from a compressed mass into many little bits. There are continuing advances in distributed computing. Business units in enterprises are embracing peer-to-peer functionality rather than the historical silos of total separation. And the top-down managerial and business hierarchies of the past are vanishing, giving way to meshed, distributed departments and processes. This e-business has become more standards-based, with UN/EDIFACT being one widely adopted standard; other standards, such as E-Business XML (ebXML), continue to emerge. The impact in the converged network is now lapping into many areas of business as the idea of a service-oriented architecture (SOA) begins to couple with other business models. SOA really focuses on loosely coupling software services, such as CRM and ERP, with one another, allowing convergence between services on the network. Another view describes this approach as Software as a Service (SaaS). SaaS treats software applications of all types as network services. Its more a delivery model than anything else. One unifying trend in converged communications today is that voice services are rapidly moving from the legacy technologies of the PBX to the IP network using VoIP. In the converged network, voice is just another service, an application like ERP or CRM. Video is quickly following voice. Treating communications technologies as a service on the network, enables coupling them with other business applications. What began as e-commerce has evolved into a broader e-business. Why? You can look to a number of case studies for the details, but operating cost savings and net gains to the bottom line have been huge drivers. Internet procurement of both goods and services continues to rise. Eliminating the middleman in the process reduces overhead and bridges the gaps between suppliers and customers. Technology has also enabled the creation of virtual teams and workgroups on the fly and on demand. In short, value is being redefined in the world of e-business. Communications technologies can blur the lines between both vertical markets and lateral players within a market. E-business drives new, flatter, dispersed business hierarchy. This drives efficiency, and improved efficiency always helps the bottom line. Integrating business sales systems with communications systems can bring a new competitive edge by simplifying and tightening processes. For example, an ERP system integrated with a VoIP system allows easy features such as click-to-call. There are many places within the entire business vendor management chain of events that converged communications can streamline and enhance.
56
Chapter 3 There are some obstacles to network convergence. They include: Regulatory and legislative issues Suitability of network infrastructure to support business needs Network reliability and security
Technology always advances faster than legislation can keep up, yet there have been regulatory advances. Security will always be a concern, and will always require constant attention.
Later chapters will dig into security more deeply.
As customers, we want control. We want technologies that adapt, evolve, and offer convergence. We want independence to use the technologies, both networks and devices, in ways that support our core business. And with the evolution of the Web 2.0 mindset, we require the ability to create global communities of interest on demand. From a technology perspective we want powerful servers and workstations with unlimited bandwidth and universal access to support survivable, secure networks that are easy to install and maintain, and cheap to operate. When we step back to consider those demands, where we are today in the convergence life cycle shows extraordinary growth from where our technology tools were even 3 or 4 years ago. Convergence technologies often focus on business salesselling either products or services. The enablers include: E-commercepayment mechanisms Network architectures for Internet connectivity Voice and Video over IP Client/server architecture advances Security methods Business process modeling
These components have all undergone continued advances that are being leveraged to create a converged environment of unifying communications tools with data systems and business processes. As Figure 3.6 shows, convergence brings you into a new life cycle that drives business activity into your information systems, which help you better understand your customers. As your technology converges with your CRM resources, the company converges to an ever-evolving service or product for market, which, in turn, drives new business activities into the cycle.
57
Chapter 3
Business Activities
Database Content
Data Warehousing
Information
Data Mining
Understanding of Customer Business
CRM
Technology Convergence
Services Convergence
Company Convergence
Figure 3.6: The business-technology convergence life cycle.
Another way of looking at this is to look at the historical evolution of business. In the industrial/manufacturing era, business was driven by supply and demand. What has changed? Supply was a need for physical resources in the manufacturing process. Mass production could only be fueled by quicker delivery of widgets. The industrial-age corporation survived by striving for efficiency in manual processes. As information systems improved, automation led to just in time delivery as an effort to cut costs and improve time to market. These improvements brought the idea of virtual organizations, even virtual companies, into playtightly coupling companies with both supplier/vendors and customers. Figure 3.7 shows the current step in the evolution of technology convergence: a thriving ebusiness community that is global in nature, fueled by an abundance of knowledge. Today, we know more about our customers, our partners, and our competitors than ever thanks to the advances in computing and networking power. To win in the market, it is necessary to couple the convergence power of integration voice and data to enhance the services delivered to customers. The converged business becomes a leader in the global e-business community.
58
Chapter 3
E-Business Community
Customer-Driven Service-Enhanced Customization
Creating Value
Virtual Corporation Extended Tightly Coupled
Supplier-Driven Mass production
Industrial Age Corporation Vertical Fully Integrated Scarce Physical Resources
Resources
Abundance of Knowledge
Figure 3.7: The e-business evolution.
Web-Enabled Business The Web enabled a new kind of business. Web services are often used to implement SOAs. They enhance machine-to-machine integration, using tools such as SOAP-XML, and can couple trusted business partner systems to speed business transactions, which speeds time to market or delivery. The Web has allowed outreach to a new customer base for many companies. Existing customers may have stuck with existing tools, but the Web created many little wins for businesses that didnt wait for large breakthroughs. Recombining existing technologies with new technologies can provide enhancements that increase the competitive edge. These incremental improvements have assured that small successes offset small failures, so major fatal failure didnt occur as technologies advanced. Many Web-enabled businesses also use some form of call center technology. Call centers provide a tool for answering or originating high volumes of telephone calls. Earlier, this chapter discussed how technology cant be the driver for the business, but VoIP call centers give an excellent example of why technology can and should drive change in the processes you use to support the business. Call centers provide a great example because, in varying degrees, they reach across every vertical segment of business.
59
Chapter 3
Why an Internet Call Center? With the advent of near-ubiquitous broadband Internet, there has been a rise in the number of teleworkers. Technology has allowed people to focus on family and friends in new ways and to change how they participate in their work environment. Internet call centers provide an approach that offers a work-at-home job but with far more connectivity that weve previously seen. Call centers have historically moved jobs from metropolitan areas to other locations that offer a lower tax base to businesses. Many US companies now use offshore call centers in other countries where the labor rate is lower, but the distributed call center approach alters the cost structure and provides an effective method to hire domestic staff around the country. The leading approach for US-based home agents has historically been the Integrated Services Digital Network (ISDN), which really doesnt provide effective integration of services. ISDN costs have proven too high for most companies to invest in this architecture. VoIP is another story. With VoIP, companies leverage significantly reduced cost and far better integration of services than ISDN has ever offered. Many companies now rely on VoIP coupled with consumer broadband to build an Internet call center. The focus for any company doing business on the Internet has become customer service. There is nothing customers appreciate more than talking to a live person in real time. The call center, in any form, provides that seamless integration capability. Customers might be anywhere in the world. They might be based in a large corporate campus or at home. They may contact the provider via a Web site or simply by telephone. According to the Gartner Group, more than 70 percent of transactions take place over the telephone. Web sites that offer live voice as a support option for customers report as much as 50 percent increases in sales. With VoIP softphone technology, the provider of products or services can receive a query for customer support, and through distributed call center technology, seamlessly redirect the call and all data information to a service representative working remotely, often from home. Todays convergence of voice and data allow not just handing off the telephone call. We can pass the customer account information, populating the home-agents screen with all the necessary information to provide quick and accurate customer support. Staffing of call centers has always been a challenge. Hiring remote staff, perhaps even part-time remote staff, lets the provider staff the distributed call center with qualified employees. Special services, such as service representatives with special linguistic skills, become more obtainable. This solution also makes available a pool of workforce candidates that may have been previously overlooked. Stay-at-home mothers, retirees, and people without transportation now become potential job candidates, participating in the workforce in new ways that were inaccessible previously. Distributed call centers do not require VoIP to provide the convergence, but VoIP brings the tightest coupling of services at the lowest cost. In the past, distributed call centers were implemented using PBX solutions and off-premise stations or ISDN lines. Current converged technology solutions make the distributed call center far more cost effective to implement today. IP technologies that integrate VoIP with data systems simply use the network as a PBX extender, creating a virtual call center environment that can physically be anywhere or everywhere. Most importantly, this distributed call center is transparent to customers, who see a single, unified point of presence for the company. Distributed call centers use of a technology for job performance requires that managers take a more hands off approach to supervising workers than traditional workflow methods. The call center supervisor does require a comprehensive set of tools to monitor both service delivery and employee productivity. Supervisors can quickly become comfortable relying on integrated systems, both telephone and computer networks, to measure and monitor productivity and worker activity. The idea of a worker being in the corporate office where work can be directly observed is transformed into a measurement of productivity and results rather than oversight of activity. Although there has been a trend to move call centers offshore, today many companies have become very security conscious and are more reluctant to engage in offshore arrangements. The Internet call center based on VoIP technology can provide for substantial savings above traditional costs without sending jobs outside the US. For those companies using offshore resources, IP telephony works as seamlessly around the world as it does around a city.
60
Chapter 3 Product Sales VoIP convergence brings creative new solutions to the retail sales market that are limited only by imagination. Converged VoIP phones can easily interact with data network service, providing SKU information about products either via a scanner or RFID tag reader. Its also important to note that VoIP phones today provide kiosk-like screen services and may not always provide voice as a service. A VGA screen on a VoIP phone provides a powerful data sharing mechanism. One retailer in Japan has leveraged this to maximize customer service. Theyve learned that if a customer has to leave the changing room to find another size for an item, they are most likely to abandon the sale and leave. This retailer equipped each changing room with a VoIP telephone that can scan the SKU of a clothing item. If the customer finds the item doesnt fit, they can simply scan the tag. The system can then provide feedback to the customer on a touch-screen. For example, you have blue jeans in a size 12. At that point, the customer can be presented with a variety of menu options, allowing for another size, and even another color. Simple touchscreen interaction allows them to make their selections. A salesperson on the floor is alerted via a WiFi VoIP telephone that the customer in a particular changing room requires black jeans in a size 10, for example. The salesperson can now take the customers selection directly to the customer and hand it over the privacy screen. VoIP technology provides information sharing that keeps the customer in the store and helps close the sale. Business selling products know well that making the act of buying as easy as possible for customers is a huge success factor. VoIP alone is just a way of making voice calls. Convergence of services and applications facilitates the delivery of completely new services and applications. Service Sales On the Web, customer service and support have always posed an interesting challenge. Companies struggle with providing help and online support through screens that say contact us and through frequently asked questions (FAQs). Some companies now provide interactive text chat with service representatives. To date, most of what weve seen has been rudimentary, but there is huge potential for innovation. The Web is powerful business tool that can indeed be voice enabled to better serve the customer. Figure 3.8 shows a Web page concept that brings voice directly into the Web page to allow customer interaction directly with the support center service agent. Today, VoIP often requires a softphone client to workan application similar to the one show in Figure 3.8 requires that the customer install some type of browser plug-in. But that is changing quickly. Development tools such as AJAX, Ruby on Rails, and Flash are beginning to deliver new VoIP connections that embed the VoIP softphone within a Web page or application.
61
Chapter 3
Enabling the Web with VoIP
Figure 3.8: The VoIP-enabled Web.
Convergence isnt just about supporting the core services you provide; its about providing your own business with services to support delivery of your customer-facing services. As Figure 3.9 shows, there is an inverse pyramid that makes up the services supply chain. At the top, is the widest audiencethe end clients. These may be retail consumers or business customers, but they make up the largest piece of the service market. Those clients interact with providers of services. But service doesnt end there. Manufacturers provide service to the providers and component suppliers provide services to them. And although this looks like a supply chain management flow, what you have to remember is that in the information age, although the product has changed, the process flows for business remain much the same. Information is made up of component parts just as manufactured products. Component suppliers in the converged network are data systems and those of business partners. The Web page provides data input about customer trends that feed business intelligence. You need to capture, process, and analyze information at every level to succeed.
62
Chapter 3
Clients
Services Providers
Manufacturers
Component Suppliers
Figure 3.9: The services food chain.
In supporting the services food chain, you can see the potential for a supply chain of information. As information passes through this supply chain stack, it evolves to become more valuable and increase your competitive edge. Raw data is the basic inputKen, 360-555-1234, 6 feet 3 inches, Washington. As data is processed through the stack, it becomes informationKen is 6 feet 3 inches tall and lives in Washington and he has a telephone. Further processing in the service stack matures information into knowledgeIts easier to reach Ken by phone than by email (we dont know his email address). At the highest end of services, well know our customer well enough that well turn that system knowledge into human wisdom. Our systems gather and synthesize information so that we can they make human judgments to guide our service interactions with clients. Financial Services Financial services present an opportunity to use every approach and technique described thus far, with one simple variation. The financial services sector provides an incredibly high volume of data inputs. Transaction data can present an enormous wealth of information for analysis. In the converged network, there is a tendency to be very data-centric when working in financial services, yet the call center provides a key human interaction with customers. Achieving balance between the focus on the data and the focus on service remains a prime success factor.
63
Chapter 3
Health Care In health care, patients become a focus of activity. Data systems provide information about them in painstaking detail. This information ranges from address and Social Security number to prescription drugs taken, medical history, and financial information. In the health care sector, there is the added responsibility to protect this information and comply with the Health Insurance Portability and Accountability Act (HIPAA), which addresses the security and privacy of health data. Health care offers many uncharted opportunities to use convergence technologies in new ways. Patient record information can be retrieved automatically with the scan of a health plan membership card. Hospitals rely more and more on wireless devices carried by staff, connected to VoIP phone systems that are tightly coupled with data networking resources. The health care professional needs a wealth of information readily accessible to ensure the best decisions regarding the health and welfare of patients. The converged network presents the best integration of information, technology, and efficiency in delivering health care. This sector has proven to be far more than a user of converged solutions. Health care providers lead the way in developing new solutions for integrating voice, video, and data services, and delivering it to the health care worker immediately. Manufacturing The traditional supply chain flow of manufacturing is also impacted by the convergence of services and applications. Much like the evolution to an e-business community, the linear supply chain is slowly vanishing. Its becoming a meshed supply chain model in which all players in the supply chain interact as peers. This provides immediacy in interaction and redefines many traditional enterprise functions, creating new opportunities for innovation. Two models emerge from this new meshed supply chain model. One model focuses on high volume and low price, creating economy of scale through volume sales. The other is a differentiated model that uses the higher-in-price-but-delivers-the-value approach. To maintain and keep customer relationship in the converged network supply chain, a technology-based information and knowledge derivation infrastructure is critical. The supply chain player has to gather and synthesize as much information as possible about both the upstream inputs and the downstream outputs. The supplier who has tightly coupled data with converged services will have ERP and CRM systems integrated to provide customer service. An upstream suppliers inability to deliver a particular widget can effectively trigger alerts in the suppliers CRM system to every key customer who might be impacted downstream. Imagine the customer service impact of being alerted that your deliverables will be a day late because of a component, and being able to notify your key customers immediately. Convergence gives supply chain players the tools to move quickly, be flexible, and adapt to change. Being nimble and adapting is a competitive edge. It lets you use knowledge about your environment as a differentiator. In short, the supply chain adds value, delivering not just widgets or technologies, but solutions.
64
Chapter 3
Financial Cost Reduction Drivers for Change

The beginning of this chapter mentioned cost reduction as an early driver for convergence and migration to VoIP technologies. Reducing costs in the corporate enterprise can come in several different forms, and telecommunications costs have many different factors to consider. Infrastructure for voice and data may be entirely separate networks that can now fold into one. Although cost reduction is seldom the primary driver for convergence with VoIP, its a business factor that cannot be overlooked. Cost reduction, coupled with delivering new converged services, can drive revenue directly to the bottom line. Local Telephone Expenses One cost reduction area of focus in the past has been local telephone charges. For large companies, using an architecture of PBXs distributed geographically across the enterprise, an approach called least cost routing, has been widely deployed. In this architecture, local access lines in remote offices, generally T1 circuits, presented a huge cost. Companies offset the expense by routing calls across the internal network of T1 circuits interconnecting all the PBXs to carry the calls internally, then drop them off on the Public Switched Telephone Network (PSTN) as close to the destination as possible. Using this approach, companies circumvented telephone company local access charges, effectively by using their own telephone network to carry what might otherwise be toll traffic. Convergence of voice and data further reduces costs by leveraging a unified IP infrastructure to carry all traffic, eliminating costly voice T1 circuits. Long-Distance Expenses Local and toll traffic expense wasnt the only cost that could be reduced through local access circuits. Long-distance calling could leverage the same method. Through creative deployment of trunk circuits between PBXs in remote locations, companies achieved control over call routing. This let many large enterprises control how calls were routed, using the cheapest rate available. Many large enterprises built telephone networks rivaling the complexity of small telephone companies in order to control costs. Businesses meshed their PBX systems together using a combination of point-to-point circuits and, often proprietary, PBX networking protocols. Automatic route selection intelligence was then built on top of the connections, based on dialed digits. Although this approach could effectively reduce telephony expense in a large enterprise, it created a very complex environment requiring increase in staff to manage the voice network. In short, enterprises became their own long-distance company. This complexity was never a viable solution for many companies. The costs often failed to sustain savings over time. Convergence again provides an architecture that puts all traffic on the switched and routed IP network, eliminating the need for dedicated voice circuits and the specialist telecommunications staff.
65
Chapter 3
Inside the Enterprise Internal calls within the enterprise were no longer a problem given the approach to local and long-distance costs. Companies used the same internal trunks between locations to carry inside traffic. Typically, this was achieved via access codes as part of the dialing plan. To reach a user in one city, you might dial a 5 followed by the four-digit extension, whereas to reach another office in another city, you might dial a 6. As technologies advanced, many of these interoffice trunks also evolved. PBX manufacturers began shifting away from proprietary protocols, using IP interface cards to move these trunk offices onto the IP network. In many cases, enterprises simply used the bandwidth available in the data network to also carry voice traffic between PBX systems. This early step toward convergence often became a competitive issue between vendors who touted free VoIP on the data network. This hyped approach caught some customer attention as companies derived cost savings from using a single carrier to provide both data and voice over a unified circuit infrastructure. This beginning of convergence is really only infrastructure convergencea small piece in the total unification of applications and services. Outside the Enterprise The telephone is the lifeline for most businesses. With recognition of the importance of the data network and the value brought by tools such as email, the telephone network has a history of reliability and availability that the data network, in many organizations, still doesnt quite provide. Telephone service is mission-critical. When companies evaluate the costs of handling inbound and outbound calling, prudent fiscal management requires business managers to dig deep into the pool of resources to develop new cost-saving approaches. VoIP, tightly coupled with the IP network, offers a compelling costsavings potential that integrates with other advances in applications used today. Support Costs Looking at the techniques described earlier, one fact is obvious: enterprise business accepted the cost of adding a telephone support staff to support creative methods to control costs. Although the savings were often justified, maintaining a full data network staff along with a telephone network staff duplicates many costs. Its inefficient. Many companies saw convergence as an opportunity to reduce technical staff by eliminating one of the support groups, but reality dictates otherwise. VoIP allows convergence of the architecture. It can allow for blending of staff, but dont assume the VoIP service can be designed and implemented solely by the IP network group in a large organizationor that the need for telephony expertise will completely vanish.
66
Chapter 3 Its deceptively easy to overlook the complexity of telephony engineering. The enterprise voice support staff understands traffic engineering for the voice service in ways the data networking team doesnt. Peak telephone calling periods, the busy hour of the day, or the busy day of the week, are an example of an understanding the data networking group wont have. Beyond that, interoffice connections of site-to-site networking for local calling, long distance, and interoffice communications may be very different in voice traffic patterns than those supporting data flows. There arent many data networking engineers who fully understand the inner workings of voice traffic in an enterprise call center. Although VoIP technology lets you move the traffic over the IP network, the underlying service is still voice, and it remains a mission-critical service. Rather than eliminate a workgroup entirely, converge staff as you converge services and applications. There may indeed be reduction, but to cut staff prematurely can lead to disastrous consequences. Dont be lured into cutting vital institutional knowledge about your services, network, and business needs in a misguided effort to eliminate cost too aggressively. Cost savings come over time. They dont come immediately, and they arent necessarily a large factor in convergence for every enterprise. Help Desk Support Help desk support in the telephone network seems simple. Most people dont need training on how to use a telephone. When businesses deployed a new PBX, training was often conducted the day before implementation and consisted mostly of dialing shortcuts, how the new voicemail system worked, and an introduction to pretty basic features. When integrating VoIP into the existing IP data network, features can work in completely different ways. VoIP phones often dont have the same buttons as traditional telephones beyond the basic dialpad. Most VoIP solutions introduce some data workstation, or computer, interaction with the system. Features such as speed dial lists, call pickup groups, and voicemail controls are often reached through a browser interface, coupling the PC with the telephone in a new way that people arent familiar with. As services converge onto this new unified communications infrastructure, the company Help desk will get calls about new features, functions, and capabilities that they arent familiar with. When planning a VoIP deployment, its important that both users and the entire support staff are involved throughout the project. Its the only way theyll be able to fully support your employees to minimize disruption to normal business operations.
67
Chapter 3
Adds, Moves, and Changes In the traditional world of telephony, the most expensive part of ongoing operations has always been adds, moves and changes to the environment. To move a computer, in many companies, simply unplugging it in one cubicle and plugging it in to another worked. Many companies use Dynamic Host Configuration Protocol (DHCP) to configure the IP network connectivity. Telephone moves have been more problematic. Because a telephone number has historically been assigned to the physical jack the phone plugged into, everything required reconfiguration when an employee moved to a new office. Changing work groups might mean complete reconfiguration of telephony services like hunt groups, pickup groups and call coverage. VoIP convergence may offer tremendous savings here as the IP network is now linked to telephony services. The telephone number and feature might now easily follow the workstation. Moving an employee to another department might be as simple as drag and drop of the employees name in a GUI interfaces to update telephone requirements. Convergence can mean that when a new employee joins the company, the human resources system provides all the input for the unified communications network so that telephone service is tied directly to network login. When fully implemented, a new employee might be assigned to a work cubicle and log in on their first morning to find all the voice and data services associated with their position are preconfigured and available for use. Remote and Mobile Workers The advances in broadband access to the Internet have fueled an exponential rise in remote and mobile workers over the past 10 years. VoIP convergence adds another dimension to the remote workforce. As described earlier, call centers can be staffed by people who are anywhere in the world. Previously inaccessible staffing resources from stay-at-home mothers to contracted multilingual translators to those without transportation have become viable staff resources. Enhancing the widespread availability of broadband Internet with the proliferation of Virtual Private Network (VPN) technologies and VoIP services, employees and contractors can now be anywhere in the world and appear to customers and business partners as if theyre at the corporate office. Location has been removed as a barrier to getting work done in the information age. Teleworkers arent the only employees who benefit. Road warriors, such as sales people, on-site support engineers, and traveling executives, also benefit from convergence. The company business extension can easily be delivered anywhere with VoIP, meaning these people might work transparently from a hotel room, a client site, or any remote location. All that is required is Internet connectivity. When a customer calls their sales representative, they dont need to know that he or she is halfway around the world attending a conference in a hotel. The office telephone number becomes as transportable as an IP address, working anywhere. New advances in unified communications, particularly in fixed mobile convergence, will bring this transparency even further as the multi-mode handset becomes a device that can connect transparently anywhere using ubiquitous wireless services.
68
Chapter 3
Strategic Drivers for Change

The strategic reasons for business change are many and varied. Some enterprises have unique applications that apply only to a particular vertical market segment. Inventory and supply chain management tools vary from market to market in manufacturing, but may not play at all in a financial services or health care environment. Shifts in the market drive changes in business strategies. Earlier, this chapter delved into strategic roadmaps for business, IT, and network strategies. Continual process re-engineering goes on in all businesses. New strategic applications appear. CRM systems evolve and grow to offer new functions and features. Many companies are exploring how to adopt CRM and ERP systems in a blended fashion, while smaller companies may look to application service providers to integrate new strategic applications on a pay-as-you-go basis. Whether youre deploying a major SAP platform, developing an in-house Web application, or integrating something like salesforce.com into business flows, any strategic change in data applications will drive other impacts into your voice services. For many companies, the act of integrating voice and data to a converged service network will be a key enabler for business process change. New Applications New applications emerge daily, and many present new features previously unavailable. With programming tools todaysuch as the .NET framework, Java, AJAX, and Ruby on Railsthe speed to market for new applications is increasing. Things we couldnt dream of doing in applications last year are the realities of this year. Application developers in todays convergence market are often seduced by the lure of Instant Messaging (IM), syndication (RSS), the idea of online presence (an indicator of a users online status and availability), and how these all complement VoIP services. Certainly there is a danger that this Web 2.0 development mentality of slapping together components will create problems. As developers embrace these technologies, they quickly migrate to the more mature view of SaaS. The SaaS model isnt new, but speed with which converged applications move through the adoption cycle from early adopters to broad consumer acceptance is increasing. Were seeing shorter time to market and shorter time to adoption in many unified communications technologies. We can expect to see this trend continue, and for many businesses, these new applications and service combinations will provide a strategic impetus for change.
69
Chapter 3
Obsolescence of Legacy Systems Older telephone systems were based on Time Division Multiplexing (TDM), just like the public telephone network. Many of these systems had a planned lifetime of about 10 years and were amortized as such within the corporate financial systems. Although advances in technology have left many of these systems appearing obsolete, they were still within a planned life cycle. Data networks have been managed differently. Typically, the network elements for business were smaller than a PBX. Data technologies in switching and routing have advanced rapidly. Data networks are often redesigned every 3 to 5 years, and data networking equipment is commonly amortized over 3 years in corporate financial systems. What this means is that many traditional telephone systems are now reaching the end of their viable life at a point in time when convergence and the benefits of moving to VoIP makes good sense from both a technical and financial viewpoint. Stop Investing in Legacy Technologies Although obsolescence of a system or the fact that its amortized off the books in terms of financial value may drive some companies, others, especially leaders within their market, simply tend to look to the future. For these companies, the need to migrate to newer technologies may be driven by other factors. Many companies stop investing in older technologies prior to the end of life just to ensure theyre future-proofing the technological assets of the enterprise. Seeing the end of life for a technology on the horizon may be enough incentive to pursue a change for many organizations. Manufacturer Discontinuation of Product and/or Support Equipment vendors and manufacturers announce they are discontinuing support for old products in the voice and data marketplace daily. Often, products have become costly to manufacture, difficult to support, or parts have simply become scarce, complicating hardware repair. In many cases, a vendor will make a conscious decision to declare end of life for a particular product in order to motivate customers to purchase a new product the vendor is promoting. In the traditional telephony PBX environment, this resulted in what was often referred to as a forklift upgrade. There is a danger for vendors to consider when taking this approach. Motivating a customer to replace a system with something new might also motivate them to look elsewhere. Many an incumbent vendor has found themselves losing a customer using this approach. Whether telephony customers upgrade with their incumbent voice vendor or migrate to a new vendor, some migration to VoIP will be driven by manufacturers who are discontinuing existing products.
70
Chapter 3
Rising Support and Maintenance Costs Aging systems cost more to support and maintain than new ones. Availability of parts may drive what were once reasonable prices upward. The technical skills needed to support older systems become less available as fewer and fewer of these systems are in operation. Sticking with a technology, like a phone system, that is 10 years old, when the market has shifted to a newer technology can become an expensive proposition. Several studies have shown that telephony systems reaching 10 years of age may cost as much as 30 percent more to support and maintain today that they did when they were new. New systems, based on new component technologies, bring another operational support factor into play. Every new technology also needs to be updated through its life cycle. When you deploy a voice or data networking solution, a series of patches and upgrades follows for the life of the equipment. New components allow for more efficient patching and upgrades, driving support costs down on new technologies, while costs continue to rise on older systems. Inability of Legacy Solutions to Support Business Needs Another strategic driver for change is simply the practical limitation of hardware and software. The hardware providing voice services today may not be able to support new software because of limitations in CPU processing power. Port capacity may be constrained to an internal architecture that cant be expanded further. In short, the system may do what it was designed for several years ago perfectly well. When telephony systems hit the market, they do so with a combination of features, functions, and price point to serve a defined set of needs. For businesses that evolve quickly, needs often outrun the ability of a system to evolve. Older systems may simply be unable to support features and functions your business requires. That brings up an important point about system evaluation overall. When evaluating a new solution, remember that youre making a strategic investment to support the corporate roadmap. Supporting the operational needs of today only serves the present. For some companies, system extensibility may be a key factor over 3 or 4 years, while others may strategically plan further ahead. Look at the enterprise tactical plans for today and strategic plans for tomorrow and evaluate solutions based on a full set of requirements for the present and the future.
Summary
Network convergence, unified communications, integrated serviceswhatever you call the impact of VoIP, video, mobility, and the combination of voice and data communications, its clear were faced with changes in how we manage the environment. Information flow is both vertical and horizontal between systems. And it flows not just inside the enterprise but outside to customers and strategic partners. Linear thinking and development taught us to build and sell, but today, you need to leverage information so that you can probe the market and respond. Knowledge, or intellectual capital, is for many, the largest asset; thus, we build databases and pull information from them. However, collaborative technologies that integrate voice and data aid in tying information assets to people.
71
Chapter 3 Convergence will make the network indistinguishable from the enterprise. It will create, over time, a virtual community of the enterprise. The technology, converged and blended, becomes completely transparent, yet absolutely necessary. Convergence isnt simply about voice and data blending into a single network architecture. Its the full integration of voice, video, and data into the core of the network, the access options, myriad network-connected devices, and the applications running over everything. Its a fourlayered convergence of the network, services, applications, and end points. The next chapter explores how to use this combination of sustaining, disruptive, and emerging technologies to improve productivity and differentiate your business in delivering products and services to customers.
72
Chapter 4
Chapter 4: Productivity Advantages of Unified Communications

In years past, cost reduction was a primary driver for integrating services. Convergence was seen as a cost reduction enabler first and foremost. As companies worked to consolidate to a single cable plant, voice and data converged onto one Cat5 or Cat6 wiring infrastructure. Further consolidation to a single wide area network (WAN) circuit infrastructure based on IP has slowly followed. In practice, cost reduction has proven to be a secondary benefit. The real benefit to converged communications is in productivity gains and as an enabler of new business operations. Although process re-engineering itself can be a major effort, convergence further strengthens the competitive edge of increased efficiency and productivity. This chapter will extend the reach of unified communications beyond cost into specific business areas and interests. It will provide real world examples of how business operations and industry segments can realize tangible productivity gains, and will not only touch on the current convergence of data, voice, and video but also see how they set the stage for other advances coming to the converged network. Worker productivity will rise and fall with the integration of data, voice, and video communications. This natural ebb and flow is representative of gradual process changes coupled with workers overcoming the learning curve as they adapt work habits to best utilize new tools and resources. Its important to remember that basic telephone usage is something very natural to working adults. Its something learned at a fairly young age. In the workplace, changing how people workhow they use the telephonewill have unexpected impacts. For example, a sales team that has been working from individual PC-based contact calling programs such as Act! or Goldmine will encounter a learning and adaptation curve with the enterprise shifts to a companywide Customer Relationship Management (CRM) system with Sales Force Automation (SFA) features. The work paradigm changes dramatically. This paradigm shift provides the catalyst for a change in corporate culture within the organization.
73
Chapter 4 Convergence is a buzzword that has garnered a lot of play in the past 8 years or so, but convergence means many different things, all of which apply to the enterprise: Network convergence is really the first phase of a long evolutionary process. Converging voice and data onto a single infrastructure provides opportunities to reduce operating expenses (OPEX). It reduces billing complexity with service providers. It provides for early workforce consolidation. It sets the stage. Service and application convergence is the hot topic of the market today. The idea of a Service Oriented Architecture (SOA), or deploying Software as a Service (SaaS) on the network, sets the stage for radical change in how work gets done. This convergence of data, voice, and video as services, coupled with the convergence of applications as services on the enterprise network, completely changes the basic steps and procedures of performing even some of the most basic work tasks throughout the day. Many organizations employ convergence between the telephone set and the desktop PC. Desktop real estate is at a premium for workers in the information economy, and the ability to use a single device on the desktop for all communications activity provides new integration enabling new workflow efficiencies. Further evolution in fixed mobile convergence will more tightly couple the telephone, the mobile phone or PDA, and the desktop workstation, providing a convergence that offers device independence with the freedom of mobility and choice of the best available device for communications at a given point in time.
How Service Convergence Drives Productivity and Enables New Business Operational Processes
Service convergence as a productivity enabler has become a root motivator for many companies pursuing unified networks. Its important that organizations not pursue new technologies in unified communications solely for the sake of their novelty. The key for any enterprise is how the convergence of the network supports the established business strategies. As unified communications technologies develop, companies around the world are discovering innovative business benefits provided by unifying data, voice, and video onto a single service infrastructure. High-level business processes can be heavily impacted by convergence, underscoring the reality that convergence is far more than cost consolidation of multiple, separate networks.
74
Chapter 4 For many enterprises, the question of one network versus several networks remains unaddressed today. Voice and data networks were developed in isolation, each being tuned for the performance requirements of the specific traffic type supported. As separate services, integration between the two wasnt anticipated or planned. VoIP introduced the idea of change, but the complexity of integration and rapidly evolving network capabilities made convergence a costly proposition for many companies. During the earlier years of convergence, organizations focused on the cost without fully understanding the value of business process efficiency and change that could evolve from integrated services. Today, its both feasible and practical to integrate business service solutions onto a single network infrastructure. Convergence also enables implementation of business solutions that were just too costly to deploy on non-converged, dedicated voice and data networks. One example of a service solution that is far more costly to implement in separate dedicated networks is Computer Telephony Integration (CTI) as described in Chapter 2. Call centers were once large, densely populated work centers deploying expensive integration solutions. It was difficult to justify the extreme costs in smaller business operations. Shaving a few seconds off a telephone call provided justification in the high-end call center processing thousand of calls per day, but that integration hasnt always transferred down to smaller business processes. Why not? Workers in todays information economy agree that there are productivity gains that can be achieved by integrating the telephone and PC workstation. The main hurdle in deploying call center technologies more broadly across the enterprise has been the negative cost ratio. The cost associated with separate networks has, for a number of years, outweighed the business benefits of implementing call center applications. In the fully converged network, many of the complexities of integration are eliminated. This reduces the implementation cost of call center solutions and creates a positive cost ratio. Figure 4.1 illustrates the importance of evaluating the business benefits over the cost ratio.
Business Benefit Data Network + Voice Network + Video Network
Lowering the cost of the network will improve the business case and will show a higher return on investment for the same business benefit
>1 When the converged network cost is less that the business benefit
<1 When the sum of separate networks is more than the business benefit
Figure 4.1: Business benefits and cost ratio evaluation.
75
Chapter 4 What we find are some vital areas of business benefit. Convergence can deliver both financial and operational benefits derived from the integration of applications and services, while enabling improved productivity. For many organizations, these benefits appear in business facets that werent fully considered in the past. Although this chapter is primarily focused on productivity gains, lets briefly touch on these business facet benefits as a reminder that convergence delivers a broad range of gains to the enterprise. They all need to be considered in the deployment and management of the converged service network. Management and Utilization of Property Service network convergence can enable property modifications that provide better working facilities for employees while reducing the enterprise needs for real estate ownership. Given distribution of work within an enterprise, the converged network can provide collaboration tools and a work environment that eliminates the requirement for all employees to work in a large, centrally located facility. Different work functions can be distributed to locations, taking advantage of tax benefits and other financial drivers. For many companies, placing the employees closer to the customer geographically provides an added value. Cost Reduction and Management Network consolidation leads to more efficient use of technical staff, leading to further cost reduction. Housing of technology resources, such as network server farms, and reducing the number of vendors or leasing requirements bring further reduction in costs. Additionally, a single, consolidated network provides greater clarity and visibility of future costs as the enterprise plans technology enhancements to support a unified business strategy. Enterprise Agility Enterprise agility is difficult to measure and quantify. One competitive factor that has been widely recognized by many companies is that Internet technologies have allowed small, entrepreneurial, nimble companies to compete against large enterprises effectively. Making the large enterprise a nimble, agile business through convergence enables quicker response to new opportunities in the market. Leveraging the property and asset management values, large enterprises can quickly move resources where they are needed or where favorable tax and labor rates prevail. Adaptation to change is simpler and less costly with a fully integrated network.
76
Chapter 4
Staff Productivity The converged network, integrating data, voice, and video services, allows employees a wider range of information media and network accessibility options. Tools such as unified messaging, voice, and video conferencing, and Web-based productivity tools create an environment for employees to more effectively accomplish business goals. New applications can be deployed more quickly in a unified network. Interoperability concerns are reduced or eliminated. Data, voice, and video applications no longer have to be developed and tested on separate networks, then cobbled together to create new service applications. To enhance productivity, the enterprise needs to improve employee flexibility and mobility while ensuring secure access to corporate information resources. Some examples include: Campus roaming Itinerant workers who routinely work out of different corporate offices Mobile telephone usersCellular users today and VoIP users through fixed mobile convergence tomorrow Relocations and moves of departments and within departments
The convergence of data and voice allows employees to be as fully productive outside the office as they are when theyre in the office at their desk. VoIP, unified messaging solutions, and call center technologies can all be used today to support the increasingly mobile enterprise workforce. Teleworkers, sales people, and consultants who spend much of their productive work time away from the office still require access to corporate services and information resources. One added value with VoIP services in the converged network is the accessibility for workers in transitory locations such as hotels and airports. As we look at the productivity advantages of unified communications, there are a number of other transitional events that help the enterprise determine whether to adopt convergence as part of the overall business strategy. For many organizations, these events act not as catalysts on their own but as accelerators to convergence adoption. In many cases, they can bring quicker benefits, both financial and via productivity, to the enterprise. The Non-PC Workstation Environment For many enterprises, particularly those whose core business is not based solely on information technology, there is a need to provide real-time data in a variety of different environments. Manufacturing environments may be too dirty to deploy PCs. Refineries, chemical facilities, and the shipping industry often present hazardous environments. Remote locations can present special security requirements. In addition, PCs may just be too expensive. In many cases, there is a need to provide a cheaper end user device. A converged network may be able to utilize a single handset device to provide both voice and data services in a cost-effective manner, reducing the equipment exposed to risk.
77
Chapter 4
Evaluating the Call Center Strategy In organizations with existing call centers, agent turnover is a universal concern. Alternate facilities might be considered for reduction in operating expenses (OPEX). Optimizing technologies onto a single platform can offer both cost savings and productivity increases. The converged call center can bring crisp and dynamic management to call flow balance between multiple call centers to ensure peak agent utilization. Implementing efficient call transfer technologies in the unified service network facilitates movement of both voice and CRM data across any customer touch point within the enterprise. This multi-channel call center approach lets the enterprise humanize business relationships with customers, eases access to critical information resources, and makes agents more productive by reducing handling times. These factors all lead to increased customer satisfaction. For sales channels, increased customer satisfaction can lead to cross-sell and/or up-sell opportunities, increasing the bottom line revenue potential. Business Process Optimization Enterprises constantly tune and optimize business processes for increased productivity. Remote collaboration tools can reduce time lost to travel. Video conferencing can eliminate travel time for internal company meetings. With converged services, giving every employee universal access to necessary resources, the enterprise can deliver a consistent user experience. Productivity in Resource Management The overhead associated with adds, moves, and changes is a significant burden. It costs time and money, and for many organizations, this ongoing administration OPEX is far more costly than the initial capital expenditure (CAPEX). Convergence reduces OPEX by creating efficiency in the administration process. Facilities managers who oversee a combination of office space and services such as voice and data in large enterprises often use what is referred to as swing space. Swing space is extra capacity to allow for the ongoing movement of individuals or groups of employees through constant organizational changes. Its essentially a buffer. For many large enterprises, swing space may comprise as much as 5 to 10 percent of the building. For many companies, the need for swing space was driven by the inability for real-time adds, moves, and changes in the traditional business PBX platform. Moving a telephone user in the legacy PBX world required time and effort to reconfigure wiring and implement programmatic changes. In the converged network, administrative intervention can be eliminated, enabling real-time reconfiguration of data, voice, and video services. Network cabling requirements have been cut in half, using the unified IP cabling plant. WiFi technologies coupled with VoIP softphones may completely eliminate the need for cabling to the individual desk. Other fixed space assets, such as conference rooms, can become viable temporary work spaces during reorganization or to support special projects with significantly reduced administrative overhead to support adds, moves, and changes.
78
Chapter 4 Focusing on Employee Productivity Email, voicemail, fax, and other messaging tools have provided increased ability to communicate both within the enterprise and with business partners and customers. These tools have also created inefficiencies. Some studies indicate that employees spend an average of 2 hours each day either reading and responding to email or listening to voicemail messages. To increase productivity, communications need to be managed more efficiently. The converged network provides a number of benefits to the organization. Perhaps the most significant is the ability to quickly integrate and deploy a wide range of services and applications. These service and applications can help streamline administrative tasks, letting employees focus on business goals, customer care, and revenue generation. The following three sections offer examples of application services that increase productivity in the converged data, voice, and video network. Unified Messaging Unified messaging platforms can give users immediate, integrated access to voice, email, and fax messages from any workstation in the enterprise, whether its a VoIP phone or a PC. The time reduction in using a phone for voicemail versus a PC for email can be significant. Unified messaging is a natural lead-in to device convergence, enabling any device on the corporate network to act as the workstation of choice at any point in time. Traveling employees can access all messages from a single device, speeding response times. Personal Communications Assistants Enterprise employees have many different contact points: Desk phonesIn some cases, in multiple offices Mobile phonesIn many cases, multiple mobile phones Home office phone EmailTo PC workstations and to handheld devices or mobile phones Video collaboration tools Instant messaging toolsBoth inside the enterprise and with outside contacts
Its becoming increasingly difficult to know which number to call or how to contact any one individual at any point in the work day. Multiple phone numbers lead to counterproductive phone tag, or worse, voicemail tag. Missed calls and multiple voicemail messages present a resource drain on productivity. Communications assistant tools in a converged network can provide information about presence and availability to help workers prioritize who can contact them and via what communications devices. Critical calls, or contacts, can be automatically routed to multiple devices to ensure efficient communications. Another advantage of these personal communications assistants is the ability to set up conference calls on demand, increasing collaboration efficiency both inside and outside the enterprise.
79
Chapter 4 IP Video Solutions The fully integrated network doesnt just converge data and voice; it brings videoconferencing power in a cost-effective, ubiquitous way. Many organizations see the power of videoconferencing as a means to reduce travel expenses. Beyond the travel cost, video can save time and provide a rich user experience in communications. In a converged network, the enterprise can provide both video on demand and videoconferencing capability to every desktop. Many organizations today are very laptop- or notebook-computer oriented. Todays systems often come equipped with a video camera (or Webcam) built in. For companies needing to invest in Webcam technologies, high-quality cameras have become lowcost commodities and can easily be bought in volume for less than $50 apiece. Video provides an array of business communications tools that are quickly becoming a normal part of the day for many organizations. Distance learning via video allows employees around the world to continue learning without the headache of travel to a centralized training facility or classroom. Crucial business information, board meeting updates, product announcements, even staff meetings, can all be viewed in either broadcast or interactive modes depending on the tools used for video. For teleworkers and remote staff, IP video provides face-to-face communications in real time, maintaining strong working relationships with coworkers in other remote locations and in corporate offices. Web-Centric Businesses For Web-centric businesses, or those tightly coupled to information technologies, the evolution of voice to next-generation voice and video services holds great potential. Although some of the content in this section may not apply directly to the enterprise implementation of unified communications today, it certainly paints a clear picture of where integrated voice and video technologies are quickly headed. Presence and availability information are crucial, but theyre also not well understood by the enterprise business world today. That is changing as society evolves. The established leadership in enterprise business today is comprised of people from the Baby Boomer Generation, but in the high-tech market, GenXers have stepped into many leadership roles. The next-generation workforce is coming from what is now being called the Millennial Generation. For many users, particularly younger users who have known Internet technologies their entire lives, tools such as instant messaging are vital communications tools. Adults often learn and adapt to new technologies from their children.
For an interesting perspective on the generation of children that has grown up with a computer mouse in hand, check out Homo Zappiens - Growing Up in a Digital Age by Wim Veen and Ben Vrakking.
80
Chapter 4 In the world of instant messaging (IM), presence and availability provide key pieces of information for the enterprise. These concepts are a follow-on to the simple IM buddy list. Although presence and availability are the key buzzwords used in the unified communications space today, relevance and context are basic concepts behind them both. The buddy list provides a window into online contacts availability, simply showing that they are online, available, busy, at lunch, on the phone, or some other simplistic status indicator. The real unified communications vision is far, far broader than the concept of relevance. Today, communications assistants (that is, software tools) offer the ability to give users control over who contacts themwhen, where, and how. In short, users can define the context they are working in and control how they are contacted. This idea is a piece of the future, but only one piece. Service integration is crucialvoice, data, and video converged on a single set of tools. Collaboration tools today are many and varied, but most currently lack extensive video services coupled with widespread application sharing. Thats a collaboration component developers are just beginning to fully understand. Fixed mobile convergence (FMC) presents another piece of the puzzle, and its sorely lacking today. In its earliest stages, FMC is to many the ability to pass a phone call from the mobile network to a locally managed WiFi. Thats really nothing more than arbitraging the cost of mobile minute airtime. Its an achievable technology that simply needs the right protocols to make it work. If the market demand existed today, this view of FMC would be fairly easily accomplished. Every technology piece needed exists in some form today. As unified communications evolves, FMC will enable the initiation of a call from a PC in a home office in the morning. The converged network can make this a multimedia call with video and application sharing. When the time comes to leave the home office and drive in to the corporate facilities, the voice stream can be handed off to a mobile handset on a home WiFi network. As you get in the car and drive away, the call will seamlessly hand off to the mobile carrier. Upon arrival at the office, it will be handed over to the corporate WiFi network. And when you arrive in your office cubicle, youll be able to hand the call back off to the phone on your desk, a softphone on your PC, or even to re-engage a full video collaboration client on the desktop to rejoin the collaboration call. Thats the unified communications path for FMC. Its more than just the network. The network, or networks, is still nothing more than a transport mechanism. Presence, availability, and relevance technologies today are barely the tip of the iceberg. Barely a toddler in terms of what the mature model will become. Today, you can share presence with your buddy list with your contacts but that doesnt begin to describe the enterprise value of convergence. Integrating Voice with Sales, Service, and Support Combine data, voice, and video with relevance, collaboration and FMC advances and think about how these services bundle together. Imagine an enterprise customer service team that is beyond relevant. A customer can call their sales rep, but the relevant enterprise will have every employees presence and availability information. Integrate the CRM system to enable advanced customer choice. The customer doesnt have to simply leave voicemail because they cant reach their designated account rep. Why not let the customer choose whether to leave voicemail or ring through the next relevant member of an account team who is available automatically? With converged systems, you not only pass the call but can easily send all the pertinent customer information as well.
81
Chapter 4
Call CentersLocalized, Distributed, Offshore Now consider the ramifications of all this enterprise relevance, presence, and availability capability tied into the call center philosophy. Why not make the entire enterprise a business relevant call center. You can know where every employee is, and their availability. You also know what tools they have available to communicate in the moment. Voicemail jail disappears from the landscape. Nobody ever needs to leave a voicemail message for their account rep except by choice. This evolution of convergence will redefine the easy-to-do-business-with enterprise! The relevant enterprise. But at the core, its not relevance. Its not presence. Its not availability. Its responsiveness. It enables nimble adaptation to the tactical needs of day-to-day business business communications at the speed of thought. When you think of the global telephone network, one of the features that has made it so valuable for worldwide business is its ubiquitous presence. The telephone is everywhere. Why not leverage convergence and unified communications technologies for the ubiquitous enterprise. The enterprise that is always on, always accessible, always responsive. That is where the next generation of communications convergence is headed. Integrating data, voice, and video today is the enabling foundation. IVR Systems As Chapter 2 noted, interactive voice response (IVR) systems are computerized systems that let callers choose options from a voice menu. With advances in voice recognition technologies, youll not only see new IVR applications, you will begin to see costs drop significantly. Simple IVR solutions let the caller speak simple answers such as yes, no, or numbers in response to the prompts, but they continue to grow in sophistication. IVR systems today can also read out complex and dynamic information such as email messages, news reports, weather information, and faxes using complex Text-To-Speech (TTS) conversion tools. These TTS systems use human voices creating speech in very small fragments that are assembled to create very lifelike voice. Although IVR systems have been used to create service solutions such as airline ticket booking, banking by phone, balance inquiry, and so forth, in the converged network, they present a new set of services for internal use. They allow employees out of the office to call in and have email messages or fax messages read back as part of a unified solution. IVR technologies enhance the ability of any employee from any location to receive and respond to important business calls and email messages.
82
Chapter 4
Computer Telephony Integration Chapter 2 also reviewed how Computer Telephony Integration (CTI) can facilitate interaction between the telephone system and enterprise computer systems. The converged network integrates data and voice onto a single IP infrastructure, drastically reducing the cost and complexity of CTI. In the converged service network, caller information, screen pops, call control tools, and outbound calling features are integrated into the VoIP system more tightly than was possible when integrating legacy voice and data network services. Call transfer, call hold, and conference calling features often become one-click operations. In Internet services today, the idea of click-to-call gets a lot of attention. On the Internet, its a nice idea thats forming. In the enterprise converged network with CTI, its a reality today. The converged network of unified communications tools makes CTI features readily accessible by business operations groups that previously couldnt implement these efficiencies. Application Integration with CRM and ERP Systems Enterprise Resource Planning (ERP) driven, Web-centered collaboration in business-to-business (B2B) interaction has grown significantly in recent years as business partners found ways to leverage Web services in Internet technologies. ERP systems are often implemented during process re-engineering within enterprise businesses to help break down the legacy silo mentality that compartmentalized large companies into smaller fiefdoms, often struggling internally within the organization. ERP systems dissolve many barriers by unifying all data resources and business processes under a single umbrella solution. This unified approach facilitates, and even encourages, collaboration between different business work groups. Although ERP systems frequently began as supply chain monitoring tools in the manufacturing sector, today theyre widespread across every business environment. Todays ERP systems support manufacturing, supply chain management, customer relationship management (CRM), sales force automation, human resources, and more. All of these components of ERP require unfettered access to voice services and data resources. As with CTI, the convergence to a single network infrastructure for data, voice, and video reduces the cost and complexity for implementing ERP systems. For many organizations, the ERP system represents what Bill Gates has referred to as the Digital Nervous System of the enterprise. ERP systems can be very tightly coupled as a service on the converged enterprise network. Communications tools, integrated with process management and monitoring, provide a level of integration that speeds countless business operations, reducing costs and increasing efficiency within the company.
83
Chapter 4
Customer Relations and CRM CRM is used to describe a wide array of business capabilities, methodologies, and technologies that support how an enterprise manages day-to-day relationships with customers. CRM systems bring reliability and consistency to customer interactions, enriching the customer experience and increasing customer satisfaction overall. As a corporate strategy, CRM is often implemented to create and maintain lasting customer relationships. As Chapter 2 noted, CRM is often a cultural shift for organizations, moving to a holistic view of managing the entire lifetime of the customer relationship. CRM systems are typically implemented within the marketing, sales, and customer service groups who have the most frequent and direct customer contact. One success key to CRM in the converged network is simply the ability to capture every single touch point and every customer interaction that occurs, regardless of what group within the enterprise is involved. CRMs focal points are to create a customer-based culture of end-to-end service. Convergence integrates data and voice services to effectively capture this information for continuous analysis and improvement. Figure 4.2 revisits a process flow introduced earlier in this guide. CRM may be viewed as analogous to ERP for many enterprises. What the CRM system enables is capturing every piece of information the organization has about every interaction with every customer, throughout the life of the customer relationship. This information repository becomes enterprise metadata; that is data about your enterprise business that can be analyzed and leveraged to further streamline and improve customer service processes.
Business Activity
IT Systems
Data Warehouse
Data Mining
CRM
Information
Informed Technology Choices
Intelligent Resource Procurement/Allocation
The Right Service Bundles
= VALUE
Figure 4.2: Building value with CRM.
84
Chapter 4 To fully use the strength of the converged network, you need to capture every business activity into an IT systemevery one. This data warehouse of information becomes the repository of what happened at every step of the way in the business flow. This information can be mined and processed against customer interaction data from the CRM system. This entire process of knowledge management gives business leaders fully developed information about the enterprise so that they can make informed technology choices for the future, intelligently procure and allocate resources within the enterprise, and develop the right new product and service offerings to increase revenue by filling customer needs, both present and future. Thats the value of CRM. Where the rubber meets the road, at customer interaction, you use a variety of communications toolsdata, voice, and increasingly video. The converged network simplifies the ability to let employees focus on customers, while leveraging the strength of the technologies to automate capturing the business intelligence information that will help the enterprise evolve. CRM systems today provide comprehensive order history tracking, click-to-email, click-to-call, and analysis reporting tools that arm employees with comprehensive CRM tools to manage larger sets of customers more efficiently than could ever be achieved in the legacy voice and data networks.
Vertical Market Business Drivers for Change

Although there are numerous small drivers within each vertical segment of the market, one key driver across all sectors is the hyper-connected nature of workers. Workers today are information professionals regardless of the business sector they work in. Everyone in the work force is an information professional. Everyone is connected. Many workers are hyper-connected. Workers use email, office phones, mobile phones, pagers, Blackberries, smart phones, home computers, and laptopsin short, they are almost always online. In this hyper-connected state, workers have also achieved a virtual Masters degree in multitasking. Using todays communications tools, workers manage more projects, cultivate more customers, and complete more work than ever before. This hyper-connected, all-inclusive toolset also means that work and personal life blur. Customers today often involve personal relationships as the work day becomes an artificial constraint for business in many areas. People work from home or from wherever they are at the point in time theyre needed. Convergence technologies and the unified communications evolution not only leverage this ability but also bring business tools that help employees ensure that business and personal lives dont blur entirely. Convergence isnt just VoIP. Convergence of data, voice, and video today are the foundational elements for unifying all communications technologies. The power of convergence doesnt lie in VoIP. The power of convergence lies in integrating voice and data services of all kinds with business applications in the enterprise, tightly coupling services and applications to support business strategies.
85
Chapter 4
An Example of Convergence Beyond VoIP Voice isn't always VoIP. The real value of convergence is the broader unified communications that embraces all data voice and video technologies. There is a new startup company today that can provide a single virtual number that works on a mobile users existing mobile phone. Users can both dial out from and receive calls on their virtual number, so they can easily separate their business calls from their personal calls, for example. Not only can a mobile user have two mobile numbers on a single phone, but the virtual number can be selected from almost anywhere that the user desires. In this way, professionals who conduct business in multiple regions can use their virtual number to give the impression of a local presence because callers can see that they are reaching or being reached by a local number. This virtual number is another telephone number. It can be assigned to your cell phone. The mobile phone can support two numbers, with CallerID working on both. The virtual number can be anywhere. Consider foreign students attending college in the U.S. Recent reports say there are 674,000 students with active VISAs at present. They could have a phone number on their cell phone in England, Japan, India, and Australia. From home their real home where family live. Two huge benefits brought about by this development are the ability to leverage international long distance arbitrage to obtain the cheapest per minute cost to call home. But this also means that mom can call back and reach her student with a local phone call. Foreign students are just an example. How many workers travel to other parts of the world from their home to work? Some parts of the U.S. are heavily populated with migrant workers in agriculture. Many professionals in the medical field come to the U.S. from the Philippines. How many people have corporate offices in their home country and another part of the world? What about a consultant in Idaho who wants a Washington DC number to work on contracts with the federal government. Local presence, via a local phone number, is easily accomplished. And Caller ID follows that number. Place a call from the UK number, and that is what the person receiving the call will see on their display. Beyond that there is another aspect of aliasing existing telephone numbers. Let me give a personal example for aliasing. I have phone numbers for home, home office, office, personal cell phone (Treo), business cell phone (Blackberry), and a couple of others that I use daily. With telephone aliasing technology, I can make them all appear on my cell phone, just like the virtual number described a moment ago. I can make and receive calls with the full presence of my telephone number. If I call on work business, I can place the call from my business number and that is what the called party will see on their display. In short, the identity of my telephone number is extended to my mobile phone, complete with Caller ID information. I know I keep mentioning Caller ID, but it's important. Let's look at a broader vision. Consider professional services workersconsultants, doctors, and lawyersas examples. Consultants and lawyers make their living on billable hours. There is a law firm in Silicon Valley that has privately estimated that they lose $1.5 million dollars a year in billable time from attorneys talking on their cell phones while they are driving. If those attorneys had virtual numbers with an account code, that time could be tracked, and billed, driving billable time back into corporate revenue stream. Consultants may provide a block of hours or billable service to special clients. Many provide a dedicated phone number for the client to reach them any time. What a perfect fit! A virtual telephone number, dedicated to a client with account code tracking for billing built right in. Doctors have different constraints. When a doctor calls a patient with test results, the Health Insurance Portability and Accountability Act (HIPAA) regulations forbid patient test results being left in a voicemail message. The medical professional has to speak with the patient. Doctors don't call patients from their cell phones because they don't want patients calling them back there. If you get a call from your doctor's cell phone, you're likely to let it go to voicemail as an unfamiliar number anyway. But if you see the medical center number on Caller ID, you know they're calling with test results. Everyone is assured the privacy and confidentiality of patient information, but through Caller ID, you leverage the telephone network for better efficiency. Beyond North America, these technologies mean that global boundaries don't matter. If I do business in six countries, I expect I'll be able to get virtual telephone numbers in every country at some point. All on my mobile phone. All at once. Whether the underlying network is VoIP, PSTN, or cellular is irrelevant to the user. Convergence eliminates the technology and simply makes it a voice service.
86
Chapter 4 Business Sales and the Web, or Net-Enabled Business Earlier, it was pointed out that the real challenge of convergence is in supporting the enterprise business strategies. The fundamental goal isnt to implement convergence. It isnt to shift to VoIP. The real objective is solving business problems. In a recent article entitled VoIP Outside of the BoxA New Way of Thinking, Telephony World editor Don Panek had this to say: Business owners and executives today are not sitting around board room tables discussing their existing phone systems, services, or expenses. Theyre not thinking about VoIP or Telephony and what it can do for their bottom lines. Theyre discussing real business problems and looking for real solutions to those problems. His closing drove the point home yet again: I think the time has come to stop talking about VoIP and Telephony and start talking about applications and solutions. Solving problems. And at the end of the day after the order has been signed, we can then mention that all the phone calls will be free or virtually free! And how is that done? Well the solution has VoIP built in! Communications technologies manufacturers, vendors, and developers often get sidetracked into the idea of how their products can interoperatehow they can build a platform. Customers dont care about a platform, and dont care what vendors are doing beneath the hood. Customers care about their business problems and solutions to those business problems. For business sales, Web-enabled businesses and the hyper-connected workers of today, access to corporate business resources and instant, easy communications are vital to success. Convergence of network voice and data services provides a first step. Integration with ERP and CRM systems takes convergence further along the road to productivity enhancement and business success. Product and Services Sales Convergence to the fully integrated network provides access across the enterprise to a full set of business tools. Business sales, whether products or services, all have a time intervala sales cycle. Outside this sales cycle, business information travels at the speed of light, or electrons on the wire. Converged services in the hyper-connected, always-on business provide differentiators that didnt previously exist, in part because the converged business is a nimble business able to respond quickly to market demands. Some examples of how this always-on enterprise can leverage the converged data, voice, and video services of the future include: All markets are up for grabsThe marketing paradigm changes dramatically. The always-on enterprise leverages communications tools to reframe the discussion and win customers while the less-empowered competitor is left behind. Difference not differentiationThe converged enterprise minimizes the behavioral changes in customers by embracing change within the enterprise. This can give your customers a tangible set of reasons to love your products and services. Dont disappointBy leveraging reliable new technologies, you ensure that everything works and the organization can react instantly to a situation.
87
Chapter 4 Make your marketing sociableThe enterprise cant control the customer conversation, but can leverage best-in-breed convergence technologies and refined business processes to build genuine relationships with established and potential customers rather than whitenoise relationships. CRM becomes not a buzzword, not a catch phrase, but a corporate culture of nurturing the business. Interaction requires iterationIts not enough to listen and respond to customers. Business success requires a long-term, sustainable dialogue that convergence technologies support. Meaningful long-term connections with customers come from community, co-operation, and co-creationall collaborative efforts that the integration of services and applications enhance. Dont forget to sellEngagement is great but it doesnt pay the bills. Remember that every touch point throughout the customer relationship cycle is an opportunity to up-sell, cross-sell, or lose business. Selling is responding to the customers. Its about making it easy to do business with you. Convergence technologies integrate all your corporate intellectual capital into a suite of services and applications that make you easy to do business with.
Financial Services Beyond the basics of efficiency and productivity, there is another set of drivers in the financial services sector that are tied to regulatory compliance. Whether its the Sarbanes-Oxley Act (SOX), the Gramm-Leach-Bliley Act (GLBA), ISO-17799, or IT Infrastructure Library (ITIL), there is a set of practices that are ever widening across business sectors. The financial services sector has always been tied to close auditing and scrutiny. Beyond audit or compliance considerations, the financial services sector is perhaps the closest sector to a pure information economy. In financial services, the ledgers and paper trails of old have become a stream of information on the network. The paper is gone and finance is all about moving information quickly, accurately, and securely. Convergence of data, voice, and video onto a single infrastructure will, for many financial institutions, provide a consolidated approach to monitoring and management, simplifying the entire data capture, warehousing, and analysis process for regulatory compliance. Figure 4.2 showed the importance of CRM in the business flow, but that same chart provides an overview of touch points in the flow that eases compliance reporting and documentation for companies involved in financial services. For financial services businesses, the reduction in cost of call center technologies may play a key role in redefining the financial services business. Deploying call center methodologies in smaller volumes now affordable with convergence can drive measurable gains in productivity down into smaller workgroups.
88
Chapter 4
Health Care Like financial services, the health care environment brings a unique set of compliance requirements related to HIPPA. This legislation established standards for transactions in the healthcare sector, but also established requirements for the security and privacy of patient health data. Converged services again provide a single infrastructure to secure and bring into compliance. The complexity of HIPAA has spawned numerous supporting, consulting service markets. Network consolidation can allow health care providers to take a single, unified approach to HIPAA compliance, thereby focusing on their core businesshealth care. Beyond compliance, the converged network brings the model of CRM to patient care in the health care environment. This integration of voice and data services means that doctors, physicians assistants, nurses, and other health care professionals can leverage advances such as screen pop CTI and click-to-call technologies to better serve patient needs. Manufacturing For manufacturing, the advantages of convergence may seem hidden. The manufacturing sector doesnt appear at first to be tied to network services the way purer information-driven companies might be. But in manufacturing, there are several key areas where convergence brings real value to the integration of data, voice, and video. Perhaps the most tangible value in convergence lies in the broad spectrum of supply chain management and the vendor/supplier relationships that make up an integral part of the manufacturing process. Inventory control data systems can easily prompt via an enterprise ERP system to contact a supplier and ensure inventory restocking is timely. The integration of data and voice in the converged network has for many manufacturing businesses proven the key to success in following the precepts of just in time component delivery.
Summary
In a recent paper entitled Voice and Video over IP: Leveraging Network Convergence for Collaboration, Melanie Turek, Senior Vice President & Founding Partner at Nemertes Research, had this to say: For several years, the big question with Voice over IP (VoIP) was whether it actually worked, and if so, whether it worked well enough for corporate ears. Well, the answer is in: Yes! As long as the network is architected properly, VoIP is definitely ready for enterprise use, and convergence projects are running strong in the vast majority of organizations. Better still, while voice typically is the first application implemented on a converged IP backbone, Nemertes is starting to see IT executives explore new applicationssuch as video, unified communications, and other collaborative toolsthat can also leverage the IP network. The benefits can be great, including cost savings and increased productivity in the virtual workplace.
89
Chapter 4 As they deploy these and other technologies, companies are starting to recognize the need for network optimization, enhanced management tools and tight security. And although most IT executives dont spend a lot of time worrying about specific standards, they like what standards get themeasy integration and interoperability among vendors and networks, both of which are important when it comes to communications technologies. As SIP grows more robust and more common, companies will have more vendor options open to themand theyll start to take advantage of the benefits convergence brings: Integration, interoperability, and the ability to stay agile in an increasingly global world. Most companies that have already deployed VoIP, when queried, identify plans to extend converged voice service to teleworkers. Although today the focus is predominantly voice service, other collaborative applications, such as voice conference services, desktop videoconferencing, and Web conferencing/collaboration are expected to follow quickly. Many organizations are already running some kind of video over IP. Business managers show increasing interest in leveraging the technology for more than just voice communications. Many enterprises are running trials and developing applications using video to address business needs in their emerging converged network environment. General industry predictions are for significant growth in both desktop and room-to-room video deployments in 2007 and 2008. IP networking and advances in broadband technologies have made video an affordable and practical business tool. Although desktop videoconferencing is viewed as a relatively new technology for almost everyone, the barrier to entry is very low. Business managers and strategists see video as one tool in a larger suite of collaborative applications, and show their interest in using it where no video exists today. According to a report by Nemertes, video is one of the leading drivers for converged networks (28 percent of participants in Nemertes latest benchmark name video as a key driver). Thats very important because of the large number of enterprise business employees who either work remotely from their direct managers or are geographically dispersed in remote offices. Some broad industry projections show that teleworkers have increased as much as 800 percent in number over the past 5 years. As companies become more global in nature, and more widely dispersed, the value of real-time communications over converged networks becomes apparent. Citing Nemertes again, in a Convergence benchmarking study, participants were asked to rate on a 1-to-5 scale (where 1=unimportant; 2=somewhat important; 3=important; 4=very important; 5=vital) how important the following drivers were in approaching convergence: Growing revenue Boosting employee productivity Gaining competitive advantage Reducing costs Meet regulatory/legal requirements
90
Chapter 4 Results showed, not unexpectedly, that growing revenue is the foremost business driver across all surveyed organizations with a mean score of 4.38 out of 5. Second in importance was meeting regulatory and legal requirements, with a 4.07. Cost reduction placed third in the survey with a 4.02 mean score. Gaining competitive advantage scored 3.84. Of these five categories, boosting employee productivity came in with the lowest rating at 3.8. When you compare these strategic business drivers with IT services and network convergence projects, it becomes apparent that demonstrating benefits at the top line (revenue) increase the adoption rate of convergence. Cost reduction is also important, but gains in productivity are still seen as a soft cost and are very difficult to quantify. Given the difficulty in proving these benefits, its not surprising that improving employee productivity is the lowest importance of the business drivers in Nemertes survey on convergence. Organizations that have gone through the evolution of adopting convergence technologies have identified several lessons learned: When methodically implemented, the converged service network can lower OPEX and increase employee productivity. The converged service network typically creates a scalable infrastructure capable of supporting new business applications in a dynamic environment. To achieve cost savings and productivity benefits, a holistic view of business services and applications is required. Its critical to look at the full picture and not just focus on either cost savings or employee productivity. The full converged network can provide greater visibility into granular cost controls. Industry studies have shown that 80 percent to 85 percent of the enterprises that have already implemented a converged network determine that the quality, resiliency, and scalability that these technologies provide either meet or exceed their expectations.
Convergence of data, voice, and video is a viable technology. Its available in the market and can be implemented today. Through integration of unified data, voice, and video onto a single IPbased infrastructure, organizations can lower their total cost of ownership (TCO). They can lower expenses for equipment and maintenance, reduce administrative costs, and lower carrier charges. The converged services network can also increase productivity and enterprise communications capabilities by facilitating employee mobility and providing a solid foundation for the deployment of advanced, feature-rich services and solutions.
91
Chapter 5
Chapter 5: Key Steps in VoIP Deployment and Management for the Enterprise
Different applications have different requirements. The introduction of voice and streaming video onto the existing IP network presents a completely new set of requirements to the operational performance envelope of the network. This chapter will examine the importance of assessing the readiness of the network and fully evaluating design considerations related to delivering integrated service in the enterprise. To address new service integration, the chapter will explore a methodology, called the performance envelope, for mapping the characteristics of the corporate network.
Network Readiness Assessment

As businesses delve into VoIP and video solutions in the enterprise, their focus is on business drivers. For many companies, cost reduction is a key business driver for convergence. Cost cannot be the only driver, and for practical purposes, shouldnt be the primary driver. Its important to set reasonable expectations and to fully understand all the business drivers behind service integration. A new application service or an integrated VoIP and CRM solution will require consideration of very different factors than cost reduction alone. Its vital to fully understand the business motivation for converging video, voice, and data so that the project planning and implementation teams can address the true success factors when mapping out project milestones. Communicating the objectives clearly with everyone involved helps maintain a clear view of the expected results. Ensuring Network Readiness for Converged Services Ensuring network readiness is a significant task and not to be taken lightly. Over the past 4 or 5 years, both VoIP vendors and systems integrators have learned the hard lesson of failure because of inadequate preparation and planning. The existing data network has been optimized over time to support the existing business requirements. As new applications and services are added, network tuning takes place and a variety of parameters are often tweaked. Historically, these optimization efforts have been driven by packet data applications such as CRM, ERP, and Web services solutionsnormal IP applications. Voice and video present new challenges as they add streaming, real-time traffic to an existing service network. VoIP call quality requires low latency in the network. Jitter needs to be low, and to avoid complicated jitter buffering requirements, it needs to be consistent. Bandwidth and packet loss need to support the new, integrated services.
92
Chapter 5 Testing and documenting the parameters and characteristics of the existing networks before extensive planning will help ensure the network is capable of supporting VoIP, video, or both. Its important to understand aspects of the traffic needed to support all the different service types that will coexist on the new network. Traffic types, frame size, prioritization schemes for quality of service (QoS), jitter, latency, and packet loss are all crucial factors. Because consistent network performance is so important, its also prudent to evaluate utilization in the network both at peak and normal times. If the network will include PSTN gateways for global connectivity, there may be requirements specific to the gateway demarcation between networks. Both the IP network and PSTN need to be considered. The PSTN side of the gateway will have specific trunking requirements. These have traditionally been calculated using Erlang-B formulas. The IP network must support the bandwidth and other network characteristics to ensure a clean handoff of calls from one network to the other.
Erlang-B Calculation Telephony carrier central offices and enterprise PBX systems are engineered for trunking requirements using the Erlang-B traffic tables and hundred calling seconds per hour, or CCS. 36 CCS would represent a circuit at maximum occupancy with zero idle time. I wont explore the formulas in any depth as we investigate managing the converged services network, but its worthwhile to understand the complexity and sophistication of traffic engineering in the traditional telephony environment. The formula for determining voice traffic load thats most widely used is: OfferedLoad = Calls/Hour AvgMinutes/Call erlangs 60 Erlangs are really a simple concept identifying the number of calling minutes per hour a system can handle. The key differentiatior is between the traditional circuit world, where minutes equate to busy circuits, against the VoIP world, where minutes of use may not be applicable. In VoIP services, its really the carrying capacity of the network in packets thats the key indicator.
The enterprise IP network must complement the trunking requirements when connecting to the PSTN. Overlooking details like this may result in performance bottlenecks that impact call quality. Plan to SucceedDont Fail to Plan Whenever new technologies are implemented in the corporate network, there are some basic planning steps that should not be overlooked. These basics will improve the likelihood of success in the deployment of converged voice and video services. Although this brief section seems to be focused on the planning and implementation of integrated services, that is the crucial starting point of holistic management of the total integrated services network. When an organization decides to implement VoIP, that decision commonly leads to acceptance of a new, holistic approach to managing the entire service network.
93
Chapter 5
Analysis Before undertaking a convergence project, fully identify the business drivers for change. Its important to identify the business objectives and set everyones expectations accordingly. List the benefits you expect to derive from integrating services so that each benefit can be addressed individually. Broad, sweeping objectives such as save money arent measurable. The success of VoIP implementations really depend on advance data gathering and analysis. The more you can document prior to any implementation work, the greater your likelihood of success. Its important to balance objectives and expectations with analysis of your networks capabilities. The determination that your network can support the new, converged services, or learning that it cant and will require upgrades in routing and switching, will influence both the budget and project timelines. Planning Theres an old adage that says people dont plan to fail, they fail to plan. Its vital that your project team be methodical and document every facet of the entire planning process. One widespread complaint heard from both systems integrators and VoIP vendors has been that the assessment and planning has been the weak link in a high percentage of failed projects. Resources invested in planning, whether spent on a project manager or the hours spent by a technical team evaluating requirements, will bring benefits at cutover time. There is no substitute for preparation and planning. For an enterprise mission-critical service such as voice, failure can be especially painful, both in stress and in business impact. The more meticulous the planning process, the higher the chance of success. Testing Once the network requirements have been documented and are fully understood, test to make sure the existing network can support your VoIP requirements. Comprehensive analysis up front helps ensure that you know what to test and what parameters or characteristics your network requires. Its important to approach testing with the specific aim of supporting the fully converged network. Testing VoIP services alone is not enough. Existing applications should be tested while simulating VoIP. There is a hidden danger of integrating VoIP services and doing all the right things to support call quality, only to starve network resources for other mission-critical applications. It is absolutely vital to take a holistic view of all network services and applications when testing. Acquire the Right Resources The right resources will vary from organization to organization. For some, acquiring resources might mean dedicating a project manager. Most organizations will need some training to bring existing staff up to speed on VoIP resources. For many companies, this will entail bringing in a VAR partner or systems integrator to bolster the expertise available. There is no shame in admitting you need expertise from outside and working with a consulting partner. Leverage expertise everywhere you can.
94
Chapter 5 Look to the FutureThink Long Term For several years, VoIP has been viewed somewhat as the final objective. Its important now to recognize that this isnt the case. VoIP is a foundation or building block technology. Its important to look at scalability in terms of users and raw numbers, but thats only scaling vertically. An earlier chapter looked briefly at the Software Oriented Architecture (SOA) and Software as a Service (SaaS) concepts. This guide has also talked briefly about emerging characteristics of unified communications such as presence and availability. These are horizontal scalability factors. To future-proof your network and ensure it can continue to grow to support the enterprise business, a holistic approach that scales both vertically and horizontally will help enable future growth at an acceptable pace. It will enhance readiness at anticipating that next new application requirement that lies just beyond the horizon now. The one you dont yet know about. The VoIP integration project presents an opportunity to review the overall future capacity of your network.
Understanding Existing Voice Needs

Once youve developed all your business needs and determined that VoIP service integration is the right business move, you need to gather data to fully understand voice services requirements. In telecommunications, there are patterns of calling that can be identified. Call usage reports from either the existing PBX or the incumbent telephony services provider offer a wealth of information. Some variation of these reports should be readily available from the current telephony technology, whether its carrier-based Centrex-like service or a locally administered PBX. There are several key elements of voice service that need to be well understood. Business telephone systems often focus on the busy hour, or that hour of the day the most telephone traffic occurs. This is a starting point, but there are many nuances to consider. The busy hour for inbound calls may be different than the busy hour for outbound calls. The aggregated inbound/outbound busy hour may be different than either taken individually. These busy hours may vary on different days of the week. Most companies also have an identifiable busy day in the average week. And business managers understand the cyclical nature of their respective businesses. The busy month of the year is generally well known in any business sector. For large enterprises with multiple divisions, or business units, calling patterns may vary and some may have different requirements and calling patters than other divisions of the corporate headquarters. Flexibility is key to understanding the needs of all business units. To identify specific business needs, you must understand the calling flows and volumes. Some enterprise calling patterns are very inbound in nature. This might be easily identified by a company using a call center, but other companies establish large outbound call centers as well. The patterns for inbound and outbound calls are often very different. Its vital to know what is required to support your particular business. Its also important to identify upper call volumes, or absolute peak traffic.
95
Chapter 5 Most enterprises are somehow cyclical in nature. When analyzing calling patterns, its worthwhile to account for several months or even for the past year. Peak periods vary widely by industry. For example, a retail services company may experience much higher call volumes during the Christmas season than any other time of year, whereas a financial services company might expect the highest call volumes at tax preparation time approaches. Its important to take these peak activity periods into account and not simply design the VoIP services to support dayto-day business needs. As you develop a strategy for integrating VoIP into the network, its also a good time to recognize that the PSTN has become a mission-critical component of business operations. The PSTN network has been engineered to deliver 99.999 percent uptime. That is what the traditional carriers tout as their benchmark. This service level equates to roughly 5 minutes of downtime per year. Corporate data networks rarely provide this level of reliability. For many organizations, increasing network uptime will provide incentive for network enhancement and redesign requirements during preparation and implementation of a converged services network. One Voice FactorCompression Methods One factor to consider in VoIP deployment is the encoding method used to convert analog human speech into a digital signal for packetized transmission. There are several encoding techniques used in VoIP solutions. Each uses different compression levels. These compression levels affect the bandwidth needed to transmit the voice packets, the quality of the voice, and the processing time. Pulse Code Modulation (PCM), or G.711 encoding, is widely used. Its the standard encoding used in the PSTN, and is generally assumed to provide the best voice quality, although new wideband codecs are now starting to surpass PCM in some implementations. PCM generates a 64Kbps voice stream. For some companies, bandwidth requirements, weighed against quality considerations, might make a different encoding scheme more appropriate. Increasing the compression of the voice signal reduces the overall bandwidth requirement, but voice quality may suffer as a result. Even though the PSTN is more than 100 years old, voice quality is still often measured by having a group of people listen to sounds in headphones and subjectively rate the quality. These evaluators rate the sound quality on a scale of 1 to 5. This rating, from 1 to 5, is referred to as the mean opinion score (MOS). A rating of 1 roughly equates to the scratchy sound quality of the intercom speaker in a fast food drive through or a warehouse environment. A 5 is the highest rating and is considered perfect, or the theoretically highest-grade voice quality achievable. It has always been the target for conducting corporate business calls, but as youll see in the chart, not actually achieved. A score of 4.4, garnered by the G.711 codec used in the public telephone network, was described in the past as toll quality voice. This has been the traditional carriers target for business-class voice services. Although the statistical validity of this human sampling may be questionable to some, this approach has been used for many years and is generally accepted globally. Today, MOS is often determined using ITU Recommendation P.800, which details methods for subjective determination of transmission quality. In the real world, the human ear can clearly distinguish between a 4 MOS and a 4.5 MOS.
96
Chapter 5 Table 5.1 provides a comparison of several widely adopted encoding schemes. This table describes the codec types and algorithms used, the sample size, bit rate, and encoding delay for the algorithm used. The last column identifies the MOS for these codecs. When selecting a codec for VoIP implementations, its important to remember that delay is cumulative from end to end. Encoding delays can become real call quality concerns in a large enterprise network. If the total delay exceeds 250 milliseconds (ms), call quality will suffer. The ITU-T standards (http://www.itu.int/) provide extensive details about the operation of these codecs. Vendors and consulting partners will also be a valuable resource in identifying the best approach for each specific implementation.
Encoding Schemes PCM Adaptive Differential Pulse Code Modulation (ADPCM) Sub-Band Adaptive Differential Pulse Mode Modulation (SB-ADPCM) Low-Delay Code Excited Linear Predictive (LC-CELP) Conjugated Structure Algebraic Code-Excited Linear Predictive (CSACELP) Algebraic Code-Excited Linear Predictive (ACELP)
Table 5.1: Encoding schemes.
ITU Codec Descriptor G.711 G.726 G.722
Sample Size 8 bits 4 bits 8 bits
Bit Rate 64kbps 32kbps 64kbps
Encoding Delay <1 ms 1ms 4ms
MOS 4.4 4.2 N/A
G.728 G.729/G.729a
40 bits 80 bits
16kbps 6kbps/8/kbps
2ms 15ms/10ms
4.2 4.2
G.723.1
160 bits
5.3kbps
37.5ms
3.5
PCM, or G.711, is the method that has been used in the PSTN. Its perhaps the most widely used codec in VoIP systems because it provides a well-known call quality rating. Every VoIP equipment manufacturer supports G.711. G.726, known as Adaptive Differential Pulse Code Modulation (ADPCM), may reduce the bandwidth requirements by 50 percent while only posing minimal degradation (.2 of a point) on the MOS rating. G.722 is mostly used in FM radio, but is included here because FM radio quality is a good audio fidelity comparison. Its rarely used in VoIP solutions. G.722.1 offers lower bit-rate compressions. A more recent variant, G.722.2, also known as AMR-WB (Adaptive Multirate Wideband), offers even greater compressions and the ability to adapt quickly to varying compressions if needed due to network changes. Bandwidth is conserved during periods of high network congestion, then returned to a normal level when congestion is alleviated. Because G.722 and its variants sample audio data at a rate of 16kHz, rather than the PSTN PCM method at 8KHz, these codecs result in far better sound quality, clarity, and fidelity. G.728, or Low-Delay Code Excited Linear Predictive (LC-CELP), coding is widely used in voicemail systems for digitizing stored voice messages. G.729 can deliver an 8-kilobit sample with less than 16ms of processing time. This G.729 codec standard is often used in Voice over Frame Relay (VoFR) and is supported by many frame relay equipment vendors.
97
Chapter 5 The G.729a codec is a variation on CS-ACELP that is rapidly growing in popularity. It compresses the voice stream into 10ms frames. Its less CPU intensive that many of the others, making it a good choice for bandwidth-constrained implementations. It offers a blend of bandwidth and delay performance characteristics that are making it quite popular for VoIP services. It has the added advantage of interoperating with G.729 codecs. G.723.1, or Algebraic Code-Excited Linear Predictive (ACELP), codecs work differently. They create models of the human voice and predict what the next sound will be. These codecs encode the differences between the predicted sound and the actual sound. Only this difference is transmitted to the receiving end, reducing bandwidth requirements. Although bandwidth needs may be reduced, this codec has often resulted in complaints that women and childrens voices were not represented accurately. The codecs described earlier are narrowband codecs, initially developed for supporting narrow voice channels. The PSTN voice channel is a 64kbps circuit carrying audio signals in the range from 300Hz to 3000Hz. Some pure IP voice solutions are now beginning to adopt broadband codecs, mostly proprietary, for VoIP call processing. Its expected that the use of these, and other, new broadband codecs will increase as more and more voice calls are carried over IP in the future. Voice Call QualityThe PSTN vs. VoIP As mentioned earlier, the PSTN has been in operation for well over 100 years. During that lifetime, it has been tuned and optimized to deliver one type of traffic, voice, as efficiently as possible with acceptable call quality. There are quality assurances in the traditional PSTN that are sometimes inferred, based on past use, and for enterprise business, sometimes contractually guaranteed. In short, voice service on the PSTN is a known quantity. You know exactly what you expect for call quality when you pick up the telephone. The PSTN is a connection oriented network. When you make a telephone call, the network establishes a voice path, or circuit, across the network that is dedicated to that specific telephone call, and meets all the basic quality requirements youve come to expect. Over the years, the telecommunications carriers have optimized the PSTN to support the delay, jitter, and loss characteristics that guarantee the expected voice call quality. In this connection-oriented, or circuit-switched, network, each circuit is dedicated to a single conversation, so security requirements are easily met. The circuit resources arent shared as they are in an IP network. Once the call is completed, the circuit is disconnected, or torn down. IP networks such as the Internet share resources all the time. They were designed to carry multiple traffic types. Thats the fundamental shift being embraced with convergence. In the legacy past, we designed and optimized a specific network for each type of traffic. A large enterprise might well have a voice network, a data network, and a video network, each with different performance characteristics. Convergence is about leveraging all these technologies data, voice, and videoonto one single network infrastructure. To succeed, its important to understand how these different networks behave.
98
Chapter 5 IP is commonly referred to as a best efforts protocol. The protocol itself provides routing delivery, but if delivery attempts fail, for any reason, IP will discard the packets. IP was designed to use available network resources in real-time, to deliver data, but without any assurances of delivery and with no quality guarantees. IP was initially designed to support data traffic that is inconsistent and unpredictable in volume, or bursty. IP data is described as bursty in nature because different data types have different characteristics. The data flow for email messages or Web browsing is quite different than the data flow for streaming media such as voice or video. IP is a connectionless, or packet-switched, network. Dedicated circuits arent established. Messages are divided up and placed into packets. Each packet is somewhat like an envelope in the postal system. It has header fields that contain the source address and a destination address. IP uses routing protocols to identify paths through the network. These packets take the best path available at the time, based on whatever metric a particular routing protocol uses to determine the best path. Thus, as congestion and other factors impact the performance of the network, IP packets can easily take different paths across the network, arriving at different times. A large message is often made up of many IP packets, so they need to be buffered and reassembled at the receiving end. QoS includes characteristics such as delay, jitter, and packet loss. IP doesnt provide any assurances for quality or delivery, so performance requirements can only be achieved by adding some other feature, or protocol, to the network. In practical terms, we increase bandwidth, optimize switching and routing paths, and incorporate other, higher-layer protocols to meet VoIP call quality requirements. We also add a layer of security by incorporating industry best practices in network design, incorporating firewalls and user access controls, and including monitoring and response mechanisms throughout the network. Table 5.2 highlights important differences in characteristics of the PSTN when compared with typical IP networks.
Characteristics of the PSTN Designed and optimized to support voice as a single traffic type Uses dedicated circuit for guaranteed delivery Connection-orientedA dedicated circuit path, or connection, is created for each voice conversation Phone calls are typically long duration (4 minutes on average) Characteristics of IP Networks Designed to carry any type of traffic No delivery guarantees using packet-based routing Connectionless in natureConversations are digitized and inserted in many packets IP packets are limited to 65,536 bits; packets are small and only live on the network for a very short time; a conversation is made up of many packets The IP network uses all the available resources of the network Delay, jitter, and loss must be overcome by adding resources to the network Security requires special resources in the network
PSTN circuit is dedicated to one 64Kbps voice channel Delay, jitter, and loss have been designed to meet specific levels for voice calls Dedicated circuit provides generally acceptable security for voice calls
Table5.2: Performance characteristics in the PSTN vs. IP networks.
99
Chapter 5 Because IP packets are small in size and data traffic is bursty in nature, IP networks use routing protocols to pick the best path for transmission. In a typical IP network, this bursty nature of data means that the traffic load on the network is constantly changing throughout the day. This constant state of flux means that the best path identified by routing protocols may change often. Routing protocols are dynamic and constantly update to identify the best path through the network based on the current traffic load. The best path one minute may not be available the next. For VoIP calls, this means that the packets comprising a conversation can potentially take many different paths through the network. Traditional data use of IP is often not particularly sensitive to network delay. If an email message is broken into many packets that cant be delivered immediately, there is likely no problem. The message can be placed in many different packets, then routed across any number of paths available in the network. The entire message can be reassembled and delivered to the recipient, with no repercussions. Unlike email, voice calls are real-time interactions between people. Even a half-second of network delay can be so detrimental to voice quality that VoIP service is rendered unusable. The PSTN dedicates bandwidth, a 64Kbps voice channel, to each telephone call. It dedicates this bandwidth for the duration of the call. It does this consistently and predictably for every phone call, every time. In an IP network, all the resources of the network are available to every user or node on the network. This is necessary for routing protocols to work and so that IP can make best efforts at delivery. Thus, every voice call can be impacted by changing traffic load and network conditions. Over time, we keep adding new traffic types to IP networks. Today, these networks support email, file transfer, Web surfing, and data sharing. As we add new traffic types to the network, we must redesign and modify the networks to support not just the new traffic type but all traffic types in the aggregate as well. The challenge with implementing and managing a converged VoIP service network is meeting user expectations such as call clarity and fidelity without delay or jitter impairments. Voice conversations must be clear and intelligible or uses wont trust the integrity of the network for voice calls. Beyond the quality of the voice itself, the basics of delivering dial tone, successfully completing calls, and always being available are key quality concerns. Managing call quality perceptions is a bit like walking the razors edge. Because voice service and telephone calling are so widely used and accepted, user expectations are high, based on their own experiences. When using a new VoIP service, if users feel that call quality is unacceptable, they will simply hang up and use alternative methods. In consumer VoIP solutions, this perception problem is a constant battle. In the enterprise environment, you must take proactive steps during the implementation process to level-set user expectations with what the integrated service can functionally support. If users develop mistrust of the network, even based on misperceptions of capabilities, it is very difficult to regain their confidence. You need to be able to measure, guarantee, and deliver suitable call quality. You also need to ensure that the network can continue to support the pre-existing data traffic without service degradation to existing services.
100
Chapter 5
Network Design Considerations

An often-stated (by consultants, systems integrators, and equipment vendors) misperception is that many corporate networks were poorly designed. Its simply not true. Its important to recognize the truth so that we arent doomed to repeat the mistakes of the past. Existing networks in business were not poorly designed, but in many cases, they werent designed at all. For many organizations, PCs were implemented way back in the past. Windows 95 brought in the concept of the workgroup, and local area networks (LANs) emerged, as small, isolated pockets of information, typically for a workgroup. With routing and internetworking, organizations began interconnecting LANs, and the explosive growth of the Internet was just one more connection. Today, we have business partner connections, VPN links, Web services, and more. We have the Internet, the intranet, and extranets. We have Web servers, offering Web services, supporting ebusiness. And in many cases, each new feature or function was brought about by tweaking the service network with some incremental upgrade. We didnt design a large number of our networks, they just happened. Todays networks have evolved and grown over time. Many have been carefully and methodically redesigned every few years. Like the PSTN has been tuned and optimized for voice traffic, there are IP networks that have been designed and tuned to support known and understood data applications that exist within the enterprise. The enterprise applications in use today may have very different network performance requirements than converged video and/or VoIP services. VoIP projects begin with an implied assumption that some network upgrades may be necessary. Equipment in the network, such as routers, may already be running at high CPU utilization. In many cases, the router OSs may not be able to support VoIP services. Existing wide area network (WAN) links may already be taxed to their limits supporting existing data applications. In many cases, these links may be using frame relay connections that might not be voicefriendly. Frame relay is very much a data services. The overall capacity in terms of bandwidth and processing power are vital factors to assess before implementing any new network service. There is a pitfall to watch out for when implementing VoIP services. Its important to avoid the added expense associated with multiple upgrades. Given that for most organizations, network upgrades represent capital expenditures (CAPEX), no technical manager should be put in the awkward position of justifying multiple upgrades. A smart approach is to assess and understand the voice calling and quality requirements in conjunction with all existing data services. In short, invest the effort to do a complete, holistic assessment of the entire network services suite across the enterprise. This is also a good time to consider any planned new data services as well as the VoIP requirements. Rather than upgrading prematurely, focus early in the process on data gathering and information analysis. Taking a methodical approach will not only ensure successful pilot testing, it will help yield a more future-proof design with the ability to grow smoothly and support future, anticipated business needs.
101
Chapter 5 Trunking capacity is always a consideration in voice networks. Its tied directly to carrying capacity. In the VoIP world, how and where gateways are connected to the outside voice network, or PSTN, are important performance and management factors. T1 circuits, whether primary rate (PRI) or standard channelized connections, can provide a standardized approach that traditional telecom providers support. This approach is simply building T1 trunks in the same way you connect an enterprise PBX to the PSTN today. In the traditional T1 environment, a T1 circuit can carry 24 voice calls. IP trunking presents a different approach, using the full bandwidth of a network connection to the carrier. SIP trunking over IP is becoming very popular and quite common. SIP trunking is easy to implement in most current vendor offerings, and provides an easily quantifiable return on investment. In short, its proving to be quick and easy to implement, and a powerful cost-savings approach. Given the codecs mentioned earlier, a 100Mb SIP trunk to the carrier, using a codec such as G.729a, can carry a high volume of voice traffic easily.
Chapter 7 will explore trunking and other services, such as E-911, in more detail in Chapter 7.
The Performance Envelope

Theres an old networking paradigm that has been used by many IT managers. When network performance seems sluggish, add bandwidth. This approach works in many cases, but its usually a temporary fix. Its also an approach that, if followed over time, guarantees that network circuit costs will spiral ever upward. Bandwidth is an important characteristic in network performance, but its not the single most important factor. Networks have a wide-ranging set of performance characteristics that determine how smoothly they operate. Theres a principle in network management, often related to security, which identifies three important network requirementsconfidentiality, integrity, and availability (often called CIA). There are many different components and characteristics that play into achieving complete confidentiality, integrity, and availability in the enterprise network. These characteristics make up what Ill call the network performance envelope. The shape of this envelope gives a view into the personality of the network. Weve touched on the importance of knowledge about your network operating environment up to this point. This section will put together the pieces so that you can utilize a knowledge-based approach to ensuring success in deploying integrated VoIP services. The more you know about the network, the more accurate your planning will be, increasing your ability to meet call quality expectations for end users. Better data improves your ability to provide better total network performance. The basic concept of confidentiality, integrity and availability doesnt offer enough granularity to fully assess the requirements for successfully delivering a converged VoIP service. To accomplish that, youll need to look at a broader set of data inputs. Youll need specific, measurable data that you can assess to gain a complete understanding of both the existing network and the new requirements.
102
Chapter 5 There are a nearly unlimited number of different data elements an organization might use or consider when defining the network performance envelope. This guide will use the factors shown in Figure 5.1 as an example and will explore some characteristics in more detail than others. These chosen factors provide a good representative sample of the types of characteristics enterprises might consider when working through this process.
Availability Throughput Manageability Scalability Integrity Response Time Reliability Network Segment Utilization
Cost
CPU Utilization Security
Figure 5.1: Graphing performance characteristics.
This figure shows the basics of CIA in a granular form and brings some other network performance characteristics into the mix to ensure you can fully support converged VoIP services. Reliability overlaps both the availability and the integrity of the network. Because its so important, well treat reliability as a discrete evaluation factor here. Throughput also plays a dual role, providing both availability and integrity. Well include three facets of throughput in the example: Because bandwidth is a major contributor to throughput, well measure it as a standalone factor. Response time provides a measurement of delay or latency in the network. As VoIP is an end-to-end service and delay is cumulative, well look at response time as a broad indicator. CPU utilization may be a performance factor across every element of the network.
103
Chapter 5 Weve added cost, which is unrelated to CIA in every way, but a business consideration that must be taken into account with every network change. Some enterprises might substitute a more comprehensive ROI model as part of this assessment. No matter how simple or complex your assessment mechanism, managers will have to answer for profitability and cost recovery of changes to network services. You must asses the manageability of the network. Deploying new services that cannot be effectively managed is courting disaster and ensures failure. For many service delivery organizations, this correlates into a service level agreement (SLA), or contractual commitment. Scalability of the network has been included here as a broad measure for future-proofing your design. Every business enterprise plans for business growth. Business growth inherently drives network growth as almost every business sector becomes more and more dependent on data networking resources. Youre striving to future-proof the network for some period of time, typically 3 to 5 years. No network designer or engineer wants to implement a major new service such as VoIP on a network, only to discover a year later that the network was under-powered during a major project and now requires a complete redesign. Integrity encompasses network reliability requirements. Security is included here as an aspect of integrity, encompassing three facets for consideration: Packet loss, if minimal, might not adversely impact VoIP services; however, its a performance component that should be measured and included in your analysis. Jitter, or variability in delay, degrades the quality of voice services if not controlled. Security can be broken down into many subcomponents. These might include firewalls, intrusion detection solutions, antivirus software, and other tools. For the purposes of this model, security will be treated as a single component.
Availability may have many different meanings to many different people. To some enterprises, and most of the telecommunications carriers, it generally indicates a guaranteed uptime at the five nines, or 99.999 percent level. Rather than deeply explore network design issues such as high availability, redundancy, and business continuity, availability will be used here as another data point on a graph for analyzing the network performance envelope. Figure 5.2 gives you the next step in the process. It presents the framework you can use for mapping each operational characteristic of the larger network performance envelope. Network services and applications all have unique and different requirements. Email can be delayed several minutes without any measurable negative impact. Delays in Web browsing might be barely noticeable as the browser screen gradually loads information. Its use of Transmission Control Protocol (TCP) at a higher layer guarantees information delivery, although the overhead might add delay. Web traffic is non real-time traffic, so between a person and a system, performance impact may be negligible. Integrating VoIP into the often already overtaxed data network introduces a whole new set of requirements into the network performance envelope. As these new characteristics needed for new network services are identified, they can be mapped onto the graph. As you do so, notice that the personality of your network, the performance envelope, begins to take shape.
104
Chapter 5
Availability Manageability
High High High
Throughput
Easy
Scalability Cost
Integrity (Errors, Collisions)

<100ms
<.1 %
Moderate
75%
Response Time
High
70%
High
CPU Utilization
Reliability
Security Network Segment Utilization
Figure 5.2: Identifying performance characteristic requirements.
Every network service and application in operation needs to be evaluated and factored into the performance envelope data set youre evaluating. Remember, youre identifying the requirements for successful service deliveryyour success factors. You are planning to succeed. For existing services such as email and Web traffic, you might make some simple assumptions about acceptable performance. For mainframe applications, or those that were custom-developed in-house, it might be prudent to overlay the specific requirements to support each. The key is that each requirement be identified using some measurable value. The data points or elements used by any organization can be established as appropriate.
This model is demonstrated in a very simple format. Each network has different and unique requirements depending on the services provided and applications in use. If your network employs QoS mechanisms for different traffic types, dont forget to account for each of them when graphing out your network requirements. The more granular the data used, the more accurate the assessment of performance requirements becomes. As always, the more you know about the network, the better your decision will be.
Every performance characteristic you asses becomes a data point on the graph lines. In the next step, which Figure 5.3 shows, connect the dots to gain visual representation of the shape, or personality, of your network. You know what the network needs to look like to successfully deliver the existing service and applications in addition to VoIP and other new services needed.
105
Chapter 5
Availability
High
Throughput
High
Manageability
High
Easy
Scalability Cost
Integrity (Errors, Collisions) Response Time

Legend
<.1%
Moderate
<100ms
75%
Envelope Limits
Reliability
High
70%
High
CPU Utilization
Security Network Segment Utilization
Figure 5.3: Giving shape to the network performance envelope requirements.
For many organizations, now the easy part is completed. The next step in the process is to physically measure each data point on the graph. Some elements, such as cost and security, may be relative assessments rather than technical measurements. The key to success is to be methodical and thorough. .This step will quantify network performance capabilities today. As Figure 5.4 shows, when you map the real-world measurements from your network with the performance envelope youve established as a requirement, you have a gap analysis mechanism. This gap analysis can now help you focus directly on those areas of network performance that fail to meet your established requirements. This helps ensure a methodical approach to upgrading the network. As part of the overall operational requirements of the new, converged network, security and performance management often mesh to become one overarching facet of design consideration. Achieving the delicate balance between service delivery and security requirements means that compromises and tradeoffs will be necessary. The right set of management tools will allow you to continually monitor network performance and security, assess risks, and measure performance around the clock.
106
Chapter 5 Neither performance nor security can be effectively monitored as a single element within the network. The network is a large, almost organic environment. A systematic and holistic view of the health and welfare of the entire system requires vigilance and constant review. Beyond this methodical approach to assessing the network performance, requirements implementing repeatable, sustainable processes will help ensure consistent network performance that delivers both the quality and security necessary for widespread enterprise success. There are a variety of Network Management Systems (NMSs) in use in business networks today. These range from expensive commercial products to freeware and open source tools. Beyond network and security monitoring tools, there are a number of VoIP-specific solutions to aid in constant monitoring of the VoIP service. The best resource for identifying the tools that will work in your environment is generally the VoIP solution provider.
Availability
High
Throughput
High
Manageability
High
Easy
Scalability Cost
Integrity (Errors, Collisions) Response Time
<.1%
Moderate
<100ms
75%
Reliability
Legend Ideal Actual
High
70%
High
CPU Utilization
Security
Network Segment Utilization
Figure 5.4: Gap analysisMapping existing performance against the requirements.
You can use this performance envelope graph to overlay the performance characteristics the network provides today with the requirements youve already documented. You know your requirements for successful service delivery and you know your network capabilities. Figure 5.4 demonstrates a common occurrence in many networks. The reality of network measurements and the requirements dont align in all areas. In some characteristics, the network provides better service than needed. This may mean that youre paying a premium price to a carrier unnecessarily. In other areas, there are gaps to address in order to meet service delivery needs. Implementing a methodical approach such as the performance envelope when analyzing service requirements also helps in defining VOIP service delivery expectations. Clear expectations are a key factor to a successful implementation. This approach ensures that you know what is required.
107
Chapter 5 Completing a thorough assessment of the existing network has two benefits. First, it provides an accurate and viable gap analysis to use in preparing the network for converged services. In todays frenzied network operations environment, its sometimes too easy to assume that adding a switch here, a link there, and increasing the bandwidth somewhere else will meet service requirements. Its important to take the time to complete a thorough assessment and provide comprehensive gap analysis information to achieve success. As a side benefit, especially in large, distributed networks, some enterprises will identify areas in which networks have been overengineered in the past. This can lead to redesign and cost savings. This performance envelope readiness assessment is an opportunity to re-evaluate the existing network and determine whether current needs are being met. Many corporate enterprise networks were not designed to be what they have become today. Networks often began as small islands of informationisolated workgroups or departmental LANs. As the network gradually grew, connections to other groups and organizations were added. In most organizations, new business applications were also added over time. Corporate networks have grown from a simple beginning into a complex and sophisticated mesh, weaving corporate operations together. Often theyve done so without being revisited from a holistic, service delivery perspective. This performance envelope approach to network assessment can help lead to a network that delivers better performance at a lower cost.
Choosing Performance Envelope Characteristics to Measure

Data, voice, and video services each place different demands on the network. Theyre different types of service. Each one is designed to provide a specific service. To support different types of service in the network, you need to be able to offer some consistent and predictable QoS that you can manage. This is often accomplished by creating a specific class of service for each type of traffic supported in the network. For some network designers, determining the number of service classes required can be a challenge. Although implementing QoS mechanisms isnt a convergence step thats required, many organizations find that VoIP drives the need in order to deliver the required performance characteristics. QoS can become very complex. Implementing an overly complicated QoS scheme can lead to a network that cant be readily supported. Rather than add undue complexity, lets look at just some of the performance characteristics included in the performance envelope example. As network convergence and emerging technologies gain momentum, be mindful of the danger that each new application brings. Its important to look toward the future and not create a new class of service for every new application, or network complexity can spin out of control. For manageability, most network engineers favor using only a few critical service classes. VoIP and video collaboration represent real-time traffic. This is most often communications information flowing from person to person, rather than interaction with a server or system. These real-time services require quick delivery with quality assurances for delay, jitter, and loss. IP networks use a best efforts approach to deliver all general traffic. This class of service is quite suitable for email, Web browsing, and most other normal network traffic. Best effort simply uses whatever network resources are available.
108
Chapter 5 In some networks, management traffic may warrant a dedicated class of service all by itself. This approach is quite common in service provider networks. It provides a mechanism to ensure the network can always be managed, regardless of congestion problems that might arise. All QoS mechanisms provide some form of traffic prioritization scheme. In the converged network, there are many different traffic types. Each may have different requirements and different prioritization needs. Similar traffic types need to be identified so that they can be handled the same way within the network. Most organizations choose to aggregate similar traffic types. This approach allows a company to take advantage of network routes that are optimized to provide the appropriate class of service. In designing a VoIP service network, the focus is typically on call quality for the voice user, but its important not to degrade pre-existing services when implementing VoIP. You need to recognize all the different traffic types in use. If mission-critical data applications arent given the necessary resources through QoS prioritization, the applications might starve for lack of resources. Email and Web traffic will still need to be delivered, even if its a lower priority. Its vital to maintain balance across all traffic types when managing an integrated data, voice, and video network. Throughput Throughput is frequently measured in terms of bandwidth. When evaluating throughput requirements, due consideration must be given to traffic aggregation points. Dont overlook the congestion issues that can develop as a result of combining 10Megabit, 100Megabit, and Gigabit Ethernet connections on the network. This traffic, all flowing to a centralized aggregation point, may overwhelm a lower throughput link. This can aggravate network congestion and introduce service delivery problems. Bandwidth Over-engineering or over-provisioning the IP network has been a common approach for many network engineers. Increasing bandwidth, by ordering higher capacity links, has been the most common technique. Adding bandwidth may alleviate short-term problems, but its important to remember that IP uses all the available network resources. Bandwidth is one resource. Overengineering frequently proves to be a delaying tactic that simply stalls necessary redesign work. This approach can be more costly in the long run. Adding bandwidth still requires investment upgrading equipment and increasing bandwidth of circuits. These can become very expensive approaches, and they dont solve the problem of design. IP data applications can quickly consume all available bandwidth, leaving the same congestion problem to be addressed. Response Time Response time is one measure of network performance, most often measured using ping as a test tool. One important nuance in the converged voice and data network is the fact that voice is an end-to-end service between people. Ping can be an effective diagnostic tool, but the end-to-end nature of voice service, coupled with the fact that delay is cumulative, necessitate comprehensive management and testing to ensure service levels and call quality are maintained.
109
Chapter 5 CPU Utilization CPU utilization in network nodes offers a good indicator of the overall health of the network. In planning VoIP services, utilization might provide an indicator that network elements are overtaxed and cant effectively support VoIP. After VoIP services are live, ongoing monitoring of CPU utilization can provide a benchmark and timeline trend analysis to monitor the health of the network over the complete life cycle. Network Segment Utilization Ethernet is, at its roots, a shared media technology. LAN switching provided network technicians a means to segment traffic into smaller broadcast domains. This increased granularity is now often enhanced through the use of virtual LANs (VLANs). Like bandwidth, CPU utilization, and other factors, network segment utilization can be monitored as part of the day-to-day management operations to ensure adequate network performance to support the required services. Integrity/Reliability The integrity and reliability of the network encompasses a number of different technical facets. Each can be monitored, measured, and managed as a part of network operations. In the legacy IP network, these may have provided acceptable service and been left unchecked. In the converged service network, they warrant ongoing monitoring and proactive management Packet Loss A common measure is error rate and data loss. When IP networks are used to transmit normal dataemail, file transfer, Web browsing, and so onsome data loss is acceptable. The higher layer protocols, such as TCP provide a measure of quality assurance and request retransmission when needed. Other data types, mainframe data, as noted, may be very intolerant of packet loss. Jitter Jitter describes the variations in delay. As IP networks route traffic over the network using the best path identified by routing protocols, its possible that every packet in a stream might take a different route. A VoIP call could potentially traverse many network paths. Each route through the network may have different delay characteristics. Jitter typically isnt a concern for the normal IP data traffic c. VoIP is far more sensitive to jitter than email or Web browsing because it is a real-time service between people. High jitter can result in unintelligible conversations that sound jerky. Users wont trust or use VoIP services if the call quality is unacceptable. Delay/Latency Delay exists in all IP networks. It exists in several forms. Routers use statistical multiplexing algorithms to process traffic. Assembling data into packets takes time. Checking the routing protocol to identify the best route through the network takes time. These miniscule delays add up and all impact the total end-to-end delay. Delay is cumulative.
110
Chapter 5 Cost No business can operate without maintaining vigilance in controlling cost or expense. In networking, you face not just CAPEX in hardware investment but also the operating expense (OPEX) of keeping the service up and running on a daily basis. Some businesses, notably service providers, focus on profitability and use ROI as a performance envelope measurement. Enterprise businesses may treat their IT and network operations internally as either a profit center or a cost center. For many, no profit is expected, but the costs associated with implementing and managing the network still must be recovered. For most organizations, over a 3-to-5 year life cycle, network OPEX tends to be much greater than the initial CAPEX investment. Its important to always factor the appropriate cost analysis in both VoIP preplanning and the ongoing service management. Availability Availability, for many, is defined as reliability. It means that the network is fault tolerant and services dont degrade when problems do occur. To provide reliability in the PSTN, there are millions of circuit paths available through hundreds of central offices. In an enterprise IP network, redundant paths, alternative routes, load-balanced connections, and high-availability equipment may all need to be incorporated into the design to ensure resilience in the network. Uptime99.999% Availability in traditional telephony has often been measured as the uptime percentage. In commercial networks, youve heard the term five nines reliability (or 99.999 percent uptime) used as the target availability measure. For enterprise operations, its crucial to recognize that this number equates to roughly 5 minutes of downtime per year. Although the commercial telephone providers have widely met the five nines measure, there arent many corporate data networks that can claim less than 5 minutes downtime in the past year. Introducing VoIP and other real-times services into the network raises the bar for IP network availability and drives designers to invest in robust, fault-tolerant design solutions. Security When implementing VoIP solutions, network security is as great a concern as reliability and call quality. Corporate networks may include firewalls and multiple connection points. These security devices can add nodal processing delay that may impact VoIP services. The more complex a rule set in a firewall, the more latency it induces to the data flow. Remember that delay is cumulative and counts toward the 250ms maximum tolerable end-to-end delay. VPN services are deployed in two typical fashions. Point-to-point VPN solutions may be used to connect remote offices over the Internet. Many companies also use VPN services allowing employees to connect to network resources while telecommuting or away from the office. Encryption algorithms consume processor power. A VPN device running DES or Triple DES encryption as a VPN endpoint will add further latency as packets are encrypted and decrypted. If the VPN endpoint is a firewall, this CPU load problem may be further compounded. Security concerns reinforce the importance of assembling a comprehensive technical team for network assessment, readiness testing, and managing the operational environment. Its crucial that the telecommunications, IT, and network security teams collaborate to be successful.
111
Chapter 5 Manageability The human resources aspect of supporting network services may be the mostly costly component. This is the biggest OPEX cost component. The more complex and difficult the network is to manage, the more resources required. As network designers and service providers, we must consider the ease of management were incorporating into VoIP network design. Scalability Scalability provides the best measure for future-proofing the network. Corporate networks change, grow, and evolve continually. They become very organic in nature as business needs change. The evolutions of the SOA and SaaS on the network are beginning to accelerate. The demands placed on the network will grow as applications, services, and networks become more tightly coupled with business processes. You must always consider the ease with which the network can scale to support new business applications and services.
Summary
The key to successfully managing the converged network begins during the planning process long before implementation. Conducting a needs analysis will help define the business needs in measurable terms. Completing a methodical network readiness assessment will further develop a holistic vision of what the network must support in order to successfully deliver integrated services. Dont sidestep these important steps early in the VoIP deployment process. They will not only ensure that the converged network fulfills your business needs but also help formulate the day-to-day management methods required to support the enterprise integrated services. Perhaps a key differentiator in using this performance envelope approach to network performance analysis is that it offers a tool to support all services in the network. VoIP is important, but unified communications is only effective when it supports the larger unified application environment that supports the complete business of the enterprise. It offers a holistic view of not just data voice and video but also enterprise-specific applications and services. Sales force automation, inventory management, customer relationship management, and Web-based ecommerce are examples of vital application services that the enterprise needs in order to thrive. Although the software-oriented architecture mindset will more tightly couple these applications with services such as voice and data, using the performance envelope as a holistic approach is an integral part of managing a converged network, encompassing the full spectrum of data, voice, video, applications,
112
Chapter 5 This chapter only briefly touched on the idea of using outside resources. Its worthwhile, in summary, to review how outside integrators and consultants can work for you. They can help look out for your best interests. There are many VoIP consulting firms that specialize in converged technologies. Their reputation is built on how well they take care of client needs. They build business through word of mouth referral and will be your advocate for success. They can ensure that you are being well served by your vendors. Your equipment vendors and service providers will gladly offer consulting services, and their sales engineers will graciously come in and help you design a solution. It may be pertinent to recognize that vendor staff may be specialists in their company-specific products. They may do a very good job based on their own experience, but they might not have comprehensive training on the workings of competitive solutions. Use your trusted and incumbent vendor partners, but dont give them free rein to design your network. You need to remain in control of the project overall, because youll have to manage it in day-to-day operations. The next chapter will delve into the impacts of events in the network, root cause analysis, and how event correlation comes into play as a critical network management tool.
113
Chapter 6
Chapter 6: Impact Analysis, Root Cause, and Event Correlation

To conduct full impact and root cause analysis, event correlation engines are often used to provide data about what happened. This chapter begins by building an understanding of protocols involved, examining their strengths and weaknesses. Key protocols include Simple Network Management Protocol (SNMP), the Internet Control Message Protocol (ICMP) tools, and even Network Time Protocol (NTP) for effective event correlation. Beyond protocols, the chapter will explore syslog. Syslog servers provide a part of the picture, but they really provide data collection mechanisms, not analysis engines. Correlating events across an enterprise network of disparate systems presents a difficult challenge. In Business @ the Speed of Thought (ISBN: 0446525685), Bill Gates described what he called the digital nervous system. He said, The most meaningful way to differentiate your company from your competition...is to do an outstanding job with information. How you gather, manage, and use information will determine whether you win or lose. When deploying converged networks over IP, youre integrating voice technology with the critical data infrastructure. Building monitoring and management processes into daily network operations provides the information, or knowledge base, about the corporate nervous system that lets you manage a complex, almost organic, business operating environment. Your management and monitoring tools become a key part of your enterprise business intelligence.
SNMP
SNMP is a widely used protocol for monitoring the health and well-being of a network. Its a simple, text-based protocol that uses a database called a management information base (MIB) to describe network device management data. Almost all network elements are SMNP-enabled. Most equipment comes from the manufacturer with the community strings of public and private enabled by default. Typically the public string provides read-only access. The private community string often provides write access also, and is often used for managing devices remotely and pushing updated configurations to routers and switches across the network. SNMP was designed to ease monitoring and remote management of network elements. These include servers, routers, switches, and even workstations. It can provide monitoring for performance, utilization, and state information about the device. SNMP uses what are called traps to capture this information, which is then often passed on to a centralized management station in a network control center. These stations typically provide network maps, with icons representing each node being monitored. In many systems, a simple green-yellow-red icon allows easy monitoring of network element status from healthy and operational (green) to potential problems (yellow) to out of service (red) conditions.
114
Chapter 6
What Is a MIB? The MIB is a type of database, comprising a set of objects used to manage individual network elements. MIBs are structured based on the OSI/ISO network management model. In the public switched telephone network (PSTN), Abstract Notation One (ASN.1) has been used for years as a mechanism for describing the object data structure of that networks elements. The PSTN elements include things like Class-5 central office switches, carrier trunking technologies, and the SS7 signaling network elements. ASN.1 was jointly developed by the ISO and the ITU-T in 1984. Todays network MIBs are developed as a subset of this larger standard. This subset is defined in IETF RFC 2578.
IETF RFCs for MIBS RFC 1156 - Management Information Base Network RFC 1157 - A Simple Network Management Protocol RFC 1441 - Introduction to SNMP v2 RFC 2579 - Textual Conventions for SNMP v2 RFC 2580 - Conformance Statements for SNMP v2 RFC 2578 - Structure of Management Information for SNMP v2 RFC 3416 - Protocol Operations for SNMP v2 RFC 3417 - Transport Mappings for SNMP v2 RFC 3418 - Management Information Base for SNMP v2 RFC 3410 - Introduction and Applicability Statements for Internet Standard Management Framework RFC 3411 - Architecture for Describing SNMP Frameworks RFC 3412 - Message Processing and Dispatching for the SNMP RFC 3413 - SNMP Applications RFC 3414 - User-based Security Model (USM) for SNMP v3 RFC 3415 - View-based Access Control Model for the SNMP RFC 3584 - Coexistence between SNMP v1, v2 and v3
A MIB Object is one of any number of specific characteristics of a managed device. Examples of MIB objects include: Output queue length, which has the name ifOutQLen Address translation table (like ARP tables) called atTable
115
Chapter 6
The Architecture of SNMP There are three components needed for managing a network with SNMP: Network management systemsThe network management system (NMS) runs the applications that control and monitor the network devices. The NMS is an active system that sends and receives SNOM queries. It accomplishes this by setting SNMP traps, then getting the results. AgentsAn agent is simply the SNMP component of the software running on the device being monitored. This might be an integral part of the OS software or it might be another process or daemon that is executed when the device boots. The agent has information about the local operating characteristics of the network device. This information makes up the MIB for that specific device. The agent translates that information and provides the communication with the NMS. Managed devicesThese are the network elements that are monitored by the NMS. These devices are typically routers, switches, servers, printers, and other service delivery elements of the network. In a unified communications design, these also often include the gateways, session border controllers, signaling and voice servers, and voicemail systems. Although workstation OSssuch as Windowsinclude SNMP capability, they are generally not monitored except when required for some very defined requirement. These devices often collect and store some form of management information locally in either event logs or syslog files.
SNMP provides a standards-based protocol and mechanism for remote monitoring and management of the unified communications network on a large scale. SNMP currently exists in versions 1, 2, and 3 in the real world. SNMPv2 was not widely adopted due to disagreements over the security framework, but many networks are evolving to use SNMPv3. Version 3 includes some important new features. The most notable is encryption of the data in transit. Earlier versions send data in plaintext, which is easily read, making SNMP a prime tool for a malicious intruder to learn about the network. Encryption ensures that only the NMS and authorized personnel can read and evaluate this information. Different SNMP versions can interoperate to a limited degree. Interoperability between versions is explained in IETF RFC 3584.
116
Chapter 6
Remote Monitoring MIBs Remote Monitoring (RMON) is another technical specification that provides for a different variety of network monitors and console systems. RMON is designed to support network probes and monitors (often called sniffers). It allows the integration of diagnostic tools from multiple vendors, which may be used for very specific diagnostics or analysis. RMON was initially developed when LAN switching became popular. It allows for managing switched LAN segments from a central monitoring facility or Network Operations Center (NOC). RMON is simply another extension of the standards already described as part of the SNMP MIB. Unlike SNMP, RMON uses only two components. The probe contains the agent and is inserted into the network. One example would be a sniffer inserted into a specific network segment or VLAN for troubleshooting purposes. The other component is a management station. This workstation is frequently a network engineers workstation, used interactively in troubleshooting and problem diagnosis. Like SNMP, RMON information uses the MIB found locally on the device, but the RMON agent is most commonly embedded in the OS. RMON agents dont monitor the entire system; only the traffic flowing through the RMON device. An RMON sniffer placed in listening mode on a LAN segment can only report on traffic on that LAN segment. There are several variations in RMON MIBs. The Token Ring RMON MIB, for example, provides specific objects for managing a Token Ring network. The SMON MIB extends RMON and provides support for RMON analysis of a switched network.
SNMP is a very simple application protocol. Because it doesnt require a full three-way handshake or guaranteed communications, its encapsulated in User Datagram Protocol (UDP). All three versions of SNMP contain the same message components: VersionThe SNMP version number. Its key that the agent software running on the network device and the NMS use the same version of SNMP. Messages that arrive tagged as being a different version are typically discarded by the NMS. CommunityThe community name, or string, is used to authenticate the management system and grant either read or write access to the agent. The most common default strings, described earlier, are public and private. Many vendors products come with other SNMP community strings enabled by default. PDU (Protocol Data Unit)The PDU types and formats are different for SNMPv1, v2, and v3. A PDU is a descriptor of how information is packaged for a given technology. For example, the PDU for LAN technologies such as Ethernet and Token Ring is called a frame. Ethernet and Token Ring frames differ in format, but the PDU they transport is a frame. IP packets are the PDU that IP transmits and differ slightly in format from other packets.
117
Chapter 6
Using SNMP SNMP uses IPv4 but also supports IPv6 for the future. The following list highlights the capabilities provided by SNMP: Data gatheringCollect data from a device that is SNMP capable. Single requests can be submitted using the snmpget and snmpgetnext requests. Multiple requests can be stacked using the snmpwalk, snmptable, or snmpdelta commands. Configuration modificationThe snmpset command provides for altering the configuration information. Status checkingCommands such as snmpdf, snmpnetstat, and snmpstatus allow for retrieval of status and other information. TranslationConverting MIB information, content, and structure from text and numeric forms to other formats for use in a wide variety of analysis systems is accomplished using snmptranslate.
Many current SNMP tools provide a graphical MIB browser. Most organizations use graphical tools that provide some underlying, automated mechanism of implementing the snmptrapd command to automate receiving of SNMP notifications. A GUI can provide a human-friendly view that makes changes in the environment quickly observable. These notifications can also be logged to a syslog server or an event log or exported to a plain-text file. They can also easily be forwarded to other SNMP management systems and passed to external applications for event correlation and further analysis. snmpwalk In UNIX systems, snmpwalk is a widely available application. An administrator can run a very simple snmpwalk command
snmpwalk -c [good community string] [target host]
and learn a great deal of information about a device. Windows users can download a variety of SNMP exploration tools from the Internet. These tools generally eliminate arcane command-line interfaces, making basic exploration of networks and devices a simple point-and-click operation. Figure 6.1 shows a simple snmpwalk of a print server on the authors network using a GUI tool from Solar Winds. In an enterprise network, routers, switches, servers, and VoIP service delivery systems can yield routing information, user account information, performance information, and details about TCP and UDP services running from this output information. When enabled, SNMP can provide an administrator with extensive information about an enterprise network very quickly.
SNMP is a reconnaissance tool. If SNMP must be enabled, it is absolutely critical that default community strings be replaced. Just as a network administrator can use SNMP to perform quick network reconnaissance and learn information about the network that must be kept private, so too can an attacker. As a tool, SNMP is a double-edged sword, providing value while potentially exposing vital information.
118
Chapter 6
Figure 6.1: An example of SNMP information.
Factors to Consider with SNMP SNMP versions 1 and 2 do not encrypt the transmitted data. This means that management information is passed in the clear and is quite readable by humans. Theres a security risk in allowing critical management information to pass in the clear, even inside the enterprise network. Because the different versions of SNMP are not compatible, use of SNMP for network management is often relegated to the lowest version supported in the network. For most enterprises today, that is SNMPv2, which does not support encrypted messages. Upgrading an enterprise network to SNMPv3 has often proven impractical. Existing routers and other network elements often cannot support the newest version. Although upgrading the OS might seem like a simple solution, often hardware replacement is the only viable means to upgrade to SNMPv3. The benefits of the latest protocol standard may not be a powerful enough business driver to warrant necessary hardware upgrades.
119
Chapter 6 Autodiscovery SNMP tools are widely used by malicious intruders for reconnaissance purposes. Many SNMP tools allow the simple use of subnet masking to run a scan across not just an individual network element but also a subnet or full network to discover what devices are listening for SNMP commands. SNMP is a very simple network discovery tool. One of the features of SNMP tools is an automatic discovery feature, through which new devices discovered in the network are polled automatically. Most implementations will allow for a quick scan that yields tremendous information. Even if the public and private community strings have been set to a secure string that is not the default, the simple act of allowing SNMP enables discovery that quickly identifies working IP addresses on the network and the domain or network names associated with each. Negative Implications SNMP may be the intruders easiest and most friendly tool. Software utilities are abundant for free downloading. Many are point-and-click operations, requiring no technical skill. Its quite common within an enterprise for employees to be curious and use these simple tools for network reconnaissance and exploration. Employees are typically within a trusted environment, so its natural that they may have access to view a great deal of information. There is a danger of network topography being mapped from within because of this implied trust relationship. Vendors approaches to SNMP implementation vary widely. For some vendors, it isnt an element of the core product design but a feature that has been added or incorporated later in the product development cycle. Since the tree structure and data indexing techniques may vary, the internal data structures any particular vendor has implemented may vary. As a result, querying the network equipment with SNMP can produce in unwanted problems, like increased CPU utilization. Large routing tables, like those often found in Border Gateway Protocol (BGP) or Interior Gateway Protocol (IGP), are one example of a situation where this problem is likely to occur. The lack of encryption capability in versions 1 and 2 introduce the threat of simple packet sniffing/capture, easily revealing the plain-text SNMP community string. No versions of SNMP use a challenge/response approach to authentication. That leaves all versions vulnerable to both brute-force and dictionary attacks. An assortment of both free and commercial software tools to instigate these attacks are readily available. Because SNMP is UDP-based, its connectionless in nature. This leaves SNMP vulnerable to IP spoofing attacks. Effectively restricting access to SNMP requires extensive access control list implementation across multiple network elements in many corporate networks. Its noteworthy that SNMP has frequently surfaced in the SANS Institutes Top 10 Most Critical Security Threats as a result of the default community strings being set to public and private.
SNMP has frequently surfaced in the SANS Institutes Top 10 Most Critical Security Threats as a result of the default community strings being set to public and private.
For more information about SNMP security implications, the US-CERT maintains an excellent SNMP Vulnerabilities FAQ at http://www.cert.org/tech_tips/snmp_faq.html.
120
Chapter 6
ICMP
ICMP is a foundation of the TCP/IP suite. It is mainly used by networked computers OSs to send error messagesindicating, for instance, that a requested service is not available or that a host or router could not be reached. In the connectionless, packet environment of IP, each host and router acts autonomously. Packet delivery is on a best-effort basis. Everything functions just fine as long as the network is working correctly, but what happens when something goes wrong within the subnet? As a connectionless service, IP has no direct mechanism to tell higher-layer protocols that something has gone awry. Furthermore, IP does not even have a method for peer IP entities to exchange information; if an IP host receives a packet, it attempts to hand it off to a higher-layer protocol. ICMP has been defined for exactly this purposeIP-to-IP communication, usually about some abnormal event within the network. ICMP messages are carried in IPv4 packets with a protocol value of 1. ICMP is defined in RFC 792 and is part of STD 5, which defines IP; this strongly suggests that ICMP is an integral part of IP. There are several types of ICMP messages. The following list highlights the most commonly used ICMP messages: Destination unreachableIndicates that packets cannot be delivered because the destination cannot be reached. The reason is also provided. Examples include: Host or network unreachable or unknown Protocol or port is unknown or unusable Fragmentation is required but not allowed (DF-flag is set) Network or host is unreachable for this type of service
Time exceededThe packet has been discarded because the Time to Live (TTL) field decremented to 0 or because all fragments of a packet were not received before the fragmentation timer expired. Parameter problemThere was a problem with something in the packer header preventing a router or host from processing the packet. Source quenchIndicates that a router along the path is experiencing congestion and is discarding packets. This is usually caused by limitations in buffer space. RedirectIf a router receives a packet that should have been sent to another router, the router will forward the packet appropriately and let the sending host know the address of the appropriate router for the next packet.
121
Chapter 6 The remaining ICMP messages are used to query the network for information: Echo and Echo ReplyIs used to confirm whether systems are active. One host sends an Echo message to the other. The destination system must respond with an Echo Reply with the same data that it received. These messages are the basis for the TCP/IP ping command. Timestamp and Timestamp ReplyThese messages provide more information than the simple Echo messages. A timestamp, with granularity to the millisecond, is inserted in the messages. This provides a mechanism for measuring how long remote systems spend buffering and processing packets. It can also be used as a clock synchronization tool between hosts. Address Mask Request and Address Mask ReplyCan be used by network nodes to determine their address mask when assigned an IP address. Information Request and Information ReplyThis field is now obsolete.
IP Packet
Time Exceeded
ICMP Used to report general information Destination unreachable Parameter Problem Source Quench Redirect Echo Timestamp Address Mask
IP Packet
IP Header
ICMP Payload
ICMP Messages are carried directly within IP itself

Figure 6.2: ICMP.
122
Chapter 6
ICMP Message Format Figure 6.3 shows the general format of an ICMP message. The following list highlights the first four bytes of all ICMP (error and query) messages: TypeIndicates the type of ICMP message, including Echo Reply (0), Destination Unreachable (3), Source Quench (4), Redirect (5), Echo (8), Time Exceeded (11), Parameter Problem (12), Timestamp (13), Timestamp Reply (14), Address Mask Request (18), and Address Mask Reply (19). CodeAdditional information specific to the message type. In the Time Exceeded message, for example, the Code field indicates whether the TTL counter was exceeded (0) or if the fragment reassembly timer expired (1). Checksum16-bit checksum similar to that used in IP.
The next four bytes are labeled miscellaneous. Theyre used differently by different messages. In most ICMP error messages (for example, Destination Unreachable, Source Quench, Redirect, Time Exceeded, and Parameter Problem), these 32 bits are unused and set to 0. In the Parameter Problem message, however, the first byte is used as a pointer to the byte where the parameter problem was detected; in the Redirect message, these four bytes contain the address of the router to which future traffic should be directed. The final field shown in the diagram contains the IP packet header plus the first 64 bits of the packets Data field (or payload) in the offending packet. The receiving host uses this information to match the message to the appropriate CPU process. The 64 bits of user data are returned so that at least part of the header of any upper-layer protocol, including any port numbers, gets back to the original sender.
12 byte ICMP message format Type Code Checksum
(Miscellaneous) Internet Header + 64 bits of Original Packets Data General format for ICMP error messages
Figure 6.3: The ICMP message format.
123
Chapter 6 ICMP differs in purpose from TCP and UDP in that it is usually not used directly by user network applications. One exception is the ping tool, which sends ICMP Echo Request messages (and receives Echo Response messages) to determine whether a host is reachable and how long packets take to get to and from that host.
ICMP Technical Details ICMP is part of the TCP/IP suite as defined in RFC 792. ICMP messages are normally generated in response to errors in IP packets, per RFC 1122 specifications, or for diagnostic or routing purposes. The version of ICMP for IP version 4 is also known as ICMPv4, as it is part of IPv4. IPv6 has an equivalent protocol, ICMPv6. ICMP messages are constructed at the IP, or network, layer. They are usually built from a normal IP packet that has generated some type of ICMP response. The appropriate ICMP message is encapsulated in IP with an IP header in order to return the ICMP message to the originating host. For example, every router in the network that forwards an IP packet must decrement the TTL field of the packet header by 1. If the TTL reaches 0, an ICMP TTL Exceeded message will be sent to the source from that router. Every ICMP message is directly encapsulated in a single IP packet. Like UDP, ICMP does not provide any delivery guarantees. Although ICMP messages are contained inside standard IP packets, ICMP messages are usually processed as a special case. Theyre not normally treated as an IP sub-process because its often necessary to inspect the contents of the ICMP message, then deliver the appropriate error message to the originating host and application.
Reachability Testing One of the most crucial tests for network monitoring is the simple determination as to whether a system or network element is reachable via the network. The two most common tools for determining reachability are ping and traceroute. ping Ping is perhaps the most widely utilized tool on all TCP/IP systems. It allows users to determine the status of other systems. It also provides a tool for measuring the expected round-trip delay between the local system and a remote network element. Ping is useful for many reasons. Prior to attempting to establish a TCP virtual circuit, a local system might first ping the intended destination to verify that it is up and reachable. Ping uses ICMP Echo and Echo Reply messages. It has the following general format (where items in square brackets [] are optional):
Usage: ping [-t] [-a] [-n count] [-l size] [-f] [-i TTL] [-v TOS] [-r count] [-s count] [[-j host-list] | [-k host-list]] [-w timeout] target_name
124
Chapter 6 In the first test that Figure 6.4 shows, the test pings the host www.yahoo.com to determine whether it is up and running. This demonstrates the simplest use of the ping command and uses none of the optional parameters. The second test in the figure uses the optional -t parameter to tell the workstation to send an ICMP Echo message continuously. The optional size and quantity parameters are not specified, so ping uses the default values64-byte messages. These are sent continuously until the program is interrupted using Control-C to break the continuous cycle. The second test results in a list of the round-trip delays experiences by each Echo message sent.
Figure 6.4: Ping.
Traceroute Traceroute is another common TCP/IP tool that lets users learn about round-trip delays and the network routing between systems. Traceroute works by sending a sequence of UDP packets with an invalid port identifier to the destination system. The first three packets have the TTL field value set to 1; this causes the first router in the path to send back an ICMP message reporting that the TTL has expired. Then three more UDP messages are sent, each with the TTL value set to 2, which causes the second router to send ICMP replies. This process repeats, incrementing the number of router hops until the message actually reaches the destination. Traceroute identifies a completed cycle when it detects an invalid port error reply.
125
Chapter 6 Figure 6.5 shows the route from a workstation on the authors network to www.yahoo.com. The route that is displayed tells the following: The first hop is through a system called GE-1-1-ur01.olympia.wa.seattle.comcast.net. This is the first point on the path in the Comcast provider network. The second hop is further along the way on the Comcast network. Hop 3 shows a hop across the AT&T provided backbone network in Seattle. Hops 7 10 traverse the Level3 backbone network from Seattle to San Jose. Hop 11 hits a network node that only provides an IP address, 4.71.112.14. Finally, on the Yahoo network, we hop across to the final destination at 209.131.36.158, which has the DNS name f1.www.vip.sp1.yahoo.com. This is the system responding to the request for www.yahoo.com.
Figure 6.5: Traceroute.
For more information about traceroute, see RFC 1393.
126
Chapter 6
Syslog, Data Logging, and the Console

The term syslog is used to describe both the application that sends syslog messages and the syslog protocol itself. The syslog protocol, defined in RFC 3164, is very simple. A syslog sender transmits a small text message to the syslog receiver or server. Syslog is available almost universally to aid in systems management and security auditing. Although syslog has several shortfalls, it is widely supported by almost every element of the network. Because its nearly ubiquitous, syslog can be used to integrate log data from many different types of systems into a central data store for event correlation and analysis. Syslog is used for network management and security auditing. Syslog itself is quite simple. It may seem simple for auditing use, but its broad availability is a great advantage. It allows a centralized, corporate syslog server to become the central data repository for audit and event correlation information. Syslog data is in plain-text format, so its easy to manipulate with standard simple tools and scripts. Most organizations start with scripts and spreadsheets for analyzing syslog data. Larger organizations, monitoring many devices, may find that this approach is too labor intensive to be effective. Large log files and large numbers of log files may require adopting scalable commercial tools and developing automated processes to ease the work involved. Technology alone cant solve the anomaly detection problem. A great deal of syslog and event monitoring is tied not just to performance but also to network security. What gets monitored, how log data is used, and how the organization responds to events at the time of detection are all a critical part of the cycle of network management, monitoring, and defense. Administrators employ detection mechanisms because they offer notification as quickly as possible when a network anomaly, intrusion, or other malicious event occurs. Network threats mutate quickly. Worms spread almost instantaneously. The threat of zero-day attacks will not allow for weak incident management prevention and detection processes. Effective incident management tools and processes ensure quick reaction and recovery when an event does occur. The syslog protocol provides a transport mechanism that allows devices to send event notification messages over the network to syslog servers. These servers are often simply message collectors that dont return any acknowledgement. The syslog protocol is very simple. The sender transmits a text message that is less than 1024 bytes. The syslog server (often referred to as syslogd or the syslog daemon) appends the message to the file. These messages can be transmitted using either UDP or TCP. Normally, syslog data is transmitted as plaintext, but there are tools that use an SSL wrapper to add encryption for increased security. Although TCP can be used, syslog doesnt require a three-way handshake. Given the small size of the messages, UDP port 514 is the most commonly used communication. As UDP is connectionless, no acknowledgments are provided. At the application layer, syslog servers normally dont send any acknowledgments back to the sender either. Thus, devices transmitting syslog messages never know whether the syslog server has received the messages. Most sending devices will send syslog messages even if there is no syslog server in place.
127
Chapter 6 Syslog packets are limited to 1024 bytes and carry the following information: Facility Severity Hostname Timestamp Message
Syslog messages are categorized based on the generating source. These sources can be the OS running a device, a syslog process (or service), or an application. To learn more about syslog, see the IETF document at http://www.ietf.org/internet-drafts/draft-ietf-syslog-protocol-19.txt for further technical details.
Integrating Tools for Event Correlation

An NMS is a combination of hardware and software used to monitor and administer the addressable and manageable elements of the network. In converged service networks, VoIP and video services introduce a new set of manageable network elements that perform telecommunications service functions. These elements typically include gateways, call management servers, emergency responders, voicemail servers, media gateways or servers, and so on. General network management involves functions such as network planning, traffic routing, user authorization, configuration management, fault management, security management, performance management, and accounting management. Many protocols exist to support network and network device monitoring and management. As we discussed, SNMP is a common network protocol, but others that may come into play include Common Management Information Protocol (CMIP), Web-Based Enterprise Management (WBEM), Common Information Model (CIM), Transaction Language 1 (TL1), and Java Management Extensions (JMX). We wont probe these protocols in depth here. When implementing the converged network, NMSs take on a new, crucial role in enterprise service delivery. Enterprises need to bolster their management capabilities to test and manage QoS, performance, and availability in performance metrics, especially with VOIP services. To get started, companies should analyze their business requirements and determine key performance and QoS metrics. A comprehensive, enterprise-wide data collection mechanism is required to provide effective service assurances. Collecting as much data about the network as possible will aid in the ability to ensure call quality and consistency of service.
128
Chapter 6
Network Monitoring An NMS constantly monitors and notifies the network administrator via email, pager, or other alarms in the event of outages or anomalies that exceed defined thresholds. Monitoring is vital to service assurance and VoIP management. An NMS continually monitors the network for problems that result from overloaded and/or crashed servers, network connections, or other devices. For example, to determine the status of a Web server, monitoring software may simply ping the server periodically to check for a response. A more comprehensive NMS technique is to send an HTTP request to fetch a specific Web page; testing email servers might involve sending a periodic test messages to ensure the email services (SMTP, POP3, and IMAP, for example) are up and running properly. Status request failureslike those found when a ping fails, the Web page cant be retrieved, or another unexpected condition is encounteredcan be configured with most NMS platforms to activate some predefined response. These responses can vary from event to event. In some cases, an alarm might be sent to the systems administrators email, pager, or mobile phone so that human intervention can follow. Highly evolved systems might trigger some automatic failover system mechanism for continuity of operations. Or a non-critical server experiencing problems might simply be removed from service until a suitable time is available for repair. Some of the most important characteristics of network elements monitored in the IP network include CPU utilization, physical memory, disk space usage, virtual memory, and fans and power supplies. Many systems monitor temperature to ensure a proper operating environment is maintained. Monitoring of system backups is incorporated to ensure positive confirmation that backup jobs run as scheduled. Many organizations monitor Web server software (typically Apache or Internet Information ServicesIIS), directory services systems, and Domain Name Service (DNS) servers. Security monitoring is often incorporated into the monitoring performed in this network operations center environment. Managing VoIP services raises the need to monitor both the VoIP service elements and QoS facets of network performance to ensure acceptable call quality. In traditional IP networks, the infrastructure elements are monitored. VoIP introduces new infrastructure elements including voice processing systems, signaling servers, gateways to other networks, border controllers supporting SIP trunking, and voicemail systems. VoIP Service Elements to Monitor Voice traffic carries a set of performance expectations users have come to expect through years of telephone use. VoIP services introduce a new range of network elements to monitor. Whenever a device (for example, phone, gateway, and gatekeeper) registers with the network, there will be an auditing entry to review. Problems with device registration, for any reason, can impact service availability. Youll want to be alerted when the number of registration attempts or failures exceeds predefined thresholds. If the number of registered telephones changes dramatically, it could be a signal that there is a problem with the VoIP network. Gateway registration monitoring will help identify new or missing gateway servers.
129
Chapter 6 Call monitoring isnt eavesdropping on individual calls. Its really call-traffic monitoring. It involves monitoring incoming and outgoing call volumes to identify failures. If your VoIP system supports fax calling, attempted fax calls also need to be monitored. Call monitoring typically focuses on four specific areas: Calls in progressWhen a VoIP phone goes off hook, a call is deemed in progress until it goes back on hook. If every call in progress connects successfully, the number of calls in progress will equal the number of active calls. When designing the VoIP network, youll need to establish an upper-limit threshold for the number of calls that can be in progress at any given time. Active callsActive calls have successfully connected a voice path. Again, when designing the VoIP network, youll need to establish an upper-limit threshold for the number of active calls that can be handled at any point in time. Attempted callsDesigners strive to ensure that all calls attempted will be completed successfully, but such isnt the case in the real world. Monitoring calls attempted over time yields data that aids in identifying peak periods and the busy hour call attempt (BHCA) value. Completed callsA completed call is any successful active call that completes without an abnormal termination code. Monitoring completed calls over time is also useful in identifying peaks periods and the BHCA value.
VoIP services need to interconnect to the PSTN through gateways. In addition to gateway monitoring, it is vital to monitor the PSTN side of the VoIP service network. PSTN connections are frequently established using ISDN Primary Rate Interface (PRI) channels over T-1 circuits. Monitoring active PRI channels, especially over time, can help identify call patterns and busy hour peak call volumes. Baseline data can also be used to identify underutilization of circuits. Data trending helps in capacity planning and the growth and maturation of the VoIP service. One benefit in deploying VoIP services is the conference bridging capabilities. If your deployment supports conferencing, you must configure the maximum number of audio streams that will be supported. Monitoring will ensure that the number of available audio streams meets acceptable service levels for your organization. IP phone functionality requires continual monitoring for service assurance. You should monitor IP phones for their registration status, the validity of their dial tones, jitter, latency, and lost packet count. These QoS parameters directly affect service delivery.
130
Chapter 6
Monitoring Bandwidth and QoS Voice traffic requires specific bandwidth based on the codec used in the VoIP design. G.711 requires about 64Kbps for each direction of a bidirectional call. G.723 and G.729 require significantly less bandwidth due to compression, but congestion can severely impact call quality. When you add applications to your network, there is always a risk of oversubscribing links. Oversubscription leads to congestion, and congestion may introduce a negative impact on call quality. Packet loss and increased latency are common side effects of congestion and can, when left unchecked, render VoIP services unusable. For VoIP users to receive an acceptable level of voice quality, VoIP traffic may need to be given some kind of prioritization over other kinds of network traffic, such as data. The main objective of QoS mechanisms is to ensure that each type of trafficdata, voice, and videoreceives the preferential treatment it deserves, thereby reducing or eliminating the delay of real-time streaming voice or video packets crossing the network. The following list highlights examples of metrics that are frequently monitored because of their effect on VoIP call quality: Delay or latency is an estimate of the network delivery time expressed in milliseconds. Its measured as an average value of the difference between the timestamps noted by the senders and the receivers of messages. It is measured when the messages are received. The end-to-end delay, or latency, measured between endpoints is a key factor in determining VoIP call quality. Jitter is also called delay variation. It indicates the variance of the arrival rate of packets. Jitter points directly to the consistency or predictability of the network. It is a call quality factor known to adversely affect call quality. Networks can compensate for jitter by implementing jitter buffers to normalize the timing of the traffic flow. Jitter buffer loss occurs when jitter exceeds that which the jitter buffer can hold. Jitter and jitter buffer loss affect call clarity, which affects the overall call quality. Packet loss indicates a packet lost during transmission. In VoIP, this could mean the loss of an entire syllable or word during the course of a conversation. Obviously, data loss can severely impair call quality. Monitoring systems measure the number of packets that were expected against the number actually received. Mean Opinion Score (MOS) is a subjective measure used in voice telephony, especially when codecs are used to compress the bandwidth requirement of a digitized voice connection from the standard 64Kbps PCM modulation. MOS is generated by averaging the results of a set of standard, subjective tests. In the past, a number of listeners rate the heard audio quality of test sentences read aloud by both male and female speakers then rate each as follows: 1-bad, 2-poor, 3-fair, 4-good, 5-excellent. The MOS is the arithmetic mean of all the individual scores. In current systems, MOS is often determined through software algorithms.
131
Chapter 6 Measurements and Metrics for Voice Quality Voice quality measurement as part of operational monitoring can be either non-intrusive or intrusive. Non-intrusive tests are typically based on actual voice conversations taking place during daily operations, whereas intrusive testing requires placing test calls across the network. One approach to evaluating call and voice quality is to assemble a group of participants who will act as judges. A common technique is to have them listen to test calls, and assign scores from 1 to 5, much like the MOS evaluation testing. There are a number of algorithms and methods that might be used, including MOS, Perceptual Analysis Measurement System (PAMS), Perceptual Speech Quality Measurement (PSQM/PSQM+), and Perceptual Evaluation of Speech Quality (PESQ/PESQ-LQ): MOS has been adopted from the PSTN and traditional telephony. This historical measure of voice call quality was judged in the US by eight men and eight women who rated voice quality on a scale from 1 at the low end to 5 at the high end. Although this human evaluation of MOS was useful for identifying the quality of experience with calls in the PSTN, it hasnt proven useful for network monitoring or SLA compliance measurement. PAMS was developed by British Telecom. It doesnt replace MOS but offers a reasonable first attempt at automating the MOS scoring. PAMS compares the original analog voice wave with reproduced speech using a complex weighting method that purports to take into account specific characteristics that are important to the human ear. The PAMS scale is from 0, which represents a perfect match between samples, to 6.5. PAMS values estimate MOS scores with a 10 to 20% level of accuracy that is inadequate for use in VoIP networks. PAMS values are excellent for benchmarking purposes. PSQM/PSQM+ is defined in ITU-T Recommendation P.861. PSQM estimates MOS with a greater level of accuracy than PAMS at 10%. PSQM and PSQM+ are also good for benchmarking and comparisons. They represent improvements on the earlier PAMS algorithm. PSQM and PSQM+ use the same 0 to 6.5 scoring that PAMS uses. PESQ/PESQ-LQ is defined in ITU-T Recommendation P.862. Like other techniques described here, PESQ doesnt replace MOS. Its proven excellent for benchmarking and comparisons and is an enhancement and improvement to the PSQM algorithm. It also uses the same 0 to 6.5 scoring.
132
Chapter 6
The Do-It-Yourself Approach Many enterprises create their own management platform suite over time. For many small to midsized businesses, a managed service has always been the primary option. These organizations are often resources-constrained and simply dont have staffing to do everything themselves. Larger enterprises often take exactly the opposite approach. These organizations have provided their own telephony services for many years and seem inclined to continue this approach in the emerging multimedia converged networks of today. Although many enterprises view VoIP and video as new service applications on the IP network, the trend seems to be a continued do-ityourself approach to data, VoIP, and video. Although this is common, it isnt necessary. For many large enterprises, migration to a converged service network may present a perfect opportunity to rethink service delivery and develop partner relationships with service providers for both delivery and management of these services. The enterprise network is evolving and becoming somewhat organic in nature. This converged network provides a shifting set of real-time, near real-time and non real-time voice, data, and video services. Even an internal SLA for workgroups inside an enterprise needs to be developed to support metrics that are relevant to each of the services provided. The SLA An SLA is essentially a contract mechanism that documents the level of service that a customer should expect to receive. An SLA that has been thoroughly thought out will also describe actions that will be taken by the service delivery organization. This is where different service classes will be defined and delivery characteristics identified for each. Data, voice, and video services will be differentiated and require different traffic/service flows. Thus, the network provides differing Classes of Service (CoS). To assure service delivery, management, monitoring, and analysis tools must be capable of monitoring the service parameters for each service call. The alternative is to implement discrete tools for each service time. This option is often unwieldy and expensive.
SLA Metrics to Consider
The SLA process leads to an informed user within the enterprise. Users will begin to understand multimedia services much the way an automobile purchaser understands performance characteristics of different cars. The MOS, PQSM, R-Value, and other performance metrics described earlier might become as familiar to the integrated services users as MPG, horsepower, and such are to motor vehicle owners. Embedding the right tools within the service delivery organization that support comprehensive monitoring, measurement, and analysis of multimedia traffic compared with established baselines are crucial service delivery support tools.
133
Chapter 6 Its important to look beyond the metrics that might be specific to VoIP and video. You must also monitor more traditional aspects of the network including loss, delay, jitter, and general availability. These measurements give a common basis of comparison across data, voice, and video services. In a collaborative network environment between end user workgroups and the service delivery group, the SLA can be used as a tool for continuous improvement as metrics evolve over time to reflect actual network requirements and performance capabilities. The right tools for monitoring and managing your evolving corporate multimedia networks move far beyond the mindset of the Internet as plumbing or a simple common architecture for delivering any traffic type. Corporate management must understand, and appropriate necessary funding to support specialized tools to maintain the health of the converged network. Without proper funding of personnel and systems, the IP network may fail with dire consequences on day-to-day operations.
Beyond ImplementationOperational Support
One of the implementation dangers in integrating network services is the inclination to reduce staff without necessary skills transfer. Organizations that are overly focused on cost reduction run the risk of embracing staff cuts followed by the creation of services that remaining staff cant effectively support. Many organizations have moved forward on the voice integration path with the IP network staff leading the initiative. This can lead to gaps in vision of telephony services and elimination of the traditional telecom support team. This is a risky proposition and the business impacts should be considered at every turn. Its surprisingly easy to overlook specialized skills that enterprise telephony engineers bring to the table. IT network engineers often dont fully appreciate the workings of Erlang-B calculations and the importance of traffic engineering. These technical resources are also the ones that design interoffice trunking facilities, understand call center requirements, and build the organizations automatic call distribution groups, hunt groups, and call pickup groups. Its important to remember that although providing these services over IP may be a new approach, these are still core voice services necessary for daily business. Its important to partner voice and data specialists in support of new unified communications service. This approach will let you make the best use of every technical resource within the organization. Dont make the mistake of allowing key institutional knowledge about your services, your network, and your requirements escape through oversight while focusing on cost reduction. In the unified communications network, reductions in cost are won over time. They may not come immediately.
134
Chapter 6 Even Help desk support for telephones is generally viewed as something simple. We assume that everybody knows how to use a phone. In the integrated data, voice, and video network, simple telephony features may well be delivered like they were in the past. Its more likely that the integrated system will bring new facets of data workstation feature management into play. For example, one common feature in VoIP systems is to provide a way for users to manage their own telephone features. Button configuration and speed dial lists may now be managed via a Web interface from the desktop. Some method of oversight, whether through management and monitoring or through specialized vendor-provided tools, will help simplify management of even the simplest changes. Remember that services converging onto a single infrastructure also mean the corporate Help desk may become the primary support point for all new services. They may begin receiving questions theyve never dealt with previously. They too may be unfamiliar with the new integrated services. Its crucial that all support staff, including the Help desk, get the training and management and monitoring tools they need to support daily business operations effectively. Pros and Cons of Rolling Out Your Own Management Platform There are distinct advantages in custom-building an enterprise-specific management platform. When an enterprise opts to utilize a suite of managed services, the service provider has some commitment to meet established expectations. In the do-it-yourself approach, you become your own service provider. You might be an internal service provider delivering services to internal departments, divisions, regions, and employees, but end user expectations remain the same. An organization embracing the do-it-yourself approach must implement not just the services but also the tools for measurement, monitoring, and analysis that will ensure service availability. Vendors and manufacturers often provide the most accurate, most granular, and what might appear to be the most desirable systems embedded within their solutions. These integrated service hardware and software solutions commonly provide a combination of syslog and SNMPlike monitoring. These tools can provide both real-time monitoring and event correlation capabilities. Many vendors also provide product-specific, proprietary tools to aid in management and monitoring. Its important to identify these tools early in the process. Some may be provided freely as part of the product suite. Some vendors will offer yet another management suite to ease the process. Again, its important to recognize the costs of adding more management tools to a service that seems to be growing in complexity daily. Third-party tools for modeling, monitoring, measuring, and managing are widely available and can provide incredible value to service managers. Some provide broad visibility across a variety of platforms, bringing insight into the network performance aspects that far exceed human capacity to observe or analyze. Some of these third-party tools are commercial products. Some may come from open source libraries. On the plus side, there is a wealth of available resources. On the negative side, the onus is now kept within the enterprise to select the right tools to manage, monitor, and analyze every facet of the services provided to ensure availability and sustainable service quality.
135
Chapter 6 Freeware and Open Source vs. Commercial Products When taking the do-it-yourself approach, dont overlook the sophisticated tools already embedded in the routers, switches, gateways, end systems, and other components of the integrated network. There is a feature-rich toolset built-in to the products your organization will use. Although manufactures may provide the most accurate tools embedded within their hardware, they may raise a different area of concern. The tightly coupled vendor solutions provide great management granularity and detailed analytical capacity. However, unless the entire network and all services are provided by a single manufacturer, this granular approach may be unable to provide a holistic view of broad services. Manufacture-provided, proprietary tools certainly have their place. They often prove invaluable when optimizing a network or when troubleshooting specific system problems. But they may lack the broader, more standardized and less platformspecific capabilities of systems provided by third parties. Open source and freeware solutions present a unique set of challenges. Although the price may be right (theyre often free), there may be no support available. This can lead to enterprise engineers providing their own support, developing their own patches, and creating their own performance tweaks. In time, open source software and freeware can morph into a variation on in-house software development. Another consideration may be the security of these solutions. There are two considerations for open source solutions. Although an enterprise might build a level of trust for a vendor providing commercial products, freeware, and open source tools essentially come from often unknown sources. In many cases, this code has been vetted by hundreds or thousands of well-qualified independent developers. The open availability of the source code means that many people can contribute, each incrementally improving and crafting the code. However, whether appropriate secure coding practices were used in their development may remain completely unknown. And while the danger of these tools being purposefully developed with malicious code elements nested inside has not proven common, these are management platform tools that support the daily enterprise operations. Can you trust the crown jewels of the enterprise service network to unsupported software written by unknown third parties? Its important to weigh the value against any and all potential risks. Integrated Commercial Solutions An organization that selects a managed service approach may be able to ignore many of the technical details of the inner workings of service delivery systems. These enterprises may simply need to check the pulse and take the temperature of the integrated services network. Often service providers will either provide access to these tools or the tools themselves as part of the managed service offering. These tools may simply provide a snapshot of the network status at a given point in time. Some tools may provide a more detailed, near real-time view of the health of the overall service. Its important to review SLAs, as these tools may only provide a customer view that might not correlate directly with service delivery requirements identified in the SLA.
136
Chapter 6 Multimedia networks supporting voice applications are complex systems that require sophisticated monitoring intelligence. Although individual manufacturers monitoring tools offer a partial solution, they cannot account for the granular nuance of monitoring and managing a complex, multi-vendor service environment. Generic monitoring and management tools are available, but generally insufficient. To truly be effective, third-party tools have to be assembled with some knowledge of each of the individual manufacturers systems. This may be the only way to obtain a comprehensive, system-wide, end-to-end view of the converged services network. The following list highlights key areas for monitoring and management. Pre-deployment simulation Manufacturer-specific monitoring and management Real-time business views Call detail records Calls in progress Delay-to-dial-tone rates BHCAs Busy Hour Call Connects (BHCCs) Gateway channel utilization and loading Real-time call monitoring Phone and multimedia device availability and monitoring Poorly performing components Service level breaches and SLA compliance Real-time interface to manager of managers Summary and exception reporting Utilization trends over time Managed devices by company, department, and location Asset tracking Capacity planning Incoming and outgoing calls Loading by route pattern, route group, route list, and gateway Bandwidth utilization Delay and delay variation Packet loss Route patterns, utilization, and availability
Using the right mix of sophisticated third-party tools will provide a toolset for documenting results consistently while eliminating any vendor bias.
137
Chapter 6 Pros and Cons of a Packaged Management Platform As earlier noted, when choosing the do-it-yourself approach, organizations will still have access to the sophisticated tools embedded in network equipment. There are still other tools that can provide insight to the big picture of service management and aid in troubleshooting granular problem areas. Managed service providers will generally allow end-user organizations, customers, to view a thin slice of their network for performance monitoring. These views pertain directly to a specific customer organization. Service providers often call this service feature Customer Network Management (CNM), which allows a customer to see their own part of the larger, shared virtual network. This approach has been in wide use for at least two decades. Early CNM systems were periodic summary snapshots showing sets of statistics that bore little resemblance to the real world or the SLAs. Today, providers offer services that are much closer to real-time views, with granularity down to the individual connection. In some cases, they allow customers to adjust certain characteristics, such as CoS or available bandwidth, in real time.
Summary
Without an NMS focused on the integrated data, VoIP and video services, companies will find themselves in the dark with a lack of information to support necessary service assurance. VoIP services that lack management will be prone to service delivery and quality problems that cannot be tracked to any specific network elements or service delivery metrics. The appropriate QoS and network performance metrics require constant management, monitoring, and analysis to ensure acceptable service delivery. Integrated services introduce new complexities and opportunities to simplify the process of moves, adds, and changes to the network. In the past, employees would often move their workstation easily, using Dynamic Host Configuration Protocol (DHCP) to retrieve data network setup parameters. Telephones have always represented a more complex move process. Requiring PBX reprogramming and intercession by a telephone services administrator; convergence to an integrated IP service network may simplify this process. VoIP solutions may also provide new productivity tools to remote workers, but again, management, monitoring, and analysis of network services is crucial to the continuity of daily business operations. Strategic new business applications appear on the horizon every day. We see network service like voice and video beginning to couple tightly with enterprise resource planning and customer relationship management systems. With a better understanding of how management and monitoring play into the life cycle of integrated services, we have the foundation to move forward to the next chapter, in which well explore managing service availability and capacity planning for converged services.
138
Chapter 7
Chapter 7: Effective Service Availability Management and Capacity Planning

Life cycle management is crucial to the sustainability of any network. Deploying data voice and video services is all well and good, but long-term success requires a repeatable methodology for consistent measurement and planning. This chapter takes a look at an approach that has been used for many years in the legacy telecommunications industry called FCAPS. The name is an acronym for fault, configuration, accounting, performance, and security management. We look at FCAPS because it has been tightly coupled with managing large voice and data networks for many years. Later, the chapter will touch briefly on the IT Information Library and the ITIL framework. These two well-know methodologies dovetail nicely as a foundation for managing the life cycle of the network. This chapter will delve into availability management, network optimization, and capacity planning issues as the primary focal point. Well explore these areas using the FCAPS model as a base for methodical techniques in delivering integrated services. Network optimization is a vital part of life cycle management. Optimization supports both availability management and capacity planning. Its an ongoing process and aids in keeping network costs in check. Network optimization can also help in managing the demands of business users, who might be rapidly stretching the capabilities or capacity of the network as they deploy new applications. Optimization includes trending, capacity planning, and ongoing minimization of infrastructure costs. By decommissioning legacy services as they are no longer needed and continually monitoring and maximizing the success rate of data voice and video sessions, we are, in a sense, future proofing the network. This holistic approach to network management also provides investment protection by continually evaluating network performance and service delivery throughout the entire life of the network, maximizing the usable life of equipment and leveraging technologies for full value. Managing the life cycle includes continuous evaluation of both the return on investment (ROI) and the return on effort (ROE). An important part of this process is being able to perform all the following: Maximize business potential through performance and availability management Learn to decipher results from performance and availability testing Use lessons learned from testing to better utilize system resources; this includes assessing capacity and application upgrade needs Recognize the risk factors of not testing network applications and capacity, including slowdowns in network response time for your customers or users Learn how to monitor, analyze, and prioritize your business and network management needs to ensure a responsive suite of converged data, voice, and video services Prepare for the convergence of performance and availability management
139
Chapter 7 These skills often arent put to use as part of the general IT staffs daily routine. These tasks may actually dilute the focus of the IT group from the daily core business and operational requirements. Balance is crucial so that no single aspect of service delivery, management, or assessment consumes excessive resources. These resources are often viewed as network resources, but the human brainpower and time resources must also be taken into consideration. Comprehensive network management requires the effective use of people and resources, business processes, technology tools, and products, vendors, and service providers. Its a neverending effort to provide reliable and consistent service levels for an array of converged services to meet end user and customer expectations.
Introducing FCAPS: A Sustainable Model for Balanced Network Management

As data, voice, and video services converge on the IP network, its prudent to examine and incorporate the lessons learned by the legacy telecommunications carriers. In many cases, these techniques have also been embraced by the major Internet service providers (ISPs). The telco carriers have been successfully delivering voice services in the Public Switched Telephone Network (PSTN) for more than a hundred years. Although new unified communications technologies such as VoIP, video, and mobile voice are converging to shake up the landscape of telephony, many long-standing service delivery practices are easily adopted to fit todays integrated networks. The telecommunications industry adopted and followed the Telecommunications Management Network (TMN) framework. This framework was originally defined by the International Telecommunications Union (ITU at http://www.itu.org). Within this large management framework, the general management functionality offered by telecommunications systems is split into five key areas referred to as FCAPS. FCAPS is the International Organization for Standards (ISO) model for network management. It provides a standards-based foundation for network management.
140
Chapter 7
Service Layer
End-to-End Layer
Domain
Domain Network Layer Domain Domain
Ethernet Layer
Figure 7.1: The TMN framework.
Digging inside the TMN, the ISO has provided FCAPS as a framework for the task of network management. All forms of management impose elements of monitoring and control. The management of service networks is no exception. For each aspect of network management, there is a life cycle of some sort, corresponding to analysis, design, construction, monitoring, and back to analysis againendlessly repeating. There are several areas of mission-critical business management that FCAPS overlooks. The FCAPS model is centered on the technical management of the service network. Daily operational management will include people and staffing issues, finance and purchase issues, process documentation, and a number of other supporting roles. The network managers day is filled with a variety of tasks and workflows. FCAPS simply provides one network management approach for that aspect of the larger role of a network manager.
141
Chapter 7
Fault Management Configuration/Change Management Accounting/Asset Management Performance Management Security Management
By isolating the management challenge into distinct areas, the FCAPS model allows us to conceptualize solutions that make the most sense for the challenges unique to each functional area.
Figure 7.2: The OSI FCAPS model.
Fault Management Every network will encounter faults originating in equipment, cabling, human error, or software during operation. The concept behind fault management is that of formalizing a repeatable process to identify these continually occurring faults. Faults may be recurring events that can be easily recognized with repetition, or they can be discrete events that occur rarely. Smooth network operations require timely isolation and resolution of faults as they occur. A Network Management System, usually deployed inside the Network Operations Center (NOC), provides both trapping and logging of network faults.
Chapter 6 reviewed some of these techniquesusing SNMP for trapping and logging system events.
Fault management is not about resolution or restoration of service. Fault management is more a process for intelligence gathering related to network performance. Faults impact performance, and a high fault rate results in impaired or degraded network service delivery. Although mitigations of problems and service restoration must occur in a timely fashion, the fault management flow is oriented to identification and isolation first. Fault resolution and service restore might employ workaround procedures where the root cause of a particular fault may preclude comprehensive correction in a quick and timely manner.
Chapter 8 will explore fault management in greater depth.
142
Chapter 7
Configuration Management Configuration management is a workflow centered on data gathering and storage. Configuration files and data are collected from network elements. These network elements in the traditional IP network include routers, switches, servers, and so on. In the unified communication network, configuration management is concerned with configurations of gateways to other networks, gatekeepers that authenticate and grant registration access to users, SIP signaling servers, media servers, and even telephone sets. Knowing where things are deployed in the network and how they are configured is central to managing the total enterprise network. Many day-to-day operational problems are related to improper configurations of one sort or another. Mistyped IP addresses or network masks are a common problem. Patches or updates to OSs, antivirus programs, and business software create gaps in the integrity of the network. You need to monitor and manage all aspects of configuration, including: IP addresses Subnet masks DNS settings Frame size Directories Disk drives Drivers for network cards Video cards User groups and identities Domain structures Applications
The list goes on and on. Chapter 8 will explore configuration management in more detail.
143
Chapter 7
Accounting/Administration/Asset Management Comprehensive accounting and effective control can save money and improve the allocation of scarce network resources. How do you know whether an upgrade is needed without measuring usage? Factors that might be measured include printer pages, use of disk space, processor time, network bandwidth, use of resources (for example, the Internet), and so on. Choosing an appropriate metric is often tricky. For instance, do you measure the number of pages or the amount of ink used for printer accounting? The latter is where the cost is. Accounting costs money and should be used only where appropriate. Costs can include computer resources (such as disk space, processor time, network bandwidth) as well as the time and effort spent enforcing restrictions and collecting funds. A large portion of the cost in your legacy phone bill is taken up by the cost of collecting it. The FCAPS model evolved from the legacy commercial telecommunications industry. Customer billing for minutes of use is a core component of that industry segment. In the FCAPS model, early documentation focused on these traditional customer billing, or call accounting, systems. Efforts to mirror this accounting mindset in data networks have been implemented through tools that collect usage statistics that may be tied to disk space, CPU utilization, minutes of use, bandwidth, or some other metric. Bandwidth has become a simple metric thats widely used. Tools implementing protocols such as RADIUS and TACACS are commonly deployed in enterprise networks to monitor usage. For most enterprises, auditing, rather than accounting and billing, is the primary usage for these monitoring systems. In enterprise networks whose core business is something other than telecommunications or voice service delivery, administration, asset management, or auditing may provide a more accurate frame of reference for this piece of the puzzle. Administration objectives include managing a set of authorized users by establishing user IDs, passwords, and permissions. Administration can also include support of systems and equipment such as performing software backups and synchronizations between systems or network elements. Asset management might simply encompass the financial aspects of capital investment and system replacement as asset costs are amortized off the corporate books. Beyond physical hardware assets, many enterprises adopt their asset management efforts to include software licenses and oversight of version control. Auditing, in the enterprise world, can range from simple usage auditing to enterprise policy management. With the widespread focus on regulatory and compliance requirementssuch as the Sarbanes-Oxley Act (SOX), the Graham-Leach-Bliley Act (GLBA), and the Health Insurance Portability and Accountability Act (HIPAA)audit processes often expand. Business sectors may have other auditable compliance requirements based on business sector.
144
Chapter 7
Performance Management Performance management is another aspect of the network that looks solely at performance data. Continual monitoring of network health and performance requires collection of performance data. Although SNMP might be used for monitoring faults and errors as part of fault management, performance management monitors interfaces on network elements, throughput, responsiveness, and the holistic performance health. In a converged network that delivers unified communications services, performance management drives the monitoring of trends in network performance and utilization. These trends can raise the visibility of network capacity and reliability issues before they impact delivery of the suite of unified communications services. Performance thresholds can be set to trigger an alarm, which might then be handled through the fault management process. To monitor and improve the service, deploy hardware and software effectively, and to spot trends and changes, monitor and control performance. Closely allied to fault management is the task of measuring everything from the number of bits sent per second to the speed of transactions to the refresh rate of your monitor. Understanding the things that can affect performance is a technical issue, but knowing what to monitor and why is a management function. You need to establish baselines and identify bottlenecks in the system. Bottlenecks are the slowest point in a system. A lot of money can be wasted attempting to improve a system if you do not know the part that is making it slow.
See Chapter 5 for more detail about performance management.
Security Management The main objective of security management is to identify and mitigate risks. In the service network, security helps: Maintain legitimate use of corporate resources Maintain confidentiality of enterprise intellectual capital Ensure integrity of the data integrity Provide consistent auditability
Security is a big issue and always comes with attached costs. These costs can involve extra processing (for example, for encryption, authentication), extra administration (for example, setting up user IDs, monitoring for security breaches and so on), extra hardware (for example, firewalls and fiber optic cable instead of easily tapped wires), and a host of other less tangible elements, such as convenience or lack thereof, that might impact the ability of users to work effectively.
Chapter 9 is devoted to network security risk and incident management.
145
Chapter 7
FCAPS Simplified Today in business network management, this FCAPS model exists in many forms and flavors. As defined by the ITU, it is more complex than necessary for many organizations. Its important to remember that FCAPS was designed to support the telco carrier networks. It may be the best choice for a VoIP service provider but not for a health care or financial institution. Rather than undertake the complexity of a full-blown FCAPS methodology, many organizations implement a four-layer approach for converged network service assurance focusing on: Performance management/availability managementGuarantee availability and performance of the network infrastructure, VoIP, and other application or business services Security managementEnsure protection from risks and response to security-related incidents Configuration and vulnerability managementManage network configurations to ensure compliance with corporate guidelines; this could be an enforcement point to ensure regulatory compliance where applicable Change control managementEnsure that only authorized and appropriate changes are made to the network; audit controls provide knowledge-based tracking of every change
Youve often heard the acronym TCO. Throughout, this guide touches on four facets of holistic management for the converged services network. These are: Operational integrity Service management Policy compliance Risk management
146
Chapter 7
Performance Management Availability Management Operational Integrity Monitor system and network performance
Security Management Continually monitor threat environment
Configuration and Vulnerability Management Identify vulnerable or exploited systems
Change Control Management Ensure no unauthorized changes are made to operating environment Delegate admin change permission only to approved staff Audit logs for all changes to production environment Test changes before implementation to minimize risk
Service Management
Map performance indicators to business issues
Event correlation of security incidents with performance issues Protect audit trails, monitor and report security violations Identify, mitigate, and respond to security incidents
Benchmark and baseline system configurations
Policy Compliance
Measure SLAs and compliance
Ensure baselines follow corporate standards and policies Identify, assess, document, and remediate or accept risks
Risk Management
Forecast and avoid service interruptions
Table 7.1: A four-layer variation on FCAPS.
The various capabilities identified in Table 7.1 can all be implemented at once. Many companies today follow the ITIL framework as a root model. The ITIL framework is a series of industry best practices. Its a set of documents that focus on the core business areas of IT service management (ITSM). ITIL is referred to as a library because it has been published in a book series. ITIL and IT Infrastructure Library are registered trademarks of the UK Office of Government Commerce (OGC), where this concept took root. The library has evolved to encompass a wide set of best practices. When you think about adopting best practices, there is one simple view that makes their benefit clear. Best practices provide a practical substitute for conducting comprehensive risk assessment and analysis. In most cases, the global business universe can leverage the shared knowledge gained through others and lower the cost of progress by avoiding the high cost and effort associated with comprehensive risk analysis best practices. The ITIL framework supports the wide field of IT infrastructure, development, and infrastructure for service delivery. One facet of ITIL is referred to as the maturity model. When organizations embrace ITIL, they typically conduct a series of assessments of their own process maturity. Most companies choose to work methodically toward a mature model that supports all the enterprise business needs. They adapt and refine along an evolutionary path and adopt new methods and techniques along the way.
147
Chapter 7
Optimization for Service Availability and Capacity

Approaching at service management, availability, and capacity planning from the traditional carrier telco providers perspective wasnt necessarily easy to adopt early on in the evolution towards the converged networks of today. For most businesses that were early adopters of new technologies such as VoIP, it seemed much simpler to add to the existing network and grow incrementally. This controlled growth can be a good thing from an investment and impact perspective. The danger is in cobbling together disparate pieces and continuing to add and grow without stepping back to take a larger view of what is happening in the network. The performance envelope discussion in Chapter 5 provides one technique for protecting against growth pains through methodical evaluation. The larger the enterprise organization, the closer the proximity to the carriers mindset. Many large enterprises, especially in the Fortune 1000, find that over time they become their own telephone company. They have become their own data network provider. These organizations provide internal voice and data services to more employees than many small telcos and ISPs do. And they add the complexity of business processes to support their core business, which might be financial, health care, transportation, or some other facet of business.
Network Management System
ALARM
Tracking Database
Workflow Customer Trouble Tickets
Operation Support Systems
Customer Records
Inventory Database
Service Level Agreement
Billing Systems
Figure 7.3: The typical tools.
148
Chapter 7 Figure 7.3 shows the typical tools of the voice and data network service delivery business. Network management systems monitor the service network, producing logs and alarms that feed into databases. Some form of trouble resolution mechanism or Help desk tracking is needed. Operational Support Systems (OSSs) provide repositories for other informationcustomer records, inventory, contracts or SLAs, and billing. The cloud in the visual of workflow represents all the rest of the core business and includes the processes and procedures around service delivery in the corporate environment. The key is in scaling the methods described here to the size and scope of your business. Customer records alone can vary from a major Customer Relationship Management (CRM) system, running on widely distributed servers, to a central contact management system in a midsized organization to an individual salespersons customer database running on their desktop. For small businesses, the contact list in the salespersons cell phone might be the customer database. At some level, every component the carriers deploy to manage a global telecommunications network correlate to similar functionality in every enterprise organization. Trending and Capacity Planning This guide is really aimed at service delivery organizations, whether they are a VoIP service provider, the IT group in a small to mid-sized business, or the Chief Information Officer (CIO) in a Fortune-100 corporation. Every business depends on business intelligence for survival and success. A technology organization inside a large enterprise that is in the business of delivering services to that enterprise has just as great a need for business intelligence as a provider of services in the open market. Monitoring trends and usage provide the supporting data for ongoing network expansion and upgrade. Although a NOC might focus on monitoring for availability, uptime and performance, the capacity planning staff needs comparable data. Its not enough to manage utilization of trunks and network connections. Capacity planners need to know what features are being used too. Feature sets such as call hunt groups, call pickup groups, and automatic call distribution groups may be tied to levels of licensing. Given that many VoIP systems are now servers running on general-purpose hardware platforms, software licensing and upgrade has to be considered in the capacity planning process. Bandwidth Bandwidth has become, for many, the holy grail of networks. The consumer end of the spectrum has evolved from very slow dial-up connections to broadband in the home. The enterprise network has gone from fractional T-1 circuits over frame relay to 100Mbps switched Ethernet to the desktop and Gigabit Ethernet in the core. Large enterprises may even use SONET technologies to provide even greater bandwidth across a wide area network (WAN). Dark fiber deployment during construction and around the country continues to increase. Wireless technologies are spreading throughout municipalities and are quickly becoming prevalent in campus, warehouse, and office deployments.
149
Chapter 7 In the PSTN and Internet, high-bandwidth circuits have carried the backbone traffic for years. Unified communications technologies have, over time, pulled the two clouds closer and closer to the point that today lines of distinction between the two blur. This is a challenge for the telecommunications and Internet providers, as theyve often used bandwidth and transport technology as a differentiator. Customers and end users care less about technology and more about solutions today. As described earlier, what matters is that sales force automation or CRM systems, and others, simply integrate and work. Costs of fiber, higher-speed Ethernet switches and routers and advances in optical technology have driven the cost of higher-capacity connections down to some of the lowest dollars-per-bit prices ever imagined. In terms of VoIP deployments in particular, this low cost can induce motivation for change and is something that needs to be evaluated, designed for, and monitored for the life cycle of the network service. In the past, enterprises deployed Private Branch eXchange (PBX) systems for their internal corporate voice services. These systems were typically connected to the PSTN via T-1 circuits carrying 24 voice channels. An ISDN Primary Rate T-1 was the most common PBX connection.
PSTN PSTN
Separate Networks No Problem
Internet Internet
PSTN PSTN
Touching Networks Early Dialup Access
Internet Internet
PSTN PSTN
Converging Networks Integrated Communications
Internet Internet
The Fully Converged Future Network
PSTN PSTN
Internet Internet
Complete Service Transparency
Figure 7.4: Network convergence evolution.
This is a good point to revisit the network convergence evolution from Chapter 1. Over time, completely separate networks, the PSTN and the Internet, have evolved to become tightly coupled. Today, they complement one another to deliver an array of data, voice, and video services.
150
Chapter 7 This tight coupling of the PSTN and Internet has led to choices in design for service delivery today. When providing voice services, one decision will be selecting the appropriate trunking technology between networks. For enterprises that continue to maintain a traditional PBX, T-1 trunks to the PSTN might represent the best solution. Some organizations will implement gateways that convert from the IP network in the enterprise to the PSTN network via T-1 circuits. As enterprise networks move forward to integrate communications services, SIP is quickly gaining momentum as a peering and trunking technology between voice service networks. SIP trunking may introduce some new costs, such as Session Border Controllers (SBCs). These provide a gateway and firewall services between service networks. SIP trunking is rapidly proving to be very cost effective. Bandwidth comes into play in SIP trunking because theyre generally IP network connections. If the cost of a 10Mbps or even a 100Mbps IP link using Ethernet is significantly lower than a T1, many more voice calls can be carried at a fraction of the cost between connected networks. A legacy voice T-1 circuit carried 24 phone calls. IP links can carry many, many more.
Softswitch
Trunking Gateway PSTN

T-1 Trunks
Signaling Gateway
SIP Trunks
Integrated Access Device
Broadband Access Link
Managed Network IP/ATM
VoIP Server
Internet
Figure 7.5: The variety of convergence options.
A glance at Figure 7.5 reveals the variety of connection times coming into play as technologies converge while theyre evolving. In some cases, separate signaling and trunking gateways are apparent. In other cases, theyll be integrated into a single softswitch.
151
Chapter 7 SIP trunking may be used between what was thought of in the data environment as extranet business partners. For example, a health care provider might implement SIP trunks to connect o insurance carriers and pharmaceutical providers as integral partners. SIP trunking is also on the rise as a connection to both traditional telecommunications carriers and other voice and video service providers. Like every other key component of the converged service network, bandwidth must be monitored and managed to ensure service levels can be maintained. Ports/Lines Ports and lines are a primary consideration during the initial design phase of any VoIP network. The number of trunk ports and telephone sets a system can support is rarely overlooked. Capacity planning needs might seem simple, but can be deceptive. The service delivery team must account for business plans as well as technical plans. Mergers and acquisitions can radically alter the number of trunk ports or telephone sets a company needs. Its vital that the technology service delivery organization participate in business planning, or at least have clear vision of where the enterprise is moving in order to support the wide range of business needs. Codec Planning Chapter 5 explored codecs, primarily for their role in network performance. Codecs directly impact the fidelity of the voice that the end user hears. Codecs have other impacts as well. Remember that you hear an analog voice sound. The process of compressing, digitizing, and packetizing voice requires hardware. The choice of codec can drive CPU utilization up or down. Codec selection also impacts the bandwidth requirements inside the service network. Pulse Code Modulation (PCM) was designed for the 64Kbps voice channel of the T-1 architecture of the PSTN. However, the G.729a (CS-ACELP) codec that has been a rising star in VoIP solutions recently can be optimized to produce a 6 to 8Kpbs bit rate. Although the real world doesnt generally produce a true 8-to-1 reduction in bandwidth required, this codec can dramatically impact bandwidth requirements. Its a good idea to regularly re-evaluate codecs in use. One practice that is becoming more widespread is an annual codec review with an eye to quality of service for end users. Although converting to a different codec seems labor intensive, it may be one approach to improving quality if the network is becoming saturated. Optimizing the Infrastructure Optimization Chapter 5 looked in depth at the performance envelope approach in the context of network readiness assessment. Just as the life cycle is continuous, so is the evaluation or assessment process.
152
Chapter 7
Optimizing Hardware In legacy PBX technology, the life cycle of an investment was commonly projected to be 10 years. In IP networking, the refresh cycle has become much shorter. Repair is most commonly viewed as trouble resolution. For PBX technology, it often means swapping out parts. The same might remain true for a VoIP call processing system today, especially for mechanical component failure. Disk drives and fans will fail. Power supplies will burn out. Hardware will require repair. Another perspective worth considering is that the integration of unified communications services moves off traditional, dedicated, and often proprietary hardware to what might be viewed as general-purpose systems. Most enterprises standardize on a server hardware platform that often supports VoIP services as readily as Web services. You need to not only consider the legacy telephone system at the time you initially deploy VoIP services. Because voice and data are becoming integrated, the data network life cycle also comes into play. For some organizations, obsolescence is only one driver. There may also be compelling business reasons to migrate with newer technology. As discussed earlier in this guide, the ongoing integration of software applications and network service can make it prudent to move to new technologynot when the older systems are at the end of life but when the end is visible on the horizon. Another important consideration is that manufacturers discontinue support for older products every day. Some do so because parts become scarce, making it more difficult to repair systems. In many cases, vendors declare end of life of a product hoping that customers will migrate to newer systems. As systems age, maintenance and support tend to become more expensive. Newer hardware brings new efficiencies in patching and upgrading, which may actually drive support costs down. And older systems may just reach full capacity with no expansion capability to meet current and future business needs. When evaluating the refresh cycle for systems approaching obsolescence, its wise to remember that technology wont do you any good if it can no longer support your business needs. Optimizing Software The ongoing management process must also encompass software. In the old world of PBXs, software was often proprietary. An upgrade or patch was tested and businesses deployed it into the production phone system at night or over a weekend when the phones werent highly used. VoIP service elements are often general-purpose hardware; maybe even the same platform as the companys data servers. Software optimization may include modifying configurations, adjusting frame or packet size, or even some esoteric nuance like altering the TCP window size in the registry setting
153
Chapter 7
Maintenance, Enhancements, Upgrades, and Patches
Many companies understand the need to keep the service processing software up to date but overlook underlying operating systems (OS). Whether a voice service element is running an OS from the UNIX/Linux family, the Windows family, or a vendors custom internal OS, its important to incorporate a patch management process that ensures timely updates of both the OS and the service software. For a large service network, this will often necessitate a lab environment for testing all upgrades before they are deployed in the production network. SLA Optimization Companies that contract with external providers watch SLAs closely to ensure contractual obligations are being met. In the enterprise network, SLAs might only be described in service offerings and not actual contracts to business units. Internal SLAs may be quite informal. In the enterprise world, VoIP service is often designed to meet specific criteria for successful internal delivery. Characteristics for the quality of service needssuch as delay, jitter, loss, and other QoS provisionsneed to be handled just like contractual SLA commitments. As conditions change, the network is regularly tuned to meet these SLA requirements, whether theyre contractual or self-imposed to assure the service quality. Its also good to remember that the SLA approach isnt just documentation of the minimum service levels being provided. Its also used to describe remedial actions to be taken. For internal service delivery, the SLA can be a process document with specific details for life cycle management. It may also be worthwhile to document what service classes, or bandwidth and QoS commitments, are being delivered to end users, then monitor for over-usage. In the enterprise network, its common for internal customers (in business units or divisions, for example) to evolve and suddenly consume far more network resources than originally planned. Although enterprises outside the service delivery sector may have no need for the SLA-driven model, there may be other motivators for the larger enterprise. When SLA-based network monitoring and management systems are fully integrated and automated, they can provide both long- and short-term visibility to overall and service-specific operations. Adopting this service provider mentality turns the SLA into a key tool for successful operation. The cycle of continuous improvement driven by managing the SLA metrics can, and should, be modified over time as network objectives and realities change and evolve. Delay Delay is simply a reality of IP networking. Delay is cumulative and all sorts of factors can impact the time it takes for an IP packet to traverse the network. In addition to delay due to transmission distance, remember that there are other processing delays. In most enterprise networks, delay is usually a constant factor for any given communication. Network traffic in most enterprises tends to follow the same path for everything of that media stream type.
154
Chapter 7
Call Setup
Delay in VoIP networks can lead to call setup failure. For most implementations, this is a factor that doesnt have an adverse impact. Rather, this is a consideration for ongoing management and monitoring of the VoIP service network.
During Call
Regular monitoring of delay in the network and knowing the operating environment may lead to quick trouble resolution later. If you recognize that data traffic is bursty by nature and that traffic such as email and Web browsing are not real-time traffic, its easy to see that they might not provide good early indicator of rising delay during day-to-day operations. VoIP users may provide the canary in the cage sort of early warning system that helps in managing the service as a whole. A change in delay patterns in the network, coupled with routine monitoring and a few calls from users identifying call problems or call setup issues, can quickly help identify problems and lead to mitigation. Delay Variation Delay variation is called jitter. Because packets can take different routes across the network, delay variation in large IP networks is common. Jitter is measured in milliseconds. For most enterprise networks, no extra engineering is needed to maintain a 1 to 2 millisecond level of jitter. Large enterprise and service provider networks, spanning across a large geographic area, are more likely to have diverse routes of varying distance that can increase the tendency for jitter. Jitter is generally not a problem during call setup. Its more likely to arise during the media flow associated with a conversation. In other words, jitter is more likely to impact the voice than it is the signaling. Jitter in voice conversations results in unintelligible conversations that sound jerky. Jitter buffers can be deployed to reduce this impact, but buffering can only do so much to overcome impairment. Jitter is managed in a number of ways. Many vendors equipment provides buffer management algorithms for setting the size of ingress and egress buffers. Beyond buffering at the node level, some vendors include jitter buffers. These are very small, often only 3-bits in size. They can be used to compensate for the timing variations. These jitter buffers may be static, dynamic, or adaptive. Static buffers are common in older routers, particularly in small branch office or home office devices. Dynamic buffers calculate an optimum buffer size based on the first series of packets received. The most advanced jitter buffers adapt to changing network conditions. There are unified communications monitoring tools in both commercial and open source variations that enterprises can use in the NOC to include jitter monitoring as part of the overall network health watch.
155
Chapter 7
Packet Loss Remember that IP is a best-efforts protocol. It relies on higher-layer protocols, like Transmission Control Protocol (TCP), for delivery guarantees. Packet loss will occur. Its also referred to as error rate.
Blocking/Non-Blocking Access
In the traditional telecommunications history, engineers design central office and PBX systems to be either blocking or non-blocking. A non-blocking system assumes that every end user will be able to go off-hook at the telephone set and place a call at the same time. In the traditional environment, non-blocking systems were very rare. Building a non-blocking system meant investing in hardware and circuit connections that statistically would sit idle most of the time. With the evolution to VoIP, and the shrinking cost of networking, some organizations take a different view today. Non-blocking systems may be achieved at a lower cost, and may be the appropriate solution for some businesses. If a non-blocking system is put in place, the monitoring systems may need to evolve to track items such as successful versus failed calls and abandoned calls to assure that a non-blocking service delivery network is being maintained.
Success Rates of Call Setup
During network readiness assessments, engineers evaluate current requirements of the data network and compile existing voice services requirements. One aspect of ongoing oversight in the voice services network is close monitoring of the call completion rate. Calls may fail to complete for a number of reasons, but a low or declining call completion rate might be an indicator of signaling problems, bandwidth consumption (or a network saturation problem), or a number of other issues. Another aspect of call completion rate to monitor is the number of abandoned calls. In a call center environment, abandoned calls are generally tied to wait time for customer service agents. A message telling a caller that the wait is 10 minutes may induce callers to hang up and retry later. Interactive Voice Response (IVR) systems or dial-prompt systems with deeply nested menus may drive callers to hang up in frustration. Call completion rate trends can provide information about specific VoIP services that make capacity planning and overall service more effectively support business processes.
Gateway Issues
Because youre looking at a converged network of voice and data, you have to recognize that not all organizations are going to move directly to new VoIP systems. Convergence is a drive to integrate existing services that work together in new ways, then managing the new network environment of voice and data. Figure 7.6 shows a company with old and new. The location on the left has a PBX connected to the VoIP gateway. In this case, the PBX programming determines which calls are directed to the PSTN and which are directed to the VoIP gateway. This might be done via a special access code, or it might be programmed into the dialing patterns of the PBX. For example, any time a 10-digit number with area code is dialed, the call might be directed to the IP network to minimize longdistance charges on the PSTN. The company might route only local and 800 number calls directly to the PSTN.
156
Chapter 7 The gateway processes and packetizes the phone call, then passes it on to the router, which will transfer it along to the corporate IP network. At the other end of the network, on the right side of the figure, notice there is no PBX. This company has chosen to eliminate the cost associated with buying, managing, and maintaining a PBX at the location on the right. There is a mix of VoIP phones and softphones at this office. The SBC on the right is using SIP trunking via the IP network. VoIP calls come into the SBC, and are then routed through the network to the appropriate endpoint. When everything is configured properly, every voice endpoint can talk with every other endpoint. Traditional voice and VoIP have converged.
Router Router
IP Network
SIP Trunk
VoIP Gateway
PSTN
T-1 Trunks
Session Border Controller
Legacy PBX
VoIP Server
Workstation
Workstation
Figure 7.6: A converged company example setup.
This highly simplified view of the network might make convergence sound like a trivial matter. It is not. This company is managing a complex multi-faceted network, and performance monitoring will have to be undertaken diligently. The gateway must have a database of telephone numbers associated with users, yet you must not overlook that many LANs use Dynamic Host Configuration Protocol (DHCP) for address assignment at startup. Thus, the gateway must maintain a dynamic database that allows a static telephone number to correlate to a dynamic IP address every time the user connects. The SBC simply represents another type of gateway where SIP trunking information and access permission controls reside.
157
Chapter 7 Monitoring systems need to provide alarms for the NOC staff to alert when a number of new situations arise. DHCP addressing is straightforward, but how do you monitor and alarm error conditions when an unknown user tries to register with the VoIP system? Trunk utilization between traditional and VoIP systems, whether done in house or with a telco provider, must be monitored closely. Trunking or media gateways often prove to be the choke point in voice call services. SBCs require the same sort of monitoring as a major enterprise router or firewall. They represent a new perimeter point in the network.
Although convergence presents many wonderful opportunities to bring services together onto a single infrastructure, it also brings several added degrees of complexity to the network. The existing IT staff may not have a deep enough understanding of telephony requirements. The telecom staff may not be familiar with data networking technologies. The skill set required to design and manage this network is a blended skill set that has not been necessary in the past when the two networks remained distinctly separate. As networks and services converge, the pool of talent with the right skills to design, manage, and maintain those networks shrinks. Its important to ensure the support staff receives the proper tools and training to keep these missioncritical services operating at acceptable performance levels. Network management is one of the critical and most challenging aspects of service delivery. In the past, many network designers followed the simple rule of better safe than sorry. This often led to over-engineered and over-priced networks.
y rch a An
s es n i r s Bu able n i ce d E v r Se ente i e Or v i ct oa r eP tiv c a Re
Figure 7.7: The network service maturity path.
158
Chapter 7 Figure 7.7 shows a network services evolutionary path that is actually quite common in business organizations. The underlying goal for focusing on managing the integrated data, voice, and video services is to turn them into business enablers for the enterprise. This evolution of maturity follows a parallel track with the ITIL maturity mentioned earlier. If there is a complete lack of business process, anarchy and chaos reign. No business manager plans this, but sometimes explosive growth, especially in a highly successful start-up company, can cause total chaos. Early business process development is often reactive in nature. As business processes mature, you find they move into a more proactive, forward-looking model. For the most successful information-oriented enterprise, network management becomes a part of the service. It becomes a differentiator. And in the most advanced cases, network management information is such powerful business intelligence that it is leveraged to enable completely new lines of business.
Summary
Knowledge about your network is a crucial element to effective service delivery. Service delivery managers need know and understand every aspect of network service delivery. Without a knowledge-based framework, service assurance efforts are hit or miss at best. Managers overseeing services networks need to know as much as possible about performance trends and service requirements to guarantee availability of mission-critical services in the enterprise. As you integrate voice, through VoIP, with the IP data network, you migrate what is for many organizations the single most mission-critical service. Capacity planners also need utilization information to provide consistent, systematic growth across the suite of unified communications services as business requirements evolve. Everyone involved in service delivery needs information about the user experience to identify bottlenecks in service availability and delivery. Correlating information about failure points is vital to effective troubleshooting in daily operations. Industry best practice today, whether via FCAPS, ITIL, or some other model, all encourage taking a holistic approach to network life cycle management. That holistic view is widely supported by a wide-ranging data gathering and information analysis strategy. To ensure service delivery, you must align service delivery efforts with the needs of customers, internal or external, while finding a cost-effective approach to ensuring capacity, availability, and security of the data, VoIP, and video services. Two of the most treasured assets any information-based company has are its people and its intellectual capital. Data, information, and knowledge are frequently far more precious than inventory and cash reserves. The latter are generally easier to replace or regenerate. Raw data is easily collected. In years past, log data often either purged from systems after some time period or went into archival storage. For many organizations, historical information that you now perceive as key business intelligence was stored on tape or microfiche and buried in a data tomb. In more recent years, attentions have focused on data warehousing concepts. This quickly spawned another industry subset promoting knowledge management solutions. Today, many of these knowledge management theories and techniques are used to arm managers with information needed to ensure service levels meet user needs.
159
Chapter 7 The enterprise service delivery workforce requires appropriate data gathering and analysis tools to facilitate making informed decisions about network services in daily operations. The more accessible and readily available these tools are, the more efficient and productive the organization can be. One way to bring the tools closer to hand is to create that converged environment where access to any piece of information and any communication need is available instantly. This knowledge-based approach ensures the highest level of confidence in all network services. A well-managed converged network, tightly integrating data, VoIP, and video services may soon prove to be a huge differentiator between competitors in business. The next chapter will dig into configuration, fault, and performance management.
160
Chapter 8
Chapter 8: Effective Network Configuration, Network Fault, and Network Performance Management
Chapter 7 introduced the FCAPS model and examined service availability and capacity planning management. This chapter will continue that theme of using a methodology for consistent management of network faults or problems, configuration of network devices, and performance. Network management means different things to different people. For some organizations, it simply means a network consultant is monitoring network activity with some tool. In larger enterprises, network management involves continuous polling of network devices to monitor status, distributed databases containing logs and error reports, and graphical representations of the network topology to present a high-level view of the overall health condition of the network. All network management can be viewed as a service that uses tools, devices, and software applications to assist network managers in monitoring and maintaining the quality of service (QoS) being provided.
Fault Management
As a part of holistic network management, fault management is the term used to describe the set of tools and functions used to detect and isolate and then remediate problems in the service network. These malfunctions may be technical, such as equipment failures, or caused by human error. The central theme is that something failed in the network. Fault management sometimes includes environmental control systems or monitoring. Faults are detected through monitoring system events. In many organizations, event monitoring occurs in three ways: The Network Management System (NMS) in an enterprise command center monitors the status and health of network elements. Often an icon representing an element in the network will simply turn from green to red, indicating a problem. The NMS typically also functions as a fault management system. Event correlation and analysis systems are designed to process syslog and event log files from a number of systems. Many of these systems include an engine for detecting anomalies as part of event correlation. Human analysis, while often effective, is also inefficient. The sheer volume of log data generated in large networks makes human analysis impractical for many purposes. Yet when network performance issues arise, most groups have people begin to review logs. Humans apply different logic than NMS and event correlation engines apply, and may of those systems spot trends, patterns, or unique items that fall below automated thresholds.
161
Chapter 8 When a failure or fault occurs, elements of the network send information about the problem to the NMS. SNMP is widely used for this purpose. Elements transmit alarm information or indicators. These alarm indicators remain in alert state until the error condition or problem is fixed. Many fault management systems are configurable to allow prioritization of network faults into different levels of severity. Some alarms may simply represent debugging information, while others might indicate emergency problems. The following list highlights the most common severity levels used in fault monitoring and management systems today Warning Minor Major Critical Indeterminate Cleared
Note the cleared severity level. It is considered good practice to have network elements send positive alerting information when an error has been cleared to ensure all monitoring systems note problem resolution.
The fault management console, often referred to as the dashboard, allows the operations center staff to monitor events across the network from a large number of systems. A large enterprise network approach might be to overlay the network map on top of a geographic map to give a sense of where in the network events are occurring. A fully robust fault management system may be automated to the point that corrective action is handled automatically. Although many network elements include protocols for automatic failover and resilience, other elements may require creative approaches. For example, detection of a failed server at the NMS might trigger running a script that moves active services to a backup system. Scripting reconfiguration of this type of response helps ensure quick activation and eliminates the potential for keyboarding input errors.
162
Chapter 8 In many cases, the automated system will simply be used to alert network support staff. Today, email messages, pager alerts, and SMS messages to mobile phones are widely supported approaches used in the NMS. It may also be prudent to implement practices that include escalation mechanisms. Notifying a single person is probably inadequate for a very large enterprise network. Sometimes a group of supporting personnel is notified as part of the incident response team. At a high level, fault management can be performed in active mode or passive mode. SNMP monitoring and data gathering are a passive approach. Data is gathered as part of the normal flow of traffic. If the network element recognizes some error condition exceeding thresholds, an error message is sent to the NMS. This also presumes the network node is able to send the appropriate alarm. If, for example, a power supply were to fail, often no alarm can be sent. The element simply powers down. Active fault management engages other tools to monitor the health of systems. These tools include PING, Web page retrieval, testing of email responses, port scans, and a number of other methods. If a Web server fails to serve up a Web page, an active monitoring system can send an alert based on that failure. The server hardware might well be fully operational with no detectable malfunction. In this case, perhaps Web services failed, but the server is still operating. No hardware components have failed. Even the OS and Web application may be running properly. Active monitoring can help catch faults that would otherwise pass undetected. One key to successful fault management is to deploy tools that support processes rather than the reverse. Established processes shouldnt be bent to fit new tools without careful consideration. On the flip side, tools may lead to the development of new network management processes where none previously existed. When deploying network management tools, its helpful to focus on the fault tolerance and resilience of the service delivery network, with an eye on both redundancy and security.
163
Chapter 8
Network and Fault ManagementAn Integration Strategy There are some basic rules of thumb to follow when integrating the NMSs into the converged services network: Identify the best sources for data. Not every network element warrants monitoring, but its vital to monitor key service delivery elements. Use the best tools available and the best features of each. A mediocre tool might seem cost effective initially, but if it doesnt provide the necessary capabilities, it wont help maintain an optimal network for delivery of integrated data, voice, and video. Wherever possible, use the dashboard approach to deploy a Manager of Managers. This centralized dashboard will provide a complete holistic view for at-a-glance snapshots of the overall network status and health. Follow the tried and true KISS (Keep It Simple Stupid) approach wherever possible. Dont introduce undue complexity. Use modular management components that integrate well together. Stay focused on outof-the box functionality. An NMS that requires customer script development and programming to implement is likely to prove a problem to sustain. The GUI approach is easier to learn and adopt than a system using command lines. Where practical, specific workgroup GUIs can provide individual teams with the tools they need from a single source (the dashboard or Manager of Managers). The network support group, Help desk, voice services staff, and security administrators may all need slightly different GUI views of the same underlying network infrastructure to operate effectively.
Caveats of Implementation Strategies for NMSs When deploying the NMS, there are some basic caveats to keep in mind. A complex and convoluted system can be less effective and more labor intensive than a manual system of work procedures. Again, keep it simple. Avoid over-automating. System automation is a wonderful boon to productivity, but the delivery of integrated data, voice, and video services requires human oversight. Automating in the wrong place can lead to a false-positive system event that triggers automated network reconfiguration or failover to backup systems. Look carefully at these mechanisms to ensure that qualified support staff is in control. Remember these are tools for staff, not replacements.
164
Chapter 8 When evaluating NMS solutions, here are some basics to consider: Does it have a simple interface? It should be easy to access everything you need. Users shouldnt have to toggle back and forth between screens to perform basic tasks. A Web browser based interface may allow for easy customization for different workgroup views. Does it provide the ability to set a baseline? Simply put, the value in an NMS is in the ease of setting baseline thresholds at normal network performance and operation levels. How easy is it to establish the baseline and set thresholds than trigger notifications? Does the reporting capability meet your needs? If the NMS can report an event, whether its a failed element or a spike in traffic, does it also provide enough historical tracking and analysis capability to provide management reports that will be useful to the service delivery manager? An NMS tool that requires knowledge of another programming or report writing language will make it more difficult to extract useful information in a format that helps make informed business decisions.
Later, this chapter will talk about configuration management, which is tightly coupled with change control processes. Throughout this chapter, the goal is to maintain a broad holistic view, keep perspective, and link all the various components of network management together.
Recognize, Isolate, and Detect Faults Past studies have given way to anecdotal evidence in many areas of networking. One example is commonly referred to as the 80-20 Rule. Sometimes it seems that there is an 80-20 Rule to fit every situation. Fault management is no different. The commonly held belief, supported by past studies, is that when faults do occur, 80% of the time is spent looking for the problem, while only 20% of the time is spent fixing the problem. Recognizing that faults exist, detecting them, and isolating them is also referred to as anomaly detection and correction. What were striving for is near real-time event correlation information to gather a timely root cause analysis of the fault for timely remediation of the problem. NMS vendors provide this root cause analysis functionality in a number of ways. Some use mathematical/statistical modeling techniques. Others base solutions on complex academic systems models. Many take a much simpler approach using rudimentary event filtering schemes. This last category is often the approach implemented by fault monitoring systems that organizations develop in-house.
165
Chapter 8
The Role of the NMS Regardless of the methodology used internally in the NMS, event filters sort through the flood of SNMP traps that are constantly being processed. In its most basic form, which Figure 8.1 shows, the NMS provides two key rolescorrelation and an event watcher. Most organizations configure the NMS to ignore trivial events and closely monitor areas of concern. As events are processed, theyre typically handled in one of the following ways: Traps reporting many failures are analyzed, leading to a verification, event correlation, and a notification process. These are typically warnings and minor alarms. Traps reported clear or restored may also trigger notifications to cancel other work tasks. Some traps, especially those deemed as major or critical, may trigger immediate notifications. Information and debugging traps might simply be logged for future reference
Server Server Server Switch Router
Router
Switch Printer Server
Verification of Failure Correlation
Paging
Router
Event Watcher
Trouble Tickets
Notification
Email
Popups On Screen Log Files
NMS
Figure 8.1: A basic NMS.
166
Chapter 8
SNMP and Fault Management Chapter 6 reviewed SNMP, which is a crucial protocol in holistic fault management and network health monitoring. SNMP is the most widely used protocol for managing devices in the network. Its popular because its flexible and easy for vendors to implement. SNMP management includes three components: managed devices, agents, and the NMS. A managed device is any hardware element in the network that implements SNMP and is capable of reporting information. This includes routers, switches, servers, workstations, printers, and other devices. In the converged services network, the media and signaling gateways, voicemail systems, and other call processing elements will all probably provide SNMP monitoring and management capabilities. They may also include vendor proprietary mechanisms. An SNMP agent is simply the software that provides the SNMP information. The agent might be a daemon process running within the OS kernel or some additional software that is installed at the time the network device is set up. The agent software collects information and passes it to the NMS. The NMS is the centralized monitoring overseer. The NMS sends requests to monitored devices in the network, and the agent software running on the device sends a reply. The NMS sends five types of messages: GetRequest is used to retrieve a specific value from a network device. SetRequest is used to set a defined value in a network device. GetNextRequest is used by the NMS when building a table of responses. Its used to collect multiple inputs. GetResponse is used to return error codes and other responses to requests from the NMS. Trap is an unsolicited message. Its sent from the agent to the manager. This is the error reporting mechanism that is used to provide immediate information to the NMS about the status of a network device.
167
Chapter 8
For technical details on SNMP see the following resources. The SNMP Version 1 RFCs are: * RFC 1155. Structure and Identification of Management Information for TCP/IP-based internets * RFC 1157. Simple Network Management Protocol * RFC 1212. Concise MIB Definitions * RFC 1213. Management Information Base for Network Management of TCP/IP-based internets The SNMP Version 2 RFCs are: * RFC 1901. Introduction to Community-based SNMPv2 * RFC 2578. Structure of Management Information Version 2 (SMIv2) * RFC 2579. Textual Conventions for SMIv2 * RFC 2580. Conformance Statements for SMIv2 The SNMP Version 3 RFCs are: * RFC 2576 (PROPOSED STANDARD). Coexistence between SNMP Version 1, Version 2, and Version 3 (March 2000) * RFC 3410 (Informational). Introduction and Applicability Statements for Internet Standard Management Framework (December 2002) * RFC 3411. An Architecture for Describing SNMP Management Frameworks (December 2002) * RFC 3412. Message Processing and Dispatching (December 2002) * RFC 3413. SNMP Applications (December 2002) * RFC 3414. User-based Security Model (December 2002) * RFC 3415. View-based Access Control Model (December 2002) * RFC 3416. Version 2 of SNMP Protocol Operations (December 2002) * RFC 3417. Transport Mappings (December 2002) * RFC 3418. Management Information Base (MIB) for the Simple Network Management Protocol (SNMP) (December 2002) * RFC 3584. Coexistence between Version 1, Version 2, and Version 3 of the Internet-standard Network Management Framework * RFC 3826. The Advanced Encryption Standard (AES) Cipher Algorithm in the SNMP User-based Security Model
168
Chapter 8
Syslog and Fault Management As described in Chapter 6, syslog is another important tool in fault management. SNMP is widely used for status and health monitoring. It provides real-time information about elements in the network. Syslog provides more comprehensive log data, including error messages with extensive data about what is occurring internally within the network device. As syslog data comes from an array of devices throughout the network, its used in event correlation to understand the impact a failure in one part of the network may have in other areas. Because syslog data is in plaintext, its easily readable, even when not easily understood. This makes it easy to manipulate for analysis. Syslog data is crucial for verification and validation of details associated with many network faults.
Chapter 9 will explore the value of syslog information in assessing security issues.
Fault Management and ROI There are tangible and measurable return on the investment (ROI) values in fault management methods. The level of service delivered to users is higher when fault management and monitoring are applied. NMS tools provide immediate impact assessment capabilities for response to faults and problems arising in daily operations. This leads to better decision making, both in troubleshooting and network planning. This improved operational stability will result in fewer outages and errors and improve the QoS delivery overall. Consistency in service delivery engenders user confidence in the integrated services. Satisfied users will leverage the converged services into other enterprise business process flows. Knowing and understanding the details of problems in the service network provides not just historical information but also crucial business intelligence that aid in future capacity planning and new services for consideration in the future.
169
Chapter 8
Configuration Management
The primary objective of configuration management is to monitor network and system configuration information to provide an audit trail, or tracking mechanism, of changes made to the network. Every device in the network has some associated configuration information. Given the variety of network elements in the converged services network, its critical that configuration changes be coupled tightly with some formalized change control process. As with every other aspect of managing the FCAPS model, the NMS is the brains of the overall process. NMS solutions might be single-vendor, single-system solutions or they might be distributed systems that work together and roll information to the previously mentioned Manager of Managers. NMS is a system that may have several components. Some key features to consider when investigating configuration management include: Auto-discoveryIs the system capable of automatically probing your network and learning what the components of the infrastructure? Can it determine topology and layout of the existing network, and detect new elements as theyre added? Can it be configured to automatically retrieve configuration information and store it in the configuration management database? Ability to import configurationsWill the system support all the various configurations of different vendors solutions into a single database schema? Will different configuration databases be required for servers and workstations than for switches and routers? Are all the elements of the converged services network supported (call managers, media/trunking gateways, signaling gateways, voicemail systems, and traditional telephony elements such as PBXs)? Configuration analysisThis is a new development in NMS capabilities that has arisen in the past few years. Is the NMS able to compare existing configurations with other similar devices in the network? Does it include a standardized comparison of known bestpractice configuration issues? Routers may have common configuration errors that dont impact prior operations but might impact converged services. Servers and workstations may have unnecessary services running that should be disabled. OS versions and patch levels need to be monitored, and in most networks, maintained with consistency across similar network devices. Will the NMS provide this comparison capability? Policy-based configurationDoes the NMS support a single policy-based solution and provide for easy deployment across all devices in the network? Policy-based configuration management allows precious staff time and resources to be used more effectively. The ability to write a policy template that can then be universally pushed ensures consistency of operations and unifies compliance with enterprise standards.
170
Chapter 8
Collecting and Storing Configuration Data In unified communications, network configuration management provides control of changes in the network. The most common changes noted are to hardware, firmware, and software, but documentation changes also apply. And these changes continue throughout the lifespan of the network. Change is constant, driven by upgrades, patches, and equipment replacement. A comprehensive configuration management approach will collect all the changes made to a network element throughout its entire life cycle. Once a device is deployed in the network, change control processes guide the evaluation and approval of implementing configuration changes. This function is vital for two reasons: The interrelationships between network elements become more complex as the network incorporates new services. In the past, a simple change to a router might have only impacted traditional data flows. In the converged network, a simple router change can easily alter the QoS for a traffic flow. This simple change might disrupt or degrade voice services unexpectedly. Fallback to previous configurations may be required quickly. One of the great benefits of comprehensive configuration management is the ability to undo changes and revert to a prior, known operational state.
Configuration and Change Management Configuration and change management thought processes in the industry today are loosely based on guidelines for software configuration management principles described by Roger Pressman in his book Software Engineering: A Practitioners Approach (McGraw Hill, 2004, ISBN: 007301933X). SCM is a widely accepted methodology for controlling and managing change in the software development environment, but the core principles apply equally to networking. The central focus is to identify what changed, who changed it, and how the change can either be reproduced or undone.
Configuration Management Resources The Institute of Configuration Management at http://www.icmhq.com/index.html Configuration Management Training Foundation at http://www.cmtf.com/ Configuration Management Body of Knowledge at http://www.cmbok.com/ The Configuration Management Wiki-Web at http://www.cmcrossroads.com/cgi-bin/cmwiki/view/CM/ Configuration Management Principles and Practice by Anne Mette Jonassen Hass (ISBN: 0-32111766-2)
171
Chapter 8 Configuration management provides for the identification of changes, a controlled storage environment for archival data, change control, and status reporting of every change activity throughout the life cycle. Figure 8.2 shows a simplified view of configuration management activity areas and offers a peek into process flow. The configuration data itself is shown as metadata (data about the configuration information). In configuration management, the metadata for a configuration item may include: The name of the change The name of the person initiating the change The name of the approver of the change Text description of the change Date change was placed into production References to other configuration items
In most NMS solutions, the metadata is often stored on the same system as the configurations themselves.
Production Network
Configuration Item Configuration Item
Audit
Item Approved
Configuration Item
Controlled Storage
Metadata
Change Control
Metadatabase Metadata Change Data Status Reporting
Reports
Figure 8.2: Configuration Management Activities.
172
Chapter 8 Configuration items that are different versions of the same original item are obviously strongly related, but each one is an individual item, which will be identified and may be extracted and used independently. This is one of the main points of configuration management: to be able to revert to an earlier version of an item class. At every step of the FCAPS methodology there is a documented, methodical process being reinforced as a commonly accepted best practice. In Chapter 7, Figure 7.7 shows an evolutionary path of an enterprise network moving from anarchy through reactive and proactive modes as part of the maturation to a service-oriented business enabler. The maturation cycle of a network often mirrors the evolution of the business itself. Configuration management is a vital part of collecting and managing business intelligence about the services supporting the core business.
Performance Management
One key to the emerging integrated services network is the constant appearance of new technologies. Today, 10 years into deployment, VoIP isnt really new or emerging. Within the broad telecommunications sector, many service providers are now stepping back and thinking more purely in terms of voice services, with VoIP being one voice delivery mechanism. For enterprise networks, the impetus is much greater to integrate technologies with business processes. Youll recall that Chapter 1 looks at convergence from several views: Infrastructure convergence of wiring and circuits Service convergence using IP as the delivery mechanism Device convergence of the physical tools used in everyday business Application convergence of enterprise business applications such as sales force automation (SFA), supply chain management, and customer relationship management (CRM)
Although commercial providers of voice services might be starting to take a narrower view of voice as simply a service, that is a luxury the enterprise business cannot afford. The value in convergence is found in the total integration. Complete integration can reduce cost and increase productivity. Once integration has become inculcated into the corporate culture, it can lead to creative new solutions to old business problems and even spawn ideas for new business tactics. Chapter 5 looked at the Network Performance Envelope methodology for assessing requirements. One tangible benefit to this model is that it supports many phases of the life cycle of the enterprise converged services network. Figure 8.3 touches on some of the most basic principles of this model.
173
Chapter 8
Availability Throughput Manageability Scalability Integrity Response Time Reliability Network Segment Utilization
Cost
CPU Utilization Security
Figure 8.3: Revisiting the Network Performance Envelope.
There are many factors that impact performance of the network. Prior to implementation, you assess network readinessyou determine whether the network can support converged data, voice, and video. As you implement those services, you reassess to ensure that your assumptions have all been proven correct, that any requisite upgrade activity was successful, and that the network does indeed deliver the integrated services as planned. In the operational phase, you continually monitor performance to ensure those standards are being upheld. Operational success is driven by information. The more you know about the network, the better your service delivery consistency. Although network managers all strive to deliver good service, sometimes consistency of service gets overlooked. Vigilant performance monitoring helps ensure a consistent, reliable network. Performance management is a series of checks and balances. We often visualize a counterbalance scale in weighing performance against other factors. That simplistic view doesnt serve the needs of managing a converged service network. Some managers view this multi-service network in terms of the trade-off triangle (see Figure 8.4).
174
Chapter 8
Data
Trade-off Triangle
Voice/VoIP
Figure 8.4: The misleading trade-off triangle.
Video
The danger in taking this view of balancing performance and service is that it pits network services against one another for resources. This approach may be simple, but the danger of giving preference to one of the new emerging services, VoIP or video, is that existing data servicesWeb services, financial applications, customer service toolsmay suffer from an exhaustion of resources. Business-critical applications may starve for resources while the new technologies take center stage. The holistic truth is more akin to balancing a high-performance automobile tire.
175
Chapter 8
A binary decision of either this or that will lead decisions about holistic network management to ineffective conclusions
Availability Manageability
Throughput
Scalability Integrity Response Time Cost
CPU Utilization
Reliability Security Network Segment Utilization

Figure 8.5: Managing holistic balance.
The illustration in Figure 8.5 has simply taken the network performance parameters used in the performance envelope methodology and shown them as balance points around a tire. Performance monitoring and management is not a binary decision in any way. Trying to balance security against manageability, error rate against response time, or throughput against reliability are just a few examples of how binary decisions fail. Because much of the focus at this stage of convergence is on VoIP services, look first at the VoIP service elements that must be monitored. Remember that years of telephone usage have socialized us to expect a base performance and quality profile akin to the traditional telephone network. When VoIP devices register with the network, whether they are phones, media/trunking gateways, signaling servers, or other devices, registration problems can adversely impact the delivery of service. If a VoIP phone on a desktop cant pull down a profile and phone number for any reason, problems follow. Youll want to establish a monitoring and alerting threshold in the event that something as simple as registration attempts or failures exceeds predefined levels. If there is growth in the network, or the number of registered telephones changes dramatically, it could be a signal that there is a problem with the VoIP services themselves. Gateway registration monitoring will help identify new or missing trunk capacity to other networks.
176
Chapter 8 Call monitoring is often an inflammatory topic because of privacy concerns. At this level of providing service in the network, call monitoring doesnt mean eavesdropping on individual calls. Its truly call traffic monitoring. This function involves monitoring incoming and outgoing call volumes to identify failures. If the VoIP system supports fax calling, attempted fax calls should be monitored as well. In call monitoring, look at four distinct areas: Calls in progressThe instant a VoIP phone or softphone goes off hook, the call is deemed in progress until the caller or called party hangs up. Calls in progress include those in the process of dialing, off-hook getting busy signal, and so on. If every call attempted completes successfully, the number of calls in progress will be the same as the next time, active calls. As part of capacity planning, decisions have to be made at implementation time as to the number of active calls the system can support. Performance monitoring assists with capacity planning. If the percentage of available capacity runs normally at 70% utilization, then rises to 95%, performance monitors can alert capacity planners to a change in the network; a change that must be addressed before calls are blocked due to demand exceeding the capacity. Active callsOnce a call has successfully connected, its deemed an active call. It has completed the setup signaling and now has a voice media path connected through the network. The same capacity considerations apply as for calls in progress. Attempted callsIn general, designers try, within available resources, to provide nonblocking systems. That is, they try to guarantee that all calls attempted will be completed successfully, but such isnt the case in the real world. Calls encounter busy conditions and go unanswered or are abandoned. Monitoring calls attempted over time yields data used to identify peak calling traffic periods and what is referred to as the busy hour call attempt (BHCA) value. Completed callsAny phone call that completes and terminates without an abnormal termination code is counted as a completed call. Monitoring completed calls over time is also helpful in determining peak traffic periods and the BHCA value.
For most businesses, using VoIP strictly as an internal calling system isnt practical. Corporate VoIP services have to interconnect with the Public Switched Telephone Network (PSTN) through either SIP trunks or standard voice trunks. This is done through some form of gateway. A gateway connecting to the PSTN via traditional T1 trunks needs to be monitored on the PSTN side of the VoIP service network not just on the IP network. Monitoring active channels on these trunks over time can help identify calling patterns and busy hour peak call volumes. Baseline data can also be used to identify underutilization of circuits, leading to downsized capacity and reduced operating costs. Data trending helps in capacity planning and the growth and maturation of the converged services.
177
Chapter 8 Another benefit to implementing VoIP services is the conference-bridging capabilities. Conferencing adds yet another monitoring aspect. If your VoIP implementation supports conferencing, the maximum number of audio streams that can be supported for conferencing has to be configured. Monitoring will help ensure that the number of available audio streams meets the service levels needed for day-to-day business requirements. IP phones, whether theyre physical hardware phones or softphones running on workstations, require continual monitoring for service assurance. IP phones should be monitored for registration status, dial tone validity, jitter, latency, and packet loss. These key QoS metrics directly impact both call quality and the user experience. QoS and Bandwidth Monitoring Earlier chapters talked about codecs and QoS. Bandwidth requirements for VoIP traffic are driven in large part by codec selection. The PCM codec (G.711) requires about 64Kbps to support a bi-directional phone call. The G.723 and G.729 require significantly less bandwidth due to compression, but network congestion may have a greater impact on call quality. Whenever new applications are introduced into the business network, there is a risk of oversubscribing individual links. Oversubscription leads to congestion that can, in turn, degrade call quality. Packet loss and increased latency are common side effects of congestion. In the worst case, they can disrupt call setup and voice media transmission to the point that VoIP services become unusable. To guarantee that VoIP users receive an acceptable level of voice call quality, VoIP traffic generally needs to be given priority over other types of traffic on the network, although video traffic may warrant even higher priority that voice. Data traffic is bursty in nature and often does not involve real-time communications between people. The primary objective of QoS techniques it to provide a prioritization mechanism that can ensure that every type of packet on the network is handled appropriately for the content inside. Because of its real-time nature, VoIP traffic typically receives the preferential treatment to reduce or eliminate delay. Commonly monitored voice performance metrics include: DelayLatency or delay is an estimate of the packet delivery time across the network. Its expressed in milliseconds and measured as the average value of the difference between a senders and receivers timestamps on messages. Its measured when the messages are received. Remember that delay is cumulative. End-to-end delay is a key factor in determining the overall VoIP call quality. JitterVariation in delay is called jitter. Weve mentioned consistency in performance, and jitter is one area that is crucial in delivery of real-time services such as voice and video. It indicates a variance in the arrival rate of packets at the destination. Jitter is a predictability factor that is often used in discussion of the overall reliability of a service network. Jitter problems are well known to adversely impact call quality. Networks can compensate for jitter by implementing jitter buffers to normalize the timing of the traffic flow. Jitter buffer loss occurs when jitter exceeds that which the jitter buffer can hold. Jitter and jitter buffer loss affect call clarity, which affects the overall call quality and user experience.
178
Chapter 8 Packet lossLoss simply indicates packets lost or discarded in transmission. In VoIP services, this could mean the loss of an entire syllable or word during the course of a conversation; more importantly, it might mean the loss of a dialed digit, preventing a call from ever completing successful. Obviously, data loss can severely impair call quality. Monitoring systems measure the number of packets that were expected against the number actually received. Mean Opinion Score (MOS)MOS is a subjective quality measure that was discussed in earlier chapters. MOS testing is non-intrusive and provides a way to monitor and measure call quality in ongoing network operations. Historically, MOS was derived from panels of judges listening and scoring call quality. Although its useful for human users to interpret quality, its a useless metric in delivering quality assurances under any form of Service Level Agreement (SLA). It doesnt fit well in operational monitoring systems; but there are alternatives that are gaining popularity, especially with large enterprises and service providers.
The E-Model is tool that can predict the average voice quality of a call using a mathematical model. E-Model accounts for the estimated impact of delay, jitter, loss, and codec performance. The output result of an E-Model calculation is called the R Factor or R Value. These values estimate voice quality on a scale from 0 (the lowest quality) to 100 (the highest quality). Like an MOS score of 5, an R Value of 100 is, in theory, unattainable, but its the standard goal network engineers shoot for when designing voice service networks. Because E-Model scores are based on measurable parameters, monitoring tools are becoming more common that can be incorporated into the enterprise NMS strategy for performance monitoring. E-Model monitoring tools evaluate the Real-Time Protocol (RTP) streams based on information found in the traffic flows (source address, destination address, TCP.UDP port numbers, and packet sequence numbers) to create what is called a jitter profile. E-Model then creates a score that, in testing, correlates to traditional MOS with 80 to 90 percent accuracy. There is another non-intrusive performance measurement technique that is emerging called the ITU-T Calculated Planning Impairment Factor (ICPIF) score. Its based on the ITU-T G.113 standard. Driven by increasing sophistication required to accurately assess voice quality, some vendors are beginning to adopt this technique. With increased attention to VoIP quality in standards bodies such as the IEEE and ITU-T, this method, which has been around for nearly 10 years, is gaining traction across the industry.
179
Chapter 8 ICPIF takes some factors into account that the E-Model does not: Signal attenuation distortion Circuit nose Codec encoding distortion Delay distortion One-way transmission time Echo
Although ICPIF may be slow moving into enterprise business network monitoring, it will certainly play a role among service providers in delivering commercial voice services using VoIP. Collecting and Analyzing the Data Throughout, this guide has repeated a common theme: Information gathering is absolutely necessary to successfully manage a converged service network. The more extensive and comprehensive the data collection and analysis tools, the better armed network managers are to make decisions about present daily operations and future capacity and growth plans. Monitoring the Health of the Network Ongoing health monitoring provides the service delivery organization information with which to make informed decisions. Perhaps more importantly, it provides a reportable mechanism to demonstrate service delivery assurances to end users and customers. Performance and Utilization Trends Close monitoring of network services brings the tools to hand to proactively measure performance and utilization for multiple reasons. The enterprise networkparticularly the integrated services network providing delivery of a mix of data, voice, and video servicesis a finely tuned machine. Just as a high-performance vehicle will get better mileage, achieve faster acceleration, and handle better on the race course, a finely tuned network will handle peaks in traffic and unplanned events more smoothly than a network just left to run. The dashboard of the NMS provides a window into the performance levels and utilization of the network at any point in time. Changes in operating conditions can be detected quickly and proactively, ensuring that business-critical network services are always available.
180
Chapter 8
Administration Management for Performance and Planning In addition to daily operations, monitoring all these factors provides trending information on utilization, traffic patterns, and capacity. This enables network planners to accurately anticipate future network needs. This added knowledge factors into business decision in other ways. Knowing that capacity planning needs are being closely managed helps ensure that a company doesnt overspend in future capacity that isnt required to support the business. Information acts as a quality assurance mechanism in future network planning. Gathering Usage Statistics Beyond the call quality and user experience measures and metrics, the infrastructure itself needs to be monitored closely to ensure the critical service delivery elements are operating at appropriate levels and can handle the load of traffic being passed. Minutes of use in the traditional telephony environment were used as an indicator that enough T1 trunks and circuits were available. In the converged network, minutes of use correlates to bandwidth and network resource consumption. A trend indicating growing minutes of usage means more network resources are being used. It can provide an early indicator that SIP or TDM gateways, signaling servers, voicemail systems, and other elements of VoIP service delivery need to be evaluated to ensure they arent being stretched beyond tolerable capacity levels. Disk space and CPU utilization are very simple indicators to watch and good barometers of changes in the environment. If any vital network element spikes in CPU utilization, or suddenly runs out of disk space, monitoring helps ensure an early warning system can engage immediate response before service delivery is impacted. Managing Backups and Synchronization for Performance In enterprise business, backup systems are widely deployed to ensure all mission-critical data is protected. When delivering converged services, some new considerations may come into play. Its imperative that you remember to back up not just the data systems but also the management systems. The NMS in an integrated network is a mission-critical piece of service delivery inside the enterprise. Backup and restoration plans need to be developed and rigidly followed. After investing all the time and effort required to baseline the network, collect configuration information, and build a historical library of past performance metrics and trends, it would be disastrous to lose the information simply because the NMS wasnt part of the information backup strategy. For large enterprises and geographically dispersed organizations, continuity of operations may involve a different approach. Many companies have a business continuity or disaster recovery center located at a remote location. As part of the NMS strategy, its wise to consider a mirrored NMS at the remote facility to ensure uninterrupted service in the event of a disaster.
181
Chapter 8
Summary
Networks are increasing in both breadth and depth. The increase in breadth relates to an increase in size. More users connect to enterprise networks. The number of connected devices is also growing. There is a parallel increase in depth driven by the set of services delivered to any endpoint on the network. In earlier networks, the service delivery footprint was small, relegated to FTP, email, and Web browsing. Converged networks are rapidly changing that. The integrated network is now bringing data, voice, and video together. From an infrastructure perspective, that drives the need for more vigilant monitoring of faults, capacity, assets, performance, and security.
Chapter 9 will address security in the converged services network in more detail.
The converged network is also rapidly growing in other ways. Voice and video are viewed as network services. Application services are also quickly integrating, both into the network and with voice and video. There are multiple concurrent convergences underway, making it vital that you monitor closely to protect your business services. Providing users with a consistent QoS and reliable network experience in increasingly complex networks requires the use of increasingly sophisticated tools. Legacy NMSs dont do a good job of managing emerging, converged networks. The old tools focused on a GUI interface with a network status indicator. In todays multi-service network, that may not be enough. New tools are based on algorithms for network discovery, real-time monitoring, visualization tools for modeling, simulation, anomaly detection, and event correlation. This guide has touched on several views of life cycle management. Figure 8.6 presents another a simple network performance management life cycle.
Topology D iscovery C onfigure N ew System s
C ollect Statistics (M onitoring) R em ediate Problem s Identify Problem s
Figure 8.6: Network performance management life cycle.
182
Chapter 8 Many legacy NMS approaches only deal with discovering network topology and collecting statistics. When planning for the ongoing life cycle management of an integrated data, voice, and video network, its important to look at NMSs that can incorporate problem identification and remediation along with the configuration of new systems.
As a note on topology, remember that with current networking equipment, topology is more than just ports and cables. The logical topology of VLAN configuration, MPLS domains, and IP subnetting is an important facet of topology discovery in depth.
As networks evolve, QoS guarantees for bandwidth and delay characteristics for audio are a starting point, but they arent enough. The technologies are advancing rapidly and this is a good time to consider broader performance attributes such as variations in connectivity, robustness of redundancy and failover, and the overall performance of the traditional best-efforts approach used by IP. For future networks, these tools will need to provide for: Performance monitoring and measurement Forensic analysis (beyond packet sniffing) Capacity planning (to project impacts of new services) Load generation (for scenario testing)
The only way to can achieve an ideal NMS is to proactively incorporate components of performance monitoring into the system. Metrics for availability and responsiveness need to be included in any notification system. Baselining support is intrinsic to success, and routinely scheduled baselines should be run across the corporate network. Without an NMS that is oriented to the complete array of services provided, including data, voice, and video, companies will fly blind as to service assurance. VoIP services that lack comprehensive management and monitoring procedures are likely to deliver poor quality services. Continual monitoring of documented service-level metrics (appropriate QoS metrics) is the only way to truly ensure and demonstrate enterprise-class service delivery. These metrics should include, at minimum, some combination of jitter, latency, call completion and quality, and voice quality (MOS, and so on). Chapter 9 will wrap up the dive into the FCAPS model and will highlight security management issues facing enterprise deployments today. In addition, the chapter will identify common industry best practices for creating an effective VoIP security plan that balances securing the network against the VoIP requirements for availability, reliability, and performance.
183
Chapter 9
Chapter 9: Effective Security Management

The convergence of voice, video, and data networks has been evolving and gaining momentum for the past several years. Many organizations have undertaken VoIP implementation to converge networks for cost reduction. Others work to achieve the competitive advantage of integrated services. Whatever the reason for network service integration, you cannot overlook the security risks that arise as technologies converge. VoIP implementers often focus on issues of voice quality, and interoperability. These are truly important factors in the delivery of voice services. In many ways inside the converged service network, voice security needs to be treated as data security. And data security needs to be treated as voice security. Both technologies bring issues and management techniques that benefit the other. This chapter will highlight security management issues facing enterprise deployments today and identify common industry best practices for creating an effective and comprehensive security plan that balances securing the network against the VoIP requirements for availability, reliability, and performance. Security methods can adversely impact network performance. Firewalls induce delay by inspecting each packet in the data stream. This will add delay to packet delivery. Congestion at the firewall can lead to variable processing time within the firewall. This will increase the problem of jitter. A systematic and holistic approach to managing integrated network performance and security includes working with vendors, services providers, and trusted business partners to ensure a comprehensive approach to security is followed. As previous chapters illustrate, successful operations are driven by knowledge and information. The more you know about the network, the better youre able to analyze problems. Solid knowledge and understanding of the network leads to an approach that balances all aspects of network management. Building this base knowledge helps you effectively manage the entire life cycle of network services and applications to ensure youre delivering the services needed today and able to meet the needs of tomorrow.
184
Chapter 9
FCAPS and Security Management

Recent reading online brought my attention to an article from Professor Eugene Spafford on the Center for Education and Research in Information Assurance and Security (CERIAS) at Purdue University. CERIAS is viewed as one of the worlds leading centers for research and education in areas of information security that are crucial to the protection of critical computing and communication infrastructure. In regard to best practices, Professor Spafford said: In the practice of security we have accumulated a number of rules of thumb that many people accept without careful consideration. Some of these get included in policies, and thus may get propagated to environments they were not meant to address. It is also the case that as technology changes, the underlying (and unstated) assumptions underlying these bits of conventional wisdom also change. The result is a stale policy that may no longer be effectiveor possibly even dangerous. Policies requiring regular password changes (e.g., monthly) are an example of exactly this form of infosec folk wisdom. From a high-level perspective, let me observe that one problem with any widespread change policy is that it fails to take into account the various threats and other defenses that may be in place. Policies should always be based on a sound understanding of risks, vulnerabilities, and defenses. Best practice is intended as a default policy for those who dont have the necessary data or training to do a reasonable risk assessment. In small and midsized businesses, and even large enterprises, often the resources arent available to perform comprehensive or even reasonable risk assessment. Perhaps more importantly, we create deliberate change control processes that dont provide the luxury of time to reassess risk each time we make a change in the dynamic networked environment we support. To perform a reasonable risk assessment each time your environment changes would stall all forward progress in technological growth. Industry best practices are a tool employed because they afford us the collective institutional knowledge of the entire technology sector in adopting processes that have been proven to work. Best practices leverage, to your benefit, the common body of knowledge that IT managers share based on experience. The key to the FCAPS model is in establishing processes for managing each of the five service areasfault management, configuration management, accounting management, performance management, and security management. This chapter will delve into security management processes to protect and sustain the converged video, voice, and data service network.
185
Chapter 9
Identifying Risks Best practices provide a tool to support operations, yet some basic risk assessment is always prudent. Risk is simply something that is outside normal operations. When voice, video, and data networks converge, every risk associated with each service is incorporated into the newly converged network. This additive risk compounds the necessity for building a layered defense to protect not just the network infrastructure but also each of the services running on the network. Figure 9.1 shows a simple risk model called Jacobsons Window that is often used in considering the range of risks to which networks are exposed. Jacobsons Window present a simple, two-factor risk assessment model, distinguished by the frequency or probability of an event occurring coupled with the impact or consequences of the event. This two-factor matrix simplifies all risk into one of four categories.
Consequences LOW HIGH
Rate of Occurrence
L O W
DISREGARD
H I G H
DISREGARD
Figure 9.1: The Jacobsons Window risk model.
The four risk classes identified using this approach are low-low, high-low, low-high, and highhigh. This approach is further simplified by breaking risks into two broad classes: inconsequential or significant.
186
Chapter 9
Inconsequential Risk Classes One risk that is considered inconsequential is the low-low class. It can generally be ignored because it doesnt matter statistically. The likelihood of occurrence is low and the consequences of these risks are deemed low, so they represent minimal impact on the organization. A risk that occurs once a year and has the impact of costing $1 is just too trivial to worry about. The opposite end of the spectrum, high-high risks, can also be ignored for the most part. In his writing, Jacobson suggests these risks just dont ever occur in the real world. He uses the example of a 50-ton meteorite crashing into your computer room on a daily basis. Although this example is extreme, it demonstrates why the high-high risk class is inconsequential. If this event occurred, you wouldnt build your computer rooms in the first place. In the real world, the highprobability and high-loss risks are generally immediately mitigated or we simply couldnt conduct business. These risks demonstrate lessons that we have already learned and for which weve developed remediation strategies. A new high-high risk is typically spawned out of some disruptive, radical shift either in technology or business practice. Theyre typically addressed as theyre created. Significant Risk Classes If youre looking for a simplified approach to risk management, youve just eliminated half the classes in the entire risk spectrum. That only leaves two categories or classes of risk you really need to be concerned with: high-low and low-high. Spam email represents a perfect example of a high-low risk. There is a very high likelihood of occurrence. Current studies indicate that more than 50% of the email received is spam. In most cases, the resulting loss is a loss of productivity as people delete the spam messages. Jacobson uses a fire destroying a telephone company central office as an example of a low-high risk. It demonstrates an event that has low probability of occurrence but a high consequential loss.
On May 9, 1988, an electrical fault started a fire in the Hinsdale, IL central office operated by Illinois Bell. Early during the fire, telephone services failed. The fire department didnt arrive on site for 45 minutes. Because of the dense, black smoke, firefighters had difficulty entering the building and locating the source of the fire. Emergency power was automatically provided by generators and batteries and could not be shut off easily. Neither standard dry chemical nor halon extinguishers were effective against the fire. Water had to be used, which exposed the firefighters to electrical shock danger. It took firefighters more than 2 hours to shut down power, enabling them to control and extinguish the fire. It was more than 6 hours after the first fire crews had arrived on the scene that the fire was declared under control. This fire was confined to an area roughly 30 feet by 40 feet on the ground floor. Cables were burned to various degrees and smoke residue covered most of the ground and parts of the first floor. The most severe damage away from the fire was caused, not by flames, but by corrosive gases in the smoke. These corrosives damaged the equipment that survived the fire. Although the existing equipment was cleaned and used to provide interim service, it was deemed unreliable. This equipment all had to be replaced over time after the fire.
187
Chapter 9 There is a broad spectrum of real-world risk that ranges from low-high to high-low. Human nature drives another facet of low-occurrence risks. Jacobson postulates that People tend to dismiss risks that they have not experienced themselves within the last 30 years. Although 30 years may seem arbitrary, psychologists believe its accurate and related to the range of human life experience. We generally remember events that happened during our lifetime. There are plenty of examples of Jacobsons 30 Year Law at work, but theyre easily identified. How often have you heard senior managers or public officials say, How could we expect anyone to anticipate this event? Weve now taken appropriate measures to ensure that this will never happen again. Human nature shows us that some lessons may only be learned through direct experience. When providing security for network services, managers need to visualize every possible risk that might occur in the enterprise. Youre often called on to consider events you havent personally experienced. Its often useful to pull together incident response teams and think well outside the box of routine business. Natural disasters, cyber security incidents, and unrelated networking problems, such as a cut fiber, all impact service delivery. Sometimes a good planning session involves discussion of all hazards risk. This approach, undertaken with a broader set of staff members and managers, can help bring other physical and misconfiguration incidents into clearer view as potential problems. The next step is to identify optimal mitigation strategies for each risk. Some organizations lean in favor of ROI-based risk management. This approach is often met with limited success. The ROI-based approach can work well with events that are highlikelihood, low-consequence because managers believe the risk exists, and therefore, it should be addressed. Low-probability, high-consequence risks present a different problem. Its difficult to quantify the likelihood of occurrence, making it difficult to achieve management concern that the risk is tangible. Every approach to risk assessment brings benefits and each has drawbacks. ROI-based decisions arent appropriate for every situation. When implementing security measures, its often wise to take a step back and evaluate why the security measure is necessary in the first place. Generally, security measures are implemented due to one of four reasons: The value of a security measure is significant but the cost is trivial. For example, the lock on the front door of an office costs little yet provides the very first measure of security. If the door is left unlocked, the consequences may be quite high but the cost of implementing a policy to lock the door is quite small. In many cases, the potential reduction in future losses will more than offset the cost of security. In this case, the security measure has a demonstrable positive ROI. This is the justification widely used for many security measures related to high-low risks in business. Card key access controls to buildings, password change policies, and basic security training for employees are good examples.
188
Chapter 9
For many enterprises, legislation or regulatory requirements necessitate specific security measures. The Health Insurance Portability and Accountability Act (HIPAA), GrahamLeach-Bliley Act (GLBA), Sarbanes-Oxley Act (SOX), and compliance with ISO17799 standards are good examples of this justification for security measures. In many cases, security measures are implemented to address a low-high risk that has an unacceptable consequence for any one single occurrence. This single occurrence loss (SOL) might be something that exceeds the owner equity or net worth of the company. These measures represent protection against risks that bring devastating consequences.
Low-high risks often require senior management involvement in decision making. ROI analysis may not justify implementing protective measures. Sometimes good judgment is required to decide whats acceptable to feel safe enough for the needs of the business.
SOLs and Reducing the Risk
There are two tactics to reduce overall risk using this two-factor approach to risk assessment. Because they offer tangible risks with a heightened sense of urgency, we have a natural tendency to focus on risks that generate a large SOL. You can typically mitigate these risks by: Reducing your vulnerabilityEnterprises have often thought in terms of disaster recoveryhow you might restore operations after a major event. There has been a shift in recent years to think more in terms of business continuity, or continuity of operations. Robust enterprise business continuity and resumption plans reduce any adverse consequences by shifting some aspect of the consequences to a resilient business continuity environment. Spread the risk across the enterpriseDistributing work across multiple geographic locations or business units can help spread the impact of any event across Help desk, training, and other organizations. The risk to each business area might be smaller and more easily addressed than a large risk to the entire enterprise. Transfer the riskOne solution may be obtaining insurance. The insurance premiums will factor in the exposure against a deductible amount, so the impact might be lowered to a tolerable level. A $50 million dollar risk reduced to a $5 million dollar potential single loss through insurance may provide an acceptable alternative.
After the potential risks have been identified, each enterprise will need to make unique decisions relative to their own business needs. High-low risks may be analyzed using an ROI-based calculation. Low-high risks often need to be reviewed in a management strategy session. Its wise to use the team approach to estimate the SOL potential and gain a collective sense of the likelihood of occurrence. Management will make a decision and draw a line somewhere along the spectrum. Usually this line is drawn based on the largest acceptable SOL. Risks in the enterprise tend to spread out in a graph like the one shown in Figure 9.2. Viewed in this context, a conscious decision can be made about each risk and each type of risk. Low-low risks might be deemed inconsequential and addressed in another venue. As the risk increases, some risks may be transferred. Higher risks that occur infrequently might be areas for insurance considerations. Toward the middle of the low-to-high consequence range may be risks that can be addressed through policy and training efforts.
189
Chapter 9
LOW Frequency of Occurrence - HIGH
Accept Risks
Happens too rarely to address Max Tolerable Single Occurrence Loss Mitigate Risks Transfer Risks Consequences of Occurrence - HIGH
Dont Care LOW -
Figure 9.2: Modeling risks.
Although this four-quadrant view of risk provides a simple tool, for most businesses, the risk mitigation focus falls toward the middle area of the graph. The lower left quadrant represents things that happen so rarely and cost so little you might not need to factor them into daily operations. The upper left quadrant represents things that occur more frequently but cost little. These are often viewed as acceptable risks. Theyre typically treated as a factor of the cost of doing business. High-occurrence risks that rise in consequence show up in the lower right. These may be transferred to other parts of the enterprise, insurance carriers, and when appropriate, outsourced to a business partner. The upper right portion represents that area of high-high risk and consequence events that, at least in theory, have been addressed before you started doing business. Jacobsons Window isnt always the best approach to risk assessment for an enterprise. Its a simple tool for quantifying and analyzing risk to help make sound business decisions that is easy to use and provides some method for basic risk assessment.
190
Chapter 9
Risk Management Life Cycle
Like every process touched on throughout this guide, risk management is a never-ending process. Figure 9.3 shows the ongoing cycle of identifying, analyzing and assessing, mitigating, or transferring risks. Throughout the risk management life cycle, there are four key questions to consider: What could happen? If it does happen, what is the impact? How often could it happen? How accurate are the answers to the first three questions (measure of uncertainty)?
Risk management includes risk assessment and mitigation as two elements of a much broader set of activities. Other elements include establishing focal point for management, implementing controls and policies, promoting awareness, and continuously monitoring effectiveness.
Assess Risk And Determine Needs
Implement Controls and Policies
Central Focal Point
Monitor and Evaluate
Build Awareness
Figure 9.3: The risk management life cycle.
191
Chapter 9 Risk assessment provides the foundation to all the other components of the risk management life cycle. It provides a basis for enterprise policies and helps maintain a focus on cost-effective approaches to implementing these policies. Risk assessment is an ongoing process that never ends. New services, video, voice, and data are constantly being incorporated into the service network. New vulnerabilities and exploits in operating systems (OSs) and applications are discovered every day. Reconnaissance attacks through firewall probing, port scanning, worms, and viruses are an ever-increasing nuisance. Threats and risks evolve over time and require vigilant monitoring. Enterprise information security needs a management focus. This central focus point comes about through policy recommendations, business controls, awareness, and other areas that shape the overall corporate culture of stewardship toward business intelligence. Security awareness supports both daily operations and the risk management life cycle. Educating users on acceptable use and security risks is a key factor in mitigating security problems. Introducing Incident Response Planning: CSIRT Basics One vital component of how any enterprise responds to a security incident is the maturity and structure of the incident response team. In network security, there are three key management factors: detection, prevention, and response. Although detection and prevention are the ideals, you cannot detect and prevent every incident. Problems will arise. Security incidents will occur. Detection is crucial; it provides the situational awareness as to what is happening in the service network. Your reaction determines how you mitigate problems and restore business to normal operations. Most large enterprises have some form of Cyber Incident Response Team (CIRT) or Computer Security Incident Response Team (CSIRT). For many companies, these working groups emerge from a de facto team that engages when an incident occurs. Organizations that dont think ahead and fail to build a team find themselves caught in the throes of ad hoc response. This causes delays in mitigation, often leaving critical services impaired for an extended period of time. A well-organized and trained incident response team can provide a range of services that help maintain service delivery capabilities. Typical roles include: Notification of the organizations management when an incident occurs Activation of a documented response plan to resolve the problem Record keeping of the steps taken to reach resolution Problem containment Evidence collection and gathering, including making backups Problem mitigation and service restoration to maintain continuity of operations Coordination of incident post-mortem reviews
These steps collectively gather lessons learned from every incident that occurs in the enterprise. These lessons learned feed process improvement plans, which leading to a continually rising ability to provide services with a minimum of disruption.
192
Chapter 9 Many enterprises treat incident response as a value-added service within the organization, coupling prevention, detention, and response services to internal measure or service level agreements (SLAs). As Figure 9.4 shows, the CSIRT team provides incident response services.
Some organizations are now shifting to a model of building a cross-divisional response teams that can provide a wider variety of services to multiple customer organizations inside the enterprise. Later this chapter will address some of these services in more depth.
Figure 9.4: Reactive, proactive, and quality management services in CSIRT.
Building Processes for Defense in Depth

A primary driver for a defense-in-depth, or layered, defense strategy is to create a holistic framework for end-to-end security across the enterprise. This comprehensive approach encompasses layers of security and multiple points of enforcement throughout the architecture. If the worst happens and there is a serious security breach, the organization must detect, contain, and correct the problem. Defense in depth not only reduces the likelihood of a successful attack; it significantly increases the chances of problem detection. Confidentiality, integrity, and availability (CIA) of corporate data and network resources are paramount for any organization. This CIA acronym has become a watchword for the security industry. One goal of a layered defense is to build a service network infrastructure that can withstand attacks or disasters and protect the enterprise from unacceptable losses.
193
Chapter 9 One approach that is often successful is to implement a series of routine technologies and procedures than embrace a complex and elaborate scheme that might contain single failure points. The simpler the infrastructure, the easier it is to protect. Its also easier to maintain. Service networks are large and complex enough without making them more so. And although simplicity isnt always a realistic goal in a large enterprise, its important to build a service network infrastructure that is easily understood and maintained. Any single control point can too easily become a single point of failure. Every layer of a strategy for defense in depth supports the other layers. When breaches occur at the perimeter, this layered approach limits the organizations exposure. Security through layers is often compared to an onion, as Figure 9.5 illustrates. There is no single layer that can be breached to get to the heart of the onion. Several layers must be either penetrated or stripped away. In network security, each layer provides another barrier, protecting the valuable business resources.
Policy, Procedures, and Awareness
Physical Security
Perimeter Security
Network Security
System Security
Data Security
Application Security
Figure 9.5: Layered security; a defense-in-depth strategy.
The best strategies provide effective protection of critical assets while allowing business to flourish. The layers or concentric rings of a layered defense allow for controlled and monitored access at each layer.
194
Chapter 9
User Authentication and the Gold Standard The primary focus of enterprise security efforts is to protect not only the services network infrastructure but also corporate data. Corporate data is a form of business intelligence, and as such, its a vital asset to any business. Au is the symbol representing gold in the periodic table of elements. This reference to gold provides an easy mnemonic tool to help remember three of the most critical elements of network and data access security. Figure 9.6 shows these three golden rules in data access protection.
Authentication Verify that users are who they say they are Authorization Ensure users only access things they have permission to access Auditing Know who did what, and when they did it
Figure 9.6: The golden rules of data access protection.
Authentication Authentication is simply the process of users proving that they are who they say they are. Its important to authenticate all users connecting to the network through some form of credential. User authentication may occur many times throughout the work day. The most common form of user authentication is simply an ID and password, but much stronger mechanisms exist and are in widespread use. There are some considerations to be mindful of with regard to the authentication process itself and how you deploy the technologies used: Do you employ a single authentication data store or are multiple resources involved? Are IDs and passwords encrypted during transmission or sent in the clear? Is simple user ID and password authentication sufficient or should you use some form of strong or two-factor authentication? How secure is the authentication and user validation process?
195
Chapter 9 Authentication is simply the process of verification that users are who they claim to be. Its important that you authenticate users when they connect to the network and when they access business information that is restricted in any way. Many networks use simple password authentication technologies based on either Password Authentication Protocol (PAP) or Challenge Handshake Authentication Protocol (CHAP). These methods require only that the user know a user name and password. This is single-factor authentication. Two-factor, or strong, authentication is typically based on a combination of something you have and something you know. One widely recognized example of this is the key fob technology that uses a time-based token to generate a number (something you have) coupled with a PIN (something you know). Biometrics may be used to introduce a third authentication factor: something you are. This addition of a fingerprint/palm scan, facial recognition, voice recognition, or retinal scan adds another layer of depth to the authentication process. Three-factor authentication is common today online in the highest security environments and many research facilities. Although this technology works, the cost is generally still prohibitive to deploy in the typical business network.
Two-Factor Authentication Risk Analysis Strong, or two-factor, authentication cannot be considered an inconsequential risk under Jacobsons Window analysis. Access control cannot be treated as a low-low inconsequential risk due to its high likelihood of occurrence. In a recent US-CERT Analysis Report, unauthorized access attempts rate in the highest category of network probes. This category encompasses 85.6% of attacks on networks in the latest quarterly evaluation. Access control cannot be treated as a high-high inconsequential risk either. These exposures occur daily, constantly in the stream of questionable activity inbound to state networks. Unauthorized access attempts inbound to state government resources number in the millions each month. The rate of occurrence is high, but the success rate is very low. The initial consequence of breach to a user account may well be considered low. Two-factor authentication concerns clearly fall into the category of significant risk. Significant risks are those that have either a high rate of occurrence or present a high consequence risk. Given millions of attempts per month against state government network resources, the rate of occurrence for login attempts through password dictionary attacks and other methods is clearly a high occurrence risk. Whether the consequences of a breach are high is dependent on which system(s) are breached. The danger is that one breach easily leads to another, allowing deeper penetration into the network. These cascading attacks are among the most dangerous, as the attacker gains increasing ability to masquerade as an authorized user with each layer of penetration. Single user passwords have long been proven as the weakest form of security in widespread use. In a recent survey, 70% of users will trade their password for a chocolate bar (see http://www.securityfocus.com/columnists/245 and http://www.securityfocus.com/columnists/245).
There is research underway on many fronts to shift the technology from two-factor and threefactor authentication to a newer mindset of continuous authentication. This might be accomplished through a variety of heuristic information, such as keyboard typing and other behavioral patterns. As networking technologies have improved, so have security options, and many networks implement a variety of these technologies.
196
Chapter 9
Authorization After a user provides his or her credentials and has authenticated, the next step is authorization. This is simply a matter of matching the authenticated user against a stored profile and applying the appropriate access permissions for that individual. For some users, network access may be the only action permitted, perhaps to the entire network, but in some cases only granting access to a small subset. Different employees will typically have permission to work with different sets of data. This is commonly referred to as role-based authorization. (for example, Human Resources, sales, and engineering staff all need access to different resources but not to each anothers resources). Other authorization examples in a unified communications environment include: The ability to establish a remote VPN tunnel. This is generally predefined by user group and establishes another set of user controls based on rules in a VPN concentrator or firewall. The ability to program speed dial lists in the VoIP system may be restricted to only users with specific privileges. Conference call reservation and management is often a restricted service. Access to log and reporting servers is often employed to ensure that appropriate management or support staff are the only employees with access privileges.
Auditing Auditing has become an important factor for enterprise business. In the traditional telephony provider, this focus was on accounting to support customer billing. As enterprise services grow and mature, auditing becomes a more necessary tool. Auditing is often used in testing security configurations. Its also important for troubleshooting. Audit records will let you see whether the network permits multiple, simultaneous logins from the same user. Some organizations permit this by design, while others deny this capability. Multiple logins from different locations might provide an indicator that an employees account has been breached or their password has been stolen. A comprehensive audit trail provides a record of what every user did, and when. It can provide tremendous value in event correlation analysis when some incident occurs. Knowing when a VoIP call was placed, when a router configuration was changed, or when a server patch was applied can all provide crucial business intelligence about the service network.
197
Chapter 9
Prepare to Respond You and your network are a target, whether you think you are or not. Security incidents will occur regardless of whether you plan for them. The odds are high that when an incident does occur, it will be broader in scope and more deleterious to your network services that youve planned for. The time to prepare is now; and tomorrow; and the next day. When your network is under attack and services are failing, you will not have the luxury of thinking about a response plan. Its wise to take preparatory steps in anticipation of the inevitable.
Treating the network like a doctor treats a patient, managing overall health care, provides a robust and sustainable network service delivery environment. Gathering documentation and consolidating knowledge about the unified communications service delivery environment is vital to understanding the healthy profile of the network. Strong, industry-adopted security practices, like good managed health care, require monitoring the health of the network on a continuous basis. Quick response to abnormal conditions will ensure early diagnosis and treatment of problems. Prevention, detection, and timely response are cornerstones to maintaining the good operating health of the service delivery network.
Policies, Procedures, and Awareness The most secure system in the world is only as good as the people administering it and the users. People are the foundation for any good security model, and historically they are the weakest link. Policies and procedures provide the foundation for any layered defense strategy. People form the first layer of defense. A corporate culture of stewardship for protecting the companys proprietary information is fundamental to good network security. Its not enough to put security policies in place. Employees need to understand them. Security policies should be reviewed with all employees annually. Every employee should be trained and understand their individual role in protecting network services and corporate information. These processes and procedures are more important than security technologies. Its absolutely vital that you account for the human factor. People are the weakest link, and its widely accepted across the information security community that the greatest threat to security lies within. People can all too easily make mistakes or become careless. A disgruntled, unhappy, or careless employee can be a threat to the security of the entire enterprise. Employee awareness and training are essential to building a sense of ownership into the corporate culture.
198
Chapter 9
Physical Access and Environment Controls The physical layer security of a services network seems easy to protect. Grant appropriate physical access only to systems providing services. Often, physical access grants administrative access via console ports and the like. Environmental controls, such as power and HVAC, should also implement controlled access mechanisms. Drastic changes in the operating environment, like an overheated server farm room, can result in damage and unexpected system behavior. Physical access controls also mean controlling access to server farms, wiring closets, and other services equipment locations. You build layers of defense in the network. Its important to build layers of defense in your physical security as well. You use monitoring systems to monitor network performance and security. For many businesses, physical access monitoring may also be appropriate.
There is a simple security review you can easily conduct that often yields some surprising results. Simply walk around your enterprise imagining that youre a burglar casing the joint. Pay close attention to the server farm rooms and data center areas. Make sure you visit every room containing critical systemsmainframes, servers, voicemail systems, telephony systems, routers, and switches. Check out the door hinges. Are they on the inside or the outside? Often doors are built to swing outward for specific fire or building code logistics. But a door that swings outward often leaves the hinges exposed on the outside. How easy would it be for an intruder to gain entry by removing the pins from the hinges on a locked door? Wiring closets housing equipment are often built with doors that swing outward because theyre built in shallow recesses. Theyre particularly vulnerable to this problem. Its worth the time and effort to conduct a complete physical inventory and identify wiring closets that should have reinforced hinges and locks installed.
Perimeter: First Line of Defense Between the Internet and Internal Networks The network perimeter used to be easy to identify. It was a single point of network ingress and egress, typically protected by a firewall. Its certainly true that the enterprise connection to the Internet is one perimeter point. But every extranet connection to a business partner is a perimeter point. Every VPN tunnel linking to another network is another perimeter point. Deperimeterization of the network has become a large issue for security managers to address. Not only do you have external perimeter points but there are numerous internal perimeters within any large organization. Sales and marketing need to be cordoned off from research and development groups. Human Resources operations require special access controls to protect confidential personnel records. The network management resources need to be isolated from everyone except authorized users. Its important to recognize that the goal of security perimeters isnt to keep everyone out. If that was the goal, you would simply disconnect the network and isolate it, eliminating all danger. Security mechanisms are implemented to support business need and allow appropriate access permissions to users who are trusted to do certain things in the network. Enterprise networks today provide a layered defense system. Firewalls provide a hard outer shell, but without access permissions to demilitarized zones (DMZs) in the network and other resources as required, business would grind to a halt. The network services infrastructure and network operations center represent vital organs that sustain the health and continuous operation of the corporate organism.
199
Chapter 9 The external hard perimeter is most often a firewall or series of firewalls. You implement firewalls to protect corporate assets from the malicious traffic on the Internet. They are vital to network safety but they also must allow business to take place. Traffic must get in and out of the network while you reduce the threats to an acceptable level. Some tactics used include: Using internal and external firewalls in combination Logging, analyzing, and reporting all access to the network Eliminating use of plaintext or unencrypted data that might expose internal systems to external threatsFTP and Telnet are cleartext protocols that can expose vital business information to anyone who might be capturing traffic; SIP is a widespread standard protocol in VoIP services today that is also a cleartext protocol that could potentially expose information about the enterprise network Encrypting incoming network traffic that is directed toward secure internal systems with VPN technologies, Secure Sockets Layer (SSL for Web services), and Secure Shell (SSH) instead of Telnet can provide an added measure of protection Proxy servers can provide traffic aggregation and funnel services through a single point to limit exposure; reverse proxy technology can hide all the inner workings of the corporate network Checking the identity of everyone accessing the network with strong user authentication as described earlier will help ensure that only known users gain access to corporate resources DMZs and filtered networks can be used to constrain external traffic to specified areas or segments of the network Routine penetration testing and vulnerability assessments can help identify weaknesses in the enterprise security posture
Network: Internal Network Layer Layered defense can map conveniently to the OSI Reference Model in many ways. At the network layer are tools you can bring to bear to deliver security. You can use network monitoring tools as well as segmentation in the network design to help provide another layer of security. Intrusion detection systems (IDSs) can provide alerts and warnings to assist the security team in quickly responding to incidents as theyre detected. The faster a problem or security incident can be identified, the more quickly it can be mitigated. Quick problem resolution helps keep security incidents from cascading into a larger chain of events. In short, early detection improves reaction time.
200
Chapter 9 Intrusion prevention systems (IPSs) go a step further. An IPS may be configured so that it can reconfigure other network elements in an automated response to some trigger event. In early 2004, the Bagle and Netsky worms spread widely and caused major network problems for many companies. Today, years later, both Bagle and Netsky variants are still being distributed daily. For most companies, some antivirus engine, either running on the mail gateway server or individual workstations frequently mitigates this problem. But Bagle and Netsky are both easily recognized by their digital signature. Intrusion detection devices routinely identify this malicious traffic. These worms have been around for so long that they are easily recognized. An IPS might not simply discard the packets at network ingress. The IPS might reconfigure border routers to shun all traffic from the sender. This might be a rule that is implemented automatically with a pre-set period of time. It might also be configured to make a permanent change to reduce risk and eliminate the CPU processing and bandwidth consumption that follows from allowing these malicious packets onto the enterprise network. Traffic-shaping tools are used in large networks to provide trending information about network traffic. For example, assume that your services network traffic is normally made up of 30% Web traffic, 40% VoIP/video traffic, 20% email traffic, and 10% other mixed traffic. Traffic-shaping tools can provide indicators of pattern changes in the traffic. Although patterns evolve and change over time, making this useful planning information, a sudden or unexpected change in traffic patterns is often indicative of a network security problem. A new worm propagating via the corporate email system might cause email traffic to spike to 40% of all network traffic or more. Monitoring traffic patterns can provide an early indicator of a problem of this sort. Access control lists (ACLs) can help block traffic and allow only specific IP addresses to communicate across network segments and subnets. Although they are basic and simple, ACLs can help protect critical VoIP systems by only allowing access to authorized system administrators from predefined network management workstations. Internet Engineering Task Force (IETF) RFC1918 (http://rfc.net/rfc1918.html) defines nonroutable addresses. A well-developed network strategy for deploying RFC1918 addresses can increase the services network security and further segregate traffic. In the VoIP environment, a design consideration might include using only RFC1918 addresses for VoIP phones and constraining voice service traffic to VLANs and MPLS domains dedicated to voice. Simply not allowing voice traffic on data VLANs and subnets provides a degree of granularity that offers some level of service protection by preventing intermingling of voice and data traffic. Implementing a network-based approach to antivirus technologies may help eradicate viruses and worms before they spread to affect other layers. One layered approach to antivirus solutions might be to deploy one vendors solution in the network, at the email gateway, and another vendors solution on the desktop workstations. This two-fold approach might also circumvent any single vendor being slower than another to release updated virus signature files. Conducting routine vulnerability assessment and testing of network services helps provide insight from the attackers point of view. During normal operations, network personnel always strive to follow best practices but the demands of business and heavy workloads can lead to potential errors. In some cases, a solution is implemented quickly just to get the job done. Regularly scheduled assessments can help ensure that any suboptimal techniques used to meet business needs today dont become the services network security vulnerabilities of tomorrow.
201
Chapter 9 Systems: Server and Client OS Hardening Practices Chapter 8 looked at configuration management, but server and client OS hardening goes beyond the network routers and switches to include all systems. You need to build processes for standardized installation and configuration of OSs. Services that arent specifically required should be disabled on servers and workstations. Programs that arent needed should be removed. All configuration changes to enterprise systems should be traced through audit logs. Change management procedures are necessary to ensure that only authorized changes are made to production systems. Establishing a standardized set of configurations, across all platforms, is a widely accepted practice. Software programs that have been approved and accepted for use within the enterprise should be installed in standardized configurations. Some divisions or workgroups within any company may have specific software requirements, but only tested, approved, and authorized software should be installed. In many organizations, users are not allowed administrative rights to their local desktop machine. This approach can prevent the installation of unknown software and help enforce the use of standard enterprise configurations. Strong user authentication and password technologies need to be implemented. When a system is prepared to move into network services production, special precautions should be taken to eliminate all guest accounts and vendor default accounts and passwords. Keep in mind that default password information is easily obtained over the Internet directly from vendor documentation.
When setting up new systems in the pre-production environment, it is vital that all OS patches be applied and the system scanned prior to connecting it to the production network. Systems built from a 6-month-old CD will be built from an image that isnt at current OS patch levels. Its a sure bet that these older patch versions have vulnerabilities that are known. Its highly likely that there are exploits in the wild to take advantage of those weaknesses. These systems, once connected to the network, can be infected in minutes; well before OS updates from the vendor can be applied. Several industry studies have shown that an unpatched machine is typically compromised within 4 to 6 minutes of being connected to the Internet. A rigorous process of building, patching, scanning, and then revalidating patch levels will help ensure that new systems are not infected during the process of being brought online in production.
202
Chapter 9
Application: Application Hardening Practices The applications running on hosts often have control of how corporate data is handled. Its important to see that corporate applications dont introduce security vulnerabilities. A poorly written or malfunctioning application might allow for unauthorized access or undesirable manipulation of data. Secure coding techniques and stringent quality control processes are vital for any in-house application development. You implement change control mechanisms in the IP network. You must take the same steps with your business applications. If a vendor partner releases a software patch for a VoIP gateway in the services network, diligent testing and documentation should be completed as the patch is applied in a methodical fashion. Just as network hardware has a life cycle, so do business applications. Part of the application life cycle should include code auditing and peer review of the programming. Doing so will help ensure the integrity of enterprise applications. Authentication, authorization, and audit capabilities at the application level should be a consideration in all enterprise software, whether purchased commercially or developed in-house. Data: Protection of Customer and Private Information The single most valuable asset within the services network is the data. This data provides business intelligence information about the enterprise. It provides transaction records and history with customers. It provided details about products, services, employees, and finances. Every layer of a strong defense-in-depth strategy is implemented with one primary goalto protect the data. One common approach in businesses is to follow traditional military strategies and classify data in terms of the intended audience for distribution. When customer data is involved, privacy laws and regulations to protect personally identifying information will often dictate the protective measures your company must take.
Chapter 10 will delve more deeply into regulatory compliance issues.
Summarizing Defense in Depth

The basics of a defense-in-depth strategy neednt be difficult to remember. Policies, procedures, and awareness provide the foundation of a secure network. Physical security ensures that only authorized personnel can touch systems. Perimeter security segments traffic to protect the boundaries between untrusted, semi-trusted, and trusted areas of the network. Network security adds a layer of intrusion detection and prevention solutions and provides traffic trend analysis to identify changes in patterns. Systems security ensures standardized configurations that adhere to company policies. Application security ensures best practices in coding and configuration of applications are followed. Data protection safeguards vital business intelligence information.
203
Chapter 9
Prevention
Networks are vulnerable to a number of known problems. Perhaps the most feared, classic problem is a denial of service (DoS) attack. When under a DoS, the network is reduced to a state in which it can no longer carry legitimate users traffic. Attacking the routers and flooding the network with extraneous traffic have been the historical approach malicious attackers have used to accomplish this. As malware has increased, today a worm attempting to replicate across the corporate network may present the greatest DoS problem as it consumes all available network bandwidth. Another classic problem is IP address spoofing. To protect the network against increasingly sophisticated attacks, you need to build stronger layers of defense. Just as layers of clothing are the best protection from the cold, layers of defense eliminate single points of failure and provide broad network security.
Digital Common Sense In an article published in Computer World in 2002, Thornton May said There can be little argument that the digital world would be much improved if all senior executives were required to enroll in some kind of information protection program. And we must hold each employee responsible for protecting intellectual property. But this will be difficult because many executives lack a digital common sense. Although business executives understand that protecting intellectual property is a key protective measure, perhaps their greatest responsibility is to lead by example and establish the climate of the corporate culture within any company. What you take as common sense in the real world often translates poorly to the digital networked world. Its vital that the enterprise leadership team foster a culture of awareness, stewardship for corporate information, and heightened digital common sense. You must develop and hone your digital common sense so that you can implement smart, costeffective approaches that protect the safety of your corporate assets in the networked digital world.
In the integrated services network, each service you provide may have its own security requirements. These requirements often vary based on the intended use of the service. Services that are only intended for use inside the corporate network are likely to require very different protection methods than services provided for external use. For example, if VoIP is used only for internal calls, completely blocking all external access to the server may be adequate. Servers that provide internal service to the organization shouldnt normally be co-located on the same server hardware that is used to provide external services. One common best practice is to isolate external traffic to a set of outside-facing DMZ segments while using another inside-facing set of segments for access from inside. These segments are often protected by implementing firewalls to allow only authorized communications between the inside and outside. Internal communications services still require design review considerations: Will every employee use VoIP services? Do you allow vendors, contractors, or visitors to connect to the internal trusted network? Will these vendors, contractors, or visitors ever be authorized VoIP users?
If not, special care must be taken to ensure that only authorized VoIP users can gain access to servers, gateways, and other VoIP service delivery elements.
204
Chapter 9 Protect the Protection Perhaps the most critical component in establishing a network security posture is protection of the network services security and management platforms themselves. Its a fairly common network design practice to implement a dedicated management segment within the network. Because this management segment has oversight of the whole network, systems on this segment may require unfettered access to the entire network. The reverse is not true. This segment represents in some ways, the keys to the kingdom of the enterprise services network. Management servers should be accessible only to trusted employees in authorized work groups. Only authorized staff should be permitted to view log data for analysis. Intrusion detection and prevention systems shouldnt be seen or touched by anyone outside the network security staff. It is vital that network management and security protection services themselves be protected. Firewall rules and ACLs can define authorized network locations by IP addresses that are granted access. Strong user authentication techniques like those described earlier can ensure that only authorized users have permission. Discovery protocols such as Simple Network Management Protocol (SMNP) and Internet Control Message Protocol (ICMP) should be disabled if they arent necessary.
Detection
Although prevention tools are valuable aids to securing the integrated services network, you cant prevent everything bad that might occur. Tools help with prevention, but ultimately your survival depends on detection. When prevention fails, and it will fail, you must have strong detection and notification mechanisms. You need to know as much as possible about every event in the network. Comprehensive detection tools can shorten response time and lead to quicker remediation. Reliable incident management techniques help shift the security posture from a reactive mode to a proactive one. These measures make pre-emptive security changes part of the corporate culture. Like every process, good incident management involves continuous review and assessment. Its important to take measures that will allow detection of not only situations in which information has been damaged, altered, or stolen, but also about how it was damaged, altered, or stolen, and who has caused the damage. This can be accomplished through procedures and implementing technologies discussed earlier that can detect intrusions, damage or alterations, and viruses.
Reaction
Just as every facet of network service delivery has an accompanying life cycle, everything you do in network security relies on layers of protection. This chapter has looked at some of the steps you can take to prevent security breaches, but you know breaches will happen. You establish procedures and implement tools to detect security breaches as they occur. You do so to enable a quick reaction. By building comprehensive processes into your day-to-day operations, you can be primarily either reactive or proactive. How an organization responds to an incident is driven by how well prepared everyone is.
205
Chapter 9 Proactive and Reactive Strategies A comprehensive security response plan should include both proactive and reactive strategies. The proactive strategy is really a pre-attack strategy. It documents a series of steps taken to reduce any existing security policy vulnerabilities. It also involves developing contingency plans. Determining the damage that an attack can cause, along with the vulnerabilities exploited during an attack, will aid in building a complete proactive strategy. The reactive strategy is a post-attack strategy. It provides tools for assessing any damage caused by the attack and identifies steps to repair the damage. A reactive strategy also engages any contingency plans developed as part of the proactive strategy. During reaction, you restore normal business operations, then document and learn from the experience. Incident management is really a matter of addressing a few simple questions: What happened? Where did it originate? Who did the incident impact? What steps are needed to mitigate the problem? How can it be prevented in the future?
Reacting to a Security Incident We spend a lot of time trying to prevent security incidents from occurring. What sometimes gets lost in all of this preparation is plans for dealing with an incident should it actually occur. This is a very brief synopsis of several incident-handling guides that provides a high-level framework for dealing with either the realization that a system has been compromised or the recognition that a system is under active attack. The SANS Reading Room has a large set of papers about incident response at http://www.sans.org/reading_room/. Remain Calm To successfully handle any perceived emergency situation, you must remain calm so that you can assess what is going on around you and react in a methodical manner. A compromised system/network or an attacker on the loose demands well-thought out action; and frankly, the bad guys have probably been in your computer for days or maybe even longer, and another few minutes wont make much difference. Youre probably going to have to rebuild your compromised servers anyway. Notify Your Organizations Management and Activate Your Response Plan to Get Help Your security policies should identify the pecking order of who gets called when if theres a security event. Individuals with particular responsibility for the affected server(s) and/or network(s) should be notified as well as any information security personnel. The severity of the incident (and your own policies) will dictate who else is brought inyour Internet Service Provider (ISP), department head, corporate officers, the press, law enforcement, consultants, response centers, and so on. Notify whoever is necessary to assess the situation and get it under control, but it is generally best to maintain a "need to know" stance and communicate, at least initially, with only the necessary parties. Whenever possible, use telephones and faxes during a computer security incident. If the attackers have full access to your computer, they can possibly (probably?) read your mail. If you use your computer, this allows them to know when you report the incident and what response you got. There is a real possibility that other systems at your site have also been compromised and one or more packet sniffers are running on your network. Thus, if you absolutely must use a computer to communicate and you are fairly certain it cant be intercepted, use a different system and/or dial-up ISP access if possible.
206
Chapter 9
Take Good Notes This cannot be stressed enoughdocument, document, document!! Maintain a log of everything you see and do, everyone you speak with, and the team working with you. This will not only help you in criminal cases (and in remembering the events at a trial that might take place a year or two down the road) but also in the investigation/forensics process, post-event analysis, and as an educational/intelligence gathering vehicle for others in the InfoSec community. Notes should be detailed, organized, and complete, and should reflect the basic who, what, where, when, and how ("why" might be left for later on). Keep copies of any altered files before restoring your system. Contain the Problem Take any necessary steps to keep the problem contained and prevent it from spreading to other systems and/or networks. This may well involve disconnecting the compromised system from your network and/or disconnecting your network from the Internet. Containment may require a physical disconnect or might be accomplished while you clean up and recover; circumstances, including whether you are dealing with an active attack or the aftermath of an attack, will dictate what is a prudent action. Note that the latter approach (containing the problem while still online) might well leave you vulnerable to additional attacks. Gather Evidence and Make Backups For purposes of learning what happened and to have evidence for future analysis (and possible prosecution), make backups of OS and file system information as well as any state and network information (for example, output from netstat or route). Keep a detailed history of this activity if you have even the least suspicion that this information will be used in a criminal or civil trial; digital signatures and file timestamps are part of the procedures you should follow to maintain the custody chain. If possible, coordinate your evidence gathering with that of a second source, such as an ISP or another network (if you detect another network that is involved in this incident). Finally, as you make the backups, consider where they are going and who will be using them; if possible, make multiple copies and secure one for historical purposes while analyzing/sharing the other(s). Get Rid of the Problem and Maintain your Business This step might be easier said than done. If your server has been compromised, you should totally rebuild it from scratch unless you are 100% certain what the entire problem is. Before you can eliminate the problem, you need to be sure that you understand the cause of the incident. What vulnerability did the intruder use to gain access and what have you done to prevent another attack? You should rebuild the server and applications from original media. The next issue, of course, is that of re-installing content. Note that some files that were exploited might have been on your system for some time already and, therefore, might have been backed up as part of your regular operations. So, although youve rebuilt the OS and applications software, you might very well reinstall files that can be exploited over and over. Again, it is imperative that you understand how an incident happened to avoid this eventuality. Finally, business continuity is a major issue. Get rid of the problem and get your server back online as soon as possible. Perform a Post Mortem Once the situation is resolved and youre back in operation, get all relevant parties together to review the incident and the response. Review your security policies and operational procedures to see what changes, if any, are required. To the extent possible, contact appropriate incident response agencies, such as US-CERT, and share your knowledge with them. Hold this meeting a day or two after the incident is deemed "over," when everyone is rested and has had time to reflect on what happened and why, and what went well and what didnt go so well. Dont do it immediately while people are still tired and dont wait weeks when people will forget and will have moved onto other things.
207
Chapter 9
Reactive Strategies
A reactive strategy is implemented when the proactive strategy for the attack has failed. The reactive strategy defines the steps that must be taken after or during an attack. It helps to identify the damage that was caused and the vulnerabilities that were exploited in the attack, determine why it took place, repair the damage that was caused by it, and implement a contingency plan if one exists. Both the reactive and proactive strategies work together to develop security policies and controls to minimize attacks and the damage caused during them. The enterprise incident response team should be included in the steps taken before, during, and after any attack to help assess the incident, document what happened, and to learn from the event. Reactive strategies include responding to calls for help and reporting of incidents, threats, or attacks. These steps might be manually driven processes or triggered through technology in IDS and firewall monitoring systems. From an operational perspective, incident handling involves taking reports, dealing with information requests, assisting with triage, and analyzing incidents. The incident response team can coordinate triage efforts between different divisions or work groups within the company and help share appropriate mitigation strategies. This team can help monitor for malicious activity in other parts of the network. They may be directly involved in rebuilding systems, applying patches, and developing any workarounds. The incident response team provides communications support, notifying work groups across the enterprise about vulnerabilities and sharing information on remediation tactics. Vulnerability response might also involve researching to find the necessary patches, hotfixes, or workarounds. Response also involves notifying other workgroups of the mitigation strategy through advisories or alerts. In some enterprises, this service will include patch management procedures to install appropriate patches, fixes, or workarounds.
Proactive Strategies
Proactive strategies are designed to avoid incidents in the first place. They also help control the scope of impact when incidents inevitably do occur. The technologies used by the incident response team to handle incidents are often used to watch new technical developments, intruder activities, and trends that might indicate future threats. Members of the incident response team need to be afforded the time to read security mailing lists, security Web sites, and current news articles to mine relevant information from the broader security community. This intelligencegathering function can be combined with past lessons learned to provide a powerful defensive resource to the enterprise.
208
Chapter 9 Ongoing, regularly scheduled security assessments can provide detailed review and analysis of an organizations security infrastructure. These assessments might be comprehensive system audits or simply desktop reviews of security practices. There are several types that can be performed: Reviews of hardware and software configurations, routers, firewalls, switches, servers, and workstations. These reviews can confirm that systems all meet enterprise standards or follow industry best practices. Interviews with employees can provide insight into how well actual security practices match the enterprise security policy. Vulnerability or virus scans can be performed against network segments to identify vulnerable systems and networks. Penetration tests can be conducted in a controlled, methodical environment, including social engineering, physical, and network attacks. This information can be leveraged to provide a stronger security posture before a malicious outsider takes advantage of any weaknesses that are discovered.
Security tools vary in scope and functionality. Because of their focused experience and understanding of security issues, the incident response team may provide valuable guidance on how to harden systems and develop a set of approved security tools.
Incident Management In a Nutshell Assess the Damage Determine the damage that was caused during the attack. This should be done as swiftly as possible so that restore operations can begin. If it is not possible to assess the damage in a timely manner, a contingency plan should be implemented so that normal business operations and productivity can continue. Determine the Cause of the Damage To determine the cause of the damage, it is necessary to understand what resources the attack was aimed at and what vulnerabilities were exploited to gain access or disrupt services. Review system logs, audit logs, and audit trails. These reviews often help in discovering where the attack originated in the system and what other resources were affected. Repair the Damage It is very important that the damage be repaired as quickly as possible in order to restore normal business operations and any data lost during the attack. The organizations disaster recovery plans and procedures should cover the restore strategy. The incident response team should also be available to handle the restore and recovery process and to provide guidance on the recovery process. Document and Learn It is important that once the attack has taken place, it is documented. Documentation should cover all aspects of the attack that are known, including the damage that is caused (hardware, software, data loss, loss in productivity), the vulnerabilities and weaknesses that were exploited during the attack, the amount of production time lost, and the procedures taken to repair the damage. Documentation will help to modify proactive strategies for preventing future attacks or minimizing damages.
209
Chapter 9
Testing the Strategy

The last element of a security strategy, testing and reviewing the test outcomes, is carried out after the reactive and proactive strategies have been put into place. Performing simulation attacks on a test or lab system makes it possible to assess where the various vulnerabilities exist and adjust security policies and controls accordingly. These tests should not be performed on a live production system because the outcome could be disastrous. Yet the absence of labs and test computers due to budget restrictions might preclude simulating attacks. To secure the necessary funds for testing, it is important to make management aware of the risks and consequences of an attack as well as the security measures that can be taken to protect the system, including testing procedures. If possible, all attack scenarios should be physically tested and documented to determine the best possible security policies and controls to be implemented. To be effective, both proactive and reactive strategies need to be tested or exercised. Regular simulation exercise can help maintain a vigilant security posture. Its altogether too easy to work through an incident that only has minor repercussions, and gloss over the lessons learned. Remember that lessons learned are only learned when they feed into some process improvement loop and a change is actually put into practice. Dont let your organization get trapped into collecting a set of lessons observed but never acted on. Learn the lessons well and incorporate them into the every day work flow to maintain a vigilant security posture.
Summary
In the converged services network, security goals will be largely determined by the following key tradeoffs: Services Offered vs. Security ProvidedEach service offered to users carries its own security risks. For some services, the risk outweighs the benefit of the service and the administrator may choose to eliminate the service rather than try to secure it. Ease of Use vs. SecurityThe easiest system to use would allow access to any user and require no passwords; that is, there would be no security. Requiring passwords makes the system a little less convenient, but more secure. Requiring a device-generated one-time password makes the system even more difficult to use, but much more secure. Cost of Security vs. Risk of LossThere are many costs to security: monetary (the cost of purchasing security hardware and software such as firewalls and one-time password generators), performance (encryption and decryption take time), and ease of use. There are also many levels of risk: loss of privacy (the reading of information by unauthorized individuals), loss of data (the corruption or erasure of information), and the loss of service (the filling of data storage space, usage of computational resources, and denial of network access). Each type of cost must be weighed against each type of loss.
The threats to the security of your service network environment have reached staggering proportions from both within and without. This chapter has identified a number of widely accepted industry best practices to help secure and closely monitor your environment. The next and final chapter will delve further into compliance, asset management and reporting, and documenting the whole environment.
210
Chapter 10
Chapter 10: Asset Reporting, Audit Compliance, and IT Documentation

This final chapter will explore asset and audit compliance methods and procedures. Theyre the last component of the FCAPS model. As this guide comes to a close, it will examine why documenting IT process is critical to business operations. Earlier chapters have made brief mention of approaches such as ITIL and ISO 17799 as best practice methodologies. This chapter will probe into those a bit deeper. For many enterprises, regulatory requirements to comply with SOX, GLBA, and HIPAA raise concerns about the impact on managing an integrated service network with VoIP. This chapter will close with a review of managing the network life cycle with an eye toward holistic risk management.
FCAPS and Asset/Administration/Accounting Management

Over time, and depending on what business the enterprise is engaged in, the A in FCAPS has stood for accounting, administration, auditing, and asset management. That variation is perhaps a good reinforcement that the FCAPS approach is a model, not a process. FCAPS is simply a holistic model that suggests you look at multiple domains within the service networkfault, configuration, accounting, performance, and securityto provide comprehensive service management. Accounting/Administration/Asset Management In the commercial services market, whether delivering traditional telephony services or enhanced VoIP services, accounting becomes vital to successful service delivery. A Call Detail Record (CDR) in telecommunications contains information about system usage. This includes the identities of call originators, the identities of call recipients, call duration, any billing information about the call, information about time used during the billing period, and other usage-related information.
211
Chapter 10
Managing Billing CDRs are created constantly throughout the business day, with every telephone call. The telephone calling patterns within an enterprise can provide a great deal of information about customers, business partners, and internal relationships. In many VoIP systems, CDRs are stored in databases on the VoIP servers. A SQL server in the VoIP segment may be just as vulnerable as one sitting elsewhere in the network.
Internal Billing
For many organizations, IT and telecommunications services represent a business within the business. For enterprises that treat these services as a cost center, billing within the organization may drive either real dollar transfers or cost center cross charges for tracking the cost of doing business. CDRs and call accounting systems provide the means to generate and control this recordkeeping function within the service delivery network.
Billing as a Service Provider
Advances in unified communications technologies have dramatically lowered the barrier to entry to the VoIP services business. Its become relatively easy and cost effective to become a voice service provider in the converged network environment. As a result, hundreds of service providers offer a wide range of integrated services, many focused in niche and vertical markets. For many global enterprises, the deployment of self-managed infrastructure makes sense. In these enterprises, affiliates and business units are often paying customers of the services delivery organization. Whether CDRs are used to bill paying customers or to track system utilization doesnt matter. The data collection and analysis of usage statistics provides the same baseline knowledge about whats going on in the network. In traditional telephony, carriers billed for minutes of usage. In the integrated services environment, there are several new factors that need to be accounted for. Bandwidth utilization is the primary accounting factor for many enterprises. Although it might not provide the granularity of detail that some facets of the network offer, bandwidth can be used to account for WAN links, based on the speed of the link. In large enterprise networks, it helps delineate the cost differences in delivering 10Mbps, 100Mpbs, or Gigabit Ethernet connections within the LAN. These port speeds become more critical factors as the enterprise evolves from a legacy LAN environment into an integrated voice, data, and video service network. Disk space and CPU utilization become increasingly important in the converged service environment. Adding a voice service introduces codecs, signaling, and control processing that consumes resources, not just in the network, but sometimes in nodes across the network. Minutes of use may seem like an obsolete billing mechanism in the new integrated services network, but that isnt always the case. Many enterprises rely on telecommunications services to derive information about peak business hours during the day or week. For businesses that are very voice-centric in nature, minutes of use, and when they are used, continue to provide key business intelligence information about the ebb and flow of work throughout the business day. Collected over time, this information is used by dynamic organizations to assist in managing staffing requirements.
212
Chapter 10 User Accounting As we touch on a variety of regulatory and compliance issues, user accounting becomes a vital recordkeeping and auditing component of the service network. With heightened requirements for audit trails and accountability, the integration of VoIP and video into the enterprise operational network raises the need for effective user accounting mechanisms. User administration tools have widely permeated the IT and telecom services environment. Theyre common enough that today, reference to an AAA server or service may not raise any questions, but for many organizations, the adoption of new services for authentication, authorization, and accounting/auditing may be new.
Chapter 9 discusses the importance of authentication, authorization, and accounting/auditing.
Tracking UserIDs, passwords, and user permissions becomes more important as service networks mature. As a more diverse suite of services is available, its important to track which users have permission to use which services. From a regulatory and compliance viewpoint, you might focus more on who the services were used by and what activity was performed. When crafting the enterprise converged services network, one factor to consider is a backup methodology. Recovery of auditing or accounting information impacts compliance for many organizations. The Asset Management View Another facet of the FCAPS approach is the challenge of asset management. Current unified communications technologies introduce a plethora of new devices and equipment into the network. Chapter 8 discussed configuration management. In the integrated services network, management and monitoring tools are vital service delivery support mechanisms. As part of configuration management, the NMS commonly incorporates an asset management module. These tools in the NMS help in managing the IT asset life cycle. As this guide has noted throughout, every facet of network and services management has its own life cycle component. The integrated services network is a series of life cycles within life cycles. In this area, we focus on: Asset evaluation and selection Asset procurement Asset maintenance and support Asset disposal at end of life
As you adopt a comprehensive FCAPS model, many organizations are inclined to take a singular view on asset management, auditing, or accounting for billing purposes. Its wise to consider a more holistic view and incorporate all three into the enterprise management model.
213
Chapter 10
Why Documentation? Why Process?

Throughout this guide, there has been a recurring thread reinforcing the need for process and documentation. As we wrap up in this closing chapter, lets touch on why these are so important in the integrated service network of voice, data, and video. Creating Repeatable Processes When we deliver services of any kind, we strive for a predictable and consistent service quality; VoIP and video services in the converged network present some technical challenges. Traditional networks dealt with bursty non-real-time data. Voice and video services place new demands for real-time services on the network. Real-time services can be designed to overcome a number of network impediments, but doing so requires predictability. We put great effort into repeatable processes in every facet of the service delivery network because repetition brings consistency. With consistency, we can achieve the quality we commit to, whether those commitments are via service level agreements (SLAs) or through tacit user expectations in the enterprise. Additionally, repeatable processes can free time and resources in many organizations; those resources can be better spent on more creative work that requires focused attention. Not every project or task in the enterprise requires repeatable processes, but those things that routinely consume resources are more efficiently managed with thorough documentation and process controls. Revisiting the Importance of Knowing Your Environment In the information economy, our business intelligence is a corporate asset. For many enterprises, the information we have about our own business processes, workflows, and information resource tools may be the most valuable asset the company owns. The more we know about the environment, the better armed we are to manage the business. The more we know about our integrated services, how theyre used, when theyre used, and who uses them, the better prepared we are to support the enterprise mission. Documenting processes provides sustainability. Too often, especially in small and midsized businesses, there is a corporate culture of reliance on key individuals who know what to do. These people who carry corporate information in their heads are perceived as being high-value resources, when in fact they present great risk. The problem is depth of coverage. When a single individual is the only resource with expertise in a particular subject area, there is a risk that an event changing the status of a single person can create problems. If a single IT person handles cyber security incidents, for example, what happens if that person is out ill or leaves the company. Its vital that business processes not rely on one deep personnel. Documentation eases cross-training of other staff, and helps ensure long-term viability of business processes. Thorough documentation of business intelligence informationwhether its about the integrated services network or a customer accountprotects the organization from the danger that critical institutional knowledge exists solely in the hands, or brains, of a single individual.
214
Chapter 10
Be Prepared: Protecting Against Future Business Issues There is another reason to document IT service delivery processes and workflows. It has to do with budgeting. Information services networks tend to change more rapidly than legacy telephone networks. In the past, an enterprise telephone system might have a 10-year lifetime. The LAN and WAN environment has proven to be more dynamic. In many enterprises, the IT services network undergoes extensive redesign and re-architecture every 3-to-4 years, with continual upgrades taking place along the way. As our integrated voice, data, and video services become more widespread, they become more tightly coupled with daily business operations. Enhancements and upgrades to the service network become, for many, key business drivers. The budgeting process of calculating costs, building a business case, proving return on investment (ROI), and preparing acquisitions all require documentation. In order to be prepared for business contingencies, in the IT services network, one commonly adopted best practice is to maintain and regularly update the requirements, including replacement costs for high-level elements of the service: HardwareA comprehensive inventory of the network infrastructure can ensure that no components are overlooked in the life cycle evolution. SoftwareA library of software used, including version information, not only documents the current environment but also provides a quick recovery checklist when problems arise. ServicesAn inventory of every service delivered and the user or customer to which it is delivered can prove vital during potential business continuity events. Knowing who the users of every service are can ensure good communications and information sharing during an incident. StaffingTraining, cross-training, and skills retention requirements need to be documented and continually updated.
215
Chapter 10
Legal and Regulatory Issues to Consider

Although this guide cant begin to address the spectrum of legal and regulatory issues to which private and public sector organizations may need to adhere, lets take a brief look at a few of the most visible and compelling concerns. SOX In 2002, SOX was passed. Entitled the Public Company Accounting Reform and Investor Protection Act of 2002, its been very controversial and raised a lot of debate. This law arose from the furor over corporate accounting scandals at WorldCom, Enron, Tyco, and others. This flurry of visible accounting scandals and fraud in business reduced public trust in business management and in accounting and reporting practices. Named after sponsors Senator Paul Sarbanes and Representative Michael G. Oxley, SOX passed in the House by a vote of 423-3 and by the Senate 99-0. This wide-ranging legislation introduced new standards and bolstered existing ones for the boards, management, and accounting firms that represent public companies. Although SOX doesnt correlate directly to telecommunications or IT services, its impact has been far-reaching, well beyond publicly owned companies. Major SOX provisions include: A Public Company Accounting Oversight Board (PCAOB) was created Public companies are required to both evaluate and disclose the effectiveness of their internal controls Financial reports must be certified reports by both Chief Executive Officers (CEOs) and Chief Financial Officers (CFOs) Companies listed on stock exchanges must have independent committees overseeing relationships with auditors Most personal loans to executive officers and directors have been banned Both criminal and civil penalties for securities law violations have been increased Willfully misstating financial reporting by corporate executives is punishable by significantly longer maximum jail sentences and larger fines
Although SOX might not apply to all organizations because of its focus on financial accountability, the principles contained in the act have been widely embraced and adopted across the private and public sector. What is for some a legal requirement has quickly become a generally accepted best practice for many. The fundamental SOX focal areas easily extend down into IT management and delivery of services within the enterprise. The impact of SOX on business practices surrounding financial and auditing controls has been widely felt. But the impact isnt solely in the financial area of business. IT has felt some impact as well. Those impacts are mostly felt in five areas: risk assessment, control environment, control activities, monitoring, and information and communication.
216
Chapter 10 Risk Assessment Every organization is faced with a variety of internal and external risks that must be taken into consideration. Establishing objectives is a precondition to risk assessment because we are actually evaluating the risk posed to achieving business objectives. Before we can determine how to manage risks, we have to understand what they are in relation to specific business goals. For example, IT managers have to both assess and understand the risks around financial reports, particularly in the areas of completeness and validity, before they can implement necessary control mechanisms. Thus, service delivery managers must understand how services, systems, and resources are being used, and document accordingly. Control Environment The control environment sets the tone for the organization and influences human behavior. This control environment addresses business factors such as management style, organizational ethics, and delegation of authority and staff development processes. IT and service delivery groups thrive in an environment in which employees take accountability for and receive recognition for the success of their projects. This encourages that issues and concerns are escalated in a timely manner and dealt with effectively. For many organizations, this means that service delivery employees need to be cross-trained in design, implementation, quality assurance, and deployment so that they can engage in the complete life cycle of the converged services technologies. Control Activities These are the corporate policies and procedures that govern how to ensure management directives are accomplished. They help guarantee that appropriate actions are being taken to mitigate the risk that might interfere with meeting objectives. Control activities occur throughout every organization, and include routine tasks at all levels and in all functions. They include a broad range of activitiesapproval processes, authorization acquisition, verification, financial reconciliations, operating performance reviews, and asset security. In the past, many ERP and CRM systems were used to collect data that fed into spreadsheets for analysis. Manual spreadsheets are prone to human error. Organizations will need to document usage rules for the integrated services and create an audit trail for each. These systems support business services and often contribute financial or other key business intelligence information. Its important that corporate policies define the business requirements and other documentation necessary for each and every IT and telecommunications service project.
217
Chapter 10
Monitoring Internal control systems within the service environment need to be routinely monitored. This helps assess the quality of the systems performance over time. This is best accomplished through a combination of both ongoing monitoring activities and periodic evaluations. Problems and anomalies identified during monitoring should be reported and documented and corrective action should be taken. This ensures a continuous improvement in the delivery of fully integrated services. Within the service delivery environment, specialized auditing and review processes may be appropriate to address high-risk areas. Those services and systems that are mission-critical to the enterprise often require focused attention. The service delivery team should perform frequent internal assessments of these vital systems and services. One important consideration may be to have independent third-party assessments performed on a regularly scheduled basis to augment the work performed by employees. The management team needs to clearly understand the outcomes of these audits and assessments. As these services roll up to larger corporate business and financial reporting, the management team will be held accountable for the outcomes. Information and Communication Information systems play a pivotal role in all our internal business controls. They provide reports that make it possible to run and manage the business. These reports contain operational, financial, and compliance-related information. Effective communication aids the flow of information throughout the organization. Service managers cant identify and address risks without timely, accurate information. Timeliness in reacting to issues as they occur is vital to maintaining the integrated services environment and ongoing business operations. GLBA GLBA was designed to open competition among banks, securities companies, and insurance companies. These companies make up what is known as the financial services industry. GLBA really governs how these companies can merge and own one another. Because people invest money differently in differing economic times, GLBA introduces controls so that investment banking firms, for example, might also participate in the insurance business. Banks havent traditionally participated in insurance underwriting, but consolidation of the financial services sector led to the need for new regulation. Although a great deal of consolidation in this industry has taken place since the passage of GLBA, it hasnt been as widespread as anticipated. For most businesses outside the financial services industry, GLBA is of no concern. It represents another set of requirements, similar in many ways to SOX, for financial services. Again, we see regulatory efforts and compliance requirements for one industry or set of industries being adopted as common best practices in other sectors.
218
Chapter 10
HIPAA Congress enacted HIPAA in 1996. The primary goal of HIPAA Title I is to protect health insurance coverage for workers and their families in the event of change or loss of their jobs. Title II of HIPAA had wide-ranging impacts on the health care industry. It set in place requirements for national standards of health care transactions and identification mechanisms for health care providers, insurance plans and providers, and employers. As part of Title II, a section called Administrative Simplification also addressed the security and privacy of individuals health-related information. The intent of this section was to bring about standardization in systems and encourage electronic data interchange (EDI) within the U.S. health care sector. Within HIPAA, there is a Privacy Rule that took effect in 2003. This rule provides regulations for what is termed Protected Health Information. PHI is comprised of information that can be linked to any individual about the status of that persons health, health care being provided, or payment for that health care. The rule has been broadly interpreted to include any portion of either a patients complete medical record or payment history. The Privacy Rule dictates what may or may not be disclosed by law and to help ensure treatment. Health care organizations have to take great care to disclose only what is necessary and take measures to protect all information from undue disclosure. Beyond the Privacy Rule, there is also a Final Rule on Security Standards that was issued in 2003, with a compliance requirement date in 2005. This Security Rule is intended to complement the Privacy Rule. It defines three types of safeguards that healthcare entities must put in to place to comply: Administrative Physical Technical
Each of these types has specific security standards that must be adhered to for compliance. Some of the standards are required; others are addressable or flexible enough to be uniquely addressed based on circumstances. Although the intent of this guide isnt to provide a detailed evaluation of HIPAA requirements and impacts, because the act is so broad in scopeeven to the point of impacting many enterprise Human Resources organizationsthis information is included to give an appreciation of these regulatory compliance needs as they may overlap into service delivery requirements. These descriptions of administrative, physical, and technical safeguards are liberally extracted from a number of resources widely available online. HIPAA has been very widely documented and many resources say the same things.
219
Chapter 10 Administrative Safeguards: Organizations covered under HIPAA must adopt written privacy procedures and designate a privacy officer. The privacy officer is responsible for developing and implementing all required policies and procedures. Management oversight and organization buy-in to compliance and security controls must be documented within the organizational procedures and policies. Classes of employees that are permitted access to PHI must be documented in organizational procedures. Employees should have access only to the subset of PHI needed to perform their jobs. Organizations must demonstrate they have implemented an ongoing training program regarding the handling of PHI, and that its provided to all employees that perform any administrative health plan functions. If an entity outsources some part of their business process to a third party, that entity must document that their vendors also have the procedures and policies in place to comply with HIPAA requirements. In most cases, organizations accomplish this through contractual clauses as part of third-party service agreements. One area of concern is the possibility of a vendor further outsourcing data handling functions to yet another vendor farther downstream. At each step of the way, appropriate contracts and controls need to be established. Health care entities are responsible for data backups and disaster recovery procedures. A documented emergency contingency plan should be established that documents data priority and failure analysis, testing activities, and change control procedures. HIPAA dictates an internal audit process that continually reviews compliance and the potential for security violations. Enterprise policies and procedures need to document the frequency and scope of audits, and the specific details of the audit procedures. Under HIPAA, audits should be both regularly scheduled and event driven. Procedures for mitigating security breaches should be well documented. The network must include controls governing the introduction and removal of both hardware and software. Equipment that has reached end of life must be retired in such a way that PHI cannot be compromised. Access to systems that contain health information should be diligently monitored and controlled. Only authorized individuals may access systems hardware and software. Access controls must include security plans for facilities, maintenance records, and visitor sign-in (including a process for escorts). Workstations should not be placed in public or high-traffic areas. Monitor screens should not be viewable by the public. Policies addressing proper workstation use are required. If contracted employees are used, they too must be fully trained on their responsibilities.
Physical Safeguards:
220
Chapter 10 Technical Safeguards: Systems that house PHI must be protected from intrusion. If data is transmitted over an open network, some form of encryption is required. If private or closed networks are used, existing access controls may be sufficient to eliminate the need for encryption. Every entity involved in the transmission and handling of PHI data is responsible for ensuring that the data within its systems has not been modified or deleted. Data integrity should be maintained and corroborated using check sum, double-keying, message authentication, and digital signature techniques. Message authentication between parties covered under HIPAA is required. Each party should corroborate that an entity is who it claims to be. Common examples of this authentication include: password, two or three-way handshakes, telephone callback, and token systems (including digital certificates). Organizations covered under HIPAA must make their documentation available to the government to determine and verify compliance. IT documentation should include a written record of all configuration settings on the elements of the service network. This is in addition to other policies and procedures because these components are complex, configurable, and constantly changing. For many service delivery organizations, this presents a huge challenge. Risk analysis and risk management programs must be implemented and documented.
Whether the regulatory issue is SOX, GLBA, or HIPAA, its obvious by now that any enterprise that is required to comply takes on an added set of documentation requirements that necessitates diligence and attention to detail. Some regulations apply only to specific areas in the private sector. Public sector groups may have other requirements at the federal, state, and local level. Although this guide cant begin to touch on the privacy issues surrounding the protection of personally identifying information (PII), as of this writing 46 states have active PII legislation in place. For many enterprises, another entitys regulatory challenge can be used as a diligent best practice with some adaptation. Many best-in-breed organizations adopt a blend of compliance with the regulations weve described coupled with the practices that follow in the remainder of the chapter.
221
Chapter 10
Methodologies in Best Practices for Management and Oversight

This section will delve into two widely accepted best practices methodologies: the Information Technology Information Library (ITIL) and ISO 17799. ITIL ITIL is a collection of industry best practices that has grown out of earlier efforts by the Central Computer and Telecommunications Agency (CCTA) in the UK. It provides a collection of industry best practices, taken from both private and public sector organizations, for information technology services across all IT infrastructure and operations. ITIL is published as a series of books, hence the library approach. At one point, there were 31 volumes in incorporated in ITIL. ITIL v3, termed a refresh of the library, was due to be released in May of 2007 but has been slightly delayed as of this writing. ITIL v3 will contain five volumes: Service Strategy Service Design Service Transition Service Operation Continual Service Improvement
The framework in ITIL has been widely adopted by many enterprise organizations. ITIL contains an area that addresses IT Service Management (ITSM), which garners a great deal of attention from service delivery organizations. Although ITSM and ITIL arent directly related, theyre used in tandem to provide a customer-focused set of service delivery best practices. Their shared heritage dates back to the IBM Yellow books and W. Edwards Demings Total Quality Management (TQM) principles. Today, these principles are found in a number of widely embraced approaches including Six Sigma, Capability Maturity Model Integration (CMMI), and Business Process Management (BPM). Its perhaps noteworthy to remember that early work in quality management and process engineering was oriented toward manufacturing. What we see here is the evolution of how the lessons learned in the manufacturing sector evolved over time and are now being applied to the delivery of information services in todays converged data, voice, and video networks. Some of the benefits of a systematic, ITIL-based approach to managing IT services include: Cost reduction IT service improvements based on best practices Increased customer and end user satisfaction Enhanced auditability through process and documentation Productivity improvements through incorporation of repeatable processes
222
Chapter 10 Service Support ITSM can bring many benefits through the adoption of best practices. Its driven by both technology and the many businesses engaged in ITSM, so its continually evolving and changing. Its a fact that many service provider organizations have turned to ITIL for guidance to help deliver and support their services. ITIL provides useful guidelines, yet it does not actually provide the service management processes. Service providers are still expected to define the processes themselves. This is often accomplished with help from outside consultants. The service management processes alone are not enough. Staff who are expected to follow the processes also require more detailed work instructions behind the processes. ITIL provide some guidelines for service management applications, yet does not provide the tool settings. Thus, after a service provider organization has defined its service management processes, the organization still needs to find or develop the appropriate application to support these processes. Today, nearly all service management applications claim to support ITIL and work out-of-thebox. Reality requires more work. Its not unusual for it to take somewhere between 2 weeks and 4 months to configure a service management application so that it can support the processes that a service provider organization has defined. And thats the easy part, after all the processes have been defined and documented. There are several views of how ITIL might be used. One leading company is Alignability. Figure 10.1 shows a widely adopted view called the Alignability Process Model.
Figure 10.1: The Alignability view of ITIL service support and delivery processes.
223
Chapter 10 Service Delivery Given that ITIL is a library, covered in a series of books developed over many years, this short mention cannot begin to do justice to the complexity of this very thorough approach. In the visuals that follow, we see some flows and concepts specific only to incident management that are widely adopted by service delivery organizations embracing the ITIL model. Each box in Figure 10.1 has a detailed and specific set of documentation and process flows to support that facet of operations. Incident management is a daily part of operations for every service delivery organization. In Figure 10.2, we see one ITIL view of incident management and response. This is provided as an example and may not be applicable as shown for every service delivery organization. The example provides, as does the entire ITIL collection, a framework that an organization can use for modeling its own specific process documentation.
Figure 10.2: The ITIL incident management and response flow.
In Figure 10.3, we see one tool for a small part of the incident management and response flow. Every incident has an impact, and associated with that impact is some sense of urgency to achieve resolution. For most organizations, this simple three-by-three matrix will provide enough granularity to prioritize every possible incident and contingency from a flu pandemic to natural disaster to minor power failures or service disruptions.
224
Chapter 10
IMPACT
Extreme (Critical) Major Incident
And, Multiple agencies cannot conduct core business
High
Cannot conduct core business
Medium
Restricts ability to conduct business
Low
Does not significantly impede business
High
Requires immediate attention
Medium
Requires attention in near future
Low
Does not require significant urgency
Figure 10.3: Incident prioritization matrix.
Figure 10.4 simply provides an example of drilling down another level. Once priorities have been applied to a given incident, established responses need to be engaged.
Each priority is related to a certain recovery time . . .

Priority 0: Major Incidents - Larger global incidents
Multiple agencies cannot conduct core business. Other global incidents may include: Fire, natural disasters, such as floods, earthquakes or volcanic eruptions. Human and animal disease outbreaks. Hazardous materials incidents. Terrorist incidents, including the use of weapons of mass destruction. Civil unrest, labor strikes, picketing.
Immediate Attention
24/7 effort until resolved /contained n hour escalation / communication
Priority 1: Significant damage, must be resolved / recovered immediately Priority 2: Limited damage, should be resolved / recovered immediately Priority 3: Significant damage, does not need to be resolved / recovered immediately Priority 4: Limited damage, does not need to be resolved / recovered immediately Priority 5: Non-urgent no impact on the customers ability to work. Could become a project oriented issue
Figure 10.4: Prioritization response matrix.
Immediate Attention
24/7 effort until resolved /contained n hour escalation / communication
N hours
24/7 effort at managerial discretion
N hours / business day
N business day(s)
N/A
225
Chapter 10 The Business Perspective What we see in this quick series of examples is that the ITIL approach to incident management provides a consistent methodology for prioritizing and responding to incidents. Coupled with established process flows, the integrated services delivery organization possesses a complete, methodical tool set for managing and delivering consistent and predictable integrated network services. This consistency and set of repeatable processes ensures that enterprise operations continue on an even keel no matter what happens. They provide sustainability. For service providers, these tools ensure business survival, as these integrated services represent the customer revenue stream. For other enterprises, appropriate modification allows them to support the enterprise mission and sustain the work that goes into meeting corporate objectives. ISO 17799 ISO 17799 is actually entitled Information technology - Security techniques - Code of practice for information security management. It was revised in June 2005 by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). It offers best practice recommendations on information security management. Many organizations are today embracing ISO 17999 as a broader part of service management and business operations, folding it into the larger set of corporate compliance efforts. Although focused primarily on the protection of confidentiality, integrity, and availability of network services and resources, ISO 17799 reaches far beyond security. Business Continuity Planning Business Continuity Planning (BCP) has grown and evolved from what in the past were disaster recovery initiatives in the enterprise. Like every other facet of managing a unified communications services network, BCP strategies continually mature as processes are developed. BCP focuses on organization recovery and service restoration when a disaster or extended disruption occurs. For many organizations, BCP digs deeper than network services into what is frequently called an all hazards plan. BCP is simply how an enterprise plans and prepares for future events that could jeopardize business mission and overall operational health. Its often a path to discovery of weak processes and tightly coupled with efforts toward improving information security and risk management practices. Typically, BCP yields some organizational documentation for use before, during, and after a business disruption. BCP methodologies today scale to fit both large and small businesses but initially grew primarily from the efforts of regulated industries. Common sense indicates that every organization should have a plan to guarantee sustainable business.
226
Chapter 10 One view that is widely represented in a number of scholarly articles on business continuity management process consists of the following six procedures. Disaster Notification HandlingAt this stage, an on-duty manager becomes aware of either an existing or impending disaster. This procedure is used to assess the situation using established checklists and determine what protective and recovery actions are warranted. Service RecoveryIn the recovery phase, management and recovery teams follow predefined plans for service recovery. Return to Normal OperationsOnce service recovery has been completed, the continuity manager or incident commander implements a predefined return to normal operations. This collaborative effort shifts the focus of operational control from the recovery team back into the hands of the normal operations team. Service Recovery TestingThe continuity manager and service recovery team remain engaged to ensure that recovery is complete and the process, as followed, provides for reliable service recovery and restoration to normal operations. Continuity Manual MaintenanceThe business continuity team must complete an afteraction analysis and update business continuity process manuals to ensure that action is taken on any lessons learned from the event. Continuity Plan MaintenanceBusiness continuity planners need to continually revise the continuity plans for the service infrastructure of the integrated network that provides mission-critical enterprise services.
There is a great deal of evidence indicating that many organizations dont invest enough time or money in BCP. Statistically, building fires permanently close 44 percent of the businesses they affect. In the World Trade Center bombing in 1993, more than 42 percent, 150 of 350, businesses didnt survive. Conversely, in the attacks of September 11, 2001, many businesses had extensive business continuity plans and were back in operation within days of the devastation. For many organizations, the plan can simply be a binder stored somewhere away from the main work location that contains contact information for management and general company staff, a list of clients and customers, and vendor partners. Its wise to include any other work process documentation including the location of the offsite data backups, copies of insurance documents, and other materials vital business survival. For large enterprises, with more complex needs, the plan may well include: A failover, disaster recovery, or business continuity work site A process to routinely test technical requirements and readiness for disaster Backup systems for regulatory reporting requirements A mechanism to reestablish physical records, perhaps from tape, drive, or microfiche A means to establish a new supply chain when vendor partners are also impacted For manufacturing operations, the means to establish new production centers may be required
227
Chapter 10 Every organization, large or small, needs to ensure that their business continuity plans are realistic. Its vital that they be easy to use during a crisis. BCP is a crucial peer to crisis management and disaster recovery planning. It is part of every organizations overall risk management strategy. Development of a business continuity plan can be reduced to five simple steps: Analysis Solution design Implementation Testing and organization acceptance Maintenance
Analysis Maintenance
Testing & Acceptance Solution Design Implementation

Figure 10.5: Five-step BCP life cycle.
A great deal of BCP material is available on the Internet, sponsored by consulting firms who provide fee-based services. Its worthwhile to note that there are many freely available basic tutorials that can help those organizations that are motivated to prepare but financially constrained from hiring consultants. The simple act of establishing the role of a business continuity planner can greatly improve organizational readiness for events that cannot be predicted.
228
Chapter 10
System Access Control System access control includes several techniques addressed earlier in this guide. Chapter 9 looked at the Golden Rules of authentication, authorization, and audit. Rather than dig into the details of each, well just revisit the basics. Authentication is the act of ensuring that the user is who they claim to be. Best practices also drive server or system authentication in the opposite direction, giving the user authenticated proof that the system or server is indeed the one it purports to be and not a spoofed system somehow inserted in the traffic flow. Authentication can range from simple passwords to more complex token-based and/or biometric verification tools.
Factors for Authentication Simple passwords are often viewed as a very weak form of authentication. Two-factor authentication is generally considered more acceptable and appropriate for many business applications today. There are four generally accepted authentication factors that might be used in combination to provide strong user authentication: Something you knownUsually a password or PIN; its assumed that only the owner of the account would know this information Something you haveOften a smart card or token; again, assumes only the owner of an account will have the associated token Something you areMost often a biometric solution: a fingerprint, palm print, voice or retinal scan Somewhere you areMay be a simple as inside or outside the corporate firewall for matters of trust; it might also be a more extensive ring-back system that allows a user to dial in, and then calls back at a predefined telephone number to grant access; RFID devices may play a role in future use of location-based authentication.
Authorization is the process of matching the authenticated user or system to a profile that identifies what that use has permission to do. Its a combination of services the user is allowed to use and service network resources to which the user has been granted access. These might include permission to read-only access or to write files and the ability to execute programs. Read, write, and execute are the most common authorization focal points. Audits provide the proof details that may be required later in some form of incident or event investigation. This includes system logs and history files that provide information about what the user actually did during a session. Audits provide the documentation for who performed what function and when. Audits provide accountability, associating users with actions.
229
Chapter 10
Physical and Environmental Security Beyond the systematic controls of authentication, authorization, and auditing at the system level, we have physical controls to protect systems. Door locks, controlled work areas, card key scanners, and automated systems all aid in securing the physical location where systems delivering key integrated services reside. Physical security may be aa simple as a locked door. For many small organizations, this simple precaution is adequate. Other enterprises may require guards, video monitoring, metal detectors, and complex personnel tracking systems to eliminate any chance of unauthorized access. Hiding resources by simply not advertising their availability provides a very minimal and weak approach often termed security by obscurity. Proponents of discretion will point out that, conversely, putting a big sign on the data center door advertising what is inside is probably unnecessary and inappropriate. Compliance ISO 17799 isnt really a regulation or code that organizations comply with directly. Its really a code of practice that helps shape the corporate culture and behaviors. Businesses use ISO 1799 to help avoid breaking any laws or regulations. For many companies contracts, SLAs and security policies become part of ISO 17799 adoption. The intent is to ensure that all internal systems comply with documented corporate policies and standards. Asset Classifications and Control IT asset management (ITAM) is another set of business best practices. ITAM brings financial, contractual, and inventory functions together to manage the IT environment life cycle. This information is used to make decisions about architectural and service changes in the enterprise based on knowledge of what is known about the operational environment. It includes both hardware and software management. Software management commonly includes licensing to ensure businesses arent running on unlicensed copies of commercial software. Beyond this, software asset management ties closely to configuration management of corporate systems. Lastly, software asset management is frequently used as a tool to document compliance with SOX, GLBA, HIPAA, copyright laws, and so on. There are a number of commercial tools available to assist in managing the enterprise software inventory. Hardware asset management focuses on the physical components. In the integrated services network, this includes all the network elements, nodes, and even cables that deliver missioncritical business services. Under the umbrella of ISO 17799, many organization document practices surrounding procurement, administration, re-use, and retirement of hardware assets. Security Policy At a high level, the enterprise security policy simply provides a definition of what it means to be secure. The security policy describes acceptable behavior and approved uses of resources and often delineates specific security elements to impede unauthorized access. The security policy speaks to the flow of information by elaborating on what information may enter or leave the corporate network or zones inside that network.
230
Chapter 10
Managing and Protecting the Network

Managing and protecting the enterprise network is a daily, ongoing exercise in managing risk. As weve begun to learn in this brief overview of regulatory and compliance concerns, we find that to some extent we must embrace risk. We recognize risks exist. We assess the impact it may have on business and we develop strategies to either counter, manage, or mitigate it. In some cases, we can avoid risk entirely by taking some alternative safe approach. In some cases, well defer the risk to a third party, like an insurance carrier. Sometimes we can reduce the negative impact, but other times we simply accept the full risk consequences. Each approach needs to be the result of a conscious business decision. Chapter 9 introduced Jacobsons Window for simple risk modeling. This is a good point to refer back and bring our thoughts together as we work toward the closing of this guide. Well do a quick review here and revisit some key points. Inconsequential Risk Classes Inconsequential risks fall into two distinct categories. One inconsequential risk is so unlikely to occur that is doesnt warrant undo concern. This risk has a low occurrence rate but a high impact should it ever occur. We used the example of a meteor striking our data center in Chapter 9, but we could just as likely take into account a continental power failure or any other very rare event. When they happen, they are completely outside our control and so devastating to business operations that spending time brainstorming contingency plans is simply a wasted effort. At the opposite end of the spectrum is the risk event that has a high rate of occurrence but a low impact. Spam email provides a perfect example. We get spam email in our inboxes every day. For most of us, the impact is minimal. We click delete, and its gone. Although we may have to do this several times in the course of a day, the impact to the company is low. In both these cases, were simply accepting the risk rather than investing efforts to mitigate and respond to the consequences. Significant Risk Classes The more significant risk classes are those that have either low-low or high-high occurrence and impact rates. This is where the majority of all risk analysis and management efforts fall. We build remediation and mitigation strategies to reduce the risk, spread it across the organization to reduce the impact, or transfer it through outsourcing, insurance, or some other means. In each case, were striving to minimize disruption to services in the network.
231
Chapter 10
Reducing Risk for Single Occurrence Losses Chapter 9 used the following illustration to show how risks spread in the real world. This view is helpful in placing the risks in different quadrants as we determine what our plan or response might be to a particular type of event.
LOW Frequency of Occurrence - HIGH
Accept Risks
Happens too rarely to address Max Tolerable Single Occurrence Loss Mitigate Risks Transfer Risks Consequences of Occurrence - HIGH
Dont Care LOW -
Figure 10.6: Modeling risks.
This spectrum of risk is very identifiable in real-world planning for business operations. Over time, losses across this spectrum of risk tend to have similar organizational impacts. From an actuarial standpoint, the notion of annualized loss expectancy (ALE) is used to quantify risk. This is often expressed in terms of occurrences per year and the loss resulting from a single occurrence. Although ALE is useful for comparing risks, its difficult to achieve credible estimates of occurrence rate for risk events that rarely occur. Its fairly easy to estimate the consequences, but not as easy to estimate occurrence rate. Computer network risks frequently stem from human actions such as simple configuration errors, but fraud, sabotage, viruses, and malicious attacks may come into play unexpectedly. These occurrence rates are very difficult to quantify. Despite this difficulty, its often wise to define a set of risks that might have disastrous effects should they occur as a one-time event. Some one-time events might be planned for, while others might result in circumstance for which there can be no pre-planned business strategy. The exercise of documenting a brainstorming session for these types of events often leads to creative, workable strategies for other similar events.
232
Chapter 10 Addressing the Risks Typical risk management focuses on risks that come from known and identifiable causes natural disasters, fires, accidents, death, and so on. Risk is anything that threatens the successful, ongoing operations of the business. The main objective of risk management is to reduce risks to tolerable levels. What is tolerable will vary from organization to organization and is based on a number of factors including technology, people, organizations, and sometimes politics. Ideally, risk management follows a prioritization scheme like that described in Chapter 9. Jacobsons Window offers just one way of looking at risk prioritization. One simple view of the risk management process is offered from the Nonprofit Risk Management Center at http://www.nonprofitrisk.org/. Risk Management Life Cycle Like every facet of network services weve touched on throughout this guide, risk management has a life cycle of its own. We see in Figure 10.7, the risk management life cycle, from identification, assessment, response development, to control and closure.
2 - Assessment
1 - Identification 3 Response development
5 - Closure 4 - Control
Figure 10.7: The risk management life cycle.
233
Chapter 10 In simple thinking, we assess the risks to the business and determine what the core needs are to protect the corporate mission. This assessment typically identifies gaps between the current operational state and what poses acceptable risk. This gap analysis feeds the development of new policies, procedures, and controls. As policies and procedures adapt and change to mitigate risk and meet business needs, its vital that they be widely disseminated across the organization. This may present the greatest challenge, but a policy isnt effective if employees arent aware it exists or what purpose it serves. Continuous monitoring of policies and controls will lead to ongoing improvementsa life of risk management within the life cycle of the enterprise itself.
Summary
For much of this guide, we focused on the FCAPS model as a tool for managing faults, configurations, accounting, performance, and security. FCAPS is a model from industry-driven standards groups and has been widely adopted in the telecommunications industry as a method for delivering voice and data services. Its important to openly acknowledge that FCAPS isnt the only model and may not be the best approach for every organization. Today, many service providers take excerpts from the FCAPS approach and fold them into the ITIL model, creating an enterprise-specific framework from the array of best practices tools and techniques available in the marketplace. For most organizations, there is no one model that fits all needs. The process of compiling the best facets of each approach is a learning process, maturation. As part of ITIL adoptions, most organizations perform a maturity self-assessment exercise to identify where they feel they fit along the curve of maturity as an IT services organization. Very few companies starting the process place themselves toward the mature end of the scale. The act of beginning, of assessing where your organization fits and documenting each step of the way, is a recognized best practice to delivering sustainable, reliable data, voice, and video services in a converged network. At the end of the day, the highly successful service delivery organization doesnt have just the ITIL library. They have the enterprise library of policies, procedures, and practices that, honed over time, document the best practices for managing the needs of that enterprise.
Download Additional eBooks from Realtime Nexus!

Realtime NexusThe Digital Library provides world-class expert resources that IT professionals depend on to learn about the newest technologies. If you found this eBook to be informative, we encourage you to download more of our industry-leading technology eBooks and video guides at Realtime Nexus. Please visit http://nexus.realtimepublishers.com.
234

Converged Network Management

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Converged Network Management

Transféré par

Droits d'auteur :

Formats disponibles

The Definitive Guide To

Converged Network Management

Chapter 1: Introduction to Unifying Network Management and Converged IP Communications

Convergence Covers Many Areas

Figure 1.1: Network convergence evolution.

Figure 1.3: Example of FMC.

Converged Communications Leads to Converged Applications

Quantifiable Business Processes

Some commonly implemented functions in the CTI environment include:

Informed Technology Choices

Intelligent Resource Procurement/Allocation

The Right Service Bundles

Figure 2.1: Building value with CRM.

Delivering Call Quality with VoIP

Table 2.1 makes the differences even clearer.

Design Considerations and Class of Service

Calls / Hour AvgMinutes / Call erlangs 60

Figure 2.2: The TCP/IP packet structure.

IP still doesnt guarantee a level of acceptability but merely a prioritization scheme.

Sender Router Router Router

Template PATH FlowSpec

FlowSpec FlowSpec FlowSpec RESV RESV RESV RESV

Figure 2.4: Using RSVP to reserve network resources.

Shim Header Ethernet PPP

Figure 2.5: The MPLS packet and MPLS encapsulation.

The Experimental Bits and QoS

LSR LSR LSR

Figure 2.6: Label switching with MPLS.

Chapter 3: Business Drivers and Justification

Vertical Market Business Drivers for Change

The Integrated Roadmap

Business Strategy Roadmap

Funding Assessment & Bus Case

People & Resources Assessment Technology Scan

Funding Assessment & Bus Case

Design Operational Components Cost Controls Review

E-Business, Business Intelligence Customer Relationship Management, Knowledge Management Portals

People & Resources Assessment Technology Scan

Design Network Mgmt. & Oper. Components Performance

Network Strategy Roadmap

Master Project Plan

Figure 3.1: Building an integrated roadmap.

The Business Strategy Roadmap

Value Proposition Target Markets

Industry & Market Assessment

Capabilities & Services Products

People & Resources Assessment

Resources & Staff

Design Management, Business Information, & Operational Solution Components

Funding Assessment & Bus Case

Figure 3.2: The business strategy roadmap.

The IT Strategy Roadmap

Implement, Deploy and Realize Solutions

People & Resources Assessment

Infrastructure Components including Systems

Asset Mgmt, Support Services

Design Operational Services Components

Funding Assessment & Bus Case

Resources & Staff

Timelines & Governance

Figure 3.3: The IT strategy roadmap.

The Network Strategy Roadmap

Network Design Locations