Vous êtes sur la page 1sur 12

IBM Global Services

Solving complex performance problems in TCP/IP and SNA environments.


Key Topics Discusses how performance analysis of networks relates to key issues in today's business environment Explains how to analyze TCP/IP networks to isolate and resolve problems Details common performance problems in traditional systems network architecture (SNA) networks Discusses the benefits of performance analysis in managing your network for successful business communications Introduction The network is crucial to communications for any large corporation.The importance of availability and performance is increased in environments where e-business is transforming business communications. The growth of the Internet, providing new services to customers and improving business-to-business communications, increases the demand for continuous operations. Performance, specifically response time, is now a key information technology (IT) measurement. Todays e-business network is primarily TCP/IP although enterprises with SNA sub, area implementations are also recognizing performance impacts, especially those migrating to Advanced Peer-to-Peer Networking (APPN). This white paper discusses the performance challenges for each of these environments and offers a successful method for analysis and resolution. Our experience focuses on end-to-end and session level performance analysis. To quickly identify where incorrect settings are causing users to experience poor performance, begin with an end-to-end analysis. Analyzing the behavior of the network requires the use of tools that provide a statistical representation of operational characteristics to identify problems, evaluate network performance and guide actions for tuning. For the purpose of this paper, we will use examples from IBM network traffic analysis (NTA) services tools that provide this level of detail for TCP/IP and SNA networks. NTA service tools use trace data captured from system trace tools, as well as standard network data capture tools. Analysis is automated for subarea data flows as well as TCP/IP packet traffic. A view of end-to-end routes experiencing problems provides a starting point for the performance analysis.

Analyzing TCP/IP network performance Effective analysis of the TCP/IP environment requires an understanding of the fundamental physics of network behavior, control mechanisms for TCP/IP performance and identification of performance problems by network behavior analysis. This includes an examination of TCP flows, packet sizes, window behavior, flow and congestion controls and data link operations. A high level view of all the routes in the flow is the best place to start the analysis and helps to identify which routes are experiencing problems. From this perspective, one can identify and focus on the cause of performance problems. The cause may simply be a need to make logical parameter changes or a routing adjustment to distribute the load and avoid overloaded resources. In some cases, the problem may be errors in a node or on a link. Once the problem is identified, NTA service tools can quicken resolution by automating the detailed analysis and providing specific recommended actions. When experiencing a performance problem, begin the analysis by tracing the portion of the network segment when and where the problem presents itself. The next step is to use the TCP/IP header data from this trace to understand the condition of the network at the time of the problem. Examine both the connections that experienced problems and a packetby-packet view of activity. Using NTA service tools, one can identify those connections experiencing excessive response times and determine the cause of throughput problems. Analysis is then divided into two categories: global analysis and detailed analysis. Global analysis is a high-level analysis of the trace data as a single entity. Results show a breakdown of the fourth layer protocols (such as TCP and UDP) and the amount of data transported by each.

The illustration below shows the division of the fourth layer protocol as detailed by NTA service tools.

Viewing the count of data packets by protocol helps to verify that the trace data captured the problem being experienced. If the data does not match expectations, take another data capture, perhaps in another area of the network, to complete a valuable analysis. Verify that the trace data includes network traffic at the time of the problem, and continue the analysis by reviewing response times and other global information such as: Packet distribution by packet size with the packet count Trace statistic summaries Buffer size distribution, including the receivers window size with the packet count Round trip time distribution with packet round trip times including packet count.

The following illustration shows an example of response time distribution.

After one obtains a high-level understanding of the network traffic conditions, detailed analysis begins. The detailed analysis provides an in-depth look at the data and helps identify problem connections. The following NTA service tools results show the problem routes requiring more focused analysis. The severity of the route condition is highlighted by the color-coded circles, indicating the troubled connections.

Further analysis of one of these probable problem routes provides specific statistical information for that connection. This example shows the detailed statistics for a probable problem route.

Further detailed information about specific connections that aid analysis include the: Number of times the receiver window size dropped to zero Longest response time Minimum segment size Number of fragmented datagrams Number of retransmits Maximum number of unacknowledged bytes outstanding. A packet-by-packet flow may also be required to analyze data flow details by connection.

Example: Poor performance and slow downloads Performance challenges can result from improper system and network option settings. In a TCP/IP environment, users were complaining of overall poor performance and slow downloads after installation of a new version of OS/390. A trace of the problem showed that the TCP receivers advertised window size was too large and retransmissions were occurring. The trace showed that the OS/390 window scale option was in use. This gave the sender, OS/390, permission to send enormous amounts of data into the network. The first router in the path did not have the buffer resources required to handle the data. Therefore, the router had buffer overrun and packets were discarded. The sender would eventually resend the discarded packets, but the extra time involved severely impacted network performance. The solution was to turn off the default window scale option in OS/390.

Solving performance problems in traditional SNA networks Many organizations rely on well-performing SNA implementations to meet their communications needs. When beginning the analysis of SNA subarea or APPN environments, examine logical unit (LU) flows, packet sizes, window behavior, congestion, bottlenecks and data link operations. Transaction times in a network consist of processing times (service time) and delays. Some delays, such as propagation, cannot be controlled, while other delays are variable and controllable. One should focus on variable and controllable delays, especially network queues and how to control them. Delays have a direct bearing on transaction times, which can affect expenses and user satisfaction. One common performance problem in SNA environments can be caused by a blocked virtual route (VR). Usually, the symptom is data backing up in the network, resulting in increased time to get through the network. This blockage is referred to as subarea congestion and can be caused by incorrect VR pool or the VR pacing window sizes.

Analyzing VTAM Internal Trace (VIT) data helps identify the held routes. NTA service tools automatically decipher the trace data to highlight the routes that are experiencing poor performance. The following illustration is an example of the potential problem route screen showing two held VRs.

As the next step, NTA identifies potential causes of the problem along with recommendations. The following is an illustration of the recommendations provided by the expert system portion of NTA.

Using the recommendations, other VR statistics, such as average path information unit (PIU) size by time, window count, and transmission group (TG) utilization, assist in completing the analysis and resolution of performance problems. The following is an illustration of TG utilization.

A single view of all the VRs and their status can also be helpful in analyzing and understanding the activity in the network.

Following are some examples of network performance challenges in both SNA subarea and SNA/APPN environments. Example: Slow data transfer, slow communications This example shows how parameter settings can affect an SNA subarea network, leading to poor business communications. In this example, the key to communications was the transfer of data between a number of clients to the supplier. The supplier received a report from a client that data transfers were taking much longer than was reasonable. The transfer time was acceptable if the supplier became the primary application and received the file. However, if the client sent the file to the supplier, the transfer time was unacceptable. Network data was collected from both the unacceptable transfer and the acceptable transfer scenarios. Using NTA service tools, the analyst determined that the data rates were 240 bytes per second for the unacceptable flow and 6,200 bytes per second for the acceptable flow. Using this information, the supplier determined that, during the unacceptable transfer, the supplier was the secondary application and the window size used was set at two. Conversely, during the acceptable data transfer, the supplier was the primary application and the window size was set at 15. The analyst recommended that the client change the primary send pacing (PSNDPAC) to 15. This resolved the slow data transfer. Example: Printing slowdown impacts productivity This example involves an SNA/APPN performance problem with a series of high-speed printers. Throughout the day, printers would stop printing briefly in the middle of a print job and then start again. This condition is known as clutching. The printers were attached to an RS/6000 print server that was using CS/AIX for the SNA support. The server was connected to a VTAM host through a Token Ring. The users discovered that all the printers attached to the RS/6000 server were clutching during peak loads. This resulted in large print jobs taking an excessive time to complete, which led to a violation of a service level agreement they had with their customer.

10

An analyst traced the Token Ring that connected the RS/6000 printer server to the 3745 front end processor. After reviewing the trace using NTA service tools, it was determined that there was a coding mismatch between the Token Ring parameters in VTAM and in CS/AIX on the print server. This mismatch caused the host to send a number of data packets to the RS/6000 and then wait for acknowledgment before sending the next set of frames. Because CS/AIX was misconfigured, it would expect more data after the host was finished sending. When this occurred, the print server had no data to send to the printers. Eventually, a timer would expire, the acknowledgment would flow and the sequence would start over. This created long delays between data transmissions, resulting in clutching. Using the findings of this analysis, the CS/AIX was reconfigured and the problem was resolved.

Benefits of performance analysis Effectively identifying and resolving complex network performance problems in TCP/IP , SNA subarea and SNA/APPN environments require the ability to quickly analyze network traffic. This ability, in turn, requires skills, specific tools and an efficient process. As the network continues to play an increasingly important role in successful business communications and operations, the ability to satisfy users and clients will be determined by the ability to quickly identify and resolve performance problems.

For more information To learn more about IBM Performance Management and Capacity Planning Services and IBM Global Services, visit www.ibm.com/services or contact your IBM sales representative. You can also contact us at 1 800 426-4682 in the U.S. and 919 301-4141 outside the U.S., or e-mail us at capacity@us.ibm.com. Bob Springsteen is a certified network consultant and Chris Lennon is a network specialist responsible for performance management of networks, with specific focus on providing remote assistance to clients using NTA service tools for IBM Performance Management and Capacity Planning Services. The capabilities of NTA service tools illustrated through examples in this white paper are available through electronic and Web access from IBM.

1 1

Copyright IBM Corporation 2000 IBM Global Services Route 100 Somers, NY 10589 Produced in the United States of America 04-00 All Rights Reserved IBM, the e-business logo, OS/390 and RS/6000 are registered trademarks or trademarks of International Business Machines Corporation in the United States, other countries, or both. Other company, product and service marks may be trademarks or service marks of others. References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates. IBM reserves the right to change specifications or other product information without notice. This publication may include typographical errors and technical inaccuracies and may be changed or withdrawn at any time. The content is provided as is, without warranties of any kind, either express or implied, including the implied warranties of merchantability and fitness for a particular purpose. Some jurisdictions do not allow disclaimer of express or implied warranties in certain transactions, therefore this disclaimer may not apply to you.

G510-1182-00

Vous aimerez peut-être aussi