Vous êtes sur la page 1sur 128

e d u c a t io n se r v ic e s c o u rsew a re

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting


Student Guide

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

NOTE: Please note this Student Guide has been developed from an audio narration. Therefore it will have conversational English. The purpose of this transcript is to help you follow the online presentation and may require reference to it. Slide 1

Netscreen 5000 Series Security Systems and ISG Series Troubleshooting

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net | Proprietary and Confidential

Welcome to Juniper Networks NetScreen 5000 Series Security Systems and ISG Series Troubleshooting eLearning module.

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 2

Navigation

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 2

Throughout this module, you will find slides with valuable detailed information. You can stop any slide with the Pause button to study the details. You can also read the notes by using the Notes tab. You can click the Feedback link at anytime to submit suggestions or corrections directly to the Juniper Networks eLearning team.

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 3

Course Objectives
After successfully completing this course, you will be able to:
Distinguish between ISG Series and NS5000 Series hardware configuration and packet flow Explain the importance of the ASIC functions Describe First Path and Fast Path in packet flow Differentiate between functions processed in the CPU versus PPU Use and interpret debug commands unique to high end systems Explain the workarounds for 3 typical troubleshooting examples
2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 3

After successfully completing this course, you will be able to:

Distinguish between ISG Series and NS5000 Series hardware configuration and packet flow Explain the importance of the ASIC functions Describe First Path and Fast Path in packet flow Differentiate between functions processed in the CPU versus PPU Use and interpret debug commands unique to high end systems, and Explain the workarounds for 3 typical troubleshooting examples

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 4

Agenda: Netscreen 5000 Series Security Systems and ISG Series


The High End Systems? Architecture Packet Flow ASIC Functions Debug Troubleshooting Examples

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 4

This course consists of 6 sections. The 6 main sections are as follows:

The High End Systems Architecture Packet Flow ASIC Functions Debug, and Troubleshooting Examples

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 5

Netscreen 5000 Series Security Systems and ISG Series Troubleshooting

The High End Systems

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net | Proprietary and Confidential

The High End Systems

In this section we take a look at the high end systems: the ISG Series and the NetScreen 5000 Series.

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 6

Section Objectives
After successfully completing this section, you will be able to:
Identify the two high end system series List the built-in modules and the interface cards in the platform Identify the types of SPMs available with each of the three Management modules

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 6

After successfully completing this section, you will be able to:

Identify the two high end system series List the built-in modules and the interface cards in the platform, and Identify the types of SPMs available with each of the three Management modules

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 7

What are the High End Systems? (1 of 4)


ISG Series
ISG1000, ISG1000-IDP

ISG2000, ISG2000-IDP

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 7

What are the High End Systems?

First we have the ISG Series, which is the lower range of the high end systems, with the ISG1000, and the ISG2000. They can also have IDP for the security module, which we are going to see is provided as a built-in card.

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 8

What are the High End Systems? (2 of 4)


ISG Series Modules
Management Module (built-in) Security Module (built-in)
Provides IDP functionality

ASIC module (built-in) Interface Cards:


4-portFE 8-portFE 2-portGE 4-portGE (starting from ScreenOS 5.4) 1-portXGE (starting from ScreenOS 6.1)

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 8

We have built-in modules and also the interface card. The built-in modules are the Management module, the Security module for the IDP, and the ASIC module. Then we have the interface cards. There are four ports and eight ports fast Ethernet (FE), two ports gigabit Ethernet (GE), and four ports GE as well. The four port is available starting from ScreenOS 5.4 and the one port ten gigabit is available starting with ScreenOS 6.1.

Course SERT-NS5000

Juniper Networks, Inc.

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 9

What are the High End Systems? (3 of 4)


NS5000 Series
NS5200

NS5400

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 9

We also have the NS5000 Series. These are in the higher range of the high end systems, and there are two chassis one is the NS5200 and the other is the NS5400. The NS5400 has two more slots for the line cards.

Course SERT-NS5000

Juniper Networks, Inc.

10

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 10

What are the High End Systems? (4 of 4)


NS5000 Series Modules
Management Modules
MGT MGT2 MGT3
2G24FE 8G 8G2 2XGE 8G28G2- G4 2XGE2XGE -G4

MGT
YES YES NO NO NO NO

MGT2 MGT3
YES YES YES YES NO NO NO NO NO NO YES YES

Secure Port Modules (SPM)


2G24FE 8G 8G2 2XGE-2 8G2-G4 (ScreenOS 6.1) 2XGE-G4 (ScreenOS 6.1)

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 10

What sorts of modules do we have for this platform? We have the Management modules and the Secure Port Modules (SPMs). There are three types of Management modules, referred to as Management 1, 2 and 3. For SPM, there is the two gigabit, 24-port fast Ethernet (2G24FE). Then there is the eight port gigabit and a two port ten gigabit.

With ScreenOS 6.1 we have the latest version of the eight gigabit and ten gigabit cards. We will see that in a subsequent slide.

In the table here, you see how they can be used. For Management 1 we can use the 24 port FE and the eight Gig 1 card. With Management 2, we can also use the eight Gig 2 card and the 10 gigabit card, and with Management 3, we can use only the newer generation of the eight Gig and the two port 10 Gig card.

Course SERT-NS5000

Juniper Networks, Inc.

11

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 11

Section Summary
In this section, we:
Identified the two high end system series Listed the built-in modules and the interface cards in the platform Identified the types of SPMs available with each of the three Management modules

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 11

In this section, we:

Identified the two high end system series Listed the built-in modules and the interface cards in the platform, and Identified the types of SPMs available with each of the three Management modules

Course SERT-NS5000

Juniper Networks, Inc.

12

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 12

Learning Activity 1: Question 1


The built-in modules include which of the following?
A) Interface card B) 8 port FE C) High end system D) ASIC module

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 12

Course SERT-NS5000

Juniper Networks, Inc.

13

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 13

Learning Activity 1: Question 2


With the Management-3 module we can only use which one of the following?
A) Screen OS6.1 B) Newer generation cards C) 24 port FE D) SPM Built-in

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 13

Course SERT-NS5000

Juniper Networks, Inc.

14

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 14

Netscreen 5000 Series Security Systems and ISG Series Troubleshooting

Architecture

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net | Proprietary and Confidential

Architecture

Course SERT-NS5000

Juniper Networks, Inc.

15

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 15

Section Objectives
After successfully completing this section, you will be able to:
Differentiate between the ISG and NetScreen 5000 chassis Use the commands get system path and get chassis

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 15

After successfully completing this section, you will be able to:

Differentiate between the ISG and NetScreen 5000 chassis, and Use the commands get system path and get chassis

Course SERT-NS5000

Juniper Networks, Inc.

16

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 16

Architecture (1 of 11)
Why is the architecture important?
To understand the packet flow Troubleshooting depends on it
These components are directly involved in the process

Debugging in the CPU level is not always enough System behavior depends on the architecture
E.g., in ScreenOS 5.4, TCP SYN check is done in CPU on NS5000, but its done in PPU on ISG

Features depend on the architecture


AES encryption done in ASIC for GigaScreen3 and 4

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 16

Why talk about the architecture? Its very important to understand packet flow in the system, and to be able to troubleshoot it because these components are directly involved in the process. When we do debugging in the CPU, it may not always be enough to find the reason a packet dropped or why the traffic is not processed as expected, etc. Also because the system behavior depends on the architecture depending on the card or version thats being used, the behavior might be different. The example here is TCP SYN check, which is done in the CPU for the NetScreen 5000 Series, but for the ISG its done in the PPU. We are going to see what the PPU is later in the course. But the PPU is inside the ASIC chip, so its very important for us to understand.

Another example that shows that features depend on the architecture is the fact that AES encryption is done in the ASIC for GigaScreen3 and 4, which we will see when we look at the schematic.

Course SERT-NS5000

Juniper Networks, Inc.

17

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 17

Architecture (2 of 11)
Highlights
Use of ASIC chips to increase performance and throughput ISG Series have GigaScreen3 ASIC NS5000 Series have 3 different ASICs:
GigaScreen2 2G24FE/8G SPM GigaScreen3 8G2/2XGE SPM GigaScreen4 8G2-G4/2XGE-G4 SPM

Management and Security Modules with dual CPU

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 17

Lets cover some highlights concerning the architecture. The use of ASIC chips increases the performance and throughput, which is one great advantage of this platform. The ISG Series uses the GigaScreen3 ASIC.

The NetScreen 5000 Series has three different types that will depend on the secure port module used. They are listed on the slide the GigaScreen4 is the latest one, thats in combination with the Management3 card that we saw in the table in a previous slide.

Another important thing is that the Management and the Security modules have dual CPUs. One CPU is used to process the flow of traffic and the other CPU is used to perform the task for example, OSPF routing or some other management task in the system.

Course SERT-NS5000

Juniper Networks, Inc.

18

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 18

Architecture (3 of 11)
ISG Chassis
ISG Series
Management Module

Security modules Dual 1GHz PowerPC CPU 2 GB RAM, FPGA

ASIC Module I/O I/O I/O I/O

1 x GigaScreen3 ASIC in the ASIC module ASIC module has direct connection with Management and Security Modules via PCI bus Management and Security Modules have dual CPU Security Module has additional FPGA (FieldProgrammable Gate Array)
CONFIDENTIAL
SERT-NS5000 www.juniper.net | 18

Network Traffic
2010 Juniper Networks, Inc. All rights reserved.

Lets look at the ISG Series. The basic structure consists of one ASIC module. At the bottom are the interface cards that connect to the ASIC module, and the ASIC connects to the security module. In the ISG2000 you can have three, and the ISG1000 can have two, for the IDP functionality. Then theres the Management module. The security module also has an FPGA to help provide high throughput to the system.

Course SERT-NS5000

Juniper Networks, Inc.

19

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 19

Architecture (4 of 11)
ISG ASIC Module
Built-in 1 x GigaScreen3 ASIC All I/O cards connect to backplane with dedicated paths to ASIC chip Front End Processor FPGA chips interface between I/O and ASIC (2 in ISG2000 and 1 in ISG-1000)
ASIC Module
Control Bus
Slot 2 Slot 1

Data Bus FPGA


GigaScreen3

Slot 4

Slot 3

Data Bus FPGA

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SDRAM

SERT-NS5000

www.juniper.net | 19

Lets look specifically now into the ISG ASIC module. Thats the focus of our attention because thats where we need to look when we are troubleshooting the platform. We have the GigaScreen3 ASIC, we have I/O cards, and we have connection to the I/O cards, so there is a data bus from the I/O card to the FPGA, which is a front-end processor. You can think of a switch thats transferring the packets from the I/O cards to the ASIC chip for processing.

Course SERT-NS5000

Juniper Networks, Inc.

20

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 20

Architecture (5 of 11)
ISG2000 Architecture
ISG-1000/2000 share the similar HW architecture Single ASIC chip, FPGA chip, IO modules are separated with chip
3
Slot 3 MGT Module Slot 2-0 Security Modules

I/O Modules

ASIC Module FAN Module

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 20

Here we see a feature of the chassis looking at it from the top. On the left hand side is the rear of the chassis and on the right hand side is the front. In the front are the I/O modules. Then we see the ASIC module; then 3 empty slots for the security modules; in the back we see in slot three the Management module

Course SERT-NS5000

Juniper Networks, Inc.

21

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 21

Architecture (6 of 11)
ISG-1000 Architecture
ISG-1000/2000 share the similar HW architecture Single ASIC, Switch Fabric FPGA, IO modules are separated with chip
FAN Module Power Supply Module 2 Slot for Security Module ASIC Module Slot 3 Mgt Module

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 21

The ISG1000 is very similar. Here we see the front is on the left side of the picture. We see the ASIC module its always the one thats closest to the I/O card. Then there are two slots in the middle for the security module. Here we see again slot 3 for the Management module. Finally, theres the power supply in the back of the chassis.

Course SERT-NS5000

Juniper Networks, Inc.

22

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 22

Architecture (7 of 11)
NS5000 Chassis GigaScreen ASIC in the SPM
NetScreen 5400
MGT

SPM

SPM

SPM

15Gbps switch fabric interconnecting SPMs Dedicated bus for control Dedicated bus for traffic to MGT module MGT1 has one CPU MGT2/MGT3 have 2 CPUs

15 Gbps Switch Fabric

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 22

Next, lets look at the NetScreen 5000, in general. We have more capacity here. There are 3 SPMs that share the 15 gigabit switch fabric. It has a dedicated bus for traffic control in the chassis and another bus for traffic to the Management module. Later we will show when the SPM needs to send traffic to the Management module, that dedicated bus is used to avoid any congestion.

Management2 and Management3 cards have two CPUs for flow and tasks. For Management1 they are in same physical CPU, separated in the architecture of the software.

Course SERT-NS5000

Juniper Networks, Inc.

23

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 23

Architecture (8 of 11)
NS5000 SPM (1)
ASIC chips reside in the SPMs Number and type of ASIC depend on the SPMs:
2G24FE 1 x GigaScreen2 8G 2 x GigaScreen2 8G2/2XGE 2 x GigaScreen3 8G2-G4/2XGE-G4 2 x GigaScreen4

Front End Processor FPGA chips interface between ASICs and backplane to MGT board/ASICs in other SPMs

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 23

Here you see the Secure Port Module of the NS5000. This would be the equivalent of the ASIC module that we saw for the ISG Series.

Course SERT-NS5000

Juniper Networks, Inc.

24

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 24

Architecture (9 of 11)
NS5000 SPM (2)
8G2-G4 SPM
GigaScreen4 GigaScreen4
Backplane

FPGA

FPGA

FPGA

I/O

I/O

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 24

Here you see there are two GigaScreen ASICs in each module. There are front-end processors that do the interconnection within the cards, between the different ASICs, and also to the backplane if the traffic needs to go to another SPM.

At the bottom you see the I/O interface. This can be one ten gig port or four one Gig ports.

Course SERT-NS5000

Juniper Networks, Inc.

25

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 25

Architecture (10 of 11)


How to check the hardware configuration?

ns5400-> get system | in product Product Name: NetScreen-5400-II isg2000-> get system | in product Product Name: NetScreen-2000 ns5200-> get system | in product Product Name: NetScreen-5200-II nsisg1000-> get system | in product Product Name: NetScreen-ISG1000

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 25

How do we check the hardware configuration? This simple command shows what product we are talking about: get system | in product.

Course SERT-NS5000

Juniper Networks, Inc.

26

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 26

Architecture (11 of 11)


How to check the hardware configuration?
ns5400-> get chassis Chassis Environment: Power Supply: Good Fan Status: Good Battery Status: Good CPU Temperature: 141'F (61'C) Slot Information: Slot Type S/N 1 Management-III 0225032008000036 2 Processing-2XGE-G4 0227032008000003 3 Processing-8G2-G4 0226032008000055

Assembly-No 0072-001 0085-001 0084-001

Temperature 111'F (44'C) 116'F (47'C) 109'F (43'C)

DRAM Size 2048MB 1024MB 1024MB

isg2000(M)-> get chassis Chassis Environment: Power Supply: Good Fan Status: Good CPU Temperature: 113'F ( 45'C) Slot Information: Slot Type S/N Assembly-No Version Temperature 0 System Board 0079022005000207 0051-005 E01 78'F (26'C), 86'F (30'C) 4 Management 0081022005000392 0049-004 D06 113'F (45'C) 3 Security 0137062005000114 0049-001 A02 cpu1:Ready, cpu2:Ready 5 ASIC Board 000140527B050065 0050-003 C00 Marin FPGA version 9, Jupiter ASIC version 1, Fresno FPGA version 110 I/O Board Slot Type S/N Version FPGA version 1 1 port XFP 0229062008000062 A00 3 2 4 port 10/100 0084042004000002 D01 6 3 1 port XFP 0229062008000070 A00 3

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 26

If you want to see details, you will use the command get chassis. Here you see an example first for a NetScreen 5400. Management3 is the card being used and theres one ten Gig module, and one eight Gig module, and they are in slots two and three in this notation. You can see the serial number for each card, the assembly number, temperature and the DRAM size.

At the bottom, the other output is for the ISG2000. Here also is a management board, but additionally there is the security module, and then the ASIC module as was shown in the schematic and also the I/O cards. Also in the middle you can see the FPGA version information. Jupiter is the internal name of the ASIC and Fresno is the internal name of the FPGA. Those were the names used when the command was run.

Course SERT-NS5000

Juniper Networks, Inc.

27

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 27

Section Summary
In this section, we:
Differentiated between the ISG and NetScreen 5000 chassis Showed how to use the commands get system path and get chassis

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 27

In this section, we:

Differentiated between the ISG and NetScreen 5000 chassis, and Showed how to use the commands get system path and get chassis

Course SERT-NS5000

Juniper Networks, Inc.

28

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 28

Learning Activity 2: Question 1


Most troubleshooting of the ISG platform focuses on which of the following?
A) ASIC module B) Management module C) IDP functionality D) I/O cards

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 28

Course SERT-NS5000

Juniper Networks, Inc.

29

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 29

Learning Activity 2: Question 2


SPM in NS5000 is equivalent to what in the ISG Series?
A) I/O interface B) ASIC module C) FPGA D) DRAM

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 29

Course SERT-NS5000

Juniper Networks, Inc.

30

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 30

Netscreen 5000 Series Security Systems and ISG Series Troubleshooting

Packet Flow

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net | Proprietary and Confidential

Packet Flow

Course SERT-NS5000

Juniper Networks, Inc.

31

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 31

Section Objectives
After successfully completing this section, you will be able to:
Explain the difference between packet flow in First Path and Fast Path Describe packet flow in the NS5000 and ISG Series platforms Identify packet types that need to be processed at the CPU level

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 31

After successfully completing this section, you will be able to:

Explain the difference between packet flow in First Path and Fast Path Describe packet flow in the NS5000 and ISG Series platforms, and Identify packet types that need to be processed at the CPU level

Course SERT-NS5000

Juniper Networks, Inc.

32

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 32

Packet Flow (1 of 6)
NS5000First Path: CPU is involved in processing
MGT3
CPU CPU

1) Packet arrives interface chip 2) Packet is forwarded to FPGA

4 8G2-G4 SPM

3) FPGA forwards it to ASIC


GigaScreen4

GigaScreen4

4) ASIC checks the packet and forwards it to CPU 5) CPU processes the packet and sends it back to ASIC

Backplane

6
FPGA

FPGA

FPGA

2
I/O I/O

6) ASIC forwards the packet to FPGA 7) FPGA forwards packet to interface chip 8) Interface chip sends the packet out

Packet

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 32

Lets now look at Packet Flow. We want to show how packets go through different components so you know what to look for when you are troubleshooting. We will first start with the NetScreen 5000. The example here is for the First Path. The First Path is when the CPU is involved in processing the packet. We call it First Path because this process is most commonly used when there is a packet for a new session. A new session is always created in the CPU so the ASIC needs to forward traffic to the CPU for processing.

You see the packet at the bottom step number 1. The packet arrives at the interface chip, then it will go to the FPGA, and the FPGA then forwards it to the ASIC thats directly connected to the FPGA. The ASIC looks at the packet and determine that this one needs to be sent to the CPU. It will send it to the CPU via the backplane and then the CPU will do the processing. Lets say it creates the session and then sends it back to the same ASIC chip, and then the ASIC chip will match the packet to an existing session. When the CPU processed the packet, the session was created and installed in the ASIC chip. The packet received matches the session and is then sent out. At that point the FPGA gets the packet and will forward it to the correct outgoing interface. The packet goes to the interface and then it will leave the system.

Course SERT-NS5000

Juniper Networks, Inc.

33

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 33

Packet Flow (2 of 6)
NS5000First Path: CPU is involved in processing
First packet for session creation
Packets that need ALG/DI/Web Filtering Packets for the following protocols need to be processed by CPU:
0 : IPv6 Hop-by-Hop Option 1: ICMP 2: IGMP 4: IP-in-IP 58: ICMPv6 89: OSPF 103: PIM 112: VRRP 132: SCTP

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 33

Here are some more details about the First Path. To repeat, first its for session creation. When there is a packet that doesnt match any existing flow, it has to be sent to the CPU for session creation. Also, when we have Application Layer Gateway (ALG) inspection or Deep Inspection (DI) or Web Filtering, the content of the packet needs to be inspected so that, for example, in the ALG FTP the control connection needs to be inspected so that the dynamic ports can be opened properly by the firewall. And there are other packets that also need to be processed on the CPU level and these are mainly: ICMP, IGMP, OSPF, PIM, VRRR, SCTP and so on.

Course SERT-NS5000

Juniper Networks, Inc.

34

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 34

Packet Flow (3 of 6)
NS5000Fast Path: CPU is not involved: packet matches session
MGT3
CPU CPU

1) Packet arrives interface chip 2) Packet is forwarded to FPGA

8G2-G4 SPM
GigaScreen4
Backplane

3) FPGA forwards it to ASIC


GigaScreen4

4) ASIC checks the packet, matches session and forwards it back to FPGA 5) FPGA forwards packet to interface chip 6) Interface chip sends the packet out

4
FPGA

4 FPGA

FPGA

2
I/O I/O

Packet

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 34

Now that we have considered the First Path, that requires CPU help to process the traffic, lets now check the Fast Path. It is called Fast Path because the CPU doesnt get involved. The GigaScreen ASIC is capable of processing the flow and avoids burdening the CPU. The packets are processed on the ASIC level and thats how we get very high throughput with this system.

Lets look at how the packet flows. It first arrives at the interface chip, it goes to the FPGA, and then the GigaScreen ASIC checks the packet and it will check it against the session table. It will go to session lookup engine to match the session, and then it will match the session, identify the outgoing interface, and then send it back to the FPGA. Then the FPGA can forward it to the interface port and it will then be sent out.

Course SERT-NS5000

Juniper Networks, Inc.

35

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 35

Packet Flow (4 of 6)
ISG2000-IDP First Path: Traffic is sent from CPU to 1) Packet arrives interface card Security Module
2) Packet is forwarded to FPGA FPGA forwards it to ASIC ASIC checks the packet and forwards it to CPU (pass 96 bytes to MM via PCI control bus) CPU processes the packet and sends it to ASIC ASIC receives the packet and forwards it to IDP (A complete packet is transferred to SM through Data Bus) IDP processes the packet and sends it to ASIC ASIC sends packet to FPGA FPGA forwards packet to interface chip

MM
CPU CPU

SM
CPU CPU

3) 4)

ASIC Module
Slot 2 Slot 1

4
FPGA

5) 6)

Packet

Data Bus

GigaScreen3

SDRAM

2
Data Bus FPGA

7) 8) 9)

Slot 4

Slot 3

10

10) Interface card sends the packet out


2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 35

Next, we check the First Path for the ISG2000 with the IDP security module. Lets see how the packet flows in this case. We start at the same point the packet arrives at the interface card and then via the data bus goes to the FPGA. The FPGA will send it to the ASIC chip; the ASIC chip checks the session table and will not find it. It will send it to the Management module for the session creation in this example. If its ALG, the session actually is matched, but it will have a flag to say this packet needs to go to the CPU for inspection for further processing. Then the packet is processed and it is sent back to the GigaScreen ASIC. If this is the case for the security module to also inspect the traffic then the ASIC gets the packet and sends it to the security module. At this point, the whole packet is sent to the security module all the packets content because the security module needs to receive all the data to be able to inspect it. Then it is inspected and then it goes back to the GigaScreen ASIC, and then finally it will go out to the interface.

Course SERT-NS5000

Juniper Networks, Inc.

36

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 36

Packet Flow (5 of 6)
ISG2000-IDP Fast Path: Traffic is directly to Security Module 1) Packet arrives interface
card

MM
CPU CPU

SM
CPU CPU

2) Packet forwarded to FPGA 3) FPGA forwards it to ASIC

ASIC Module
Slot 1 Slot 2

5 4
FPGA

4) ASIC checks the packet, matches session and forwards it to IDP


SDRAM

Packet

Data Bus

2
Data Bus FPGA

5) IDP processes the packet and sends it to ASIC 6) ASIC sends packet to FPGA 7) FPGA forwards packet to interface chip 8) Interface card sends the packet out

GigaScreen3

Slot 4

Slot 3

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 36

How does it work in the case of Fast Path? The CPU is not involved, but the security module still has to inspect the traffic. Again, the packet will go to the FPGA and then the GigaScreen ASIC. It will go straight to the security module this time no CPU involvement. Then the packet is processed and sent back. The GigaScreen ASIC will identify the outgoing interface and send the packet out through the FPGA and then to the interface card and then out of the system.

Course SERT-NS5000

Juniper Networks, Inc.

37

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 37

Packet Flow (6 of 6)
What are the possible paths?
NS5000
Single-ASIC Cross-ASIC
8G2-G4 SPM GigaScreen4 GigaScreen4
Backplane

FPGA

FPGA

FPGA

ISG2000
Always single-ASIC
Single FPGA Dual FPGA
I/O I/O

ASIC Module
Data Bus

Control Bus

Slot 2

Slot 1

ISG-1000
Always single-ASIC/single-FPGA

FPGA
SDRAM GigaScreen3 Data Bus

Slot 3

Slot 4

FPGA

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 37

Lets summarize the packet flow now; lets think of possible paths. First, lets consider the NetScreen 5000, which can use what we refer to as Single-ASIC or Cross-ASIC. Single-ASIC is when the incoming traffic goes this way and then return traffic goes out this way out of the same ASIC chip.

Then we have cross-ASIC; its going to be this way. For example, incoming traffic goes here, then the return traffic goes this way. When the traffic comes from the other side, it will come here, on the other interface set. It will go to this ASIC for processing, and then this ASIC will process the packet, and then send it this way. Thus we have Cross-ASIC.

For the ISG, its always Single-ASIC because in the ASIC module its just one chip, but we think of the FPGA in this case. We can have traffic coming here and going out the same FPGA or we can have traffic coming into the top FPGA and going out of the bottom FPGA. This is important when we look at the output, so that we know which FPGA to check and we know what to expect when we look at the counters.

Course SERT-NS5000

Juniper Networks, Inc.

38

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 38

Section Summary
In this section, we:
Explained the difference between packet flow in First Path and Fast Path Described packet flow in NS5000 and ISG Series platforms Identified packet types that need to be processed at the CPU level

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 38

In this section, we:

Explained the difference between packet flow in First Path and Fast Path Described packet flow in the NS5000 and ISG Series platforms, and Identified packet types that need to be processed at the CPU level

Course SERT-NS5000

Juniper Networks, Inc.

39

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 39

Learning Activity 3: Question 1


A new session is always created in the what?
A) ASIC B) CPU C) PPU D) FPGA

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 39

Course SERT-NS5000

Juniper Networks, Inc.

40

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 40

Learning Activity 3: Question 2


Cross-ASIC processing is available in which Juniper platform?
A) ISG1000 B) ISG2000 C) NS5000 D) GigaScreen 4

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 40

Course SERT-NS5000

Juniper Networks, Inc.

41

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 41

Netscreen 5000 Series Security Systems and ISG Series Troubleshooting

ASIC Functions

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net | Proprietary and Confidential

ASIC Functions

Course SERT-NS5000

Juniper Networks, Inc.

42

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 42

Section Objectives
After successfully completing this section, you will be able to:
Differentiate between functions performed in the CPU versus those done in the ASIC chip and PPU Use the get ASIC PPU command to see which functions are processed by each PPU

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 42

After successfully completing this section, you will be able to:

Differentiate between functions performed in the CPU versus those done in the ASIC chip and the PPU, and Use the get ASIC PPU command to see which functions are processed by each PPU

Course SERT-NS5000

Juniper Networks, Inc.

43

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 43

ASIC Functions (1 of 3)
ASIC benefits: Increase Performance and Throughput
FAST PATH: Traffic forwarding without using CPU VPN Encryption and Decryption (AES, 3DES, DES,SHA-1, MD5) TCP 4-Way close IP fragmentation re-assembly Screening IPSec fragmentation and re-assembly with IKE acceleration Byte counters / data collection from local session memory IPv6 acceleration

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 43

Lets now look at the ASIC functions to see what the ASIC is doing. The most important objective is to increase the performance and throughput in the system. One of the benefits that the system has is Fast Path. This enables the system to handle traffic forwarding without using the CPU, as we saw in the packet flow.

VPN encryption and decryption is also done in the ASIC chip, so it doesnt increase CPU utilization to do that. It also can be responsible for processing TCP 4-Way close; also to do fragmentation re-assembly, and additionally for some screen functions, such as IDP flood, SYN flood, ISMP flood.

It can also perform IPsec fragmentation and re-assembly with the IKE acceleration. Additionally, it can provide byte counters for the policy and IKE acceleration for IPv6 traffic. So, the IPv6 traffic is also processed on the ASIC level without going to the CPU.

Course SERT-NS5000

Juniper Networks, Inc.

44

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 44

ASIC Functions (2 of 3)
Packet Processing Units (PPU)
Packet Processing Units (PPU)provide additional processing capacity in ASIC level Provide additional processing power for ASIC chip PPU features:
Defragmentation (cleartext and encrypted) TCP SYN check SYN proxy SYN cookie TCP 4-way close IPv6 acceleration HA packet forwarding (ISG) Interface with IDP Security Module (ISG) DSCP copy Policy counters

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 44

One important part of this architecture in the ASIC chip is the PPU; the packet processing unit. It gives additional processing capacity at the ASIC level. It is an entity that can be programmed to do different things. The features that are supported in this PPU are listed in this slide.

It can perform defragmentation for both clear text and encrypted traffic. It can perform TCP SYN check, SYN proxy and SYN cookie, get TCPU 4-way close and increase the acceleration like shown previously. It also does the HA packet forwarding in the case of ISG, and also interfaces the IDP security module in the ISGs. It can also perform the DSCP copy for QoS and policy counters to count the number of bytes.

Course SERT-NS5000

Juniper Networks, Inc.

45

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 45

ASIC Functions (3 of 3)
How to check PPU functions
Total of 6 PPUs in GigaScreen3 and 4
Example for ScreenOS 6.3
ns5400(M)-> get asic ppu functions PPU and XTCPU functions: Defragmentation of encrypted packets: PPU-A Defragmentation of clear-text packets: PPU-C Syn-proxy function: PPU-B Tcp-3way-check function: PPU-B sdram HA and IDP packet forwarding: PPU-D IDP processing: PPU-E Syn-cookie function: PPU-F IPV6 flow processing: PPU-A IPV6 tunnel processing: PPU-C and PPU-D IPV6 parser: PPU-E

Use get asic # eng ppu functions for ScreenOS 5.4 and earlier

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 45

How do you check these functions in the system? Its simple with this command get ASIC PPU functions. If you run this command, you can see the PPU. We have six PPUs in GigaScreen3 and 4 the latest models. In this example for ScreenOS 6.3, you can see the PPUs. For example, the SYN cookie function is processed by PPU-F. We have PPUs from PPU-A to PPU-F. Another example highlighted here: defragmentation of clear-text is done by PPU-C.

These functions might change depending on the version, because of different features that were included. You can check using this command.

Course SERT-NS5000

Juniper Networks, Inc.

46

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 46

Section Summary
In this section, we:
Differentiated between functions performed in the CPU versus those done in the ASIC chip and PPU Used the get ASIC PPU function to see which functions are processed by each PPU

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 46

In this section, we:

Differentiated between functions performed in the CPU versus those done in the ASIC chip and PPU, and Used the get ASIC PPU function to see which functions are processed by each PPU

Course SERT-NS5000

Juniper Networks, Inc.

47

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 47

Learning Activity 4: Question 1


The ASIC chip increases the performance and throughput in the system since it does what?
A) Enables traffic forwarding without using the CPU B) Uses First Path C) Gets packets through the firewall D) Eliminates the need for FPGA

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 47

Course SERT-NS5000

Juniper Networks, Inc.

48

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 48

Learning Activity 4: Question 2


The PPU gives additional processing capacity to the ASIC by performing which of the following?
A) Re-assembly B) Isolation C) Management D) Defragmentation

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 48

Course SERT-NS5000

Juniper Networks, Inc.

49

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 49

Netscreen 5000 Series Security Systems and ISG Series Troubleshooting

Debug

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net | Proprietary and Confidential

Debug

Course SERT-NS5000

Juniper Networks, Inc.

50

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 50

Section Objectives
After successfully completing this section, you will be able to:
Review general commands used in ScreenOS List the most important commands specific to high end systems Explain how to collect the data and interpret the output Run debug tag info when looking for problems related to CPU

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 50

After successfully completing this section, you will be able to:

Review general commands used in ScreenOS List the most important commands specific to high end systems Explain how to collect the data and interpret the output, and Run debug tag info when looking for problems related to the CPU

Course SERT-NS5000

Juniper Networks, Inc.

51

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 51

Debug (1 of 49)
What are the troubleshooting commands?
Same get/debug commands from ScreenOS Additional commands to troubleshoot different components in the system
Different commands depending on platform/card type Different outputs depending on card type/ScreenOS version In ScreenOS 6.2 and 6.3 the commands are visible and documented

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 51

Lets now discus debugging and the commands that are used to troubleshoot the platform.

The first thing to note is that we have the same get and debug commands as ScreenOS. Thats going to help us here. But we are also going to see additional commands specifically for this platform. In the ScreenOS 6.2 and 6.3, the latest version, we have these commands visible in the command line interface. If its an earlier version then they are hidden, but you can execute them as normal.

Course SERT-NS5000

Juniper Networks, Inc.

52

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 52

Debug (2 of 49)
Common commands in ScreenOS
General information:
get tech get log system get log system saved get event

Performance:
get performance cpu all detail get performance session detail

Session Information:
get session info get session frag get session

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 52

The first set of commands consists of general commands that we use in ScreenOS. We want to check general information, so we use get tech, get log system, get log system saved and get event. Then, for performance, we use get performance CPU all detail and get performance session detail. For session information, we use get session info, and for information about fragmentation counters and processing we use get session frag. The get session command can be used for the complete session table. You can use that tool to investigate the data. You can also run the session analyzer using get session output.

Course SERT-NS5000

Juniper Networks, Inc.

53

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 53

Debug (3 of 49)
Common commands in ScreenOS
Interface and Screening statistics:
get counter stat get pps * (if ScreenOS 6.1 and later) get zone <zone> screen counter

Memory and internal resources:


get net-pak s get gate get pport get tcp get flow
* Packet per second counts have to be enabled with set pps command

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 53

There are also some other things to check: interface and screening counters. First you check with get counter stat. You can use packets per second (PPS) counters as well if you enable them with check PPS. You can check screen counters with get zone screen counter. If you are looking for possible attacks, such as floods, you can check this command.

For the memory and internal resources, use the command get net-pak s. For statistics, use get gate, get pport, get tcp and get flow. This provides general information about how the system is allocating resources.

Course SERT-NS5000

Juniper Networks, Inc.

54

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 54

Debug (4 of 49)
Additional Commands for High End Systems
get session hardware Displays the hardware sessions installed in the ASIC chip

get sat <asicnumber> counters


Displays information about the read-write pointers and the full counters of each queue in an ASIC.

get sat <asicnumber> demux-counter


Shows the packets sent by ASIC to the CPU and packets dropped by Screening

get sat <asicnumber> frq1


Displays the status of free buffer queue. Use the command to check for presence of leak in the buffer queue.

get sat <asicnumber> x-context


Displays records of various memory tables, table addresses, and reset counters in an ASIC.
2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 54

Now we come to whats really special about this platform. These are the most important commands we are going to cover here and they are most commonly used in troubleshooting.

The command get session hardware is going to show the session tables on the ASIC chip itself. Sometimes there may be a problem. For example, if the session table in the CPU is not the same as in the ASIC chip. We can get the output to compare. With the command get sat counters you see the read-write pointers that are used for the queues. There are different queues in the ASIC and its very important to see how the queues are if they are full or if they are free, if there are packets dropped, you can look for queue full.

Then theres get sat demux. This is important as it enables you to see packets going to the CPU, and packets dropped by the screening function. Then theres get sat frq1, which is a command to see the free buffer queue. This is basically to see how the packets buffers are being used.

With get sat x-context you see the output of some memory tables, and also some reset counters that are important.

Well show you see an example of everything later on.

Course SERT-NS5000

Juniper Networks, Inc.

55

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 55

Debug (5 of 49)
Additional Commands for High End Systems
get arp asic <asicnumber>
Displays the ARP entries in an ASIC

If 6.0r2 or later: get asic demux-counters


Equivalent to get sat <asicnumber> demux-counters but for the whole system instead of one ASIC chip

get asic ppu defrag


Displays defragmentation statistics for cleartext and encrypted traffic for all ASIC chips

get asic ppu syn-cookie


Displays statistics for syn-cookie Screening feature (SYN flood)

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 55

This second set of commands is also specific for high end systems. With get sat session we see how sessions are allocated in the hardware in the chip. With get ARP ASIC, we see the ARP entries in the ASIC chip. You can also use get ASIC demux. Its the same as get sat demux but it will be information for the whole system.

If you have NetScreen 5000, with three cards, you have six ASIC chips. When you use get ASIC demux, you see the counters for all of them in aggregate.

Then we have the command get ASIC PPU to check how the PPU is performing. Use get ASIC PPU defrag for the defragmentation and get ASIC PPU SYN-cookie for the SYN cookie feature.

Course SERT-NS5000

Juniper Networks, Inc.

56

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 56

Debug (6 of 49)
Additional Commands for High End Systems
get asic ppu syn-proxy
Displays statistics for syn-proxy Screening feature (SYN flood)

get asic ppu tcp-3way-check


Displays statistics for TCP SYN check feature

get asic ppu ipv6


Displays statistics for IPv6 traffic acceleration in PPU

get asic ppu ha-idp-fwd (ISG only)


Displays statistics for HA and IDP packet forwarding

get asic ppu idp (ISG only)


Displays statistics for packets forwarded/received by IDP

debug tag info


Displays additional information about packets going to CPU
* For ScreenOS 5.4 use get asic eng ppu <option>

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 56

The get ASIC PPU SYN-proxy command displays statistics for the SYN-proxy screening feature (SYN flood); get ASIC PPU TCP 3-way check displays statistics for the TCP SYN check feature.

Use get ASIC PPU ipv6 for IPv6 traffic acceleration in the PPU. The command get ASIC PPU HA-IDP fwd is used to display HA or IDP forwarding in the ISG. In the ISG the PPU can do the HA forwarding and also send packets to security module.

If you run the get ASIC PPU IDP, you also get counters for the packets sent or received by the IDP security module.

Then theres a debug command, which is debug tag info. This is very useful when you need to see whats going to the CPU. You can run this command to see the packet tags that go to the CPU for processing.

Course SERT-NS5000

Juniper Networks, Inc.

57

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 57

Debug (7 of 49)
Specific Commands per Platform
NS5000-2G24FE
get michigan
Displays specific information for front end processor in 2G24FE card

NS5000-8G2/2XGE/8G2-G4/2XGE-G4
get arch
Displays counters for front end processor in the SPMs using GigaScreen3 and 4

ISG
get fresno
Displays counters for front end processor in the ASIC module

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 57

Lets go ahead and look at the specific commands for each platform as well. If you have the 24 FE card you use get michigan. If you have an 8 gig card or 10 Gig card you use get arch, and if you have an ISG, you use get fresno because these commands are for the different FPGA chips that exist in each platform. You use different commands for each of the different FPGAs.

Course SERT-NS5000

Juniper Networks, Inc.

58

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 58

Debug (8 of 49)
Commands to Collect
NS5000 with 2G24FE SPM
get sat <asicnumber> d get sat <asicnumber> x-c get sat <asicnumber> fr get sat <asicnumber> c get sat <asicnumber> s get arp asic <asicnumber> get michigan <slotnumber> count get michigan <slotnumber> igmac

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 58

This is a simple example of the commands. For example, here are commands that youd use for the NetScreen 5000 with the 24 FE card.

Course SERT-NS5000

Juniper Networks, Inc.

59

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 59

Debug (9 of 49)
Commands to Collect
NS5000 with 8G2/2XGE/8G2-G4/2XGE-G4 SPM
get asic demux (if 6.0r2 or later) get sat <asicnumber> d get sat <asicnumber> x-c get sat <asicnumber> fr get sat <asicnumber> c get sat <asicnumber> s get arp asic <asicnumber> get arch <slotnumber>

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 59

Here you see example commands in the case of the eight Gig or 10 Gig card. The get sat command and the get ASIC command are always common. But now we use get arch instead of get michigan.

Course SERT-NS5000

Juniper Networks, Inc.

60

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 60

Debug (10 of 49)


Commands to Collect
ISG2000
get asic demux (if 6.0r2 or later) get sat <asicnumber> d get sat <asicnumber> x-c get sat <asicnumber> fr get sat <asicnumber> c get sat <asicnumber> s get arp asic <asicnumber> get fresno 0 get fresno 1*
* Only for ISG2000 (two FPGAs)
CONFIDENTIAL

2010 Juniper Networks, Inc. All rights reserved.

SERT-NS5000

www.juniper.net | 60

In the ISG we use get fresno. In the ISG1000 there is only get Fresno 0 since there is only one FPGA.

Course SERT-NS5000

Juniper Networks, Inc.

61

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 61

Debug (11 of 49)


How to Collect
Most counters are absolute -> multiple outputs needed Recommendation:
Run block of commands 5 times with 30 second interval

How:
Copy/paste commands in console session Script in ScreenOS (if 6.0 or later) Script in external tool

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 61

Now the question that we have is how do we collect this output? You know the commands but you need to know how do you actually collect them. The tip here is that most counters are absolute, so they will always increment every time you run a command, they increment. The idea is to run the commands five times during a 30 second interval, so later you can check the delta between each output, and then compare if their counter is incrementing or not.

You may see some counter with a very high number but it could be its not incrementing anymore. Thats why we run it a few times usually it is five times. How do you do that? You can do copy/paste in the session so console or Telnet or SSH, or you can do a script in the ScreenOS itself if you create a script for that. Alternatively, you can use an external tool to connect to the firewall and execute the command.

Course SERT-NS5000

Juniper Networks, Inc.

62

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 62

Debug (12 of 49)


How to collect? (NS5000 only)
How to obtain <slot number>?
get chassis shows the physical slot numbers <Slot> <slotnumber> = <Slot> - 2
E.g. get arch 2 is for SPM installed in physical Slot 4

How to obtain ASIC number?


Always 0 for ISG For NS5000 use get asic mapping
E.g. NS5400 with 8G2 in Slot 2 and 2XGE in Slot 4
ns5400-> get asic mapping 0 1 2 3 4 5 (ethernet2/1 to ethernet2/4) (ethernet2/5 to ethernet2/8) n/a n/a (ethernet4/1) (ethernet4/2)

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 62

There is one thing about NetScreen 5000. How do you know the exact numbers that need to be put in the command? In this case, when we do get chassis we see the slot number is 4, so the command is going to be get arch two because we need to subtract two from the slot number to get the number. For the ASIC number, we always use zero for the ISG because there is only one, but for the 5000 Series we have to use get ASIC mapping. You can easily see which ASIC you need to check. Lets say you have a problem with Ethernet 4/1, then you go check ASIC 4.

Course SERT-NS5000

Juniper Networks, Inc.

63

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 63

Debug (13 of 49)


Example: NS5400-8G2-G4/2XGE-G4 with ScreenOS 6.2 (1)
ASIC numbers: 0, 1, 4 and 5 Slot numbers: 0 and 2 List of commands:
get asic demux get asic ppu defrag get asic ppu tcp-3way-check get asic ppu syn-cookie get asic ppu syn-proxy get sat 0 d get sat 0 x-c get sat 0 fr

get sat 0 c get sat 0 s get arp asic 0 get sat 1 d get sat 1 x-c get sat 1 fr get sat 1 c

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 63

To summarize here, we will show an example. Heres a NetScreen 5400 with an 8 gig card and a 10 gig card and the ASIC numbers are 0, 1, 4 and 5. This means there is one card in slot zero and one card in slot two. Here are the commands to run to get the data for all the system. We see the get ASIC PPU and the get ASIC demux are common you run it only once. With the get sat command and the get arp command you have to run it for each ASIC.

Course SERT-NS5000

Juniper Networks, Inc.

64

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 64

Debug (14 of 49)


Example: NS5400-8G2-G4/2XGE-G4 with ScreenOS 6.2 (2)
ASIC numbers: 0, 1, 4 and 5 Slot numbers: 0 and 2 List of commands, contd:
get sat 1 s get arp asic 1 get arch 0 get sat 4 d get sat 4 x-c get sat 4 fr

get arp asic 4 get sat 5 d get sat 5 x-c get sat 5 fr get sat 5 c get sat 5 s get arp asic 5

get arch 2 get sat 5 c get sat 5 s

KB13216 - How to troubleshoot ASIC issues on Juniper Firewalls: NS5000 and ISG Series
2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 64

The get arch command is for each card, so get arch zero and get arch two. Refer to the Knowledge Base reference document KB13216 for a more detailed explanation, as well as other examples.

Course SERT-NS5000

Juniper Networks, Inc.

65

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 65

Debug (15 of 49)


How to interpret the outputs?
get asic demux (or get sat <asic> demux)
nsisg2000(M)-> get asic demux-counters Current(3d;13:27:15) to_host_packet: 928686 SYN/ACK: 10577 FIN: 53221 RST: 26713 OTHERS: 838175 first_packet: brcst: no_ip_ether_net: ttl_zero: invalid_src_adr: udp_hdr_len_err: tcp_data_off_err: tiny_tcp_err: lan_attk: ping_of_death: tcp_chksum_err: udp_chksum_err: defragged_proc: total packet: clsf counters: fragment pak unknown protocol icmp 1366460414 53933 310335 978 1300 159 1562 29 211 15 203246 56053 12578 1368029499 Last(3d;13:27:15) 928632 10574 53221 26708 838129 1366346964 53930 310312 978 1300 159 1561 29 211 15 203228 56039 12574 1367915932 PPS( 2 0 0 0 2 5268 0 1 0 0 0 0 0 0 0 0 0 0 5274 21s)

76212 225 43214361

76206 225 43210876

0 0 161

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 65

Now that you have seen how to collect the data, even more importantly, you need to see how to interpret this output. Its very important that you know what you are looking at. The get ASIC demux output or get sat demux will provide a similar output. Here you see the packets going to the CPU. You can see on the right-most column the PPS count the packets per second. This is the most important thing you need to check in this output. The slide is highlighted to show there are 5000+ packets going to the CPU per second. This is something we consider very important when we are looking at problems of performance. For example, in case we are having high CPU processing in the system, we want to know why. We can run this command to see how many packets per second are going to the CPU. Then you can understand whether that is expected or if that is overloading the system and you can make a decision about what to do next. For example, we also see here a breakdown of the packets that go to the CPU. It can be packets to the host or packets for the First Session. In this case, most of the packets that are going to the CPU are for First Session, so they are packets that dont match any session of the ASIC chip and were sent to the CPU for further processing.

Here we also see the counters of the packets that somehow were dropped. So, ttl_zero or invalid source address or TCP checksum error, UDP checksum error. These were all packets that were dropped.

Course SERT-NS5000

Juniper Networks, Inc.

66

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 66

Debug (16 of 49)


get asic demux (or get sat <asic> demux)
Shows packets going to CPU and dropped by Screening Check PPS counters on rightmost column What is important?
Find out how many packets per second are going to CPU

Why is it important?
Troubleshooting of high CPU issues

What to do next?
Determine if the pps observed is expected or solve problem in the network to reduce the load Investigate the type of packet that is going to CPU with high pps

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 66

We look at the PPS counters and thereby understand whats going to the CPU, and this is important for us to see if theres an attack or why the traffic is going to the CPU.

Course SERT-NS5000

Juniper Networks, Inc.

67

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 67

Debug (17 of 49)


How to interpret the outputs?
get asic ppu defrag
nsisg2000-> get asic ppu defrag Show ASIC 1 PPU information: Defragmentation of Encrypted Packets Total input packets: 0, Total Fragments: 0 First frag: 0, None-first Frag: 0 Defrag pass: 0, ESP frag: 0 Unexpedted packet: 0, To RSMQ: 0 AH frag: 0 Defragmentation of Clear-Text Packets Total input packets: 934463, First frag: 455095 Defrag pass: 905668, Defrag fail: 1301 Null Session Error: 643, Out-of node buffer: 0 PPU merge: 0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 67

Then there is get ASIC PPU defrag. You use that to check statistics about fragmentation. What is important here is to check the new session error and the defrag fail. Usually, when there is a problem with defragmentation, thats where the counters increment.

Course SERT-NS5000

Juniper Networks, Inc.

68

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 68

Debug (18 of 49)


get asic ppu defrag
Shows defragmentation in the PPU Check Defrag Fail and Null Session Error What is important?
Find out if there are dropped or failed fragments increasing

Why is it important?
Fragmented traffic may be getting dropped Detect fragmentation in the network

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 68

What else can you do for this case? You can check whether you really expect this defragmentation? Do you want this fragmented traffic in the network?

Course SERT-NS5000

Juniper Networks, Inc.

69

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 69

Debug (19 of 49)


get asic ppu defrag
What to do next?
Determine if fragmentation is expected Use also get session frag to check fragment counts Check the other ASIC commands Capture packets to see which device in the network is dropping the fragments Enable no-hw-session in the policy and check if the problem stops

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 69

Next, you can check get session frag output to look for the fragmentation count to see how many packets arrived as first fragment, or no first fragment; fragments that couldnt be re-assembled can also be checked with this command.

You can also correlate the data with the other ASIC commands to help you pinpoint the issue and you can also do some packet captures. You want to see, did you really receive all the fragments that were sent to the firewall? Maybe the firewall is not receiving all the fragments.

Then you can also tweak the policy configuration. Set no hardware session to see if that solves the problem. When you do that you bypass the PPU defragmentation processing, and you can possibly isolate the issue.

Course SERT-NS5000

Juniper Networks, Inc.

70

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 70

Debug (20 of 49)


How to interpret the outputs?
get asic ppu tcp-3way-check
Ns54000-> get asic ppu tcp Show ASIC 1 PPU information: total input: 355742, total fwd: 355740 total drop: 3, redirect to client: 0 packet from server: 118611, msg send to server: 118555 msg rcv stage 4: 0, msg rcv stage 5: 0 Invalid session count: 0 Show ASIC 2 PPU information: total input: 118611, total fwd: 0 total drop: 0, redirect to client: 118611 packet from server: 0, msg send to server: 0 msg rcv stage 4: 118548, msg rcv stage 5: 3 Invalid session count: 0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 70

Similarly, you can use get ASIC PPU TCP-3-way check. Most important here are total drop and invalid session count. This is to help you understand how the ASIC is processing the 3-way handshake. You can see here there is a total drop of three in ASIC one, and you have ASIC two receive stage five and also three.

Course SERT-NS5000

Juniper Networks, Inc.

71

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 71

Debug (21 of 49)


get asic ppu tcp-3way-check
Shows TCP SYN check counters (set flow tcp-syn-check)
Check total drop and invalid session

What is important?
Find out if there are dropped packets

Why is it important?
TCP sessions are not being established due to TCP SYN check TCP SYN check feature is faulty

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 71

This is an example of a problem that TCP 3-way check was not working properly when the session involved two ASIC chips. It was being dropped by one chip and the other was waiting stage 5.

Course SERT-NS5000

Juniper Networks, Inc.

72

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 72

Debug (22 of 49)


get asic ppu tcp-3way-check
What to do next?
Determine the conditions in which the problem occurs: Is it any TCP traffic or specific src/dst/service? Is there asymmetric traffic in the network? Check the other ASIC commands Disable TCP SYN check feature to see if the problem stops Get the session information of a connection test
get session id <index>

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 72

What else can you check with this output? You can try to understand the condition is it all TCP traffic or is it a specific source, destination, or service? In the problem we looked at there was traffic going through both ASIC chips, so it was a special case.

Also check if there is asymmetric traffic whether only one direction of the flow is going through the firewall. This could be something thats having an influence.

Also check the other ASIC commands. Look at the data of not only one output but also as a whole. One thing that can be done as an action is disable TCP SYN check to see if that can help.

You can use get session ID because to see the status of the session if its going normally or if it is not completing properly.

Course SERT-NS5000

Juniper Networks, Inc.

73

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 73

Debug (23 of 49)


How to interpret the outputs?
get asic ppu syn-cookie
nsISG2000-> get asic ppu syn-cookie Show ASIC 1 PPU information: Syn-Cookie process statistics: Total input packets: 261628, Non-TCP first packets: 0 VLAN check fail: 0, TCP ACK: 0 TCP SYN: 26471, ACK decryption: 0 SYN encryption: 0, BGP bypass: 26471 From VPN engine: 0, Invalid ACK: 0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 73

The other command is get ASIC PPU SYN-cookie. Its the same idea, so the most important things to check are VLAN check fail and invalid ack.

Course SERT-NS5000

Juniper Networks, Inc.

74

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 74

Debug (24 of 49)


get asic ppu syn-cookie
Shows counters for SYN cookie feature Check Invalid ACK
It doesnt mean packet drop. ACK packet is not a cookie ACK but a first packet of the TCP connection

What is important?
Find out if there are packets dropped by SYN cookie feature

Why is it important?
Unable to pass TCP traffic Network under attack
2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 74

Here we can look at some attacks.

Course SERT-NS5000

Juniper Networks, Inc.

75

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 75

Debug (25 of 49)


get asic ppu syn-cookie
What to do next?
Determine if there is an attack Determine if SYN flood thresholds are set correctly Check other ASIC commands Disable SYN cookie to see if the problem is solved

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 75

Do we have a SYN flood attack or do we have the proper settings for SYN flood protection. We can also take action to disable it for troubleshooting to see if that will avoid the problem. Usually you may have a packet drop, and then you can disable it and check.

Course SERT-NS5000

Juniper Networks, Inc.

76

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 76

Debug (26 of 49)


How to interpret the outputs?
get asic ppu syn-proxy
nsisg2000-> get asic ppu syn-proxy Show ASIC 1 PPU information: Syn-proxy process statistics: Total input packts: 615701, Xport-ESP input: 0 Xmit to client: 0 Xmit to server: 0 Xmit SYN/ACK: 0, Xmit RST: 0 Rcv SYN: 0, Rcv RST: 0 Rcv FIN: 0, From VPN engine: 0 VPN process drop: 0, Unexpected pack drop: 0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 76

For SYN-proxy, the counter to usually check is the unexpected packet drop, which will tell you if there is a problem.

Course SERT-NS5000

Juniper Networks, Inc.

77

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 77

Debug (27 of 49)


get asic ppu syn-proxy
Shows SYN Proxy counters Check VPN process drop and Unexpected pack drop What is important?
Find out if are dropped packets due to SYN Proxy

Why is it important?
Packets are being dropped due to SYN Proxy SYN Proxy feature is being triggered

What to do next?
Determine if SYN flood thresholds are expected Check syn cookie counters if enabled Determine if there isnt any SYN flood attack Disable SYN Proxy to see if the problem is solved Check other ASIC commands
2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 77

We can look further at the SYN flood attacks. Look at the configuration, see if the threshold is as expected; have a look at the traffic to see if the load is expected or if it may be some kind of attack.

Course SERT-NS5000

Juniper Networks, Inc.

78

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 78

Debug (28 of 49)


How to interpret the outputs?
get sat <asic> counters
nsisg2000(M)-> get sat 0 c Q name wrptr rdptr 0 frq1 001d 0039 1 psra1 000b 000b 2 psra2 006b 006b 3 psra3 0000 0000 4 psra4 0000 0000 5 psrb 0000 0000 6 cpu fifo 0002 0002 6 cpu1 06f9 06f9 7 slu 0007 0000 8 spi 0001 0001 9 rsm fifo 0019 0019 9 rsm2 0000 0000 10 xmt1 000f 000f 11 xmt2 0004 0004 12 xmt3 0000 0000 13 xmt4 0000 0000 14 cpu3 0000 0000 15 cpu4 0000 0000 full 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 emp 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 size 0064 0000 0000 0000 0000 0000 0000 0000 0007 0000 0000 0000 0000 0000 0000 0000 0000 0000 q_full_cnt 0 0 0 0 0 0 0 7 959 0 0 0 0 33 0 0 0 0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 78

Now lets go to get sat counters. This is also a very important command, because here you look at the status of the queue. Each line is one queue in the ASIC chip, and they send packets to each other. You see in the example the session lookup queue is the one that is highlighted with a high queue full count number. You need to look at the queue full count to see if it is incrementing. Queue full means the queue has reached capacity and cannot process any more packets. There can be packets dropped because the queue was full and couldnt receive more packets.

Also, its important to check the full column because, if this is 1, it means the queue is full and then it may block all the traffic. If the queue is full all the time, it will block the traffic all the time. Well see that in an example further on.

Course SERT-NS5000

Juniper Networks, Inc.

79

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 79

Debug (29 of 49)


get sat <asic> counters
Shows status of each queue in ASIC chip Each queue has different function:
psr: parser xmt: transmit cpu: queue from CPU host: queue to CPU slu: session lookup engine ppb: PPU-B queue frq2: free buffer queue

Check full and q_full_cnt columns What is important?


If full = 1 queue is full and cant forward packets If q_full_cnt increments queue was full and reset

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 79

As was mentioned, each line is for a different queue. They exist inside the chip, so we have parser queue, transmit queue, CPU queue, host queue, session lookup engine queue, PPU queue, and free buffer queue.

Course SERT-NS5000

Juniper Networks, Inc.

80

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 80

Debug (30 of 49)


get sat <asic> counters
Why is it important?
System is not forwarding traffic NSRP cluster went into split-brain scenario Traffic load is reaching system maximum capacity Traffic to IDP is being dropped

What to do next?
Determine which traffic/services are being affected Disable the feature corresponding to the queue to see if the problem stops Check other ASIC commands Check PPS to determine if traffic load is too high Check if full goes back to 0 if not system reset is required Check get log sys for ASIC reinit messages
CONFIDENTIAL

2010 Juniper Networks, Inc. All rights reserved.

SERT-NS5000

www.juniper.net | 80

If queue full is always 1 and it doesnt go back to zero, it may require a reset to recover the system from the failure.

Course SERT-NS5000

Juniper Networks, Inc.

81

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 81

Debug (31 of 49)


How to interpret the outputs?
get sat <asic> x-context
nsisg2000-> get sat 0 x-c saturn context: 0x03b3fdd8(80000000) sess pool, hdr:0x8c5c8600, tail:0x925ee700 session: in use:214882, alloc:830920960, free:830706078, total:1048575 sess shadow base: 0x63543980, size: 56 soft session base: 0x07fd6650, size: 288 ageout_fifo: 0x25974560 ager: rd:0x2940ef, wr:0x2940ef ager wrap count: rd:198, wr:198, catchup:0 ageout counters: rd:833175791, wr:145036538, not valid:1 skip:0, never:0, twin active:0, dma miss:0 unlink err:0 dma miss retry fail:0, dma miss retry succ:0 cleanup:0, proc:830711627, by twin:0, batch:131072 rsm rcv: 0, 2vpn: 0 rsm onhold: 0, freed: 0 rsm hash: 0x6326d3e0, pool: 0x0378b994/0x0378b9a4 ras hold: 0, total packet after ras is 0 hostq base: 0x04c00000, 0x6e000000 hq2 rcv: 0x04d80000, xmt: 0x04d82000 saturn free buffer reinit count: 1 saturn engine reset count: 1 st_dbg_asic_reinit: 0x04a2f8a8, val 0 packet up/down between CPU and ASIC: 1 tcp-syn-bit-check drop count: 1128272, tcp-syn-bit-check fragment drop count: 0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 81

This is a very important command as well: get sat <asic> x-context. Here you look for free buffer reinit and engine reset counts. These two counters help us understand if there was any reset in the ASIC chip for any reason. If the ASIC had to reset, you will see it here with these counters. If you are seeing packet drops in the network, you can look at these and see if it was reinit, which means they were dropped.

Also, you check packet up/down between CPU and ASIC to see if, for any reason, there was some loop between the CPU and the ASIC. One example could be the session exists in the CPU but doesnt exist in the ASIC. So, the ASIC receives a packet from the CPU and doesnt know where to send it, it will send it back to the CPU. Then it stays in a loop, and these are the counters you can check. This is good to check in the case of high CPU you might have a packet looping inside the system.

Course SERT-NS5000

Juniper Networks, Inc.

82

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 82

Debug (32 of 49)


get sat <asic> x-context
Shows memory tables, addresses and asic status Look for reinit or reset What is important?
Find out if there are ASIC reinits

Why is it important?
ASIC reinits drop traffic System may be overloaded To understand if there is ASIC failure

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 82

Basically thats what we check in this output.

Course SERT-NS5000

Juniper Networks, Inc.

83

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 83

Debug (33 of 49)


get sat <asic> x-context
What to do next?
Check get sat <asic> c for full queues or queue full increments Disable the feature using the PPU affected Check get log sys for ASIC reinit messages Check other ASIC outputs

Output changes in 6.1 and later


Defrag info Buffers Port information (Jupiter chip has 32 ports) Interface mac table

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 83

In the 6.1 release and later, you can also see with this output defragmentation information some additional buffers that you usually dont need to check only when you get a special request via our engineering team.

Course SERT-NS5000

Juniper Networks, Inc.

84

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 84

Debug (34 of 49)


How to interpret the outputs?
get sat <asic> frq
ns5400-> get sat 4 frq JPT 4 FRQ buffers... FRQ1 (4/97) buffer 29 duplicated 1 times! buffer 62 missing(0x000a3000)! buffer 104 duplicated 1 times! buffer 117 missing(0x000be800)! Buf allocated: cpu : 00000000 cpu1: 00000000 cpu2: 00000000 ppa : 80088902 ppb : 800bb902 ppc : 80088102 ppe : 00000000 ppf : 00000000 pdma: 8008f102 fb0 : 80092902 fb1 : 80095102 CH00: 0009e902 CH01: 000ad102 CH02: 00000000 CH10: 00095102 CH11: 00096902 CH12: 00000000 FRQ2 buf allocated: cpu : 802d4100 cpu1: 80202100 ppa : 80200900 ppb : 80205100 ppe : 80204100 ppf : 80204900

rsm : 800a3902 ppd : 00000000

CH03: 00000000 CH13: 00000000

cpu2: 80202900 ppc : 80203100

rsm : 807fb100 ppd : 80203900

wr=0x0000f29a, rd=0x0000e6a5, 0xbf5 bufs in frq2. FRQ2 buf HEALTHY, 11 bufs held expected: No.1 buf 0x00200902 No.2 buf 0x00201102 No.4 buf 0x00202102 No.5 buf 0x00202902 No.6 buf 0x00203102 No.7 buf 0x00203902
2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 84

Lets now check another very important command, get SAT FRQ. This shows the state of the free buffers that are used to store the packets. When you look here you see buffer missing messages, but please note that they might not always indicate an issue. They are here but the ASIC itself can deal with that and avoid any problem. Also, you can see here that the state is HEALTHY, so you dont need to really worry about it.

Course SERT-NS5000

Juniper Networks, Inc.

85

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 85

Debug (35 of 49)


get sat <asic> frq
Shows status of free buffer queue Look for missing buffers; leak and Err: -> do not necessary indicate problem as ASIC can recover from it Do get sat 0 frq | in bufs few times and check if read/write pointers are always the same -> LEAK
ns5400(M)-> get sat 0 frq | in bufs wr=0x0000c276, rd=0x0000b681 , 0xbf5 bufs in frq2. FRQ2 buf HEALTHY, 11 bufs held expected:

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 85

The condition you do need to worry about is when you do a get SAT 0 FRQ | include bufs and you see the read and write pointers are always the same.

Course SERT-NS5000

Juniper Networks, Inc.

86

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 86

Debug (36 of 49)


get sat <asic> frq
What is important?
Find out if there is buffer leak Missing buffers keep incrementing Status shows LEAK

Why is it important?
Buffer leak eventually can cause ASIC reinit Performance is affected System may be overloaded To understand if there is ASIC failure

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 86

When the read and write pointers are always the same it means you might have a leak. It means all the buffers are used and no more buffers are available, so no more packets can be processed. The consequence for the network is that the system just stops forwarding the traffic.

Course SERT-NS5000

Juniper Networks, Inc.

87

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 87

Debug (37 of 49)


get sat <asic> frq
What to do next?
If showing LEAK check multiple times to see if buffer list is always increasing its only a real leak if the buffer list is extremely long and no buffers are being freed Check get sat <asic> c for full queues or queue full increments Check get log sys for ASIC reinit messages Check other ASIC outputs

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 87

You can always correlate that with the get sat counter command, because it will tell you if there is any queue full. If you have frq full in the get sat counter, you are going to see frq is full.

Course SERT-NS5000

Juniper Networks, Inc.

88

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 88

Debug (38 of 49)


How to interpret the outputs?
get sat <asic> session
Shows session allocation information Look for leaked counts
ns5400-> get sat 4 session Saturn chip 4 free session link list sanity check: session: total 524287, alloc 3013104, released 3001124, free 512307, checked_free 512307, leaked 0

What is important?
Find out if there are sessions leaking in the ASIC session table

Why is it important?
Session leak can cause packet loop between CPU and ASIC -> high CPU problem

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 88

Then you have get sat session. This one usually is not a problem, but sometimes you may have a leak, so you have sessions in the ASIC that are mismatching from the CPU session table.

Course SERT-NS5000

Juniper Networks, Inc.

89

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 89

Debug (39 of 49)


How to interpret the outputs?
What to do next?
Check get sat <asic> c for full queues or queue full increments Disable the feature using the PPU affected Check get log sys for ASIC reinit messages Check other ASIC outputs Run debug tag info and debug flow basic

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 89

This is nothing to worry about, because the ASIC can also deal with that, and the CPU as well can correct. Its only a problem if this output, this number of leak sessions, really starts increasing very high.

Course SERT-NS5000

Juniper Networks, Inc.

90

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 90

Debug (40 of 49)


How to interpret the outputs?
get michigan
ns5400-> get michigan 1 count P3 rx count 53843, tx count 20896 P4 rx count 0, tx count 0 P5 rx count 59496, tx count 47859 P6 rx count 0, tx count 47859 P5 drop count 0, P6 drop count 0 iTxrdy 3c, iRxrdy 0x0, Txrdy 0xf, Rxrdy 0x0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 90

Now lets check some specific commands introduced earlier. The command, get michigan, for the FPGA for the 24FE card, looks for the drop counters.

Course SERT-NS5000

Juniper Networks, Inc.

91

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 91

Debug (41 of 49)


get michigan
Shows counters for 2G24FE SPM front end processor Look for drops What is important?
Find out if there are drops

Why is it important?
System capacity is being reached Hardware fault

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 91

This usually is not a problem. When you have drops at this level of the FPGA chip most of the time there are hardware issues.

Course SERT-NS5000

Juniper Networks, Inc.

92

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 92

Debug (42 of 49)


get michigan
What to do next?
Determine the traffic load that is arriving the system Check get count stat and correlate the information Check get sat <asic> c for full queues or queue full increments Check get log sys for ASIC reinit messages Check other ASIC outputs Possible RMA

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 92

In such cases, you can do a replacement or, if system capacity is being reached, then there is nothing else to do but to increase the number of cards or change the design.

Course SERT-NS5000

Juniper Networks, Inc.

93

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 93

Debug (43 of 49)


How to interpret the outputs?
get arch
ns5400-> get arch 2 -- I/O Card 2, BigSur (0xf6c00000) 0 1 Alpine0 RxPktCnt 5a344e01 1fd0f200 Alpine0 RxErrCnt 00000000 00000000 Alpine0 TxPktCnt 37869601 62647801 00000000 Alpine0 TxErrCnt 00000000 Alpine1 RxPktCnt 39869601 62647801 Alpine1 RxErrCnt 00000000 00000000 Alpine1 TxPktCnt 74344e01 20d0f200 Alpine1 TxErrCnt 00000000 00000000 -- I/O Card 2, Alpine 0 (0xf6c0c000) 0 1 2 MacRxPktCnt bb9a bd6e bfd3 MacRxErrPktCnt 0000 0000 0000 MacTxPktCnt 5871 329e 2e28 MacTxErrPktCnt 0000 0000 0000 JRxPktCnt 00a6e568 0088bd6e 0088bfd4 JRxErrPktCnt 00000000 00000000 00000000 JTxPktCnt 00d4cf47 00978ca2 007967e0 JTxErrPktCnt 00000000 00000000 00000000 SRxPktCnt 0196c7b4 0178a0fd SRxPktErrCnt 00000000 00000000 STxPktCnt 014e9dfb 00f30c6a STxPktErrCnt 00000000 00000000

3 c308 0000 31c8 0000 0088c309 00000000 00796ae3 00000000

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 93

The other specific command is get arch for the 8 gig or 10 gig card. Here in this command you see the names BigSur and Alpine, which are the FPGA chips. Here you see the counters rx, tx, packet and error. What you look for here are errors; you need to pay attention to that.

Another thing that might help here is to check if all the expected counters are incrementing. For example, you have here four channels. If you have the eight Gig card you expect each channel to be related to one port, so you can see here, you can run this command and see how they are incrementing, when you send traffic through the system.

Course SERT-NS5000

Juniper Networks, Inc.

94

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 94

Debug (44 of 49)


get arch
Shows status of 8G2/2XGE/8G2-G4/2XGE-G4 SPM front end processor Look for err What is important? Why is it important? What to do next?
Find out if there are error or drops in front end processor Throughput is not as high as expected Hardware failure System capacity is being reached Determine if traffic load is not reaching system capacity Check get sat <asic> c for full queues or queue full increments Check get log sys for ASIC reinit messages Check other ASIC outputs Possible RMA

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 94

Most of the time, when you look for errors, they are going to be hardware errors, in which case you do an RMA. While it is certainly possible there may be a problem in how the packets are sent, thats not very common.

Course SERT-NS5000

Juniper Networks, Inc.

95

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 95

Debug (45 of 49)


How to interpret the outputs?
get fresno
nsisg2000-> get fresno 0 fresno version is 0x66, Rocket IO mode iorx_pkt_cnt0/1/2/3 is 0x9a74, 0x0000, 0x7284, 0x31ff iotx_pkt_cnt0/1/2/3 is 0x2a83, 0x0000, 0x252a, 0x0000 iorx_ipb_timeout_cnt0/1/2/3/4/5/6/7/8/9 is 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 jrx_pkt_cnt0/1/2/3 is 0x0000, 0x0000, 0x9a87, 0xa483 jtx_pkt_cnt0/1/2/3 is 0x0000, 0x0000, 0x2a86, 0x252c jrx_pkt_sop_cnt0/1/2/3 is 0x0000, 0x0000, 0x0000, 0x0000 jtx_pkt_sop_cnt0/1/2/3 is 0x0000, 0x0000, 0x0000, 0x0000 rio_ipb_status0/1/2/3 is 0x00, 0x00, 0x00, 0x00 rio_opb_status0/1/2/3 is 0x00, 0x00, 0x00, 0x00 cross_fresno_rx0/1 is 0x0000, 0x0000 cross_fresno_tx0/1 is 0x0000, 0x0000 SYNC , NO LINK , NO LINK , SYNC tx_total_frame_cnt = 00000000, 00000000, 00000000, 00000000 tx_err_frame_cnt = 00000000, 00000000, 00000000, 00000000 Rx_crc_frame_cnt = 00000000, 00000000, 00000000, 00000000 Rx_err_frame_cnt = 00000000, 00000000, 00000000, 00000000 Tx_real_error_pktcnt = 00000000, 00000000, 00000000, 00000000 Tx_real_total_pktcnt = 00000000, 00000000, 00000000, 00000000 Rx_real_error_pktcnt = 00000000, 00000000, 00000000, 00000000 Rx_real_total_pktcnt = 00000000, 00000000, 00000000, 00000000 Rx_real_illgl_pktcnt = 00000000, 00000000, 00000000, 00000000 slot0 slot1 slot2 slot3 XMTQ7 XMTQ6 XMTQ3 XMTQ4 XMTQ2

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 95

The get fresno output is similar. For ISG you check the FPGA counters on the ISG platform. You also look for errors to see if they are incrementing and here there is one extra detail so that you see the transmit queues. If you remember from get sat counters, that output shows the queues. Here you see how the queues are used, so slot 2 is using the transmit queue three (XMTQ3), for example.

Course SERT-NS5000

Juniper Networks, Inc.

96

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 96

Debug (46 of 49)


get fresno
Shows status of ISG front end processor Look for err What is important?
Find out if there are error or drops in front end processor

Why is it important?
Throughput is not as high as expected Hardware failure System capacity is being reached

What to do next?
Determine if traffic load is not reaching system capacity Check get sat <asic> c for full queues or queue full increments Check get log sys for ASIC reinit messages Check other ASIC outputs Possible RMA

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 96

Thats what you look for with get fresno. The errors are basically the same idea as with get arch. Most of the time, its either a hardware failure or you are really reaching the system capacity.

Course SERT-NS5000

Juniper Networks, Inc.

97

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 97

Debug (47 of 49)


How to run debug tag info
Example - telnet
****************** 03167.0: tag (03a06f80) ****************** pak length: 48 vlan qidx:6 slot:0 port:0 buffer:0x8028d91c protcol:6 demux:4 l2idx:5190 ipid:e4ca flags:0x40008007 session pointer:0x00029247 src:10.227.5.200 dst:4.4.4.4 sport:c52e dport:17 ********************** end tag info ************************* st_tag_2_ifp: 10.227.5.200 -> 4.4.4.4, incoming ifp=ethernet2/1.400 start demux process 4

demux 4: first packet for the session Src-ip: 10.227.5.200 -> dst-ip: 4.4.4.4 Src port: c52e -> dst port: 17 Incoming interface eth2/1.400 IPID = 0xe4ca

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 97

Now we come to the debug command, debug tag info. This is very important. You run it when you are looking for problems related to the CPU. This command will show us only packets going to the CPU. If packets are being processed only by the ASIC chip, we dont see them in the debug.

The debug flow basic is the same; it only shows packets going to the CPU.

Why do we do debug tag info? Here you see the information from the packet going to the CPU and a lot of detail. You see packet length and also the queue index that shows which queue sent the packet to the CPU. If you go to the get sat counters you can see which queue has queue index 6. You see the address of the buffer, so if you want to see the whole packets content, you can look at this buffer.

The protocol is six and then the demux tag, which is very important since it indicates why the packets went to the CPU. Demux 4 means, its the first packet for the session. If there was no session in the table in the ASIC chip, it has to send it to the CPU for session creation.

You also see source address, destination address, source port and destination port here, in abbreviated notation.

Another important thing is the IPID of the packet. If you are looking for a packet loop, you can do this debug and then you see it all you see the same packet ID five, ten, or 100 times; the same packet so, there is a loop.

Course SERT-NS5000

Juniper Networks, Inc.

98

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Please remember that the debug command can be service affecting depending on the load in the system because it takes a lot of CPU time to do this debug. If the load is very high, you might create some interference.

Course SERT-NS5000

Juniper Networks, Inc.

99

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 98

Debug (48 of 49)


How to run debug tag info
Run it for ~10seconds when the problem is happening
Set debug buffer to maximum size: set dbuf size 4096 Clear debug buffer: clear db Run debug command: debug tag info Wait 10seconds Type Esc to abort Collect output: get db stream

CPU intensive, affects system performance Look for demux number


1: packet has to be sent to CPU for processing (e.g. ALG) 4: first packet 25: ICMP
2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 98

What we usually do is run debug for 10 seconds, and then type ESC to abort immediately, and then inspect the output.

Another example is tag. We have 1, which is a packet that had to be sent to the CPU for processing. Even if there is a session, the packet needs to go to the CPU for example, in the case of ALG also, 25 is for ICMP, and ICMP always goes to the CPU.

Course SERT-NS5000

Juniper Networks, Inc.

100

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 99

Debug (49 of 49)


How to run debug tag info
Whats important?
Find out if there are too many packets going to CPU

Why is it important?
Investigation of high CPU Packets that should processed only by ASIC are going to CPU incorrectly Packet loop between ASIC and CPU

What to do next?
Determine if the packets going to CPU are expected If not, investigate the traffic pattern and policy configuration Check ASIC commands for queue full increments or reinits

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 99

Thats it for this debug command. We always do correlation, so we check also the get sat command, especially get sat demux, because then we know how many packets are going to the CPU per second.

Course SERT-NS5000

Juniper Networks, Inc.

101

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 100

Section Summary
In this section, we:
Reviewed general commands used in ScreenOS Listed the most important commands specific to high end systems Explained how to collect the data and interpret the output Showed how to run debug tag info when looking for problems related to CPU

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 100

In this section, we:

Reviewed general commands used in ScreenOS Listed the most important commands specific to high end systems Explained how to collect the data and interpret the output, and Showed how to run debug tag info when looking for problems related to CPU

Course SERT-NS5000

Juniper Networks, Inc.

102

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 101

Learning Activity 5: Question 1


We run the command get sat counters to do what?
A) Look at the status of the queue B) See if there was any reset in the ASIC chip C) Check for high CPU D) Find SYN flood attacks

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 101

Course SERT-NS5000

Juniper Networks, Inc.

103

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 102

Learning Activity 5: Question 2


When the command get michigan shows drops on the FPGA chip, most of the time it indicates what?
A) Mismatched sessions in ASIC and CPU B) Leaked sessions C) Hardware issues D) Output defragmentation errors

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 102

Course SERT-NS5000

Juniper Networks, Inc.

104

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 103

Netscreen 5000 Series Security Systems and ISG Series Troubleshooting

Troubleshooting Examples

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net | Proprietary and Confidential

Troubleshooting Examples

Course SERT-NS5000

Juniper Networks, Inc.

105

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 104

Section Objectives
After successfully completing this section, you will be able to:
Describe workarounds provided in the three most critical troubleshooting examples occurring in the field Apply the commands described in each troubleshooting example

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 104

After successfully completing this section, you will be able to:

Describe workarounds provided in the three most critical troubleshooting examples occurring in the field, and Apply the commands described in each troubleshooting example

Course SERT-NS5000

Juniper Networks, Inc.

106

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 105

Troubleshooting Examples (1 of 12)


Example 1 System stops forwarding traffic
Scenario
NS5400-MGT2-2XGE NSRP Active/Passive cluster ScreenOS 6.2r1

Problem
Master unit stops forwarding traffic Failover to backup unit doesnt occur Manual failover needed to recover the services Reset needed to recover the system
ns5400-> get chass | in mb Slot Type S/N 1 Management 0102032007000009 2 Processing-2XGE 0143072006000013 Assembly-No 0058-005 0063-003 Temperature 109'F (43'C) 114'F (46'C) DRAM Size 2048MB 1024MB

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 105

The first real world troubleshooting example here, and one that is most service affecting, is when the system stops forwarding the traffic. This example was with NetScreen 5400 Management 2, with the two port, 10 gigabit card, and it was an active/passive cluster running the 6.2r1 release. What was the problem? The master unit just stopped forwarding traffic; no traffic was being processed. It was service affecting because no failover to the backup unit was triggered, so the traffic was not being processed. But they were still exchanging heartbeats, so there was no failover that was triggered.

How was the situation resolved? A manual failover was done to the backup unit, so the backup unit was running well it recovered the services. Then the old master had to be reset to recover from that situation. Here we show the get chassis output so you can see the information about the card.

Course SERT-NS5000

Juniper Networks, Inc.

107

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 106

Troubleshooting Examples (2 of 12)


Example 1 System stops forwarding traffic
Commands collected
get sat 0 d get sat 0 x-c get sat 0 fr get sat 0 c get sat 0 s get arp asic 0 get sat 1 d get sat 1 x-c get sat 1 fr get sat 1 c get sat 1 s get arp asic 1 get arch 0
ns5400-> get asic mapping 0 1 2 3 4 5 (ethernet4/1) (ethernet4/2) n/a n/a n/a n/a

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 106

How did we investigate this problem? We collected the get sat commands. To look at the ARP table, these are the most important commands: get sat demux, get sat x-compact, get sat frq, get sat counter, get sat session and get arp asic. Also, use get arch zero to see the counters in the front-end processor.

Use the command get asic mapping to know which ASIC you need to check. You have to check zero and one, so thats why you see here both get sat 0 and get sat 1.

Course SERT-NS5000

Juniper Networks, Inc.

108

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 107

Troubleshooting Examples (3 of 12)


Example 1 System stops forwarding traffic
Analysis
slu ASIC queue full and not getting freed
ns5400(M)-> get sat 0 c Q name wrptr rdptr () 7 slu 0007 0003 () full emp 1 0 size 0007 q_full_cnt 349

Packet loop between ASIC/CPU


LISNS5400:FW1(M)-> get sat 0 x-c | in between packet up/down between CPU and ASIC: 72

ASIC reinits
LISNS5400:FW1(M)-> get log sys | in reinit ## 2008-12-08 13:40:42 : reinit chip 0, invalid buf (380a7100). ## 2008-12-08 13:41:42 : reinit chip 0, invalid buf (380bf900).

Problem didnt happen after disabling TCP SYN check

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 107

What did we see with this output? We were looking for the counters, so the first thing we note is the slu queue in the get sat counter command was showing a lot of queue full. This was incrementing constantly. Every time we ran the command the number was higher. Then we also noted that the queue full was always 1, so that meant no packets were being processed, the queue was full and stuck. It was dropping all the traffic. Thats why no packets were being processed; no traffic was running.

Then we kept on checking the data and we also saw a lot of packets up and down between the CPU and ASIC. Also we see that re-initialization in the ASIC chip. With the get log sys command, we saw reinit chip zero so there was an invalid buffer.

So, we obtained these three pieces of evidence that there were problems on the ASIC chip. Then we did one try of disabling the TCP SYN check, and we noted the problem was not happening anymore.

Course SERT-NS5000

Juniper Networks, Inc.

109

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 108

Troubleshooting Examples (4 of 12)


Example 1 System stops forwarding traffic Workaround
Disable TCP SYN check
unset flow tcp-syn-check unset flow tcp-syn-bit-check

Root Cause
Software defect: TCP SYN check was corrupting packets for cross-ASIC sessions, causing packet loop between ASIC/CPU and slu queue stuck.

Solution
Code was modified to implement the necesssary corrections

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 108

Then we know the workaround is to disable the TCP SYN check, but whats important here is the investigation that we did with engineering determined that TCP SYN check was corrupting the packets in the case of cross ASIC sessions. Then we saw that because of the packet loop between the ASIC and the CPU the session lookup queue got stuck and couldnt recover and then it couldnt process any more packets. Thats why the system stopped forwarding the traffic.

The solution in this case was to modify the code to avoid this problem of corrupting the packets, and then the problem was solved. Now we dont have this issue anymore.

Course SERT-NS5000

Juniper Networks, Inc.

110

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 109

Troubleshooting Examples (5 of 12)


Example 2 TFTP transfers not working
Scenario
NS5400-MGT3-2XGE-G4/8G2-G4 NSRP Active/Active cluster ScreenOS 6.1r4 Sessions are cross-ASIC (2XGE-G4 to 8G2-G4)

Problem
Specific users cannot do TFTP transfers through the cluster Transfer starts but after a few seconds it hangs If no-hw-session is enabled in policy transfer is successful
SDU:Jabbar-NS5400(M)-> get Slot Type 1 Management-III 2 Processing-2XGE-G4 3 Processing-8G2-G4 chas | in mb S/N 0225082008000060 0227062008000032 0226092008000027 Assembly-No 0072-001 0085-001 0084-001 Temperature 109'F (43'C) 123'F (51'C) 116'F (47'C) DRAM Size 2048MB 1024MB 1024MB

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 109

The second example is also with the NetScreen 5400, but now with the Management-3 card, and with the new interface cards ten Gig and also eight Gig. Also, we have in this case active/active cluster, ScreenOS 6.1r4, and we saw that all the sessions were cross-ASIC going from a 10 Gig port to an eight Gig port. The problem is we had some specific users that couldnt do TFTP transfers through the cluster. From the client side, we could see the transfers were starting but after a few seconds they would just hang. We suspected some of those problems were in the ASIC level, so we enabled no hardware session in the policy, especially for that client, and then we saw that port. We could then see that we had something in the ASIC thats causing the problem, because the no hardware session will bypass the processing in the PPU.

Course SERT-NS5000

Juniper Networks, Inc.

111

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 110

Troubleshooting Examples (6 of 12)


Example 2 TFTP transfers not working
Analysis (1)
Packet captures showed that only tranfers with fragmented packets were unsuccessful No fragment drops were detected in the system
ns5400(M)-> get asic ppu defrag Show ASIC 1 PPU information: Defragmentation of Encrypted Packets Total input packets: 0, Total Fragments: 0 First frag: 0, None-first Frag: 0 Defrag pass: 0, ESP frag: 0 Unexpedted packet: 0, To RSMQ: 0 AH frag: 0 Defragmentation of Clear-Text Packets Total input packets: 353294, First frag: 82415 Defrag pass: 352369, Defrag fail: 0 Null Session Error: 0, Out-of node buffer: 0 PPU merge: 0

ASIC commands didnt show any anomaly


No ASIC queue full, no reinits

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 110

Whats the analysis we did here? We did some packet captures to see why only that specific client was having a problem. We saw that those clients were doing transfers with fragmented packets. The block size of the TFTP was 8000 bytes or so, so it was causing fragmentation. Then what do we do? Lets check get ASIC PPU defrag, because thats where the defragmentation is done. But here we see zero no defragmentation error; no null session error. So the PPU processing seemed to be fine. We continued to look at the other ASIC commands. They also didnt show anything that could really pinpoint the problem. What do we do next? We did a debug tag info.

Course SERT-NS5000

Juniper Networks, Inc.

112

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 111

Troubleshooting Examples (7 of 12)


Example 2 TFTP transfers not working
Analysis (2)
debug tag info showed the packets going to CPU incorrectly
Session already created, fragmented traffic is processed only by ASIC

Demux = 4 -> packets were considered first packets incorrectly


****************** 03167.0: tag (03a06f80) ****************** pak length: 48 vlan qidx:6 slot:0 port:0 buffer:0x8028d91c protcol:6 demux:4 l2idx:5190 ipid:e4ca flags:0x40008007 session pointer:0x00029247 src:10.227.5.200 dst:4.4.4.4 sport:c52e dport:45 ********************** end tag info ************************* st_tag_2_ifp: 192.168.25.30 -> 192.168.33.43, incoming ifp=ethernet2/1.43 start demux process 4

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 111

We decided to see whether there was something wrong going to the CPU. We did debug tag info and then we saw what the problem was. We saw these fragments were going up to the CPU. They belonged to a flow that didnt exist, but they were being sent to the CPU with demux tag four; they were considered first packets for a new session. It was confusing the CPU because the CPU already had a session for that traffic. The packet was not sent out. It was being dropped when the ASIC received it. That was the issue.

Course SERT-NS5000

Juniper Networks, Inc.

113

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 112

Troubleshooting Examples (8 of 12)


Example 2 TFTP transfers not working
Workaround
Enable no-hw-session in the policy
All packets are processed by CPU Debug flow basic confirmed correct processing

Root Cause
Software defect: PPUC fragment handling was incorrect, causing ASIC session matching to fail and send packet to CPU

Solution
Code was modified to implement the necesssary corrections

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 112

What we did as a workaround is we used no hardware session in the policy. In that case, the packets are processed in the CPU, and we saw from the root cause that the PPUC, which is the one that handles defragmentation, was incorrect. We saw zero errors, but that was incorrect, so it was using a bad hashing mechanism to match the session table in the ASIC. This was causing session matching fail in the ASIC. Then, because no session was found in the ASIC, it was sent to the CPU. The CPU was confused and the packet was not sent out. The solution here was also to modify the code and now we dont have this problem anymore in the latest version.

Course SERT-NS5000

Juniper Networks, Inc.

114

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 113

Troubleshooting Examples (9 of 12)


Example 3 System showing abnormal high CPU
Scenario
NS5400-MGT2-2XGE ScreenOS 6.2r1

Problem
System showing high CPU Determine the reason for this behavior
ns5400-> get perf cpu all detail Average System Utilization: 5% (flow Last 60 seconds: 59: 20(30 2) 58: 20(30 1) 57: 55: 78(88 8)** 54: 78(88 7)** 53: 51: 77(87 6)** 50: 77(87 6)** 49: 47: 77(87 6)** 46: 77(87 6)** 45: 43: 76(86 6)** 42: 77(87 6)** 41: 6 task 3) 3) 6)** 6)** 5)** 6)** 56: 52: 48: 44: 40: 79(89 77(87 77(87 77(87 76(86 7)** 6)** 6)** 7)** 5)**

29(39 77(87 77(87 76(86 76(86

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 113

Heres another example, which is regarding abnormally high CPU. This is something that is also important for the system.

What causes high CPU? In this example we have NetScreen 5400 with the 10 gigabit card running ScreenOS 6.2r1. We have a system showing high CPU. The first command to use when high CPU exists is get perf CPU all detail. The word all is critical since, when using it, it will break down the CPU utilization.

The output shows both flow and task CPU utilization. This reveals, in this case, that we had flow CPU high, but not task. What does this tell you? It tells you that the flow processing is the one thats causing the high CPU utilization and that means its traffic we are processing a lot of traffic. Lets focus on the traffic thats being processed.

Course SERT-NS5000

Juniper Networks, Inc.

115

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 114

Troubleshooting Examples (10 of 12)


Example 3 System showing abnormal high CPU
Analysis (1)
flow is the CPU running high
Related to traffic processing/forwarding

~8000 packets per second were sent to CPU because of ALG processing
ns5400-> get asic demux to_host_packet: first_packet: brcst: no_ip_ether_net: total packet: clsf counters: icmp To CPU traffic analysis: ALG: DMA required: Current(02:57:15) 612430 13400782 243 708 14014163 Last(02:57:15) 612430 13258685 243 708 13872066 PPS( 17s) 0 8147 0 0 8147

40

40

4152761 59

4010664 59

8147 0

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 114

The next thing we did was to look at get ASIC demux. We checked the PPS and saw we have 8,000 packets per second going to the CPU for ALG processing. We had all these packets going to the CPU for ALG. The next question that we asked was, which ALG is being triggered? Which traffic is this? We didnt expect to have this amount of traffic for the ALG.

Course SERT-NS5000

Juniper Networks, Inc.

116

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 115

Troubleshooting Examples (11 of 12)


Example 3 System showing abnormal high CPU
Analysis (2) debug tag info showed packets going to CPU
Demux = 4 -> first packets

Destination ports were identified There were services using well-known ports and matching ALGs Packets go to CPU if needed to be processed by ALG
****************** 11236.0: tag (03a15f00) ****************** pak length: 46 vlan qidx:6 slot:0 port:0 buffer:0x806e191c protcol:17 demux:4 l2idx:5190 ipid:0 flags:0x00000007 session pointer:0x000e1d91 src:192.134.71.124 dst:212.60.215.99 sport:13c4 dport:13c4 ********************** end tag info ************************* st_tag_2_ifp: 192.134.71.124 -> 212.60.215.99, incoming ifp=ethernet2/1.400 start demux process 4

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 115

Next we ran debug tag info, which shows the packet tags going to the CPU. In the tag, we can see the destination port. We can match to a service and then understand which ALG is being triggered.

In this case, 13c4 is 50/60, which is the port for the SIP service for Voice over IP. We then knew why the CPU was high. There was a lot of traffic going through the firewall for the SIP service.

We asked ourselves, Do we expect this high amount of traffic for SIP service? We can try a packet capture in the network or check, for example, the source, to see why its sending all the traffic, and hopefully understand whats going wrong.

In this case there was no problem in the system. The traffic load was high because the packets sent to this condition represented a relatively high load and what happened was that port was being used by a different service using that port and that service didnt need any ALG processing. But, because it was using the port that was for SIP, it was going to the CPU for the ALG processing.

Course SERT-NS5000

Juniper Networks, Inc.

117

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 116

Troubleshooting Examples (12 of 12)


Example 3 System showing abnormal high CPU
Workaround
Disable the ALGs being triggered
unset alg <algname> enable

Root Cause
System working as expected, traffic load for CPU processed packets was too high.

Solution
Change services to use non well-known ports Or disable the ALGs if not needed

KB9453 - Troubleshooting High CPU on a firewall device

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 116

The idea here was to either change the port that serves that application from that specific network, or disable the ALG if you dont need to use it; if you dont have any SIP service in the network.

With these three examples, we saw the most important problems that we had in the field. First, system stopped forwarding the traffic, then second, certain applications or certain services are dropped and we needed to check exactly which service it is and check the details. Then the third one was the high CPU. Again, these three are the most important types of problems we have had.

We also have this Knowledge Base reference document KB 9453, which provides a good starting point, and also covers the analysis that we covered.

Course SERT-NS5000

Juniper Networks, Inc.

118

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 117

More Information
Juniper Knowledge Base: http://kb.juniper.net

Ask a question and get answers


Technical Documentation:
http://www.juniper.net/techpubs/software/screenos/index.html

ScreenOS Concepts and Examples Guide ScreenOS CLI Guide


J-Net Forum: http://forums.juniper.net/jnet

Sign up and participate

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 117

You have these additional sources of information.

The Knowledge Base has several articles that can help you.

Via the Technical Documentation link you can get to the ScreenOS Concepts and Examples Guide, which can help you understand the expected behavior, and also the ScreenOS CLI Guide can help you review the syntax of the commands.

You can also use J-NET to discuss problems you may encounter.

Course SERT-NS5000

Juniper Networks, Inc.

119

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 118

Section Summary
In this section, we:
Described workarounds provided in the three most critical troubleshooting examples occurring in the field Showed how to apply the commands described in each troubleshooting example

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 118

In this section, we:

Described workarounds provided in the three most critical troubleshooting examples occurring in the field, and Showed how to apply the commands described in each troubleshooting example

Course SERT-NS5000

Juniper Networks, Inc.

120

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 119

Learning Activity 6: Question 1


Which of the following is an indication that the system has stopped forwarding traffic?
A) Fragmented packets B) Queue full & full always 1 C) Session matching fail D) High CPU

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 119

Course SERT-NS5000

Juniper Networks, Inc.

121

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 120

Learning Activity 6: Question 2


The first command to use when high CPU exists is:
A) get ASIC demux B) get sat counter C) get sat session D) get perf CPU all detail

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 120

Course SERT-NS5000

Juniper Networks, Inc.

122

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 121

Course Summary
In this Course, we:
Distinguished between ISG Series and NS5000 Series hardware configuration and packet flow Explained the importance of the ASIC functions Described First Path and Fast Path in packet flow Differentiated between functions processed in the CPU versus PPU Used and interpreted debug commands unique to high end systems Explained the workarounds for three typical troubleshooting examples

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 121

In this Course, we:

Distinguished between ISG Series and NS5000 Series hardware configuration and packet flow Explained the importance of the ASIC functions Described First Path and Fast Path in packet flow Differentiated between functions processed in the CPU versus PPU Used and interpreted debug commands unique to high end systems, and Explained the workarounds for 3 typical troubleshooting examples

Course SERT-NS5000

Juniper Networks, Inc.

123

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 122

Additional Resources
Education Services training classes
http://www.juniper.net/training/technical_education/

Juniper Networks Certification Program Web site


www.juniper.net/certification

Juniper Networks documentation and white papers


www.juniper.net/techpubs

To submit errata or for general questions


elearning@juniper.net

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 122

For additional resources or to contact the Juniper Networks eLearning team, click the links on the screen.

Course SERT-NS5000

Juniper Networks, Inc.

124

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 123

Evaluation and Survey


You have reached the end of this Juniper Networks eLearning module You should now return to your Juniper Learning Center to take the Practice Test and the Student Survey
The test will allow you to gauge your knowledge of the material covered in this course The survey will allow you to give feedback on the quality and usefulness of the course

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 123

You have reached the end of this Juniper eLearning module. You should now return to your Juniper Learning Center to take the Practice Test and the Student Survey. The test will allow you to gauge your knowledge of the material covered in this course. The survey will allow you to give feedback on the quality and usefulness of the course.

Course SERT-NS5000

Juniper Networks, Inc.

125

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 124

2010 Juniper Networks, Inc.

Juniper Networks, Junos, Steel-Belted Radius, NetScreen, and ScreenOS are registered trademarks of Juniper Networks, Inc. in the United States and other countries. The Juniper Networks Logo, the Junos logo, and JunosE are trademarks of Juniper Networks, Inc. All other trademarks, service marks, registered trademarks, or registered service marks are the property of their respective owners. Juniper Networks reserves the right to change, modify, transfer, or otherwise revise this publication without notice.

2010 Juniper Networks, Inc. All rights reserved.

CONFIDENTIAL

SERT-NS5000

www.juniper.net | 124

Juniper Networks, Inc. All rights reserved. Juniper Networks, the Juniper Networks logo, Junos, NetScreen and ScreenOS are registered trademarks of Juniper Networks, Inc. in the United States and other countries. JunosE is a trademark of Juniper Networks, Inc. All other trademarks, service marks, registered trademarks or registered service marks are the property of their respective owners. Juniper Networks reserves the right to change, modify, transfer or otherwise revise this publication without notice.

Course SERT-NS5000

Juniper Networks, Inc.

126

NetScreen 5000 Series Security Systems and ISG Series Troubleshooting

Slide 125

CONFIDENTIAL

Course SERT-NS5000

Juniper Networks, Inc.

127

e d u c a t io n se r v ic e s c o u rse w a re

Co rp orat e a nd Sa les Head q uart ers Junip er Net w orks, Inc. 119 4 Nort h Mat hild a Avenue Sunnyvale, CA 9 4 0 8 9 USA Phone: 8 8 8 .JUNIPER ( 8 8 8 .5 8 6 .4737) or 4 0 8 .74 5 .20 0 0 Fax: 4 0 8 .74 5.210 0 w w w.junip er.net

APAC Head q ua rt ers Junip er Net w orks ( Hong Kong) 26 / F, Cit yp laza One 1111 Kings Road Taiko o Shing, Hong Ko ng Pho ne: 8 52.2332.36 36 Fax: 8 52.2574 .78 0 3

EMEA Head q uart ers Junip er Net w o rks Ireland Airsid e Business Park Sw o rd s, Count y Dub lin, Ireland Pho ne: 35 .31.8 9 0 3.6 0 0 EMEA Sales: 0 0 8 0 0 .4 5 8 6 .4 737 Fax: 35.31.8 9 0 3.6 0 1

Cop yright 20 10 Junip er Net w orks, Inc. Al l right s reserved . Junip er Net w o rks, t he Junip er Net w orks logo, Juno s, Net Screen, and ScreenOS are regist ered t rad em arks o f Junip er Net w orks, Inc. in t he Unit ed St at es and ot her count ries. Al l ot her t rad em arks, service m arks, regist ered m arks, or regist ered service m arks are t he p rop ert y of t heir resp ect ive ow ners. Junip er Net w orks assum es no resp onsib il it y f o r any inaccuracies in t his d o cum ent . Junip er Net w orks reserves t he right t o change, m od if y, t ransf er, or ot herw ise revise t his p ub l icat ion w it ho ut not ice.

Vous aimerez peut-être aussi