Vous êtes sur la page 1sur 29

Introduction

Monitored Types of Information


Network Monitoring Configurations
Network Monitoring Methods
Performance Monitoring

Performance Indicators
Performance Monitoring Functions

Fault Monitoring

Problems of Fault Monitoring


Fault Monitoring Functions

Accounting Monitoring

Network monitoring is concerned with


observing and analyzing the status and
behavior of the end systems, intermediate
systems, and subnetworks that make up the
network to be managed

Issues in network monitoring


what to monitor?
define what is to be monitored

how to monitor?
how to obtain information from managed resources

what to do with the monitored information?


how the monitored information is used in various
management functional areas

Static information
hardly changes
current configuration information
e.g., the number and identification of ports on a router

Dynamic information
changes frequently
information related to events in the network
e.g., change of state, transmission/reception of packets

Statistical information
derived from dynamic information
e.g., average number of packets transmitted per unit time

MANAGEMENT INFORMATION BASE (MIB)


Statistical
data base

Call_Blocked

Packet_Loss

Time_Delay

Throughput
Abstraction of state
and event variables

Dynamic
data base

State_Variable
Event_Variable
Sensor activation and
data collection
Sensor data base

Buffer

Switch_server
Source

Status_Sensor
Server

Station_Info
Switch_Buffer
Switch_Source
Configuration data base
Static data base

Derived_Status_Sensor
Event_Sensor

monitoring application
includes the functions of monitoring that are
visible to the user
e.g., performance, fault, accounting

manager function
performs the basic monitoring function of
retrieving information

agent function

gathers and records management information


for one or more network elements and delivers
the information to the monitor

managed objects

mgmt information that represents resources


and their activities

monitoring agent

generates summaries and statistical analysis


of mgmt information

Monitoring
application
Manager
function
Monitoring
application
Manager
function

Agent
function
Managed
objects

(a) manager-agent model

Monitoring
agent

Agent
function
Managed
objects

...

Agent
function
Managed
objects

(b) A model for summarization

Monitoring
application
Manager
function
Agent
function
Managed
objects
(a) Managed resources in
manager system
Monitoring
application
Manager
function

Monitoring
application
Manager
function
Subnetwork
or internet

Agent
function
Managed
objects
(b) Resources in agent system

Subnetwork
or internet

Monitoring
application
Manager
function

Subnetwork
or internet

Agent
function
LAN

Agent
function
LAN

observed traffic
(c) External monitor

(d) proxy monitor agent

Polling

a request-response interaction between a


manager and agent
a manager sends request to an agent which
processes the request and responds with
information from its MIB
a manager may use polling to
learn about the configuration it is managing
obtain periodically an update of conditions
investigate an area in detail after being altered to a
problem

Event Reporting
information flow is initiated from the agent to
manager
an agent may generate report periodically to
give the manager its current status or
whenever a significant event (e.g., change of a
state) or an unusual event (e.g., fault) occurs
good for detecting problems as soon as they
occur

Measuring the performance of the


network (or performance monitoring) is
absolutely required in NM
to detect & fix problems that cause
performance degradation
to better plan network upgrades

Problems in selecting and using


appropriate indicators (or metrics)
too many indicators in use
the meaning of most indicators are not yet
clearly understood
some indicators are supported by some
manufacturers only
frequently, the indicators are accurately
measured but incorrectly interpreted by
human or mgmt application
the calculation of indicators takes too much
time

Service-oriented

Availability:

the percentage of time that a


network system, a component, or an
application is available for a user

Response Time: how long it takes for a

response to appear at a users terminal after a


user action calls for it

Accuracy:

the percentage of time that no


errors in the transmission and delivery of
information

Efficiency-oriented

Throughput:

the rate at which applicationoriented events (e.g., file transfers) occur

Utilization:

the percentage of the


theoretical capacity of a resource (e.g.,
transmission line, switch, CPU) that is being
used

TO

Network interface
(e.g., router)

Workstation

SI

Server

Network

SO
WO

WI
TI

RT = TI + WI + SI + CPU + WO + SO + TO

RT = response time
TI = inbound terminal delay
WI = inbound queuing time
SI = inbound service time

CPU = CPU process delay


WO = outbound queuing time
SO = outbound service time
TO = outbound terminal delay

CPU

Performance Measurement
the actual gathering of statistics about network
traffic & timing
typically performed by agents within network
devices
e.g., amount of data in and out of a node, number
of connections, traffic per connection

Performance Analysis
analyzing the gathered data and presenting it
e.g., total, average, min, max, histogram

Synthetic Traffic Generation


generating artificial traffic load
permits the network to be observed under a
controlled load

Performance measurements can be used


to answer a number of questions
Why is the response so slow? (a very loaded
question!)
Why is the retransmission rate so high?
Is traffic evenly distributed among network
users or are there source-destination pairs with
unusually heavy traffic?
What is the percentage of each type of packet?

What is the channel utilization and


throughput?
What is the effect of traffic load on utilization,
throughput & time delays?
When does traffic load start to degrade system
performance?
What is the maximum capacity of the channel
under normal operating conditions? How many
active users are necessary to reach this
maximum?

To detect faults as quickly as possible


after they occur and to identify the
cause of the fault so that correctional
action may be taken
Problems of Fault Monitoring

Fault Detection Problems


Unobservable faults: e.g., deadlock, device not monitorable
Partially observable faults: insufficient to pinpoint the
problem
Uncertainty in observation: not clear what the problem is

Fault Isolation Problems


Multiple potential causes
Too many related observations
Interference between diagnosis and local recovery
procedures
Absence of automated testing tools

802.3

802.5

Client

Router

Router

MUX

PBX

T1

Server

MUX

PBX

Heterogeneous Network Environment

802.3

Application failure

Transport failure

Client

Server
Data link failure
Transmission
Mux
Mux
break
Router

Router

Logging
record important events and errors
logs should be accessible by managers (e.g.,
via polling)

Event Reporting
sending events, errors to managers
sending alarms to manager to warn possible
problems

Diagnostic Functions

connectivity test (e.g., traceroute)


response-time test
liveness test (e.g., ping)
protocol integrity test
loopback test

Keeping track of users usage of network


resources

communication facilities
computer hardware
software and systems
services

Usage may need to be broken down by


account, by project, or by individual user
for appropriate accounting purposes

Network monitoring is the most basic aspect of


NM
The purpose of network monitoring is to
gather information about the status and
behavior of network elements
Information to be gathered include
static, dynamic and statistical information

Monitoring methods - polling & event reporting


Monitoring functions
performance monitoring
fault monitoring
accounting monitoring

Vous aimerez peut-être aussi