Vous êtes sur la page 1sur 78

NOKIA

NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 1 of 78

Nokia Q1 Poller Software Design Specification

Version 1.01

File: (L: refers to share \\TRSKPD03ES\vol1 in domain trs_es)


/var/www/apps/conversion/current/tmp/scratch22940/107958457.doc

Prepared by: Antti Miettinen

Inspected: <>/<>

Approved: <>/<>

NOKIA
NET/CO/PS/OSS/EMS/TRS MS-Word template information: File name: Version: Date:

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 2 of 78

DOCUMENT TEMPLATE INFORMATION

TECH_SPE.DOT v 0.01 11.7.1995

Interleaf template used for MS-Word template: SWPD-117 v1.12 Proposal 03.07.1995

DOCUMENT VERSION HISTORY


Doc. Vers. Date Document Author/ Modifier. Doc. Status. Notes

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 1.0 1.01

10.08.1998 03.09.1998 16.09.1998 21.09.1998 24.09.1998 25.09.1998 25.09.1998 07.03.1999 08.03.1999 15.03.1999 16.03.1999 14.01.2000

Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen Antti Miettinen

Draft Draft Draft Draft Draft Draft Draft Draft Draft Draft Approved Draft

Copy from Q1PS PB TS Windows feature eatpoint For others to comment More implementation stuff Snapshot from clearcase And more And still more Prepare for SW1 Minor tweaks Update according to comments Update after checking Changes to match reality

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 3 of 78

TABLE OF CONTENTS
1 INTRODUCTION..............................................................................................................................5 1.1 DOCUMENT DEFINITION........................................................................................................................................5
1.1.1 Purpose....................................................................................................................................................................5 1.1.2 Scope.......................................................................................................................................................................5 1.1.3 Readership...............................................................................................................................................................5 1.1.4 Overview of Document...........................................................................................................................................5

1.2 ACKNOWLEDGEMENTS...........................................................................................................................................6 1.3 DEFINITIONS, TERMINOLOGY AND ABBREVIATIONS................................................................................................6 1.4 RELATED DOCUMENTS.........................................................................................................................................6 1.5 OVERVIEW OF THE COMPONENT.............................................................................................................................7 2 DESIGN CONSTRAINTS................................................................................................................9 2.1 FUNCTIONALITY.....................................................................................................................................................9
2.1.1 General....................................................................................................................................................................9 2.1.2 Fault polling............................................................................................................................................................9 2.1.3 DCN monitoring......................................................................................................................................................9 2.1.4 Time management.................................................................................................................................................10 2.1.5 Other polling activities...........................................................................................................................................11

2.2 NOKIA Q1 COMMANDS......................................................................................................................................12


2.2.1 General..................................................................................................................................................................12 2.2.2 Polling command (-)..............................................................................................................................................12 2.2.3 Status command (0x01).........................................................................................................................................12 2.2.4 Equipment structure command (0x05)..................................................................................................................12 2.2.5 Get Changed Elements command (0x0a)..............................................................................................................13 2.2.6 Fault Condition command (0x1?, 0x2?)................................................................................................................13 2.2.7 Flush Event History menu command (k:13,1).......................................................................................................14 2.2.8 Get Active Alarms menu command (k:13,2).........................................................................................................14 2.2.9 Get New Events menu command (k:13,3).............................................................................................................14 2.2.10 Set Time Counter command (0x06)....................................................................................................................15 2.2.11 Fast Poll command (0x0F)..................................................................................................................................15

2.3 RUNTIME ENVIRONMENTS....................................................................................................................................15


2.3.1 General..................................................................................................................................................................15 2.3.2 NMS/10 MF..........................................................................................................................................................16 2.3.3 NMS/10 SR...........................................................................................................................................................17 2.3.4 Modem Management Adapter...............................................................................................................................18 2.3.5 Nokia Base Station Controller...............................................................................................................................19 2.3.6 Nokia Base Stations...............................................................................................................................................19 2.3.7 MetroHub..............................................................................................................................................................20 2.3.8 AXC.......................................................................................................................................................................20 2.3.9 RNC......................................................................................................................................................................20 2.3.10 IMC.....................................................................................................................................................................21 2.3.11 Successor for TMS Adaptor.................................................................................................................................21 2.3.12 Local management tools......................................................................................................................................22

3 ARCHITECTURE...........................................................................................................................23 3.1 DESIGN CRITERIA................................................................................................................................................23 3.2 SOFTWARE ARCHITECTURE..................................................................................................................................23


3.2.1 General..................................................................................................................................................................23 3.2.2 Functional model...................................................................................................................................................23 3.2.3 Static model...........................................................................................................................................................28 3.2.4 Dynamic model.....................................................................................................................................................40

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 4 of 78

4 IMPLEMENTATION......................................................................................................................51 4.1 GENERAL............................................................................................................................................................51 4.2 Q1MASTERCOM IMPLEMENTATION.....................................................................................................................52 4.3 CHOICES FOR DISTRIBUTING DATA AND FUNCTIONALITY.......................................................................................53 4.4 FAULT DATABASE................................................................................................................................................56 4.5 FAULT DATABASE QUERIES AND NOTIFICATIONS...................................................................................................58 4.6 FILTERING AND SEVERITY ASSIGNMENT................................................................................................................58 4.7 CONNECTION STATE............................................................................................................................................59 4.8 DCN STATISTICS................................................................................................................................................59 4.9 ELEMENT QUERIES..............................................................................................................................................59 5 MODULE TEST PLAN..................................................................................................................63 5.1 GENERAL............................................................................................................................................................63 5.2 UNSIGNED 64 BIT INTEGER ARITHMETIC..............................................................................................................63 5.3 TIME CONVERSION..............................................................................................................................................63 5.4 LINKED LISTS......................................................................................................................................................64 5.5 BIT VECTOR........................................................................................................................................................64 5.6 RED-BLACK TREE................................................................................................................................................64 5.7 PRIORITY QUEUE.................................................................................................................................................65 5.8 TARGET FRACTION QUEUE...................................................................................................................................65 5.9 FIXED SIZE BLOCK ALLOCATOR............................................................................................................................65 5.10 Q1 DATALINK PROTOCOL..................................................................................................................................66 5.11 Q1 COMMANDS................................................................................................................................................67 5.12 Q1 COMMAND EXECUTION................................................................................................................................67 5.13 Q1 COMMAND QUEUE.......................................................................................................................................68 5.14 Q1 COMMAND SCHEDULING..............................................................................................................................68 6 SOFTWARE COMPONENT INTERFACES.............................................................................69 6.1 GENERAL............................................................................................................................................................69 6.2 CONFIGURATION INTERFACE................................................................................................................................69 6.3 ALIVE CHECK INTERFACE....................................................................................................................................70 6.4 SHUTDOWN INTERFACE........................................................................................................................................70 6.5 Q1 COMMAND PASS THROUGH INTERFACE...........................................................................................................70 6.6 FAULT STATUS QUERY INTERFACE........................................................................................................................70 6.7 FAULT STATUS CHANGE NOTIFICATION INTERFACE................................................................................................71 6.8 ELEMENT PRESENCE REPORTING..........................................................................................................................71 7 ERROR HANDLING......................................................................................................................72 8 OTHER TECHNICAL SOLUTIONS...........................................................................................73 A ABOUT NOTATIONS....................................................................................................................75 B MULTIPLE INHERITANCE IN C..............................................................................................76

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 5 of 78

1 INTRODUCTION
1.1 Document Definition
1.1.1 Purpose This document describes the Nokia Q1 Poller software component. Nokia Q1 Poller is a reusable software component which reuses the Q1MasterCom core for Nokia Q1 master end protocol functionality and extends it for fault polling and DCN monitoring. This document specifies the interfaces to the services offered by Nokia Q1 Poller as well as the internal design and module structure of the component. 1.1.2 Scope This document describes several possible designs for Nokia Q1 Poller. The focus is on the functionality, which requires tight coupling with Q1 protocol stack operation and Q1 bus communication interface. The actually implemented variation of the designs is covered in detail, but for future reference other possibilities for decomposing the polling functionality are also described. This document should cover every solution taken between external constraints set to the component (by for example Nokia Q1 protocol specifications and product requirements) and the actual C program code and document the criteria used for the module/object decomposition. This document contains also the information necessary for testing the modules of this software component. The interfaces to the services offered by this component and the use of services offered by other components are specified in detail. This document serves as a Program Block Technical Specification for the NMS/10 MF C2.0 project. 1.1.3 Readership Software engineers or project managers who either want to reuse the design of this software component or evaluate the amount of work that is needed to reuse it. This document should also be usable when modifying, debugging and testing the Nokia Q1 Poller implementation. 1.1.4 Overview of Document This document covers the following areas: desired functionality of the software component constraints for the design set by protocol specifications implementation constraints in several runtime environments alternatives for distributing the functionality

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 6 of 78

For the design chosen for implementation the following aspects are covered: functional model which describes the computations and data flows in the software the design object model which specifies the static structure of the software scenarios which document the dynamic behaviour of the software state charts of the relevant objects module structure and test plan component interfaces

The description of the design is informal. When possible, diagrams use UML-notation. OMT-style data flow diagrams are also used.

1.2 Acknowledgements
Thanks for the following people for providing valuable information during the construction of this document: Juha Matturi (NET/RAS) Jyrki Ylihonkaluoma (NET/RAS) Juha Pajuvirta (NET/NWS/SWP) Janne Rand (NET/RAS) Jarek Krol (NET/RAS) Juha Liukko (NET/CO/PS/OSS/EMS/TRS)

1.3 Definitions, Terminology and Abbreviations


See [6].

1.4 Related Documents


[1] Nokia Q1 Protocol Description Version : 3.01/1 Author : Riku Linnanen, Antti Miettinen Location : Q1 Forum (LN/WWW) Format : MS Word Number : DN9759533 Nokia Q1 Fault Management (Functional Specification) Version : 3.02/3 Author : Ilkka Haukkavaara, Jyrki Ylihonkaluoma Location : Q1 Forum (LN/WWW) Format : MS Word Number : DN9759615 Programming Language C Version : ISO/IEC 9899:1990 Author : ISO/IEC Location : NRC IHS WWS Format : Bitmaps

[2]

[3]

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS [4] [5]

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 7 of 78

Q1MasterCom Reusable SW Component Databook Version : 1.3d Author : Antti Miettinen Location : http://laurel.trs.ntc.nokia.com/Q1MCOM/q1mc1397.doc Format : Word 97 Q1 Equipment Studies Version : 0.04 Author : Marko Kohtala, Antti Miettinen Location : \\trskpd03es\vol1\depments\tmn\TechInfo\ Q1\Q1study0.04.doc Format : Word 97 NMS/10 MF Terms Version : 1.5 Author : Various Location : \\trskpd03es\vol1\program\tmn\kpd\nms10xx\C50\projects\ KPD R&D\MF C20\Miscellaneous\Terminology-1.4d.doc Format : Word 97

[6]

1.5 Overview of the component


The Nokia Q1 Protocol is a master/slave half-duplex protocol used in managing elements, which are organised to serial communication buses (see Figure 1). Each Q1 bus is controlled by one master and the elements on the bus operate in slave mode. The slaves on the bus cannot initiate any kind of data transfer. The master must periodically poll the elements for any data to flow from the elements to a network management system. Therefore it is almost always necessary to have some kind of polling functionality associated with the entity which is acting as the bus master.
Master

Slave

Slave

Slave

Slave

Slave

Slave

Figure 1 Q1 management bus Nokia Q1 Poller is an ANSI/ISO C software component, which offers services for monitoring the connection status of the elements it is configured to monitor maintaining local fault status information of the monitored elements distributing notifications of changes in element connection status and fault status synchronising the real time clock of Nokia Q1 (E generation) elements monitoring the Q1 bus for element existence (autodiscovery)

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 8 of 78

The poller implementation is intended to be reusable in different hardware and operating system environments. For Nokia Q1 master end protocol functionality, the Nokia Q1 Poller relies on the services offered by the Q1MasterCom software component (see [4]).

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 9 of 78

2 DESIGN CONSTRAINTS
2.1 Functionality
2.1.1 General This section describes the functionality that is usually required from a Nokia Q1 poller implementation. 2.1.2 Fault polling The basic functionality required from Nokia Q1 poller is fault polling. The fault polling can be implemented in various ways. However, in any complete fault management system the following functionality is usually required: 1. Mirroring the fault status of the monitored elements The managed elements maintain current fault status information, but it is usually desirable to concentrate and cache this information in order to provide faster access to upto date fault status of the managed network. Actually, this kind of caching often happens on several levels in a complete network management system. For example NMS/10 MF mirrors the fault status of the network elements connected to it and NMS/100 in turn mirrors the fault status stored in several MFs. The fault status mirroring should support filtering. This is especially important for D/ND generation elements, as they have no internal filtering capabilities. It can also be desirable to control the assigning of fault severity for faults obtained from D/ND generation elements, as D/ND generation fault management commands do not provide severity attribute. 2. Generating change notifications about fault status changes In addition to having a real-time view of the current fault status of the monitored network, it is often desirable to have a history log of fault status changes. Therefore the fault polling should generate notifications of any changes detected in the fault status of the monitored elements. These notifications can also be utilised in maintaining an upper level fault status cache.
1

2.1.3

DCN monitoring In addition to providing information about the faults reported by the monitored elements, the Q1 poller should indicate DCN failures. Minimally, if an element does not answer, a loss of management connection should be indicated. The quality of Q1 bus

It turns out that for robust change notification generation, fault status mirroring is necessary, i.e. to provide 2, 1 is needed.
1

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
2

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 10 of 78

should also be monitored. The Q1 master should maintain counters of the following occurrences: character parity errors character framing errors serial interface overrun errors number of transmitted characters number of received characters Q1 datalink packet parity errors Q1 protocol violations malformed Q1 datalink packets packet size overflows extra data after zero continue-bit and before transmission of next command packet command specific protocol violations number of empty data transfer answer packets number of transmitted Q1 datalink packets number of received Q1 datalink packets number of serial interface transmit errors number of serial interface reception errors

The counters can be maintained either globally, or several counters can be set up. Choosing the entity to which a counter is to be associated depends on the nature of the counter. For example the number of character parity errors depends probably most on the bus (even though element specific variation is possible) so it is sensible to associate this kind of counter with each bus. In principle, the more fine-grained the statistics are, the better the system can be diagnosed, but storage considerations may prohibit use of maximally specific counters. The statistics shown to the user of the system should be configurable but this information can naturally be a subset of the information actually collected. It is also desirable to monitor the performance of the DCN traffic. For this purpose the element response times can be monitored. Averages, minimums and maximums for the following times can be maintained either globally, per management bus or for individual addresses: element response time i.e. the time from command packet transmission end to answer packet reception start inter-character delay i.e. time between individual characters in slave response packets total command execution times (these should probably be monitored for some specific commands or the measure should be time related to some attributes of the command as the execution time depends very much on the command)

2.1.4

Time management Nokia Q1 (E generation) elements provide timestamped information. Therefore it is necessary to keep the real-time clock of the elements in synch with the management system. The clock of monitored elements should be set in the following situations:

Note that all of these may not be meaningful in all environments.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 1. Poller start-up

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 11 of 78

Upon system start-up the state of the clocks of the elements is unknown. To ensure valid timestamp information the clocks of the elements should be synchronised before using any commands, which obtain timestamped information. 2. Node reset/reconnection detection If an element has failed to answer it might have lost its real time clock. 3. Periodically In order to prevent drifting, the clock of the elements needs to be refreshed periodically. 2.1.5 Other polling activities In some environments the following polling activities might also be needed: 1. Autodiscovery When a new element is connected to a Q1 management bus it is useful to get a notification about this to the central management system. When a new element is discovered, it is usually desirable to get some identification information about the found element. 2. Configuration change detection If element configuration is changed (for example via a local management port) it is useful to get a notification about this to the central management system. E generation elements provide warning events about configuration changes, but as events can get lost, it might be desirable to poll the configuration checksums of some elements. This functionality is probably most conveniently implemented with a component, which can be loosely coupled with the protocol stack and fault polling. Also the mechanism for detecting configuration changes in D/ND generation elements is highly element specific. 3. Uplink data polling Some elements might want to report uplink data via an element specific mechanism. For example CATS base station (VTGA unit) reports MMI uplink messages in response to a Fast Poll command. Therefore it is necessary to poll VTGA periodically with a Fast Poll command. Whether the implementation needs to be tightly coupled with the protocol stack, depends on the timing constraints for the functionality. 4. Network test Even though the fault polling provides information about DCN status, it is sometimes desirable to test the Q1 buses extensively. Usually this kind of functionality can be implemented with a component, which can be loosely coupled with the protocol stack, provided that the Q1 command interface supports specifying control parameters for the
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 5. PM data collection

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 12 of 78

command execution and provides enough information about the command execution status.

This document does not discuss PM data collection. PM data collection is usually most conveniently implemented with a component, which can be loosely coupled with the protocol stack and fault polling.

2.2 Nokia Q1 Commands


2.2.1 General This section describes the Nokia Q1 commands that are relevant to a Nokia Q1 Poller implementation. The commands are described from the point of view of the data that they provide and the effect that they have on the target element and the way the provided data can be used in a poller implementation. 2.2.2 Polling command (-) Polling command is targeted to an address. In addition to address specific connection status, it provides the element level status bits of the target element. Most notably, it provides the X-bit information of the whole element. The X-bit is one whenever any of the functional entities of the target element have unread fault status changes. A fault status change is considered unread until it has been reported in response to a Clearing Fault Condition command (0x1?) or Get New Events menu command (k:13,3). The X-bit of a functional entity is also cleared by a Flush Event History menu command (k:13,1). 2.2.3 Status command (0x01) Status command is targeted to an address. In addition to address specific connection status, it provides the status bits of the functional entities of the element. Most notably, it provides the X-bit information of the individual FEs of the element. Indirectly it provides also the present functional entities of the target element. 2.2.4 Equipment structure command (0x05) Equipment structure command is targeted to an address. In addition to address specific connection status, it provides the number of functional entities present at the address. This command is next to useless for E-generation elements as it actually provides only a number which equals one plus the largest FE number in the address.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 2.2.5

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 13 of 78

Get Changed Elements command (0x0a) Get Changed Elements command is targeted to an address. In addition to address specific connection status, it provides a list of functional entities that have unread fault status changes (which is equivalent to having the X-bit on). This command is supported by E generation elements.

2.2.6

Fault Condition command (0x1?, 0x2?) Fault Condition command is targeted to a functional entity. Depending on the node architecture, it provides either address specific or FE specific connection status information. The command has two variations: a clearing and a non-clearing version. Both variations provide a list of SB, FC pairs i.e. a list of faults. The faults are tagged with an ON/OFF bit indicating whether the fault is currenly active or inactive. The non-clearing Fault Condition command reports only the active faults (i.e. the ON/OFF bit is always one in the answers). Despite the fact that the clearing variation is probably intended to provide information about fault deactivations as well as fault activations, the ON/OFF information is of little use in a robust Nokia Q1 Poller implementation. As the DCN is not error free, the fault condition command can fail. It is in general impossible for the slave to know whether the master has successfully received the answer. Therefore it is possible that an element clears its X-bit even when the master has not successfully received the answer. If the fault condition command fails, the element may have considered some fault deactivations reported and therefore the master can not rely on the Fault Condition command answers to provide fault deactivation information. Both variations of the command can be used for updating the locally cached fault status information and for generating notifications about fault status changes. Upon successful answer reception the answer data and local fault status information are compared, and the following operations are performed:

if a fault is present in the answer with ON/OFF bit set to one and missing from the local fault database, the fault is added to the local fault database and a fault activation notification is generated if a fault is present in the local fault database and missing from the answer or reported as off, the fault is cleared from the local fault database and a fault deactivation notification is generated if a fault is reported as off in the answer and it is not present in the local fault database, a fault disturbance notification is generated The master has to assume that it may have cleared the X-bit of the target functional entity whenever it has attempted the Clearing Fault Condition command. In addition to the fault list the commands provide the FE status byte of the target functional entity and a 16-bit time counter value but these are of little interest.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 2.2.7

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 14 of 78

Flush Event History menu command (k:13,1) The Flush Event History menu command is executed as a Data Transfer command and is therefore targeted to a functional entity. Depending on the node architecture, it provides either address specific or FE specific connection status information. This command flushes the event history buffer supported by E generation elements and clears the X-bit of the target FE. As it is in general impossible for the slave to know whether the master has successfully received the command answer, the element may have flushed its event history and cleared its X-bit even if no answer to this command is received. The master has to assume that it may have cleared the X-bit of the target FE always when this command has been attempted. This command also resets the Get New Events sequence number to zero.

2.2.8

Get Active Alarms menu command (k:13,2) The Get Active Alarms menu command is executed as a Data Transfer command and is therefore targeted to a functional entity. Depending on the node architecture, it provides either address specific or FE specific connection status information. This command is used to query the current fault status of an E generation element. The answer contains a list of active faults and their activation times. The answer data is used in a way similar to the handling of answer data to a Fault Condition command:

if a fault is present in the answer and missing from the local fault database the fault is added to the local fault database and a fault activation notification is generated if a fault is present in the local fault database and missing from the answer the fault is cleared from the local fault database and a fault deactivation notification is generated This command has no effect on the X-bit or the event history buffer.

2.2.9

Get New Events menu command (k:13,3) The Get New Events menu command is executed as a Data Transfer command and is therefore targeted to a functional entity. Depending on the node architecture, it provides either address specific or FE specific connection status information. This command is used to read the event history buffer maintained by E generation elements. The event history buffer acts as a FIFO structure which stores fault status changes and other notifications. The answer data can be used directly in updating the local fault status information (for exception handling, see chapter 8):

if a fault activation is reported the fault is added to the local fault database (the fault database should not contain the associated fault) if a fault deactivation is reported the fault is removed from the local fault database (the fault database should contain the associated fault) This command clears the X-bit of the target FE.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 15 of 78

In addition to fault status changes the event history buffer can contain other events. Disturbance indicates that some condition has been active for a short period. Warning indicates occurrence of some event. These event types cause no changes to the local fault status information, but they should be forwarded just as alarms and cancels are. As it is in general impossible for the slave to know whether the master has successfully received the command answer, it is possible that this command removes data from the FIFO even when the master fails to receive the answer. This has the following implications: retries cannot be used with this command the master has to assume that it may have cleared the X-bit always when this command has been attempted For detecting lost events, the answer has a 16-bit sequence number. This sequence number is incremented by the element always when an answer to this command is formed. If the sequence number does not increase by one (modulo 65536) between answers, some fault status changes may have been lost and the local fault status information must be synchronised by querying the element for the current fault status information. 2.2.10 Set Time Counter command (0x06) The Set Time Counter command is targeted to an element. As the command is not replied to, it does not directly provide any information. Normally this command is sent to the broadcast address (4095) in order to synchronise the clock of all E generation elements at the same time. 2.2.11 Fast Poll command (0x0F) The Fast Poll command is targeted to an element. It provides element specific connection status information. This command can be used as a PDU in an upper level protocol, which uses Q1 essentially as a datagram service. The intended use for this command is uplink data polling i.e. it can be used to poll the target slave for spontaneous uplink messages. Both command packet and answer packet can contain user data.

2.3 Runtime environments


2.3.1 General There are several hardware and software environments, where Q1 Polling functionality is required. This chapter discusses environments, which are under active development, and possible future environments where Q1 Polling functionality could be required.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 2.3.2 NMS/10 MF

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 16 of 78

NMS/10 MF C1.0 is an existing system, whose main purpose is to provide a replacement for TMC. NMS/10 MF C1.0 hardware is currently sold to Nokia customers, which means that future software should be able to run on the hardware. NMS/10 MF C1.0 hardware consists of an industrial PC with passive ISA-backplane, a Pentium CPU card, an Ethernet interface card, an intelligent serial multiport card (SIO386) and two Q1 bus interface cards. The operating system used in the PC is Windows NT (version 4.0, service pack 3). SIO386 is equipped with 20MHz Intel i386EX CPU, 8 megabytes of RAM and 16 serial communication channels. The ISA-bus interface is implemented with FIFO-chips, which are interfaced via I/O ports on the PC side. The operating system for SIO386 is WosNuc, a proprietary real-time operating system from ASPO Systems. The software interface from the NT PC to SIO386 is implemented by a dynamic link library, a service process and a kernel mode device driver. The DLL offers a subset of the WosNuc operating system calls for processes running on the NT PC. Particularly, it offers a set of asynchronous messaging primitives for inter process and inter processor communication. The Q1 bus interface cards act mainly as V.11 drivers. Each Q1 bus interface card provides eight protected V.11 interfaces. The TTL-level serial channel signal from SIO386 is connected to the interface cards and the choice between the two interfaces per serial channel is switchable by software (see Figure 2).
SCC 1

Port1A

Port1B

Figure 2 Serial channel with doubled drivers The Ethernet interface card provides the interface between NMS/10 MF and management systems. For NMS/10 MF C1.0, the intended client systems are NMS/10 SR and NMS/100. For NMS/10 SR the provided interface is a subset of the TMC Node Manager Server interface, which is available via a TCP connection. For NMS/100 the provided interface is a subset of TMC MML port interface, which is available via a TCP connection, and a TMC Alarm Printer emulation interface, which is available via a TCP connection. NMS/10 MF C2.0 provides also a SNMP interface for fault management.
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 17 of 78

In this environment it is sensible to implement quite a lot of the polling functionality on the serial communication board. The memory capacity is large enough for implementing the Q1 protocol and fault polling functionality completely on SIO386. The board can also store a quite large fault database. The aspects pushing functionality to the NT PC side include the fact that the client systems interface the NT PC side of the system. Also software development and debugging is more straightforward for software running on the NT PC side. Even though the NMS/10 MF C1.0 hardware needs to be supported by future software releases, there is pressure for redesigning the hardware. One problem, which is prominent especially in small networks, is the fact that the hardware is not very compact. The industrial PC occupies a whole sub-rack. Also most of this space is unused in C1.0 release. The hardware was developed to be able to host a maximum of four SIO386 cards and eight Q1 bus interface cards for a total maximum of 64 protected Q1 buses. In many situations this kind of massive central polling system is not required. Also the ISA bus is currently being phased out (at least for desktop systems), which may cause problems in hardware component and support software availability in future. Therefore it is desirable to target commercially available serial multiport cards in addition to supporting the NMS/10 MF C1.0 hardware. 2.3.3 NMS/10 SR The SIO386 hardware is used also in NMS/10 SR, when there is no TMC or NMS/10 MF under it. The subset of MF software used in NMS/10 SR environment is called PDH Polling SW. The associated PDH Polling hardware consists of the SIO386 card and one Q1 bus interface card. In this environment there is additional pressure for implementing the polling functionality as much as possible on the serial board. As the software is running on a desktop system which might be in interactive use, the polling should require as little system resources (CPU, memory) as possible. However, for NMS/10 SR the ISA-bus interface of the SIO386 is an even more severe problem than for NMS/10 MF. Desktop PCs are migrating to PCI bus and a functional SIO386 configuration requires two free ISA slots. Therefore it is highly desirable to target commercially available PCI serial boards. For small systems, the two serial ports commonly available on PC motherboards should also be usable. There is a multitude of commercially available serial boards. For a Q1 poller operating in an interactively used system the feature requiring special attention is the CPU consumption of Q1 polling. There are boards, whose design is similar to SIO386, i.e. there is a CPU and some memory present on the board and it is possible to write software, which is to run on the serial board. The advantage of this kind of boards is the fact that part of the Q1 polling functionality can be moved off the main CPU. Depending on available resources a varying proportion of the functionality can be implemented on the serial board. The Q1 poller implementation should be flexible enough to be reusable in this kind of varying conditions. Even though this kind of intelligent serial boards have the advantage of low main CPU consumption, there are, however, disadvantages: this kind of boards are expensive

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 18 of 78

there are no standards for writing generic software for different serial boards

If the board manufacturer provides a generic serial port device driver, it is possible to support this kind of intelligent boards with a generic implementation running on the host CPU but then the advantage of the special purpose CPU is largely lost. For CPU-less boards, the board manufacturer should provide a generic serial port device driver and it should be possible to support this kind of boards with a generic implementation running on the host CPU. The CPU consumption of CPU-less serial boards depends on the way interrupts are managed and the implementation of the device driver. An unfortunate fact of Q1 fault polling is that the most common operation i.e. executing the Polling command requires going through the full I/O-path for each character while polling short addresses which have no faults. In this kind of operation for example the FIFOs commonly present on UART chips have little effect and there is in general very little that can be done to help the situation. One important consideration for a poller implementation running in NMS/10 SR environment is the need to co-operate with Q1CS/GCS. Q1CS/GCS offers a COMinterface for NMS/10 Node Manager applications for sending Q1 commands to elements accessed via various DCN arrangements. As the elements Q1 poller is monitoring should also be available for management via Q1CS/GCS, co-operation is necessary for controlled access to the Q1 bus interfaces. One possible solution is to implement Q1 poller as a separate application using the COM-interface offered by Q1CS/GCS. This would, however, induce unnecessarily large overhead for all executed commands. A more efficient arrangement is to integrate Q1 poller to the protocol stack component used by Q1CS. It is also possible to give the local serial ports to poller control and to offer access to Q1CS via a piping interface. 2.3.4 Modem Management Adapter The Modem Management Adapter is a plug-in unit acting as a SNMP mediator for a Q1 managed modem pool. The unit has a Motorola MC68EN302 CPU, two megabytes of RAM and 1 megabyte of FLASH memory. The modems to be monitored can be connected to four serial channels and the maximum number of monitored elements under one unit is 100. The software interface to the operating system services on the unit is the RAS/CT standard COSI. The fault information maintained by MoMA is available via SNMP and command line interfaces. For the fault attributes the following information is obtained via Q1 command execution:
3

element address functional entity number supervision block number fault code software identification (obtained with "m:4,6" menu command) hardware identification (obtained with "m:4,5" menu command)
3

Even though the unit has four serial channels, which can be polled in parallel, the elements are mapped to one logical Q1 management bus.
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 2.3.5 Nokia Base Station Controller

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 19 of 78

equipment type (obtained with Equipment Identification command) supervision block identification (obtained with Supervision Block Identification command)

Q1 managed elements are common in fixed transmission networks, but Q1 managed network elements are used also in cellular networks. In cellular networks the Q1 managed elements can be monitored either by a base station controller or a base station. In DX 200 BSC the monitoring of Q1 managed equipment is implemented by software distributed on different CPUs. The AS7-U plug-in unit is an Intel 80286 equipped preprocessor card running without any actual operating system and in the context of Q1 management handles the Q1 datalink layer. The unit has 256K memory reserved for program code. Newer AS7-V/AS7-A units are equipped with i486 CPU and 4M of RAM. These units run also without an operating system. A maximum of 18 Q1 service channels (buses) can be configured to a BSC and the maximum number of monitored elements is 512. The actual polling is implemented by software running on the host CPU of Operation and Maintenance Unit (OMU). The hardware interface between the host CPU and the pre-processor unit is implemented with DPRAM buffers. The Q1 monitoring is a secondary activity for the BSC so it is very important that the polling does not disturb other activities. For the fault attributes the following information is obtained via Q1 command execution:
5

2.3.6

Q1 channel (bus) number address functional entity number supervision block number

Nokia Base Stations Q1 managed elements located at a Base Station can also be supervised by the base station itself. The Base Control Function Unit of 2nd generation, Talk-family, PrimeSite, MetroSite and UltraSite base stations can monitor Q1 elements connected to a single management bus. The number of BTS monitored elements is typically low (i.e. it can be for example one). The maximum number of monitored elements in MetroSite (CATS) base stations is 255. As in the case of BSC, the Q1 monitoring is a secondary activity for the BTS. It is very important that the polling does not disturb other activities. The CPU in all of the above BCF units is Intel i960. The operating system used is a Nokia proprietary operating system where processes run in a shared address space. CATS base station uses Q1MasterCom as the Q1 protocol component.

The elements under current MoMA release are always ACL2 V.35 modems and DNT2M network terminals. 5 The information applies to BSC S8 release.
4

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS address functional entity number supervision block number fault code NE identification string FE identification string SB identification string

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 20 of 78

The Q1 fault information is sent from BTS to BSC. For the fault attributes the following information is obtained from the managed elements:

For D/ND generation elements packet level commands are used to obtain the identification strings. For E generation elements standard menu commands are used.
6

InSite is a small base station intented for indoor use. It employs integrated transmission, but the software interface between InSite BTS part and transmission uses the same design as traditional BTS/transmission combination, i.e. Nokia Q1 protocol is used between the two parts. 2.3.7 MetroHub MetroHub is a RAS/CT product targeted mainly for cellular networks. It is a flexible cross connect (used e.g. like DN2). The units in MetroHub have Motorola MPC860 CPU and the interface to operating system services is RAS/CT standard COSI. As the MetroHub takes part in the PETS autoconfiguration, which employs IP DCN, it would be sensible to include Q1 polling functionality into MetroHub instead of building yet another DCN for Q1 and using BSC for fault polling. In this kind of arrangement the MetroHub would be in a position analogous to a base station. InHub is a small transmission node intented for indoor use. It could potentially use Nokia Q1 poller for monitoring nearby FlexiHopper/MetroHopper nodes. 2.3.8 AXC AXC is an ATM cross connect unit located in every base station of 3 rd generation cellular networks. As Q1 managed elements are likely to be located near it, the unit is in a position very much analogous to current base stations and MetroHub. AXC uses MPC860 CPU and Chorus operating system. 2.3.9 RNC RNC (Radio Network Controller) is an element in 3rd generation cellular networks in a position analogous to BSC. The current RNC architecture employs unit called NEMU for functionality similar in current BSC OMU. The NEMU is based in PC hardware and runs Windows NT operating system. Reuse of NMS/10 MF software, including Nokia Q1 Poller, is highly desirable in this environment.
6

This information applies to CATS base station.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 2.3.10 IMC

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 21 of 78

IMC is an element used in GIO (GSM Intranet Office) environment in a position analogous to BSC. IMC is based in PC hardware. Reuse of NMS/10 MF software, including Nokia Q1 Poller, is highly desirable in this environment. 2.3.11 Successor for TMS Adaptor TMS Adaptor is a plug-in unit, whose purpose is mainly to provide two master interfaces for a Q1 management bus. The hardware of TMS Adaptor is aging so in future it is possible that a new version of the product is required. TMS Adaptor is commonly used to provide a local management interface for a group of Q1 managed elements which are also supervised by a remote master (see Figure 3).
LAN

Operations system

NMS/10 MF

V.11 Service Bus

TMS Adaptor Network Elements Port A Port B ... Port C V.11 Service Bus 9600 bps

NOKIA

F1

F2

F3

F4

7 4 1 -

8 5 2 0

9 6 3 ,

ASCII INV DEL RET

Service Terminal

Figure 3 TMS Adaptor operating environment It is also possible to use TMS Adaptor to provide management interfaces for two supervision systems. For this purpose TMS Adaptor maintains a fault database which is used for answering to fault management commands targeted to the elements under the adapter. Even though TMS Adaptor caches the fault information of the elements under it, this caching does not enhance the fault management in any other way than by providing two management interfaces to the same elements. In fact the use of TMS Adaptor introduces some delay to the fault detection and to normal Q1 command execution. There are
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 2.3.12 Local management tools

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 22 of 78

numerous ways the operation of a TMS Adaptor replacement can be enhanced, but many of those are outside the scope of this document. For further information consult the EPA C1.0 project documentation.

Traditionally the Q1 managed elements have been configured locally with Service Terminal. This is a clumsy handheld device equipped with a keypad and a small LCD display. Further development for this target is not anticipated and a replacement for it would be highly desirable. More advanced local management is possible with portable PCs and other small computing devices. If the portable PC is running Windows NT, the NMS/10 products are usable as they are, and therefore the considerations for NMS/10 SR apply also here. However, the polling functionality is not as critical for local management tools as it is for central management. There are usually few elements to monitor and the polling frequency needs not be as fast as possible. The main issue is Q1CS/GCS co-operation. The Nokia 9000 Communicator is equipped with a RS232 serial port and is usable as a local management tool for Nokia Q1 managed elements. The PDA side of the Nokia 9000 (the side of interest for Q1 management) is equipped with Intel i386 CPU and has eight megabytes of memory. The memory is divided to three parts: 4M as read-only for OS and applications, 2M for application execution and heap space and 2M for user data (essentially a RAM disk). The operating system for the PDA is GEOS. GEOS is very tightly coupled with the segmented Intel x86 architecture. For optimal operation in GEOS environment, the code modules and data objects should be small and special considerations apply for using function pointers. The next generation Nokia Communicator is likely to employ ARM 7100 CPU and EPOC operating system. The EPOC operating system API is offered through C++ classes and programming in EPOC environment is most conveniently achieved by using C++. Therefore C code targeted for EPOC should prepare for interfacing with C++ code.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 23 of 78

3 ARCHITECTURE
3.1 Design criteria
In addition to producing a correctly operating component, the major target of this design is to combine and divide tasks and data between modules to minimise dependencies between components and to maximise them inside components hide design decisions made in the implementation of one module from other modules lessen propagation of change minimise communication between modules for efficiency clarify task division between modules minimise the amount of operating system and compiler dependent code compose functionality into small enough modules to enable flexible distribution of functionality in heterogeneous environments make as much as possible of the functionality and features optional to enable choosing only the required features for a target environment common case fast: when different aspects steer the design to conflicting directions, the most common activity should be considered first

These goals also serve to achieve efficiency, reusability, testability, maintainability and many other qualities often regarded as attributes of good software.

3.2 Software architecture


3.2.1 General The architecture of the software component is described with three views: the functional model, the static model and the dynamic model. The functional model describes the computations performed by the software. The static model describes the static structure i.e. the data stored and manipulated by the software. The dynamic model describes the dynamic behaviour i.e. the states and collaborations in the software. The order of presenting these models is more or less arbitrary as they all depend on each other. Each model emphasises different aspects of the software not all aspects are covered by each model. 3.2.2 Functional model

3.2.2.1 Fault polling Even though modern object oriented methodologies discriminate against using data flow diagrams, these diagrams can in practice be very useful. In the particular case of fault polling, it is very useful to model the functionality with data stores and processes
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 24 of 78

participating in the fault status mirroring because the diagrams make very explicit the data dependencies. It is important to note that the processes in data flow diagrams are processes describing functionality or computation. They should usually not be mapped directly to for example operating system processes. In the following discussion refer to Figure 4. At poller start-up the fault database is empty and the fault status of all monitored elements is unknown. The configuration specifies the addresses , which are to be monitored for faults. For updating the fault database the first information needed is the present functional entities in each address. This can be modelled as a process, whose source information is the set of addresses whose FEs are unknown and the information produced is a set of FEs which need to be queried for faults. This set of functional entities is consumed by a process, which queries the FEs for current fault status information. The process produces the current fault status information (a set of faults) per functional entity. When all FEs of an address have been queried, a set of addresses, whose fault status is known, is updated. The obtained FE specific fault status is compared with the data stored to the fault database and notifications about changes are generated and the fault database is updated with the new information.
7

For functional fault status mirroring the above scheme is sufficient. However, it is not very efficient. The fault polling would proceed to do a full query of each functional entity of each address for every fault poll cycle. The fault polling can be optimised by utilising the X-bit information (this requires that the queries reset the X-bit information). For this purpose, a process whose source information is the set of addresses whose fault status is known can be introduced. The process obtains the address specific X-bit information and produces a set of addresses whose fault status has changed. The changed addresses are in turn consumed by a process which refines the addresses to a set of functional entities which need to be queried for fault status changes. This information is in turn consumed by a process which queries the functional entities for fault status changes and resets the X-bit of the target. In addition to producing information for updating the fault database and generating change notifications, the process also produces FEs whose fault status is again known and when all FEs of an address have been queried, the set of addresses whose fault status is know, is updated. This scheme is much more efficient than the brute force approach but it is not as robust. It relies on the assumption that the X-bit information obtained from the element is always reliable. A reasonable compromise between robustness and efficiency can be achieved by introducing fault status consistency checking to the optimised scheme. The fault status consistency checking can be modelled as a process, which reads the addresses whose fault status is known. It can update either the set functional entities or addresses whose fault status is unknown. Alternatively the process can have its own internal state and it can update the fault database directly. For clarity some flows are omitted from Figure 4: In some situations it might be desirable to monitor only some of the FEs of an element. However, the supervision of all FEs in this case should not cause any significant problems and the unwanted fault information can be eliminated with filtering.
7

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 25 of 78

upon command failure, the target address may need to be added to the set of lost addresses if the failing command can reset the X-bit information of the target, the address/FE is moved to the set storing addresses/FEs whose fault status is unknown

Also filtering and severity assignment has been omitted. Filtering basically ignores specified fault information. Severity assignment assigns severity attribute to fault information obtained with D/ND generation FM commands (with E generation FM commands the severity attribute is provided by the element).

unknown functional ent it ies del faults of missing FEs

get faults

funct ional entity consistency check all FE s known? aut onomous consistency check full update fault database notifications

present FEs[ad]

get FE s

address consistency check

known addresses

get changes

delta update

unknown addresses

get X[ad]

changed addresses Set slave time Connect ion check

get X[FE ]

changed funct ional entities

lost addresses

Loss

Any process which sends commands can act as t he "Loss" process

Figure 4 General polling data flow In the following discussion about D/ND generation fault polling, refer to Figure 5. For D/ND generation fault polling the present FEs are obtained with a Status command. After a successfully executed status command, all FEs of the address are added to the set of FEs that need to be queried for current faults. The faults are queried with a Clearing Fault Condition command, which resets the X-bit information. When all FEs of an address have been successfully queried, the address is added to the set of addresses whose fault status is known. The known addresses are polled with Polling command and upon detecting a set X-bit the address is added to the set of changed addresses. To obtain the FEs that have changed, a Status command is again used. The FEs with X-bit set are added to the set of changed FEs. The Clearing Fault Condition command is used for querying the fault status changes and the processing of answer is identical to the query of current fault status. As the current fault status query and fault status change query are
8
8

Even though the FEs are numbered consecutively starting from zero there can be "gaps" in the present FEs. Therefore Equipment Structure command cannot be used to simply obtain the FE count. Additionally D-generation elements do not support the command.
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 26 of 78

identical, the processes and data stores can be combined for D/ND generation fault polling. For autonomous fault status consistency checking non-clearing Fault Condition command can be used.

Non-clearing Fault Condition command

new fault s[FE] old f aults[FE]

known addresses X=0

all FE s known?

new fault s[FE] queried FE Clearing compare Fault Condition command FE to quer y add/delete[FC]

fault database

Polling command

alarm/cancel/disturbance[FC] notificat ions X=1

X=1

changed addresses

Status command

FEs t o be queried all FE s

unknown addresses

Status command

FEs present[ad]

cancel faults of missing FEs

Figure 5 D/ND generation fault polling In the following discussion about E generation fault polling, refer to Figure 6. For E generation fault polling the present FEs are also obtained with a Status command. All FEs present in the target address are added to the set of FEs that need to be queried for faults. To reset the X-bit information a Flush Event History menu command needs to be sent to the functional entities. This has to be done before executing a Get Active Alarms menu command as it is possible that events are placed to the event history during the execution of the Get Active Alarms menu command and these events should not be flushed after the execution of the Get Active Alarms menu command. The obtained faults are compared with the data stored to fault database. Cancels are generated for faults present in fault database and missing from answer and alarms are generated for faults present in answer and missing from fault database. The fault database is also updated with the new information. The E generation fault management menu commands provide timestamped information. While updating the fault database the timestamps (may) need to be compared to determine whether the data obtained from the element or the data already stored is more recent. When all FEs of the target address have been successfully queried, the address is added to the set of addresses whose fault status is known. The known addresses are polled with Polling command in the same way as for D/ND generation fault polling. Upon detecting a set X-bit, the changed FEs are obtained with a Get Changed Elements command. The changed FEs are queried with a Get New Events menu command. The events are forwarded and alarms and cancels are applied to fault database. When all FEs have been queried the address is put back to the set of addresses whose fault status is known. For autonomous fault status consistency
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 27 of 78

checking the Get Active Alarms menu command can be used (in this case the Flush Event History menu command is not executed).
fault database notifications

Flush Event History menu command

Get Act ive Alarms menu command

compare

forward and update

unknown functional entit ies

consistency check

known addresses

all FE s known?

Get New Event s menu command

Status command

Present FEs

Polling command unknown addresses changed addresses Get Changed Elements command changed funct ional entities

sequence number mismat ch or command failure

Figure 6 E generation fault polling 3.2.2.2 Connection monitoring In the following discussion refer to Figure 7. When a bus interface for a monitored address is continuously malfunctioning, the interface is to be considered failing and an appropriate connection loss indication is generated. In addition the failing interface is added to a set which specifies DCN failures. These failures should be monitored for reconnection. This can be modelled by introducing a process whose source information is the set of addresses, which have failed bus interfaces. The process checks the failed interfaces and upon reconnection updates the connection status information and generates appropriate reconnection indication.
9

The connection status information can be used in general command execution for choosing the order in which the bus interfaces are used when more than one interface is available. When the purpose of executing a command is just to obtain the answer data, it is sensible to try first the interface, which is known to be working. However, for connection monitoring the exact opposite is true. The command answer data is not of interest instead the command should be tried in all interfaces that are known not to work.

In TMC and NMS/10 MF the bus interfaces are called directions: primary direction and secondary direction.
9

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
address

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 28 of 78

connection checking

command,int er faces

normal command client

answer

command command,int erfaces interface choosing command execution

connection status

command,int erfaces connection status update interface stat us

monitored addresses

fault database

notifications

Figure 7 Connection monitoring 3.2.3 Static model

3.2.3.1 Configuration There are various ways to parameterise the operation of Q1 polling. Usually the whole Q1 address space is not to be monitored so at least the addresses to monitor should be specified in the configuration. Also it might be desirable to specify in configuration the expected element generation i.e. the FM commands to be used for each address instead of trying to auto-detect the generation. Usually the same set of addresses that is to be monitored for faults is to be monitored for DCN connection status. The addresses for autodiscovery on the other hand are usually distinct from the set of monitored addresses.
10

From the data flow diagrams additional aspects for parameterisation can be discovered. For parameterisation purposes, it is useful to introduce the concept of fault poll cycle. As can be seen from Figure 4, the data flows form several loops. For example addresses flow from known addresses data store to X-bit query process and back, or in case of nonzero X-bit an address can go through the change query loop. The queries can be implemented so that addresses are extracted from the known addresses data store in ascending order (modulo 4094). One fault poll cycle can be intuitively defined as the activity during which addresses don't repeat i.e. a cycle is complete when all known addresses have been processed once. The generation information can, however, be obtained from the elements and a mechanism for automatically configuring the poller can be devised but this should not prevent manual configuring.
10

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 29 of 78

For example the fault status consistency checking can be parameterised relative to this fault poll cycle by specifying the number of addresses or functional entities to check per each fault poll cycle. Also checking of failed connections can be specified as the number of failed addresses to check for each fault poll cycle. Alternatively these checks could be specified by frequency: number of checks per a minute or other suitable time unit. For D/ND generation fault management and TMC compatible alarm reporting interfaces one problem has been the unknown accuracy of fault status change notification timestamps. As the notifications are timestamped by the poller, there is an error marginal from the last poll time to the actual timestamp. Even if this time were measured, there is no way to provide the accuracy information via TMC compatible interfaces. In this kind of environment it is useful to introduce the concept of fault poll cycle target time. The poller configuration specifies a time interval, which is the maximum duration of one fault poll cycle. If the fault poll cycle target time is exceeded, a special internal fault is set. When the actual fault poll cycle time stays within the specified limit, the special fault remains cleared. This arrangement provides a known upper limit for the error of the timestamp values (while the special fault is not set).
11

The cycle target time introduces also a new design choice to be made: what to do with the remaining time when the poller operates within the limit. The simplest solution is to just start the next cycle immediately after one is complete. The "leftover time" could also be allocated among the different polling activities by configurable fractions. For example one third of the time could be allocated for fault status consistency checking, one third for connection status checking and one third for autodiscovery. In some environments it might not be desirable to constantly poll the monitored elements with a pace as fast as possible. For example, when there are only a few elements, the constant polling might disturb the slaves by consuming too much of their CPU time. The cycle target time can be utilised in this case also. The polling activity can be forced to suspend periodically by specifying that a fraction of the leftover cycle time is to be used for nothing. To configure for example a normal polling cycle once an hour and no other activity, the cycle target time would be set to one hour and all leftover time would be allocated for nothing. Full fault status query, delta polling and consistency checking can also be seen from another point of view: fault status consistency checking could be considered to be the same thing as full fault status query. If elements are polled infrequently, it might not be desirable to perform delta polling at all. In this arrangement, however, the concept of fault poll cycle can of course not be specified relative to delta polling. In DCN monitoring an address should usually not be considered lost immediately upon failure to answer. In many environments the management buses are not error free and occasional communication failures should be considered normal. Therefore the loss indication should employ a threshold. An address should be considered lost only when it has failed to answer consistently, for example when several commands have failed consecutively, or when an element has failed to answer during a configurable number of consecutive fault poll cycles.
11

Assuming that X-bit is reliable.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 30 of 78

As the polling activity executes Q1 commands, the configuration should specify control parameters to be used for command execution. Depending on the environment a set from the following parameters might require user control: packet retry count maximum number of empty data transfer answer packets to tolerate during one data transfer command maximum number of packet transactions to allow for a command execution first character timeout in answer packet reception inter character timeout in answer packet reception total timeout for answer packet reception first character timeout in command packet transmission inter character timeout in command packet transmission total timeout in command packet transmission total timeout for a command transaction minimum delay from command packet transmission or answer packet reception to next command packet transmissions minimum delay between command packet transmissions or answer packet reception to command packet transmission to different bus interfaces minimum delay after receiving an empty data transfer answer packet and transmission of the next data transfer command packet minimum delay between consecutive data transfer commands (to one address)

Note that in many environments many of the parameters can be fixed. For example in environments, where the transmission to the Q1 bus is not flow controlled, there is no need for transmission timeouts. Also many of the delays can be zero in many environments. For details about the parameters, see [4]. The configuration of Q1 poller can be modelled by classes depicted in Figure 8.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
Fault Monitorin g C onne ction Monitorin g

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 31 of 78

0..4094

NE address generation 0..4094

C ons is te ncyCh e ck countP erCycle unit: FE, NE

0..4094

C onne ction C he ck addressesPerCycle failThreshold

Autodis cove ry

Existe nce C he ck addressesPerCycle

Timi ng cycleTarget consistencyCheckFraction connectionCheckFract ion existenceCheckFraction nothingFraction

C ommandC ontrol retryCount commandT imeout interPacketDelay switchDelay empt yPacketDelay

Time ou ts txT imeoutsfirstCharT imeout interCharT imeout tot alT imeout rxT imeouts

Figure 8 Poller configuration 3.2.3.2 Fault status For fault status mirroring it is necessary to maintain a database of current fault status of the monitored elements. A fault is identified by the following attributes: management bus address functional entity supervision block fault code The D/ND generation fault management commands provide the following additional attributes associated to an execution of the Fault Condition command: FE status byte time counter These attributes are not associated with an individual fault they are attributes of a functional entity. The time counter information is practically useless (time unit is not specified by protocol and implementations conflict with specification) and might not even be supported in any sensible way in some Q1 management bus configurations (e.g. when TMS-Adaptors are used). The FE status byte, however, is present in external interfaces of TMC, subsets of which are also supported by NMS/10 MF. The FE status byte is present for example in notifications specified in the TMC Node Manager Server interface and the FE status byte is used as part of the key identifying a fault by TMC Alarm Manager. Therefore it is necessary to at least provide a matching FE status byte in alarms and the corresponding cancels if the TMC Node Manager Server interface is to be supported by the system where the fault poller is running. If the poller is operating in TMS Adaptor like environment where logically transparent access to the monitored
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS severity and event type time stamp

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 32 of 78

elements is required, the real FE status bytes of the monitored elements should be maintained. The E generation fault management commands provide the following additional attributes associated to an event:

Some events can be seen as objects describing changes in the state of a fault. From Nokia Q1 Protocol specifications it can be inferred that an event can be classified to one of the following types: alarm: the fault has activated cancel: the fault has deactivated disturbance: the fault has been on for a short period but is currently not active (disturbances are reported only about inactive faults) warning: a general purpose notification, something has happened for example a configuration change It is sensible to coerce both D/ND and E generation faults to a common representation. This representation would have the following attributes: management bus address functional entity supervision block fault code FE status byte severity activation timestamp element generation For D/ND generation faults it is necessary to generate the timestamp and severity attributes. For E generation faults the FE status byte could be obtained with packet level commands but as this would slow down the operation and the FE status byte is needed only for TMC Node Manager Server interface compatibility it is more sensible to generate it. The timestamp can be augmented with accuracy information. For faults obtained with E generation fault management commands it would indicate "exact" accuracy. To provide the accuracy information for faults obtained with D/ND generation fault management commands, the time from the latest poll or at least an upper limit to that would need to be measured. The faults are naturally indexed by the concatenation of bus, address, fe, sb and fc. This suggests fault database implementation as an associative container with <bus, address, fe, sb, fc> as the index. As the fault database needs to support queries and the queries might be made using a protocol whose PDU is smaller than the size of the fault database, the query interface should support iterative traversal. Usually it is sufficient to provide a query interface which allows iterating over the whole database i.e. it is not necessary to provide an interface supporting complex queries. Choosing the appropriate faults for further processing is most conveniently done by the query client. However, for TMC Node Manager Server interface, it would be convenient if the fault database could be
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 33 of 78

iterated in the order of TMC stations, which are groupings of arbitrary FEs. Also for a SNMP interface the most natural iteration order would be indexing by an unique ID which might have no relation to the concatenation of bus, address, fe, sb and fc. To support these varying iteration order requirements, it is higly desirable that the fault database query interface is as fast as possible preferrably a direct function call interface accessing the fault store directly. In many environment the fault information presented via the external FM interfaces includes indentification strings of the managed elements. The two extremes for obtaining this information are: 1. Executing the ID query commands every time the identification information is needed. 2. Uploading all identifications at system startup/configuration. Neither of these seems very appropriate. Always querying the information assures upto date identification information but slows down the operation. Full upload at system startup slows down startup and requires a potentially large amount of memory. The uploaded information can also become inconsistent with the monitored elements. A reasonable compromise could be to implement ID caching.
FaultState bus,ad,fe,sb,fc bus ad fe sb fc Fault Even t feStat us sever ity timeSt amp W arn ing

Disturbance

Alarm

C ance l

Figure 9 Faults and notifications Classes modelling faults and notifications are presented in Figure 9. Even though warnings do not indicate a fault, these notifications carry attributes, which specify for example SB and FC. Therefore warning is shown as a subclass of Event, which is associated with Fault class. In the context of a warning the associated Fault is to be interpreted as a general entity specifying the identity of the source of the notification.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 3.2.3.3 Notifications

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 34 of 78

In addition to queries, the fault database interface should support notifications. Clients should be able to request these notifications and to cancel their requests. Whenever the state of the fault database changes an appropriate change notification should be generated and sent to registered clients. There are various levels of accuracy at which the notifications can work: 1. Notifications have no additional information they just indicate that the fault database might have changed. Upon detection of the notification, client needs to query the whole fault database to synchronise its state.
12

2. Notifications indicate more accurately the entity whose fault status has changed. Choice here varies from management bus to an individual fault code. The client needs to query only a subset of the fault database to synchronise.
13

3. Notifications provide full information for client to synchronise without queries to fault database.
14

Full information should be suitable in all cases (the additional information does not harm the clients) and the overhead of the information should be negligible in most environments. Another design choice to be made is the notification protocol. At least the following choices can be considered: 1. Datagram style reporting no flow control, no acknowledgement. 2. Acknowledged reporting without a window i.e. only one report can be unacknowledged at a time. 3. Flow controlled reporting e.g. a sliding window protocol. In case of multiple clients the most suitable choice is datagram style reporting with numbered notifications. This allows fault polling to proceed without blocking or local buffering of notifications in case some clients fail to process the notifications at the rate they are generated. Sequence numbering provides a mechanism for detecting lost notifications. If the fault polling is operating with only one client for notifications, acknowledged or flow controlled operation might be appropriate. If the protocol in this case operates without a window, it is sensible to implement a notification buffer at the fault poller so that the fault polling does not stall immediately when the notification client is busy. Other buffering and packaging strategies for the notifications may also be applicable depending on the transport for the notifications. The Q1 element level X-bit can be seen as this kind of notification mechanism even though the X bit itself is polled not reported spontaneously. 13 The Q1 FE level X-bit can be seen as this kind of notification mechanism even though the X bit itself is polled not reported spontaneously. 14 The Q1 E generation menu command Get New Events can be seen as this kind of mechanism even though the events are polled not reported spontaneously.
12

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 35 of 78

C lie n ts

0..*

Notifi cationTarge t sequenceNumber notifyT argetAddress

Figure 10 Notification clients Keeping track of notification clients can be implemented with a simple collection storing the client address/identity and a sequence number if datagram style reporting is used. This is presented in Figure 10. For E generation elements most of the notification attributes are directly available in the Get New Events menu command reply data. For D/ND generation elements, the notifications are generated by comparing the locally stored fault status to Fault Condition command replies.
cycle 1 cycle 2 time

1) ALARM

ON Fault Code present in FCC reply with state ON OFF

2) CANCEL

ON Fault Code present in FCC reply with state OFF OFF

3) DISTURBANCE

ON Fault Code present in FCC reply with state OFF OFF

4) no notification

ON Fault Code present in FCC reply with state ON OFF

5) CANCEL

ON Fault Code present in FCC reply with state OFF OFF

6) ALARM

ON Fault Code present in FCC reply with state ON OFF

Figure 11 Fault Condition command replies Figure 11 shows how Fault Condition command replies are assumed to be constructed by network elements. The vertical lines indicate the polling cycle times when a given NE is polled with Polling command and possibly queried for faults with Fault Condition command. At cycle 2 fault database is updated and notifications are generated as follows: Case 1: The old fault status information indicates that the fault is not active. According to the new information the fault is known to be active and the fault is added to the database. The notification generated has type alarm. Case 2: The old fault status information indicates that the fault is active. According to the new information the fault is known not to be active and the fault is removed from the database. The notification generated has type cancel. Case 3: The old fault status information indicates that the fault is not active. According to the new information the fault is known not to be active and fault
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 36 of 78

database is not changed. However, the fault has been active and therefore a notification is generated and the notification has type disturbance. Case 4: The old fault status information indicates that the fault is active. According to the new information the fault is known to be active and fault database is not changed. If the FE in question has several faults active, it is impossible to know which of them triggered the NE to report that fault status has changed. Usually there is no way to report this kind of "negative disturbance" to external interfaces. No notification is generated. Case 5: The old fault status information indicates that the fault is active. According to the new information the fault is known to be not active and the fault is removed from the database. The notification generated has type cancel. There is no way to distinguish this case from case 2. Case 6: The old fault status information indicates that the fault is not active. According to the new information the fault is known to be active and the fault is added to the database. The notification generated has type alarm. It is impossible to distinguish this case from case 1. If any Q1 managed element doesnt operate according to the above assumptions, the element behaviour is to be documented in [5]. Note that the above disturbance generation rule should not be used at poller startup. 3.2.3.4 Filtering and severity assignment TMC and NMS/10 MF support filtering of fault information. In TMC the filters are specified by a list of match expressions. One match expression specifies the attributes that the fault information has to have in order to pass through the filter. The information is matched against each expression and if any of them matches the filter passes the information through. In NMS/10 MF the filters are specified by an arbitrary AND/OR/NOT combination of expressions, where the different event attributes can be tested. For example the following filter expression could be used to exclude AIS and FEA reports and all disturbances:
15

EXCLUDE = { FC=64-72,74-79,173-182 OR TYPE=DISTURBANCE }

The above filter excludes all events, whose fault code matches the ranges specified and the events whose type is disturbance. The filters can be used at individual FM interfaces and a global filter can be used between the monitored elements and the local fault database. In TMC the severity attribute shown at external interfaces is generated by built-in rules. In NMS/10 MF the severity is assigned by a configurable classificator at individual FM interfaces. The classificator is defined by specifying a logical expression associated to each severity level. The expressions are checked in order from most severe to least severe and the severity is set when a match is found. The following classificator is used as the default classificator in NMS/10 MF C1.0:
15

The filters in MF can be either inclusive or exclusive.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
CRITICAL = { TYPE=ALARM AND } MAJOR = { TYPE=ALARM AND

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 37 of 78

FC=0-12,16-19,21-22,24,32-39,42-56,58-63,80-89,96-99,109111,125,139,141-144,148,150-161,163-165,167,176,195,202203,241

FC=13-15,20,23,25-31,40-41,57,64-79,90-95,100-102,114124,126-138,140,145-146,162,166,168-172,175,177-184,186189,193-194,196-201,204-240,243-253 } MINOR = { TYPE=DISTURBANCE OR ( TYPE=ALARM AND FC=103-108,112-113,147,149,173-174,185,190-192,242,254 ) }

This classificator assigns critical severity to alarms with the specified fault codes, major severity to alarms with fault code from another set and minor severity for all disturbances and alarms with fault code from a third set. As for example cancels match none of the expressions specified, their severity is unclassified. In NMS/10 MF C2.0 the severity can be either ciritical (fatal can be used as a synonym), major, minor, warning or unclassified. Additionally all cancels are reported with special cleared severity in the SNMP interface (to be in line with NWI2 specification).
16

Q 1AF_filte r_de f 2 1 2 1 3

Q 1AF_e xpr_de f

Q 1AF_cl as s ifi er_de f

Q 1AF_or_de f

Q 1AF_num _e q

Q 1AF_gen _e q

Q 1AF_FE_s t_e q_de f

Q 1AF_an d_de f

Q 1AF_not_de f

Q 1AF_type _e q

Q 1AF_se v_e q

Q 1AF_se v_st _eq_de f

Q 1AF_bus_e q

Q 1AF_addre s s_e q

Q 1AF_FE_e q

Q 1AF_SB_e q

Q 1AF_FC_e q

Figure 12 Filters and classifiers A class hierarchy to support NMS/10 MF -style filtering and classification is shown in Figure 12.

16

To be confused with Nokia Q1 warning, which is an event type.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 3.2.3.5 Connection status

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 38 of 78

In addition to mirroring the fault status stored by monitored elements, the fault poller should indicate failures in communication with the monitored elements. In TMC and NMS/10 MF fault codes 200, 201, 202 and 203 are used for this purpose. Fault codes 200, 201 and 202 are address specific faults indicating problems in communication between the master and the managed element. Only one of the faults can be active in an address (FE=0, SB=0) at a given time. Fault code 200 indicates that the primary supervision connection to the element is lost but the secondary connection is working. Fault code 201 indicates that the secondary supervision connection to the element is lost but the primary connection is working. Fault code 202 indicates that there is no connection to the element. Fault code 202 is the only option when the bus is not protected. On a protected bus it indicates that the connection is lost in both primary and secondary directions. Figure 13 shows the address specific connection state and the notifications generated for each transition.
alarm( 200) OK alarm(201)

cancel(200) alarm(202)

cancel(201)

pri los t

cancel(200),alarm(201)

s e c los t cancel(201),alarm(200)

cancel(202)

cancel(200),alarm(202)

los t

cancel(202),alarm(201)

cancel(202),alarm( 200)

cancel(201),alarm(202)

Figure 13 Address specific connection state

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 39 of 78

Fault code 203 is supposed to be a bus specific fault and it is associated with nonstandard address 65535 (FE=0, SB=0) which is the address of the supervision bus object in TMC. This fault is intended to be set if supervision connection to all monitored elements is lost. In practice TMC does not use fault code 203 at all. Instead fault codes 200, 201 and 202 are used for the bus object in the same manner as for individual elements. If all monitored addresses fail to answer in primary direction, but secondary direction has responding addresses, fault code 200 is set to the bus object address. If all monitored addresses fail to answer in secondary direction, but primary direction has responding addresses, fault code 201 is set to the bus object address. If all monitored addresses fail to answer (either in both directions for protected bus or in the only direction for unprotected bus), fault code 202 is set to the bus object address. Bus loss and address specific loss faults are managed independently. Therefore it is possible to have element specific loss faults and bus loss fault active at the same time. Note that the above scheme does not extend well to a situation where there are more than two interfaces to the Q1 bus. It would make more sense to define some supervision blocks, which represent the address specific or bus specific connections, and to use only one fault code. The connection failure threshold requires maintaining address and bus interface specific loss counters. These counters are updated either directly upon command failures or at fault poll cycle completion, depending on which criterion is to be used for failure consideration. If failure threshold is to be interpreted as a number of polling cycles, an address specific boolean specifying whether the address has failed during current cycle is also required. For some node architectures it would be meaningful to maintain the connection status information for individual functional entities. This is the case when each FE has its own CPU and Q1 slave instance (e.g. DTRU). However, this is not a valid assumption for all cases. For example in Supervisor Substation the FEs represent different digital, analog and pulse inputs and commands to the different FEs are managed by the same slave and CPU. The simplest solution here is to maintain only address specific connection status information.
Los sC ou nt C onne cti on State bus,ad,if bus ad if count failedDuringLast Cycle

Figure 14 Connection state Connection state can be maintained with a simple container storing address and interface specific loss counters. This is presented in Figure 14.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 3.2.4 Dynamic model 3.2.4.1 Fault polling

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 40 of 78

There are several ways to model the behaviour associated with fault polling. For the scheduling of polling there are (at least) two approaches: using an active scheduler which employs a set of passive element queries using a set of active pollers and a passive scheduler

With passive queries, the scheduler acts as a central mediator. It obtains information from the queries and executes the polling commands. With active pollers, the scheduler merely notifies pollers for example about cycle completion and start. In this scheme the pollers actively submit queries to be scheduled and the scheduler chooses them for execution. An overview of a polling cycle with an active scheduler is depicted in Figure 15.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
cycle: T imer : Scheduler st art reset

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 41 of 78

X: Poller

delta: Poller

full: Poller

: Bus

for monit ored addresses

reset

reset

get Query

get Query

get Query

excuteQuery

Poll

execut eQuery

Status, Fault Condition, k:13,3

execut eQuery

Status, Fault Condition, k:13,2

When all addresses have been processed, fault status consistency checking, lost checking and aut odiscovery are performed according to configuration.

expire

Figure 15 Overall fault poll cycle With active pollers and passive scheduler, a pollers submits a request to the scheduler, when it has something to poll. The scheduler chooses a request and grants permission to the associated poller to submit commands. When polling cycle target time with configurable fractions for "leftover" time is used, there are two classes for queries: "mandatory" activity and "optional" activity. Optional activity is not allowed to run if anything mandatory is not finished. If a fraction of the polling cycle target time is to be used for nothing, a special optional query, which does
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 42 of 78

nothing but becomes done at cycle time end, is required, or this has to be handeled explicitly by scheduler. For scheduling the optional queries the elapsed time for a query has to be measured and the fractions associated with the activities need to be maintained. For E generation fault polling there are some special considerations. The Get New Events menu command answer contains a sequence number for detecting lost answers. Therefore the fault poller needs to maintain a 16 bit sequence number for each E generation FE. Upon detecting a sequence number mismatch the current fault status of the FE should be queried with Get Active Alarms menu command. Another situation, which requires special handling is event buffer overflow. This is indicated by E generation elements by storing a special warning notification to the event buffer. When a buffer overflow occurs the element ceases updating the event buffer until a Flush Event History menu command is received. Therefore, when the event buffer overflow warning is detected, the event history should be flushed and current fault status queried. 3.2.4.2 Connection monitoring Upon inspecting Figure 13, it becomes evident, that it is desirable to check all interfaces of an address before updating the connection status fault codes. If an element is disconnected from the Q1 management bus, all interfaces to it will fail. In this situation it is desirable to get fault code 202 set for the address. However, if the fault codes are updated between checking the different interfaces, the scenario depicted in Figure 16 including the dashed arrows will happen. If the fault codes are updated after checking all interfaces, only the fault code updates indicated with solid arrows will occur.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
FC200 FC201

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 43 of 78

FC202

Poller poll

Ser ial1

Ser ial2

no answer

set

poll

no answer

clear

set

poll

ok

clear

set

poll

ok

clear

clear

Figure 16 Element disconnect/reconnect If there are several disconnected addresses, the communication failures may have been caused by a common DCN fault. Therefore upon a reconnection it is sensible to continue connection status checking of other DCN failures more rapidly than in case of no reconnection. This can be achieved for example by considering the number of failed addresses to check per cycle as the maximum number of failing checks per cycle. Connection status information is produced each time a command is executed. This information should be always utilised to keep the connection status information as upto date as possible. It is, however, not desirable to keep connection status information about addresses that are not to be monitored because the information would not be kept upto date (connection status of unmonitored addresses is not checked for consistency). To minimise connection state fault code changes, the fault codes should also here be updated only after checking all interfaces. A scenario clarifying this situation is depicted
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 44 of 78

in Figure 17. If commands originated from other sources than the poller itself are used for updating the connection status, the semantics of failure threshold can be chosen in at least two ways: the threshold specifies the number of consecutively failed commands the threshold specifies the number of consecutive fault poll cycles during which all commands have failed
Command client dat a transfer Int erface choosing Connect ion status Serial0 Ser ial1

check

dat a transfer

answer

answer

poll

answer

update

Figure 17 Command execution and connection checking When a complete Q1 management bus (or an interface to it) is cut, it is desirable to get a bus specific loss fault code set for the bus in question. Therefore it is sensible to implement bus loss deduction without any threshold. When all monitored addresses fail to answer even once, it is higly likely that the whole bus is cut. When no threshold for bus specific loss is combined with a threshold for address specific losses, bus specific fault code should become usually set before address specific fault codes, when the whole bus is cut. Also it is sensible not to set the address specific loss faults, while a bus specific loss fault is active. 3.2.4.3 Autodiscovery The purpose of autodiscovery is to monitor the Q1 address space, especially the part, which is presumed to be unoccupied, for element existence and to obtain element identification information from responding elements. In some environments
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 45 of 78

autodiscovery could have collaborations with network configuration consistency checking. This kind of functionality is very much independent of fault polling and DCN monitoring. It is usually most conveniently implemented by a component which can be loosely coupled with fault polling and Q1 protocol stack. If, however, the autodiscovery frequency is parameterised relative to the fault polling cycle (e.g. one address for each fault poll cycle), it is most convenient to integrate minimal existence checking to the poller component and to implement identification queries within a separate component, which can register as a client for existence notifications generated by poller. 3.2.4.4 Start-up The startup of Nokia Q1 Poller is briefly depicted in Figure 18. The operation is relatively straightforward. A thing to notice is the fact that at poller startup registered notification clients receive an alarm for each active fault. For D/ND generation elements the timestamp of these alarms is generated by the poller upon receiving the FM command answers. Therefore it is possible that several alarms with different timestamps are generated about single fault activation if the poller is restarted.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
client open bus interfaces set monitoring profile

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 46 of 78

poller

D/ND NE

E NE

register for not ifications

for all addresses

st atus command

for all FEs

fault condition command

alarm for each fault

st atus command

for all FEs

get new event s

flush event history

get active alarms

alarm for each fault

delta polling

Figure 18 Poller startup


Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 3.2.4.5 Reconfiguration 1. Open a Q1 bus interface

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 47 of 78

There are numerous ways the poller configuration can change. The semantics for some configuration changes are not trivial.

Adding an interface to a Q1 bus may have an effect on the connection status information. If there are addresses which are considered lost, and there is only one interface defined for the bus, the addresses have fault code 202 set. It is also possible that the whole management bus is considered cut, i.e. the bus object has fault code 202 set. When an interface is added to the bus, there are two alternatives to handle the situation if the connection status is to be a boolean truth-value: the added interface is assumed to be working the added interface is assumed to be failing

If the first alternative is chosen, each address specific fault code 202 should be converted to fault code 200. Additionally it must be decided whether cancels and alarms are generated and reported to notification clients. Also if a bus specific loss is active, the added interface changes the situation so that fault code 202 for the bus should be changed to fault code 200. If the second alternative is chosen, fault code 201 should be set for each connected address. Also if the complete bus was not considered cut, fault code 201 for the bus object should be set. For lost addresses the fault code 202 can remain set as it was before. It is also possible to defer fault code changes till the status of the new interface is discovered. However, this choice leaves the added interface to an unknown state for the duration of the status discovery and the implementation of connection status has to be able to manage the indeterminate situation. If the bus had more than one interface already before this configuration change, the semantics of fault codes 200, 201 and 202 become blurred as they have been defined for environments where the maximum number of bus interfaces is two. For example the following semantics could be used in this case: fault code 200 is set when one interfacen chosen as the primary fails, but there are other working interfaces fault code 201 is set when there are interfaces working but an interface chosen as the secondary fails fault code 202 is set when all interfaces fail

With this scheme it is possible to have both fault codes 200 and 201 set for an address. If the added interface is assumed to be initially working, fault codes 202 should be converted to a combination of fault codes 200 and 201. If the added interface is assumed to be failing the state can only be indicated for addresses whose all interfaces have failed and in this case fault code 202 is already set.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 2. Close a Q1 bus interface

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 48 of 78

Another possibility for maintaining fault codes 200-202 is to observe only two of the interfaces and ignore the state of additional interfaces.

This change can also affect connection status information. When an interface of a bus with two interfaces is closed i.e. a protected bus becomes unprotected, the state of addresses with connection failure changes. There are two scenarios: the primary direction interface is closed the secondary direction interface is closed

When the primary direction interface is closed, the connection status information for the secondary direction should become the information for the only remaining interface regarded now as the primary direction. As the bus is now unprotected, fault codes 201 should be converted to codes 202. All fault codes 200 should be canceled. Fault codes 202 can remain as they were. This applies to both address specific and bus specific faults. When the secondary direction is closed, fault codes 200 should be converted to fault codes 202. All fault codes 201 should be canceled. Fault codes 202 can remain as they were. This applies to both address specific and bus specific faults. If the bus had three interfaces before this change and all interfaces are observed for the loss fault codes as described above, it is necessary to convert combinations of fault codes 200 and 201 to fault code 202. 3. Add addresses to monitor Adding an address to the set of monitored addresses can affect bus connection state. If the address is initially assumed to be working and there is a loss code active for the bus, the loss should be canceled. If the address is initially assumed to be failing, address specific loss fault should be set. Also possible faults already stored for the address need considerations, see below. 4. Remove addresses from monitoring Removing an address from the set of monitored addresses can affect bus connection state. If the removed address was the last connected address (for one interface or all interfaces), an appropriate bus specific loss fault should be set. Additionally it must be decided what is done to the faults set for the removed address. If the faults remain in the fault database, at least the connection status fault codes should be updated when the address is returned to the set of monitored addresses. If the faults are cleared, it must be decided whether cancels are generated and sent to notification clients. 5. Set connection loss threshold When the connection loss threshold is increased, the loss count for some addresses can fall below the threshold and those addresses should therefore be considered reconnected. Also if the threshold is decreased, some addresses may become lost. Fortunately, it is

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 49 of 78

easy to defer these changes to become active at the time when the loss counts are updated for the next time. 6. Set polling cycle target time This may cause the current polling cycle to overflow its target. It is also possible that an indicated overflow becomes invalid. However, it should be adequate to take the new target time into use for the next started cycle. 7. Set filter between monitored elements and fault database When the filter between monitored elements and fault database changes, the issue to consider is whether we should resynchronize the fault database to be in agreement with the new filter. The straightforward way to do the resynchronisation is to clear the fault database and reload all fault information from monitored elements through the new filter. However, generating cancels for all faults before the upload can cause a massive burst of cancels and many of the cancels are probably annulled during the upload. If cancels are to be generated, it is highly desirable to defer them. Alternatively the faults in the database can be tested as alarms against the new filter, and cancelled, if they do not pass. Additionally the fault status of all addresses is set to unknown, which causes full comparison of element fault status and local fault status information with the new filter in effect. It should be noted that it is not possible to reconstruct history upon a filter change. For example it is possible to define a filter, which excludes all cancels. If this kind of filter is taken into use, it impossible to undo the including of cancels that has been done, i.e. it is impossible to know which faults would have remained in the fault database if the filter were in effect earlier. By far the simplest solution is to leave the fault database as it is and to let the user of the system to manually order a restart or resynchronisation if it is desired. 8. Set filter for a notification client When the filter for a notification interface changes, the issue is whether we should try to resynchronise the client. The inability to reconstruct history applies here also. Disregarding that, it would be possible to go through the fault database and to test each fault as an alarm agains the filters and to perform the following operations: if the alarm passes the new filter, but does not pass the old filter, generate an alarm if the alarm passes the old filter, but does not pass the new filter, generate an cancel
17

By far the simples solution here is to let the client do the synchronisation, if it pleases. 9. Set global classificator The issue here is whether already stored fault information should be reclassified to be in agreement with the new classificator. If the faults in fault database are reclassified, it must be decided whether cancels and alarms are generated.
17

It is not very easy to decide whether this cancel should also be checked against the new filter.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS 3.2.4.6 Shutdown

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 50 of 78

10. Set classificator for an interface Typically this can only affect events that haven't been reported yet. The new classificator is used for those events. If the interface maintains fault information, the severity should be reassiggned to match the new classificator.

Usually there is nothing special to do upon poller shutdown. All acquired resources are released and operation is ended. In some environments a special notification might be generated. There should be no need to for example cancel all stored faults upon shutdown.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 51 of 78

4 IMPLEMENTATION
4.1 General
The implementation language for the component is ANSI/ISO C. This makes use of some object-oriented constructs a bit cumbersome but makes it easier to support more platforms. Here are some general rules followed while mapping object oriented constructs to C. Objects of a class are implemented as variables having a structure type. Data members of an object are mapped to structure fields. Non-virtual methods are implemented as global functions taking as their first argument a pointer to the target object. Virtual methods are implemented as function pointers embedded to the object structure, referring to functions taking as their first argument a pointer to the target object. Inheritance is implemented with aggregation i.e. a derived class structure has a field of the type of the base class. If the base class field is the first field of the derived class structure, it is possible to simply use an explicit cast in overridden methods to convert the base class object type to the derived class. When multiple inheritance is required, it is necessary to use pointer arithmetic in overridden methods of a class whose field is not the first field of the derived class structure in order to get the address of the derived class structure. Alternatively explicit pointers to the complete object embedded into subobjects can be used. Non-hierarchical structures in class hierarchy are to be avoided as this raises the issue of virtual vs. non-virtual inheritance (see appendix B).

18

There are several disadvantages to using C as compared to languages with more direct OO support. In C one has to explicitly call default constructors and destructors. The virtual function pointers have to be explicitly initialised. The this or self pointer is explicitly present in functions acting as methods and the type and in case of multiple inheritance even the address for the pointer has to be explicitly adjusted. All this results in more code to be written and is prone to errors. Therefore it is sensible to insert assertions in the code to check as frequently as possible that for example the pointer arithmetic has not resulted in calling a function with an invalid pointer. For this purpose the function pointers embedded to the object structures can be used as they are constant data during the lifetime of an object and provide indirectly class identity information about the object. In C it is next to impossible to enforce member access control usually supported by OO languages. Some advantages can also be seen in using C as compared to languages with more direct OO support. As the OO constructs have to be explicitly programmed in C, they do not go unnoticed. It is not uncommon to have performance problems caused by frequent constructor/destructor calls in C++ programs. Also the programmer has more control over the nature of the OO constructs so they can be tailored to better support the design to be implemented.
18

Yes this does indeed waste storage but simplifies things a lot.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 52 of 78

4.2 Q1MasterCom implementation


Nokia Q1 Poller can be seen as an extension to Q1MasterCom. Q1MasterCom uses in its implementation a collection of modules, which are grouped to two libraries: AL and UL, and these modules are also utilised by Nokia Q1 Poller.
al_ms g_que ue send( al_msg *) dispat ch( ) {ordered} 0..* al_ms g 0..* dispat ch( ) st art (milliseconds) st op( ) requestNotification(al_msg *) ackT icks(count) lagT icks(): count al_ios _rc_ms g ... al_ios _ms g al_ios _wc_ms g 0..1 0..1 C oncre t e Han dle r al_ti ck e r dispat ch() 0..* al_io_s ys te m wait() check(): boolean 0..* al_io_h andle r

al_ti me r st art (milliseconds) st op() requestNotification(al_msg *)

al_ios tre am read(al_ios_rc_msg *, ...) write(al_ios_wc_msg *,...)

C oncre t e Stre am

Figure 19 AL classes AL is a very small library, whose classes are depicted in Figure 19. All classes in AL are abstract. There are two seemingly independent parts in the library: I/O and messaging. The classes ConcreteStream and ConcreteHandler are just examples to show how the I/O and messaging are linked to each other in an implementation. The purpose of the library is to define a simple framework for asynchronous operation inside a single thread of control. A thread using the framework has an instance of al_msg_queue and an instance of al_io_system. The al_io_system instance offers the capability to wait for I/O and to dispatch the appropriate I/O handler when I/O occurs. The al_msg_queue offers just a single point to which messages can be sent and the messages in the queue can be dispatched upon request. The message base class al_msg simply defines one virtual method for dispatching the message. The classes al_timer, al_ticker and al_iostream are abstractions of delay timer, periodic timer and a full duplex I/O stream. These are the I/O abstractions needed by Q1 protocol and poller. UL is a more or less arbitrary collection of utility modules. These modules are utilised by AL and concrete descendants of AL base classes as well as other modules of Q1MasterCom and Nokia Q1 Poller. The following UL functionality is used in Nokia Q1 Poller: singly linked list doubly linked list

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 53 of 78

bit vector red-black tree priority queue target fraction queue fixed size block allocator time conversion 64 bit arithmetic for 32 bit systems tracing and profiling

A class diagram depicting the central Q1MasterCom classes is shown in Figure 20. One minor advantage of using C as opposed to for example C++ can be seen from the figure. With C it is possible to combine aggregation and specialization. For example the class al_ios_rc_msg is an abstract class, but it is possible to implement the virtual functions inside the containing q1MComBus class because the function pointers must be explicitly initialised and they can therefore be initialized while constructing an instance of q1MComBus class. Actually this figure also shows the fact that the semantics of aggregation and inheritance are not always as clearly distinct as often seems it would be also valid to show the association between al_ios_rc_msg and q1MComBus as inheritance.
1 1 q1MC om C om m an dQ ue ue q1MC om Q u e u e dC m d 0 ..* {ordered} push() pop() 1 q1MC omB us Sche dule r al_cio_handl e r3 25 q1MC om C on fi gure r

q1MC om B us C m dTarge t submit (q1MComBusCommand *) q1 MC om B us C m dC ontrol 1 q1MC om B us C om m and 1 q1MC om B us C m dS tatus al_ios _m s g al_ ios _wc_m s g 5 al_m s g 1 al_io s _rc_m s g 1 q1MC omB us

q1 MC omPi pe

1..MAX q1MC om B us C han ne l

al_ti m e r 1 1..MAX al_io s tre am q1MC om C os iSe rialS tre am

q1MC om C om m an dPo ll

1 q1MC om C om m an d ...

q1MC om C omm an dRun ne r q1MC om Packe tC ol le ct or command() answer() failure() abort () st at e answer() decode()

q1MC om C omm an dS tatu s q1MC omC omm an dFaul ts

Figure 20 Central Q1MasterCom classes

4.3 Choices for distributing data and functionality


The Q1 polling functionality can be distributed among objects in various ways and the objects can be mapped in various ways to implementation entities (e.g. processes and processors). The most appropriate choice depends on the target environment. A good starting point for considering the distribution is the generic polling data flow diagram. When the state of the monitored network is stable, the activity spins along the data flow loop from known addresses data store to the X bit query process and back. In many environments this is likely to be the most common activity executed by the poller. There are some conflicting goals for the design:
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 54 of 78

The X-bit query should execute as rapidly as possible i.e. the polling activity should loop through the monitored addresses as fast as possible in order to detect fault status changes as fast as possible. The time from fault status change to the information being present in the management system should be minimised. The polling should not hog system resources, especially CPU time, if other activities are present in the execution environment. The code for executing a polling command should share the generic Q1 command execution implementation i.e. it should not be necessary to write special code for the polling command execution.

Executing the X-bit query requires writing the polling command to the bus interface and reading the answer. Optimising the query for speed argues for tight coupling between the command execution code and the bus interface. When there are several processors present in the runtime environment, this argues for implementing the X-bit query functionality on the CPU with direct access to the bus interface hardware. In environments, which have a special purpose CPU for serial channel handling this helps to achieve also the second goal. However, the CPU load can also be reduced by slowing down the polling. This naturally conflicts with the goal to speed up the X-bit query. From the generic polling data flow diagram it is possible to see that the X-bit query process accesses the data stores specifying the known addresses and changed addresses, which are shared by other processes. The X-bit query is likely to access very frequently the known addresses, so the data store should be directly accessible to the CPU executing the X-bit query. This data store can be efficiently implemented with a simple bit vector, storing one bit per address, and a counter looping through the addresses. For the complete Q1 address space this requires only 512 bytes for the bit vector which means that at least the data storage requirements should not prevent implementing the X-bit query functionality on many intelligent serial communication boards even with limited memory capacity. The data store specifying the changed addresses is likely to be infrequently updated which suggests that the X-bit query can be more loosely coupled with it.
Q1 bus interface host Q1ChangeQuery Q 1Faul tsKnown Management interface host

Q 1Faul tsC h ange d

Q1FaultQuery Q1Bus Q 1C omman d

Figure 21 X-bit query and Q1 protocol implemented by bus interface CPU The X-bit query needs to co-operate with other command execution activities. The bus interfaces need to be accessed in a controlled manner for sending commands originating from the X-bit query and for example fault status change query. If the X-bit query
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 55 of 78

implementation uses a generic Q1 command execution component, there should be no problem the Q1 command execution is the single access point to the Q1 bus interfaces for all command clients (see Figure 21). This may, however, force quite a lot of functionality to be implemented on the CPU, which hosts the X-bit query, and the resources in that environment can be very limited. To minimise the functionality to be implemented on the bus interface CPU it is possible to take advantage of the fact that a Polling command is always one packet transaction. The X-bit query can be implemented so that it uses the Q1 datalink interface shared by the generic Q1 command execution component (see Figure 22).
19

Q1 bus interface host Q1ChangeQuer y Q 1Faul tsKnown

Management int erface host

Q 1Faul tsC h ange d Q1FaultQuery

Q1DataLink Q 1Packe t

Q1Bus Q 1C omman d

Figure 22 X-bit query on bus interface CPU with limited resources It is also possible to limit the functionality present on the Q1 bus interface CPU to the Q1 datalink layer as is the case in DX 200 BSC (see Figure 23). In this case, however, the X-bit query burdens the management interface host CPU.
Q1 bus interface host Management int erface host Q1ChangeQuer y

Q1DataLink Q 1Packe t

Q1Bus Q 1C omman d

Q 1Faul ts Known Q 1Faul ts Ch ange d Q1FaultQuery

Figure 23 Datalink implemented by bus interface CPU If the Q1 bus interface host can be dedicated for the polling activity and there are adequate resources available, it is sensible to implement the functionality associated with all of the processes and data stores (including the fault database) shown in Figure 4 on the bus interface host. In this kind of environment the X-bit query proceeds rapidly and additionally fault status change notification generation is very fast as the new and old fault information are directly accessible for the same CPU. This kind of arrangement is shown in Figure 24. If the fault queries from management systems are frequent and/or
19

This might disturb the continue iteration of some commands for some elements.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
Q1 bus interface host Q1FaultPoller

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 56 of 78

the query speed is an issue, it is probably sensible to implement fault database mirroring on the management interface host.
Management int erface host Q1FaultMirror Q 1Faul tQ ue ry

Q 1Faul tC hange

Q1Bus Q 1C omm an d

Q1Pipe

Figure 24 Mirrored fault database The above compositions (without fault database mirroring) are also valid for environments, which have no special Q1 bus interface CPU. In this kind of environment there is more freedom for the design. However, similar considerations for allocating functionality to operating system entities, i.e. processes and threads apply. Tightly coupled functionality should be mapped to a single thread/process to minimise context switches.

4.4 Fault database


The choice for a suitable implementation of the fault database depends on the constraints set by the runtime environment. The anticipated amount of active faults, available memory storage and requirements for the CPU efficiency of fault queries and updates affect the choice. A very straightforward way to implement the fault database is to store objects, which hold the identifying attributes (bus, ad, fe, sb, fc) and additional attributes (fe status, severity, timestamp) to a generic map container. This requires for each active fault the storage needed to store all these attributes and some container overhead. This approach has several advantages: allocation can be done in fixed size objects which allows simple and efficient implementation of allocation and freeing routines updates and queries are efficient (and with for example balanced binary tree performance degrades gracefully when the amount of active faults increases) the implementation is simple a generic map implementation can be directly utilised

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
SB 0..255 FE feStat us

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 57 of 78

0..65535

FC timeSt amp severityMask

0..255

NE elementGenerat ion 0..4095

FaultTre e

0..*

B us

Figure 25 Fault database as a tree To reduce memory requirements, the faults can be arranged as a tree (see Figure 25). In this case, for example, when new fault codes become active in the same supervision block, new storage is required only for the FC specific attributes. This approach, however, has several disadvantages: variable size objects need to be allocated (or storage is wasted by allocating always a fixed size block which is big enough to hold the largest object) which makes allocation and freeing of storage more complex and less efficient updates and queries are less efficient the implementation is complex i.e. a generic map implementation cannot be directly used

The update and query performance here depends on how the fault tree is implemented. For example, the levels of the tree can be implemented with linear containers or associative containers. The choice here is again dictated by speed/memory tradeoffs. It is also possible to choose an implementation between the two above schemes: for example arrange the fault database as a tree upto the address level and store the rest of the fault attributes to objects stored to a map which is indexed with (fe, sb, fc). The fault database interface can be as simple as the following set of operations: add fault inputs: fault attributes outputs: success/failure get first fault greater than or equal to given iterator inputs: iterator (for example bus, address, fe, sb, fc) outputs: fault or null/sentinel indicating none found get next fault inputs: iterator outputs: fault or null/sentinel indicating none found remove fault inputs: iterator

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 58 of 78

outputs: found/not found indication

A fault map can be efficiently implemented with a red-black tree. The fault allocation policy is likely to be environment dependent. The storage for the faults could be dynamically allocated from a common memory pool or a fixed memory area could be dedicated for faults.

4.5 Fault database queries and notifications


From implementation point of view the association from Cancel to Alarm can be reduced to including only the timestamp attribute of the associated Alarm to the Cancel class. Also the inheritance relationships are more conveniently implemented with an additional attribute specifying the event type. A view to faults (implemented with a direct map approach) and notifications is shown in Figure 26. It is sensible to include accuracy information for the timestamp attribute in case the system interfaces allow reporting this information.
Fault bus,ad,fe,sb,fc elementGeneration sever ityMask feSt at us bus ad fe sb fc timeSt amp accur acy

Fault Map

Even t eventT ype onTime accuracy

Figure 26 Implementation view to faults and notifications The query interface for external clients can be similar to the internally used fault database interface. If the fault database is stored to shared memory directly accessible to clients, only mutual exclusion and robustness of iterators needs to be considered. If the queries are made through a message passing interface, the following query operation is likely to be more appropriate than the a-fault-at-a-time interface: inputs: lower bound iterator, upper bound iterator, maximum number of faults outputs: a set of faults, iterator to next unread fault

4.6 Filtering and severity assignment


The parser for the filters and classificators is likely to run in an environment distinct from the environment where the parsed expressions are to be evaluated. Therefore it is sensible to implement separate parser and evaluator. This allows more freedom for the parser implementation as the parser is usually used in an environment where for example memory constraints are not as severe as might be the case for the poller target environment. It is assumed that parser and lexical analyser generation tools and C++ are available in the parsing environment.
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 59 of 78

The parser for the filters and classificators is generated from Bison parser specification and the lexical analyser from Flex lexer specification. Use of Bison and Flex allows making the parser/lexer re-entrant. If Bison and Flex are not available and the application needs only one parser/lexer instance, conversion to standard lex and yacc should be straight forward. To simplify memory management, the syntax tree is implemented with reference counting smart pointers. The evaluator for the filter and classificator expressions is implemented in ANSI/ISO C. The syntax tree is packed to an array and pointer references are converted to indexes. The representation is not architecture neutral (for example byte order of parser and evaluator must match). If the packed representation is to be transferred between hosts having different architecture, marshalling and unmarshalling can be implemented in relatively straightforward way, as the representation is a table.

4.7 Connection state


The loss counters shown in Figure 14 can be implemented with a straightforward table. In most cases the wasted space (usually all addresses are not monitored) introduced by a table is negligible if for example one byte per counter is used. The connection state should be updated always when commands are executed to addresses, which are monitored, regardless of the origin of the command.

4.8 DCN statistics


Depending on memory capacity considerations the DCN statistic counters can be collected with varying granularity. The most general approach would be to specify a monitoring profile in poller configuration and to dynamically allocate storage according to configuration. However, collecting counters associated to a bus interface should usually be accurate enough. The DCN statistic counters can be presented as special PM counters at external interfaces.

4.9 Element queries


There are (at least) two approches to implementing the functionality described by the polling data flow diagrams: construct classes for each process and data store construct classes implementing functionality associated with paths in the data flow diagram

In both alternatives the generic polling data flow diagram can be interpreted as specifying the needed base classes and the E and D/ND specific diagrams show specialization. In the first alternative we end up with a lot of classes and interfaces between them. For the second alternative we end up duplicating functionality as there are several paths passing through the same processes and data stores.
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 60 of 78

One important aspect that the implementation must address is memory usage. In many environments conservative memory usage is critical. Also dynamic memory allocation is to be avoided as it introduces runtime penalties, exception handling problems and makes the behaviour of the component hard to predict. Storing the following information requires special attention: addresses to monitor addresses whose fault status is considered known addresses whose fault status is considered unknown addresses whose fault status is considered changed addresses considered disconnected functional entities present in monitored addresses functional entities whose fault status is considered known functional entities whose fault status is considered unknown functional entities whose fault status is considered changed Get New Events menu command answer sequence number for each E generation functional entity

The Q1 address space for a bus is 0-4094. Functional entities are numbered 0-254. Therefore the theoretical maximum of FEs on a bus is 255*4095=1044225. This suggests that it is acceptable to store small amount of address specific information for the whole address range but FE specific information cannot be stored for the complete theoretical maximum. To address this, the fault polling data flow diagrams can be consulted to find out where FE specific information is needed. The "get faults" process in Figure 4 needs the information specifying functional entities whose fault status is considered unknown. For simplicity, lets ignore the "functional entity consistency check" process for now. The "unknown functional entities" data store is updated by the "get FEs" process. To limit the storage required by this data store the processes "get FEs" and "get faults" can operate in producer-consumer fashion with flow control. The same strategy can be used for limiting the memory required for other data stores too. A set of classes for implementing the element queries in this fashion is shown in Figure 27.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 61 of 78

q1MC om C om man dGetC hange s

NokiaQ 1FEC hange Q ue ry gcReqQueue

q1MC omC omman dFaults

De ltaQ u ery NokiaQ 1C han ge Q ue ry gneReqQueue deltaReqQueue gneReq fccReq st atusReq gcReq TMS4Fau ltQ u e ry fccReqQueue

q1MC omC omm an dDataTrans fe r

Status Q ue ry st atusReqQueue

q1MC omC omm an dStatu s

Full Q ue ry NokiaQ 1Ful lQ ue ry gaaReqQueue fullReqQueue fccReq gaaReq st atusReq

Figure 27 Element queries The external limit for the fault polling speed is set by the Q1 command response times of the monitored elements and Q1 bus communication characteristics (bit rate, delays). To approach the limit, as much as possible of the computation should be overlapped with the waiting of answer packets to Q1 commands. Another way to state this is to require that the Q1 command execution component should be kept as busy as possible it should be waiting for an answer packet most of the time. To achieve this, it must be possible to begin the execution of the next Q1 command immediately upon completion of the previous command i.e. there should always be at least one command queued for execution while another command is active. As the most commonly executed command is the Polling command, the above requires that at least the X-bit query activity should employ double buffering (or N-buffering) i.e. there should be at least two queries active in parallel. Another conclusion which can be drawn from the above considerations is that the Q1 commands should be queued as close as possible to the Q1 command execution. This way a new command can be popped from the queue immediately upon command iteration completion. All nontrivial processing, i.e. Q1 bus interface choosing, connection status updating and fault status updating should be located outside the path between command queue and command execution. For scheduling the queries a choice between active or passive scheduler (as described in 3.2.4.1) has to be made. Using an active scheduler seems more appropriate for a number of reasons. When a scheduler acting as a central mediator is used, it is easier to contruct the queries in such a manner that they are independent of the scheduling policies enforced by the scheduler and even the overall architecture of the whole Nokia Q1 Poller application. This makes the queries potentially reusable in a wider context than just the
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 62 of 78

Nokia Q1 Poller and more easily testable than active pollers. Also the behaviour associated with configuration changes is likely to be more easily implemented when there is one central object scheduling the queries and commands to the Q1 bus as configuration changes may call for suspending/aborting active processing. The overall collaborations required for an active scheduler employing passive queries are shown in Figure 28. The diagram is not standard UML collaboration diagram it only shows some central objects taking part in fault polling and command execution with the arrows showing interaction and the arrow direction showing the uses relation.
: Scheduler

: XPoller

: FullP oller

: DeltaPoller

: Connect ionM onitor

: q1MComPipe

: q1MComBus

: q1MComBusScheduler

: q1MComCmdQueue

Figure 28 Overall Q1 command execution and fault polling collaborations

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 63 of 78

5 MODULE TEST PLAN


5.1 General
As many of the modules as possible shall be tested in isolation from other modules i.e. it should be possible to construct a module test for each module separately. The code coverage target for all environment independent modules is 100%. All module tests shall be runnable in batch mode (noninteractively). The module tests shall be updated upon defect discovery to reproduce the incorrect behaviour, i.e. the module tests should be usable as regression tests.

5.2 Unsigned 64 bit integer arithmetic


Module ul_math.c implements the following functions: initialize unsigned 64 bit integer by specifying most and least significant 32 bits check if given unsigned 64 bit integer is zero compare two unsigned 64 bit integers shift the bits of given unsigned 64 bit integer to left shift the bits of given unsigned 64 bit integer to right subtract two unsigned 64 bit integers add two unsigned 64 bit integers multiply two unsigned 32 bit integers and store result to an unsigned 64 bit integer divide an unsigned 64 bit integer by an unsigned 32 bit integer and return quotient and remainder as unsigned 32 bit integers add an unsigned 32 bit integer to an unsigned 64 bit integer subtract an unsigned 32 bit integer from an unsigned 64 bit integer

The module tester can be implementd by exercising the functions and comparing results with 64 bit arithmetic implemented by for example underlying hardware, used compiler or runtime library.

5.3 Time conversion


Module ul_time.c implements the following functions: construct 64 bit millisecond counter from year, month, day, hour, minute, second and millisecond break 64 bit millisecond counter to year, month, day, hour, minute, second and millisecond

These routines can be tested by excercising the routines with different values and comparing results with runtime library behaviour. Note that behaviour of time related routines is often loosely specified. The time manipulated by these routines is intended to

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 64 of 78

refer to UTC time but leap seconds are simply ignored. The millisecond counter measures time from 1.1.1970 00:00 UTC (again ignoring leap seconds).

5.4 Linked lists


The linked list macros are defined in ul_sll.h and ul_dll.h. A module tester for the singly linked list shall exercise the following operations: add to tail add to head pop head pop tail insert before an item insert after an item remove an item

For the doubly linked list, the following additional operations shall be tested:

5.5 Bit vector


The bit vector implemented in ul_bv.c has the following interface: initialize a fixed size bit vector set a bit by index clear a bit by index get bit state by index find first non-zero bit starting from given index find first zero bit starting from given index

The module can be tested easily by duplicating the expected contents into a simple array and by checking the consistency of the bit vector and the array during the test.

5.6 Red-black tree


The read-black tree implemented in ul_rbt.c has the following interface add given item to the tree put given item to the tree and remove possibly matching item replace item remove given item from the tree find an item remove all items from the tree get first item get next item get greater or equal item get greater item get last item

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 65 of 78

The implementation can be tested either as a black box ordered associative container or the properties defined for a red-black tree can also be checked. A red-black tree has the following properties: every node in a tree is either red or black every leaf is a nil node, which is coloured black if a node is red, then both of its children are black every simple path from a node to a descendant leaf contains the same number of black nodes

These properties guarantee that any path from the root to a leaf is no more than twice as long as any other i.e. they set an upper limit for the unbalance of the tree. Checking the first three properties is relatively straightforward, but the last one requires nontrivial coding.

5.7 Priority queue


The priority queue implemented in ul_lpq.c is suitable for use when the number of priorities is large. It uses the singly linked list macros to keep items of the same priority in a linked list, and the red-black tree to keep the list heads of different priorities in an ordered associative container. The priority queue interface has the following operations: push an item to the queue pop an item from the queue check whether queue is empty items pushed into queue with constant priority are popped in FIFO order items with distinct priority are popped in priority order regardless of push order

At least the following properties shall be checked by the module test:

5.8 Target fraction queue


The target fraction queue implemented in ul_tfq.c can be regarded as a kind of dynamic priority queue. An item pushed to the queue has a class and each class has a target fraction setting in the queue. The items are popped from the queue so that the number of items of a class popped is proportional to the target fraction setting for the class. The module tester shall check at least the following properties: items of a single class are popped in FIFO order items of classes with same target setting are popped in FIFO order when there are always items of each class available in the queue, the number of items popped for each class approaches the set targets

5.9 Fixed size block allocator


An allocator for fixed size objects is implemented in ul_ea.c and it has the following simple interface: initialize allocator allocate an element

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS free an element

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 66 of 78

The tester shall exercise the above operations and modify the allocated elements and check that the data stored to the elements is preserved during operation.

5.10 Q1 datalink protocol


The Q1 protocol datalink layer functionality is implemented by modules q1mcenc.c, q1mcdec.c and q1mcdlnk.c. The datalink encode function takes the following parameters: buffer containing data to encode buffer to store the encoded data to Q1 address

As the input data originates from the using application, the function asserts the input parameters for correctness. Correctness of the encoding shall be checked with short and long addresses and several data lengths (especially boundary cases 0 and maximum). The datalink decode function takes the following parameters: buffer containing data to decode buffer to store the decoded data to

The function reports success/failure and the decoded Q1 address. As the data for this function originates from external source, both invalid and valid data should be checked. The datalink packet collector interface consists of the following functions: data read from bus is reported with q1MComPacketCollector_answer the collected data is decoded with q1MComPacketCollector_decode

The state field reports the current packet collector state. The state should observe the transitions shown in Figure 29.
overfl ow

answer()/C==1

answer()/nframes > maxFrames fi nis he d

pari ty

re adi ng

answer()/C[last]==0

decode()/p! =0 decode()/invalid format format

decode()/p==0 answer() /C[x<last]==0 answer()/len=0 ti me out

garbage

decode()/p==0

decode()/invalid format decode()/p!=0

Figure 29 Q1 datalink packet collector states

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 67 of 78

5.11 Q1 commands
The q1MComCommand descendants are implementd by q1mcpcmd.c and q1mcxfer.c modules. The q1MComCommand interface is straight forward and easily testable: command method is used to construct a command packet answer method is used to report an answer packet to the command failure method is used to report communication failure to the command abort method is used to abort a command reset method is used to reset a command to initial state

The tester shall exercise all of these methods for each command with valid inputs and with valid and invalid slave answers. Module q1mccr.c implements a simple command runner just keeping track of retries. Testing this module is most conveniently done while testing the various Q1 commands.

5.12 Q1 command execution


The q1MComBus class is implemented by q1mcbus.c module. The class interface is very simple:
20

add channel (q1MComBus_addChannel) remove channel (q1MComBus_delChannel) submit a command (q1MComBus_send) reset counters (q1MComBus_clearErrors, q1MComBus_clearStats) read counters (q1MComBus_getChannel) when delays are zero, writing command packets should proceed without delay with non-zero inter packet delay, the delay should be observed between separate packets of one command an between packets of separate commands with non-zero switch delay, the delay should be observed between packets from/to different bus interfaces with non-zero empty packet delay, the delay should be observed upon empty answer from slave with non-zero data transfer delay, the delay should be observed between data transfer commands with non-zero transaction timeout, the command should abort upon transaction timeout expiration with zero retry count the command should abort upon the first failure with non-zero retry count, a packet should be retried upon answer packet timeout and upon invalid answer packet a command should be aborted when retry count is exhausted a command should be aborted when empty reply limit is exhausted a command should be aborted when packet limit is exhausted
20

However, there is a lot of functionality to be tested associated with command execution:

Actually the delay needs only be observed between commands to the same address, but this would complicate the implementation a lot.
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 68 of 78

when a command fails to complete via one bus interface, the execution should be tried via other interfaces, if so specified specified packet sizes should be observed: transmitted packets should use specified maximum and specified maximum in answer packets should be accepted correct statistics should be reported in command answers

As the q1mcbus.c module uses the Q1 command runner (q1mccr.c) and Q1 datalink packet collector (q1mcdlnk.c) modules, these modules get reasonable coverage while testing q1mcbus.c.

5.13 Q1 command queue


As the Q1 command queue implemented in q1mccmdq.c uses the UL priority queue and target fraction queue implementations, most of the functionality is tested by the module testers for the used queue implementations. However, the following aspects should at least be asserted by the module tester for Q1 command queue: commands pushed into queue with constant priority are poppen in FIFO order commands with positive priority are popped in priority order regardless of push order

5.14 Q1 command scheduling


Module q1mcbs.c implements the scheduling of Q1 commands to the Q1 bus. The current implementation of q1mcbus.c runs only one command at a time and q1mcbs.c enforces this fact by queueing any commands submitted while a command is in execution. Module q1mccmdq.c is used for queueing the commands. The module implements q1MComBusScheduler class which has the following methods:
21

submit a command suspend command execution enable command execution get command execution state (enabled/disabled) purge command queue when the scheduler is disabled, no commands are passed to the bus when the scheduler is enabled, only one command at a time is passed to the bus after purge the command queue is empty

The module test shall exercise these methods and assert the following conditions:

It seems that running several commands in parallel might not be desirable as some elements might not tolerate interleaved commands.
21

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 69 of 78

6 SOFTWARE COMPONENT INTERFACES


6.1 General
Nokia Q1 poller interfaces described here do not specify any IPC mechanism. The communication mechanism in NMS/10 MF is WosNuc messaging. For example in COSI enviroments COSI messaging would be used. The Nokia Q1 Poller implementation allows adapting the interfaces for different IPC mechanisms.

6.2 Configuration interface


This interface is provided for configuring the component. It is an extension of Q1MasterCom configuration interface. Depending on target environment the set of configurable parameters supported may vary. In NMS/10 MF C2.0 the following parameters can be set: bus number primary serial port and speed secondary serial port and speed (optional) global filter global classification retry count for polling commands retry count for commands passed through loss detection threshold as number of cycles the following parameters used as defaults for commands passed through and for polling commands first character timeout in answer packet reception total timeout for command execution through one serial interface data transfer command delay empty reply delay inter packet delay switch delay empty reply limit for command execution through one serial channel packet limit for command execution through one serial channel command passthrough activity time number and unit (NE or FE) of fault status consistency checks per polling cycle number of DCN failure checks per polling cycle number of element existence (autodiscovery) checks per polling cycle polling cycle target time target fraction for fault status consistency checking target fraction for DCN failure checking target fraction for element existence checking
22


22

When command as passed through, this parameter specifies the time during which the polling should avoid commands that are likely to time out (lost checking and autodiscovery) after the last pass through command has been replied. This parameter can be used to optimize command pass through performance.
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 70 of 78

target fraction for no activity addresses for D/ND generation fault polling addresses for E generation fault polling addresses for element existence checking time interval for E-generation element clock refresh

In NMS/10 MF the configuring happens in sessions. The interface employs the following messages: start configuration session specifying the parameters which are to be changed. set configurable parameter, which specifies the new value for a parameter. end configuration session indicating that the new parameters are to be taken into use.

The purpose of configuration sessions is to enable controlled ownership change of global resources, e.g. serial ports. If the target environment has only one poller which does not share any resources with other entities, the interface can be implemented without sessions.

6.3 Alive check interface


This interface is provided for the entity responsible for the supervision of the operation of the component. The interface is similar to the interface offered by Q1MasterCom. Upon receiving an alive check request Nokia Q1 Poller makes reasonable effort to ensure that it is in consistent state and responds appropriately.

6.4 Shutdown interface


In many environments there is no need to shut down the poller. However, there is not much functionality associated with shutting down poller operation. Upon a shutdown request, poller releases all allocated resources, responds to the request and stops operation in manner appropriate for the target environment (for example in WosNuc, the process goes to infinite message reception loop).

6.5 Q1 command pass through interface


This service is provided for passing through Q1 commands to the elements. It is similar to the interface offered by Q1MasterCom which in turn is similar to the NMS/10 MF internal Q1 command interface. The interface employs separate message types for each Q1 command. In NMS/10 MF environment there is the additional complication that large commands need to be iterated by using several messages as large Q1 commands/answers do not fit into one WosNuc message.

6.6 Fault status query interface


Depending on the fault database implementation the fault status queries can be implemented either with a fault-at-a-time query interface or in manner similar to SNMP get-bulk operation. In NMS/10 MF the queries are made with WosNuc messages. The
Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 71 of 78

query message includes lower bound and upper bound iterators specifying the range of faults to report and the maximum number of faults to include in the answer. The answer message is filled with faults starting from the lower bound iterator and an iterator to the next unread fault is included in the message.

6.7 Fault status change notification interface


Usually a notification interface inludes at least the following messages: register for notifications (client server) send notification (server client) cancel registration (client server)

In NMS/10 MF this kind of notification interface is offered by a program block separate from the poller program block. The interface between poller and this data distribution program block uses acknowledged messages without message window. The interface between data distribution and the actual notification clients employs unacknowledged numbered notifications.

6.8 Element presence reporting


In NMS/10 MF the autodiscovery functionality is implemented by two separate program blocks: poller and info logger. Poller performs element existence checking according to its configuration and reports found elements with special notifications send through the same interface as fault status change notifications. Info logger performs the actual identity queries upon reception of these notifications by using the Q1 pass through interface offered by poller.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 72 of 78

7 ERROR HANDLING
One goal in this design has been to avoid operations that can fail in order to avoid error handling problems. Whenever possible, interfaces are designed so that errors can be indicated in the response when a client invokes an operation that can fail. For example invalid configuration parameters are indicated in configuration interface reply messages. When errors cannot be indicated in this way, they should be logged and/or counted. In NMS/10 MF selected error conditions are indicated by special faults associated with Q1 bus number 254. The Nokia Q1 Poller manages faults associated with the following conditions: polling cycle target time overflow storage reserved for faults depleted storage reserved for FE specific state depleted invalid answers to fault query commands reserved fault codes reported by element

In addition the following conditions are reported with operating system exception reporting mechanism: serial interface error IPC-failure invalid Q1 command answer

Consistency of internally computed data is checked with assertions. It is in general futile to implement exception reporting for this kind of internal failures.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 73 of 78

8 OTHER TECHNICAL SOLUTIONS


There are a number of exceptional situations which are briefly described here. Some of them can be considered anomalous as in normally operating systems they should not occur. However, deterministic operation even in those cases is naturally necessary. 1. Get New Events menu command sequence number mismatch is detected This indicates that events have been lost. The current fault status of the FE in question should be queried. 2. Event history overflow warning is detected This indicates that events have been lost. The current fault status of the FE in question should be queried. 3. Real time lost event is detected The clock of the element in question should be set. 4. Element reports a reserved fault code Fault codes 200-202 for FE=0, SB=0 reported by an element should be ignored if the TMC style connection status fault codes are maintained. The occurrence should, however, be logged. 5. FE count of an address changes If more FEs than earlier are discovered in an address, the FEs are queried for faults. If the FE count decreases, the faults of the missing FEs should be canceled. If the FE count of an element has changed it is also possible that the element has been replaced or for example functional entities have been renumbered. Therefore it is sensible to check the consistency of the fault status of the whole element in this situation. 6. An element reports FE status indicating nonexistent FE of FE < last In normal situations the FEs are numbered consecutively without gaps. However, when an unit associated to a FE is removed, there is a gap in FEs until FEs are renumbered. One possible way to handle this is to cancel the faults of the FE in question. Another quite acceptable alternative is to just log the occurrence and leave fault database untouched. The missing FE should not be queried for faults as it is likely to not respond. 7. E generation element reports an event with invalid severity mask Arbitrary combinations of the bits of the severity field reported in E generation FM command answers are not allowed. For example warning and disturbance bits are mutually exclusive. Malformed events can either be discarded or propagated. If they are discarded, the occurrences should still be logged and/or counted. If they are propagated

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 74 of 78

rules for deciding for example the type of a malformed event need to be established. The rules could be the following: warning bit is checked first if the bit is set, activity and disturbance bits are ignored (also all severity bits are ignored in this case) disturbance bit is checked after checking warning bit if the bit is set, the activity bit is ignored severity bits are checked from most severe to least severe when a set bit is found, the rest of the bits are ignored

8. E generation element reports a disturbance about a fault which is active The choices here are whether the fault should be canceled as the reported disturbance indicates that the fault is inactive (disturbances are reported only of inactive faults) and whether the notification should be propagated. This might also be an indication about the fact that events have been lost and therefore the current fault status should be queried. 9. E generation element reports a warning about a fault which is active The choices here are whether the fault should be canceled as warnings should be reported for SB, FC which does not indicate a fault and whether the notification should be propagated. 10. E generation element reports an alarm about a fault which is active As the fault is active, an alarm has already been reported and therefore the alarm should not be propagated. However, if it is desirable not to destroy any information and clients are not disturbed by an extra report, propagating the alarm should not do any harm. Also if the timestamp of this new alarm is more recent than the locally stored timestamp, it is likely that the fault code has been cancelled and reactivated, but events have been lost. 11. E generation element reports a cancel about a fault which is not active If it is desirable not to destroy any information and clients are not disturbed by an extra report, propagating the cancel should not do any harm. It is also possible that the cancel is due to a fault having been activated and cancelled, but the alarm event has been lost. 12. Get Active Alarms answer contains warnings/disturbances As warnings and disturbances do not indicate active state, they should not be present in Get Active Alarms answer. Forwarding them as notifications should, however, not do any harm. 13. Time stamp is very old If an element reports an event whose timestamp is older than the information in local fault database, there are two possibilities: the clock of the element is wrong or the event has been superseded by more recent information.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 75 of 78

A ABOUT NOTATIONS
As this document is not strictly formal, there is some ambiguity in interpreting and constructing for example data flow diagrams. The arrows drawn between processes and data stores depict flow of data, but the form of data hasn't usually been specified. Also the directions for the arrows could be chosen differently from what is now presented. These problems originate from the fact that the abstractions that the diagrams describe are not well defined. For example if we have a process p updating a data store d, we should draw an arrow from p to d, if the abstractions have been defined so that d can be updated by a write-only operation. If the data store is interpreted as the memory area where the data is stored, it is usually necessary to read that area in addition to writing to it. If the data store is for example a balanced binary tree, it is necessary to find the appropriate location, adjust pointers and rebalance the tree. If, however, the data store were defined as an abstract data type providing an insert operation, the storing of data can be seen as a write-only operation. However, this level of formalism is most probably not very useful for the design of a component as simple as the Nokia Q1 Poller.

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 76 of 78

B MULTIPLE INHERITANCE IN C
Consider for example Figure 30 and the following two cases: 1. Command inherits ListNode for the purpose of storing commands to a command list. Answer inherits ListNode for the purpose of storing answers to an answer list. Transaction inherits both Command and Answer so it should be usable as a command and as an answer. Therefore it should be possible to store a transaction object to the command list and to the answer list at the same time. Therefore Transaction objects should contain two ListNode subobjects. This is in C++ terms the nonvirtual multiple inheritance case. 2. Command inherits ListNode for the purpose of storing commands to a message list. Answer inherits ListNode for the purpose of storing answers to the same message list. In this case Transaction objects should contain only one ListNode subobject. This is in C++ terms the virtual multiple inheritance case. To implement virtual inheritance in C it is necessary to defer the allocation of the base class storage in inheriting classes. Methods for classes Command and Answer can access ListNode through a pointer which is initialized to point to the actual location of ListNode storage at object construction time. A code sample illustrating the virtual multiple inhertance case follows:
typedef struct ListNode { struct ListNode *next, *prev; ... } ListNode; void ListNode_init(ListNode *, ...) { ... } typedef struct Command { ListNode *listNode; ... } Command; void Command_init(Command *c, ListNode *ln, ...) { c->listNode = ln; ... } void Command_method(Command *c, ...) { ListNode_method(c->listNode, ...); ... Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS
} { ListNode listNode; Command command; } CommandImpl;

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 77 of 78

typedef struct CommandImpl

void CommandImpl_init(CommandImpl *c, ...) { Command_init(&c->command, &c->listNode, ...); ... } /* Answer in similar way */ typedef struct Transaction { Command command; Answer answer; ListNode listNode; ... } Transaction; void Transaction_init(Transaction *t) { Command_init(&t->command, &t->listNode, ...); Answer_init(&t->answer, &t->listNode, ...); ... } void Transaction_overriddenAnswerMethod(Answer *a, ...) { Transaction *t = (Transaction *) ((char *)a offsetof(Transaction, answer)); ... }

Copyright Nokia Telecommunications Oy

NOKIA
NET/CO/PS/OSS/EMS/TRS

Nokia Q1 Poller Software Design Specification 14.01.2000 / v 1.01

Company Internal Draft Page 78 of 78

Li stNode

C om mand

Answe r

Tran saction

Figure 30 Multiple inheritance dilemma

Copyright Nokia Telecommunications Oy

Vous aimerez peut-être aussi