0 évaluation0% ont trouvé ce document utile (0 vote)
238 vues220 pages
Troubleshooting for OMC 2-1 is a comprehensive guide to troubleshoot a bss System and network. It covers the following topics: diagnosis, Fault location, Fault removal, and restoring the system.
Troubleshooting for OMC 2-1 is a comprehensive guide to troubleshoot a bss System and network. It covers the following topics: diagnosis, Fault location, Fault removal, and restoring the system.
Troubleshooting for OMC 2-1 is a comprehensive guide to troubleshoot a bss System and network. It covers the following topics: diagnosis, Fault location, Fault removal, and restoring the system.
Chapter 1 General Procedures and Methods 1-1 ................................................
1.1 Requirements for Maintenance Personnel 1-1 .............................................
1.1.1 On Professional Knowledge and Skills 1-1 .......................................... 1.1.2 On BSS System and Networking 1-1 ................................................... 1.1.3 On BSS Equipment 1-1 ........................................................................ 1.1.4 On Instruments and Apparatus 1-2 ...................................................... 1.2 General Procedures of Troubleshooting 1-2 ................................................ 1.2.1 Information Collection-Collecting Original Information as Detailedly as Possible 1-2 ............................................................................. 1.2.2 Fault judgement-Judge the Scope and Type of the Fault 1-3 .............. 1.2.3 Fault Location-Specifying the Concrete Cause of the Fault 1-4 ........... 1.2.4 Fault Removing-Removing the Fault and Restore the System by Using Suitable Methods or Steps 1-4 ....................................................... 1.3 Basic Methods of Fault Judgement and Location 1-4 .................................. 1.3.1 Analysis of Original Information 1-4 ..................................................... 1.3.2 Alarm Information Analysis 1-5 ............................................................ 1.3.3 Indicator Status Analysis 1-5 ................................................................ 1.3.4 Calling Test Auxiliary Analysis 1-6 ....................................................... 1.3.5 Apparatus & Meter Auxiliary Analysis 1-7 ............................................ 1.3.6 Traffic Statistics Auxiliary Analysis 1-8 ................................................. 1.3.7 Interface Trace 1-9 ............................................................................... 1.3.8 Test/Loop Back 1-12 .............................................................................. 1.3.9 Comparison/Interchange 1-13 ................................................................ 1.3.10 Switching/Resetting 1-14 ..................................................................... 1.3.11 Contacting the Technical Support Engineers of Huawei 1-15 .............. Chapter 2 Troubleshooting for OMC 2-1 .............................................................. 2.1 GSM BSS-OMC Overview 2-1 ..................................................................... 2.2 Troubleshooting for OMC and Examples 2-2 ............................................... 2.2.1 Communication Between BAM and Host Interrupted 2-2 .................... 2.2.2 Communication between BAM and Server is Interrupted 2-3 .............. 2.2.3 Data Table Error is Prompted during BAM Startup 2-3 ........................ 2.2.4 Loading Host Programs from BAM Failed 2-4 ..................................... 2.2.5 Data cannot be Modified/Deleted 2-4 ................................................... 2.2.6 At Least One Table in BSC Data Management System Report Has Error 2-5 ................................................................................................. 2.2.7 Communication Timeout of Interface Tracing through BSC Maintenance System 2-5 .............................................................................. 2.2.8 Alarm System Cannot Receive Alarms 2-6 .......................................... 2.2.9 BAM Prints "Fail to shake hands with alarm server" 2-7 ...................... 2.2.10 Records in Alarm Box and Those in Fault Alarm table Inconsistent 2-8 ............................................................................................. 2.2.11 Shaking Hand with BAM Fails When the Alarms are Deleted through Alarm System 2-9 ............................................................................ 2.2.12 Failed to Register due to Server Failure 2-9 ...................................... 2.2.13 Service Console Running Failed 2-10 .................................................. 2.2.14 Loading Shell Map Failed 2-10 ............................................................ 2.2.15 Some Computers Cannot Refresh Templates in BSC Traffic Statistics System 2-11 ..................................................................................... Chapter 3 Troubleshooting for Software Loading 3-1 ........................................ 3.1 Troubleshooting for BTS Software Loading 3-1 ........................................... 3.1.1 Introduction to Software Loading 3-1 ................................................... 3.1.2 Description of Common Loading Troubles 3-3 ..................................... 3.1.3 Trouble Handling 3-3 ............................................................................ 3.1.4 An Example of Loading BTS Software Failure on the Remote WS 3-6 .......................................................................................................... 3.2 Troubleshooting for BSC Software Loading 3-7 ........................................... 3.2.1 BSC Software Loading Channels 3-7 .................................................. 3.2.2 AM/CM Loading Fails 3-7 ..................................................................... 3.2.3 AM/CM Loading Timeout 3-10 ............................................................... 3.2.4 BM Loading Failed and LOAD Indicator is Constantly ON 3-11 ............ 3.2.5 BM Loading Timeout 3-13 ...................................................................... 3.2.6 BM Loading Starts Again after Successful Loading 3-14 ....................... 3.2.7 BM Does Not Work Normally or Reload Automatically 3-15 .................. 3.2.8 Operating Parameters Not Updated after Data Loading 3-16 ................ Chapter 4 Troubleshooting for Links 4-1 ............................................................. 4.1 Overview 4-1 ................................................................................................ 4.2 Fundamental Knowledge 4-1 ........................................................................ 4.2.1 SS7 Concepts 4-1 ................................................................................ 4.2.2 SS7 Direction 4-2 ................................................................................. 4.2.3 GMC2 4-2 ............................................................................................. 4.2.4 Pb Interface 4-2 .................................................................................... 4.2.5 Signaling Direction on LAPD Links 4-4 ................................................ 4.3 Trouble Handling 4-4 .................................................................................... 4.3.1 No SS7 Signaling Trace Message 4-4 ................................................. 4.3.2 RSL Disabled 4-5 ................................................................................. 4.3.3 OML Disabled 4-6 ................................................................................ 4.3.4 SS7 Link Faulty 4-7 .............................................................................. 4.3.5 AM/CM-BM Link Faulty 4-13 .................................................................. 4.3.6 Optical Path Faulty 4-13 ......................................................................... 4.3.7 PbSL Disabled 4-14 ............................................................................... 4.3.8 E1 Faulty 4-15 ........................................................................................ 4.3.9 Microcells Optical Transmission Board Faulty 4-16 .............................. 4.4 Examples 4-18 ................................................................................................ 4.4.1 Examples of SS7 Link Trouble 4-18 ....................................................... 4.4.2 Examples of RSL & OML Troubles 4-19 ................................................ 4.4.3 Examples of E1 Transmission Trouble 4-21 .......................................... 4.4.4 Examples of Optical Transmission Board Trouble in Microcell 4-26 ...... Chapter 5 Troubleshooting for Clock 5-1 ............................................................ 5.1 Overview 5-1 ................................................................................................ 5.1.1 Fault Description and Fault Causes in BSC Clock System 5-1 ............ 5.1.2 Fault Description and Fault Causes in BTS Clock System 5-2 ............ 5.2 Fundamental Knowledge 5-4 ........................................................................ 5.2.1 Introduction to BSC 5-4 ........................................................................ 5.2.2 Fundamental Knowledge of BTS Clock System 5-13 ............................ 5.2.3 Specifications about BTS Clock System 5-18 ........................................ 5.3 Trouble Handling 5-18 .................................................................................... 5.3.1 Trouble Handling for BSC 5-18 .............................................................. 5.3.2 Troubleshooting for BTS 5-20 ................................................................ 5.4 Examples 5-23 ................................................................................................ 5.4.1 Troubleshooting Examples for BSC Clock Fault 5-23 ............................ 5.4.2 Troubleshooting Examples for BTS Clock Fault 5-25 ............................ Chapter 6 Troubleshooting for Handover 6-1 ..................................................... 6.1 Overview 6-1 ................................................................................................ 6.1.1 Failure Classification 6-1 ...................................................................... 6.1.2 Locating Tool 6-1 ................................................................................. 6.2 Trouble Handling 6-2 .................................................................................... 6.2.1 Locating Procedure 6-2 ........................................................................ 6.2.2 Locating Procedure of No Handover Starting Up 6-3 ........................... 6.2.3 Locating of Hardware Failure 6-4 ......................................................... 6.2.4 Locating of Data Configuration Problem 6-4 ........................................ 6.3 Examples 6-5 ................................................................................................ 6.3.1 MSC Handover Problem 6-5 ................................................................ 6.3.2 BSC Problems 6-10 ............................................................................... 6.3.3 BTS-related Problem 6-16 ..................................................................... 6.3.4 Others 6-21 ............................................................................................ Chapter 7 Troubleshooting for Congestion 7-1 .................................................. 7.1 Overview 7-1 ................................................................................................ 7.2 A-Interface Congestion 7-1 ........................................................................... 7.2.1 Fundamental Knowledge 7-1 ............................................................... 7.2.2 Troubleshooting Process 7-2 ............................................................... 7.3 Radio Channel Congestion 7-4 .................................................................... 7.3.1 Fundamental Knowledge 7-4 ............................................................... 7.3.2 Analysis 7-5 .......................................................................................... 7.4 Examples 7-7 ................................................................................................ 7.4.1 Example of SDCCH Congestion Resulted from Co-frequency Interference 7-7 ............................................................................................. 7.4.2 A Cells Congestion rate is Overhigh 7-8 ............................................. 7.4.3 Severe SDCCH Congestion Resulted from Unstable Transmission 7-8 ........................................................................................... 7.4.4 SDCCH Congestion Resulted from Lots of Burst LA Updating 7-9 ...... 7.4.5 High TCH Congestion Rate Resulted from Incorrect CIC Setting 7-10 ..................................................................................................... 7.4.6 High SDCCH Congestion Rate LAC Resulted from Improper LAC Setting 7-11 ............................................................................................. Chapter 8 Troubleshooting for Access 8-1 ......................................................... 8.1 Overview 8-1 ................................................................................................ 8.1.1 MS Searching Network 8-1 .................................................................. 8.1.2 Location Updating Procedure 8-1 ........................................................ 8.1.3 Call Procedures 8-3 ............................................................................. 8.2 Trouble Handling 8-5 .................................................................................... 8.2.1 MS Cannot Find a Network 8-5 ............................................................ 8.2.2 MS Cannot Access a Network 8-7 ....................................................... 8.2.3 Location Updating Is too Frequent 8-9 ................................................. 8.2.4 MS Drops from the Network Frequently 8-9 ......................................... 8.2.5 MS Finds a Network but Cannot Call 8-10 ............................................. 8.3 Examples 8-11 ................................................................................................ 8.3.1 MS Has Difficulty in Accessing a Network-Signal Is too Weak 8-11 ...... 8.3.2 MS Cannot Perform Cell Reselection - Signals of Adjacent Cells Are Weak 8-11 ....................................................................................... 8.3.3 GSM MS Drops from the Network - Location Updating Period Is too Short 8-12 .............................................................................................. 8.3.4 MS Drops from the Network - CGI is Erroneous 8-13 ............................ 8.3.5 MS Has Difficulty in Accessing a Network - MSC Cell Data Is not Ready 8-14 ................................................................................................ 8.3.6 MS Drops from the Network - MSC Cell Data Is not Ready 8-15 .......... 8.3.7 Some MSs Cannot Access a Network - System Information Is Erroneous 8-16 ............................................................................................... 8.3.8 MS Has Difficulty in Accessing a Network - CBQ and CBA Are Erroneous 8-17 ............................................................................................... Chapter 9 Troubleshooting for Voice 9-1 ............................................................ 9.1 Overview 9-1 ................................................................................................ 9.2 Fundamental Knowledge 9-1 ........................................................................ 9.2.1 Transmission Format of Voice Signal 9-1 ............................................ 9.2.2 Transmission Path of Voice Signal 9-2 ................................................ 9.2.3 Concepts 9-4 ........................................................................................ 9.2.4 Common Operations 9-4 ...................................................................... 9.2.5 Supplement 9-6 .................................................................................... 9.3 Processing of Voice Troubles 9-7 ................................................................. 9.3.1 Analysis 9-7 .......................................................................................... 9.3.2 Location Procedures 9-8 ...................................................................... 9.4 Trouble Location 9-9 ..................................................................................... 9.4.1 Single Pass and No pass 9-9 ............................................................... 9.4.2 Echo 9-11 ............................................................................................... 9.4.3 Voice Discontinuity 9-13 ......................................................................... 9.4.4 Noise 9-14 .............................................................................................. 9.4.5 Cross-talking 9-16 .................................................................................. 9.5 Fault Examples 9-17 ....................................................................................... 9.5.1 Cross-talking Resulting from Improper Data Configuration 9-17 ........... 9.5.2 Voice Discontinuity Resulting from BCCH Carrier Mutual-assistance 9-18 ................................................................................... 9.5.3 Single Pass Resulting from MS Fault 9-19 ............................................ 9.5.4 Noise Resulting from Poor Contact of E1 9-19 ...................................... 9.5.5 Voice Loopback Resulting from Outgoing Cabling 9-21 ........................ Chapter 10 Troubleshooting for Call Drop 10-1 .................................................... 10.1 Overview 10-1 .............................................................................................. 10.1.1 Description 10-1 ................................................................................... 10.1.2 Formula for Call drop 10-3 ................................................................... 10.2 Causes 10-4 ................................................................................................. 10.2.1 Coverage 10-4 ..................................................................................... 10.2.2 Handover 10-6 ..................................................................................... 10.2.3 Interference 10-8 .................................................................................. 10.2.4 Uplink/downlink Unbalance Caused by Antenna & Feeder System 10-10 .................................................................................................... 10.2.5 Transmission Failure 10-11 .................................................................... 10.2.6 Unreasonable Parameter Settings 10-12 ............................................... 10.2.7 Others 10-13 .......................................................................................... 10.3 Examples 10-13 .............................................................................................. 10.3.1 Example 1: Reducing Call Drop by Optimizing Handover Related Parameter 10-13 .................................................................................. 10.3.2 Example 2: Call Drop Caused by Interference 10-14 ............................. 10.3.3 Example 3: Call Drop Caused by Interference 10-15 ............................. 10.3.4 Example 4: Uplink/downlink Unbalance 10-16 ....................................... 10.3.5 Example 5: Call Drop Caused by Interference from Repeater 10-16 ..... 10.3.6 Example 6: Call Drop Caused by Isolated Island Effect 10-17 .............. 10.3.7 Example 7: Settings of Version Related Parameters 10-18 ................... Chapter 11 Troubleshooting for Antenna & Feeder System 11-1 ....................... 11.1 Overview 11-1 .............................................................................................. 11.1.1 Common Failures 11-2 ......................................................................... 11.1.2 Common Causes of Failures 11-2 ....................................................... 11.2 Fundamental Knowledge 11-3 ...................................................................... 11.2.1 RF Transmission Path in Antenna Feeder System 11-3 ...................... 11.2.2 Measuring Standing Wave Ratio of Antenna Feeder 11-5 ................... 11.2.3 Checking CDU Antenna Port TTA Power Feeding 11-6 ...................... 11.3 Locating Failures of Different Types 11-8 ..................................................... 11.3.1 On Downlink Signal 11-8 ...................................................................... 11.3.2 On Uplink Signal 11-9 .......................................................................... 11.3.3 On Controlling and Alarm 11-10 ............................................................. 11.4 Examples 11-11 .............................................................................................. 11.4.1 Insufficient Power Tolerance of Lightning Arrester Caused Standing Wave Ratio of Antenna Feeder Abnormal 11-11 ............................... 11.4.2 EDU Internal Bias Tee Quality Problem Causing TTA Feeding Failure 11-11 ..................................................................................................... 11.4.3 No Cable Connection between TX-COM and TX-DUP of CDU Causing Call Establishment Failure 11-12 ........................................................
HUAWEI
M900/M1800 Base Station Subsystem Troubleshooting Manual V300R002
M900/M1800 Base Station Subsystem Troubleshooting Manual
Manual Version T2-030303-20030331-C-4.00 Product Version V300R002 BOM 31033203
Huawei Technologies Co., Ltd. provides customers with comprehensive technical support and service. Please feel free to contact our local office, customer care center or company headquarters.
Huawei Technologies Co., Ltd. Address: Administration Building, Huawei Technologies Co., Ltd., Bantian, Longgang District, Shenzhen, P. R. China Postal Code: 518129 Website: http://www.huawei.com Email: support@huawei.com
2003 Huawei Technologies Co., Ltd.
All Rights Reserved No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd. Trademarks , HUAWEI, C&C08, EAST8000, HONET, , ViewPoint, INtess, ETS, DMC, TELLIN, InfoLink, Netkey, Quidway, SYNLOCK, Radium, M900/M1800, TELESIGHT, Quidview, Musa, Airbridge, Tellwin, Inmedia, VRP, DOPRA, iTELLIN, HUAWEI OptiX, C&C08 iNET, NETENGINE, OptiX, SoftX, iSite, U-SYS, iMUSE, OpenEye, Lansway, SmartAX are trademarks of Huawei Technologies Co., Ltd. Notice The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document don't constitute the warranty of any kind, express or implied.
About This Manual Version The product version corresponds to the manual is M900/M1800 Base Station Subsystem. V300R002 Contents The manual consists of 11 chapters that brief the procedures and methods, troubleshooting for OMC, loading, links, clock, handover, congestion, access, voice, call drop, antenna & feeder. Chapter 1 General Procedures and Methods Chapter 2 Troubleshooting for OMC Chapter 3 Troubleshooting for Software Loading Chapter 4 Troubleshooting for Links Chapter 5 Troubleshooting for Clock Chapter 6 Troubleshooting for Handover Chapter 7 Troubleshooting for Congestion Chapter 8 Troubleshooting for Access Chapter 9 Troubleshooting for BSS Voice Chapter 10 Troubleshooting for Call Drop Chapter 11 Troubleshooting for Antenna & Feeder System Target Readers The manual is intended for the following readers: Marketing staff Installation engineers & technicians Operation & maintenance personnel Conventions This document uses the following conventions:
I. General conventions Convention Description Arial Normal paragraphs are in Arial. Arial Narrow Warnings, cautions, notes and tips are in Arial Narrow. Bold Headings, Command, Command Description are in boldface. Terminal Display Terminal Display is in Courier New; message input by the user via the terminal is in boldface.
II. Command conventions Convention Description italic font Command arguments for which you supply values are in italics. [ ] Elements in square brackets [ ] are optional. { x | y | ... } Alternative keywords are grouped in braces and separated by vertical bars. One is selected. [ x | y | ... ] Optional alternative keywords are grouped in square brackets and separated by vertical bars. One (or none) is selected. { x | y | ... } * Alternative keywords are grouped in braces and separated by vertical bars. A minimum of one and maximum of all can be selected. [ x | y | ... ] * Optional alternative keywords are grouped in square brackets and separated by vertical bars. Many (or none) are selected. ! A line starting with an exclamation mark is comments.
III. GUI conventions Convention Description < > Message entered via the terminal is within angle brackets. [ ] MMIs, menu items, data table and field names are inside square brackets [ ]. / Multi-level menus are separated by forward slashes (/). For example, [File/Create/Folder].
IV. Keyboard operation Format Description <Key> Press the key with key name expressed with a pointed bracket, e.g. <Enter>, <Tab>, <Backspace>, or<A>. <Key1+Key2> Press the keys concurrently; e.g. <Ctrl+Alt+A> means the three keys should be pressed concurrently.
Format Description <Key1, Key2> Press the keys in turn, e.g. <Alt, A>means the two keys should be pressed in turn. [Menu Option] The item with a square bracket indicates the menu option, e.g. [System] option on the main menu. The item with a pointed bracket indicates the functional button option, e.g. <OK> button on some interface. [Menu1/Menu2/Menu3] Multi-level menu options, e.g. [System/Option/Color setup] on the main menu indicates [Color Setup] on the menu option of [Option], which is on the menu option of [System].
V. Mouse operation Action Description Click Press the left button or right button quickly (left button by default). Double Click Press the left button twice continuously and quickly. Drag Press and hold the left button and drag it to a certain position.
VI. SymbolsEye-catching symbols are also used in this document to highlight the points worthy of special attention during the operation. They are defined as follows: Caution, Warning, Danger: Means reader be extremely careful during the operation. Note, Comment, Tip, Knowhow, Thought: Means a complementary description. Troubleshooting Manual M900/M1800 Base Station Subsystem Table of Contents
i Table of Contents Chapter 1 General Procedures and Methods ............................................................................. 1-1 1.1 Requirements for Maintenance Personnel ........................................................................ 1-1 1.1.1 On Professional Knowledge and Skills ................................................................... 1-1 1.1.2 On BSS System and Networking ............................................................................ 1-1 1.1.3 On BSS Equipment ................................................................................................. 1-1 1.1.4 On Instruments and Apparatus ............................................................................... 1-2 1.2 General Procedures of Troubleshooting............................................................................ 1-2 1.2.1 Information CollectionCollecting Original Information as Detailedly as Possible.. 1-2 1.2.2 Fault judgement-Judge the Scope and Type of the Fault ....................................... 1-3 1.2.3 Fault Location-Specifying the Concrete Cause of the Fault ................................... 1-4 1.2.4 Fault Removing-Removing the Fault and Restore the System by Using Suitable Methods or Steps ............................................................................................................. 1-4 1.3 Basic Methods of Fault Judgement and Location.............................................................. 1-4 1.3.1 Analysis of Original Information .............................................................................. 1-4 1.3.2 Alarm Information Analysis ..................................................................................... 1-5 1.3.3 Indicator Status Analysis......................................................................................... 1-5 1.3.4 Calling Test Auxiliary Analysis ................................................................................ 1-6 1.3.5 Apparatus & Meter Auxiliary Analysis ..................................................................... 1-7 1.3.6 Traffic Statistics Auxiliary Analysis.......................................................................... 1-8 1.3.7 Interface Trace........................................................................................................ 1-9 1.3.8 Test/Loop Back ..................................................................................................... 1-12 1.3.9 Comparison/Interchange....................................................................................... 1-13 1.3.10 Switching/Resetting............................................................................................. 1-14 1.3.11 Contacting the Technical Support Engineers of Huawei .................................... 1-15 Chapter 2 Troubleshooting for OMC........................................................................................... 2-1 2.1 GSM BSS-OMC Overview................................................................................................. 2-1 2.2 Troubleshooting for OMC and Examples .......................................................................... 2-2 2.2.1 Communication Between BAM and Host Interrupted ............................................. 2-2 2.2.2 Communication between BAM and Server is Interrupted....................................... 2-3 2.2.3 Data Table Error is Prompted during BAM Startup................................................. 2-3 2.2.4 Loading Host Programs from BAM Failed .............................................................. 2-4 2.2.5 Data cannot be Modified/Deleted............................................................................ 2-4 2.2.6 At Least One Table in BSC Data Management System Report Has Error............. 2-5 2.2.7 Communication Timeout of Interface Tracing through BSC Maintenance System 2-5 2.2.8 Alarm System Cannot Receive Alarms................................................................... 2-6 2.2.9 BAM Prints Fail to shake hands with alarm server ............................................... 2-7 2.2.10 Records in Alarm Box and Those in Fault Alarm table Inconsistent..................... 2-8 Troubleshooting Manual M900/M1800 Base Station Subsystem Table of Contents
ii 2.2.11 Shaking Hand with BAM Fails When the Alarms are Deleted through Alarm System .......................................................................................................................................... 2-9 2.2.12 Failed to Register due to Server Failure ............................................................... 2-9 2.2.13 Service Console Running Failed......................................................................... 2-10 2.2.14 Loading Shell Map Failed.................................................................................... 2-10 2.2.15 Some Computers Cannot Refresh Templates in BSC Traffic Statistics System 2-11 Chapter 3 Troubleshooting for Software Loading ..................................................................... 3-1 3.1 Troubleshooting for BTS Software Loading....................................................................... 3-1 3.1.1 Introduction to Software Loading ............................................................................ 3-1 3.1.2 Description of Common Loading Troubles.............................................................. 3-3 3.1.3 Trouble Handling..................................................................................................... 3-3 3.1.4 An Example of Loading BTS Software Failure on the Remote WS........................ 3-6 3.2 Troubleshooting for BSC Software Loading ...................................................................... 3-7 3.2.1 BSC Software Loading Channels............................................................................ 3-7 3.2.2 AM/CM Loading Fails.............................................................................................. 3-7 3.2.3 AM/CM Loading Timeout ...................................................................................... 3-10 3.2.4 BM Loading Failed and LOAD Indicator is Constantly ON................................... 3-11 3.2.5 BM Loading Timeout ............................................................................................. 3-13 3.2.6 BM Loading Starts Again after Successful Loading.............................................. 3-14 3.2.7 BM Does Not Work Normally or Reload Automatically......................................... 3-15 3.2.8 Operating Parameters Not Updated after Data Loading....................................... 3-16 Chapter 4 Troubleshooting for Links .......................................................................................... 4-1 4.1 Overview............................................................................................................................ 4-1 4.2 Fundamental Knowledge................................................................................................... 4-1 4.2.1 SS7 Concepts ......................................................................................................... 4-1 4.2.2 SS7 Direction .......................................................................................................... 4-2 4.2.3 GMC2...................................................................................................................... 4-2 4.2.4 Pb Interface............................................................................................................. 4-2 4.2.5 Signaling Direction on LAPD Links ......................................................................... 4-4 4.3 Trouble Handling................................................................................................................ 4-4 4.3.1 No SS7 Signaling Trace Message .......................................................................... 4-4 4.3.2 RSL Disabled .......................................................................................................... 4-5 4.3.3 OML Disabled.......................................................................................................... 4-6 4.3.4 SS7 Link Faulty ....................................................................................................... 4-7 4.3.5 AM/CM-BM Link Faulty ......................................................................................... 4-13 4.3.6 Optical Path Faulty................................................................................................ 4-13 4.3.7 PbSL Disabled....................................................................................................... 4-14 4.3.8 E1 Faulty ............................................................................................................... 4-15 4.3.9 Microcell's Optical Transmission Board Faulty ..................................................... 4-16 4.4 Examples ......................................................................................................................... 4-18 4.4.1 Examples of SS7 Link Trouble.............................................................................. 4-18 4.4.2 Examples of RSL & OML Troubles ....................................................................... 4-19 Troubleshooting Manual M900/M1800 Base Station Subsystem Table of Contents
iii 4.4.3 Examples of E1 Transmission Trouble ................................................................. 4-21 4.4.4 Examples of Optical Transmission Board Trouble in Microcell............................. 4-26 Chapter 5 Troubleshooting for Clock.......................................................................................... 5-1 5.1 Overview............................................................................................................................ 5-1 5.1.1 Fault Description and Fault Causes in BSC Clock System .................................... 5-1 5.1.2 Fault Description and Fault Causes in BTS Clock System..................................... 5-2 5.2 Fundamental Knowledge................................................................................................... 5-4 5.2.1 Introduction to BSC................................................................................................. 5-4 5.2.2 Fundamental Knowledge of BTS Clock System................................................... 5-13 5.2.3 Specifications about BTS Clock System............................................................... 5-18 5.3 Trouble Handling.............................................................................................................. 5-18 5.3.1 Trouble Handling for BSC..................................................................................... 5-18 5.3.2 Troubleshooting for BTS....................................................................................... 5-20 5.4 Examples ......................................................................................................................... 5-23 5.4.1 Troubleshooting Examples for BSC Clock Fault................................................... 5-23 5.4.2 Troubleshooting Examples for BTS Clock Fault ................................................... 5-25 Chapter 6 Troubleshooting for Handover................................................................................... 6-1 6.1 Overview............................................................................................................................ 6-1 6.1.1 Failure Classification............................................................................................... 6-1 6.1.2 Locating Tool........................................................................................................... 6-1 6.2 Trouble Handling................................................................................................................ 6-2 6.2.1 Locating Procedure................................................................................................. 6-2 6.2.2 Locating Procedure of No Handover Starting Up.................................................... 6-3 6.2.3 Locating of Hardware Failure.................................................................................. 6-4 6.2.4 Locating of Data Configuration Problem................................................................. 6-4 6.3 Examples ........................................................................................................................... 6-5 6.3.1 MSC Handover Problem......................................................................................... 6-5 6.3.2 BSC Problems....................................................................................................... 6-10 6.3.3 BTS-related Problem............................................................................................. 6-16 6.3.4 Others.................................................................................................................... 6-21 Chapter 7 Troubleshooting for Congestion................................................................................ 7-1 7.1 Overview............................................................................................................................ 7-1 7.2 A-Interface Congestion...................................................................................................... 7-1 7.2.1 Fundamental Knowledge ........................................................................................ 7-1 7.2.2 Troubleshooting Process ........................................................................................ 7-2 7.3 Radio Channel Congestion................................................................................................ 7-4 7.3.1 Fundamental Knowledge ........................................................................................ 7-4 7.3.2 Analysis................................................................................................................... 7-5 7.4 Examples ........................................................................................................................... 7-7 7.4.1 Example of SDCCH Congestion Resulted from Co-frequency Interference .......... 7-7 7.4.2 A Cells Congestion rate is Overhigh ...................................................................... 7-8 Troubleshooting Manual M900/M1800 Base Station Subsystem Table of Contents
iv 7.4.3 Severe SDCCH Congestion Resulted from Unstable Transmission ...................... 7-8 7.4.4 SDCCH Congestion Resulted from Lots of Burst LA Updating .............................. 7-9 7.4.5 High TCH Congestion Rate Resulted from Incorrect CIC Setting ........................ 7-10 7.4.6 High SDCCH Congestion Rate LAC Resulted from Improper LAC Setting.......... 7-11 Chapter 8 Troubleshooting for Access....................................................................................... 8-1 8.1 Overview............................................................................................................................ 8-1 8.1.1 MS Searching Network ........................................................................................... 8-1 8.1.2 Location Updating Procedure.................................................................................. 8-1 8.1.3 Call Procedures....................................................................................................... 8-3 8.2 Trouble Handling................................................................................................................ 8-5 8.2.1 MS Cannot Find a Network..................................................................................... 8-5 8.2.2 MS Cannot Access a Network ................................................................................ 8-7 8.2.3 Location Updating Is too Frequent .......................................................................... 8-9 8.2.4 MS Drops from the Network Frequently.................................................................. 8-9 8.2.5 MS Finds a Network but Cannot Call .................................................................... 8-10 Examples .............................................................................................................................. 8-11 8.3.......................................................................................................................................... 8-11 8.3.1 MS Has Difficulty in Accessing a Network Signal Is too Weak.......................... 8-11 8.3.2 MS Cannot Perform Cell Reselection Signals of Adjacent Cells Are Weak ...... 8-11 8.3.3 GSM MS Drops from the Network Location Updating Period Is too Short ........ 8-12 8.3.4 MS Drops from the Network CGI is Erroneous.................................................. 8-13 8.3.5 MS Has Difficulty in Accessing a Network MSC Cell Data Is not Ready........... 8-14 8.3.6 MS Drops from the Network MSC Cell Data Is not Ready ................................ 8-15 8.3.7 Some MSs Cannot Access a Network System Information Is Erroneous ......... 8-16 8.3.8 MS Has Difficulty in Accessing a Network CBQ and CBA Are Erroneous......... 8-17 Chapter 9 Troubleshooting for Voice.......................................................................................... 9-1 9.1 Overview............................................................................................................................ 9-1 9.2 Fundamental Knowledge................................................................................................... 9-1 9.2.1 Transmission Format of Voice Signal ..................................................................... 9-1 9.2.2 Transmission Path of Voice Signal ......................................................................... 9-2 9.2.3 Concepts ................................................................................................................. 9-4 9.2.4 Common Operations ............................................................................................... 9-4 9.2.5 Supplement ............................................................................................................. 9-6 9.3 Processing of Voice Troubles............................................................................................ 9-7 9.3.1 Analysis................................................................................................................... 9-7 9.3.2 Location Procedures ............................................................................................... 9-8 9.4 Trouble Location................................................................................................................ 9-9 9.4.1 Single Pass and No pass........................................................................................ 9-9 9.4.2 Echo ...................................................................................................................... 9-11 9.4.3 Voice Discontinuity................................................................................................ 9-13 9.4.4 Noise ..................................................................................................................... 9-14 9.4.5 Cross-talking ......................................................................................................... 9-16 Troubleshooting Manual M900/M1800 Base Station Subsystem Table of Contents
v 9.5 Fault Examples ................................................................................................................ 9-17 9.5.1 Cross-talking Resulting from Improper Data Configuration .................................. 9-17 9.5.2 Voice Discontinuity Resulting from BCCH Carrier Mutual-assistance.................. 9-18 9.5.3 Single Pass Resulting from MS Fault.................................................................... 9-19 9.5.4 Noise Resulting from Poor Contact of E1 ............................................................. 9-19 9.5.5 Voice Loopback Resulting from Outgoing Cabling ............................................... 9-21 Chapter 10 Troubleshooting for Call Drop................................................................................ 10-1 10.1 Overview........................................................................................................................ 10-1 10.1.1 Description .......................................................................................................... 10-1 10.1.2 Formula for Call drop .......................................................................................... 10-3 10.2 Causes........................................................................................................................... 10-4 10.2.1 Coverage............................................................................................................. 10-4 10.2.2 Handover............................................................................................................. 10-6 10.2.3 Interference ......................................................................................................... 10-8 10.2.4 Uplink/downlink Unbalance Caused by Antenna & Feeder System................. 10-10 10.2.5 Transmission Failure......................................................................................... 10-11 10.2.6 Unreasonable Parameter Settings.................................................................... 10-12 10.2.7 Others................................................................................................................ 10-13 10.3 Examples ..................................................................................................................... 10-13 10.3.1 Example 1: Reducing Call Drop by Optimizing Handover Related Parameter. 10-13 10.3.2 Example 2: Call Drop Caused by Interference.................................................. 10-14 10.3.3 Example 3: Call Drop Caused by Interference.................................................. 10-15 10.3.4 Example 4: Uplink/downlink Unbalance............................................................ 10-16 10.3.5 Example 5: Call Drop Caused by Interference from Repeater ......................... 10-16 10.3.6 Example 6: Call Drop Caused by Isolated Island Effect ................................... 10-17 10.3.7 Example 7: Settings of Version Related Parameters........................................ 10-18 Chapter 11 Troubleshooting for Antenna & Feeder System................................................... 11-1 11.1 Overview........................................................................................................................ 11-1 11.1.1 Common Failures................................................................................................ 11-2 11.1.2 Common Causes of Failures............................................................................... 11-2 11.2 Fundamental Knowledge............................................................................................... 11-3 11.2.1 RF Transmission Path in Antenna Feeder System............................................. 11-3 11.2.2 Measuring Standing Wave Ratio of Antenna Feeder.......................................... 11-5 11.2.3 Checking CDU Antenna Port TTA Power Feeding ............................................. 11-6 11.3 Locating Failures of Different Types.............................................................................. 11-8 11.3.1 On Downlink Signal ............................................................................................. 11-8 11.3.2 On Uplink Signal.................................................................................................. 11-9 11.3.3 On Controlling and Alarm.................................................................................. 11-10 11.4 Examples ..................................................................................................................... 11-11 11.4.1 Insufficient Power Tolerance of Lightning Arrester Caused Standing Wave Ratio of Antenna Feeder Abnormal ........................................................................................... 11-11 11.4.2 EDU Internal Bias Tee Quality Problem Causing TTA Feeding Failure ........... 11-11 Troubleshooting Manual M900/M1800 Base Station Subsystem Table of Contents
vi 11.4.3 No Cable Connection between TX-COM and TX-DUP of CDU Causing Call Establishment Failure................................................................................................... 11-12
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-1 Chapter 1 General Procedures and Methods 1.1 Requirements for Maintenance Personnel Nowadays the M900/M1800 base station subsystem (BSS) has found extensive application in the GSM network. Correspondingly it plays a more and more important role in the telecommunication network. Its stable and normal operation has to be guaranteed. Any possible trouble has to be removed as soon as possible. This manual describes how to locate and remove the possible trouble from BSS quickly. The BSS maintenance personnel who is to use this manual should be acquainted with the following: 1.1.1 On Professional Knowledge and Skills The BSS maintenance personnel should well grasp the following: ! Communication knowledge, including GSM principle, exchange principle, PCM principle, SDH principle, etc. ! Product knowledge, including BSS functional configuration, radio interface theory, calling flow, traffic flow, etc. ! Related signaling and protocols, including SS7, LAPD, LAPDm, etc. ! Related international technical specifications. ! PC network fundamentals, including Ethernet, TCP/IP, Client/Server, database, etc. ! Skills in BSS routine operation, PC operation and instrument operation. 1.1.2 On BSS System and Networking The maintenance personnel should be acquainted with BSS and the related networking, including: ! BSS hardware configuration and performance parameters. ! Networking topology between BSC and each BTS and multiplexing ratio and trunk mode on the Abis interface. ! BSS cell distribution and attributes. ! BSS handover and power control parameters. ! Network configuration and channel allocation of the related transmission devices. 1.1.3 On BSS Equipment In order to guarantee the troubleshooting efficiency and prevent any misoperation, the BSS maintenance personnel should get the related certificate and know the BSS operation process. The significant trouble should be handled by the personnel who Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-2 have received Grade B (or above) training from Huawei Technologies Co., Ltd. Besides, the BSS maintenance personnel should also know: ! Which operations may cause interruption over part or all of the traffics. ! Which operations may cause equipment damage. ! Which operations may cause MS complaints. ! Which emergency or standby measures are provided. 1.1.4 On Instruments and Apparatus Instruments and apparatus are very useful for BSS troubleshooting. Their visual data may directly tell the trouble location. Therefore, they can help to enhance the BSS troubleshooting efficiency. The BSS maintenance personnel should be able to use the following instruments and apparatus skillfully. ! Test MS ! Power meter ! Antenna & feeder analyzer ! Signaling analyzer ! Multimeter ! Oscilloscope ! Spectrum analyzer ! Frequency meter 1.2 General Procedures of Troubleshooting Generally, the troubleshooting should go through the following four stages: Information collection -> Fault judgement -> Fault location ->Fault removing. 1.2.1 Information CollectionCollecting Original Information as Detailedly as Possible I. Essential Any fault processing procedures begins from the fault information collection of the maintenance personnel. There are four sources of fault information collection: ! Fault complaint from the customer or customer center; ! Analysis on traffic statistics items; ! Alarm output of the BSS alarm system; ! Routine maintenance or abnormality found during the scout During BSS routine maintenance, most fault information is from the former three sources. However, usually the initially obtained fault information cannot describe the trouble completely and thoroughly, especially when the information is got via a phone call. The information cannot represent the essence of the fault unless it is given in detail. Nowadays, the network size is growing and the complication of networking is increasing. The change in and interference from the various internal and external factors may constitute a negative impact on the normal running of the BSS. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-3 Consequently BSS faults might result from some more complex causes. This makes it more and more difficult to locate a BSS fault. It is no use to just analyze the problem and try to solve it based on the inadequate information. It may enlarge the scope of faults to be located and increase the difficulty of solving them, even lead to the error processing methods, thus losing the best chance to remove the faults. So, it is essential to collect various original informations. II. Practical Although there are many factors for the fault of the BSS, they seldom function simultaneously. That is, only some or one of them function(s) at the specified time point. It implements the possibility to locate the fault by using the exclusive method. In the initial stage of fault processing, the collection of original information shall help the maintenance personnel in locating the fault, and improve the efficiency of fault processing, reduce the possibility of error operation, thus making the customer more satisfied. III. Suggestions on maintenance ! The maintenance personnel should collect the original information, especially in case of serious fault, so as to perform the next step. ! The maintenance personnel are strongly recommended to study the system theory, GSM theory, GSM specification and relative signaling knowledge so as to solve the problem as quickly as possible. ! When answering the fault complaint call, the maintenance personnel is suggested to gather as much information as possible. ! The maintenance personnel are also suggested to create the environment with their companion, under which they are able to communicate and ask for help easily. 1.2.2 Fault judgement-Judge the Scope and Type of the Fault After the fault information is collected, the scope and type of the fault are to be judged. I. Judge the scope of the fault The determining of the fault scope is to find the direction of fault processing, that is, where and how to find the concrete cause of the fault. In the BSS, the fault scope refers to the area where the fault happens. It often has something to do with the location of the BSS. In the manual, the fault scopes are determined based on the application of the BSS. The manual consists of ten parts: ! OMC fault; ! Load fault ! Link fault; ! Clock fault; ! Handover fault; ! Congestion fault; Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-4 ! Access fault; ! Voice fault; ! Call drop fault; ! Antenna and feeder fault; II. Judge the type of the fault The judgement of the type of the fault is to decide to use what methods to analyze and solve the problem. The fault categorization shall undergo in different methods, which are going to be discussed in the subsequent chapters. Refer to the section of 1.3 Basic Methods of Fault Judgement and Location for the mostly used fault judgement methods. 1.2.3 Fault Location-Specifying the Concrete Cause of the Fault As mentioned in the above, although there are many factors for the fault of the BSS, they seldom function simultaneously. That is, only some or one of them function(s) at the specified time. The fault location is to exclude the impossible causes and find the right ones from many possible causes. The accurate and fast location can not only improve the troubleshooting efficiency but avoid the man-made accidents caused by the error operation. It provides the key instruction and reference for the fault processing methods. The basic methods of fault judgement and location shall be introduced briefly in the next section. 1.2.4 Fault Removing-Removing the Fault and Restore the System by Using Suitable Methods or Steps After the fault is located, the procedure of fault removing is implemented as the last step of troubleshooting. Fault removing is to remove the fault and restore the system by using some measures, such as checking transmission line, replacing board, modifying configuration data, switching system, resetting board, etc. 1.3 Basic Methods of Fault Judgement and Location 1.3.1 Analysis of Original Information Original information consists of the fault information collected through the subscriber fault complaint, the fault notification from other offices, the abnormality met during maintenance and other information collected by the maintenance personnel. It is the important material for the fault judgement and analysis. The original information analysis is used to judge the fault scope, specify the fault type, and to provide the gist for reducing the fault scope and initial locating of fault. If Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-5 the maintenance personnel are experienced, they are even able to locate the fault directly. The BSS maintenance personnel usually can get more than what they expect if they do well in collecting the original information and analyzing it effectively and thoroughly. Besides handling the MS related trouble, the original information analysis can also be used to handle other troubles, especially the trunk trouble. Trunk is related to transmission system interconnection and signaling matching; therefore, the original information collection seems to be vital to trunk troubleshooting. Such original information includes the operating status (normal or not) of transmission system, state (modified or not) of data at the peer office, definitions of some signaling parameters, etc. 1.3.2 Alarm Information Analysis Alarm information is the information output from the BSS alarm system, indicated usually through sound, light, Light Emitting Diode (LED), and screen output, etc. The alarm information output from the alarm maintenance system includes the detailed description of the abnormality, possible causes and restoration suggestions. It involves the hardware, link, trunk and Central Processing Unit (CPU) loading ratio, etc., with abundant and complete information. It is a gist for the fault analysis and location. The alarm information analysis is mainly used to find the specified section or cause of the fault. Due to its abundant contents, the alarm information may be used to locate the fault cause itself or along with other methods. It is one of the main methods for fault analysis. 1.3.3 Indicator Status Analysis Each board has status indicators. These indicators can indicate the work status of circuit, link, optical path, node and active/standby mode besides that of the corresponding boards, so they are much helpful for the fault analysis and location. The indicator status analysis is mainly used to find the probable fault section or cause quickly and provide the materials for the next processing. Due to the inadequacy of the information indicated by the indicators, they are used along with the alarm information generally. [Example] Take indicators of the BIE for example. Table 1-1 describes the indicators of the BIE. Two of them are cited as below: ! When the indicator LIU1 is ON, it can be considered that the first E1 cable of the board is not connected with the corresponding BTS. The reason may be that the E1 cable is wrongly connected or that the transmission equipment is erroneous. ! When the indicator ACT is ON, it can be considered that the board acts as an active board. For other indicators, refer to the following table. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-6 Table 1-1Indicator description of the BIE Indicator name Color Meaning Description Normal status RUN Green Run indicator ON: FPGA loading. ON 0.25s & OFF 0.25s: the board is not in operation. ON 1s & OFF 1s: the board is in normal operation. OFF: self-check fault and LIU1~LIU6 are ON. ON 1s & OFF 1s ACT Green Active/stand by indicator ON: the board is active. OFF: the board is standby. ON/OFF LIU1 Green Group 1 E1 indicator ON: local-end loss of synchronization. ON 1s & Off 1s: peer alarm. OFF: the first E1 is connected. OFF LIU2 Green Group 2 E1 indicator ON: local-end loss of synchronization. ON 1s & Off 1s: peer alarm. OFF: the second E1 is connected. OFF LIU3 Green Group 3 E1 indicator ON: local-end loss of synchronization. ON 1s & Off 1s: peer alarm. OFF: the third E1 is connected. OFF LIU4 Green Group 4 E1 indicator ON: local-end loss of synchronization. ON 1s & Off 1s: peer alarm. OFF: the fourth E1 is connected. OFF LIU5 Green Group 5 E1 indicator ON: local-end loss of synchronization. ON 1s & Off 1s: peer alarm. OFF: the fifth E1 is connected. OFF LIU6 Green Group 6 E1 indicator ON: local-end loss of synchronization. ON 1s & Off 1s: peer alarm. OFF: the sixth E1 is connected. OFF
[Notes] The maintenance personnel should be familiar with the meanings of indicator status so as to respond quickly in case of faults. 1.3.4 Calling Test Auxiliary Analysis Within all the services provided by the BSS, the greatest portion is the voice service, so most of the fault relative with the BSS will affect directly or indirectly the normal calls of subscribers. So, the calling test is a simple and quick method to judge whether the call processing function and relative modules of the BSC are normal or not. It is often used to judge the normality of MS (Mobile Station), BTS (Base Transceiver Station), BSC (Base Station Controller), trunk system, etc. [Example] There was a BTS30 with configuration of S(1/1/1). Its three cells were respectively at the frequency of 119, 123 and 105. The call initiated via this BTS had no voice, whether the call was to an MS or to a fixed phone. The troubleshooting process is given below: 1) At the remote maintenance console, the engineer viewed the TRX and TMU states and found they were normal and no alarm was generated. 2) Via signaling trace, the engineer found the call procedure was complete. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-7 3) The BTS was located in the second module of the BSC where the other BTSs were normal. The engineer checked the BSC data configuration and found no problem. 4) The engineer opened the back door of the BTS to check the cables and found everything was all right. 5) The engineer performed soft & hard reset over the BTS, pulled the TMU out and then reinserted it and reloaded the software. However, the trouble was not removed yet. He then replaced the TMU and TRX, but the trouble still existed. 6) In the BSC equipment room, the engineer changed the BTS to another trunk port and found that call can be set up via that BTS with that configuration. The possibility of BTS failure was excluded. 7) The engineer pulled out the BIE and then reinserted it, replaced the trunk cable and HW; however, the trouble still existed. 8) The engineer performed several dialing tests using an MS and found the MS could not be disconnected. He suspected that the problem lied in time slot interchange. As is known, time slot interchange problem is usually caused by E3M or GNET failure. 9) The engineer switched over the GNET and found call voice could be heard. When he switched the active GNET back, the call voice disappeared again. He then replaced the active GNET with the standby and found the call voice could not be heard yet. It indicated that the problem lied in both the active GNET and the corresponding slot. [Notes] The calling test is one of the mostly used methods in the routine maintenance. It is often employed along with the interface message trace, and used widely in the testing of various functions of the BSS. 1.3.5 Apparatus & Meter Auxiliary Analysis The apparatus and meter auxiliary analysis is the widely used technical method of fault analysis and location for the BSS fault processing. It reflects the fault nature with the visual and quantified data, and is often used in the power test, signaling analysis, wave analysis and error code detection, etc. [Example] Call drop rate was high. The troubleshooting process is given below: 1) The engineer intercepted the signaling of some dropped calls using MA10. 2) He analyzed the signaling and found the TA approached 63. It indicated that the reason was that the TA was too great. 3) The engineer modified the data configuration to reduce the cell coverage. 4) Then the call drop rate decreased. [Notes] Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-8 The maintenance personnel should be able to operate the meters. 1.3.6 Traffic Statistics Auxiliary Analysis The call drop rate is one of the key indices of the BSS, because it affects directly the operators' income and is one of the key elements for the core competition of the operators. So, the operators pay much attention to increase the handover successful ratio and reduce the call drop. However, there are many possible causes affecting the call drop ratio of BSS, and they are hard to detect. So, the key to decrease the call drop ratio is to find the main causes generating the call loss in time and resist them. The traffic statistics is one of the analysis tools. [Example] One BTS30 was expanded from S (4/4) to S(8/4). (The TRX0, TRX1, TRX2, TRX3, TRX4, TRX5, TRX10 and TRX11 formed cell 1 and the TRX6-TRX9 formed cell 2.) After hardware expansion and the corresponding data setting, MS complained that calls made near the BTS were subject to beep affect. No alarm related to that BTS was generated. The troubleshooting process is given below: 1) According to the traffic statistics on TCH cell measurement, the number of TCH call drops of the cell 1 under that BTS was 63 and TCH call drop rate reached 3.7%. The number of A interface failures during TCH occupation was 63. The average number of idle TCHs in interference band 3 was 0.94, that in interference band 4 was 0.33 and that in interference band 5 was 1.21. According to the traffic statistics on intracell handover measurement, the number of unsuccessful outgoing BSC handovers of that cell was 35 and the number of unsuccessful incomming BSC handovers of that cell was 12. From the traffic statistics on outgoing cell handover measurement, the engineer found that the cause of outgoing cell handover was that the uplink quality of that cell was poor. 2) Since it was after the BTS expansion that the items of the cell 1 under that BTS indicated worse performance, the problem most probably lied in hardware connection or data configuration. 3) The engineer checked the CDUs, TRXs and all RF cables and found they were all properly and securely connected. 4) As the average number of idle TCHs in interference bands 3, 4 and 5 increased from 0 after the BTS expansion, it could be considered that the more A interface failures, lower outgoing/incoming cell handover success rate and higher TCH call drop rate were all related to interference. 5) The engineer checked the frequency planning data to see whether there were inter-frequency and co-BSIC adjacent cells and whether the frequency of TCH was adjacent to that of BCCH. No unreasonable setting was found. 6) The engineer checked whether the hopping related data configuration was right. For example, he checked whether BCC was identical to the training sequence No. (TSN) and whether the mobile allocation index offset (MAIO) and hopping Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-9 sequence No. (HSN) were properly configured. He laid the checking emphasis on [Frequency Hopping Table], [Radio Channel Configuration Table], [TRX Configuration Table], [Cell Configuration Data Table] and [Cell Allocation Table]. 7) When checking the [Radio Channel Configuration] Table, he found the MAIOs corresponding to the eight TRXs TRX0, TRX1, TRX2, TRX3, TRX4, TRX5, TRX10 and TRX11 of the cell 1 were respectively 0, 1, 2, 3, 4, 5, 7 and 8. Obviously such setting was erroneous. The engineer modified the MAIOs respectively to 0, 1, 2, 3, 4, 5, 6 and 7, set the whole table to the specified module and then reset the BTS at the fourth hierarchy. 8) One hour later, the engineer viewed the traffic statistics on TCH cell measurement. He found that the average number of idle TCHs in interference bands 3, 4 and 5 was all 0, that the number of A interface failures during TCH occupation was 0 and that the number of TCH call drops was 1. He then viewed the traffic statistics on intracell handover measurement and found that the number of unsuccessful outgoing BSC handovers of that cell was 0 and the number of unsuccessful incomming BSC handovers of that cell was also 0. 9) When the engineer performed dialing tests near the BTS, the beep affect disappeared. So the trouble was removed. [Notes] The traffic statistics analysis is often used along with the signaling trace and analysis, and it plays an important role in high call drop ratio, low handover successful ratio, call abnormality, etc. The maintenance personnel are strongly suggested to master it. 1.3.7 Interface Trace The interface trace is applied in locating the failure causes of subscriber call connection and inter-office signaling cooperation, etc. The trace result can help to find the cause of call failure directly and locate the problem or to get the index for the subsequent analysis. [Example] There were two Huawei BSCs (BSC1 and BSC4). The BSC1 was connected with Huawei MSC1 and BSC4 with MSC2 from the manufacturer S. During operation, it was found the BSC4-to-BSC1 handover success rate was over 90% while the BSC1-to-BSC4 handover success rate was only about 20%. The troubleshooting process is given below: 1) During a light traffic period, the engineer performed a handover test on the site. He traced the signaling on the internal interface and that on the A interface via the maintenance console. Then he analyzed the collected data. 2) The signaling traced on the user interface of MSC1 was shown in Figure 1-1. Upon receipt of the message "Prepare Handover", MSC2 should have returned the message "Prepare Handover ACK"; however it returned the message "Abort" actually. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-10 MSC2 MSC1 Prepare Handover Abort
Figure 1-1Flow of signaling traced on the user interface of MSC1 3) The signaling traced on the A interface between MSC2 and BSC4 was shown in Figure 1-2. MSC2 BSC4 Handover Request ACK Clear Command Handover Request Clear Complete Paging None Connect Response
Figure 1-2Flow of signaling traced on the A interface 4) The engineer checked the message "Handover Request" from MSC2 to BSC4 and found that MSC2 had transmitted the "Handover Required" message properly and transparently and the speech version information was specified via the channel type. 0B IE 04 Length 01 Speech 08 Full rate TCH channel Bm 91 GSM speech full rate version 2 01 GSM speech full rate version 1 The message "Handover Request ACK" BSC4 returned to MSC2 upon receipt of that message from MSC2 contained the following: Layer 3 information: 17 0D 06 2B 38 51 0C 00 0C B3 05 DB 63 01 90 Chosen encryption algorithm: 2C 01 Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-11 Layer 3 information of the message indicated the Channel Mode: 63 IE 01 Speech Full rate TCH or Half Rate version 1 Then MSC sent the message "Clear Command" to BSC with "Cause: Protocol Error between MSC-BSC". Since the phase version of the A interface between MSC2 and BSC4 was Phase 2, analysis over the signaling on the A interface should be based on Phase 2 protocol. The difference between successful signaling and failed signaling lies in the speech version in channel type. So MSC2 expected the message "Handover Request ACK" returned from BSC4 could further indicate the speech version information. However, as is specified in Phase 2 GSM 08.08 protocol, the message "Handover Request ACK" contains no speech version information but the Channel Mode information that is indicated in Layer 3 information. However, according Phase 2+ protocol, the message "Handover Request ACK" should contain the "Speech Version (Chosen)" when BSS selects a speech version. Details are cited below. (See "GSM 08.08 version 7.6.0".) 3.2.1.10 HANDOVER REQUEST ACKNOWLEDGE This message is sent from the BSS to the MSC and indicates that the request to support a handover at the target BSS can be supported by the BSS, and also to which radio channel(s) the MS should be directed. The message is sent via the BSSAP SCCP connection associated with the dedicated resource. INFORMATION ELEMENT REFERENCE DIRECTION TYPE LEN Message Type 3.2.2.1 BSS-MSC M 1 Layer 3 Information 3.2.2.24 BSS-MSC M (1)11-n Chosen Channel 3.2.2.33 BSS-MSC O (4) 2 Chosen Encryption Algorithm 3.2.2.44 BSS-MSC O (5) 2 Circuit Pool 3.2.2.45 BSS-MSC O (2) 2 Speech Version (Chosen) 3.2.2.51 BSS-MSC O (6) 2 Circuit Identity Code 3.2.2.2 BSS-MSC O (3) 3 LSA Identifier 3.2.2.15 BSS-MSC O (7)5 1 This information field carries a radio interface HANDOVER COMMAND message. 2 Shall be included when several circuit pools are present on the BSS MSC interface and a circuit was allocated by the HANDOVER REQUEST message. 3 The Circuit identity code information element is included mandatorily by the BSS if the BSS allocates the A interface circuits and a circuit is needed. 4 Included at least when the channel rate/type choice was done by the BSS. 5 Included at least when the encryption algorithm has been selected by the BSS. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-12 6 Included at least when the speech version choice was done by the BSS. 7 Shall be included if a new potential current LSA in the target cell has been identified (see GSM 03.73). Not included means that there is no potential current LSA in the target cell. Based on the above analysis, it could be concluded that the A interface phase version configured at MSC2 was different from that configured at BSC4 or that MSC2 processing on Phase 2 protocol was erroneous. 5) The engineer modified the A interface phase version configurations at those two sides and made them consistent with each other. From the traffic statistics, the engineer found the BSC1-to-BSC4 handover success rate became over 90%. So the trouble was removed. [Notes] The interface trace is able to locate the fault cause accurately and provide the valuable reference information, it is one of the widely used method in the BSS routine maintenance and fault processing. 1.3.8 Test/Loop Back The test is to measure the relative technical parameters of output power, transmission channel or trunk device which may be in the fault status with the apparatus and meter, software test tool, etc., and to judge whether the devices are faulty or to be faulty. The loop back is to perform the self sending and receiving (self loop) to one transmission device or channel with the hardware or software method. And judging the normality of transmission device, transmission channel, service status and signaling cooperation, etc. after the self loop so as to find whether the relative hardware condition and software parameter setting are normal or not. It is one of widely used method to locate the transmission problem and judge the correctness of trunk parameter setting. [Example] One office was to be expanded. During the expansion process, the office engineer decided to add an SS7 link with SLC of 2 (consistent with that at MSC) besides the expanded parts. The engineer then manually modified the data. After resetting the whole BSC, the engineer found the links configured automatically were all normal while the one added manually could not be established. Three data tables were modified manually on the site: [E3M E1 Configuration Table], [MTP Link Table] and [Trunk Circuit Table]. The troubleshooting process is given below: 1) The engineer checked the related data and found that all data was right. The link was configured in the FTC 13. The engineer checked the [Trunk Circuit Table] and found that all the time slots with No. of 16 in the corresponding TCSM were properly set as Unavailable and they were described to act as A or Pb interface signaling links. The SS7 link run through the port 1 in the last BIE and it was configured to involve the time slot 16; therefore, the corresponding trunk circuit No. was right 2096. 2) The other two links in that module run through the port 0 in the transparent transmission BIE and were normal. The engineer suspected that the problem lied Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-13 in the port 1. He replaced the port 0 with the port 1, modified the corresponding data configuration and set the whole table. The latter two links were still normal. The possibility of port 1 failure was excluded. 3) Via check, the engineer found the FTC indicator was in the normal state and that circuits of the FTC 13 were all normal. 4) The engineer looped back the BIE and found the link was normal. 5) He looped back the E3M and found the link was also normal. 6) Then he looped back the FTC and found the link was abnormal now. It indicated that the problem lied in the parts between the E3M and TCSM. 7) The engineer suspected that data configuration in AM/CM was wrong. He vied the Host and found the SS7 that should have been added to the [E3M E1 Configuration Table] was not added. It indicated that the data was not written into the Host. It could be concluded that the problem lied in the DIP switch. 8) The engineer checked the GMCC DIP switches in AM/CM and found the DIP switches with name of S1-2 and S1-4 were all set to ON (indicating that frozen data should be used). The engineer modified the DIP switches with name of S1-2 and S1-3 to ON and those with name of S1-4 to OFF. After he reloaded the data, the SS7 link became normal. [Notes] The test and loop back methods are usually employed along in the location of transmission fault. The loop back can be classified into hardware loop back and software loop back. The operation of software loop back is simple and flexible, but its reliability is not as good as the hardware loop back. In addition, the BSS trunk self loop is also often used to judge whether the parameters of local office and the outgoing route data are set correctly during the office deployment and trunk expansion. 1.3.9 Comparison/Interchange Comparison is to compare the faulted components or phenomenon with the normal ones, and find the differences so as to find the problem. It is usually used in the situation with simple fault range. Interchange is to interchange the normal components (such as board, optical fiber, etc.) with the potential faulted components if the fault range or part can not be located even after the standby components are replaced, and to compare the work status change after the interchange so as to specify the fault range or part. It is usually used in the situation with complicated fault range. [Example] During power-on commissioning over one BSC, the engineer found the GALM in AM/CM was displayed in red (indicating the faulty state). When the alarm box was powered on, it was still in red and the GALM communication alarm was generated additionally. The troubleshooting process is given below: 1) The engineer checked the data and found no problem. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-14 2) It is normal that the GALM in AM/CM is displayed in red when the alarm box is not powered on. The engineer powered on the alarm box; however, the GALM was still displayed in red and the trouble still existed. 3) The engineer viewed the DIP switch of the alarm box and found no problem. 4) The engineer then checked whether the GALM was faulty by replacing the GALM in AM/CM with that in BM. He performed the replacing by modifying the GALM related DIP switch & jumper settings. Note: the AM/CM and BM use the same type of GALM except that the GALM related DIP switch & jumper settings of them are different. After replacing, the GALM in AM/CM was still displayed to be faulty. 5) The GALM in AM/CM may be displayed in red when communication between it and alarm box interrupts. The engineer suspected the problem lied in the signal cables of the alarm box. He replaced the alarm signal cables in BM with those in AM/CM. Now the GALM in AM/CM became normal. 6) Then the engineer checked the alarm signal cables carefully. He found the alarm signal cables in AM/CM were mixed up with those in BM. That was the reason why the communication between GALM in AM/CM and alarm box interrupted and GALM was displayed in red. The GALM in BM could not be displayed in red though its communication with the alarm box also interrupted because it itself operated normally. [Notes] In the actual fault location, various methods are used together, and in the above example, the methods of loop back, interchange, test and comparison are all used. We can see that the master of various methods will be quite helpful for the fault processing.
Caution: It should be pointed out that the interchange is of a little risk, for example, after the short-circuited board is put to the normal frame, the normal frame may become faulted. So, cautions should be exercised in performing the interchange method so as to avoid new fault.
1.3.10 Switching/Resetting Switching is to perform the manual switching for the switching device under the active/standby work mode, that is to forward the services from the active device to the standby. Compare the running status of the device after the switching to confirm whether the active device or the active/standby relationship is normal or not. Resetting is to restart manually some the whole switching device or some parts of it. It is used to exclude software running is confused. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 1 General Procedures and Methods
1-15 The switching and resetting can not locate the fault cause accurately, and due to the randomness of software running, the fault may be not able to re-occur after the switching or resetting, thus it is difficult to know the real fault and solve the problem. So, this method is just an emergency method, applicable only in the emergent situation.
Caution: Back up the main control boards before switching them so as to avoid the losing of system data. Because the resetting often interrupt the system service, even makes the system down due to the error operation, which will bring the severe negative consequences for the routine running of the BSS. So, the operation of switching and resetting is quite restricted, and is not recommended to use it.
1.3.11 Contacting the Technical Support Engineers of Huawei For any problem which is hard to solve in the routine maintenance or fault processing, you can contact the technical support engineers in your city through telephone, fax or Email or just contact directly the Huawei customer service center, we will respond quickly to your request. When you tell the faults to the Huawei engineers, please provide the following information: ! Detailed name of the office or site ! Linkman and telephone ! Time of fault occurring ! Detailed description of fault ! Software version of the office or site ! Actions performed after the faults occurring and the results ! Problem level and the time you wish to solve the problem We shall arrange our engineers to process your problems. In addition, you can get the latest technical documents from our technical support website: http://support.huawei.com. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-1 Chapter 2 Troubleshooting for OMC 2.1 GSM BSS-OMC Overview GSM BSS-OMC system can be divided into the following parts: Browser The browser is used to browse and view OMC network configuration and the information of each node and to monitor the entire OMC network. Its operation environment is Window9x or Windows2000. BAM BAM is the server for the various service consoles to communicate with foreground BSC. Service console Service console is the classification set of some service functions. It includes BSC maintenance system, BSC traffic statistics system, BSC data management system, BSC data auto configuration system, alarm system, traffic statistics report, and (local and remote) BTS maintenance system. Each service console consists of WS (man-machine interface) and its corresponding BAM module. WS provides the interface for input and operation. BAM module applies operations to host in accordance with WS commands, and sends operation results to WS. OMC Server OMC Server is used to 1) Store OMC configuration information and provide the necessary configuration data for the browser. 2) Store HLR subscriber data. The database platform is Sybase. 3) Store traffic statistics data. 4) Store alarm data. The operation system of OMC Server is Solaris2.6. BSS-OMC structure is as shown in Figure 2-1. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-2
Figure 2-1 OMC structure The mode of connecting through OMC Server is remote maintenance and the mode of connecting through OMC Local WS is local maintenance. App 1 ~ App n refer to service consoles, including BSC maintenance system, BSC traffic statistics system, BSC data management system, data auto configuration system, alarm system, traffic statistics report, and (local and remote) BTS maintenance system. 2.2 Troubleshooting for OMC and Examples 2.2.1 Communication Between BAM and Host Interrupted I. Description After the BAM program is started, the indicator for the communication between BAM and the host is red. II. Analysis 1) Check whether the host program runs normally. You can exam the indicators on GMPU (for single-module BSC) or GSNT, and GMCCM (for multi-module BSC). 2) Check whether the cable between MCP card and the host is correctly connected. 3) Check whether the field INSTALLED of [MAILBOX] in bam.ini is set to 1. III. Handling process 1) Configure the data and load the data to make the host operate normally. 2) Replace MCP card cable or change the connection mode of MCP card. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-3 3) Set the field INSTALLED of [MAILBOX] in bam.ini to 1 and restart the BAM. 2.2.2 Communication between BAM and Server is Interrupted I. Description After the BAM program starts, Connecting to Switch is displayed periodically or it is not displayed. But the corresponding icon in OMC Shell is marked by a cross. II. Analysis 1) You may use command ping to check whether TCP/IP is installed correctly. . 2) Check whether the ServerAddress in [Network] in bam.ini is set to the expected server IP address. 3) Check whether Server processes operate normally. For this purpose, you may directly login to the Server to check it. 4) Check whether the installed in [Network] in bam.ini is set to 0 and clinetinstalled to 1. III. Handling process 1) Ensure the TCP/IP of the BAM is normal. 2) Ensure both OMC Shell and BAM are connected to the same Server. 3) Ensure OMC Server processes operate normally. 4) Ensure the installed in [Network] in bam.ini is set to 0 and clinetinstalled to 1. After the setting, restart the BAM. 2.2.3 Data Table Error is Prompted during BAM Startup I. Description After BAM program is started, the system pops up a dialog box showing data table error or database engine error. II. Analysis 1) Restart the computer and the BAM program to see whether the problem disappears. 2) Check the data table indicated in the prompted dialog box to see whether the version is correct ( mis-upgrading can cause version error ). 3) Check whether the BAM installation directory is damaged. For example, check whether the directory IDAPI and the directory IDAPI32 exist and whether the files in the directories are complete. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-4 III. Handling process 1) Restart the computer and the BAM program. 2) Delete the data tables prompted in the dialog box (or rename the tables). Then restart the BAM computer, the deleted tables will be created automatically. Configure them again. 3) Reinstall the same OMC version. 2.2.4 Loading Host Programs from BAM Failed I. Description Loading host programs from BAM fails and BSC cannot run. II. Analysis 1) Check whether any error is prompted in the main window of the BAM program. 2) Ensure the field address related to the loading in the bam.ini is correctly set. 3) Ensure the host software and the physical board are matched. III. Handling process 1) Ensure that the files to be loaded exist in ...\bam\dload directory. If not, copy necessary files to this directory. 2) Correctly set the field address and restart the BAM program. 3) Use the correct version (matched with the board) host software and load it. 2.2.5 Data cannot be Modified/Deleted I. Description During data configuration, the data of some modules cannot be modified/deleted. II. Analysis 1) The data operation of non-common tables is designed to be related to the module. So first check AM description table to find out the module IDs that have not corresponding module. 2) As there is no corresponding module ID in AM module description table, the data are regarded as being illegal when the service console performs the data check. So the related operation is not accepted. III. Handling process 1) Add the modules whose data cannot be modified/deleted into AM module description table and re-convert all data. 2) Restart the BAM program. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-5 3) Restart BSC data management system. 2.2.6 At Least One Table in BSC Data Management System Report Has Error I. Description At least one table in BSC data management system has error after the data of all tables are converted. II. Analysis The tables with error and the corresponding cause are listed in BAM main window. III. Handling process Modify the data of the corresponding table according to the information provided in BAM main window. 2.2.7 Communication Timeout of Interface Tracing through BSC Maintenance System I. Description Communication timeout dialog box is prompted after the interface to be traced is selected. II. Analysis The module to be traced is not configured or it is in inactive state; the communication between the foreground and background parts of the node that the module belongs to is interrupted. III. Handling process 1) Check the module state: click Maintenance to pop up a menu and then select the Module State. 2) Check the working state of the node in OMC Shell; if the icon of this node is marked by a red slash, this indicates that the communication between the BAM and host is interrupted. If the icon of this node is marked by a red cross, this indicates the communication between the BAM and OMC Server is interrupted or the BAM program is not running. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-6 2.2.8 Alarm System Cannot Receive Alarms I. Description Alarm system cannot receive any alarm that has been sent to the BAM from foreground. And the history alarms cannot be viewed by the browser. II. Analysis 1) Use the following command to check whether the alarm server process is normal: ps-ef|grep alarmbam There are should be two alarmbam processes. 2) Database error, including: Alarm data space is full and master or alarm log space is full. III. Handling process 1) If the alarm process is abnormal, please close alarm process, then use kill XXX (process number) to kill newfhlrsvr, newcommdriver and alarmbam in order. Then restart these processes in reverse order. End the process (newfhlrsvr): ps -ef|grep newfhlrsvr The screen displays: newomc 345 1 0 12-08 0:0 newfhlrsvr newomc 346 345 0 12-08 10:35 newfhlrsvr newomc in the first column is the user name, 345 and 346 in the second column are the process numbers, and the 345 in third column is the parent process number. Then use kill command to end the process, as shown in the following: kill 345 346 Restart the process: $ alarmbam $newcommdriver $ newfhlrsvr 2) Check whether the SYBASE in OMC Server has been started. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-7 3) If the alarm data space is full, back up the history data and clean up the database. Back up the history database: $ bcp warn..history out history.dat -Usa P password -c Clean up the history database: $ isql -Usa 1> use warn 2> go 1> truncate table history 2> go 4) If master or alarm warn log space is full, use dump tran + database name + with no_log to clean up the space. If the space is still not available, check whether there are any tables or storage processes that should be established in other database have been wrongly created in the master database. If so, delete them. 5) Restart the process. If the problem remains, handle the problem according to the printed Fail information. 2.2.9 BAM Prints Fail to shake hands with alarm server I. Description In BAM message tracing, BAM prints Fail to shake hands with alarm server. II. Analysis Sending the handshaking message frame to OMC Server fails. Usually the cause is the connection between the BAM and OMC Server is interrupted or the alarm server process is abnormally started. III. Handling process 1) Check the network communication Use the command ping to check the state of the communication between the faulty computer and OMC Server. 2) Check the number of processes Check whether there are two alarmserver processes and two newcommdriver processes. If not, Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-8 (a). Use isql -Usa -Pserver1234 to login to the sybase system, (b). Check whether the alarm log and data space is full. sp_helpdb warn device_fragments size usage free bytes ------------------------------ ------------- -------------------- ----------- data_dev1 20.0 MB data only 20464 data_dev1 50.0 MB data only 46944 data_dev1 30.0 MB log only 30192 log_dev1 100.0 MB log only 102400 If the sum of the size of data only is 0 or its free bytes is less than 1000, or the sum of the size of log only is 0 or its free bytes is less than 1000, this indicates that the data or log space is not enough. If Warn database is normal, check whether the master is normal. The method is the same as above one. (c). Firstly, If warn database and master database are normal, enter the log directory of the OMC user and use the command vi to view the Fail information in alarm.log, or use the following command to measure the error information: grep Fail switch.log Then solve the problem according to the given error information. Secondly, check whether there are two newcommdriver processes. If not, enter the log directory of the OMC user and use the command vi to view the Fail information in commdrv.log, or use the following command to measure the error information: grep Fail commdrv.log 3) Check whether the hard disk is full. Use the command df-k to check whether the related file systems in the hard disk is full. 2.2.10 Records in Alarm Box and Those in Fault Alarm table Inconsistent I. Description The number of alarms in alarm box is different from that queried from alarm system. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-9 II. Analysis Generally this is because the manual operation has been made to the database and some alarms in the alarm fault table are deleted, or the alarms are cleaned up when the BAM restarts but the alarms in alarm box still exist. III. Handling process Only after the host restarting or the host faults disappearing, and BAM is restarted, can the records in the alarm box and those in alarm table be the same. 2.2.11 Shaking Hand with BAM Fails When the Alarms are Deleted through Alarm System I. Description If over 10,000 records in history table are to be queried in one time, select Maintain->Delete history alarm->OK to delete all alarm history tables. After 3~4 minutes, then the BAM prints Fail to shake hands with alarm server. II. Analysis Server program uses the single-process dbproc to access the database, if 100,000 records or a large number of data are deleted at one time, it will take a long time to process the operation and all CPUs will be occupied till the operation completes. Thus the handshaking messages are ignored. III. Handling process It is recommended that the data should not be deleted in large batch in one time and they should be deleted in small batches. The new version now is available to solve this problem. In the new version, the deletion can be completed through a sub-process and this will not have any impact on the processing of handshaking and query messages. 2.2.12 Failed to Register due to Server Failure I. Description Start OMC Shell, after the main window appears, Failed to register due to server failure is prompted. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-10 II. Analysis When OMC Shell is connected to the server, it reports the serial number input during the installation. So the server checks the License.dat in the data directory to verify the serial number. When this problem occurs, the cause might be there is no License.dat in the server or this file has been damaged. Being unable to verify the serial number, then the server returns the failure information. III. Handling process Login to the server to check whether License.dat exists or it has been damaged. If necessary, copy this file again from installation disk. 2.2.13 Service Console Running Failed I. Description When open the service console, Operation failure: Create process failed, error code:***** is prompted. II. Analysis If the problem occurs repeatedly, generally this is because the service console program file is faulty or it has been damaged. III. Handling process 1) Double click the node icon of the service console to check the version number of the service console program. 2) Reinstall the same version service console. 2.2.14 Loading Shell Map Failed I. Description When OMC Shell is opened, the system prompts Loading map failed and OMC Shell program is terminated. II. Analysis OMC Shell cannot load the map file. The cause might be the file does not exist, the format is incorrect, or the path is incorrect. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 2 Troubleshooting for OMC
2-11 III. Handling process 1) Open the omc.ini in C:\WINDOWS. The map file path and filename are listed behind MAP= in [OPTION] field. 2) If the map file name is not there, please add it. 3) If the path or the filename is incorrect, modify omc.ini. 4) The map file extension must be .BMP. If not, convert the file into BMP format. 2.2.15 Some Computers Cannot Refresh Templates in BSC Traffic Statistics System I. Description When the template refresh is performed, two of four client computers that connect with the same Server cannot complete the operation. II. Analysis When these two computers are used to perform only FTP operation, it is found that it takes one hour to send a file of 20k. So it is certain that the network fault occurs. III. Handling process Check the network cable to ensure the communication is clear. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-1 Chapter 3 Troubleshooting for Software Loading 3.1 Troubleshooting for BTS Software Loading 3.1.1 Introduction to Software Loading BTS software loading can be completed in two steps: 1) Load the BTS software to the Flash Memory of BTS O&M unit (TMU in BTS3X and MMU in BTS3001C) on BTS maintenance console. 2) Activate and run the software on BTS maintenance console. The specific operations differ slightly for the software of different BTS units. For the software of BTS O&M unit, it can be run directly after being activated. But for the software of BTS units other than the O&M unit, load the software first from the Flash Memory of the BTS O&M unit to the Flash Memory of corresponding boards (TRX in BTS3X, MFU and MCI in BTS3001C) after activation, then run the software. The data flow for the BTS software downloading and activation for different units on the local and remote WS is illustrated in the following three figures. BTS mai ntenance consol e WS BTS termi nal mai ntenance consol e WS BTS termi nal mai ntenance consol e WS BSC BAM Hub BSC TMU Fl ash memory TRX BTS 3X MMU Fl ash memory BTS 3001C OML OML TRX MFU MCI Downl oad sof tware command f l ow sof tware data f l ow (Downl oad Sof tware Operati on) (Downl oad Sof tware Operati on) (Downl oad Sof tware Operati on)
Figure 3-1Data flow for downloading BTS software on the local and remote WS Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-2 BTS mai ntenance consol e WS BTS termi nal mai ntenance consol e WS BTS termi nal mai ntenance consol e WS BSC BAM Hub BSC TMU Fl ash memory TRX BTS 3X MMU Fl ash memory BTS 3001C OML OML TRX MFU MCI Acti vate sof tware command f l ow sof tware data f l ow (Acti vate O&Muni t Sof tware Operati on) (Acti vate TMU Sof tware Operati on) (Acti vate MMU Sof tware Operati on)
Figure 3-2Data flow for activating BTS O&M unit software on the local and remote WS BTS mai ntenance consol e WS BTS termi nal mai ntenance consol e WS BTS termi nal mai ntenance consol e WS BSC BAM Hub BSC TMU Fl ash memory TRX BTS 3X MMU Fl ash memory BTS 3001C OML OML TRX MFU MCI Acti vate sof tware command f l ow sof tware data f l ow (Acti vate not- O&Muni t Sof tware Operati on) (Acti vate TRX Sof tware Operati on) (Acti vate MFU, MCI Sof tware Operati on)
Figure 3-3Data flow for activating software of units other than BTS O&M unit on the local and remote WS The detailed operations for software loading on the local WS and remote WS are different and described below respectively. 1) Software loading operations on the local WS ! Save correct BTS software to the computer connected to the BTS O&M unit. ! Perform software loading on the local WS. In the interface for software loading, select the correct software name, input correct software file name (including path) and version No., and then load the software to the BTS O&M unit through the serial port. Finally, write the software to the Flash Memory of the BTS O&M unit. ! Activate and run the loaded software. For the software of BTS O&M unit, perform activation operation on the local WS to activate and run the software. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-3 And for the software of units other than the BTS O&M unit, activate the software on the local WS, and the software will be automatically loaded from the Flash Memory of the O&M unit to the Flash Memory of other boards, and then get run. 2) Software loading operations on the remote WS ! Copy the correct BTS software to the corresponding directory on the BAM of the BSC to which the BTS belongs. ! Add and set a new software version record in the BTS software configuration table on BSC data management console or data automatic configuration console. The software name, software file name (including the path), the major version No., minor version No., the year, month and date in the new record must be fully consistent with those of the BTS software version to be loaded. ! On the remote WS, load the BTS software from the BSC BAM to the BTS O&M unit through OML, and write it to the Flash Memory of the BTS O&M unit. ! For the software of BTS O&M unit, activate and run it on the remote WS. For the software of BTS units other than O&M unit, activate it on the remote WS, and the software will be automatically loaded from the Flash Memory of the O&M unit to the Flash Memory of other boards, and then get run. 3.1.2 Description of Common Loading Troubles The common troubles during software loading can be classified into the following categories according to the loading phases when the troubles occur. 1) Loading fails because the loading party does not have administration authority. 2) Loading fails when software loading just starts. 3) Loading fails during the process, with Flash alarm or hardware alarms, or with repeated loading sometimes. 4) Activation fails after software version has been successfully loaded on the BTS maintenance console. For the loading process of the BTS software on the local WS differs greatly from that on the remote WS (See Figure 3-1, Figure 3-2 and Figure 3-3 for their loading commands and software data flows), the detailed trouble handling procedures for software loading on the local WS and remote WS are described respectively for each category of troubles. 3.1.3 Trouble Handling I. Loading Fails Because the Loading Party does not Have Administration Authority 1) Software loading on the local WS Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-4 When loading the BTS software on the local WS, the loading party does not obtain the administration authority of the BTS. Therefore, prompt indicating no administration authority will be received when the software loading just starts. Under this scenario, first obtain the administration authority of the BTS, and then re-load the software. 2) Software loading on the remote WS When loading the BTS software on the remote WS, the loading party receives the prompt indicating loading failure due to no administration authority when the software loading just starts. The cause for the loading failure is that the BTS administration authority has been obtained by the local WS. Under this scenario, release the administration authority of the local WS first, and then re-load the software. II. Loading Fails When Software Loading Just Starts 1) Software loading on the local WS During the software loading on the local WS, remove troubles according to their corresponding causes as given below. ! The version does not exist. Check the file name (including the path) input for the version to ensure the file does exist in the input path. ! Downloaded file is invalid. Check the validity of the file to be downloaded to ensure that correct BTS software version file is being downloaded (ensure the file is legal and has not been infected by virus). ! The version No. input is inconsistent with the actual version No. of the BTS software. Check the consistency between them. ! Software downloading fails due to downloading wrong software from other boards (e.g., download the software of MMU to the MFU board, or download the software of TMU to the TRX board.). Check to ensure that the BTS software version to be downloaded matches with the board hardware selected. ! Loading fails due to link unstable or faulty between the WS and the BTS serial port. Check the communication links and remove faults if there is any. Then re-load the software. 2) Software loading on the remote WS For software loading on the remote WS, the software to be loaded is obtained from the BAM, and then it is downloaded via BSC to the BTS O&M unit through OML. Exceptions during the process are usually caused by the following factors. Check and remove the exceptions accordingly as described below. ! The version file to be loaded is not available for software loading on the remote WS. Usually the cause is that there is no record for the version in the BTS software configuration table in BSC, or the data has not been set to the GMPU of the BSC after the record is added to the table. Add the corresponding record, or accordingly set the record to the GMPU of the BSC. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-5 ! The version does not exist. Check to ensure that the file name input (including the path of the file in the BAM of the BSC) is correct and the version exists in the file path. ! There is prompt indicating BAM is performing state transition. Wait for a while, and then re-load the software. ! Downloaded file is invalid. Check the validity of the file to be downloaded to ensure that correct BTS software version file is being downloaded (ensure the file is legal and has not been infected by virus). ! Software downloading fails due to downloading wrong software from other boards (e.g., download the software of MMU to the MFU board, or download the software of TMU to the TRX board). Check to ensure that the BTS software version to be downloaded matches with the board hardware selected. ! BAM is working abnormally. Check to ensure that the BAM works properly. ! The transmission is abnormal. Check to ensure the transmission is proper by testing the transmission BER. III. Loading Fails during the Loading Process The causes and corresponding measures to remove the trouble are described in the following. ! Data received are incomplete because of burst bit error in transmission link. Check to ensure the transmission links work properly, and then retry the software loading. ! Fault of the version to be loaded brings about the verification failure. Check to ensure correct versions are being loaded. ! The version No. input is inconsistent with the actual version No. (The version No. is checked after the software loading is completed in the some version of BTS O&M software). Check to ensure the consistency between the version No. input and the actual version No. ! If Flash alarm or hardware alarm is reported together with loading failure, it is caused by the failure in writing the software into the Flash Memory of the board. Try reloading the software. If reloading fails again, replace the corresponding board. IV. Activation Fails after Software Version has been Successfully Loaded on the BTS Maintenance Console There are three causes for this problem. ! The version to be activated does not exist in the Flash Memory of the BTS O&M unit. Try to re-load the version to be activated, and then activate it again. ! The description of the activated version is not consistent with that of the downloaded version. Check the description of the version to be activated, modify and reactivate it. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-6 ! The board activated has not been configured, or the communication is not normal. Check to ensure the board activated has been configured and there is no communication alarm for the board. 3.1.4 An Example of Loading BTS Software Failure on the Remote WS I. Description A BTS required upgrading its software version. An engineer copied the new version to the directory dload in the BAM of its BSC to load the new version. When he was executing the software loading on a remote WS, loading failure was prompted. After checking, it was found that the software version was wrong. II. Troubleshooting process 1) From the prompt of loading failure, there might be two causes. The first one was the parameters in the new record added to the BTS software configuration table in BSC were not matched with those in the version file. The second was that the new record added to the BTS software configuration table in BSC had not been set to the GMPU and sent to its corresponding modules. 2) The engineer opened the BSC software configuration table and found the record of the software to be added. He checked the parameters for the record and found no errors. 3) He queried the records of the BSC software configuration table in the GMPU and found no record for the software version to be loaded in the GMPU. After asking the on-site engineers, he detected that no setting of the whole table, but only saving of the record, had been performed after the record is added to the table. 4) Now the cause of the problem was detected as no setting of the record to the GMPU. Perform table setting to the BTS software configuration table and re-load the software, the software loading is successful. III. Analysis During software loading on the remote WS, data configuration is required after the version to be loaded is copied to the directory on the BAM of the BSC. Add a record of the software version to be loaded in the BSC software configuration table and set the whole table. Then the software version to be loaded can be queried on the remote WS for normal loading. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-7 3.2 Troubleshooting for BSC Software Loading 3.2.1 BSC Software Loading Channels ! Loading channels for board software in AM/CM: Active GMCC: BAM MCP card GSNT -- GMCC0 (GMCC1) Standby GMCC: BAM --MCP GSNT GMCCM -- GMCCS GSNT: BAM MCP card (GSNT just transfers the software) GMCCM -- GSNT GCTN: BAM MCP card GSNT -- GMCCS -- GCTN ! The loading channel for GMPU software in BM BAMMCP cardGSNTGMCCMGMCCSGFBIOptical fibersGOPT GMC2GNET--GMPU 3.2.2 AM/CM Loading Fails I. Description The first indicator on the task bar of BAM is green, i.e., the communication between AM/CM and BAM is normal, or at least one active GMCC works normally. But there is no loading progress bar on BAM, and the Load indicator on the GMCC that is being loaded with the software is ON. II. Introduction to AM/CM Loading AM/CM loading means to load the AM program and data file under the directory C:\omc\bsc\bam\dload on the BAM to the GMCCM, GMCCS, GSNT and GCTN of the AM/CM. The program to be loaded includes cc08.mcm, cc08.mcs, cc08.snt and cc08.ctn. Data files are generated into db_0.dat file after format conversion. In addition, file fsk.dat and modutype.str are also required to be loaded. AM/CM loading can be performed by AM/CM reset (power off the main control frame, pull out, then push in the board, or set the reset switches), or by using the reset or switchover function on the maintenance console. There is a 4-digit DIP respectively on GMCC, GSNT and GCTN (Note: The DIP of GCTN is located on the backplane.) In order to load AM/CM, please Check and modify the settings of the DIP switches on the boards to ensure that the DIP switches are set as follows: 1) Setting of the DIP switches for MCC(From top to bottom) ! SW4: Left, indicating that programs and data are not available Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-8 ! SW3: Right indicating that programs and data are writable ! SW2: Right This setting is fixed ! SW1: Left This setting is fixed 2) Setting of the DIP switches for SNT(From top to bottom) ! SW4: Left, indicating that programs and data are not available ! SW3: Right indicating that programs and data are writable ! SW2: Right This setting is fixed ! SW1: Left This setting is fixed 3) Setting of the DIP switches for CTN(From top to bottom) ! SW4: Left, indicating that programs and data are not available ! SW3: Right indicating that programs and data writable ! SW2: Right This setting is fixed ! SW1: Left This setting is fixed The status of the indicators on GMCC when the software is being loaded on the GMCC is described below. ! LOAD indicator is constantly ON when the GMCC is waiting for software loading. ! LOAD indicator flashes during the process of software loading on GMCC. ! LOAD indicator is OFF after the software loading is finished. The GMCC will enter normal service state after the RUN indicator became ON and LOAD indicator flashes quickly for a while. Under normal service state (also called running state), LOAD indicator is OFF and RUN indicator is normally flashing in 1 second interval. III. Analysis As described in the loading channels for board software in AM/CM, causes of problems during AM/CM loading most probably lie in GMCC, GSNT (clock signals), BAM, loading cables and the DIP setting of various boards. Therefore, check the following items for troubleshooting of AM/CM loading. 1) Check whether there is program and data under corresponding directory. 2) Check the setting of DIP switches on corresponding boards. 3) Check the system clocks. 4) When the software cannot be loaded from BAM to AM/CM, check the loading cables and MCP card. Note that loading cables are not hot pluggable. 5) When the software cannot be loaded to standby GMCCs, pull out all GMCCs from their slots, and re-insert them in the slots in sequence to check whether the bus on the backplane has been suspended due to the damage of some GMCCs. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-9 IV. Troubleshooting Process Refer to Figure 3-4 for troubleshooting process. Note that settings in the file BAM.INI must be handled with care, e.g., the setting of LocalModu in the figure. In addition, the settings under [DownLoad], such as MCM_SEG, MCS_SEG, CTN_SEG and GSNT_SEG, cannot be modified at random. Make sure boards and backplane are well connected. Replace the board and feed back problems NO NO YES : NO
YES NO YES NO YES YES YES NO
NO YES
Troubleshooting for AM/CM loading failure when the communication between BAM and GMPU is normal The DIP switches of the failed board correct? Correctly set the DIP switches of boards and active/stand by GMCC Fault removed? End Fault removed! There are files in the directory for loading? Copy the correct software versions, convert and load the data In [SYSTEM] under BAM.INI, LocalModu=0! Fault cleared? Set LocalModu=0 End Check each node at loading channels, such as MCP card and loading cables GSNT and the clock normal? Check the clock from GCKS to GSNT, e.g., the clock cable In [MAILBOX] under BAM.INI, INSTALLED=1! YES : NO Set INSTALLED=1
Figure 3-4Troubleshooting process 1 Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-10 3.2.3 AM/CM Loading Timeout I. Description The first indicator on the task bar of BAM turns green. Loading progress bar is displayed during software loading of a board, but the progress is interrupted with the prompt time out. II. Analysis As described in the loading channels for board software in AM/CM, causes of problems during AM/CM loading most probably lie in GMCC, GSNT (clock signals), BAM, loading cables and the corresponding boards. Therefore, check the stability of clocks. Check the clocks from GCKS to GSNT or to GCTN according to the locations of the board (either in the main control frame or interface frame). III. Troubleshooting Process Refer to Figure 3-5 for troubleshooting process.
NO YES YES NO YES NO NO YES YES NO YES NO Start
The board and the backplane well connected?
Insert them tightly and fixedly Fault removed! End Loading cable overlength (more than 10m)? Replace loading cable End Replace the board and feed back the problem. Clock normal? Check GCKS, the clock cable from GCKS to GCTN and GSNT Fault removed? End
Fault removed!
Figure 3-5Troubleshooting process 2 Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-11 3.2.4 BM Loading Failed and LOAD Indicator is Constantly ON I. Description BM loading failed and the LOAD indicator on the GMPU is constantly ON. II. Introduction to BM Loading BM loading means to load the program and data file under the directory C:\omc\bsc\bam\dload in BAM to the GMPU of BM in BSC. The program for GMPU to be loaded is cc08.bsm and the data file to be loaded is generated into db_?.dat file (? represents module No.) after format conversion. In addition, fsk.dat and modutype.str is also required to be loaded. BM loading can be performed by BM reset (power off the main control frame, or set the reset switches), or by using the leveled reset function on the maintenance console. 1) Load the BM of Multi-module BSC In order to load the BM of Multi-module BSC, please Check and modify the settings of the DIP switches on the GMPU boards to ensure that the DIP switches of all MPUs are set as follows: (From top to bottom) SW8: Right SW7: Left SW6: Left indicating that this programs are not available SW5: Left indicating that data are not available SW4: Right indicating that data are writable SW3: Right indicating that programs are writable SW2: Left This setting is fixed SW1: Left This setting is fixed SW4: Right This setting is fixed SW3: Right This setting is fixed SW2: Left This setting is fixed SW1: Left This setting is fixed 2) Load the BM of Single-module BSC In order to load the BM of Multi-module BSC, please Check and modify the settings of the DIP switches on the GMPU boards to ensure that the DIP switches of all MPUs are set as follows: (From top to bottom) SW8: Left This setting is fixed SW7: Left This setting is fixed SW6: Left indicating that this programs are not available SW5: Left indicating that data are not available Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-12 SW4: Right indicating that data are writable SW3: Right indicating that programs are writable SW2: Left This setting is fixed SW1: Left This setting is fixed SW4: Right This setting is fixed SW3: Left This setting is fixed SW2: Right This setting is fixed SW1: Left This setting is fixed After all the DIP switches are correctly set, power on the main frame (for MCC) and interface frame (for CTN) of the AM, and power on the main frames of all the BMs. Note: ! When loading program and data and writing them into Flash Memory, set switch 3 and 4 as ON, switch 5 and 6 as OFF. ! When using the program and data in the Flash Memory, set switch 5 and 6 as ON, switch 3 and 4 as OFF. ! When loading and writing only data into the Flash Memory, set switch 4 and 6 to ON, switch 3 and 5 to OFF. ! When loading and writing only program into the Flash Memory, set switch 3 and 5 as ON, and switch 4 and 6 as OFF. ! When both the switch 7 and switch 8 are set as ON, the communication bandwidth is 256kbit/s. When both the switch 7 and switch 8 are set as OFF, the communication bandwidth is 128kbit/s. When the switch 7 is set as OFF and switch 8 ON, the communication bandwidth is 512kbit/s. When the switch 7 is set as ON and switch 8 OFF, the communication bandwidth is 768kbit/s. The current setting of switch 7 and 8 is OFF for switch 7 and ON for switch 8. For first loading, set the loading switch 3 and 4 on GMPU as ON, then load and write the program and data into corresponding Flash Memory. The loading progress can be monitored on BAM. When GMPU resumes normal working, set switch 5 and 6 as ON. If it is not the first loading, load or write the program and data together or separately into the Flash Memory as per the actual situation. Refer to the above description for the setting of DIP switches. RUN indicator flashes at the interval of 0.5 seconds when the BM is working normally. LOAD indicator is ON when the BM loading is about to start. LOAD indicator flashes during software loading and resumes to its normal state within several seconds after the loading is completed. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-13 III. Analysis Check the following items for troubleshooting. 1) Check the communication channel between GMPU and BAM. 2) Check the completeness of files. 3) Check the settings of DIP switches. 4) Check data configuration. 5) Check clocks. IV. Troubleshooting Process 1) The problem might be caused by the communication fault between BAM and GMPU. Therefore, check the communication channels, including the communication channel between GMCCM and BAM, the inter-module signaling channel from GMCCM to GFBI via GSNT, the optical path and GMC2. Check the cable connection between boards. 2) Check to ensure that the four files cc08.bsm, db_1.dat, fsk.dat and modutype.str required for software loading are under their corresponding directory. 3) Check to ensure the DIP switches are correctly set. 4) When the communication between BAM and BM is normal, print information can be viewed on BAM even without software loading. If the prompts Can't open the follow input file and open program file error are displayed, check the configuration of AM description table or check the file modutype.str. Check whether the Localmodu in the file BAM.ini is set as 1 (the correct configuration should be 0), whether the address of the BAM_SEG segment is configured correctly, whether the BAM module No. and IS_MODULE_TYPE in [Software Parameter Table] are correctly set. 5) Check to ensure that the clock selection in [Clock Description Table] is hardware selection because BM obtains clocks through optical paths. Optical path instability can also be one cause of this problem. Therefore, check the optical path. 3.2.5 BM Loading Timeout I. Description BM loading timeout. II. Analysis The problem might be caused by the following factors. 1) BAM is operating abnormally. 2) BM is operating abnormally. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-14 3) Version mismatch 4) Loading channel unstable 5) Clocks unstable III. Troubleshooting Process 1) BAM works unstably and lots of data between BM and BAM gets lost. Therefore, reset the BAM. 2) If GMPU works abnormally, reset GMPU. 3) If the program of BAM and that of OMC works abnormally or versions are not matched (usually this occurs for first installation or upgrading), install correct versions as per the version matching description provided by the software provider. 4) Check the stability of the loading channels, especially the communication between AM and BAM, the optical paths between AM and BM. 5) Check the stability of clocks, especially the clock from GCKS to GCTN (the BM clocks are transmitted through optical paths). 3.2.6 BM Loading Starts Again after Successful Loading I. Description BM loading starts again after successful BM loading. II. Analysis The problem might be caused by the following factors. 1) Data configuration and version matching errors 2) Clocks abnormal or clock settings incorrect III. Troubleshooting Process 1) During installation or upgrading, note the matching relationship between different versions, especially the consistency between the version of GMPU program and that of OMC software. Replace the existing program or the existing version of OMC and BAM as per the version matching relationship description and the upgrading guide. Note the configuration of some data tables, including [Frame Description Table], [BSC Slot Description Table], [BSC BIE Active/standby Boards Description Table], site [TRX Configuration Table] and [Software Parameter Table]. The initialization failure of these tables might bring about the reloading of BM. Note also that the number of data records cannot exceed the max. threshold. 2) Check clock channels, including GCKS, the clock cables and optical paths from GCKS to GCTN. Check the setting of clock mode. Usually clock mode is set as Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-15 Hardware selection. But if the default value GOPT0 is adopted, GMPU might not work if links cannot be located due to GOPT0 fault. 3) Check if the file fsk.dat exists under the directory DLOAD. 4) Check the segment address of BM in BAM.INI. Usually the configuration is 0x800000. 5) Check whether there is data file for corresponding module under the directory DLOAD. Usually the corresponding module is x module and its data file is Db_x.dat. 3.2.7 BM Does Not Work Normally or Reload Automatically I. Description BM does not work normally or reload automatically. II. Analysis 1) Check the setting of segment address. 2) Check the version. III. Troubleshooting Process 1) Most probably the problem is caused by the incorrect setting of BAM deviation address during the first installation or upgrading. Check the value of BSM_SEG in [DownLoad] in the file BAM.INI. Usually the providers recommended value, 0x800000, is adopted. 2) Check whether the versions for installation or upgrading are matched with each other or whether the data table in the data management console has been correctly updated as per the version matching description or upgrading guide provided by the software providers. 3) Check whether the program of BAM and OMC is abnormal or whether the versions are not matched. If it is the case, re-install the OMC. 4) Check the consistency between [BSC BIE Active/standby Boards Description Table] and [Slot Description Table]. 5) Check whether the correspondence table between MSM and FTC is consistent with the slot description table. 6) Close all data tables open in the data management table, press <Ctrl+Shift+F12> to check whether the file boarddes.dbf under the configuration menu is empty. If it is empty, it indicates incorrect upgrading procedures. Please re-upgrade it as per the correct upgrading procedures. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 3 Troubleshooting for Software Loading
3-16 3.2.8 Operating Parameters Not Updated after Data Loading I. Description The operating parameters of the GMPU are detected not updated after data being modified and reloaded to the GMPU. II. Analysis 1) Check whether the data has been re-converted. 2) Check whether the data-writable and data-available DIP switches on boards are correctly set. 3) Check whether the communication between BM and BAM is normal. III. Troubleshooting Process 1) Check to ensure the data has been re-converted or re-convert and load the data. 2) Check to ensure the data-writable DIP switch on BM or AM to be loaded with the software is set as ON, and the data-available switch as OFF. 3) Check to ensure the communication between BM and BAM is normal. Refer to troubleshooting for loading and OMC. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-1 Chapter 4 Troubleshooting for Links 4.1 Overview Besides that voices and data are transferred between units in GSM, a large amount of signaling is transmitted between BTS, BSC, MSC and PCU and between modules in BSC. The signaling is transmitted to each unit via different links. Between BSC and MSC, SS7 is used. The link that carries the SS7 is called SS7 link. The signaling link between TRX and BSC is RSL. OML is the signaling link that bears maintenance messages between BSC and BTS. The signaling messages between BSC and PCU are carried by PbSL. The RSL, OML and PbSL are all a kind of LAPD link. 4.2 Fundamental Knowledge 4.2.1 SS7 Concepts I. DPC and OPC Each signaling point in GSM has a signaling point code. For BSC and MSC, the signaling point code of BSC acts as originating signaling point code (OPC) and that of the MSC as destination signaling point code (DPC). II. SLC Signaling link code (SLC) is used to identify the signaling link between the signaling points. It contains 4 bits. Correspondingly there are at most 16 (2 4 =16) signaling links in each office. For a signaling link, the SLC at the BSC side should be the same as that at the MSC side. Otherwise, signaling interconnection cannot be performed. As can be seen, the function of SLC is like that of CIC. III. Initial establishment Initial establishment is the operation that must be performed for a link before it is put into service. It is used during the process of link initialization or link troubleshooting. The initial establishment function is performed by the SF in the link state signal unit (LSSU). Table 4-1 lists the six states of the LSSU. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-2 Table 4-1Six states of LSSU Name Meaning Name Meaning SIO Out of establishment SIN Normal establishment SIE Emergency establishment SIOS Out of service SIPO Processor fault SIB Link blocked
The initial establishment process of a link includes five steps: link not aligned ! link aligned ! accepted periodically ! acceptance completed ! put into service. 4.2.2 SS7 Direction From BSC to MSC, the SS7 runs via the following: LPN7 GNET HW transparent BIE E1 cable E3M port 4 E3M E3M port n (n=0, 1, 2 and 3) E1 cable MSM FTC m (m=0, 1, 2 and 3) of the TCSM unit E1 cable (and other transmission devices) MSC. Note: all numbers here begin with 0. 4.2.3 GMC2 GMC2 is the communication board between AM/CM and BM. Whether AM/CM delivers commands to BM or BM reports its state to AM/CM, the messages in both directions have to pass by the GMC2. Its performance can directly affect the communication between AM/CM and BM. The BM can contain two GMC2s, with the left one numbered 0 and the right numbered 1. In the GMC2, the indicator RUN is used to tell the board status and F0 to tell the link status. The RUN keeps ON when the board runs normally. The F0 keeps ON when the link is normal, flashes when the link is not established and keeps OFF when the board is not started. These indicators are useful for trouble Analysis. The communication signaling link between AM/CM and BM includes: GMCCS GFBI GOPT GMC2 GMPU Error of any part may cause GMC2 link establishment trouble. 4.2.4 Pb Interface PCU and BSC are connected via E1 cables. The Pb interface between PCU and BSC is an internal interface. PCU and BTS are not directly connected but via BSC. Therefore, the E1 cables between PCU and BSC have two functions: some act as signaling links for PCU-BSC communication and the others as packet service links for PCU-BTS communication. Each PCU can be connected with many BSCs. The BSCs are numbered via BSC No in the PCU. There can be many E1 cables between each BSC and the PCU. The RPPUs connected with the same BSC can back up each other. In PCU, each RPPU can be connected with only one BSC. A 2Mbit/s E1 cable contains 32 64kbit/s time slots and each time slot can be divided into 4 16kbit/s sub-time slots. That is, an E1 cable contains 128 16kbit/s sub-time slots. E1 time slot allocation on the Pb interface is shown in Table 4-2. Wherein, the numbers in the left column are numbers of the 64kbit/s E1 time slots. Ti (i ranges Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
Note: In PCU, each E1 sub-time slot is defined via RPPU No., RPPU E1 No. together with E1 time slot No. The sub-time slots are numbered in a left-to-right and then up-to-bottom sequence, with sub-time slot No. of the time slot 0 at the beginning. However, in BSC, the PCIC is listed in an up-to-bottom and then left-to-right sequence. Therefore, attention should be paid to the corresponding relationship between sub-time slot No. and PCIC. For example, if the E1 sub-time slot No. at the PCU side and the PCIC at the BSC side both begin with 0, the E1 sub-time slot 1 corresponds to PCIC 32, the E1 sub-time slot 2 to PCIC 64, the E1 sub-time slot 3 to PCIC 96, the E1 sub-time slot 4 to PCIC 1 and so on. The following formula can also be applied to get the corresponding relationship between the E1 sub-time slot in BSC and that in PCU: BSC sub-time slot No.=32(PCU sub-time slot No. MOD 4)+(PCU sub-time slot No./4) Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-4 4.2.5 Signaling Direction on LAPD Links LAPD links include RSL, OML and PbSL. In BSC, the signaling direction on RSL and that on OML are both as such: GLAP GNET HW BIE E1 cable (and other transmission devices) BTS. The signaling direction on PbSL is: GLAP GNET HW BIE E1 cable E3M port 4 E3M E3M port n (n=0, 1, 2 and 3) E1 cable PCU. Note: all numbers here begin with 0. 4.3 Trouble Handling 4.3.1 No SS7 Signaling Trace Message I. Description No message is displayed in the signaling trace window when SS7 trace is enabled at the BSC maintenance console. II. SS7 trace Communication between BSC and MSC is performed via standard SS7. The BSC traces and monitors the SS7 using trace code at the BAM maintenance console. This process is called SS7 trace. III. Analysis Check whether the selected message type is right. Check whether the selected module No. is right. Check whether the selected link No. is right. Check whether the version of the BAM maintenance console is the same as that of the FAM BSC. IV. Troubleshooting process 1) Select the right message type. 2) Select the right module No. 3) Select the right link No. according to data configuration. 4) Install the system by the corollary version provided by the supplier. 5) Select the menu [Maintenance/SS7/State query] and then input a specified link to view the state of the SS7 link at the BSC side. If the field [link Fault] is shown as NO and [transmission service] as YES, it can be considered that the SS7 link is in establishment. 6) Click the menu [SS7/SCCP maintenance/State query] and select "signaling point query" or "subsystem query" to see whether the related signaling point at the BSC side and that at the MSC side are properly configured. If both states are shown as ALLOWED, it can be considered that the signaling related Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-5 configuration is right at both sides. Otherwise, please check the signaling data at the two sides. 4.3.2 RSL Disabled I. Description During maintenance link query at the maintenance console, the corresponding signaling link of RSL is found not in the multi-frame connection setup state. II. Analysis 1) Check whether the corresponding OML is normal. 2) Check whether the corresponding TRX is enabled. 3) See what state the corresponding signaling link is in. 4) Check whether data configuration of the corresponding link is right. 5) Check the corresponding BIE to see whether the E1 cable and HW are well connected. 6) Check whether the corresponding FPU (in BTS2X) or TRX (in BTS3X) is in the normal state and whether the board software version is right. 7) View the eight green indicators under the removable panel of the GLAP. III. Troubleshooting process 1) When finding the link is in the TEI unallocated state, please check whether data configuration of the corresponding link is right. 2) When finding the link is in the interrupted state, please check whether data configuration of the corresponding link is consistent with the corresponding hardware configuration. If no, please modify the data configuration or change the hardware configuration by project files to make the two configurations consistent. 3) Check the corresponding BIE to see whether the E1 cable and HW are well connected. If no, please correct them. 4) Check whether the corresponding GLAP is in the right position and normal state and whether other links are established. 5) Check whether the BIE (in BTS2X) is in the normal state. 6) Check whether the corresponding FPU (in BTS2X) or TRX is in the normal state and whether the board software version is right. 7) Change the TMU (in BTS3X) or FPU (in BTS2X), TRX and their backplanes and see whether the problem lies in the BTS hardware. 8) When many RSLs of a BSC are found faulty, it is recommended to replace the BSC network board or clock frame. 9) Insert another GLAP into the slot and see whether the problem lies in the board hardware. 10) When satellite transmission is adopted between BTS and BSC, please check whether OML is established. If no, please check whether software version is right, Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-6 whether data configuration is right and whether the satellite transmission cable is normal. If yes, please check the TRX related software and hardware and see whether the TRX knows the satellite transmission mode is adopted and whether the TRX has made proper processing. 4.3.3 OML Disabled I. Description During maintenance link query at the maintenance console, the corresponding signaling link of OML is found not in the multi-frame connection setup state. II. Analysis 1) See what state the corresponding signaling link is in. 2) Check whether the corresponding data configuration is right. 3) Check whether the corresponding GLAP is in the right position. 4) Check whether the corresponding BIE is in the right position and whether the E1 cable and HW are well connected. 5) Check whether the BIE in BTS is in the normal state. 6) Check whether the OMU (in BTS20) or TMU (in BTS3X) is in the normal state. 7) View the eight green indicators under the removable panel of the corresponding GLAP. III. Troubleshooting process 1) When finding the OML is in the TEI unallocated state, please check whether data configuration of the corresponding link is right. 2) Check whether the corresponding data configuration is right and consistent with the corresponding hardware configuration. 3) Check whether the GLAP is in the right position. 4) Check whether the corresponding BIE is in the normal state and whether the E1 cable and HW are well connected. 5) Check whether the BIE in BTS is in the normal state. 6) Check whether the OMU or TMU is in the normal state. 7) When many RSLs or OMLs of a BSC are found faulty, it is recommended to replace the BSC network board or clock frame. 8) Reset the GLAP and see whether the problem lies in the board software. 9) Insert another GLAP into the slot and see whether the problem lies in the board hardware. 10) When satellite transmission is adopted between BTS and BSC, please check whether the satellite transmission cable is normal and whether the software version and corresponding settings are right. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-7 4.3.4 SS7 Link Faulty I. Description The SS7 link may meet with the following four troubles: " Link interrupted, i.e. the link is not established. " Link unstable, i.e. so many initial establishments happen to the link that the link cannot carry any service. " Link out of service, i.e. the remote processor meets with fault. " Link blocked, i.e. the link suffers from overload. II. Possible causes Table 4-3Possible causes of SS7 link trouble No. Possible causes Remarks 1 Transmission trouble Interrupted transmission may cause link interruption. Intermittent transmission, bit error and bad trunk cable connection may cause higher link error rate and more messages retransmissions between BSC and MSC. This may result in frequent initial establishments, link congestion and link unstableness. 2 Trunk board trouble, including troubles in transparent BIE, E3M, MSM and FTC This may cause link interruption, link unstableness and link congestion. 3 LPN7 protocol processing board trouble
4 GNET trouble 5 Remote processor fault This may cause link interruption and link congestion. 6 Signaling link blocked manually at the remote end
7 Data configuration unsuitable When the MTP data SLC and DPC is not consistent with that at the peer end, the link cannot pass the Layer-3 signaling link test after initial establishment. This may result in so many link establishments that the link becomes unstable. When linkset or link selecting code is unsuitable, the load cannot be evenly shared among many signaling links. This may cause the link with heavy load to be congested.
III. Analysis link interrupted 1) Check whether the transmission is interrupted Transmission interruption is a common cause of link interruption. The key to judge whether the transmission is interrupted is to check whether out-of-frame alarm is generated on the corresponding E1 cable. When finding the external transmission system is interrupted, please contact the Transmission Sector. 2) Check whether the corresponding board fails and whether cables between boards are securely connected. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-8 Failure in protocol processing board, trunk board and switching network board is also a common cause of link interruption. To be specified, such boards include the LPN7, GNET, transparent BIE, E3M, MSM, FTC and their backplanes. The cables between these boards are: HW between the GNET and transparent BIE, E1 cable between the transparent BIE and E3M and E1 cable between the E3M and MSM. Generally the cause of board trouble can be found out by querying alarm information at the alarm console. 3) Self-loop test method Self-loop test is one of the most commonly-used ways to locate link trouble. It can help to find an approximate trouble area. During self-loop test over SS7, the following methods can be used to judge whether each part of the link is normal. " When the originating signaling point is normal, the link can be normally aligned at Layer 2. However, when the parameter like SLC and DPC is unsuitable, the link may fail the Layer-3 test, break and return to the initial establishment process. Externally the indicator (corresponding to the SS7 link) in the LPN7 intermits by one minute periodically. " When the originating signaling point is exceptional, the indicator (corresponding to the SS7 link) in the LPN7 keeps OFF after self-loop, which indicates that the link is interrupted. Now read again the SS7 direction: LPN7 GNET HW transparent BIE (self-loop allowed) E3M (self-loop allowed) MSM FTC (self-loop allowed) MSC. When the originating signaling point is found exceptional after self-loop, it can be considered that the corresponding board is faulty, or that the cables here are inconsistent with the corresponding data configuration, or that the related data configuration has errors. For details, see 4.2.2 SS7 Direction. When finding that the originating signaling point is normal upon completion of self-loop over the FTC, please check whether the SLC, OPC and DPC at the BSC side is consistent with those at the MSC, whether E1 cable connection over the A interface between BSC and MSC is consistent with the corresponding data configuration and whether E1 cable transmission is normal. SS7 link interruption trouble is of two kinds: one is trouble that happens due to SSU link establishment trouble during the process of connection setup over A interface and the other is trouble that happens after the connection is set up. For the former, please check in turn whether the SS7 link related data configurations at both sides are consistent, whether cables are properly and securely connected, whether transmission is normal and whether the related board is faulty. For the latter, please check whether the related board is faulty, whether cables are securely connected and whether transmission is normal. IV. Analysis link unstable 1) Check whether the transmission BER is too high. When the transmission BER is of the order of magnitude higher than 10 -6 , the link error rate will become higher and more message retransmissions will occur between switching exchanges. This may cause the link to become out of establishment, realigned and unstable. The following two methods can be used to check whether the transmission BER is too high. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-9 " Check whether the bit error rate indicator corresponding to the transmission device and switching exchange is normal. " Use the BER tester and PCM analyzer to test the E1 channel and obtain the accurate BER value. 2) Check whether the related board fails and whether cables between boards are securely connected. Check the signaling link related protocol processing board, trunk board and switching network board, including the LPN7, GNET, transparent BIE, E3M, MSM, FTC and their backplanes. The cables between these boards are: HW between the GNET and transparent BIE, E1 cable between the transparent BIE and E3M and E1 cable between the E3M and MSM. Generally some signaling links or trunk circuits may also become exceptional when these boards fail. Therefore, comprehensive analysis can be made to locate the link unstableness trouble. If necessary, the method of replacing the board during system running can also be used to locate the trouble. 3) Check whether data of the MTP link is right. The parameters SLC and DPC are very important for signaling link establishment. When DPC is wrongly configured or when SLC does not match with the counterpart at the peer end, the link will fail the Layer-3 signaling link test after initial establishment, as a SLTM can indicate. In other words, the link cannot bear any signaling service. Then another initial establishment will be performed over the link. Due to the data configuration error, the initial establishment will occur periodically. Consequently the link keeps out of service and unstable. The DPC setting can be obtained from network planning. As for the SLC, the both sides should negotiate to make the corresponding settings at both sides match. 4) Find out the cause of intermittence via alarm information. Many link troubles result from transmission problems, such as too high transmission BER, unstable clock and out-of-frame problem. When the transmission quality is bad and transmission BER is too high, PCM alarm may be generated. Generally this alarm can be used to find out the cause of link interruption. However, the transmission BER should be less than 1*10 -6 while the PCM alarm threshold is higher than this value, therefore, the link interruption threshold may be reached before the PCM alarm threshold. In other words, the signaling link interruption alarm may be generated before the PCM alarm. Bit errors are accumulated in a link. When the number of accumulated bit errors reaches a certain value, the link will be interrupted. When the PCM alarm threshold is not reached at this time, the PCM alarm may not be generated though the link is interrupted or may be generated after the link is interrupted. That no PCM alarm is generated does not mean that there is not a certain bit errors in a link. In this case, you may analyze the cause of the link traffic interruption alarm to see whether the link interruption trouble results from a transmission problem. Figure 4-1 shows a signaling link traffic interruption alarm. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-10
Figure 4-1Signaling link traffic interruption alarm " The alarm detailed explanation provides the cause values of signaling link interruption (like 0x09 in the figure). Details are given below: Cause value: 02H This cause indicates that STOP command is received from the MTP3. Please find out why the MTP3 delivers that command by analyzing messages received and transmitted at the MTP3. The common reason is that the Layer-3 signaling link test is not passed (as a SLTM can indicate) or the MTP3 does not receive SLTA message from the peer end twice. Please analyze the traced messages and find out the real reason why the MTP3 delivers the STOP command. Cause value: 08H The initial establishment control (IAC) reports that establishment is impossible. Following this cause, a specific Layer-2 cause will be given. For example, the cause 6A 03 may follow the cause 2C 08, indicating that establishment is impossible because congestion lasts too long time. Cause value: 09H The basic reception control (RC) reports that the link trouble results from exceptional BSN. When such trouble occurs to the GLAP and the link interrupts frequently, it can be most possibly considered that the BSN or FIB is exceptional. Such trouble may not be cleared automatically. Board reset may be required. Generally the reason of such trouble is that bit errors cause transmission unstableness. So please make sure that the transmission recovers normal before resetting the board. Cause value: 0AH The RC reports that the link trouble results from exceptional FIB. See the above cause value 09H for handling suggestions. Cause value: 0CH The RC reports that SIO is received. Ask the opposite office to find out why it transmitted SIO during the link transmission process. Cause value: 0DH The RC reports that SIN is received. Follow the above handling suggestions. Cause value: 0EH Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-11 The RC reports that SIE is received. Follow the above handling suggestions. Cause value: 0FH The RC reports that SIOS is received. It can be considered that the opposite office suffers from link interruption earlier than the local does. Ask the opposite office to find out the reason. Cause value: 12H The signal unit error rate monitor (SUERM) tells that the link is faulty. It can be considered that link interruption is caused by the too high BER. Please recover the link transmission. Cause value: 15H T7 timeout causes the link interruption. Please first check whether BER is too high and whether the link is congested. When this kind of link interruption keeps up, you may perform self-loop test over the link. Most probably this trouble is caused by transmission bit error. When self-loop test cannot help to locate the trouble yet, you may view the current traffic and see whether the link is overloaded. T7 holds the max. wait time (0.8 1.5 s) for peer verification over the MSU sent by Layer 2. Every time an MSU is sent, a T7 will start. When no verification has been received within the time range, a verification timeout message is returned and the link becomes interrupted. Then Layer 2 reports that the link is interrupted due to T7 timeout. " The signaling link establishment trouble alarm also provides the cause values of establishment trouble. Details are give below. Cause value: 0FH During the establishment process, the RC reports that SIOS is received. Probably this is caused by the opposite office problem. Please ask the opposite office to perform check. In the meantime, the local office can also perform self-loop test and see whether the problem lies in the local. Cause value: 16H T2 timeout causes this trouble. Problem is that no message is received from the peer office. Procedures of locating this problem are given below. a) Check whether the problem lies in the peer office or in the local by connecting the Tx and Rx E1 cables of the link. When the link trouble is located and the link intermits by 12 s, it can be considered that the local office is all right and that the problem lies in the peer office or in the transmission between both offices. Please ask the peer office to check whether the link is activated and whether the related configuration is right. Meanwhile you may perform check over the transmission. b) When the link cannot be established in the local office during the self-loop process, you should check the local office first. When the cabinet is of type B, please check whether the HW between the network board and trunk board is properly connected. You may perform self-loop tests step by step to locate the trouble. Cause value: 19H Verification over the establishment BER monitor (AERM) is terminated. This can be caused by the too high transmission BER. Please recover the transmission. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-12 V. Analysis link out of service The reason why a link becomes out of service is that LSSU (SIPO) message is received from the remote end and messages over Layer 3 cannot be processed. What causes the remote end to transmit the SIPO message is given below: " The remote processor met with fault and the system automatically transmitted the SIPO message. " The MML command to set the link to the out-of-service state was manually executed at the remote end. Then the system transmitted the SIPO message in answer to the command. Please contact the peer office to learn the real reason and recover the link to normal as soon as possible. VI. Analysis link blocked When a link is blocked, the system will report the trouble details at the alarm console. Generally you may roughly find out the trouble cause. Then you can perform handling as per the suggestions provided by the system. Besides, you may also locate the trouble according to the following procedures. 1) Check whether the transmission is normal. Intermittent transmission and bit error may cause higher link error rate, more message retransmissions between switching exchanges and longer transmission delay. This may result in frequent link switchover, link unstableness and link congestion. Methods of querying transmission status are given below: " Viewing the panel indicators corresponding to the transparent BIE, E3M, MSM and FTC. " Perform BER test over the related channels using BER tester and PCM analyzer. " Contact the transmission sector to learn the transmission status. 2) Check whether the related board fails and whether cables between boards are securely connected. Check the signaling link related protocol processing board, trunk board and switching network board, including the LPN7, GNET, transparent BIE, E3M, MSM, FTC and their backplanes. The cables between these boards are: HW between the GNET and transparent BIE, E1 cable between the transparent BIE and E3M and E1 cable between the E3M and MSM. Generally some signaling links or trunk circuits may also become exceptional when these boards fail. Therefore, comprehensive analysis can be made to locate the link unstableness trouble. If necessary, the method of replacing the board during system running can also be used to locate the trouble. 3) Check whether data configuration is right. When the link selection code configured in MTP Linkset Table does not match the actual number of SS7 links, some links will never be selected as SS7 link. The signaling load over each link will be uneven. When traffic increases, the link with relatively heavy load will be blocked. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-13 VII. Maintenance suggestions When BSC and MSC are in two different places, the transmission system is an important part that should be considered during the analysis process on SS7 link trouble. Therefore, periodic maintenance should be performed over transmission devices. Please contact the transmission sector as soon as possible in case of link trouble. Besides, analysis is based on both BSC and MSC since SS7 link involves both of them. 4.3.5 AM/CM-BM Link Faulty I. Description The indicator F0 in one or two GMC2(s) is exceptional or the indicator RUN does not keep ON. When all links of the two GMC2 cannot be established, the corresponding BM information (like control panel or software version) cannot be queried at the maintenance console and the AM/CM will lose touch with BM. II. Analysis 1) Check whether the GMC2 is normal. 2) Check whether the link is established in the GMC2. If no, check whether each part of the communication channel is normal, including the GMCCS, FBI, OPT, optical fiber, AM/CM backplane and BM backplane. 3) When only one link can be aligned in the GMC2, you may check the DIP switch in the GMPU and jumper switch in the backplane. 4.3.6 Optical Path Faulty I. Description When the optical path is faulty, the LFA, RMT and BER indicators in the GFBI and GOPT will keep ON. For example: When the optical path in the GFBI is out of synchronization, the LFA indicator corresponding to the optical path keeps ON and so does the RMT indicator in the GOPT. When the optical path in the GOPT is out of synchronization, the LFA indicator in the GOPT keeps ON and so do the RMT indicator corresponding to the optical path. When the GOPT receives no optical signal, the RNL indicator in the GOPT keeps ON. The alarm box gives an audible and visual alarm. In the meantime, an alarm record related to GOPT or GFBI trouble is displayed at the alarm console. In addition, when the optical path is unstable, indicators of the GFBI may flash occasionally, noises during conversation may be caused and link may become unstable or interrupted. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-14 II. Analysis The optical path clock system includes the following parts: AM/CM: GCKS ! Clock wire ! GCTN ! GFBI ! GOPT The optical path system includes the following parts: GFBI Optical fiber GOPT Note: Please be sure to connect the optical fiber properly. In case of optical path trouble, please check each part of the optical path system and optical path clock system. 4.3.7 PbSL Disabled I. Description During PbSL query at the maintenance console, the PbSL is found in the DISABLED state. II. Analysis This trouble has two typical causes. When the PbSL is in the BLOCKED management state, it can be considered that the operator has blocked the LAPD. When the PbSL is in the UNBLOCKED management state, it can be considered that the PbSL becomes faulty physically. In this case, problem may lie in E1 connection, data configuration or L2PU trouble. III. Troubleshooting process 1) Query the PbSL status at the PCU O&M console. When its management state is shown as BLOCKED, it is no wonder that the PbSL becomes disabled. Unblock the PbSL and the PbSL should recover to the ENABLED state. Otherwise, it can be considered that the PbSL is faulty. 2) Check data configuration. Check the E1 port configuration via "pcu check e1config" at the PCU O&M console and check whether time slots configured in the PbSL are consistent with those configured at the BSC side. 3) Check E1 cable connection. Check whether E1 cable connection is consistent with the corresponding data configuration, whether E1 cables are properly connected, whether TRx connections are right and whether the E3M version at the BSC side is suitable. 4) Perform self-loop test over the PbSL. You may perform self-loop test over the PbSL via the command "mt lapd loop set" at the O&M console. There are many self-loop modes. The self-loop over E1 cable is recommended. When the self-loop test is passed, the PbSL operating state will be shown as ENABLED. Otherwise, it can be considered that the transmission is faulty. Please contact the transmission device engineer for further handling. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-15 When the PbSL passes the self-loop test, problem may lie at the BSC side. Check whether the related data configuration is right. See 4.2.5 Signaling Direction on LAPD Links. When the A interface trunk circuit that shares the same E3M with the PbSL is faulty, please check whether the E3M is in the normal state. When the SS7 link that shares the same E3M with the PbSL is faulty, please check whether the transparent BIE is in the normal state and whether the E1 cable between the transparent BIE and E3M is securely connected. When many other PbSLs, RSLs and OMLs of the BSC are found faulty, it is recommended to replace the GNET or clock frame. 4.3.8 E1 Faulty I. Description and possible causes The E1 cable may meet with the following troubles: 1) Local/remote E1 cable out-of-synchronization alarm and LAPD alarm together with their corresponding cleared alarms are given frequently. 2) E1 cable is disabled and OML becomes broken. 3) Local E1 cable alarm is generated. Possible causes: 1) The grounding cable has errors. 2) A transmission device, board or E1 cable is faulty. 3) The lower cascaded site is not connected. II. Analysis 1) When E1 out-of-synchronization alarm, LAPD alarm and OML alarm are given frequently, Generally the reason may be that the E1 cable grounding cable is not so well set that interference is caused or that a transmission device is faulty. Troubleshooting procedures: a) Check the E1 interface board (and TMU) at the BTS to see E1 cable grounding settings. b) Test the resistance of the E1 cable connector and that of the rack to measure the insulation situation. c) Check whether the E1 cable connector in the DDF (when configured) is connected with the DDF grounding cable. d) Check whether the E1 cable enclosure of the transmission device is grounded. e) Detect the above grounding situations and check whether the system is in the single-point-grounded state. If no, please modify the system to the single-point-grounded state and then check whether the trouble is removed. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-16 f) When the trouble is not removed yet, it may be considered that the problem lies in the transmission device, E1 cable or E1 interface board. You may check the connection and replace the parts one by one to locate the trouble. g) Check the transmission NM and see whether a transmission related alarm is given. If yes, please handle it as per the related alarm details. 2) When E1 cable is disabled, Probably the reason may be that the E1 cable, transmission device (sometime transmission device trouble causes no alarm) or board is faulty. a) Perform self-loop test over BTS using a self-loop cable and see whether the LIU indicator of the E1 interface board is OFF. If no, it can be considered that the problem lies in the BTS interface board. Please replace the interface board. b) Perform self-loop test over BSC using a self-loop cable and see whether the indicator of the BIE interface board is OFF. If no, it can be considered that the problem lies in the transmission device. c) Check the transmission NM and see whether a transmission related alarm is given. Based on the alarm (if any), you may judge whether the problem lies in the transmission device. d) When neither of them is faulty, it may be considered that the problem lies in the cooperation between the transmission device and BTS. See 4.4.3 Examples of E1 Cable Transmission. 3) When BTS is normal and local E1 cable alarm is given, Generally the reason is that the lower cascaded site is not connected or it meets with a transmission trouble. a) Check data configuration and see whether the BTS is cascaded with a lower site non-started yet. b) If yes, please self-loop the cascading interface of the local BTS. 4.3.9 Microcell's Optical Transmission Board Faulty I. Description and possible causes The optical transmission board of microcell may meet with the following troubles: 1) BSC and BTS fail to shake hands with each other. 2) The lower cascaded site is normal while the local BTS fails to shake hands with others. 3) Optical fibers of the transmission device and BTS are normal. However, E1 cable is disabled when they are interconnected. 4) The network port IP address of the ASU can be successfully pinged; however, the NM software and command line software cannot be normally reached. Possible causes: 1) The optical transmission board bears wrong configurations or the BTS DIP switch setting has errors. 2) The optical transmission board or BTS is damaged or the DIP switch setting has errors. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-17 3) The attenuation is too large for the pigtail fiber self-contained in the BTS is impaired. II. Analysis 1) When BSC and BTS fail to shake hands with each other, Generally the reason is that the optical transmission board has wrong traffic configuration or that the DIP switch setting has errors. a) Check whether the indicator MOD in the BTS O&M cavity is flashing. If yes, it can be considered that the system is running normally. b) Check whether the transmission mode related DIP switches in the O&M cavity are properly set. For the DIP switch LSB, the digit ON indicates that E1 transmission mode is adopted and OFF indicates that optical fiber transmission mode is used. As to other DIP switches, their default settings are ON. c) When DIP switch settings are right but the indicator does not flash, it may be considered that the problem lies in BTS. You may replace the BTS. d) When both the indicator and DIP switches prove to be normal, you may connect the system to a local network port and use the NM software to check whether the ASU data configuration is right. e) Use an optical fiber to perform self-loop test over the west optical interface of BTS and see whether the BTS indicator LIU is OFF. Note: A west optical interface is usually an upper cascading optical interface. It depends on the actual data configuration. f) Via the NM software, check whether the ASU generates a transmission related alarm. When an optical path or tributary alarm indicator is ON, it may be considered that the ASU is faulty. 2) When the lower cascaded site is normal but the local BTS fails to shake hands with others, Generally the reason is that the optical transmission board has wrong data configuration or that the ASU tributary is wrongly connected. a) Use an optical fiber to perform self-loop test over the west optical interface of BTS and see whether the BTS indicator LIU is OFF. When the indicator LIU is normally OFF, it may be considered that the problem lies in the bottom layer of the OML at local BTS or in BSC data configuration. You may check the BSC data configuration first. When no error is found, it can be considered that the BTS is damaged. b) When the indicator is not OFF, you may connect the system to the local maintenance NM and see whether the ASU data configuration is right. c) Check whether the ASU generates a transmission related alarm. When its data configuration proves to be right, it may be considered that the ASU is faulty. You may replace the BTS. d) When all the above items prove to be normal, it may be considered that the problem lies in the E1 cable connection between the ASU and MMU or in the E1 port of the MMU. You may replace the BTS. 3) When both BSC and BTS pass self-loop tests but they fail to shake hands with each other after being interconnected or connection between them intermits, Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-18 a) Test the optical fiber transmission distance. The ASU can support a max. optical fiber transmission distance of 30 km. When this value is exceeded, trouble will occur. b) The receiving sensitivity of the optical transmission device or that of the ASU optical interface at BTS is decreasing, or the optical fiber is impaired so that larger attenuation is caused. You may adopt more optical attenuators and test the actual transmission distance the optical transmission board and other such devices can support. After locating the trouble, replace the related transmission device or BTS. 4.4 Examples 4.4.1 Examples of SS7 Link Trouble I. Example of transmission unstableness (with analysis based on the alarm) The SS7 link of an office often intermitted. After a long period of trace and analysis over the link intermittence, it was found that every time a link intermitted, the corresponding PCM system of it generated a PCM alarm. The signaling link intermittence was caused by the following causes: " The MTP-2 timer T7 (1.5 s) timed out. Since it had sent a message to the peer office, the local office had not received any verification from the peer within 1.5 s. As specified in the protocol, the local office reestablished the link every time when T7 timed out. In other words, the link intermitted. As is described above, it was because of transmission intermittence or bit error that the local office could not receive verification from the peer office within the time range of T7. The only way to solve this problem was to improve the transmission. " During the link setup process, the timers T2 and T3 timed out. These two timers are used for the link setup process. When the local office has not received the SIO message from the peer office since it sent a SIO message to the peer, the timer T2 will time out. At the establishment stage, each side keeps sending SIO messages to the peer till it receives the SIO from the peer. Then they enter into the verification stage. As the Rx E1 cable of the local office interrupted, the local office could not receive the SIO message and T2 timed out. T3 is the timer that is actually used for the verification stage. When the local office has not received the verification message from the peer office since it sent a verification message to the peer, the timer T3 will time out. Since the transmission was of bad quality, the Rx E1 cable of the local office could not send the message properly to the peer and T3 timed out. " A SIOS message was received from the peer office. This message indicted that the link at the peer was in the interrupted state. Therefore, the link at the local office was also interrupted and initiated a reestablishment process. After contact with the peer engineer, the local engineer learned that the peer link interruption was caused by Cause 62 (T7 timeout and message verification stopped). So this interruption was also due to transmission unstableness. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-19 From the above analyses, it could be seen that the link interruptions were caused by transmission unstableness. It caused many PCM alarms. After solving the transmission problem, the trouble was removed. II. Example of peer office trouble (with analysis based on the alarm) The SS7 link of an office intermitted. In one evening, the peer MSC was upgraded. Then in the local BSC, the signaling link interruption alarm was generated for the first time. Its cause value was 0x0F, which indicated that the peer office first initiated the link interruption. This link intermittence lasted nearly two hours. After querying the history alarm, the engineer found the digital trunk PCM failure alarm that had never occurred before was generated for many times after the peer upgrading. One day later, the signaling interruption failure alarm was frequently generated with cause value of 0x0F. From the signaling traced on the early morning, the engineer found that the local office had made many attempts to initiate the SS7 link reestablishment process since the link intermitted. However, it had not received the acknowledgement message from the peer office within the protocol-specified time range, so the signaling link establishment failure alarm was generated (with cause of T2 timeout). From the cause value of the signaling link traffic interruption alarm, it could be seen that the SS7 link intermittence was due to the peer office trouble. 4.4.2 Examples of RSL & OML Troubles I. RSL establishment trouble wrong software version 1) Description There was a BTS30 with configuration of S4/4/4. During deploy commissioning over it, the engineer found all RSLs could not be established in the 12 TRXs while the OMLs were normally in the multi-frame connection setup state. All local MML maintenance operations were normally performed. However, each TRX was in the faulty state, i.e. displayed in the red color. All FAIL indicators in the TRXs were ON. Other boards were normal. 2) Analysis At first, the engineer suspected that the problem lied in data configuration. However, after throughout check, no problem was found in the data configuration. Since the trouble only existed in the RSLs of all TRXs, the problem could only lie in the TRX boards. It occurred to the engineer that Huawei had delivered only 3 TRXs to that BTS and the other 9 TRXs were borrowed from another BTS. The 3 TRXs were delivered two months later than the other 9 ones. It might be TRX version inconsistency that caused the trouble. Then the engineer queried the versions of all TRXs at that BTS via local MML commands and found all TRXs were in the version of 0.00R000.00000000. It meant that all TRXs were not loaded with any software. II. Troubleshooting process The engineer loaded software to the 12 TRXs and activated them. Soon all RSLs were established in the 12 TRXs and a call could be made. III. RSL setup trouble bad HW connection 1) Description Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-20 There was a BTS2X (Site A) with configuration of S6/6/6. During deploy commissioning over it, the engineer found the RSLs could not be established in the 8 TRXs on the second Tx & Rx E1 cables while the OMLs were normally in the multi-frame connection setup state. All local MML maintenance operations were normally performed. Each TRX was in the normal state, i.e. displayed in the green color. All FAIL indicators in the TRXs were OFF. The ALARM indicators in the 8 FPUs that correspond to the second Tx & Rx E1 cables flashed by 1 s. Other boards were normal. At the remote maintenance console, the engineer saw that the channel states of the 8 TRXs were B and that the cell initialization process was normal. 2) Analysis As was seen from the trouble description, the setup-failed RSLs were located in the TRXs 4 and 5 in the slave cabinet 1 and the six TRXs in the slave cabinet 2. If the problem lied in cables between the columns, all RSLs would fail to be established. Therefore, it could be concluded that cables between the columns had no error. At the remote maintenance console, the engineer found that all TRXs and FPUs were in the right version. The TRX where the primary BCCH is located can affect the other TRXs in the corresponding cell. When its RSL cannot be established, the RSLs of the other TRXs in the cell cannot be established, either. When the engineer replaced the TRX of the primary BCCH and the corresponding FPU, the trouble still existed. There was rare possibility of all the 8 TRXs and FPUs being faulty, so the problem might not lie in the BTS hardware. Generally the causes of RSL setup trouble include data configuration error, transmission trouble and BSC/BTS hardware problem. 3) Troubleshooting process a) Since the data setting center verified that in the data configuration the RSLs could be established, the problem might not lie in data configuration. b) The engineer suspected that the problem lied in transmission. However, the indicator corresponding to the BIE was normally OFF. When the engineer disconnected one of the second Tx & Rx E1 cables from BTS, the indicator flashed. When the engineer disconnected both of the second Tx & Rx E1 cables, the indicator was ON. Therefore, it was concluded that the Tx & Rx E1 cables had not been self-looped or mismatched. c) Since the problem did not lie in the transmission between BSC and BTS, it was suspected that the E1 cable from the cabinet top to backplane had errors. The engineer found no problem when he removed the back cabinet door and inspected the E1 cable. Then he interchanged the E1 cable from the port 1 in the cabinet top and that from the port 2 in the cabinet top. However, the trouble was not removed yet, so this possibility could be excluded. d) After the midnight, the engineer experimented with another S6/6/6 BTS (Site B). He verified the transmission system and hardware of Site A using data of Site B, as shown in Figure 4-2. As a result, all RSLs were established in Site A and all TRX channels were normal. This indicated that hardware of Site A and transmission between Site A and DDF were all right. So the problem must lie in the BSC. Site A Site B BSC ! ! ! ! BTS ! ! ! !
Figure 4-2Connecting Site A to the BSC ports of Site B Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-21 e) Then the engineer verified the BSC of Site A using the transmission system and hardware of Site B, as shown in Figure 4-3. As a result, all RSLs on the second Tx & Rx E1 cables were not established. Hence, the engineer affirmed that the problem lied in the BSC. Since nothing was wrong with the Site B data, it must be the BSC hardware that had errors. Site A Site B BSC ! ! ! ! BTS ! ! ! !
Figure 4-3Connecting Site B to the BSC ports of Site A f) Replacing the related BIE and GLAP, the engineer found the trouble still existed. He checked the HW related data configuration and confirmed that its data configuration was consistent with its actual position. Then he pushed the HW connected to the port 0 and found it was securely connected. Now he doubted that the HW to port 2 of the BIE was disconnected. He interchanged the HW to port 0 with that to port 2 and found that all RSLs were interrupted. At the remote maintenance console, the engineer noted that channels were in the normal state. It could be seen that the HW itself was all right and the problem only lied in the HW connection. The engineer reconnected the HW and found that all RSLs were established and that all TRX channels were normal. At last the engineer performed a dial-and-test and everything proved to be normal. 4.4.3 Examples of E1 Transmission Trouble I. Example 1 A site KCLK started before Jan. 15 th and operated normally. After Jan. 24 th , its OML often interrupted and the indicator (corresponding to the E1 cable) at BSC flashed. The engineer performed an on site inspection. 1) The equipment room was located at the top of a 300-m-high hill. The microwave transmission equipment room was 20 m away. 2) On the site, the engineer found the following: a) TMU E1 was grounded, as was seen from the DIP switch. b) The cabinet-top E1 cable connector case was insulated from the cabinet enclosure. The working grounding cable of the rack was connected with that of the equipment room. c) The DDF, an all-metal frame, was connected to the grounding cable of the equipment room. The case of the E1 cable connector contacted the metal of the DDF. d) No lightning arrester was configured for the E1 cable. e) The indicator of the E1 cable flashed at a frequency of 20Hz-30Hz. 3) Troubleshooting procedures: Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-22 a) The engineer self-looped the BTS at the rack top and found the indicator of the E1 cable was OFF. Then he restored the BTS. b) The engineer self-looped the BTS on the DDF and found the indicator of the E1 cable was OFF. Then he restored the BTS. c) The engineer self-looped the BSC on the DDF and found the indicator corresponding to the E1 cable at BSC was OFF. Then he restored the BSC. d) The engineer powered the TMU on and then off and found the trouble still existed. Then he restored the TMU. e) The engineer removed the E1 cable connector from the DDF and found the trouble still existed. Then he restored the E1 cable connector. f) The engineer disconnected the E1 cable that run from the transmission system from the DDF. Then he tested the voltage difference between the grounding cable of the E1 cable and that of the DDF and found the Tx & Rx differences both ranged between 0.001V-0.004V. Then he restored the E1 cable. g) The engineer disconnected the E1 cable that run from the BTS from the DDF. Then he tested the voltage difference between the grounding cable of the E1 cable and that of the DDF and found the Tx & Rx differences were both 0.003V. Then he restored the E1 cable. h) The engineer disconnected the E1 cable from the rack top, powered off the rack and removed the TMU. He tested the resistance between the cabinet-top E1 cable connector case and the grounding cable of the rack and found they were insulated from each other. Then he restored the E1 cable. i) The engineer switched the TMU DIP switch that corresponded to the grounding cable of the E1 cable to OFF and found the trouble still existed. Then he restored the DIP switch. j) The engineer self-looped the BSC at the DDF. He tested the transmission bit error of BSC. Within twenty minutes, he had found no bit error. Then he restored the BSC. k) The engineer removed the E1 cable connector from the DDF and switched the TMU DIP switch that corresponded to the grounding cable of the E1 cable to OFF. Then he found the trouble disappeared. The BTS reset and operated normally. l) The engineer replaced the TMU (with the E1 cable ungrounded). He let the E1 cable connector case contact the DDF and found the TMU indicator corresponding to the E1 cable flashed at a frequency of 20Hz-30Hz. m) The engineer restored the TMU to the original one and removed the E1 cable connector from DDF. Then he cleaned up the equipment room and left. 4) Add a lightning arrester. II. Example 2 1) Description As the customer-service personnel and office operator said, a BTS X shut down at 21:00 on Oct. 16 th . After the BIE and OMU of the BTS20 were replaced, the trouble still existed. The engineer suspected the problem lied in transmission because the E1 cable between the DDF of the multi-module office and the DDF box of the BTS was 130 m long. Although this was theoretically reasonable, generally such a long E1 cable was rarely put into practical use. After switching Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-23 the optical path over and replacing the 130-m-long E1 cable, the engineer found things looked up. By day the BTS intermitted and automatically recovered. However, during 19:00-22:00 the BTS completely interrupted and could not recovered automatically until the period was past. At the alarm console, the engineer found that many PCM out-of-synchronization alarms were generated in the BIE at BSC, that the LAPD link trouble alarm, local BIE link out-of-synchronization alarm_1 and LAPD_OML trouble alarm were generated at BTS and that all alarms were cleared within 1 minute. In the BTS20, during the trouble period the indicator ALM on the BIE was ON, that on the FPU flashed and the indicator LIU on the BIE also flashed. 2) Analysis The BTS X was in the configuration of S6/6/6. That is, it had 18 TRXs. Figure 4-4 shows the specific connection between the BTS and BSC. BSC ROOM BSC BSC DDF DDF DDF BTS DDF BTS BTS ROOM Host exchange 1 2 3 4 5 6 7 8 E1 E1 E1 E1 E1 E1 OPTICAL FIBRE OPTICAL FIBRE Host exchange Optical transceiver Optical transceiver End Office
Figure 4-4Connection between the BTS and BSC Wherein, the optical fiber was performed by the SDH that China Telecom leased. The E1 cable between 6 and 7 in the figure was a little too long (130 m long). Besides, it was openly cabled at the building top. In order to locate the trouble, the engineer performed tests over the following transmission sections. " 3 ------> 7 Test result: The BER was about 1E-4. There were some alarm indication signals (AIS) and "FRM LOSS" alarms. Sometimes severe interference was received. " 3 ------> 6 Test result: The signal level was -1dB. The frequency offset was 1Hz. There was no bit error. " 6 ------> 7 Test result: The signal level was -4dB. The frequency offset was 4Hz. There was no bit error. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-24 From the above test results, it could be seen that the transmission sections 3----->6 and 6----->7 did not cooperate very well. The only great difference between the two corresponding test results lied in the signal level. The engineer suspected that the signal became too weak to be detected after it had run through the two transmission sections. After consulting the transmission engineer of Huawei, the engineer learned the signal level result was acceptable. Generally the signal level could be accepted as long as it was no higher than -20dB. That difference was only a sensitivity problem. Therefore, the engineer laid emphasis on the interference problem. Since the E1 cable between 6 and 7 in the figure was a little too long and was openly cabled at the building top, the engineer doubted that the problem was that the E1 cable met with some severe interference. Together with the office engineer, the engineer went to the building top and checked the cabling conditions. Besides the E1 cable, some telephone lines were also found. As the external interference source cannot affect the E1 cable unless its interference can penetrate through the E1 shielding layer, only the cable with relatively strong signals (like power cable) can interfere in the E1 cable. This indicated that the telephone lines could not bring about direct interference. The grounding cable might also cause interference; therefore, the engineer tested the grounding situation and found that the shielding layer of the E1 cable was grounded at 6. Then the engineer disconnected the E1 cable from the DDF box at BTS and tested the BER using BER tester. Now he found FAS, AIS and bit error all disappeared and only a few background noises existed. This indicated that the grounding cable was at least one of the trouble causes. By the way, the E1 shielding layer was grounded at the DDF box via the case of the double-male-connector. However, when he connected the E1 cable to the BTS bypassing the DDF box, the engineer found the same trouble still existed. The reason must be that the measure did not hit the problem. It occurred to the engineer that the E1 shielding layer was also grounded at the BTS rack and BTS20 BIE. From the schematic diagrams of the BIE and baseband frame backplane at BTS20, the engineer found the shielding layer of the E1 Rx end at the E1 port was grounded via the backplane and BTS rack. The shielding layer of the E1 Tx end was controlled via the DIP switch S4 in the BTS20 BIE. Note: S4.1~S4.4 respectively correspond to E1 ports 1~4. The engineer removed the shielding layer of the E1 Rx end from the grounding cable. Then the trouble disappeared. 3) Conclusion The main cause of the interference was that the grounding situation of the E1 port at BTS was in breach of the single-point-grounding requirement specified for E1 transmission. The shielding layer of the E1 Rx end was grounded via the BTS rack. In the meantime the Tx end of the optical transceiver was also grounded. So a grounding loop formed. With the induced voltage of the loop, the noise source caused too much interference, as brought about higher transmission BER, FAS alarms and AIS remote signaling alarms. III. Example 3 In one BTS X, when the HDSL device was used for transmission, sometimes the link interrupted and the BTS link could not be established while the transmission device generated no alarm. In the trouble circumstances, the engineer performed the following location operations. 1) He observed alarm indicators of the BTS and found some alarms were given at the remote while no alarm was generated in the HDSL. 2) He disconnected the E1 cable. Then he connected the E1 port in the HDSL with a BER tester. Note: the tester was set to test the time slot 31. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-25 3) He performed a 1-minute BER test. As a result, the UAS was 100%, BER was 1.25 0E 1 (a little too high). As is known, under such BER conditions, a link cannot be established. Besides, the bit error persisted there. 4) He reset the HDSL device and then tested the BER again. Now the BER became 0. Throughout the above procedures, the office engineer was accompanied by the office engineer. The office engineer admitted that the HDSL device was wrong. In order to confirm that the HDSL resetting had errors, the engineer performed tests over the BTS X again on that evening. 1) At 23:25, the engineer prepared the connections. He then activated the test software. The trouble reappeared. There should be SABME, ID_CRQ and ID_ASS frames on the downlink and ID_REQ frame on the uplink. However, via the MMI, the downlink frames could not be seen. This indicated that the BTS had not received the downlink frames though the frames could be detected by K1205. 5) The engineer disconnected the E1 cable and connected the BER tester to the E1 port He then informed the office engineer to perform self-loop at BSC. 6) As the tester displayed, the BER reached 1.25 0E 1. However, the bit error disappeared in several minutes. 7) Now the engineer began to test the HDSL setting in the following procedures: a) He powered off the HDSL and then on. b) After the HDSL initialization finished, he performed the BER test. c) He then repeated the above steps a) and b). In total, the engineer performed 30 tests over the HDSL resetting. Failure was found in five of those tests. Upon every failure, the bit error persisted in the HDSL till the engineer reconfigure the BER tester parameters, e.g. the measured time slot No. Maybe such reconfiguration can reinitialize the E1 port chip of the BER tester so that the HDSL can recover. The BER of the time slot 31 was tested to be 1.25 0E 1. From the above test result, it could be seen that the HDSL resetting had errors. When the HDSL was powered on and then off, bit error might occur and persist. The trouble probability was about 1/6. In case of failure, the bit error did not disappear until the engineer powered the HDSL off and on again, or reconfigured the signal source, or disconnected and reconnected the E1 cable. 8) Possible causes a) The HDSL itself had problems in Rx processing. It could not accurately parse the signal in each E1 frame. b) The HDSL had problems in transmitting. c) The HDSL processing was not strictly in accordance with the standard protocol. For example, according to the tester, the Si4~Si8 in the NFAS of the E1 signal transmitted from the HDSL was substandard. This might cause difficulty in HDSL connection with others. It was to be confirmed. They stopped the remote loop-back over BSC to represent the trouble. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-26 Via the K1205, it was detected that there were SABME, ID_CRQ and ID_ASS frames on the downlink and ID_REQ frame on the uplink. However, via the MMI, the downlink frames could not be seen. This indicated that the BTS had not received the downlink frames though the frames could be detected by K1205. The engineer activated the software for several times. However, the trouble was not removed. Then the engineer performed a board resetting. There still existed the trouble. However, the trouble was a little different from the previous one. The downlink SABME could not be detected by the K1205 but seen at the MMI. In other words, this time it was the uplink that was unavailable. From the two trouble descriptions, it could be seen that the HDSL signal had really something wrong. The reason why the K1205 and BTS could not detect the uplink or downlink frame together might be that the K1205 and BTS had different interface chips and they responded differently to the threshold signal transmitted from the HDSL. Sometimes the K1205 could detect the signal while the BTS could not and vice versa. Besides, the uplink and downlink were symmetrical. Either might become unavailable. The key to solve this problem was to handle the transmission trouble completely. Then the engineer performed another board resetting. Now the trouble disappeared. This indicated that board resetting and software activation were useful after all. Through hardware analysis, it was found that the BTS switched the clock source at the E1 port to another clock during the board resetting. Meanwhile the transmission device should also synchronize its clock with the new clock source and interrupt for a certain time. This was the reason why the previous trouble disappeared. 4.4.4 Examples of Optical Transmission Board Trouble in Microcell Example 1 In one mini BTS X, the optical transmission mode was used. When it was powered on, its indicator MOD kept ON. The engineer checked the settings related to transmission mode and DIP switch on site and found that all DIP switches were set to the ON digit. When he switched the LSB to OFF, the BTS displayed the indicator normally. Example 2 There was a cascaded BTS X. Its lower cascaded BTS had started normally; however, it always failed to shake hands with the BSC. The engineer connected it with the NM software and checked the BTS. He found that the traffic configuration and logical configuration were both normal. However, the "IU3 TU_AIS" alarms were founded with the ASU of the local BTS. The engineer replaced the BTS while adopted the original optical transmission. Now the BTS operated normally. Example 3 In one BTS X, the self-loop over BSC was normal and that over BTS was also normal. However, the BS & BTS handshake failed after they were interconnected. The engineer attempted to self-loop the optical transmission device and BTS on site, but Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 4 Troubleshooting for Links
4-27 he failed. He doubted the problem lied in the pigtail fiber self-contained in the BTS. Through check over the BTS, the engineer found the attenuation of the Rx pigtail fiber at the west optical interface of BTS was about 20dB higher than the normal value. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-1 Chapter 5 Troubleshooting for Clock 5.1 Overview 5.1.1 Fault Description and Fault Causes in BSC Clock System Table 5-1Fault description and fault causes in BSC clock system Fault causes Fault description Reference source fault The alarms DDS control data abnormal and Clock crystal oscillator override are reported on BSC alarm console. GCKS MOD indicator flashes slowly and most of the BTSs clock are out of lock. The alarms Clock source changed and Clock reference source abnormal/normal (alternatively) are displayed at the BSC alarm console. The MOD indicator or the F0/REFA indicator on the GCKS is constantly ON. After pulling out the two cables that introduces the two clock sources in the clock frame, the output 2MHz clock in the clock frame turns normal and most of the BTSs become locked. Board fault in clock frame On the maintenance console, GCKS is displayed red, or GCKS indicator is displayed abnormal, but turns normal after GCKS replacement. Clock alarms are displayed at the alarm console, but are removed after GCKS replacement. After pulling out the two cables that introduce the two clock sources in the clock frame, the CKB output clocks (including 2MHz, 4MHz, 8kHz and 32MHz) are tested abnormal and cannot recover even after GCKS replacement. Connection cable fault in clock frame On the maintenance console, GCKS is displayed red, the CLK indicator on GCKS flashes slowly. The fault cannot be removed even after GCKS replacement. On the maintenance console, GCTN/GSNT clock fault is reported, but the CKB output clock (including 2MHz, 4MHz, 8kHz and 32MHz) is tested normal by the frequency meter. Clock reference source fault alarm is reported on the BSC alarm console, and the F0/REFA indicator on the GCKS is ON. Data configuration fault On the maintenance console, GCKS is displayed red, the CLK indicator on the GCKS flashes slowly. The fault cannot be removed after GCKS replacement.
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-2 5.1.2 Fault Description and Fault Causes in BTS Clock System I. Fault description and fault causes in BTS3X clock system Table 5-2Fault description and fault causes in BTS3X clock system Fault causes Fault description Reference source problem 13MHz clock out of lock alarm is reported. The BTS3X BSIC cannot be decode. The cell handover success rate is very low. The output 2MHz deviation of the MMU tested at the local end is large. Connection cable fault Clock-related alarms and test phase-lock ring alarm occur to all TRXs in the slots, accompanied with TRX repeated loading, TRX communication alarm and alarm removal report etc. But, no alarms occur to TMU, TDU indicators switches work normally and DIP switches set correct. Onsite operation error or BTS3X clock aging 13MHz clock out-of-lock alarm occurs. Call drop occurs frequently. Handover success rate drops substantially. The output 2MHz clock of the TMU tested at the local end is normal. Board fault (TMU, TDU and TRX) TMU mailbox fault alarm occurs. TMU clock fault occurs. Clock-related alarm and repeated reset occur to all TRXs. Connection cables and TMU work normally, and DIP switches are correctly set. But, clock-related alarm and repeated reset occur to all TRXs. Clock-related alarm occurs to some TRXs.
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-3 II. Fault description and fault causes in BTS3001C clock system Table 5-3Fault description and fault causes in BTS3001C clock system Fault causes Fault description Reference source fault 13MHz clock out-of-lock alarm occurs. BTS3001C BSIC cannot be unlocked. The cell handover success rate is very low. The output 2MHz deviation of the MMU tested at the local end is large. No other alarms occur to the BTS3001C. Board fault Frame No. alarm, TS No. alarm, Rx/Tx phase-locked loop alarm and other clock-related alarms occur to the BTS3001C. MFU and MCI are repeatedly reset. No output 13MHz clocks can be tested or the deviation is large. Version fallback 13MHz clock out-of-lock alarm occurs. BTS3001C D/A value deviation is large, and the D/A value displayed is 0. Handover success rate drops substantially. The MMU operating version is 03.0301A, but the operating version of MFU and MCI is 05.0301A or even higher. Onsite operation error or clock aging 13MHz clock out-of-lock alarm occurs. BTS3001C cell handover success rate drops sharply. No other alarms occur to BTS3001C. When the BTS3001C clock is adjusted as under free-run mode, the output 13MHz clock offset is large.
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-4 5.2 Fundamental Knowledge 5.2.1 Introduction to BSC I. Introduction to BSC clock signal flow TCSM MSC E3M GCKS GCTN GFBI GOPT GNET GNOD/GLAP/GMC2 E3M GSNT GMCC CKV BIE 8kHz 8kHz 2MHz 32MHz 8kHz 32MHz
Figure 5-1Multi-module BSC clock signal flow As shown in Figure 5-1, the path for multi-module BSC clock signals is as follows. The clock is abstracted from MSC and is transmitted via TCSM and E3M to GCKS as 8kHz reference source. GCKS traces the external reference signals and filters the jittering and drifting of the external reference signals to provide the high-stability clock signals (32MHz, 8kHz and 4MHz) which the BSC required. The clocks output from GCKS are first provided to the GCTN (32MHz, 4MHz and 8kHz) and the GSNT (32MHz and 8kHz) in AM. Then, they are driven by the GCTN to output the clock signals (32MHz, 8kHz and 4MHz) to GFBI. Clock signals are transmitted to BM through the optical paths, and GOPT extracts the clock signals (2MHz and 8kHz) and transmits them to the GNET in BM. The clock signals of the boards (GNOD, GLAP/LPN7 and GMC2) in the main control frame in BM are provided by GNET. The clock signals extracted by GCTN are first driven by CKV, and then transmitted to the boards in BTS interface frame through external HW clock cables. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-5 II. Introduction to theinput reference source signals in BSC clock frame There are three channels of input reference source in BSC clock frame. 1) 8kHz differential signal source from E3M (including 8kHz0! and 8kHz1!) 8kHz0! and 8kHz1! are two sets of 8kHz reference source, most commonly used by BSC. They can be input to GCKS by JC1 or JC2 through differential cables (twisted pairs). The cable connection is illustrated in Figure 5-2. The 8kHz differential signal sources from the trunk system are input by the 4-core sockets JC1 and JC2. Note that JC1 and JC2 are connected in parallel, i.e., if the two channels of 8kHz differential signal sources from the trunk frame are input from one 4-core connector, the differential cables can be connected either to JC1 or JC2. 8k0- 8k0+ JC1 JC2 8K1- 8K1+
Figure 5-2The cable connection when the 8kHz differential reference sources come from the same trunk frame If the two channels of 8kHz differential signal sources come from different trunk frames, the cable connection must be handled with care. Only two differential cables can be connected to either 4-core sockets, as shown in Figure 5-3. 8k0- 8k0+ 8k1- 8k1+ JC1 JC2
Figure 5-3Cable connection when the 8kHz differential reference sources come from different trunk frames 2) The 2.048Mbit/s (E1 signals and HDB3 codes are converted on GCKS to 2.048MHz clocks) input reference source (including ER1 and ER2) from BITS equipment 3) The 2.048MHz TTL clock signal input reference source (including 2MR1 and 2MR2) from BITS equipment Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-6 III. Introduction to the output signals in BSC clock frame 1) To GSNT: 8kHz, 32MHz 2) To GCTN: 8kHz, 32MHz and 4MHz Note: 8kHz and 32MHz clock signals are simultaneously output to the backplane of GSNT and that of GCTN, with the output matching resistance as 75! (150! for each board). The 4MHz signals only output to the GCTN, with the output matching resistance as 75!. Therefore, set the frequency meter to the state of low resistance matching when testing the 8kHz, 32MHz and 4MHz clock signals with the meter. Otherwise, the test result might be inaccurate. IV. Introduction to BSC clock hardware configuration 1) Introduction to GCKS GCKS is the board that generates the reference clock source in the clock system of the switch. In the board, there is a stratum-2 Oven voltage Control Oscillator (OCXO) in GCKS. GCKS traces the external reference signals and filters their jittering and drifting to ensure high frequency accuracy and stability of the timing signals output and provide the switch a high-quality clock source. The GCKS can provide both stratum-2 and stratum-3 clocks. GCKS works under four modes. When GCKS is first powered on, the OCXO in it will have been heated for 3 minutes. In the 3 minutes, its MOD indicator is OFF. Then, the board starts to work, its OCXO enters free-run state and the MOD indicator is OFF. If there is no reference source, the OCXO will continue in its free-run state. If there is reference source, GCKS will first check its frequency deviation. If the deviation is big, REFA indicator will be ON. If the deviation is normal, the OCXO will enter fast pull-in state and the MOD indicator will flash quickly. Usually fast pull-in state will last 10~15 minutes. After the 10~15 minutes, if the reference source is abnormal, the OCXO will enter free-run state. If the reference source is normal, the OCXO will enter locked state, MOD indicator will flash slowly and GCKS starts to work. If the reference source is abnormal during the locked state, the OCXO will enter holdover state and MOD indicator will be ON. But when the reference source resumes normal, the OXCO will re-enter locked state. 2) Important DIP switches (a) Configuration of GMPU and GCKS communication rate There are DIP switches on both GCKS and CKB for configuration of communication between GCKS and GMPU. Only when the DIP switches on both GCKS and CKB are correctly set, can the communication between GMPU and GCKS be normal. Otherwise, GCKS fault will occur. Table 5-4The DIP switches for configuring the communication rate between GCKS and CKB BSC type GCKS S5.1 CKB S4.1 S4.2 S4.3 S4.4 Communication mode Multi module OFF ON ON ON ON Point to multi-point, high baud rate Single module ON OFF OFF OFF OFF Point to point, low baud rate
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-7 (b) Other DIP switches on GCKS There are other two DIP switches on GCKS for configuring the clock stratum and phase lock mode of GCKS. Usually they are set as follows. DIP switch Common setting in BSC Description S5.2 ON ON: PDH phase lock mode OFF: SDH phase lock mode S5.3 ON ON: stratum-3 clock OFF: stratum-2 clock
3) Indicator on GCKS Indicator Color and normal state Meaning and description RUN Red, flashing once every second when the communication is normal. The RUN indicator flashes once every second when the communication is normal, 4 times every second when the communication is abnormal. The normal communication depends on the correct settings of the DIP switches. GCKS communicates with SLT or ALM with low baud rate when there is an SLT in the clock frame or there is a clock frame in the independent C&C08B standalone exchange. If the RUN indicator flashes quickly after GCKS is powered on, first check the DIP switches on the GCKS. ACT Green ON when the GCKS is active and OFF when the GCKS is standby. F0 Green, OFF when the reference source is normal Indicator for the status of the external reference source selected. ON when the reference source is abnormal, On when the reference source is normal, flashing when the software is re-capturing the reference source, OFF again when the reference source is locked. If F0 is ON after GCKS is powered on, check to ensure the reference source is correctly accessed to the GCKS. There can be six types of reference source to be accessed. Seen from the back of the backplane, J1 and J2 are the input terminals for 2048kbit/s reference source, J3 and J4 are the input terminals for 2048kHz reference source. Under locked state (MOD indicator flashes slowly. Refer to the description below), when F0 indicator flashes, GCKS is re-capturing the reference source and it will be OFF again after it locks the reference source. If F0 flashes frequently, check if SDH mode is required. For F0 flashing under other modes, check the quality of the reference source accessed. F1 Green, OFF when the board works normally OFF when the system works normally (including VCXO, DDS, 88915). If F1 on the GCKS is ON, GCKS fault occurs (including VCXO, DDS AND 88915 fault). Normally GCKS will be switched over automatically (if it is standby, no switchover will be performed). Maintenance engineers shall replace the faulty board timely according to the status of other indicators. LOF Green, OFF when the board works normally ON when the 88915 is out of lock. Under this case, the board will be automatically switched over (if it is standby, then no switchover will be implemented). Due to 88915 out-of-lock, the clock signals (frequency and amplitude) output from the board will be abnormal, so the maintenance engineers shall replace the board timely. DDS Green, OFF when DDS output is normal OFF when DDS output is normal, ON when the DDS components on GCKS work abnormally. When DDS is ON, automatic switchover of the GCKS shall occur (if the board is standby, no switchover will occur). The abnormal DDS components will make the clock signals output by the board abnormal (frequency and amplitude). DDS output abnormal is most probably caused by the inaccuracy of the reference source. Therefore, first check the reference source. If the reference source is normal, replace the board timely. LOCK Green, OFF when 88915 is normal ON when 88915 is out of lock. The meaning of LOCK indicator is completely the same as that of LOF indicator. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-8 Indicator Color and normal state Meaning and description CLK Green, OFF when 58274 is normal OFF when the time chip 58274 has calibrated the time and works normally. Otherwise, it flashes. The time chip 58274 in the C841CKS provides time to the switch, but the initial time shall be configured on the BAM. CLK indicator is OFF when 58274 works normally. If no initial time has been configured at the BAM, or 58274 chips work abnormally, CLK indicator will flash once every second. R8K0 Green,ON when R8K0 is selected The 8kHz reference source from the trunk. There are altogether 6 reference source indicators on C841CKS. When the external reference source is selected, its corresponding indicator will be ON. Otherwise, the corresponding indicator will be OFF. R8K0 is default reference source with highest priority (unless other reference source has been configured with higher priority than R8K0 on the BAM). As long as R8K0 is available, it will be configured as the default reference source. If it is unavailable, select R8K1 as the default reference source. By analog, the rest can be selected as the default reference source when all the reference sources before it in numbering are not available. If the active reference source is not available after GCKS is powered on, F0 indicator will be ON and the indicator for the active reference source will also be ON (R8K0 by default). If stratum-3 clock is configured for GCKS (by setting SW3.5 as ON or configuring it on the GMPU), GCKS can capture suitable reference source as active. Its stratum-2 clock will be configured by the GMPU. The active reference source states (normal or abnormal) are indicated by F0. REFA Green, OFF when the clock deviation is normal ON when the clock deviation is too large and OFF when the clock deviation is normal. REFA indicates the clock deviation when the clock is under holdover or free-run state, and it indicates the output of clocks under fast pull-in or locked state. REFA will be ON when the output frequency deviation is bigger than 0.4ppm or 4.6ppm respectively under the case that stratum-2 clock or stratum-3 clock is configured to GCKS. Only when the GCKS is in locked state, That frequency deviation check would be performed to the reference source. If the deviation is detected too big, GCKS will enter holdover state, and it returns into locked state when the reference source resumes normal. MOD Green OFF: free run (just powered on/the reference source is abnormal after fast pull-in) Flash quickly: fast pull-in (quickly lock the reference source) Flash slowly: locked (normal) ON: holdover (the reference source is abnormal after being locked) +12V Green, ON when the 12V power supply is normal 12V power supply indicator. OFF when it is normal. +12V is the power supplied to the OCXO on GCKS. If the 12V power supply is abnormal, there will be no output from the OCXO and the GCKS cannot work normally.
V. Introduction to BSC clock data configuration Table 5-5Tables required to be configured for BSC clocks Table name Function [Clock description table] Describes the clock source of the GNET [AM_CKS clock configuration table] Describes the work mode selection of the clock module
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-9 For multi-module BSC, the two tables in the above table are required for BSC clock data configuration. [Clock description table] configures the clock source of GNET. For multi-module BSC, select hardware selection (module B/C2K network) and select clock frame clock (B independent office) for C&C08B standalone exchange BSC. [AM_CKS clock configuration table] configures the stratums and reference sources of the clock. [Select clock reference source] There are 3 categories, 6 types of clock reference sources, i.e., 8kHz differential signals 8K0 and 8K1, 2Mbit/s E1 signals 2MB0 and 2MB1, 2MHz TTL signals 2M0 and 2M1. Usually 8kHz differential signals are used. Generally all are selected. [Configure the priority of the reference source] Usually the priority level for reference source selection from higher to lower is 8K0, 8K1, 2MB0, 2MB1, 2M0 and 2M1. Generally all are selected. [Select the active clock after power on] Select the GCKS first used after the clock frame is powered on. Usually the active GCKS0 is firstly selected. [Set clock stratum] For GCKS combines the stratum-2 and stratum-3 clocks, this parameter is designed. Usually stratum-2 is selected for MSC and HLR, and stratum-3 for BSC. [Set clock phase lock mode] Select the PDH mode. [Select clock module work mode] Usually automatic adjustment mode is selected. Controlled mode is selected only for commissioning and clock network access acceptance test. VI. Troubleshooting for BSC clocks 1) Check indicators on GCKS Note especially the state of MOD and REFA indicators during the normal operation of GCKS. If the MOD indicator flashes slowly, it means GCKS is tracing normally the MSC clocks. If the MOD indicator is OFF, it means the GCKS is working under free-run mode (usually the free-run mode is not taken unless when the frequency deviation of the clock reference source obtained from MSC is too big). If the MOD indicator is in the states other than ON and OFF, it means the GCKS is working abnormally. In addition, if the REFA indicator is ON, it means the frequency deviation of the GCKS output clock is too big. Check the status of F0 indicators. If the F0 indicator is ON, it means the reference source is abnormal. If the F0 indicator is OFF, it means the reference source is locked. If the F0 indicator is flashing, it means the software is now tracing the reference source. When CLK indicator is ON, it means there is communication fault between GCKS and GMPU. 2) Query on the maintenance console (a) Query whether GCKS is faulty or abnormal on the maintenance console. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-10 (b) Query the state of the clock source by selecting the menu [Control options/ /GCKS Board Control/Query Clock Source State] Select the menu above to query the state of the clock source. Note that the frequency deviation queried on the maintenance console is not accurate enough. Therefore, usually the average value of multiple frequency deviations queried on the maintenance console is taken for reference. (c) Query the work state of GCKS by using the right-key menu on the maintenance console. 3) Query clock-related alarms on the alarm console Query whether there are clock-related alarms on the alarm console, such as Clock reference source abnormal, Clock configuration data error, BTS 8kHz clock alarm, 13MHz clock out of lock, DDS control data abnormal, Clock reference source switchover and GCTN clock fault alarm etc. 4) Check related components with meters The most commonly used test instrument for clocks is frequency meter. The commonly used frequency meter is HP53132A, used to measure the frequency accuracy. Oscillometer is used to check whether the waveforms of the clock signals are deformed and whether the phases are normal. Clock test can be performed at MSC side, BSC side and BTS side. Refer to the BSC clock test point below for clock test accuracy requirement. Usually the clock test follows a process below. 5) First test whether the clocks output from the BSC clock frame are accurate. 6) If the BSC clocks are accurate, test the clocks at BTS side. 7) If the BSC clocks are not accurate, test the clock at MSC side. 8) If the clock at MSC side are also inaccurate, test the clock reference source of MSC. 9) If the MSC clocks are accurate, test whether there are faults on the clock transmission path between MSC and BSC. VII. BSC clock test point 1) Output 2MHz clock in GCKS ! Test instrument: frequency meter ! Test point: 2MH-OUT test point on the backplane of the clock frame, J16 in the Figure 1-4. This is the most commonly used clock test point and it exists in the clock frame of both MSC and BSC. ! Grounding point: coaxial shield layer of the clock frame can be used as the grounding point. ! Test requirement: Stratum-2 clock is adopted at the MSC side and its clock accuracy should be f/f 1E-8. Stratum-2 clock is adopted at the BSC/BTS side and its clock accuracy should be f/f 5E-8. Therefore, the frequency deviation allowed at the MSC side is 2048000.000 %1E 8 = 0.02Hz. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-11 And the frequency deviation allowed at the BSC/BTS side is 2048000.000 % 5E 8 = 0.1Hz. If there is too big frequency deviation, check whether the reference source, GCKS and CKB are normal. See Figure 5-4 for 2MHz test point. ! !
! ! 8K 0- 8K 0+ 8K 1- 8K 1+ 2M H -O U T1 8K -I N J C1 J C2 J 16
Figure 5-42MHz clock output from the backplane of the clock frame 2) 8kHz reference source accuracy on the CKB of the clock frame Take the 8kHz clock reference source as the example. ! Test instrument: Frequency meter ! Test point: It depends on the reference source selected, as shown in Figure 1-5. ! Grounding point: The GND point in Figure 5-5 is recommended, or the coaxial shield layer of the clock frame can also be used as the grounding point. ! Test requirement: The allowed frequency deviation of the 8kHz reference source at the MSC side is 8000.000 % 1E 8 = 8 %1E 5 Hz The allowed frequency deviation of the 8kHz reference source at the BSC/BTS side is 8000.000 % 5E 8 = 4 % E 4 Hz
Figure 5-5Test point for 8kHz clock reference source on CKB 3) Clock test at the back of MSM ! Test instrument: Oscillometer Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-12 ! Test point: As shown in Figure 5-6, connect the oscillometer on the upper part of pins at the back of MSM on TCB to test the 8kHz, 2MHz and 4MHz clocks. ! Grounding point: The GND points in Figure 5-6 are recommended. ! Test requirement: The waveform is normal. In addition, to test the accuracy of the 8kHz clock reference source, lead the MSM Tx E1 cables to the test TMU to test the 2MHz clocks extracted by the TMU. The accuracy requirement is the same as that in testing the clocks at the back of CKB. Pin in Up Socet of MSM Backplane 2M- 2M+ 4M- 4M+ 8K- 8K+ 25 26 27 A B C GND GND GND
Figure 5-6MSM clock signal test point 4) Clock accuracy at the back of GCTN ! Test instrument: Oscillometer ! Test point: As shown in Figure 5-7, connect the oscillometer on the upper part of pins at the back of the backplane of corresponding GCTN to test the 8kHz, 32MHz and 4MHz clocks (two groups for each). ! Grounding point: The grounding points in Figure 5-7 are recommended. ! Test requirement: The waveform is normal. 32M1 32M0 8k 1 8k 0 4M1 4M0 45 46 47 A B C GND GND GND Pin in Up Socet of CTN Backplane
Figure 5-7Test points for the clock signals on the backplane of GCTN 5) GSNT clock accuracy Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-13 ! Test instrument: Oscillometer ! Test point: As shown in Figure 5-8, connect the oscillometer on the back of backplane of corresponding GSNT to test the 8kHz and 32MHz clocks (two groups for each). ! Grounding point: The GND points in Figure 5-8 are recommended. ! Test requirement: The waveform is normal. SNT Backplane 8k 0 8k 1 32M0 32M1 4 34 35 A B C GND
Figure 5-8Test points for the clock signals on the backplane of GSNT 5.2.2 Fundamental Knowledge of BTS Clock System BTS clock system performs radio link time division and radio frequency calibration function in GSM system. It plays an important role in the normal service implementation of GSM system. I. BTS reference clock When BTS clock reference source comes from BSC, the BTS is connected to the BIE of the BSC through the E1 cables of the BIU (BTS Interface Unit). BIU selects and processes one channel of clocks among the clocks extracted from E1 interface line, then takes it as the reference clock for the MCK module of the BTS high-precision clock. Under some cases, BIU can also provide the input for the external reference clocks. It selects one channel of clocks from the two channels of clocks it extracts from the E1 as the reference source clock for MCK module. The deviation of the clock is required to be no more than 0.05ppm. II. BTS system clock The OXCO, frequency divider, frequency verification unit, CPU and D/A in the MCK module comprises a closed-loop frequency calibration system to automatically trace the reference source input and provide the BTS high-performance 13MHz reference clock, as shown in Figure 5-9. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-14 Reference source CPU D/A OCXO System clock Frequency division System clock Phase discrimination
Figure 5-9MCK module phase-locked loop All clock signals used by the BTS are generated by the 13MHz signals. MCK module can decide the precision of the external reference signals, automatically pull in and lock the reference clock signals and filter their jittering so that the clock signals output can have high frequency precision and stability and can work under free-run or holdover state when the external reference clock source gets lost. There are two BTS clock software versions currently in service, V 0407 and V0529. Their algorithms are not the same. For V0407, when the phase discrimination deviation is less than 4Hz, the clocks will trace and lock the reference clock. Otherwise, they will not trace and lock the reference clock and they stay in free-run mode. For V0529, the clocks always trace and lock the reference source. MCK module has three phase lock states, free-run, pull-in and locked. Free-run state occurs under the following scenarios: ! When the reference source jitters terribly and cannot stay stable for the clocks to pull in. ! When the reference source gets lost, MCK will enter free-run state immediately no matter what state it is in. 13MHz clock out-of-lock alarm will be reported under both the above scenarios. ! When the reference source is set as the internal clock on the local MMI, the MCK module will be forced into free-run state. Pull-in state occurs under the following scenarios. ! When the reference source jittering is less than 4Hz (BTS3X) or 5Hz (BTS3001C) under free-run state, MCK module will enter pull-in state. ! When the clock deviation frequently exceeds the security range under the locked state, MCK module will enter pull-in state. The locked state occurs under the following scenarios. ! The 13MHz clock deviation has been less than 0.2Hz for long time; ! the MCK module will enter locked state. III. The generation of BTS system clock signals GSM system is a TDMA (Time Division Multiple Access) system, i.e., the multiple access function is accomplished through time division. Both the TDMA principle and GSM05.10 specification regulate that four signals are required for the timing at the radio interfaces, FCLK (Frame Clock), FN (Frame No.), TSCLK (Timeslot Clock) and OBCLK (Octet bit clock). As defined in GSM specification, system clock signals can be generated through the frequency division of the master clock of 13MHz. Among the clock signals, OBCLK is the sixth frequency division of the 13MHz, i.e., 13+6=2.16MHz. The FCLK, OBCLK Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-15 and SREF signals are aligned with the rising edge of the 13MHz clocks (10-20 ns delay allowable). SREF: Frequency 13MHz/4=3.25 MHz, period 307.7 ns, duty ratio 50% OBCLK: Frequency 13MHz/6=2.167 MHz, period 461.5 ns, duty ratio 50% FCLK: Frequency 13MHz/6/10000=216.7Hz, period 4.615 ms, duty ratio 50% 13MHz: Frequency 13MHz, duty ratio 50% The period relationship between clocks of different systems is as the following: 1 FCLK=8TSCLK 1 TSCLK=156.25 BCLK=1250 OBCLK IV. The calibration of 13MHz clocks Because the frequency dispersion of the 13MHz OCXO is rather large in actual manufacturing, the 13MHz output frequencies of different boards differ greatly. Therefore, to ensure an accurate 13MHz reference clock for each board, all the 13MHz clocks are required to be calibrated respectively before the boards start to work. In addition, the 13MHz clocks are required to be calibrated once every year to avoid the 13MHz clock deviation due to clock aging. 1) Required equipment and environment One frequency meter, one serial port communication cable and MMI BTS maintenance system. 2) 13MHz clock calibration process ! Check to ensure the board is in normal service (warm-up end). Connect the frequency meter to the board 13MHz clock port, and the serial port communication cable to the MMI BTS maintenance system control console. ! Start MMI control, obtain administration authority and complete the board configuration. ! Select board management and clock configuration item. ! Select Internal clock in clock configuration interface, adjust the default DA control value so that the difference between the 13MHz value on the frequency meter and the actual 13MHz is controlled within 0.1Hz. ! Select store to Flash Memory and press <OK> in the [CLK configuration] interface. Note: Press Ctrl + Alt + F to perform clock hardware parameter configuration. ! Exit the configuration item and complete the 13MHz clock calibration. The clock system principles for all Huawei BTS series are basically the same. To get a comprehensive understanding of Huawei BTS clock system signal flow can help a lot for clock system troubleshooting. V. Huawei BTS clock system signal flow 1) BTS30 clock system signal flow Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-16 BTS TDU Board JP3 JP4 BTS JP2 JP1 JP3 JP3 TRB CMB U5 U7 U6 U8 U2 U4 U1 U3 SIX TRX M a t c h i n g T M U Wring2 Wring1 Wring3 Wring4
Figure 5-10BTS30 clock system signal flow BTS30 clock system signal flow is illustrated in Figure 5-10. The path in the major cabinet is TMU CMU (JP3) TDU (JP2) TDU (JP4 and JP3 Extension cabinet) JP1 TRB (JP3) 6 TRXs in the local cabinet. During the path, signals might pass four wiring points: wiring 1 and wiring 2 (to the extension cabinet), wiring 3 (TDU to TRB) and wiring 4 (from the CMB of the major cabinet to the TDU). The path in the major cabinet is TDU (JP3 or JP4) TDU (JP1) TRB (JP3) the 6 TRX in the local cabinet. During the path, signals might pass three wiring points: wiring 1 and wiring 2 (to the major cabinet and next level of extension cabinet), wiring 3 (to TDU and TRB). The signal flow of BTS312 is similar to that of BTS30. Clock signals are forwarded via TDU. 2) BTS3001C clock system signal flow Because BTS3001C cabinets are encapsulated and not allowed to be opened in the field, only clock source and operation problems, no internal signal flow, are described here. VI. Board indicators of BTS series 1) Indicators of clock-related boards in BTS3X series Table 5-6TMU indicator Indicator Color Meaning Description Normal state PLL Green Phase discrimination indicator ON: free run Flash quickly (4 Hz): pull-in Flash slowly (1 Hz): locked OFF: abnormal Flash slowly (1 Hz)
2) BTS3001C clock-related indicator There is no clock-related indicator in BTS3001C. VII. Huawei BTS series clock test points 1) BTS3X series Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-17 The T2MHz and 13MHz test output in TMU is illustrated in Figure 5-11. PWR RUN LI3 LI1 LI2 M/S PLL DBG MMI LI4 T2M FCLK T13M
Figure 5-11TMU clock test point 2) BTS3001C BTS3001C test points are illustrated in Figure 5-11. The 13MHz output point is 13MCLK, and FCLK output point is 2M (output shift is performed by executing custom commands).
13MCLK FCLK SLAVE SLAVE MASTER I O POWER MMI ASU O&M PWR RUN ACT MOD LIU S3 ON OFF MSB LSB
Figure 5-12BTS3001C O&M cavity (REV.C) Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-18 5.2.3 Specifications about BTS Clock System The requirement about GSM BTS frequency accuracy in GSM specifications is as the following. The frequencies of radio part and baseband part of the BTS must derive from the same signal source. For BTS of any type, no matter whether they support slow rate FH (Frequency Hopping), the absolute frequency error of their transceivers and the relative frequency error between transceivers must be less than 0.05ppm under all test conditions. 5.3 Trouble Handling 5.3.1 Trouble Handling for BSC I. Trouble handling for reference source 1) Description F0 indicator is ON or flashing after GCKS enters normal service. Clock reference source abnormal and Reference source replacement alarms are displayed on the alarm console. Sometimes most of the BTSs clock are out of lock. 2) Introduction to indicators F0 indicator indicates the state of the reference source. OFF: Reference source unavailable and GCKS in free-run state Flash quickly: Reference source available, GCKS in fast pull-in state for about half an hour. Flash slowly: GCKS in locked state (normally working state) ON: Reference source gets lost when GCKS is in locked state, GCKS enters holdover state. 3) Analysis ! Remove the cables for the 8K0 and 8K1 reference sources from E3M and check if there are alarms related to reference source and if the BTS has locked the clocks of BSC. If after the cables are removed, reference source alarms disappeared and the BTS normally locks the clocks of BSC, reference source fault can be confirmed. ! If there is frequency meter available, test the deviation of the output 2MHz clocks in the clock frame before the cable removal and after the cable removal. If the output clock deviation is large before the cable removal, but normal after the removal, reference source fault can be confirmed. ! Check board indicator When F0 indicator is ON, no reference source is available. When F0 is flashing, the reference source is not stable. 4) Troubleshooting process If the reference source fault is confirmed, take the following steps for troubleshooting. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-19 ! Check whether the cables are correctly and reliably connected. Note that 8K0 and 8K1 come from different offices and are connected to different sockets (JC1 and JC2). IF possible, replace clock input cables to remove the problems caused by the clock input cable fault. ! Check the clock generation channels in E3M and TCSM frame. Connect the clock access cables to other E3Ms and TCSM frames to check whether there is E3M or TCSM frame fault. If there is, check whether the E3M interfaces are damaged or whether TCSM frame is not in normal service. Replace E3M or the DRC on the back of the backplane. ! If the reference source fault still cannot be removed, test in sequence the MSC CKS output clock, MSM clock and the 8kHz reference source clock input from the BSC CKB. ! If the MSC clocks are not accurate, check the reference source and the operating status of the clock frame of the MSC. If the MSM clock is not accurate, check the E1 cables between BSC and MSC. If there is transmission between BSC and MSC, the transmission exerts great impact on the clocks and has to be tested. II. Trouble handling for the clock frame 1) Components of the clock frame The clock frame is composed of GCKS, CKB and PWC. There are two GCKSs, working in active/standby mode in slot 4 and slot 6. 2) Description On the maintenance console, GCKS is displayed red and GCKS indicator works abnormally. Clock-related alarms are displayed on the alarm console. 3) Trouble analysis The most probable causes for clock frame fault is GCKS or CKB fault (including DIP switches error). When the above scenario occurs, if the fault is removed after GCKS replacement, check whether the DIP switches are correctly set. If the DIP switches are correctly set, it can be confirmed that the GCKS is faulty. Replace the corresponding GCKS. After GCKS replacement, if the fault still exists, remove the cables for the two reference sources in the clock frame and test the CKB output clock (including 2 MHz, 4 MHz, 8 kHz and 32 MHz). If the CKB output clock is abnormal (the waveform of the fault is difficult to obtain and the sampling will be long by using oscillometer), CKB must be faulty. Replace CKB. III. Trouble handling for the cables 1) Clock-related cables (a) Maintenance cables Maintenance cables are used to connect GMCCM and GCKS. GMCCM performs O&M to the two GCKSs through 2 serial ports. One end of the maintenance cable is connected to JB33 on C821MCB (it is inserted from the first pin on the top). Another end of the cable has two connectors, one 4-core Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-20 connector and one 2-core connector. The 4-core connector is connected to JB21 on CKB (inserted in the pins five pins above the bottom) and the 2-core socket is connected to JB3 on CKB (inserted from the first pin on the top). (b) Clock cables There are two types of clock cables. One is the clock cable for synchronization (differential), which is connected to the fifth port at the back of E3M and inputs synchronization clock source to the clock frame. Another is the clock cable used to transmit the two sets of clocks (32MHz, 8kHz and 4MHz; 32MHz and 8kHz) output by the clock frame to the GCTN and GSNT via corresponding backplane. 2) Trouble analysis On the maintenance console, GCKS is displayed red, GCKS CLK indicator flashes slowly and Clock reference source alarm is displayed on the alarm console. The trouble cannot be removed even after GCKS replacement. The F0/REFA indicator on the GCKS is ON, and it might indicate reference source cable fault. GCTN clock fault and GSNT clock fault alarms might be displayed on the alarm console, but no other correlated alarms occur. 3) Trouble handling When GCKS is displayed red, GCKS CLK indicator flashes slowly and the trouble cannot be removed after GCKS replacement, replace the maintenance cables. When Clock reference source fault alarm is reported and the F0/REFA indicator on GCKS is ON, the reference source cables might be faulty. Replace the reference source cables. When GCTN clock fault and GSNT clock fault alarms, but no correlated alarm, are displayed on the alarm console, the CKB output clock cable might be faulty. Replace the CKB output clock cable. 5.3.2 Troubleshooting for BTS I. Troubleshooting for the reference source 1) Description Alarms 13MHz clock out-of-lock, BTS BSIC cannot be unlocked and the cell handover success rate drops, but no other alarms, are reported. 2) Probable causes Refer to Table 1-2. 3) Trouble handling Step 1 On the local MMI maintenance console, check whether the BTS MCK module clock is in free-run state or fast pull-in state. Observe and record the D/A value in the board information displayed. Step 2 Test and record the jittering deviation of the 2MHz output clocks in different BTSs. The jittering deviation within a short time frame should be in the MCK module tracing range, i.e., within +/-5Hz or +/-4Hz. Note that custom messages for clock signal switchover has to be sent through the local MMI maintenance console before the test of 2MHz output clocks in BTS3001C. D4-00-FF-FF-FF-81-00-FF-02-72- [00-FCLK, 01-2MHz] Step 3 Test the BTS 13MHz output clocks and write down the value. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-21 Step 4 If the BTS 13MHz output clock is not accurate, set the BTS clock mode as internal clock mode. Adjust the BTS D/A value to enable the BTS enter locked state according to the deviations of the 13MHz clocks. Step 5 If BTS 13MHz output clock is accurate, 13MHz clock out-of-lock might be caused by the reference source fault, which probably is incurred by transmission or BSC clock fault. Step 6: Check whether 13MHz clock out-of-lock occurs to other BTSs of the same BSC. Check whether the 2MHz clock of the BSC is accurate and normal. Step 7 If the BSC clock is accurate, check whether other TS abstraction equipment exist on the transmission channel and whether the transmission equipment clock system is normal. II. Troubleshooting for field operation errors 1) Description The alarms 13MHz clock out-of-lock alarm, BTS BSIC cannot be unlocked, Cell handover success rate drops, but no other alarms, are displayed. 2) Probable causes Refer to Table 1-2. 3) Trouble handling Step 1 On the local MMI maintenance console, right click the MMU and observe whether the clock state in the board information displayed is free-run. Write down the D/A value displayed. Step 2 Set the BTS clock mode as the internal clock mode, test the 13MHz clock output by the BTS with the frequency meter and observe whether the 13MHz output meets the specification requirement. Step 3 If errors occur to the 13MHz clock output by the BTS, manually adjust the D/A value and observe whether the 13MHz output clock varies accordingly. Step 4 If D/A value can be successfully adjusted, it can be confirmed that the errors of D/A value are causes by BTS clock aging or operation mistakes. Re-adjust the D/A value corresponding to the 13MHz BTS. If the adjustment of D/A value cannot according adjust the 13MHz output frequency, boards must be faulty. Replace the boards. Detailed troubleshooting measures: under the internal clock mode, adjust the D/A value so that the BTS can output accurate 13MHz clocks and save the D/A value in the Flash Memory of the BTS. Then reset the BTS clock work mode as the external clock mode. III. Troubleshooting for cables 1) Description Clock-related alarms and phase-locked loop alarm occur to the TRXs in all the slots. Repeated loading, TRX communication alarm and removal report occur to the TRXs of all slots. No alarms occur to TMU. TDU indicator and DIP switches are normal. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-22 2) Probable causes Refer to Table 1-2. 3) Analysis Instrument required: multimeter Step 1 According to the BTS30 clock system signal flow, check the rack cabling from the transmission source (the TMU in the major BTS) to TRB. Replace the faulty cables if there is any, and observe whether the troubles have been removed. If the troubles cannot be removed, it might be caused by board fault. Remove the board fault accordingly. IV. Troubleshooting for board fault 1) Description Boards are repeatedly reset and loaded due to TMU major clock alarm, TRX frame No. (TS No.) alarm and clock alarm etc. TMU alarm and TRX alarm are most probably caused by board fault. As described in the BTS30 clock system signal flow, TMU is the source of the clock system, CMB, TRB and TDU are the parts the clock signal flow passes and the TRX is the termination of the clock signals. Therefore, TMU, TDU and CMB clock faults will lead to the abnormal working of all TRXs. The TMU fault alarm will be accompanied by TMU clock alarm Therefore, when TMU, TDU and CMB clocks are faulty, all TRXs cannot work normally. TMU fault is usually accompanied with TMU clock alarm, TDU fault will lead to the fault of all TRXs and CMB fault will also lead to TDU fault. When TRB and TRX are faulty, usually clock alarms occur to the TRXs in some slots. Usually board faults can be analyzed and located according to the system clock signal flow. 2) Probable causes Refer to Table 1-1. 3) Trouble handling There are two ways to handle board faults. One is board replacement. Another is clock waveform test. The second way requires oscillometer on site to help the test. Test the clock wave form segment by segment according to the clock signal flow to locate the faulty segment. The following describes the process for board replacement. Step 1 Check whether all TRXs are faulty and whether serious clock alarms occur to TMU. If yes, go to Step 2. If No, go to Step 7. Step 2 Replace the board suspected to be faulty with a normal board and observe whether the BTS can normally working. If the BTS can resume normal working, it can be confirmed that TMU is faulty. Replace the faulty TMU. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-23 If after TMU replacement, TMU major clock is removed, but TRX alarm cannot, go to Step 3. Step 3 Observe whether the configuration of the rack indicated by TDU indicator is correct. If yes, replace the faulty TDU with a normal TDU working in other racks and observe whether the fault is removed. If yes, go to Step 4. Step 4 It can be confirmed that cabling fault is excluded. Step 5 Replace CMB. Step 6 Replace TRB. Step 7 If faults of some TRXs still cannot be removed, pull put the faulty TRXs and insert them into normally-working TRX slots. Observer if faults still persists. If yes, the TRX must be faulty. If no, TRB corresponding slot or matching problem might happen. Replace TRB boards. For matching problem, pull out, then push in connectors to adjust the clock bus load so that the fault can be removed. 5.4 Examples 5.4.1 Troubleshooting Examples for BSC Clock Fault I. Call drops during handover due to too big BTS clock frequency deviation Fault description: Sometimes the call drops during handover in GSM1800 system. But the signal level is high and there is no interference. Troubleshooting process: The engineers found that the BTS frequency deviation was very big according to the onsite test result. The 13MHz output frequency deviation of some BTSs reached 2.5Hz, greatly larger than the international standard 0.65Hz. 8KHZ clock alarms of many BTSs were reported on the alarm console. MSC frequently reported Clock software phase lock normal/abnormal alarm, which indicated the inaccuracy of the clock at MSC side. Cause analysis: From the above description, it can be inferred that the instability of BTS clock were the major causes for call drops during handover. Therefore, the maintenance engineers re-adjusted the clock reference source of the MSC to make it stable, the 8kHz input clock error alarm disappeared and the call drop during handover disappeared. II. BTS clock frequency deviation too big due to configuration error in the clock description table Fault description: In single module BSC, the engineers queried the MCK clock of the BTS and found its the frequency deviation was very big. Then tested the clock from the GCTN and found its frequency deviation was also very big. Troubleshooting process: The engineers queried the data configuration and found the clock selection in the clock description table wad configured as Multi-module BSC clock load-sharing mode and the GCTN extracted clocks from GOPT. Cause analysis: In the single module BSC, the clock selection should be configured as C&C08B standalone exchange and the GCTN extracted clocks directly from the Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-24 clock frame. After modifying the configuration, the engineers tested the clocks from the GCTN and found their frequency deviation very small. They queried the MCK clocks and found the frequency deviation normal. III. Clock fault due to cable problem from MCB backplane to FIO backplane Fault description: During the equipment check after an upgrading in an office, the engineers found clock system alarms were reported and the ENA indicator on GCTN0 in SCP4 was flashing quickly. Troubleshooting process: The engineers at first thought it was caused by GCTN fault. But the fault could not be removed after GCTN replacement. They performed GCKS switchover and the system resumed normal. They then replace GCKS0 and switch GCKS0 over as active, but the fault still remained. After repeated verification, it was found that whenever GCKS0 is used, the alarm would be reported. But when the GCKS1 was used, the system would work normally. The engineers performed GSNT switchover clock cable plug/unplug test and inserted only one GCKS in the frame, but the fault could not be removed. Cause analysis: In the internal cable set of the BSC, the cable from MCB to FIO backplane is disrupted, so GCTN cannot correctly select clocks after clock switchover. IV. BSC clock problem due to MSM fault Fault description: In an office, BTS2.0 was used. After BTS cutover, most of the BTSs were in fast pull-in state and crystal oscillator override alarm was reported many times. Troubleshooting process: The engineers queried BSC clock board and found it normal. The MOD indicator of GCKS was flashing slowly and GCKS was in locked state. They checked all the boards on the BSC maintenance console and found their state normal. Cause analysis: After analyzing the faulty MSM, the engineers found that the matching resistance of the E1 Rx/Tx chip of the faulty MSM is damaged. They replaced the damaged resistance and the fault was removed. V. BTS out of lock because E1 TS integrater cannot transmit the clock in real time Fault description: In a BSC module, 13MHz clock out-of-lock alarm was displayed at the BTS side. The TMUs of all the BTSs were in free-run state and could not lock BSC clocks. This fault can be temporarily removed by resetting GCKS but cannot be removed from its root cause. Troubleshooting process: The engineers checked the E1 channels on E3M 2. They connected TS crossover equipment in one E1 channel. They replaced the clock extraction E1 port of the BSC and extracted the clocks from the E1 that did not pass the TS crossover equipment. The out-of-lock problem of the BTS was removed and the BTS started to trace its upper level clock. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-25 VI. The reference source not available due to too small amplitude of 8k1 reference source Fault description: After XX BTSs cutover, most of the BTSs were found in fast pull-in state and crystal oscillator override alarm was reported many times. The engineers checked the BSC clocks and found them normal. They then checked MSC clocks and found the 8kHz 1 reference source abnormal. Troubleshooting process: The engineers tested the 2MHz differential signals output from the MSM on the pins at the back of the MSM backplane with an oscillometer and a frequency meter. The square wave pulse voltage was only 68 mV and the frequency was only around 2.048MHz. Cause analysis: The fault was probably caused by too small signal level. After tested with the frequency meter, it was 1.75MHz. After MSM replacement, the clocks resumed normal. Therefore, it can be inferred that the fault was caused by MSM problem. VII. Clock fault due to CKB DIP switches setting error Fault description: During batch GCKS replacement in the office X, the GCKS of the older version worked normally in the slot 6 of the clock frame. But it was displayed faulty after being replaced with the GCKS of the new version. Troubleshooting process: The engineers checked the DIP switches on CKB and found the S4 was incorrectly set as ON ON ON OFF. Therefore it was the S4 setting error that caused the abnormal working of new-version GCKS in slot 6. Cause analysis: Under multi-module BSC, the communication between GCKS and GMCCM adopts point to multi-point mode. Since the serial port on GCKS0 in slot 4 in the clock frame is directly connected to the serial port cables communicating with GMCCS, the setting of the DIP switches exerts impact on the serial port communication of GCKS1 in slot 6 instead of that of the GCKS0 in slot 4 in the clock frame. VIII. Clock fault due to connection problem of clock cables and the backplane Fault description: In XX local network, there was no voice when calls were connected, and Group0 clock fault alarm was reported to GCTN in AM/CM (Group0 clock refers to the clock signals transmitted from GCKS0 to GCTN or to GSNT. Group1 clock refers to the clock signals from GCKS1 to GCTN or to GSNT). Troubleshooting process: The engineers pulled out the active GCKS0 and performed active/standby switchover to the boards. Then the network worked normally. Cause analysis: After test, it was confirmed that the fault was caused by the clock cable connection problem between the CKB and the group0 4MHz clock cable of GCTN 0. Replace the clock cable and the CKB, the fault can be removed. 5.4.2 Troubleshooting Examples for BTS Clock Fault I. Clock fault because TMU13MHz clock cannot lock th external clock Fault description: In office deployment, The engineers found that the BSIC of A BTS could not be unlocked and the TMU of BTS A was in free-run state. In the BSC Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-26 configuration data, the working state of BTS A was set as Tracing BSC clocks, but the actual state was free-run. The D/A value of TMU OXCO was 1470. Troubleshooting process: The engineers tested the T2MHz and T13MHz output of the BTS TMU with a frequency meter and found the deviation of the two clocks was very big, as given in the following. T2MHz 2047996Hz ! 2048006Hz T13MHz 12999993Hz ! 12999994Hz From the above test result, the clock deviation of the T2MHz was very big and it was necessary to check whether it was poor transmission quality or the TMU 21Q554 chip problem that caused the clock deviation. The engineers selected a BTS B that was working normally in external clock mode, and then they tested the output of the T2MHz and T13MHz clocks with the frequency meter. T2MHz 2047999Hz ! 2048000Hz T13MHz 13000000Hz ! 13000000Hz They switched the TMU of the BTS B to the BTS A and tested the clock output of TMU with a frequency meter. T2MHz 2047993Hz ! 2048009Hz T13MHz 13000000Hz ! 13000000Hz From the above test, it can be confirmed that the transmission of the BTS B was not stable and the clock deviation was big. But the TMU of BTS B could still lock the external clocks in BTS A and output standard 13MHz clocks. Therefore, though the transmission quality was poor, it was not the cause for the free-run state of the TMU in BTS A. After switching the TMU of BTS A back, they adjusted the D/A value of the OXCO with the frequency meter. They found that the TMU could output standard 13MHz clocks and successfully locked the external clock when the D/A value was 1910. Cause analysis: The following three conclusions can be drawn from the troubleshooting process. 1) The transmission quality of BTS A is poor, but it is not the cause for the problem that the TMU cannot lock the external clock. 2) The chips of the TMU in BTS A are not damaged, because the TMU can work normally when the D/A value is 1910. 3) The root cause for the fault is that the deviation between the setting of the D/A value of the OXCO in the TMU of BTS A and the actual value is too big (1910-1470=440). The engineers had traced the status of the TMU in BTS A for three weeks and found the value of the phase discriminator stayed around 1910 and the TMU worked normally. This means the fault has been removed. II. Faults are located segment bu segment accoriding to the signal flow Fault description: In site X, TRX could not be started normally. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-27 Troubleshooting process: The engineers powered off the BTS, then proceeded in the following procedures. 1) Replaced the connection cable between CMBJ25 and TRBJC3. But the TRX still could not be started. 2) Replaced the TRB and the TRX could be started. But, master clock alarms were reported during the operation. 3) Checked the DIP switches, the cables and the matching connectors of the TDU, CMB and TRB and found no error. 4) Pulled out, then pushed in all TRXs one by one, and powered on them. But the fault could not be removed. 5) Replaced TMU and reset the BTS X. The fault disappeared. 6) Re-installed the replaced TMU and the master clock alarm occurred again. Therefore, it can be confirmed that the TMU was faulty. 7) Replaced the faulty TMU with the normal one and tested the BTS X. The operating status of boards was normal. Cause analysis: The BTS worked normally after TMU was replaced and master clock alarm occurred to the original TMU. From the fact, it can be inferred that the TMU fault caused the TRX unable to be started normally. III. Clock fault due to operation errors Fault description: 13MHz clock out-of-lock alarm was found in many BTSs in X area. Troubleshooting process: The engineers configured the 13MHz clock under free-run mode at the local end and tested the output T13M of the BTS. They found the output deviation was around 5Hz. They adjusted the D/A value and found the 13MHz clocks working normally under free-run mode. They then configured the clocks of the BTS under Tracing the upper level clock mode and found the BTS normal without any out-of-lock alarm. Cause analysis: The fault was caused by operation errors. The default clock D/A value on the clock O&M interface was 2048, while the D/A value adjusted for the boards was between 1100 and 1600. When the user queried the interface and pressed <OK> to exit the interface, the system would start clock capturing with the default D/A value. Under this case, the output of the 13MHz clocks was around 13MHz+5Hz, 5Hz deviated from the ex-factory 13MHz. The version of many clocks in the BTS in the field was 0407, which usually gave up clock capturing of the reference clocks 4Hz deviated from the value of the 13MHz phase discriminator and adopted free-run clock mode. Thus, the 13MHz clock out-of-lock alarm was reported. IV. Troubleshooting example 1 for faulty TMU Fault description: In the BTS in X area, TRX repeatedly loaded the software, TRX communication alarm, TRX micro-processor fault and frame TS alarm was also reported. Troubleshooting process: The field engineers checked the output signal pulse of the 13MHz clocks of the TMU and found them intermittently disruptive. They replaced the TMU and the BTS resumed normal. Cause analysis: TMU was damaged and 13MHz clocks were unstable. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 5 Troubleshooting for Clock
5-28 V. Troubleshooting example 2 for faulty TMU Fault description: In X local network, the TRX stayed in downloading state after BAM reset. There were history alarms such as 2166 TRX master clock alarm and 2136, test phase-locked loop alarm, the clock of the TMU was in slave state, and to rest the TMU and TRX from the remote end failed. Troubleshooting process: The field engineers found that the transmission at the local end was normal and the ALM indicator on the TRX was ON. He performed debugging on the MMI and found there was no reset hole beside the RST silk print on the handle bar. He first mistaken that the RST hole was covered by the yellow ESD label. But, he then found that the clock of the TMU was a slave clock and the soft reset cannot be performed. On the local end, there was master clock alarm and E1 local end alarm, and the TRX was still in alarm state after hard reset of TMU at the local end and the TRX repeatedly downloaded software. He replaced the TMU and the BTS resumed normal. VI. Troubleshooting examples for BTS3001C version fallback Fault description: The cell handover success rate of BTS3001C microcell in X area drops sharply. Troubleshooting process: The engineers tested the 13MHz clock output deviation of the BTS with a frequency meter and found it reached around 27Hz. He checked the version of the BTS and found the version of the BTS fell back from 05.0301A to BOOT 03.0301A. He re-activated the software and the fault disappeared. Cause analysis: After version fallback, the versions of the BTS are not matched, so deviation of the BTS clocks occurs. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-1 Chapter 6 Troubleshooting for Handover 6.1 Overview MS continuously moves and its relative position to the BTS changes during conversation. In order to guarantee the channel quality during conversation, MS continuously measures the quality of radio channels of the surrounding cells and transmits the measurement report to BSC through the BTS of service cell. BSC implements the radio link control based on the information such as the level strength and quality class of service cell and adjacent cell contained in the report. When MS moves from one cell to another, the new cell instead of the original one will serve it and guarantee the continuity of service. All cells are formed as a seamless network through handover. There are many reasons causing the handover failure. This part describes the common idea and examples for the handover failure analysis. 6.1.1 Failure Classification I. Classified by phenomenon They can be classified into 3 kinds of problem based on the phenomenon of handover failure. ! problem of none-handover initiation; ! problem of incoming cell handover ! problem of outgoing cell handover II. Classifed by reasons They can be classified into 3 kinds of problem based on the reasons of handover failure. ! Hardware failure, including board failure, hardware connection failure, etc. which may only be caused by BTS, BSC or MSC, or improper matching of BTS, BSC and MSC. ! Data configuration failure, including adjacent cells with consistent BCCH and BSIC, inconsistent CGI, unreasonable handover parameters and defect of frequency planning. ! Congestion 6.1.2 Locating Tool The traffic statistic is a good tool to analyze the handover failure, e.g. when the handover success rate is low in an office, check if the radio handover success rate is also low through the traffic statistic. If it is, check if that is caused by the hardware Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-2 failure or too low level in handover or analyze if the handover threshold is set too low, resulting in low success rate due to too low level in handover. 6.2 Trouble Handling 6.2.1 Locating Procedure 1) Confirm if the problem occurs in an individual cell or all cells, and the characteristic of the failed cell, e.g. all are the adjacent cells of a cell or they share BSC and MSC. If the handover failure occurs between 2 cells, check with emphasis if the data configuration between 2 cells is correct and if the hardware fails. If the failure occurs in all adjacent cells of a cell, check with emphasis if the data configuration of this cell is correct and if the hardware of it fails. If the failure occurs in all the cells under the same BSC, check with emphasis the data configuration between BSC and MSC. If the failure occurs in all the cells under the same MSC, there may be a problem with the matching between the opposite office and this office, such as incompatible Signaling and unreasonable timer setting. 2) Confirm if the data are modified before the handover failure. If the failure occurs in an individual cell, check if the data configuration related to this cell is modified. If the failure occurs in all the cells under the same BSC, check if the data configuration of this BSC and opposite MSC is modified. Similarly, if the failure occurs in all the cells under the same MSC, check if the data configuration of opposite MSC is modified. 3) Check if the handover problem is caused by the hardware failure. See 1.2.3 for the locating method. 4) Register the useful traffic statistic, such as the handover and TCH performance measurements. Pay attention to the following, but not only these are involved. ! Observe if TCH occupation of failed cell is normal, e.g. if call drop rate rises up. ! Observe if the success rates of incoming and outgoing handover are normal. ! Observe the distribution of reasons for handover failure. ! Observe if the radio handover success rate is normal. 5) Do the driving test on the failed cell and analyze the Signaling of driving test. Pay attention to the following. ! Observe if the uplink/downlink level of failed cell is balanced. The uplink/downlink unbalance may cause the handover problem. And frequent uplink/downlink unbalance is caused by the hardware failure. ! Observe if the measurement report of failed cell contains the correct list of the adjacent cells. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-3 ! Observe if it is possible to hand over from the failed cell to the adjacent cells and vice versa. ! Analyze if the Signaling flow of handover is normal. 6.2.2 Locating Procedure of No Handover Starting Up The MS in a cell can not initiate a handover into another cell under very weak signal or the signal with very bad quality. This kind of problem is generally analyzed from the following 2 aspects. ! if the condition of outgoing handover is met; ! if there is a candidate cell according with the condition of outgoing cell handover. There may be the following specific reasons. I. The handover threshold is set too low. For the edge handover, the triggering condition is the Rx level is lower than the handover threshold. If the handover threshold is set too low, the handover will not initiate even when the level of adjacent cell is much higher than that of service cell. The conversation quality will be impacted and call drop will be resulted when the condition is serious. The handover threshold should be set based on the coverage scope of cell. The scope of service area in the cell can be indirectly changed through the handover threshold. II. The relationship of adjacent cell is not set. Although the level of adjacent cells in the service cell is very high, MS does not report this adjacent cell and the handover to this cell is unavailable as the relationship of adjacent cells is not set. Observe the adjacent cell list of service cell reported by MS through reselection or conversation test. If MS has already moved to the main lobe of a cell, but there is no such a cell in the adjacent cell list, check if the correct relationship of adjacent cells is set. Or scan BCCH frequency with another MS in the test and observe if the BCCH frequency with stronger signal occurs in the service cell or adjacent cell list. III. The hysteresis is set unreasonable. Only when the difference of signal levels of handover candidate cell and service cell is bigger than the hysteresis, can it be taken as the destination cell. When the hysteresis is set too big, handover may be hard to be triggered. IV. The parameter N-P for statistic time of the optimal cell is set unreasonable. In normal handover, MS sorts the sequence of handover candidate cells through N-P principle. If a candidate cell is the optimal cell for P seconds in N seconds, it can be taken as the destination cell of handover. When 2 good candidate cells become the optimal cells alternatively, it is very hard for the handover decision algorithm to find an optimal cell satisfying N-P principle, resulting in hard handover. Adjust the setting of N and P, and reduce the statistic time to make the handover decision be more sensitive to the level change. This Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-4 case occurs in optimization of a network. The original statistic time of a cell is N=5 and P=4 and after they are adjusted as N=4 and P=3, the handover is normal. When the land form of service cell is very complex, the received-signal level of moving MS fluctuates a lot. In this case it is hard for the candidate cell to satisfy the N-P principle, resulting in hard handover. 6.2.3 Locating of Hardware Failure If the data configuration of failed cell and adjacent cell have not been modified recently and the handover problem occurs suddenly, first consider if it is caused by BTS hardware failure. 1) If similar problem occurs in the shared BTS cell of this cell, consider if it is caused by the shared hardware failure of cells, e.g. if TMU fails. 2) If only one cell has the problem with handover under this BTS, consider if it is caused by the cell hardware failure, e.g. some carriers are damaged and the failure of handover to the carrier is resulted. Some carriers can be blocked to verify this problem. If the handover success rate is recovered normal after a carrier is blocked, check if it is caused by the failure of this carrier, the related CDU or antenna. If the signal uplink/downlink of a carrier is seriously unbalanced, the handover problem will be resulted, e.g. frequent handover or decreasing of handover success rate. 3) Observe if the Signaling of this cell is normal through Abis interface tracing, including if the uplink/downlink Rx quality in the measurement report is good. If the Rx quality in the report is bad, there will be failure with the hardware or serious interference in this cell, as a result, the Signaling can not be normally interacted and the handover problem is caused. 6.2.4 Locating of Data Configuration Problem As there are many handover problems caused by the data configuration error, the analysis and solution are introduced below. 1) MSC independent networking mode: If the handover of incoming or outgoing MSC is abnormal, check first if the Signaling matching of 2 MSCs is correct and second if the data have been modified recently for the opposite MSC and local MSC. 2) Shared MSC networking mode: If the handover is between BSC from different suppliers and the handover abnormity occurs, check first if the Signaling matching between BSC is correct and second if the data have been modified for BSC. 3) If the handover abnormity only occurs in a cell, make the analysis based on the specific condition of abnormity. 4) If the incoming cell handover is abnormal, observe if the handover from all other cells to this cell is abnormal (the general problem for abnormity is the handover success rate is low and the handover to this cell is unavailable). Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-5 If the handover from all other cells to this cell is abnormal, it is generally caused by the data configuration of this cell. The data configuration includes not only that of this cell but also that of other cells related to this cell. For instance, CGI may be correct in the data configuration table of this cell but incorrect when configured in other adjacent cells. The incoming cell handover is abnormal, but only the handover of one cell to this cell is abnormal and that of others to this cell is normal. If so, check if the data configuration of adjacent cells about this cell is correct and if the hardware of this cell is normal other than checking if the data configuration of adjacent cells is correct in that of this cell. For the abnormity of outgoing cell handover, the analysis idea is similar to the incoming cell handover, so it is not described here. 6.3 Examples 6.3.1 MSC Handover Problem I. The charging data modification of supplier A leads the success rate for the incoming MSC handover of dual band network to 0%. [Description] After a dual band network was activated, the indices were normal all the time. But someday the incoming MSC handover success rate suddenly became 0%. [Handling Process] 1) When the handover was suddenly abnormal, check if the data configuration of GSM1800 network had been modified. 2) When the Signaling at MSC side of GSM1800 was traced, it was discovered after MSC of GSM900 transmitted the handover request to that of GSM1800, the latter responded the handover response message (with handover command) to the former. And after 2 seconds, MSC of opposite GSM900 responded the Abort message to that of GSM1800. 3) When the Signaling of MSC for GSM900 was traced, it was discovered the Ho Detect message was reported by BSC of GSM900 when handed over from the BTS of GSM900 to that of GSM1800. But that message transmitted from GSM900 was not received at the MSC side of GSM1800, so the handover failure was resulted. 4) It was checked If the data of opposite MSC were modified and it was discovered the MSC data of GSM900 had been modified. 5) The handover was normal after the MSC data of GSM900 were modified. II. MSC data configuration error leads to the handover failure. [Description] Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-6 The GSM1800 network of an office was of Huawei and the GSM900 network was of supplier S. The former was the configuration of 1 MSC (MSC4) and the latter was that of 2 MSC (MSC1 and MSC2). MS was handed over from MSC4 to MSC1 and the handover was normal. But when MS was handed over from MSC4 to MSC2, the handover was unsuccessful. [Handling Process] 1) As the handover from MSC4 to MSC1 was normal, the reason of failure may be the problem with routing from MSC4 to MSC2. 2) When checking the Signaling link of interface E between MSC2 and MSC4, all was normal and there was no problem with the conversation between MSC2 and MSC4. 3) The Signaling of interface E was analyzed. MS under MSC4 transmitted the handover request. MSC2 responded the handover request response and transmitted the return handover roaming number but the latter flow was interrupted. 4) The latter flow was MSC4 established the service channel at interface E based on the handover roaming number addressing. It was possible MSC4 did not check out the routing based on the handover roaming number. 5) When checking the data configuration of MSC4, it was discovered the called number attribute of roaming number returned by MSC in the analysis table of called number was configured as MSISDN. 6) After the called number attribute was modified, the failure was eliminated. [Recommendation] As the incoming/outgoing handover is related to the selection of routing, the analysis of failure reason is complicated. It is recommended to make good use of the analyzer function and find out the failure reasons from the flow. III. Improper BSC parameter setting of Company S causes low handover success rate to Huawei MSC. [Description] In an office, MSC, BSC and BTS from both Huawei and company S existed at the same time. The handover success rate from MSC of Huawei to that of company S was generally maintained about 80%, but the handover success rate from MSC of company S to that of Huawei was lower and the lowest was about 30%. [Analysis] There are many factors impacting the across-MSC handover. They include the network optimization, data configuration and protocol. Handover includes 3 processes: Handover Required Indication; Handover Resource Allocation; Handover Execution; Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-7 Note: The Handover Required Indication process allows BSS to execute handover for an MS request [Handling Process] 1) Adjust the radio parameters such as frequency, power and handover relationship of both sites near to the coverage area of equipment from company S, but the effect is not obvious. 2) The Signaling of both interfaces was traced through the Signaling analyzer and it was discovered the main reason for unsuccessful handover was BSC of company S did not deliver the Handover Command message to MS, resulting in the time out of timer waiting for MS access of Huawei MSC. Figure 6-1 shows the Signaling flow of typical handover failure. S MSC Huawei MSC Handover Failure (MAP) Prepare Handover (MAP) Prepare Handover ACK (MAP) IAI (TUP) ACM (TUP) 7s Timer Overtime Abort (MAP) CLF RLG
Figure 6-1Signaling flow of typical handover failure The Signaling result showed for most of handover, the time from Prepare Handover to receive MAP Prepare Handover ACK (MAP) was about 2s. With the delay on the interface A, the whole continuous process exceeded 2s in most cases. For a small amount of handovers trying within 1s, it succeeded in most cases. The speed of continuous process for handover was related to not only the switch but also the radio environment. The varied algorithm of different equipment also led to the possibility of handover delay. 3) RACHBT parameter of BSC from company S was modified (the min. interval of service channel handover, corresponding to Huawei BSC parameter as the min. interval of channel handover) from 3s to 5s, then the problem was resolved. (BSC parameter of Huawei is 4s.) Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-8 Figure 6-2 shows the Signaling flow of successful handover. S MSC Huawei MSC Prepare Handover (MAP) Prepare Handover ACK (MAP) IAI (TUP) ACM (TUP) Handover Complete CLF RLG Handover Detect ANN
Figure 6-2Signaling flow of successful handover 4) After the problem was resolved, the handover success rate of both parties reached about 80%. [Recommendation] The handover across MSC is related to BTS, BSC and MSC of both parties. So the analysis is difficult and there are many problems to be resolved. As long as they are analyzed and excluded through checking, all problems will be finally resolved. IV. The Signaling matching problem leads to handover failure. [Description] The BSC office (1AM+2BM) was newly established in a place and connected to MSC60 of Huawei. The BSC controlled 50 BTSs. All was normal when handover was among the cells within BSC. But when the outgoing BSC was handed over to the BSC connected to DX2000 (MSC) of supplier N, the handover was unsuccessful. The handover from BSC of supplier N to that of Huawei could be successful and the consequent handover back to BSC of supplier N could also be successful. [Analysis] 1) Unsuccessful handover might be due to the problem with data configuration at BSC side, e.g. external description error, cell adjacent relationship error, BA1 table and BA2 table description error. 2) It might be due to the problem with data configuration of cell description in MSC60 of Huawei BSC. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-9 3) As the inter-office handover between MSC of Huawei and DX2000 (MSC) of supplier N was involved, it might be due to the external description error of BSC connected to DX2000 on BSC of Huawei. [Handling Process] 1) When checking out the reasons at BSC side of Huawei, the data table of external cell description, cell adjacent relationship table, BA1 (BCCH) table and BA2 (SACCH) table, the engineer did not find any error. The data of foreground/background were consistent based on checking. 2) Let BSC of other party check if the external cell description data of Huawei BSC cell were correctly set. They were correct based on checking. 3) Let MSC of both parties confirm if the data were correctly set. They were both correct based on the confirmation. 4) The Signaling at interface A was traced and it was discovered when Huawei BSC cell delivered compulsory handover to the cell of other party, there was only handover request command but no handover command delivered by upper level MSC of Huawei, and CGI of destination cell was correctly reported in the handover request. But the Signaling was completely correct when the compulsory handover was transmitted from the cell of other party to that of Huawei and the handover was successful. It could be fundamentally ensured there was no problem with data configuration and there might be the problem with the Signaling flow between MSC. 5) According to inter-office handover flow, when Huawei BSC delivered the compulsory outgoing BSC handover to BSC of supplier N, Huawei BSC first transmitted a handover request to upper level MSC60 which transmitted Perform Handover to DX200. The CGI of destination cell was included in this message. VLR of other party allocated an Handover Number and returned it to Huawei MSC60 through Radio Channel ACK. If the Signaling flow was normal, MSC60 transmitted an IAM to DX2000 which transmitted an ACM to MSC60. After the session was established, MSC60 delivered Handover Command to Huawei BSC and then MS could be handed over to BSC of other party. 6) MAP message was traced with analyzer between MSC. It was discovered MSC60 did not transmit IAM after it received the message Handover Number transmitted by the other party and thus the Signaling flow was terminated. When this message was checked, it was discovered DX2000 had found the CGI of destination cell. 7) When Handover Number from other party was analyzed, it was discovered the handover number of other party was transmitted in the mode of 130% % % % % % % % % without 86 added in front. As Huawei equipment did not accept this format, the handover failure was resulted. 8) Coordinated with supplier N for adding 86 before the handover number, the problem was resolved. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-10 V. The equipment matching problem between different supplier leads to low outgoing BSC handover success rate. [Description] The independent networking of Huawei GSM1800 in an office was matched through dual frequency with GSM900 of supplier A and supplier B. After the reselection and handover data were completed for both parties, it was observed through the traffic statistic the dual frequency handover success rate was low. It was represented that the handover success rate from GSM1800 to GSM900 was as low as about 60% to 80% and that from GSM900 to GSM1800 was high. [Analysis] When there was problem with the interconnection with the equipment from other suppliers, the parameter and details of the other party should be known on time, e.g. if Phase 2+, EFR was supported. [Handling Process] 1) The message of interface A and interface E was analyzed through a Signaling analyzer. It was discovered after BSC of GSM1800 transmitted the message Handover Required, MSC of GSM1800 responded the message handover REJECT and declined handover. 2) The corresponding interface E (GSM1800 MSC-GSM900 MSC) GSM 1800 MSC transmitted Prepare Handover to GSM900 MSC which responded the message Abort. 3) As the handover success rate from GSM900 to GSM1800 was high, it was observed in the message Prepare Handover transmitted from GSM900 MSC to GSM1800 MSC, the voice version provided was the full rate version 1. But in the message Prepare Handover transmitted from GSM1800 MSC to GSM900 MSC, the voice version provided was the full rate version 1 and 2 and half rate version 1 belonging to PHASE 2+ version. MSC of supplier A did not accept this version, so handover failure was resulted. 4) Only the full rate version 1 was selected through the modification on circuit pool table of interface A of MSC data. After loaded, it was discovered the voice versions provided in the message Prepare Handover from GSM1800 to GSM900 were all the full rate version 1 and 2. The dual frequency handover success rate was greatly increased. 6.3.2 BSC Problems I. Incorrect CGI leads to low handover success rate. [Description] MSC of an office is the equipment of supplier M while BSC and BTS are the equipment of Huawei. The traffic statistic indices of a day were observed and it is discovered in "Inter-cell Handover Performance Measurement", "success rate of inter-cell handover" was very low for a cell (cell 24 of No.3 module of BSC) at the period of 10:00 to 11:00, and it was 73.12%. It was discovered that was mainly Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-11 because the outgoing cell handover success rate in the cell 46000****OCFB was very low and the handover failure times reached 10. [Analysis] The main reasons causing the failure of handover between cells follow. 1) unreasonable handover data configuration 2) problem with equipment (individual TRX damaged) 3) congestion 4) interference 5) clock problem 6) coverage 7) uplink/downlink unbalance Handover flow: 1) All BCCH frequencies of adjacent cells in "BA2 Table" are delivered to MS through the system message type 5. 2) MS reports the BCCH frequencies, BSIC and level value of 6 adjacent cells and service cell with the strongest level to BSS (through the measurement report). 3) After the measurement report is preprocessed, the module, cell number and CGI of all cells are confirmed by BSC through BCCH frequency and BSIC to "Cell Adjacent Relationship Table" and "Cell Description Data Table" (external cell description data table). If "Cell Adjacent Relationship Table" is not configured with an adjacent cell, the information of this adjacent cell will not be indexed, thus handover can not be transmitted. If the frequency and BSIC of 2 adjacent cells are same, they will be indexed to the first adjacent cell, thus handover failure will be resulted. If the frequency of active BCCH in cell A is the same as that of a TCH in cell B which has the same BSIC with cell C, the asynchronous handover access on a time slot (this time slot is aligned with the active BCCH time slot in cell A) of this TCH may by encoded by mistake through cell A into its own random access. For instance, MS retransmits the handover access to cell B in this time slot for several times due to some reason (e.g. handover failure), cell A may generate SDCCH congestion and assignment failure. 4) BSC executes the handover decision flow (completed in GLAP) such as basic sequence of cell. Once it finds the proper destination, it will transmit the handover request information with destination cell CGI to BSCGMPU. GMPU confirms the number of module where the cell locates in "cell module information table". 5) GMPU transmits the handover request to this module and makes the statistic for "Outgoing Cell Handover Request". 6) If there is no CGI in the "Cell Module Information Table", BSC will think the destination cell is an external cell and transmit the CGI of destination cell and service cell to MSC through the handover request. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-12 7) MSC first checks the cell matching with the destination cell CGI in "Location Area Cell Table", confirms the ''Destination Signaling" of this cell, i.e. BSC, and transmits the handover request message to this BSC. 8) If there is no CGI of destination cell in "Location Area Cell Table", look it up in the adjacent cell. If found, transmit the handover request message to this MSC, then to BSC. [Handling Process] 1) There are no full busy times in the traffic statistic, so the problem with congestion can be excluded. 2) There is only interference for the interference band 2 in "TCH Performance Measurement", so the handover failure due to interference can be excluded. 3) Check the board status in "BTS Maintenance System", it is normal (so the problem with equipment can be excluded). Right click TMU-check the board-close and double click BAM-WH long message. The final digits are "12-01-", showing the clock is normal. 4) There are many failure times for outgoing BSC handover. The engineer checked BA2 table, cell adjacent relationship table, external cell description data table and location area cell table, but did not find the data inconsistency and consistent frequency and BSIC. After checking, the engineer discovered the CGI about this cell in the external cell description table was incorrect. 5) CGI was modified. All modules were transmitted after setting. The command character was configured for all the cells taking this external cell as the adjacent cell, i.e. the handover data were configured. The indices were observed an hour later and all was normal. II. BSIC modification leads to low handover success rate. [Description] When the result of traffic statistic was checked, it was discovered the handover success rate in some cells was low. Based on detailed analysis, it was discovered the outgoing cell handover of these cells was normal but the time of incoming cell handover is 0. [Handling Process] 1) It was checked whether there was the cell with consistent frequency and BSIC in the adjacent cell of failed cell but it was not discovered, so this reason was excluded. 2) It was checked whether BCCH of failed cell had been modified and it was discovered not all adjacent cells of failed cell were set. But based on the record, the frequency of failed cell had never been modified, so this reason was excluded. 3) It was checked whether the failed cell had been severely interfered. Seen from the traffic statistic, the interference band was all normal and there were not many times for call drop of failed cell and there were few times of handover Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-13 caused by the bad quality of conversation. If there was interference, it would be impossible there was no successful incoming cell handover. So this reason was also excluded. 4) It was checked whether TRX worked normally, and the channel was observed, but no problem was found out. 5) The data of failed cell were checked and it was discovered BSCI had been modified. Not all adjacent cells might be dynamically set. It was through BCCH and BSIC that BSC located the destination cell when it transmitted the handover request. When one of them was modified, all adjacent cells of the modified cell should be informed. 6) After all adjacent cells of failed cell were dynamically set, the traffic statistic was observed the next day and it was discovered all was normal for the incoming cell handover. III. Consistent BCCH and BSIC lead to failures of TCH occupation and incoming cell handover. [Description] The traffic statistic indices of an office were observed and it was discovered in "TCH Performance Measurement", there were many "assignment occupation failure times (all)" for 2 cells (cell 7 and 8 of No.2 module of BSC) in a period of time, which were respectively 89 times and 61 times. But "TCH call occupation failure times" were 0. [Handling Process] 1) "TCH assignment failure times" were consisted of "TCH call occupation failure times" and "TCH assignment failure time in handover". As "TCH call occupation failure times" were 0, all TCH assignment failures occurred in handover. 2) Fifteen minutes of "incoming cell handover performance measurement" of the 2 cells was registered to check between which adjacent cells on earth and this cell the failure occurred. With the observation of several period of time, all handover failures were from a specified cell (CGI=********1768) to the 2 cells and the handover was not caused by congestion. So it was estimated the failure was caused by the handover to wrong cell due to data error of cell with consistent frequency and BSIC. 3) Under the state of conversation, MS measured the level value of frequency corresponding to the adjacent cell specified in BA2 table delivered by the system through SACCH, and reported the measurement result to BSC. Based on the reported BCCH and BSIC, BSC found the module number and cell number of related adjacent cell in [Cell Description Data Table]. Then it confirmed the found cell was geographically and logically adjacent to the service cell through checking the module number and cell number of adjacent cell of service cell in [Cell Adjacent Relationship Table]. And it found the module transmitted by the handover request in [Cell Module Information Table] through finally-confirmed CGI (this CGI was of the adjacent cell recorded by the Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-14 system when it was looking for [Cell Description Data Table]). Thus the course of looking up the information of adjacent cell and routing was completed. 4) Based on the course above, it could be judged there was the cell with consistent BCCH and BSIC in [Cell Description Data Table] and this group of cells with consistent BCCH and BSIC were exactly the adjacent cells of the same cell. 5) Based on the flow of the table checking above, the module number and cell number of all adjacent cells of the cell with CGI=********1768 were recorded in [Cell Adjacent Relationship Table], then the corresponding BCCH and BSIC were checked and recorded in [Cell Description Data Table]. It was discovered the cell 7 of No.2 module and cell 27 of No.3 module were the cells with consistent BCCH and BSIC, and the cell 8 of No.2 module and cell 36 of module 5 were the cells with consistent BCCH and BSIC, too. After the reason was found, BSIC of cell was modified to reduce the work amount for modifying the data. After the data were set, the related indices were normally recovered. [Recommendation and Conclusion] 1) When the network capacity is very large, the cells with consistent BCCH and BSIC are unavoidable. To reduce the error probability, it is required to make good planning on frequency and BSIC in the network planning and this is important in guaranteeing the network quality. 2) If there are adjacent cells with consistent BCCH and BSIC in a cell, the success rate will surely be 0 when the handover is from the cell to those adjacent cells. It is shown in the traffic statistic that the success times of incoming cell handover of the specified cell in "Incoming cell Handover Performance Measurement" of a cell are always 0 (the failure reason is not congestion). And there are "TCH occupation failure times (all)", but the times of "TCH call occupation failure" are 0. 3) Being familiar with the flow of handover and that of system table checking is the precondition for quick decision and location of handover failure. IV. Cell BCCH modification leads to lots of incoming handover failure and SDCCH congestion. [Description] 1) There was the adjacent frequency interference for a cell due to the frequency planning, resulting in high call drop and bad conversation quality. So the data management console modified and dynamically set the active BCCH frequency. The next day the traffic statistic showed there were lots of BSC inner-incoming handover and BSC inter-incoming handover in cell 1, but most were unsuccessful. Meanwhile the cell 1 SDCCH was seriously congested (max. congestion rate was 75% for being busy). 2) When the channel state was observed in the remote maintenance of BTS, it was discovered there was no SDCCH occupation before lots of TCH Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-15 occupation in cell 1 (such TCH occupation should be BSC inner or inter-incoming handover to TCH). And the TCH occupation time was basically 3s to 5s, but few was 10s. Sometimes 3 to 4 TCH became "A" at the same time and quickly became "I". Although SDCCH was seldom occupied, it was very common that 8 SDCCH became from "I" into "A". [Handling Process] 1) After the abnormity of cell 1 was found out, the operation of a day was reviewed based on the traffic statistic and it was discovered the frequency planning was modified and dynamically set just before lots of incoming handover failure and SDCCH congestion rate increment. So the problem was probably caused by the data setting. 2) In order to confirm the congestion was not caused by the hardware problem, the equipment in this BTS was checked and verified. And the channel state of cell 1 was observed at the remote end. Finally it was located that the problem was caused by the data modification and setting. After the frequency of cell was modified, this cell and another cell had consistent BCCH and BSIC. When BSC implemented the handover and decision based on the BCCH and BSIC of adjacent cells reported by MS, it possibly handed over the request which should be to another cell to this cell. Even if no power was transmitted from this cell, the incoming handover request would be generated. After the incoming handover, it was discovered the channel quality was very bad, and became a bit better after a while. So it was discovered the occupation TCH time of incoming handover was very short (3s to 4s). If the traffic of another cell was large, the times of request for incoming handover to this cell would increase, resulting in the failure of lots of incoming handover. This cell received lots of instant assignment request and assigned SDCCH, resulting in serious SDCCH congestion. 3) After the frequency of active BCCH of cell was modified and the data were reset, the problem was resolved. The incoming handover times of cell 1 was recovered to the normal level as usual. V. CGI error of cell description data table leads to no incoming handover to this cell. [Description] The handover of a GSM network was abnormal. When the handover was from cell A to cell B, the signal of cell B was much stronger than that of cell A, but the handover was not generated. Only after the handover was across cell B coverage area to cell C coverage area was the handover from cell A to cell C available. [Analysis] If a cell can be taken as the service cell providing service and be formally handed over to other cells but not be handed over in, it can be checked whether CGI, BSIC, BCCH frequency number, etc. of this cell in [Cell Description Data Table] are correct. Normally it is caused by the incorrect setting of those data. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-16 [Handling Process] 1) The BCCH frequency of cell B was locked by the test MS. It was normal when dialing. Compulsory handover could be available to any adjacent cell. 2) The BCCH frequency of any adjacent cell in cell B was locked for dialing, then it was compulsorily handed over to cell B, but the handover was unavailable. It was discovered from the software of driving test that there was no handover command in the network. 3) Based on the handover flow, it should be MS that detected the signal of adjacent cell and reported to BSC in the measurement report. BSC made the handover decision based on the measurement report. If the handover condition was met, the service channel of destination cell would be activated and the handover command was transmitted to MS. 4) The signal of cell B was obviously stronger than that of cell A. The handover requirement (PBGT handover threshold 70) was surely met, but the handover command was not transmitted. It showed there was an error in the course of activating the service channel of destination cell. 5) Only 2 items, frequency and BSIC, were contained in the adjacent cell information reported by MS. BSC looked up the cell which had the adjacent relationship with the service cell in [Cell Description Data Table] and [External Cell Description Data Table] based on the frequency and BSIC, and found the CGI of destination cell. If the destination cell was the external cell, the handover request would be transmitted to MSC. The portable parameter was the CGI of destination cell. The course of channel activation was completed by BSC in which the destination cell located. Then the handover command was transmitted to MS by the service cell. If the destination cell was the inner cell of BSC, the module number and cell number of destination cell would be determined by CGI, then the channel activation was completed and the handover command was transmitted. 6) If cell B could not activate the channel when taken as the destination cell, it might be due to the error of [Cell Description Data Table]. As a result, BSC in which the destination cell located could not find the destination cell or activate the channel, and the service cell did not transmit the handover command. 7) When [Cell Description Data Table] was checked, it was discovered there was an error for the CGI of cell B. After the CGI was modified and dynamically set, the handover was normal. 6.3.3 BTS-related Problem I. Too high busy threshold of RACH of BTS2.0 leads to low handover success rate. [Description] Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-17 The handover success rate of BSC in the area A was very low all the time and it was about 83%. Normally the success rate of integral handover of BSC is above 90%. [Handling Process] 1) After the cell with very low handover success rate was analyzed, it was discovered the request assignment times of TCH in the TCH performance statistic was much more than the successful assignment. It could be thought BSC had transmitted the handover request to BTS, just the success rate of TCH occupation was very low. 2) The BTS in which the cell located was BTS2.0, so it was suspected the busy threshold of RACH of BTS2.0 was higher, resulting in difficult TCH occupation. And the access of new channel in the course of handover was impacted, too. 3) After the busy threshold of RACH was decreased from 8 to 5, the integral BSC handover success rate was increased to about 90%. II. Uplink/downlink unbalance due to CDU failure leads to low handover success rate. [Description] The incoming cell radio handover success rate in BSC of cell 2 of a BTS was very low as 10% to 30%. [Analysis] The low BSC inner incoming cell radio handover success rate is normally because of the problem with data (e.g. CGI error in [Cell Description Data Table], lack of measurement frequency for BA1 and BA2, adjacent frequency interference, etc.). It is also because there is blind coverage area for high traffic or difficulty for uplink weak MS access. [Handling Process] 1) When the hardware was checked, it was discovered the state of BTS maintenance board was normal. From the channel state it was seen TCH had been occupied for long but the times were few. So it could be judged there was no problem with conversation data. 2) The handover data were all normal based on checking. 3) "Incoming Cell Handover Performance Measurement" of this cell was registered in the traffic statistic, and it was discovered the incoming handover for all adjacent cells was bad. 4) In the driving test, at a place 2km away from BTS, it was discovered the handover was tried frequently but the handover did not succeed and returned to the original cell. If it occasionally succeeded, call drop would immediately occurred. The downlink level was about 85 dBm in time of handover. The dialing test with frequency lock was done for decades of times. When it was taken as the caller, the tests all failed; and when taken as the called, the test all passed but calling out was unavailable. 5) So it was inferred the loss of CDU uplink channel was excessively big or BTS cabinet top wiring was incorrect, resulting in weak uplink signal. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-18 6) After CDU was replaced, the incoming handover success rate was increased to over 95%. III. BTS clock free oscillation leads to low handover success rate. [Description] A BSC with multi-module was configured with 3 BM. No.1 BM controlled 3 BTS30s. The whole network handover success rate was maintained about 95%, and decreased to 90% after running for some time. [Analysis] If the data were not modified and the network handover success rate was changed from normal to abnormal, it was normally because one or several BTSs were normal, and as a result the handover success rate was impacted. The abnormal BTS could be found through the traffic statistic and the problem could be handled according to the BTS. [Handling Process] 1) When the traffic statistic was analyzed, it was discovered the handover success rate for the cell of 3 BTS30s and adjacent cells were generally low. When the working state of 3 BTS30s was checked, it was discovered the clock was in the state of free oscillation, resulting in low handover success rate. 2) The 3 BTSs located in the same group of BIE which was so suspected failed, resulting in the losing lock of BTS clock. This group of BIE was exchanged with another BIE, but the problem still existed. 3) After the corresponding HW wires of this group of BIE were replaced and the network board was switched, the lock loss of BTS clock still existed. That the losing lock of clocks of 3 BTSs was caused by BSC should be excluded. 4) The Tx 2MHz tributary and TMU were replaced, but BTS still could not lock the clock. 5) As there were no frequency meter and other test instruments in the field, the frequency offset test could not be done on 2MHz and 13MHz clocks of BTS, and DA value could not be precisely adjusted. TMU of all BTS working normally was checked and DA value was recorded. Then the DA value which could be normally locked on other TMU was set in TMU of the 3 BTS30s. After being tried for several times, the clock of 3 BTSs was in the state of lock. 6) Based on a day of observation, it was discovered 3 BTS30s ran normally. And the handover success rate was increased to about 90%. [Recommendation and Conclusion] Manually set DA value of BTS if there are no test instruments in the field, and observe whether BTS is locked. It normally takes over 30 minutes. If DA value needs to be adjusted several times, the efficiency will be very low. Set the new DA value and check TMU 2 to 3 minutes later. If at the moment the difference of discriminator displayed is the same as the set DA value, the DA value can not finally make TMU be locked, and if not the same, TMU can be finally locked. By this method, the adjustment speed can be increased. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-19 IV. TRX performance deterioration leads to low success rate of incoming cell handover. [Description] The handover success rate of cell A with high traffic was low (below 70%), but other indices were normal. [Handling Process] 1) When observing the traffic statistic of this cell A, the engineer found out high traffic was maintained all the time soon after last expansion, and the handover success rate was low. The proportion of incoming/outgoing cell failed handovers was 4:1. As the incoming handover success rate was not high, the handover success rate of whole cell was decreased. 2) As there were few "incoming cell failed handovers (no usable channels)" and most of failed handovers were "incoming cell failed handovers (other reasons)", so the reason of congestion could be excluded. 3) The traffic statistic task of "outgoing/incoming cell handover performance measurement" of cell A was registered and the outgoing/incoming cell handover performance were respectively observed. 4) In the "outgoing cell handover performance measurement" task the engineer found out most of outgoing cell failed handover were concentrated on several specified destination cells. The reason might be the congestion of destination cells. "Incoming cell handover failure (no usable channels)" of these destination cells was registered. It was discovered the incoming cell failed handovers from the cell A were basically "incoming cell failed handovers (no usable channel)". The outgoing cell failed handovers of the cell A were located as being caused by congestion. 5) In the "incoming cell handover performance measurement" task it was discovered almost none of incoming cell handover success rates was high (all about 60%). Normally "none ofl handover success rates is high" is due to the bad frequency planning or external interference, resulting in bad network quality. But based on "there is no quality problem with outgoing cell handover", the frequency planning defect and external interference can be excluded. When observing the indices such as the call drop rate and interference band of the cell A, engineer found out they were all very low, so such a conclusion could be confirmed. 6) Based on the data checking, there were no cells with consistent BCCH and BSIC. Meanwhile there was no adjacent cell with very low handover success rate with cell A based on the traffic statistic observing, so the problem with data was excluded. 7) The HF data were checked. TSC was consistent with BCC and there were no other problems with data. Meanwhile the channel occupation of TCH carrier Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-20 was observed, the occupation was found normal. So the problem with HP data could be excluded. 8) The handover-related parameters were checked, they were basically consistent with the parameters of other cells with high traffic in the downtown, so the reason of unreasonable parameters could be excluded. 9) The channel occupation was observed and several SDCCH were occupied for BCCH carrier. It was normal for the cell with high traffic. 10) The channel occupation was observed in detail and a special case was found out: 4 to 6 SDCCHs were occupied at the same time and then released at the same time. Such cases took 60% of observation time. Under normal condition, there is almost no possibility that 4 to 6 SDCCH are occupied and released at the same time. It is estimated that is due to the bad TRX performance, resulting in abnormal SDCCH occupation and release, and bad handover performance. 11) The data were checked. After it was confirmed that the "carrier mutual assistance" function was used, the active BCCH carrier was blocked and replaced. 12) The carrier was unblocked and the channel occupation and traffic statistic indices were observed. The channel occupation was recovered normal. There was no case that several SDCCH were occupied and released at the same time. The handover success rate in the traffic statistic indices was increased to 95% and the problem was resolved. V. Improper antenna planning leads to low handover success rate. [Description] The handover success rate and the traffic statistic indices for 3 cells in a BTS were very low, especially for the handover from cell 1 to cell 2 and cell 3, the success rate was lower than 30%. [Analysis] The low handover success rate is generally due to the assignment failure of hardware board defect, handover data error or improper antenna planning. [Handling Process] 1) When the BTS hardware was checked, it was running normally, and there was no related alarm. The handover-related parameter setting in the traffic statistic was normal. So the problem with hardware and parameter setting could be excluded. 2) The BTS located at the east side of north-south strike road, 700m away from the road. The azimuth angle of 3 cells were 0, 80 and 160, respectively pointing to 2 directions of the road and the open residence area in the east, and the down tilt angels of 2 cells were 7. The directions of 3 antennas were excessively concentrated in the design. Only the pertinence of coverage target was considered, while no consideration was taken on the seriousness of cell overlapping in the east of BTS. The west was only covered by the side lobe Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-21 and back lobe of 3 cells. So when the user passed this section of road, it was first under the coverage of cell 1. The signal of 3 cells was very weak when getting to the west section of road of BTS and it fluctuated a lot. The handover statistic time and duration were set very short and the handover was very sensitive, resulting in frequent handover failure. 3) After the azimuth angles of 3 cells were adjusted to be 60, 180 and 350, the handover success rate of 3 cells was immediately increased to over 90%. [Recommendation and Conclusion] The coverage target and handover should be both considered in the planning of azimuth angle for antenna. The directions should be equally distributed to avoid serious overlapping between cells, or blind coverage area impacting normal handover. The problem with traffic could not be absorbed but be guaranteed through carriers. 6.3.4 Others I. Locate the problem with low incoming BSC handover success rate through the traffic statistic. [Description] For a dual frequency network, GSM900 network is of supplier S and GSM1800 network is of Huawei. In a big adjustment on data (including modification on frequency and CGI of part cells), GSM900 network and GSM1800 network were both reloaded with data, and it was discovered the incoming BSC handover success rate of GSM1800 decreased. [Analysis] As all handover threshold parameters did not change after data loading, the radio environment was basically consistent before and after data loading, and the incoming BSC handover success rate greatly decreased, so the problem was related to the error of configuration data. [Handling Process] 1) When the traffic statistic before after data modification was compared, it was discovered there was little change for the following items. See Table 7-1 for the data comparison. Table 6-1Traffic statistic indices before/after data modification before data modification after data modification incoming BSC handover request times (to GSM1800) 1885 times 1613 times incoming BSC handover success times (to GSM1800) 1684 times 1460 times incoming BSC handover failure times 185 times 216 times
From the data above, it was discovered that before data modification, the incoming BSC handover request times minus incoming BSC handover success Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-22 times (to GSM1800) was bigger than the incoming BSC failure times. But after the data modification, the incoming BSC handover request times (to GSM1800) minus the incoming BSC handover success times (to GSM1800) was smaller than the incoming BSC failure times. Meanwhile based on the meaning of several items in the traffic statistic above, the difference between the incoming BSC handover request times (to GSM1800) and the incoming BSC handover success times (to GSM1800) represented the handover failure times from the destination cell GSM1800 received the handover request to the handover success message was reported. The occurrence after data modification showed many incoming BSC handover requests were abnegated before the destination cell GSM1800 received the handover request. 2) After the data modification, there was no alarm for BSC equipment. There were 2 possible reasons for loosing the incoming BSC handover request. ! There was an error for GSM900 network. The CGI of destination cell transmitted could not be found in GSM1800 system, resulting in message abnegation. ! There was an error for GSM1800 network data. Due to CGI error in the module message table, the module message forwarding in BSC could not find the destination cell, resulting in the message abnegation. 3) For the second reason, if there was CGI error for the cell module message table, there would be problem for all incoming cell handovers of this cell. The BSC inner incoming cell handover trials and BSC inter incoming handover requests in the cell performance measurement were checked. It was discovered the 2 indices in 3 cells of BTS A were all 0, so it was judged there was CGI configuration error for 3 cells of this BTS in the module message table. 4) The CGI of cell with configuration error was modified and the whole table was set. 5) The traffic statistic data were checked. All indices were normal and the problem was resolved. II. Locate the handover problem through the the radio handover success rate and handover success rate difference. [Description] The radio handover success rate and the handover success rate were close and low. [Analysis] The difference between the handovers of radio handover success rate and that of handover success rate was not big, and it showed the course in Figure 1-2 was successful. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-23 MSC Target BSC Handover Request Handover Request ACK Handover Required Handover Command Source BSC MS Handover Command Radio Tx Signal Measurement
Figure 6-3Flow of part handovers The handover success times and the handovers of 2 statistic points were much different. It showed MS access in the destination cell was unsuccessful after it received Handover Command. That is to say the destination cell did not receive Handover Complete, and the source cell could not receive Clear Command delivered by MSC. So the possible reasons for the failure at Um port include the follwing. ! The destination cell uplink signal was weak, so it could not be accessed. ! The destination cell was not a real one, but the dummy destination with consistent BCCH and BSIC. ! The CGI of destination cell in the external cell table of BSC did not correspond to BCCH and BSIC, resulting in CGI error of destination cell. After channels were activated in other cells, it could not be accessed as it was very far from MS. [Handling Process] The data were checked and the case with consistent BSIC was modified to guarantee there was no error for external cell data. The problem was resolved. III. Sometimes the incoming handover may be to the cell which is not allowed by NCC. [Description] The equipment of Huawei and supplier D were used in a place. BSS of Huawei was attached under MSC of supplier E. The Flower networking was used. There were many handover relationships between the equipment of 2 suppliers. Based on the traffic statistic result of Huawei BSS, the general handover request times from cell E to cell Huawei were over 40% when busy, and the handover success rate was about 80%. While the general handover request times from cell Huawei to cell E were 800 when busy. [Handling Process] 1) Based on the test result, it was presumed there might be an error for "NCC Permitted" of cell E. 2) When the data of cell E were checked, it was discovered this parameter was set as 7 by all cells E. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 6 Troubleshooting for Handover
6-24 3) So the case with "NCC Permitted" as 6 was added for test, and it was discovered BSIC of cell Huawei could be resolved when the service cell was cell E. 4) After this value was changed, the handovers form cell E to cell Huawei were increased from over 30 to over 600. IV. Excessively high min. downlink power of candidate cell leads to handover unavailability. [Description] The BTS of Huawei and that of supplier A in a place were continuously covered but the successful handover to the cell of supplier A was unavailable. [Handling Process] 1) The data configuration of adjacent cells of both were checked but there was no error. 2) It was discovered in the test that the Rx level was low in the handover area and that was about 92 dBm. But the compulsory handover was successful. 3) As the BTS of Huawei was configured as S (1/1/1) model, the uplink/downlink unbalance was serious. To improve the handover success rate, the downlink power of candidate cell was adjusted as 95 dBm. But within the handover band of this cell and supplier A, the actual downlink power level was low. That the min. downlink power of candidate cell was set excessively high led to this cell could not pass M regulation, resulting in unavailable handover to this cell. 4) The min. downlink power of candidate cell was adjusted as 100 dBm, then the handover was normal. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-1 Chapter 7 Troubleshooting for Congestion 7.1 Overview Congestion is the conflict resulted from the shortage of resource due to various reasons. The resource involved in congestion at BSS side can be divided into to two classifications: wire resource and radio resource. For example, A-Interface circuit congestion and Abis-Interface circuit congestion belong to wire resource congestion. The wire resource congestion mainly refers to A-Interface congestion and the A-Interface congestion might be accompanied with radio signaling channel congestion. The radio resource congestion mainly includes the congestions of various types of channel, such as SDCCH, TCH and AGCH. This document introduces the fundamental knowledge, phenomenon and analysis of the congestion, and the troubleshooting. 7.2 A-Interface Congestion 7.2.1 Fundamental Knowledge A-Interface is the interface between BSC and MSC. So the A-Interface congestion is the most critical among all congestions and it affects all BTSs of the BSCs involved with this A-Interface. The A-Interface congestion is presented by the following: 1) Subscribers cannot make call in most cases. 2) The radio signaling channels of all BTSs are congested simultaneously. The channel state observed from BTS maintenance system shows that all SDCCHs are busy and the situation seems to be unable to release in a short time. At the same time, the traffic channels are rather idle. 3) BSC CPU occupancy ratio maintains high and it can reach up to 80% or more than 80% in a very short time. 4) All or most A-Interface SS7 signaling links are in interrupted state.
Caution: In the case A-Interface SS7 signaling link flashes on/off or it is interrupted in a long period, or in the case of SDCCH dynamic adjustment and channel overload or high CPU occupancy ratio, be calm to solve the problem according to the guide provided in this document. If the methods are correct, the system can be recovered smoothly.
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-2 7.2.2 Troubleshooting Process I. Check whether the traffic statistic tasks related to A-Interface signaling performance are being measured There are three related traffic statistic tasks: MTP Measurement Function, SCCP Measurement Function and BSC Measurement Function. BSC Measurement Function task is used to measure the items related to paging, such as page requests from MSC", "page request times" and "PCH overload times". This is a supplement to observing A-Interface signaling flow. After the problem occurs, a BSC Measurement Function task whose statistic cycle is 15 minutes should be registered. This task can only contain the statistic items related to paging.
! Note: In any case, BSC should register MTP Measurement Function task and SCCP Measurement Function task. All statistic items of these two functions should be registered and the statistic cycle is 15 minutes at most. MTP Measurement Function task object is the SS7 links of all modules. It is recommended that the BSC Measurement Function task whose statistic cycle is 15 minutes should be registered even in normal system running state. And this task can contain only the statistic items related to paging so as to facilitate the analysis when the problem occurs.
II. Judge whether A-Interface signaling overloads through traffic statistics There are many examples in which link flashing on/off or link interruption in a long period is caused by the downlink overload of A-Interface signaling. Observe BSCs MTP Measurement Function task. If an A-Interface signaling link has been broken for a long time, observe the statistic result measured before it is broken. If the Signaling link receiving rate (%) of some or all SS7 signaling links exceeds 40%, this indicates that MSC sends too many messages (generally a large amount paging messages) to BSC. So this results in the downlink overload of A-Interface SS7 signaling link. The statistic result of MTP Measurement Function task reflects the average situation of MTP link seizure in a statistic cycle. The multiple times of burst link congestion might not make the average result exceed the standard range. But if the Signaling link receiving rate (%) is 2~3 times as that measured in normal case and frequent PCH overloads can be seen from the statistic result of BSC Measurement Function task, this might be caused by the abnormality of A-Interface SS7 signaling downlink load. If the Signaling link sending rate of some or all SS7 signaling links exceeds 40%, this indicates that BSC sends too many messages to MSC. So this results in the A-Interface SS7 signaling uplink overload. However, this case rarely happens. The number of signaling link interruptions due to congestion also can be seen from other statistic items of MTP Measurement Function. By comprehensively analyzing Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-3 the problem, whether the congestion occurs and where the congestion occurs can be known.
! Note: 40% is a dangerous warning limit for the the link load capacity. If detecting the link overloads in routine maintenance, must find ways to release the overload. In the case the system runs normally, it is impossible that uplink overloads but downlink does not overload. If both the uplink and downlink overload, number of links should be increased. If only downlink overloads, inform MSC maintenance personnel to handle the problem.
III. Troubleshooting for A-Interface signaling uplink overload If A-Interface signaling downlink does not overload, it is impossible that the uplink overloads; even if A-Interface signaling downlink overloads, there is little possibility for the uplink overload. In the case the uplink overload occurs, temporarily close some BTSs of the module where the overloaded link is located. Only after Signaling link sending rate becomes lower than 40% should closed BTSs be started up. IV. Troubleshooting for A-Interface signaling downlink overload There are many cases of this problem. The direct cause of A-Interface signaling downlink overload is generally overlarge paging volume. The cause of MSC sending overlarge number of pagings to BSC might be MSC/VLR fault, or SMC suddenly sends overlarge number of point-to-point messages. In this case, the BSC maintenance personnel should immediately contact the NSS maintenance personnel to locate the problem source at NSS side and then take effective solutions. After NSS problem is solved and A-Interface signaling link is recovered, BSS will automatically and gradually recovers within a certain period of time. Supposed that transmission of all BSS BTSs has been interrupted or BSS service has been being interrupted for a long time due to other reason. Within the 10 minutes after the interruption is recovered, there will be a large number of messages sent at A-Interface and this will result in A-Interface signaling downlink congestion. In this case, no need to take any action and the system will automatically recover. After A-Interface signaling link recovers, you may perform Abis-Interface signaling tracing for a certain cell through BSC maintenance system, or use the signaling analytic instrument to trace the Abis-Interface signaling of some cells. Normally there should be some complete LA updating procedures or call procedures and this indicates that the BTS signaling link congestion is being released. If no complete LA updating or call procedure can be seen within a long time, this indicates that BTS works abnormally and it is necessary to reset the BTS. If you want to speed up the congestion release for important BTSs, there are methods for different conditions. In the case that these important BTSs have other adjacent cells whose congestion has been released, then you may reset these important BTSs. During the resetting, the MSs waiting for LA updating or the MSs in conversation state can compete the necessary procedure through these adjacent cells. If this condition Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-4 cannot be met, do not reset the BTS. The reset will only delay the congestion release. In this case, you may unplug the E1s of some BTSs to speed up the recovery. V. Measures for signaling link not overloaded If this problem occurs, observe the statistic result of SCCP Measurement Function task. If CR messages (O) > CR messages (I) + CREF messages (I) and the difference is obvious, and there are large number of SCCP no response from remote alarms, MSC signaling processing probably is faulty. In this case, immediately contact the MSC maintenance personnel to solve the problem. VI. Be active to learn the running status of other BSSs If there are other BSSs within the same MSC and the same location area (LA), the on-site maintenance personnel may also learn the running status of other BSSs. To understand the range of the problem is helpful for problem locating. If it is found that multiple BSSs have the same problem regardless their manufacturers and if no the same operation has been performed for all faulty BSSs, it can be concluded that the fault is located in NSS side. In this case, immediately contact NSS maintenance personnel to report the fault. 7.3 Radio Channel Congestion 7.3.1 Fundamental Knowledge The radio channel congestion refers to the congestion of pure radio channel resource, i.e., individual BTS rather than large area. Classified according to the channel type, radio channel congestion includes congestion of common channel (such as PCH and AGCH), and congestion of dedicated channel (such as SDCCH and TCH). Generally, if the LA is properly planned and the proportion between PCH and AGCH is proper, common channel congestion rarely occurs. So this document mainly describes the troubleshooting for dedicated channel congestion. Hereafter the channel congestion refers to dedicated channel congestion. To judge whether the channel congestion occurs mainly is to check the channel blocking rate in traffic statistics system. If SDCCH or TCH blocking rate of a cell is obviously high, it can be concluded that the congestion occurs. In normal circumstance, the busy hour congestion rate does not exceeds 5%. The congestion is mainly caused by the following: 1) Unreasonable LA planning. 2) Terrestrial resource unavailable 3) Traffic volume is large and the capacity expansion is needed. 4) Increase of burst traffic, like the remote railway station, location for festival gathering and the time for short messages being intensively sent etc. 5) TRX fault 6) Interference resulting in channel assignment failure Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-5 7.3.2 Analysis I. Congestion due to unreasonable LA planning See examples in 8.4.6 for detail. II. Large traffic volume resulting inSDCCH congestion and TCH congestion Possible causes: " The traffic volume of SDCCH and TCH is larger than the normal value. " The congestion is resulted from the burst traffic increase like the locations for festival gathering and entertainment show etc. Diagnosis method: Check the traffic volume of the congested cell through BSC traffic statistics system. Troubleshooting: For the normal increase of traffic volume, the expansion is the only solution. For the congestion resulted from burst traffic, whether the capacity expansion is needed depends on the operator. In addition, some measures can be used to release the congestion. For example, in the condition that the congested cell has the appropriate adjacent cell, you may enable the directed retry function to assign the calls to the channels of the adjacent cell. III. SDCCH congestion resulted from increase of burst traffic volume Possible causes: " This situation mostly occurs at the area along the railway, especially at the railway station near the tunnel. As these areas are remote, the configured capacity is limited. When the train passes by this station or stops at this station, large number of MSs dropped from network performs LA updating, resulting in SDCCH congestion. The SDCCH congestion is also likely to occur at the time for messages being intensively sent. In above two cases, only SDCCH signaling is used so TCH congestion is not caused. Moreover, as SDCCH is congested, TCH traffic volume is rather small. In addition, the unreasonable LA planning might also result in SDCCH congestion due to the excessive LA updating. " Besides, the transmission interruption recovery may also cause the same problem. In this case, the transmission problem should be solved first. Diagnosis method: Check whether SDCCH blocking rate and traffic volume are over high and TCH traffic volume is normal or lightly low but there are TCH channel requests through BSC traffic statistics system. Check whether LAC configuration is reasonable. Troubleshooting: This problem is hard to be avoided. But you may take some measures to release the congestion, such as to increase the configured SDCCHs and to enable the dynamic adjustment between SDCCH and TCH. If the LAC is unreasonable, modify the configuration. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-6 IV. Congestion resulted from TRX fault Possible causes: In the case a TRX of a multi-TRX cell is out of service, this may lead to the channel congestion. Diagnosis method: 1) After the possible traffic causes have been excluded, check whether there is TRX alarm (the state of the board is in red color). If yes, it is certain that the TRX is faulty. 2) Check whether the state of the channel with TRX is B or O. Check whether the unblocking can solve the problem. 3) If TRX is normal but the channel is constantly in IDLE state, and no traffic occurs within a long time and the assignment to the channels of this TRX fails, the assignment might be faulty. In this case do not directly reset the TRX. 4) After the assignment fault has been excluded, it might be that the TRX is damaged or antenna & feeder connection failure. Generally it is uplink Rx channel fault. 5) For the TRX without abnormal alarm but the cell where the TRX is located is congested, block the suspected TRX. If the cell congestion is solved, this indicates the blocked TRX is faulty. Troubleshooting: 1) If the congestion seems to be resulted from assignment causes, solve the assignment problems. 2) Replace the TRX that is surely faulty as indicated in the alarm. For the uncertain TRX, check whether the antenna & feeder connection is correct and whether antenna & feeder VSWR is normal. If everything is normal, replace the TRX to verify whether the TRX is faulty. V. Congestion resulted from interference Possible causes: The interference on radio interface can result in congestion. For example, if there is TCH signal coverage near the cell and the TCH frequency is the same as BCCH frequency, the handover access on this TCH might be decoded into random access and this leads to SDCCH congestion; in the circumstance that the Rx sensitivity is very high, the interference signal also is decoded into access signal and this leads to SDCCH congestion. Diagnosis method: 1) Check the data configuration to see whether any TCH frequency uses the near BCCH frequency. In principle, TCH should not use the frequencies of BCCH frequency set. 2) Check the interference band statistics to see whether the interference exists. Troubleshooting: Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-7 Modify data configuration to eliminate the interference. VI. Congestion resulted from terrrestrial resource unavailable Possible causes: The occurrence of A-Interface or Abis-Interface fault during the channel assignment can result in the assignment failure and the cause is territorial resource not available. This can also have impact on congestion rate. Diagnosis method: Check the proportion of the congestion caused by territorial resource not available through BSC traffic statistics system. Then check the A-Interface and Abis-Interface data configuration to solve the terrestrial resource problem. Troubleshooting: Modify data configuration. If the data configuration is correct, the congestion might be caused by cable connection fault. 7.4 Examples 7.4.1 Example of SDCCH Congestion Resulted from Co-frequency Interference [Description] Burst SDCCH congestion at a BTS often occurred. When the congestion occurred, number of SDCCH requests obviously increased. [Analysis] 1) The on-site signaling tracing found that: when the congestion occurred, over 60 SDCCH requests are reported within 300ms and the requests are the same. The assignments for the first several requests failed and other requests were rejected. So this resulted the congestion. 2) After the data configuration is checked, it is found that the TCH frequency of a cell (cell B) 10km from this cell (cell A) is the same as BCCH of cell A and the BSIC of B cell is also the same as that of cell A. 3) Probably the B cells MS that was located somewhere between A cell and B cell was performing handover access and the access to B cell is hard, so the handover access signal is decoded into random access and A cell allocated channels for every request. Therefore, SDCCH congestion occurred. 4) The congestion disappeared after B cells BSIC was modified on site to make it different from A cells one. Troubleshooting: In principle, TCH should not use the frequencies of BCCH frequency set. Otherwise, besides the SDCCH congestion, BCCH signal also has interference on the TCH. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-8 7.4.2 A Cells Congestion rate is Overhigh [Description] A BTSs configuration was S(6/4/2). From a certain date, the traffic statistic result of this BTS showed that the TCH blocked state in a cell (6-TRX) was very serious. The statistic result of the statistic task (the statistic cycle is 24 hours) showed that the TCH congestion rate of this cell is 15%~60% and the congestion occurred almost every hour. When the congestion rate was high, the traffic volume of this cell was very low (usually it was about 0.8Erl at busy hours); at the same time attempted TCH seizures meeting a TCH blocked state was 0. The channel state of all basebands in this cell was Idle. The baseband and RC attribute of this cell were normal and nothing abnormal could be found through maintenance console. [Troubleshooting process] 1) Check the channel state of BT through far end maintenance console so as to primarily judge that TCH seizure failure occurs in BT4 and BT5 of this cell. 2) Block BT4 and BT5, as well as RC4 and RC5. 3) Register a traffic statistic task for this cell. This task should contain the following statistic items: TCH Seizure failures, attempted TCH seizures, TCH congestion rate and attempted TCH seizures meeting a TCH blocked state. The statistic cycle is 30 minutes. 4) At the night of the following day, checked the statistic result measured at the last night, TCH congestion was not found in all periods. This indicated that the RC4 and RC5 were faulty. 5) Unblock BT4, BT5, RC4 and RC5. 6) Reset RC4 (TRX4) and RC5 (TRX5). In the second day, the statistic result of the task registered in step3 showed that the congestion still remained. 7) Went to the BTS site to unplug/plug TRX4 and TRX5. The dialing test for the locked frequency (on TRX4 and TRX5) showed that TCH seizure failure remained. Exchanged TRX4 with TRX5 and then the dialing test for the locked frequency (on TRX4 and TRX5) showed that TCH seizure failure remained. 8) Replaced TRX4 and TRX5. Then the dialing test for the locked frequency (on TRX4 and TRX5) showed that no TCH seizure failure occurred. 9) In the second day, the statistic result of the task registered in step3 showed that there was no TCH congestion. The problem was solved. 7.4.3 Severe SDCCH Congestion Resulted from Unstable Transmission [Description] The SDCCH of a newly built BTS was mostly in busy state while the TCH in IDLE or busy state; the conversation after the successful dialing was normal. The number of SDCCH allocation failures was about 1,000 per hour (at busy hours). During the self-loop of BIE, the port indicator flashed occasionally. There were LAPD failure alarm and its recovery alarm (interval between these two alarms is within 1 second) and these two alarms were generated every 10 minutes. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-9 [Analysis] The common causes for SDCCH congestion are as follows: 1) Data configuration error 2) Number of SDCCHs is not enough 3) RF fault 4) No TCH or severe TCH congestion 5) Bad transmission quality [Troubleshooting process] 1) No problem was found during the data checking. After the BIE port of this BTS was exchanged with that of other BTS30, the same problem still occurred in this BTS and the other BTS30 worked normally. So the possible data error and hardware fault at BSC side could be excluded. 2) This BTS was far from BSC. Then the statistic task related to transmission was registered but the statistic result showed that nothing was abnormal. The traffic statistics of SDCCH was still abnormal. 3) Replaced the TMU and TRX of this BTS but the problem remained. 4) The transmission test was performed (another newly built BTS had the same problem). The test found that there were transmission errors. After the test section by section, one 2MHz transmission board in the access network where this BTS passed was found to be faulty. And the problem was solved after this 2MHz transmission board was replaced. 7.4.4 SDCCH Congestion Resulted from Lots of Burst LA Updating [Description] It was found that in one BSC the radio connection success ratio was low. The analysis of traffic statistics showed that the problem was mainly caused by the SDCCH congestion of some specific BTSs. [Troubleshooting process] 1) The traffic statistic result showed that there ware 300~400 SDCCH seizures PER HOUR in the cell where the congestion occurred at busy hours. The configuration of the BTS was S(1/1/1). Each cell was configured with 8 SDCCH/8 channels. Normally this configuration could support 300~400 SDCCH seizures. But several decades of SDCCH congestion occurred in each cell at busy hours. 2) A related traffic statistic task was registered. The statistic result showed that most of SDCCH seizures were caused by LA updating. For the site location, it was found that most BTSs that had this problem were located at the border of two LAs along the railway. So it might be the SDCCH congestion was caused by burst LA updating. 3) To verify this possible cause, a traffic statistic task whose statistic cycle was 5 minutes was registered. The statistic result showed that the LA updating was mostly concentrated in a certain 5 minutes. After consulting the train timetable, it Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-10 was found that there were 4~5 trains passing by in this 5 minutes. So lots of LA updating was performed in a short time when the train passed by and this resulted in the congestion. 4) More SDCCHs were configured. [Conclusion] 1) For the BTS located in the border of two LAs along the railway, the redundant SDCCHs should be configured. 2) For the S(1/1/1) BTS, enable the SDCCH dynamic allocation function. 7.4.5 High TCH Congestion Rate Resulted from Incorrect CIC Setting [Description] The TCH congestion of an office kept high. TCH congestion rate (excluding handover) reached 4%. [Troubleshooting process] 1) The version upgrade and network expansion had been made not long before. The TCH congestion rate before the version upgrade is low. 2) The problem might be related to the modification of data configuration. As there are so many data, you must have a definite object in view. The analysis of the traffic statistics at busy hours of the current day showed that the cells with high congestion rate mainly concentrated in one module of BSC1. This module controlled the most BTSs of this city. The deterioration of the congestion rate of this module lowered the deterioration of the whole network. So the fault was primarily located in this module and this module should be analyzed with emphasis. 3) As TCH congestion rate (excluding handover) = TCH seizure failures (excluding handover)/ Attempted TCH seizures (excluding handover), there were many TCH seizure failures in each cell in traffic statistics. The further analysis of the cause for too many TCH seizure failures showed that most of TCH seizure failures were TCH seizure failures (requested terrestrial resource unavailable). This indicates that the requested terrestrial resource unavailable is the main cause for the high TCH congestion rate of this module. 4) The main cause for requested terrestrial resource unavailable was the fault at Abis-Interface or A-Interface. So they needed to be checked. 5) As the many cells of this module had the same problem, the possibility for Abis-Interface to be faulty was little. So the emphasis was given to the related hardware or data of A-Interface. 6) No fault was found in checking A-Interface hardware of this module. 7) Then checked the data configuration of the trunk of this module. Opened the trunk circuit table, sorted the data and then checked. 8) It was found that the CIC of the first 32 timeslots of this module group 0 was 65535 but the circuit of this module group 0 was corresponding to the circuit from Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-11 BSC to MSC. So it was obvious that the CIC number was incorrect. Dynamically set the circuit number to be 0~31. 9) The traffic statistic result in the following day showed that the cell congestion rate of this module was lowered, the TCH seizure failures (excluding handover) of each cell were greatly decreased and the congestion rate of the whole network (excluding handover) was dropped from 4% to 2%. 7.4.6 High SDCCH Congestion Rate LAC Resulted from Improper LAC Setting [Description] The SDCCH congestion rate of two cells of a BTS reached 4.91%. the configuration of this BTS was S(1/1/1) and the TCH traffic volume of each cell did not exceed 3Erl at busy hours. [Troubleshooting process] 1) Checked the statistic items related to TCH and SDCCH measurement function. The result showed that TCH traffic volume was low, but the number of SDCCH seizure requests was high and it reached 3032 at busy hours, the traffic volume reached 1.86Erl, and the congestion rate reached 4.91%. 2) SDCCH congestion rate = Attempted SDCCH seizures meeting an SDCCH blocked state/Attempted SDCCH seizures (all), the possible causes for SDCCH seizure are: (a) The signalling before the conversation is established (b) Signalling in handover (c) Signaling of LA updating in MS idle mode. 3) As TCH traffic volume was normal (2.79Erl) (Number of available TCHs was 6); the number of attempted TCH seizures (excluding handover) was normal (318); and the number of attempted handovers was normal (146). So the great number of SDCCH seizures was probably caused by lots of LA updating. 4) The LAC of this BTS was 0500 and the LAC of the adjacent cells were 0 520. Changed the LAC of this BTS to 0520. Then number of attempted SDCCH seizures was 298, the congestion rate was 0 and traffic volume was 0.27Erl. The congestion was solved. [Conclusion] 1) In the LAC planning, try to use the subscribers geographical distribution and subscribers behaviors for LAC division so as to reduce the LA updating at the border of the LAC. For the large city with high traffic, if there are two or more than two LACs, the division can be made based on the geographical features such as mountain and river to reduce the overlapping depth between cells of different LACs. If these geographical features are not available, the division based on street or the border locates at the place with high traffic (such as shopping center) should be avoided. Generally, the LAC boarder should be bevel Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 7 Troubleshooting for Congestion
7-12 instead of being parallel or vertical with the street. For the border area between the city and suburb, to avoid frequent Location updating, the LAC border should be located in the most outer site instead of in the site with high traffic (such as the border area) so as to reduce the frequent LA updating. The LA range cannot be too large or too small. It is recommended the TRXs of a LAC should not exceed 300. 2) To modify LAC number, make sure that the cells with the same CGI should not exist. After the modification at BSS side, the corresponding modification at MSC side should also be made. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-1 Chapter 8 Troubleshooting for Access 8.1 Overview 8.1.1 MS Searching Network 1) Generally MS operates on its home PLMN (HPLMN). Under some special circumstances, MS may select another PLMN. This selection can be performed in two modes: automatic network searching and manual network searching. 2) Every time it is switched on with SIM in it, MS checks for the PLMN it last logged into and tries to log into that PLMN. 3) When logging succeeds, MS will operates on that PLMN. 4) When logging fails because there is no suitable cell, MS will search at least 30 GSM900 channels or 40 GSM1800 channels for login. During this process, a PLMN will be selected. 5) When logging fails because location updating fails, MS will not search the channels. Instead, it will display the PLMNs available for the subscriber. Then the subscriber can select a network either automatically or manually. 6) In the automatic mode, MS selects a network from the PLMN list by priority. In the manual mode, MS displays the PLMNs available for the subscriber and logs into the PLMN the subscriber selects. 7) MS roaming can affect network searching. There are two roaming modes: international roaming and national roaming. The former is a process that MS logs into a PLMN in a country different from the country of HPLMN. The latter refers to that MS logs into a PLMN in a country the same as the country of HPLMN. MS searches the HPLMN periodically during the process of national roaming. 8) During the process of national roaming, in order to prevent MS from logging into a forbidden LA, MS stores the forbidden LAs into a list of "Forbidden Las for national roaming". This list will be cleared when MS is switched off or SIM is removed. 9) In addition, MS stores some barred PLMNs in its SIM. Only when a barred PLMN is manually selected and the corresponding location updating succeeds, can the PLMN be removed from the list of Forbidden PLMNs. 10) MS searching network failure occurs when MS fails to select a PLMN or cell. 8.1.2 Location Updating Procedure The location updating procedure is shown in Figure 8-1. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-2 Channel Active Channel Active ACK Immediate Assignment Command BTS BSC MSC MS Channel Request first SABM Establish Indication!Location Updating Request) CR(Complete_l3_information) CC Location Updating Accepted TMSI Reallocation Complete Clear Command Clear Complete UA
Figure 8-1Location updating procedure Notes: 1) In the whole location updating procedure, authentication and ciphering procedures are also included. They are performed after the procedure of SCCP connection establishment between BSC and MSC. In Huawei MSC, authentication and ciphering procedures are optional. 2) Sometimes the identification procedure may also be part of the location updating procedure. When the MS ID contained in the MS-reported message Establish Indication is TMSI and when VLR does not recognize the TMSI or the TMSI authentication fails, the identification procedure will be started. That is, MSC delivers a transparent message Identity Request to the MS and then the MS reports a message Identity Response in which the IMSI is provided. 3) When the parameter "TMSI allocation in case of location updating" at MSC is set to NO, MS will not report the message TMSI Reallocation Complete during the location updating procedure. 4) Abnormal case: Location updating rejected. In other words, the message Location Updating Rejected is received from MSC. Possible causes of the abnormal case include: Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-3 ! MSC does not allocate CGI to the cells involved in the BSC. ! Communication between MSC and VLR fails. ! MS is not registered in HLR, etc. 8.1.3 Call Procedures Channel Active Channel Active ACK Immediate Assignment Command BTS BSC MSC MS Channel Request Establish Indication (CM Service Request) CR(Complete_l3_information) CC Setup Call Proceeding Assignment Request ASSIGNMENT Establish Indication Assignment Complete Assignment Complete Alerting Connect Connect Ack talk Disconnect Release Release Complete Clear Command Clear Complete CM Service Accepted Channel Active Channel Active ACK UA UA first SABM first SABM
Figure 8-2 Call procedures when MS is calling Note: 1) In the whole call procedures, authentication and ciphering procedures are also included. They can be performed after the SCCP connection establishment between BSC and MSC shown in Figure 8-2. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-4 2) For the dual band MS, the classmark updating procedure is also part of the call procedures. It is performed after the SCCP connection establishment and before the message CM Service Accepted is received. Figure 8-3 shows the procedure. BSS MS MSC Classmark Request Classmark Enquiry Classmark Change Classmark Update
Figure 8-3 Classmark updating procedure 3) As shown in Figure 8-2, BSC sends the message Assignment Request before the Alerting message. Therefore, this is called early-assignment procedure. When BSC sends the message Assignment Request after the Alerting message, the procedure is called late-assignment procedure. There is also a kind of very-early-assignment procedure. In such procedure, the TCH is assigned during the immediate assignment procedure. When BSC sends the message Assignment Request, no new channel is requested. The channel is assigned via the channel mode modify procedure. Figure 8-4 shows the channel modify procedures. Mode Modify Mode Modify ACK BTS BSC MS Channel Mode Modify Channel Mode Modify ACK
Figure 8-4 Channel modify procedure 4) Abnormal case 1. BSC sends the message Assignment Request. Upon receipt of the message, BTS does not send the message Assignment Command to MS but sends the message Assignment Fail message to BSC. Possible causes of this abnormal case include: ! There is no TCH available at BSC ! The BSC-assigned CIC is not idle at BSC. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-5 5) Abnormal case 2. BSC sends the message Assignment Command to MS. Upon receipt of the message, MS reports the message Assignment Fail. Possible causes: ! Downlink or uplink BER is too high. 8.2 Trouble Handling 8.2.1 MS Cannot Find a Network I. Description 1) MS displays "No Networks" or nothing. 2) The network list MS displays does not contain HPLMN. II. Analysis 1) Cell is not put into service yet. At the maintenance console, select [Obtain Cell Attribute] from the corresponding cell. When the message "Cell not initialized" is returned, it can be considered that the cell is not put into service yet. When the corresponding attribute information is returned, it can be considered that the cell is already in service. 2) Trace the message process on the Abis interface and see whether a channel request is made. If no, analysis can be made as follows: a) For the non-MS trouble, ! The BTS hardware may be faulty. ! At the maintenance console, check whether the board is in the normal state or whether the indicator of each board is normal. Then check whether the RC and BT attributes are consistent with the corresponding configurations made at the data management console and whether the BTS clock is already locked and synchronized with the BSC clock. If everything is all right, it is required to test whether the antenna & feeder system outputs the right power. ! The system information may be erroneous. ! Check whether the configurations of CI, LAI, BSIC and CCCH at BTS are consistent with the corresponding items listed in the Radio Channel Configuration Table. ! The CBQ and CBA settings may have errors. b) For the MS trouble, ! The reason may be that the MS is in the position where the signal quality is very poor or the signal level is very low. Move the MS to an open area and try again. ! The problem may also lie in that the MS battery has low power and its receiving capability decreases. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-6 3) If the "Channel Request" message is traced on the Abis interface while the message "Location Updating Rejected" is also displayed, check should be made on the related data at NSS. III. Trobleshooting process 1) Initializing the cell and site When finding the cell is not put into service yet, perform hierarchical reset over the site of the cell. This may affect the conversation in all other cells under that site. During the process of initializing, the progress of each initializing stage is indicated. BTS initializing contains two stages: site initializing and cell initializing. a) Procedures of site initializing are given below: ! Set site logical objects ! Set site hardware objects ! Set site extension attributes ! Establish connections between sites ! Start the site b) Procedures of cell initializing are given below: ! Set TEI ! Set up the signalling channel ! Set up the traffic channel ! Set cell attributes ! Set cell extension attributes ! Set RC attributes ! Set RC extension attributes ! Set channel attributes ! Set cell alarm thresholds ! Start the cell ! Wait for the cell status change report The results of each stage are displayed at the maintenance console at the real time. A stuffed star means an operation success while an unstuffed a failure. In case of failure, causes will also be given. Make checks based on the causes. 2) Removing the hardware trouble ! When finding a board faulty, reset or replace it. ! When finding the problem lies in the antenna & feeder system, check whether the connection at each port is secure and whether the power supply and power amplifier are normal. ! When finding the problems lies in the clock board, replace it and make sure that the item "clock mode" the Site Description Table is set as "external clock". ! When finding the system information erroneous, correct the erroneous items, reconfigure the table and validate the setting via dynamic data configuration. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-7 3) Removing the trouble from NSS ! Check whether the message "Location Updating Rejected" is transmitted on the A interface. If no, check the A-interface-related data configurations and connections at MSC and BSC. ! If the message is traced on the A interface, check whether the CGI of the cell is added to MSC and whether the MS is properly defined in HLR. ! Check for other possible problems at NSS if the trouble is not removed yet. 8.2.2 MS Cannot Access a Network I. Description 1) MS displays "No Services" or "Only Emergence Call" or nothing. 2) In the manual mode, the MS displays that it has found one or several networks. 3) On the Abis interface, no message is found or the message "Location Updating Rejected" is traced. II. Possible causes 1) SIM is not inserted. Insert the SIM. 2) MS battery has low electric power. Charge the battery. 3) The cell is not put into service yet. See the following troubleshooting process. 4) RSL interrupts. 5) System information is erroneous. 6) The Um interface phase version is too low. III. Troubleshooting process When finding the cell is not put into service yet, reset the cell and then the fourth hierarchy of the site and see whether the initializing procedure is normal. 1) Judge whether the cell is put into service. If yes, turn to 3). If no, turn to 2). At the maintenance console, select [Obtain Cell Attributes] of the corresponding cell. When the message "Cell not initialized" is returned, it can be considered that the cell is not put into service yet. When the corresponding attribute information is returned, it can be considered that the cell is already in service. 2) When finding the cell is not put into service yet, perform hierarchical reset over the site of the cell. This may affect the conversation in all other cells under that site. During the process of initializing, the progress of each initializing stage is indicated. BTS initializing contains two stages: site initializing and cell initializing. Procedures of site initializing are given below: ! Set site logical objects ! Set site hardware objects Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-8 ! Set site extension attributes ! Establish connections between sites ! Start the site ! Procedures of cell initializing are given below: ! Set TEI ! Set up the signalling channel ! Set up the traffic channel ! Set cell attributes ! Set cell extension attributes ! Set RC attributes ! Set RC extension attributes ! Set channel attributes ! Set cell alarm thresholds ! Start the cell ! Wait for the cell status change report The results of each stage are displayed at the maintenance console at the real time. A stuffed star means an operation success while an unstuffed a failure. In case of failure, causes will also be given. When the cause is data error, check whether the corresponding data configuration is right. When the cause is that the operation cannot be executed, it can be considered that the BTS TMU has wrongly configured the corresponding board. Check whether the board is normal. Then turn to 1. 3) When finding the cell is already in service, check whether the clock parts and boards in the cell are normal. If no, perform corresponding handling as suggested below: At the local maintenance console, enter into the equipment status query window and view the board status. If a board is displayed in red, it can be considered that the board is faulty. In this case, check whether the corresponding board hardware is normal, whether communication between boards is right and whether each board is properly powered. a) When the cell is already in service and when MS can lock a band while no message is transmitted on the Abis interface, the reason may be that the item "Cell access allowed" in the BSC System Information Data Table is set to NO. Just modify it to YES and reconfigure the Table. b) When the cell is already in service and when MS can find a network while no message is transmitted on the Abis interface, the reason may be that the Um interface phase version is too low. Handling suggestions: At the data configuration console, modify the Um interface phase version in the Local Information Table to be GSM_Um_Phase 2 or above. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-9 8.2.3 Location Updating Is too Frequent I. Description When switched on, MS can find a network. It can also call or be called. However, MS performs location updating frequently and the voice quality is poor. II. Analysis Generally MS location updating is used for the following three purposes: 1) Location updating when MS enters into a new location area 2) Periodic location updating 3) Location updating when MS is switched on Improper data configuration may cause too frequent location updating. III. Troubleshooting process 1) View the signaling on the interface. When the message is "Normal Location Updating", the reason may be that MS is on the overlapped border of more than one location areas. (This case seldom occurs.) Change the MS position. 2) When the message is "IMSI Attach" and MS is not switched on/off frequently, the reason may be that MS often drops from the network because it has too low signal level and cannot receive the BTS message. Check whether MS level is too low. 3) When it is periodic updating that is performed too frequently, the reason may be that the location updating period is too short. Select [Cell/System information data table] at the BSC data management console and see whether the location updating period setting is too small. When finding it is normal, check whether the MS has properly received that system information. (Via a test MS, the T3212 timeout value should be clearly detected.) 8.2.4 MS Drops from the Network Frequently I. Description 1) In idle mode, MS sometimes displays the accessed network and sometimes does not. This kind of situation occurs too frequently. 2) In call mode, the MS call often drops. II. Analysis During the processes of system information receiving, cell selection or cell reselection, when MS finds its cell does not satisfy the protocol requirements while it cannot find another suitable cell, the MS may drop from the network. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-10 III. Troubleshooting process 1) Check whether the system information is properly sent and whether the cell reselection parameters and random access parameters are changing frequently. 2) Check whether the MS C1 is too small. If yes, check whether the parameters that may affect C1 are properly set, such as the MS min Rx power level and MS max Tx power level. 3) Check whether the BTS output power has some too great jitters. 4) Check whether the antenna is fixed. 5) Check whether there is co-channel interference between the BCCH in the cell and that in another. 8.2.5 MS Finds a Network but Cannot Call I. Description After switched on, MS finds a network. However, when dialing, one of the following situations occurs. 1) MS receives no ring back tone even when the dialed number is not busy. 2) MS receives the ring back tone and then interrupts automatically. 3) MS receives the ring back tone and interrupts when the call is answered. II. Analysis When MS cannot call, the problem may lie in BTS, BSC, MSC and fixed telephone network. III. Troubleshooting process 1) Check whether the signalling on the Abis interface is normal. 2) Check whether the immediate assignment is performed. If no, check whether there is SDCCH available. If yes, check the paging parameter settings. If no, reconfigure the cell or increase the cell capacity. 3) Check whether assignment is performed. 4) When finding assignment is not performed, check whether there is TCH available. Reconfigure the cell or increase the cell capacity. Then check the data mapping relationship on the Abis interface. 5) When finding the radio link fails, check the measurement report. If it indicates that the receiving quality is too poor, the reason may be that the antenna & feeder system is not properly connected or that the BSC clock is faulty. 6) Check whether the network route is normal. When no route is detected, the problem may lie in that the link to the BSC switching network has errors, that FTC is abnormal, that A interface is blocked, that MSC cannot obtain the roaming MS number or that MSC blocks some routes. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-11 7) When the route is normal, the problem may lie in that the FTC application fails. 8) Other causes. 8.3 Examples 8.3.1 MS Has Difficulty in Accessing a Network Signal Is too Weak [Description] MS could not make a call though MS is near to the BTS B which could be seen by the subscriber. The MS could not access a network. [Troubleshooting process] 1) The engineer performed some drive tests over the BTS B in the positions where the BTS could be seen. He found that the MS receive signal level was high and ranged between -80~-90dBm when it was in the positions lower than the BTS and low when it was in the positions about 500 m vertically higher than the BTS. It was also found that the MS could not access a network when it was on the roadside nearer to the mountain. 2) B was an omni-directional BTS and its main lobe was in the plane parallel to the earth; therefore, its upward-radiated power was very low. Besides, since the MS antenna was not very far from the earth, its receive signal might suffer from much attenuation from the earth. That was the reason why MS in the high position (especially on the roadside nearer to the mountain) could not access a network or call though the subscriber could see the BTS. 3) This was caused by the characteristics of radio wave propagation. It had nothing to do with the BTS performance. 8.3.2 MS Cannot Perform Cell Reselection Signals of Adjacent Cells Are Weak [Description] In one M1800 network, an MS in idle mode could not normally perform cell reselection. Instead, it often dropped from the network and then was registered again in the network. [Analysis] Possible causes of MS dropping from the network are given below: 1) Signals of the adjacent cells were all too weak, so MS could not find a suitable cell for cell reselection. 2) MS could not find any adjacent cell for the serving cell had no adjacent relationship with the neighboring cells. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-12 [Troubleshooting process] 1) The engineer checked the relation data of the adjacent cells and found no problem. Therefore, the trouble could not be due to the second cause. 2) Through tests, the engineer found the signal of the current serving cell was very weak and that of the adjacent cell was also too weak; therefore, MS did not perform cell reselection. 3) When it continued to move to a position where its signal was really too weak while it had not found an adjacent cell yet, MS dropped from the network. 4) The reason why the signal of the adjacent cell was weak was that the height, position and azimuth angle of the antenna at the adjacent site were wrongly set. 5) Heightening the antenna at the adjacent site could enhance the signal of the adjacent cell. However, it could not be implemented currently due to the on-site limitations. 6) The engineer increased the adjacent cell redundancy so that the serving cell had adjacent relationship with a non-neighboring site and MS could perform cell reselection normally. 8.3.3 GSM MS Drops from the Network Location Updating Period Is too Short [Description] There were four BTS2.0s in one GSM network. The network operated normally. One day, many MSs dropped from the network. [Analysis and troubleshooting process] 1) The engineer first performed hardware check. At the remote maintenance console, the engineer found that the HPA and TRX of all BTSs were normal and that all TCHs and SDCCHs were occupied. 2) Then the engineer checked the BSC hardware and found that the indicators of GMPU, GLAP, BIE and CK3 were all normal. From these checks, it could be considered that the problem did not lie in the BTS hardware. 3) In that period, data of the network was not modified. The only change was that the MS quantity increased greatly. In VLR, more than 4,000 local subscribers and more than 5,000 roaming subscribers were registered. Hence, the engineer inferred that the problem lied in the sharp increase of MS quantity. 4) The increase of MS quantity can cause the following two affects: ! TCH congestion rate increases. When TCHs in the cell are all occupied, other subscribers cannot call or be called. ! SDCCH congestion rate increases. When SDCCHs in the cell are all occupied, MS cannot normally perform signalling exchange. Subscribers cannot call or be called. Besides, they cannot perform location updating successfully. The consequence of location updating failure is that MS drops from the network. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-13 5) The engineer queried the BSC System Information Data Table and found that the location updating period was set to 2 (unit: 6 minutes). That is, the location updating period was set as 12 minutes. It was also found that the corresponding parameter at MSC was set as 30 minutes. Therefore, each activated MS initiated a periodic location updating via the SDCCH every 12 minutes. When there were so many MSs that all SDCCHs were occupied, the MS to initiate a periodic location updating would fail in the location updating and drop from the network. 6) The engineer modified the location updating period at BSC to 10, i.e. 10 x 6 = 60 minutes. In the meantime, the corresponding parameter at MSC was set as 180 minutes. He also reset the two Tables. Since then, no subscriber complaint has been received. [Conclusion] The location updating period set at MSC (VLR) should be 2~3 times of that set at BSC. The shorter the location updating period is set at BSC, the better the overall network performance will be. However, the signalling traffic in the network will become heavier and the radio resource usage lower. Besides, the MS power consumption will increase, which may shorten the average stand-by time of each MS in the system. Therefore, you should consider the processing capabilities of MSC and BSC and the traffics of A interface, Abis interface, Um interface, HLR and VLR when you set that parameter. Generally it was set to a high value when the system was in urban areas and to a low one when the system was in suburban areas or at countryside. 8.3.4 MS Drops from the Network CGI is Erroneous [Description] There was a BTS. MS under it could not normally access a network. The MS sometimes could receive signals and sometimes could not even when it was near to the BTS. It became normal when it was switched off and then on. [Troubleshooting process] 1) The engineer found the temperature in the equipment room reached 40C and the air conditioner could not work normally, he doubted it was the high temperature that caused the BTS trouble. However, after correction, the trouble was not removed yet. 2) The engineer replaced the BTS TRX and then performed a test. However, the trouble still existed. The possibility of TRX failure was excluded. 3) Then the engineer collected and analyzed some onsite information such as traffic statistics information, messages on the Abis interface and uplink & downlink test powers. No problem was found. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-14 4) The engineer viewed the MSC Location Area & Cell Table and found that the GCI settings of the three cells at MSC were identical while inconsistent with the corresponding CGI settings at BSC. 5) The engineer then modified the GCI at MSC. The trouble disappeared. [Conclusion] In case of such trouble, please check whether the GCI settings at MSC are consistent with the corresponding CGI settings at BSC. 8.3.5 MS Has Difficulty in Accessing a Network MSC Cell Data Is not Ready [Description] There was a BTS A. MS under it could well receive signals and make calls upon switch-on; however, later the MS displayed "Network failure" or "No networks". The MS became normal when it was switched off and then on. However, later it would meet with the trouble again. [Analysis] 1) For a trouble that sometimes occurs and sometimes does not, the most probable problem lies in transmission failure. That BTS used microwave for transmission; therefore, it should be checked whether the BER was too high. 2) The BTS clock accuracy should also be considered. That BTS was in the second hierarchy; therefore, it should be checked whether the TMU was satisfactory. 3) The problem may also lie in periodic location updating failure that may be caused by data configuration error. [Troubleshooting process] 1) Via tests, the possibility of transmission failure was excluded at first. 2) Then the engineer checked the TMU clock status and found it was normal. 3) It was noticed that there were also two sites B and C near the BTS A and they were in the same location area as A was. 4) The engineer used a test MS to scan all frequencies and found that the MS could receive signals from B and C, access a network and make calls. Later it detected stronger signals (from A). Then it performed cell reselection and camped on the cell under A. However, its periodic location updating was rejected by MSC. The engineer inferred data configurations at BSC and MSC might be erroneous. 5) The engineer checked the Cell Table, Cell Module Information Table and Cell Description Data Table of BSC and found that one cell ID corresponded to more than one cells and that the adjacent relationship was not completely defined. He added a cell ID and complemented the adjacent relationship. However, the trouble was not removed yet. 6) Then the engineer checked the MSC data and found that the CGI of the cell under A had not been defined. In this case, periodic location updating initiated in the cell Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-15 under A would be rejected by MSC because the CGI could not be found. It was no wonder that MS dropped from the network. 7) The engineer added the CGI of the cell under A to the MSC data configuration. The trouble disappeared. 8.3.6 MS Drops from the Network MSC Cell Data Is not Ready [Description] In one network (with BTS type of BTS20), an MS often dropped from the network and could not make calls when it was inside a room. When it was near the BTS, the MS could access a network and make calls. [Analysis] Possible causes of MS call dropping are given below: 1) Antenna & feeder problem 2) CDU problem 3) RF power amplifier problem 4) Interference over the BTS (uplink interference) 5) New interference source in the subscriber's living area (downlink interference) 6) Data configuration error [Troubleshooting process] 1) The engineer replaced the main antenna with the diversity one of the problem cell and viewed the antenna & feeder. At this step, the possibility of antenna & feeder problem was excluded. 2) The engineer replaced the CDU of the problem cell with that of another cell. At this step, the possibility of CDU problem was excluded. 3) Then the engineer replaced the HPA, TRX, PSU and FPU with those of another cell. At this step, the possibility of RF device was also excluded. 4) The engineer viewed the traffic statistics console and found that the highest level of the interference band in the problem cell was 2. Therefore, the problem could not lie in uplink interference. 5) The engineer closed the TRX of the problem cell and the signal in the subscriber's living area disappeared immediately. Therefore, the problem could not lie in downlink interference, either. 6) The engineer inserted the SIM to another MS and found the MS could not access a network, either. It indicated that the MS had no error. 7) The engineer used a test MS for the trouble location. The outdoor signal strength in that area was tested to be about -80dBm; however, the MS could not access a network yet. In addition, upon switch-on, the test MS that was locked in the cell showed that it dropped from the network; however, it could still make calls. As for a common MS, upon MS switch-on, the received signal was best. However, in one Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-16 minute, it showed it dropped from the network. When it called, its received signal recovered best and the call could be set up. 8) The engineer suspected that MS could not perform location updating in that cell or camp on/off that cell. He turned the test MS to another cell by executing SET BCCH on the test MS. Now the MS could access a network though the signal was very weak. He then returned the MS into the current cell via the command SET BCCH. In this case, MS could both access a network and make calls. 9) The engineer performed signalling trace over the MS and saved the messages. Through analysis on the messages, he found MS had initiated many location updating requests in the cell while MSC always returned "Location Updating Rejected" with cause value of "Network failure". 10) Then the engineer viewed the MSC data configuration and found that the cell ID of that area was missed. 11) The engineer added the data of that cell to the MSC data configuration. The trouble disappeared. 8.3.7 Some MSs Cannot Access a Network System Information Is Erroneous [Description] Some subscribers complained that they could not access a network or make calls under an omni-directional BTS. However, everything was all right when they were under other sites. [Troubleshooting process] 1) Since the trouble occurred to some MSs, the engineer first checked the MS data and found that MS data was right. 2) The engineer then checked the BTS board as well as the running status and found the BTS run normally. 3) The engineer reset the BTS. However, the trouble was not removed yet. 4) The engineer used a problem SIM to make a test. He found that the contents the problem SIM displayed were different from those of a normal SIM. Upon switch-on, the problem SIM first searched the time slot 2 instead of the time slot 0 of the TRX where the BCCH belonged. 5) As the channel the MS first searched was wrong, the engineer suspected that the problem lied in data configuration error. 6) He checked the data configuration and found that the item CCCH in the System Information Table was wrongly set as 2 non-combined CCCHs. 7) The engineer modified that item as 1 non-combined CCCH. Then the MS could normally access a network. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-17 8.3.8 MS Has Difficulty in Accessing a Network CBQ and CBA Are Erroneous [Description] In one hotel, when a subscriber stepped out of a lift, MS could not access a network or the access was very slow. This situation lasted more than five minutes. The MS could successfully access a network when it was switched off and then on. [Analysis] 1) When MS steps into a lift, its previous C1 or downlink signal failure counter may fail to be displayed. The MS will initiate the cell reselection procedure. At first it attempts to search the stored list of cells of the current PLMN. If failing, it will search all cells of the current PLMN. If failing again, it will search other PLMNs. 2) If MS is searching other PLMNs when it steps out of a lift, it may find the signal of one PLMN is best for it. It will initiate the location updating procedure but fail. That PLMN will be listed in its Forbidden PLMN List. After failing in searching other PLMNs, it will search its HPLMN, i.e. the last registered PLMN. When finding the HPLMN, MS will perform the same operations as those it performs under the following condition 3). 3) If MS is searching the HPLMN for a cell when it steps out of a lift, it will read in turn the information from the BCCHs, beginning from the BCCH information with the best signal. It may find the priority of the GSM900 cell is low and go on searching. When finding a GSM1800 cell and considering the priority satisfactory, MS performs location updating and camps on that cell if it succeeds in the updating. If it is a GSM900 single band MS, the MS will search all GSM900 cells. When finding the priorities of all GSM900 cells are low, it will select the GSM900 cell with the best signal for location updating. When succeeding in the location updating, it will camp on that cell. During this process, MS should search all frequencies with signal. Therefore, it takes MS a relatively long time to access a network. 4) If MS is being switched on/off whet it steps out of a lift, it will search the stored list of cells. When finding a GSM1800 cell from the list, it will perform location updating and access a network successfully. When finding there is no GSM1800 cell in the list, it will search all GSM900 cells in the list. When finding the priorities of all GSM900 cells are low, it will select the cell with the best signal for location updating and then successfully access a network. During this process, MS should only search all frequencies with signal listed in the stored list. Therefore, it takes MS a relatively short time to access a network. 5) Sometimes, when MS steps into a lift, it will enter into the slow-searching state in a no-signal period in order to save power. When the MS steps out of the lift, it performs the cell selection procedures normally. In this case, it will take MS a relatively long time to access a network. [Troubleshooting process] Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 8 Troubleshooting for Access
8-18 1) The trouble was that the access to a network was very slow. When the subscriber found the MS had not accessed a network yet in several minutes after he stepped out of the lift, he switched the MS off and then on. In this case, the MS could successfully access a network. 2) The engineer checked the consistency between Location Area & Cell Table, Cell Module Information Table of MSC and Cell Table and Cell Description Data Table of BSC and found nothing wrong. 3) The engineer then checked the CBQ and CBA settings in BSC configuration and found the settings were likely to be unfavorable to fast access. 4) On the spot, except that the GSM1800 cell of the hotel and another GSM1800 cell had the CBQ settings of NO, the CBQs of all other sites were set to YES. Therefore, the priorities of the many cells that covered the hotel were low. (Note: The hotel was the highest building in the local plain. Therefore, it was covered by many cells.) When MS did not perform selection among the stored list of cells, it could not access a network until it searched all cells. That was why the access to a network was very slow. The engineer modified the CBQ settings of all cells to NO and then the trouble disappeared. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-1 Chapter 9 Troubleshooting for Voice 9.1 Overview This chapter introduces the fundamental knowledge, common operations and usual processing of GSM voice troubles, including single pass, no pass (indicating both parties cannot hear each other.), echo, noise and cross-talking etc. The operations and methods that are involved in this chapter are basic modes for locating voice troubles. To handle the specific troubles, refer to this chapter on the basis of large quantities of dials-and-tests for accurate location. 9.2 Fundamental Knowledge 9.2.1 Transmission Format of Voice Signal FTC is responsible for implementing format conversion of voice signals in GSM system. That is to say, in the direction from FTC to MSC, all the voice signals are at the rate of 64Kbit/s and in PCM format, and their contents are PCM samples got after these voice signals undergo 8kHz sampling and A-law compression and expansion prior to 8-bit encoding. In the direction from FTC to BTS, the voice signals are at the rate of 16kbit/s and in compressed format (TRAU frame), with contents of abstracted characteristic parameters of these voice signals. Generally, the direction from FTC upwards is called 64kbit/s link, and that from FTC downwards is called 16kbit/s link for convenience. The uplink and downlink voice signals over 64kbit/s link are symmetrical, but the formats of the uplink and downlink TRAU frames over 16kbit/s link are different. As a result, the voice signals over 16kbit/s link cannot be looped back, but those over 64kbit/s link can. By combining fault scope and voice format, the transmission format of voice signals can be concluded as shown in Figure 9-1. same transfer same BIE same BM same SM SM ! MS BTS BIE GCTN TCSM GNET GCTN 16kbit/link 64kbit/ link GNET BSC MSC
Figure 9-1Transmission format of voice signal Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-2 9.2.2 Transmission Path of Voice Signal The transmission path of the GSM voice signal is given below (the key points are highlighted in red): MSRadio line (including antenna system)"" BTS(E1)"BTS_DDFTrunk transmission""BSC_DDF(E1)"BIE(HW)"GNETGOPT(Optical fiber)" GFBIGCTNE3M(E1 or transmission equipment)"MSMFTCMSC
! Note: The transmission path is for multi-module environment. For the single-module environment, no board on the AM/CM module is provided, such as E3M etc.
MSC (when a signal doesnt go out of a SM, it is destined to the corresponding GNET. Here, the SM and GNET are parts of MSC): DT(HW)"GNETGOPT(Optical fiber)"GFBIGCTN See Figure 9-2 for details: BTS E1 BTS E1 BTS E1 BTS E1 BTS E1 BTS E1 E1 BIE E1 BIE E1 BIE E1 BIE GNET GNET HW HW HW HW G O P T G F B I GCTN R a d i o
L i n k Transmission Equipment Transmission Equipment Transmission Equipment E3M HW E3M HW E3M HW E3M HW T r a n s m i s s i o n
E q u i p m e n t E1 E1 E1 E1 E1 TCSM GNET GNET HW HW HW HW GCTN DT DT DT DT G O P T G F B I E 1
L i n k
o r
T h r o u g h
D D F E1 E1 E1 E1 E1 E1 E1 E1 HW DT Other MSC / HLR / GMSC PSTN EC MSC Side BSC Side (BM1) (BM2) (SM1) (SM2)
Figure 9-2Transmission path of GSM voice signal Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-3 1) Timeslot interchange is implemented in BTS, which is not illustrated in the figure above. BTS2X BIE or BTS3X TMU implements switch between TRX time slots and E1 time slots, which are in a definite one-to-one relationship. 2) BIE implements mapping between E1 time slots and HW time slots, which are in a definite correspondence relationship. 3) GNET at BSC BM implements mapping between HW time slots and A interface time slots, which is dynamic. The mapping of each call is different from others . 4) GCTN at BSC implements interchanges between 2.048%16Mbit/s HW time slots in GFBI and those in E3M, which are not in a definite correspondence relationship, and each call is different from others. 5) E3M implements interchanges between HW time slots and E1 time slots, which are in a definite one-to-one relationship. 6) GNET at MSC implements interchanges between two A interface circuits (i.e., calls between two MSs in the same SM), between HW time slots in DT and those in GOPT (i.e., calls between two SMs in the same MSC) or between local circuit and outgoing circuit (e.g., calls between MSs in the local SM and local fixed-line phones, or between out-of-town fixed-line phones and MSs). 7) GCTN at MSC can implement interchanges between HW time slots in two FBIs only when the call between two SMs is originated.
! Note: The above analysis doesn't include the prepaid MS. For such MS, the outgoing route is needed even if the call between two local MSs is originated.
(2) GMPU is not illustrated in the above figure. However, actually the allocation of circuits GMPU accomplishes and request for layout of GCTN transferred by GMCCS both are relevant with the interchange between voice signals. A E3M or a TCSM may not serve the BM controlling and maintaining it, but for the A interface circuit occupied by the call under the BM, such relationship doesnt apply. That is, the call under BM1 doesnt always occupy the TCSM circuit configured by BM1. Actually, the TCSMs configured by a BM usually belongs to different SMs. A interface circuits are allocated by MSC, consequently, it is possible the A interface circuit occupied by the call under BM2 corresponds to the TCSM belonging to BM1, and at this time, GCTN takes the responsibility of implementing the overlap between HW in GFBI of BM2 and that in E3M of BM1. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-4 9.2.3 Concepts Full rate service: Full Rate encoding/decoding (FR), and the algorithm Regular Pulse Excitation-Long Term Prediction (RPE LTP) is adopted. Encoding process: TC frames the voice signals received from MSC based on the principle of 20ms per frame. A frame of data contains 160 PCM sampling points, each of which is 8 bits long. The parameters output after encoding contain 260 bits totally, which plus the synchronization header and control parameter is a 320-bit TRAU frame. Decoding is contrary to encoding: After receiving a TRAU frame sent from BSC, TC restores it into the voice data according to the decoding algorithm and sends it to MSC. DTX: Voice Activity Detection (called VAD for short) and Silence Descriptor (SID) are applied to Discontinuous Transmission (DTX) in GSM. When TRAU detects the data received from MSC is a non-voice message via the VAD functional module, the voice flag bit in the TRAU frame formed after encoding shall be cleared. After discriminating the flag bit, BTS disconnects the downlink until the flag bit is set. Similarly, when receiving an uplink frame, TRAU discriminates the SID flag and set of the flag means the MS is in an interim state. To make the receiving end feel the GSM network is keeping serving it, TRAU injects a comfortable noise into the uplink through the substitute technology so as not to make the subscriber consider the conversation is interrupted. DTX is applied to TRAU, which may lower the transmission power of BTS and MS, reduce intra-frequency interference among radio interfaces and weaken the sensitivity of the GSM voice signal to exceptions occurring to radio interfaces. 9.2.4 Common Operations I. Dial-and-test on A interface circuit Here only the interconnection between Huawei MSC and Huawei BSC is described. For route selection rule of other manufacturers' MSC, refer to their corresponding documents. If the signaling connected with the MS lies in module N, MSC shall assign the circuit of module N preferentially. As a result, if a trunk is to be tested, the No. of the module corresponding to the trunk shall be found first, then reserve the A interface link of this module and block A interface links of other modules, and afterwards block the circuits excepting the circuit to be tested piece by piece. For MSC from other manufacturers, as BSC does not provide the user interface tracing function, a signaling analyzer e.g., K1205 or MA10 should be attached to the A interface for tracing SS7. (If there are lots of links used for such purpose, most of them should be blocked at nights and only one or two links are reserved to facilitate the tracking). To distinguish fault messages, when a fault occurs, push any button of the MS when the conversation holds, and then search the START DTMF message Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-5 before performing Call TRACE. In this way, the assignment request for this call can be found, thus CIC shall be found. II. CIC in assignment request message of interface tracking On the window What It Means, click Assignment Request and click hexadecimal digits at the bottom to make them framed in red. Click Circuit Identity Code, and the hexadecimal digits framed in red indicate a CIC, which can be converted into the CIC in decimal format, via which the corresponding module No. and trunk circuit No. can be found in the trunk circuit table. (For MSC, the trunk circuit table is also needed). 006E shown in Figure 9-3 indicates a hexadecimal CIC.
Figure 9-3CIC format description table III. Monitoring system in real time Since blocking and unblocking operations are frequently implemented for links and circuits during the dial-and-test, it is necessary to monitor the system running in real time to avoid mis-operations. For example, check whether large quantities of subscribers cannot assign circuits owing to mis-operations. The real time counter query function of traffic measurement console can be used to satisfy such purpose. The procedures are as follows: Before operating links and circuits, enable the real time counter query function and select the item number of assignment failures. If the Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-6 result is 10000 (indicating the accumulated value since BSC runs), query the results of the real time counter again after operating links and circuits (you may click the button oblige to query in the query window), hence the current number of assignment failures can be got. If it is still 10000, it means no new assignment failures occur and the above operation is safe. If the number becomes 11000, it indicates the above operation resulted in lots of assignment failures and should be restored immediately. No sooner had the system been upgraded or data been modified than the real time counter query function can be performed to facilitate the monitoring to the system. IV. Allocation principle of A interface circuit Provided that the call signaling from BSC arrives at a module in MSC, this module shall be allocated with circuits preferentially. Only when the circuit occupancy in the module reaches the stated threshold can the signaling be transferred to other modules for circuit allocation. Therefore, for a single-module MSC, A interface circuits are allocated sequentially, while for a multi-module MSC, A interface circuits are preferentially allocated to the module the call signaling gets to before other modules in the MSC are allocated with A interface circuits sequentially. If the A interface circuits are not congested too much, the signaling is hardly transferred among modules and generally the A interface circuits in the module are allocated sequentially. 9.2.5 Supplement 1) Only when a valid uplink TRAU frame is received can TC be started. Therefore, in case a fault or a severe bit error occurs to the uplink 16kbit/s link, which causes the situation that TC is not able to receive the valid TRAU frame, TC cannot be started. And no matter a conversation over the faulty link is originated between two MSs or between an MS and a fixed-line phone, both parties cannot hear each other. If a downlink 16kbit/s link gets faulty, it only impacts on the downlink voice signal. When a MS over the faulty link communicates with a fixed-line phone or another MS over a normal link, the MS over the faulty link cannot hear the opposite party. Such situation is called signal pass. If both the downlink 16kbit/s links of the channels occupied by the two MSs are faulty, they shall not be able to hear one another. 2) As what is transmitted over the transport line is the signal multiplexed by many time slots, all the time slots over the line shall get faulty if the line is faulty or connection error occurs. Only in the devices where processing is made by time slots may parts of time slots fail, such as the boards implementing switching: BTS2X BIE, BTS3X TMU, BTS3X BIE, BTS3X GNET, BTS3X GCTN, BTS3X E3M, MSC GNET and MSC GCTN etc. Such faults may be caused by board failure, incorrect data configuration, host program error, poor contact of the line, Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-7 bad quality of stub or electromagnetic interference. If the EMC protection capability of a board is not good enough, parts of its time slots may get faulty. 3) Same BER may impact the 64kbit/s link less than the 16kbit/s link. If bit error occurs to the 64kbit/s link, the generated noise shall be relatively even without obvious fluctuation. Even when the noise submerges the voice due to overhigh BER, it still can remain even. However, for the 16kbit/s link, even if the original bit error is symmetrical, since the 16kbit/s compressed signal should be decoded to become the voice signal, the bit error shall become uneven for the voice signal after decoding, and bubble, voice discontinuity and metallic sound may occur. 9.3 Processing of Voice Troubles 9.3.1 Analysis For a voice fault, judge whether it occurs inside MSC or only out of MSC. Only the path via which the call between two MSs (not including prepaid MS) in a MSC goes can be ensured to be in the MSC. For a fault out of MSC, check relevant outgoing equipment and data. If correct, it shall be a fault of outgoing equipment. For a fault inside MSC, it can be located according to the following procedures: Step 1 Judge whether the fault occurs to one site or multiple sites. Step 2 If the fault only occurs to one site, perform dial-and-test on all carriers of the site to further check whether a time slot, a frequency or the whole site has such problem. If it is a frequency fault, it may be caused by interference. If it is a site fault, check the transmission path from this site to GNET (including boards, lines, trunk transmission equipment etc.). Step 3 If the fault occurs to multiple sites, check how these sites are distributed as per data configuration and see whether they share the same transmission path, the same BIE, the same BM, the same SM or the same MSC. " If a special transmission path gets faulty, check the corresponding transmission equipment, cables and optical fibers. " If the faulty sites share the same BIE, check the BIE and HW between BIE and GNET. " If the fault occurs to a BM, check the boards and cables between GNET and GCTN " If the fault occurs to multiple BMs that correspond to a SM, and the faulty call belongs to the SM, check all boards and cables between GCTN of BSC and GNET of the SM. If the fault occurs only when the SM calls other SMs, check the boards and cables between GNET of the SM and GCTN of MSC. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-8 " If multiple SMs fail to converse and these conversations are implemented in the MSC, check the boards and cables between GNETs of these SMs and GCTN of the MSC.
! Note: Besides the transmission equipment, boards and cables, the slots accommodating these boards, backplanes, stubs and connectors should be checked as well. For convenience of description, all above are called boards and cables hereafter.
9.3.2 Location Procedures Use two test MSs and activate the functions Call Hold and Call Wait for them. Lock their frequencies to a BTS and record the version No. of the BTS. Perform the dial-and-test at a night and enable user interface tracking of MSC to trace the A interface message. When single pass occurs, dont hook on and execute the following operations: 1) Record the CICs of the calling and the called parties as per the tracked A interface message, and the corresponding frequency and channel time slot according to the message displayed on the two test MSs. Check whether the recorded CICs and the corresponding time slots are on a FTC and a TRX respectively. 2) Hand over GCTN of MSC during conversation. If single pass remains, proceed the following operations. If it disappears, the GCTN may be faulty, and switch over it back to check whether single pass still exists. 3) If the functions Call Hold and Call Wait are activated for the two test MSs, this step shall be executed, otherwise go to the next step. When MS A fails to communicate with MS B, use MS C to originate a call to the faulty MS (assuming it is MS A). MS A accepts the new call, and at this time the original call is held. Since the resource at radio side (including radio resource and AIE) used by MS A didnt change at all during the two calls, if noise still exists during the second conversation, MS A might be faulty (you may use C to originate a call to B for further confirmation). Contrarily, if the trouble disappears during the second conversation, it could be concluded that MS B might be faulty. 4) According to the CIC recorded in the A interface message find the corresponding MSC module No. in trunk circuit table of the MSC data management console. Switch over the GNET of the MSC module (if the two MSs correspond to different modules, the two relevant GNETs shall be switched over). If single pass still exists, proceed the following operations. If the trouble disappears, the GNET Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-9 of MSC or the HW connected with the active GNET gets faulty. Switch over the board back to check whether single pass still remains. 5) Switch over GCTN of BSC during conversation. If single pass remains, proceed the following operations. If the trouble disappears, the GCTN may be faulty. Switch over the board back to check whether the trouble still exists. 6) According to the CIC recorded in the A interface message find the corresponding BSC module No. in the trunk circuit table of the BSC data management console. Switch over the GNET of the corresponding BSC module. If single pass remains, proceed the following operations. If the trouble disappears, the GNET of BSC or the HW connected with the active GNET gets faulty. Switch over the board back to check whether the trouble still exists. 7) Search the corresponding trunk circuit No. of any channel of the BTS to be tested in the radio channel configuration table. If the number is N, N/256 is no other than the BIE group No., and switch over the group of BIEs. If single pass remains, proceed the following operations. If the trouble disappears, the trouble may be caused by BIE. Switch over them back for further test. 8) Use the test MS to implement forced handover. Hand over the MS to the cell of another adjacent site controls and check whether single pass disappears. If possible, inter-BSC switchover test can be implemented. 9.4 Trouble Location 9.4.1 Single Pass and No pass [Description] Single pass indicates during conversation only one party can hear the voices sent from the opposite, but the other can hear nothing. No pass means both parties cannot hear each other. [Analysis] As per voice circuit procedure in the system, the two type troubles may be caused by: 1) Radio problem Radio environment, e.g., imbalance between uplink/downlink levels resulting in voice of poor quality and interference to one party. 2) BTS fault Hardware: Board (e.g., CDU, TRX, TMU etc.) fault, error of the switching network table of TMU. Software: Data configuration error. For instance, time slot Nos. In [Radio Channel Configuration Table] are configured incorrectly. Trunk mode Nos. in [Site BIE Trunk Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-10 Mode Description table] are inconsistent with those in [Site BIE Configuration Table], which causes the situation that the cascaded BTS cannot converse normally. 3) Abis interface fault Poor quality devices between BTS and BIE (including the trunk transmission equipment in the middle), connectors and cables as well as bit error of transmission line may cause voice of poor quality to one party. 4) BSC fault Hardware: All boards and cables between BIE and GCTN (including the backplane). Software: Time slot and HW configuration of BIE. 5) A interface fault Hardware: " Board fault: Including the boards such as E3M, MSM, FTC, DT at MSC etc. " Cable fault: Including the cables such as crossover cable and crossed pair of cables. " Setting error of DIP switch: There are DIP switches on 12FTC and 13FTC setting whether these FTCs are multiplexed, and MSM also carries a DIP switch used for setting the time slot occupied by the maintenance control message of FTC. If these DIP switches are set incorrectly, it may cause single pass or no pass. Software: CIC configuration. CIC is for setting whether A interface trunk circuit is available. If 12FTC is put into use, EFR service cannot be configured, otherwise single pass may occur when a MS originates a call to a fixed-line phone or no pass may occur when a MS originates a call to another MS. For the group of TCSMs during multiplexing, the four time slots that are used for bearing signaling and correspond to four FTCs should be set to be unavailable. And the last time slot of the last FTC that is used as the maintenance time slot should be set to be unavailable as well, otherwise both parties involved in a conversation may not be able to hear each other. 6) MSC Hardware: " The boards such as DT, GNET and GCTN are faulty or in poor contact with backplanes, or the backplanes or slots get faulty. " Cable damage or in poor contact. For instance, the HW cable between DT and GNET, optical fiber between SM and AM and outgoing trunk cables are damaged or in poor contact. Software: [Semi-permanent Connection Table] or data of outgoing trunk is configured incorrectly. Sometimes a faulty MS may cause such troubles. [Handling suggestion] Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-11 Judge whether parts of BTSs under a BM get faulty or the whole BM even more devices fail. If the fault only occurs in the outgoing process, check the associated outgoing trunk and data. For the fault occurring in the office, check all parts within the fault scope. 9.4.2 Echo [Description] Generally echo indicates when a digital MS originates a call to another digital MS or a fixed-line phone, one party can hear the voices from itself besides those from the opposite one. Voice loopback indicates when a digital MS originates a call to another digital MS or a fixed-line phone, one party only can hear the voices from itself while the other party can hear nothing. [Analysis] 1) Echo occurring in case of conversation between two MSs Such echo is a kind of acoustic echo. As the acoustic isolation performance of some MSs cannot meet the requirement of GSM protocol, the voices received by the receiver can be sent to the microphone easily. Then these voices are sent to BTS after decoding, and finally to the opposite MS. Such echo is caused by the local MS and brought to the system, and finally heard by the opposite MS. Acoustic echo has nothing to do with time slots of FTC, carrier time slots of BTS and reloading of data but is related to MS type. 2) Echo occurring in case of conversation between MS and fixed-line phone Such echo is a kind of electrical echo. Impedance mismatch of the hybrid converter at PSTN causes the situation that the transmitted signal is coupled to the receiving line, thus echoes occur at the 4-wire end. 3) Voice loopback If an intra-office call is looped back, usually it results from hardware loopback owing to wrong connection of A interface trunk cable. If a outgoing call is looped back, it may be caused by hardware loopback occurring to the outgoing trunk. If GNET or GCTN of BSC or MSC gets faulty, time slot switching error even loopback may be caused. [Handling suggestion] Based on the causes analyzed above, the following measures can be implemented: 1) Echo occurring in case of conversation between two MSs Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-12 Acoustic echo is related to the voice propagation environment in which the MS is located (e.g., background noise, surrounding barrier, space size and climate etc.). Therefore, echoes dont always occur to the MS that may generate echoes. The following method can be used confirm the opposite MS generates acoustic echo: Adjust the tone of the opposite MS and the local party can feel the change of echo tone apparently. 2) Echo occurring in case of conversation between MS and fixed-line phone During conversation between a fixed-line phone and a digital MS, if the MS can hear large quantities of echoes, such kind of echo is called acoustic echo, which may result from lack of Echo Cancellor (EC). Search the corresponding route data of the call to ensure the data is correct. If correct, check whether the EC of the corresponding mobile network equipment is configured accurately according to the principle that EC is placed near PSTN, so as to guarantee EC works normally. Occasionally, echoes of high volume may occur when a digital MS originates a call to a fixed-line phone. That is because the hybrid coil at the fixed network doesnt measure up and the volume of echoes generated exceeds the processing capability of EC, which results in echoes of high volume occurring at the digital mobile network. As soon as a call is set up, EC shall perform adaptive matching of echo cancellation dependent parameters. If parameter search is too slow or unsuccessful, temporary echo or continuous echo may occur just after the call was set up. Unfrequent occurrence of such phenomenon is normal. 3) Voice loopback According to the numbers of the calling and the called parties and time when a loopback occurs, search the corresponding CDRs in MSC and confirm whether the calls looped back went through the same route, and then check whether the trunk cable corresponding to the route is connected incorrectly. When a intra-office call is looped back, block the A interface circuit and make 32 circuits of only one trunk idle. Then perform dial-and-test on each trunk of the A interface in sequence to check whether loopback exists. If it exists, check whether the corresponding trunk is connected incorrectly as per the No. of the CIC occupied by the call. If only the outgoing call is looped back, perform dial-and-test on the outgoing trunk to check whether loopback exists. If it exists, check the corresponding trunk is connected incorrectly as per the No. of the CIC occupied by the call. If all trunks are connected correctly after the above operations, (however, it cannot be ensured that other offices corresponding to the outgoing route are connected correctly), the trouble still remains. Try to switch over GNET and GCTN at MSC. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-13 If the trouble still cannot be eliminated after the above operations and it is confirmed the outgoing call was looped back, check whether the equipment and cables of other offices involved in the outgoing route are in a normal state. 9.4.3 Voice Discontinuity [Description] The voice in the conversation is frustrated. Some instances of the conversation are lost, or in worse condition, the entire conversation becomes difficult for both parties. [Analysis] The causes resulting in voice discontinuity are as follows: 1) Frequent handover Since GSM system supports hard handover. The handover from a source channel to the destination channel can cause loss of Abis interface downlink voice frames, so voice discontinuity resulting from handover during conversation is inevitable. Sporadic handover is not attention-getting, while frequent handover occurring at cell edges or due to cell overlap may cause discontinuity of conversation. Such trouble can be avoided by optimizing the network, adjusting obliquity and height of the antenna and configuring the parameters interference handover uplink/downlink quality threshold and emergency handover uplink/downlink quality limit threshold. Location method: Use a test MS to check the channel occupied by the MS changes continuously during a conversation. 2) Radio link interference Interference may increase the BER over the radio link and cause voice discontinuity. In addition, the conversation may be of poor quality owing to signal fluctuation at cell edges. Location method: Perform a road test through a test MS to analyze and check whether radio link interference exists with the help of Ant Pilot network optimization software. 3) BTS transmission fault Check whether all connectors (including connectors on DDF) are in good condition. For fiber optical transmission, check whether fiber connectors are clean and whether BER is high, and microwave transmission may be impacted by climate. It should be noted that the 75! coaxial cable laid from the transmission interface boards (such as 42BIE, TMU) to the BTS cabinet top in BTS cabinet may be of poor contact due to long time use. If the connectors in the cabinet are dusty, it may impact the conversation. If both microwave transmission and fiber transmission are adopted, Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-14 make sure these types of equipment are matched in interface transmission impedance. Location method: Check whether the transmission alarms, such as BTS BIE remote alarm, LAPD link alarm, are given from the alarm system. Furthermore, testing whether bit error occurs to the transmission path is the most effective and convenient method.
Caution: Sending pseudo codes and using the clock in the BER tester could guarantee the result of the BER test completely reflects the quality of the link.
4) Carrier board fault Location method: Use a test MS to check whether the channel or frequency occupied by the MS changes continuously during a conversation and whether the frequency and time slot are fixed in case that no voices are transmitted. [Suggestion] To locate a voice discontinuity trouble, perform dial-and-test to find the position where the trouble occurs and then judge which kind of fault it is according to the above methods. 9.4.4 Noise [Description] Bubbles, clicks and metallic sounds heard during a conversation are called noises. In worse condition, only noises instead of voices can be heard. [Analysis] Generally, noises are caused owing to bit error. Besides the fault of any board, connector or cable on the path via which voice signals go through, grounding error, interference, clock fault or wrong setting of DIP switch may result in bit error. Interference on radio link can cause bit error, while clock unsynchronization shall lead up to slip frame or loss of frames. Wrong setting of DIP switch may bring about errored bits, although such mistake happens occasionally. Different errored bits may cause different impacts. The errored bits on the line from A interface to MSC impact on PCM sample, as a result, the noises generated are relatively even because the noises and voices are in a overlap relationship. The Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-15 errored bits on the line from A interface to BSC impact on the compressed voice signals which should be decoded before being heard although these bits are also well-proportioned. Consequently, the noises generated, such as bubbles, sense of discontinuity and metallic sounds, fluctuate greatly, and make some sentences cannot be distinguished clearly. Slip frame or loss of frame caused by clock unsynchronization is regular in time, therefore, noises appear regularly during a conversation. [Handling suggestion] Find the cause resulting in noises first according to the features of noises, then determine the scope for location test in terms of the place where noises occur. 1) BTS fault Possible causes: " Trunk transmission bit error " TRX fault, including version incompatibility of TRX software and hardware. " FPU fault " CDU fault " MCK fault, which may cause instability of BTS clock frequency, thus impact the quality of conversation. " Interference on radio channel " Antenna fault Location: " Perform dial-and-test by using a test MS " Check whether the related alarm is given " Tracing messages or check signal quality and whether inference exists from the MS " Use an antenna tester to test the antenna system " Check whether the grounding system is wrong. " Perform a transmission bit error test " Test the clock signal of the faulty BTS. 2) BIE related noise Possible causes: " BIE fault " HW fault between BIE and GNET " Transmission error from BIE to BTS. Location: " Check whether the related alarm is given (PCM alarm) " Perform a transmission bit error test " Replace the BIE " Connect the BTS to another BIE Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-16 " Check the HW and E1 cable of the BIE " Switch over the GNET 3) BM related noise Possible causes: " Switching network board (GNET and GCTN) fault " Optical fiber interface circuit fault. Location: " Check the connection of optical fibers and related connectors " Check HW of GNET " Switch over GNET or GCTN. 4) Intra-office noise Possible causes: " A interface dependent board & cable fault (E3ME1MSMinternal HWFTCE1or trunk"DTHWGNET at MSC) " Inter-module circuit fault: Optical fiber interface circuit, GCTN at MSC Location: " Enable GSM user interface trace at MSC and perform dial-and-test on A interface circuit. Analyze the trunk occupied by the faulty call based on the interface message and calculate the corresponding board, and then check the related boards and cables (including all boards, cables and backplanes from E3M to GNET " If trunk equipment is in the way between BSC and MSC, test the equipment. (Check whether errored bits exist in it and whether the grounding system is correct) " Check whether the related alarm is given " If inter-module forwarding at MSC is implemented for the faulty call, check the associated connections and connectors. Switch over GCTN at MSC if necessary. 5) Outgoing noise Based on the outgoing route, check the related outgoing trunk equipment and cables. 9.4.5 Cross-talking [Description] During a conversation, the voices from the third party, besides those from the opposite party can be heard, or only the voices from the third party can be heard. Such phenomenon is called cross-talking. [Analysis] Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-17 Generally cross-talking occurs when the call passes through the outgoing route. Wrong data configuration (e.g., CIC) or incorrect hard connection (e.g., E1 cable of A interface) may cause cross-talking accompanied with single pass or no pass. [Handling suggestion] Similar to the location method of single pass/no pass. 9.5 Fault Examples 9.5.1 Cross-talking Resulting from Improper Data Configuration [Description] Single pass/no pass accompanied with cross-talking occurred during conversations near a BTS. [Troubleshooting process] 1) Based on lots of dials-and-tests that were performed on site, an on-site engineer found that when the TCH time slot at frequency 37 was assigned, single pass and no pass occurred. After many dials-and-tests, the engineer found single pass would occur to a MS as long as it is assigned to the master BCCH carrier (at frequency 37), and no pass would occur so far as both MSs are assigned to the master BCCH carrier. After a while, cross-talking occurs, and only the voices from one party can be heard. 2) The associated BIE was switched over as soon as the trouble appeared, but the trouble still remained. 3) The engineer checked the radio channel configuration table of the faulty TRX12 and found the trunk circuit Nos. be 1110 and 1117, which were the same as the TRX9 trunk circuit Nos. 4) Repeat of trunk circuit No. may bring about single pass, no pass even cross-talking. 5) No single pass or cross-talking happened any more during on-site dials-and-tests after the dynamic setting of the related data had been modified. [Analysis] The trunk circuit Nos. in the radio channel configuration table are used to deliver TMUs of BTS (BIEs of BTS20) for switching of time slots of a carrier. Since the carrier signaling adopts the data in the radio channel connection table, it is normal. However, as TCHs for bearing voices are configured repeatedly, the TCH assigned to the carrier shall be switched to the corresponding TCH of TRX9, thus result in single pass. If the corresponding TCH of TRX9 is being used for conversation, the uplink voices of TRX 9 may be switched to the corresponding time slot of the TRX, thus cross-talking occurs. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-18 9.5.2 Voice Discontinuity Resulting from BCCH Carrier Mutual-assistance [Description] The user near a BTS complained the conversation was discontinuous. [Alarm information] TRX hardware alarm 21:59;41 Mutual-assistance occurred to BCCH of the cell 22:00:15 TRX hardware alarm recovery 22:09;42 BCCH mutual-assistance switchback 22:10:12 TRX hardware alarm 22;10:19 [Troubleshooting process] 1) To prevent such circular switchover, the faulty TRX was to be blocked so as to exit the service before it was replaced. 2) As the TRX was in a Disable state when the under-power alarm occurred, the remote operation didnt take effect. 3) As the faulty TRX was concerned with carrier mutual-assistance, the alarm would disappear temporarily in ten minutes. Because the faulty carrier also served as the TRX where the master BCCH was located, the power amplifier should be activated. After the above operation, an alarm occurred, and carrier mutual-assistance was enabled again. 4) The remote has no time to block the TRX due to the short duration of displaying normal working status. Therefore, the faulty carrier should be replaced as soon as possible. 5) If the carrier cannot be replaced immediately, it is recommended to switch off the power of the faulty TRX. [Analysis] The channel configuration of the first carrier is changed because carrier mutual-assistance was implemented for the under-power alarm, and the power amplifier is disabled as well. The second carrier has been configured with BCCH and SDCCH and the conversation is handed over to the second carrier. Based on the judgement principle of under-power alarm, the power amplifier shall be enabled every ten minutes after the alarm occurs and will implement detection in 10 ms. If no alarm signal is detected, it should be considered the alarm disappeared. In this way, as the first carrier has disabled the power amplifier after ten minutes and no conversation is made, the first carrier fault will be considered to be recovered after the detection 10 ms later. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-19 As per the cell BCCH mutual-assistance procedure, the original BCCH carrier will make BCCH mutual-assistance switchback to reconfigure channels after recovery. However, the active TRX will serve to send power, generate under-power alarm and implement cell BCCH mutual-assistance, consequently handover between the two carriers will be ceaseless. In this way, as it has no time to hand over a call that has been handed over to the carrier whose power amplifier was enabled just now back to the second carrier during second-to-first handover, conversation discontinuity will occur. 9.5.3 Single Pass Resulting from MS Fault [Description] Single pass always occurred every time when a MS of certain type served as the calling party to originate a call to a fixed-line phone or a MS. [Troubleshooting process] 1) When several MSs of other types were used to originate calls to the problem MS, single pass always occurred to the problem MS used as the called party. But when dials-and-tests were made between these MSs of other types, single pass didnt occur at all. Therefore, the MS of this type may be faulty. 2) Afterwards, dials-and-tests were performed in the equipment room, meanwhile the user interface tracking function was enabled, and the tracking signaling was found in a normal state. Therefore, it could be concluded the MS of this type got faulty. 3) After flapped several times, the MS was put into use again for dial-and-test. No matter it served as the calling party or the called party, single pass didnt occur any longer. Then, MSs of other types were used for several dials-and-tests, and single pass still didnt appear. [Analysis] After used for a long while, especially crashed, the parts in the MS may be loose. Therefore, once single pass occurs, check the quality of the MS first. 9.5.4 Noise Resulting from Poor Contact of E1 [Description] Loud noises always occurred when the user supported by a BM of a BSC called. After dial-and-test, the noise was confirmed to have the following features: 1) Noise description Voice signals became faint (without discontinuity or distortion. These signals can be considered to be normal if these noises could be regardless), while noises got loud Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-20 (the type of these noises are similar to that of white noise), and the voice signal and the noise were overlapped. RQ in the test MS was 0. When the MS served as the calling party, loud noises appeared as soon as a ringback tone was returned. 2) Popularity The result of dial-and-test on the faulty BTS showed that the noise was irrelevant to the place where the BTS resided, BIE the BTS belonged to and the transmission path between the BTS and the BIE. And such noise existed in all cells of all BTSs. 3) Downlink 4) Random occurrence. The occurrence probability increases with the augment of traffic. [Troubleshooting process] 1) Based on the above features, it can be concluded that: " According to features (1) and (4), it could be known that the noise was caused by fault of the circuit from A interface to MSC. (The noise and the voice signal were overlapped. The noise could be caused by bit error of PCM sample or of 64kbit/s link, and its occurrence probability is relevant with polling of A interface circuits. " As per feature (2), the faulty point should be in the range from BIE of BM to MSC. 2) GNET and GOPT were switched over, and optical fibers and fiber connectors were unplugged/plugged and cleaned. GCTN was switched over again for the line from A interface to BSC, and the noise didnt disappear at all. A interface circuit allocation principle: Provided that the call signaling from BSC arrives at a module in MSC, this module shall be allocated with circuits preferentially. Only when the circuit occupancy in the module reaches the stated threshold can the signaling be transferred to other modules for circuit allocation. If the A interface circuits are not congested too much, the signaling is hardly transferred among modules and the A interface circuits are almost allocated sequentially in the module. For the single-module MSC, A interface circuits are allocated sequentially. BSC and MSC have four modules respectively, whose A interfaces are in a one-to-one relationship, as a result, it is impossible the signaling from multiple BMs reaches the same SM. Furthermore, as the A interface circuit has sufficient space, usually signaling transfer doesnt occur at MSC. Also, dial-and-test should be performed between two MSs under the coverage of the same BTS, therefore, it almost can be concluded that the A interface circuit from BM to MSC, namely FTCDTGNET of SM1, gets faulty. By performing dial-and-test on the A interface circuit between BM1 and MSC, the corresponding trunk circuit (with DT No. of 11) of FTC No. 87 was confirmed to be faulty. After FTC and DT were switched over in turns, the trouble still remained. A trunk alarm was given by tightly touching the E1 cable near the Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 9 Troubleshooting for Voice
9-21 connectors at the back of the two boards to indicate the E1 cable is of poor contact with the plug of the DTM backplane. After unplugged and the plugged, the E1 cable was tested again, and the trouble disappeared. [Analysis] If the noise is superimposed, usually it indicates the line from A interface to MSC is faulty. If it is random and ruleless, the line from TC to BSC may be faulty. 9.5.5 Voice Loopback Resulting from Outgoing Cabling [Description] During conversation, the local party only can hear its own voices instead of voices from the opposite party, while the opposite one can hear nothing. [Analysis] As the physical uplink and the physical downlink of radio interface are separate by frequency, and those of Abis interface are unsymmetrical, it is impossible that loopback occurs between Abis interface and BTS. Also, frequency hopping cannot cause loopback. The data configuration was correct after scrutiny. Voice loopback is possibly caused by loopback on the physical layer, and may occur in the following cases: 1) As most of voice loopback exists in the outgoing call, it may occur to the trunk between local MSC and tandem exchange. 2) Loopback may occur to the trunk between local MSC and TMSC. 3) Loopback may occur to transmission equipment. [Troubleshooting process] 1) By performing traversal dial-and-test on the trunk circuits corresponding to all FTCs, it was confirmed that loopback didnt occur to the trunk of local A interface. 2) Dial-and-test was performed on the trunk between local MSC and tandem exchange. 3) Physical line loopback was found in the trunk between local MSC and tandem exchange, which caused voice loopback of the outgoing call. 4) Relaying the trunk between local MSC and tandem exchange solved this trouble. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-1 Chapter 10 Troubleshooting for Call Drop 10.1 Overview For GSM network, call-dropping failure rate is an important index measuring the quality of radio network. This chapter analyzes the causes resulting in call drop and describes the methods of troubleshooting for the purpose of reducing call-dropping failure rate, thus improving the quality of network. In addition, it also introduces the measures of dealing with worst cells caused by high call-dropping failure rate, reducing worst cell ratio, thus decreasing call-dropping failure rate. Definitions of worst cell ratio indices that are calculated according to different network sizes are given below: Worst cell ratio in super network: Number of worst cells/number of cells where the busy time average traffic per channel exceeds 0.15Erl. Worst cell ratio in large network: Number of worst cells/number of cells where the busy time average traffic per channel exceeds 0.12Erl. Worst cell ratio in medium network: Number of worst cells/number of cells where the busy time average traffic per channel exceeds 0.1Erl. Definition of worst cell: The cell where the busy time TCH congestion rate (not including handover) is greater than 5%, or the TCH call-dropping failure rate is greater than 3%. Definition of call-dropping failure rate: Call-dropping failure rate = [Busy time total TCH traffic * 60]/total number of busy time TCH call drops, in which the number of call drops indicates the number of Clear Request messages. 10.1.1 Description There are two types of call drops given below. ! Call drop over SDCCH: Indicating the call drop occurs in the course during which BSC assigns a SDCCH to an MS but a TCH has not been successfully assigned yet ! Call drop over TCH: Indicating the call drop occurs after BSC assigns a TCH to MS successfully. There are three causes resulting in call drop, which are given below. ! Radio link failure, which occurs in the course of communication and causes the situation that messages cannot be received. ! T3103 timeout. It indicates the MS cannot occupy a channel of the destination cell or return to the original channel. ! System failure, such as equipment failure etc. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-2 1) Among the three causes, radio link failure is the main factor During a conversation, when the voice quality of a MS is too bad to be accepted and cannot be improved via radio frequency power control or handover, the MS will consider the radio link gets faulty and forcedly release the link, which thus causes call drop. As stated in GSM specification, there is a counter S in the MS. As soon as a conversation starts, the counter is assigned an initial value, which is the parameter Radio Link Timeout. If the MS fails to decode a SACCH message with period of 120 ms, 1 will be subtracted from S. Contrarily, every time when the MS receives a SACCH message successfully, 2 is added to S, but the value of S cannot be greater than the initial value. When S is 0, the MS reports radio link failure. The Signaling procedure is shown in Figure 10-1. Steps (1) and (2) shows SDCCH/TCH have been established, while step (3) cannot decode the SACCH message block (uplink/downlink), thus radio link timeout is caused. SACCH multiframe number (namely, SACCH multiframe period with a unit of 480 ms) in the cell attributes table defines uplink connection failure time. When detecting an activated connection on the radio link is broken, BTS will report the message Connection Failure. The system judges whether a connection fails by the BER of uplink or by checking whether SACCH can correctly decode. As stated by GSM protocol, if the BER of SACCH is used to judge whether a connection fails, when the uplink BERs in N continuous SACCH multiframe periods are greater than the set threshold, BTS will report the message Connection Failure to BSC. N, the number of SACCH multiframe periods, has been set during data configuration, which is no other than the SACCH multiframe number in the cell attributes table, with a unit of 480 ms. In addition, in case that the layer 2 frame cannot interwork with MS normally, BTS layer 2 radio interface will report the message Error Indication to BSC, as shown in step (3) in Figure 10-1. The cause is T200 timeout, and at this time, BSC will release the radio link and report the message Clear REQ. MSC BSC BTS MS Measurement Result Connection Failure Clear REQ (Radio Interface Failure) (1) (2) (3)
Figure 10-1Signaling flow of radio link failure 2) T3103 (a) Definition: In the course of an intra-BSS or inter-BSS handover, BSC reserves TCHs of the cell initiating the handover and the destination cell in terms of T3103. This timer is activated as soon as BSC sends the message Handover Command, and is cleared after receiving Handover Complete (for intra-BSS handover), or Clear Command (for inter-BSS handover). Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-3 (b) This timer is used to reserve a channel for a long time so that MS can return to the channel. Nevertheless, If MS lost, it will be used to release the channel. When BSS sends a handover command to MS, the timer starts to perform the timing function. After BSC receives a Handover Complete message from the destination cell or a Handover Failure message from the source cell, the timer will be reset. Following BSC sending a Handover Command message to BTS, if no messages are received after T3103 expires, BSC will consider radio link failure occurs to the source cell, and then release the channel of the source cell. The Signaling flow is shown in Figure 10-2. MSC BSC BTS1 MS Channel Activate BTS2 Handover Indication Channel ACK Handover Command Handover Command Handover Access Handover Detection Physical Info (TA) SABM Establish Indication UA Handover Complete Handover Complete Set T3103 Reset T3103
Figure 10-2Call drop resulting from T3103 timeout 3) See example 8 for detailed descriptions of call drop resulting from the causes such as equipment failure. 10.1.2 Formula for Call drop 1) TCH call-dropping failure rate = number of TCH call drops/times when TCH is occupied successfully % 100% 2) TCH call drop measurement point: The channel currently occupied is of TCH type when BSC sends a Clear Request message to MSC. 3) The cause values for sending Clear Request are as follows: ! Radio Interface Message Failure ! O&M Intervention ! Equipment Failure ! Protocol Error Between BSS and MSC ! Preemption The Signaling flow is shown in Figure 10-3. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-4 BSC BTS MS Connection or Connection Acknowledge Connection Failure Call Assignment Procedure Signaling Handover Procedure Signaling Connection or Connection Acknowledge Connection or Connection Acknowledge Handover Complete Error Indication Or. Or. Or.
Figure 10-3Signaling flow for TCH call drop Formula of SDCCH call-dropping failure rate: SDCCH call-dropping failure rate = number of SDCCH call drops/total times when SDCCH is occupied successfully % 100% SDCCH call-dropping failure rate (%) = [number of radio link failures when SDCCH is occupied (connection failure) + number of radio link failures when SDCCH is occupied (error indication) + number of terrestrial link failures when SDCCH is occupied (Abis)]/total times when SDCCH is occupied successfully % 100% SDCCH call drop measurement point: The channel currently occupied is of SDCCH type when the messages Clear REQ and Error Indication are sent to MSC. 10.2 Causes 10.2.1 Coverage I. Analysis 1) Discontinuous coverage (blind area) Call drop is caused by isolated BTS. As the signal is of weak strength and poor quality at the edge of an isolated BTS, handover to other cells cannot be implemented, and thus call drop occurs. If BTS lies in the place where the landform is intricate and radio propagation environment is complicated (e.g., a mountainous area), it may cause call drop owing to discontinuous coverage. 2) Poor indoor coverage In the place where many buildings are located, call drop easily occurs due to high transmission attenuation, low indoor level and great penetrate loss. 3) Beyond coverage (isolated island) Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-5 Owing to some reasons, the coverage of a serving cell is beyond the defined coverage. For example, the power in cell A is so high that a MS still occupies the signals of cell A after it moves out of the coverage of the adjacent cell B that has been defined by cell A and reaches cell C. However, cell A has not defined cell C as an adjacent cell yet, so at this time the MS cannot find a proper cell when it tries to perform a handover according to the adjacent cell B provided by cell A, thus call drop occurs, as shown in Figure 10-4. Cell A Cell B Cell C Expected Coverage Actual Coverage Can't find next cell cause call drop
Figure 10-4Call drop resulting from overlarge coverage 4) Shortage of coverage It may be caused by some equipment failure in a cell. For example, the antenna is obstructed or the carrier taking BCCH (power amplifier) gets faulty. II. Location Get familiar with the area that is not covered enough and perform a large-scope test. Observe the signal level, whether the handover is normal and whether call drop occurs. In addition, by means of OMC traffic measurements check the BSC call-dropping failure rates to find the cells with high call-dropping failure rates and other relevant statistics, facilitating the location. The related traffic measurement tasks and items are listed below: 1) In power control performance measurement, see whether the average uplink/downlink signal strength is too low. 2) In receive signal level performance measurement, see whether the proportion of number of low receive signal levels is too large. 3) In cell/inter-cell handover performance measurement, see whether the level class and the average receive signal level are too low when a handover is initiated. 4) In call drop performance measurement, see whether the level is too low when call drop occurs, and whether TA is abnormal prior to call drop. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-6 5) Defined adjacent cell performance measurement. Based on the statistics of adjacent cells defined in the cell adjacent relationship table and reported by MS, you may locate in which adjacent cell the average level is too low. 6) In undefined adjacent cell performance measurement, see whether the undefined adjacent cells in which the average level is too high exist. 7) In power control performance measurement, see whether the maximum distance between MS and BTS, namely TA, keeps abnormal in multiple continuous time segments. III. Solution 1) Find the area that is short of coverage Perform tests to find the areas that are short of coverage. For the isolated BTS or the BTS in a mountainous area, continuous coverage can be formed by adding BTSs or expanding the original coverage via other ways, such as improving the maximum transmit power of BTS, adjusting azimuth, downtilt and height of antenna etc. Analyze whether the trouble is caused by surroundings, e.g., tunnel, mall, subway entry, underground parking lot and depression. Generally, call drop easily occurs at these places, and micro cells can be used to handle the trouble. 2) To guarantee indoor communication quality, the signals going outside must be strong enough. If indoor communication quality can not be improved greatly by enhancing the maximum transmit power of BTS, adjusting azimuth, downtilt and height of antenna, adding BTSs can be helpful. To build up indoor coverage of main publics such as office buildings and hotels, the indoor distribution system could be applicable. 3) For the cell where beyond coverage may occur, define its all potential adjacent cells to reduce call drops resulting from lack of proper cell for handover. The problem of beyond coverage can be solved by lessening the antennas downtilt of the BTS. 4) Removing hardware failure. Perform tests to judge whether hardware failure occurs and causes short of coverage. If the call-dropping failure rate of a BTS rises abruptly and all other indices remain normal, check whether the adjacent cells work normally. Such trouble may be caused by failure of the downlink, such as failure of TRX, diversity unit or antenna, because faults of the uplink will cause high handover failure rate of the original cell. 10.2.2 Handover I. Analysis 1) Unreasonable parameters For example, if the level of the handover candidate cell is set to be too low and the handover threshold is set to be too little, some MSs will be handed over to the adjacent cell when the level of the adjacent is a little stronger than that of the Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-7 serving cell for a time. But after a while, if the signal of the adjacent cell faint, and it happens no proper cell is available for handover, call drop could occur. See example 6 for call drop resulting from improper settings of handover parameters. 2) Adjacent cell undefined If an adjacent cell has not been defined yet, MS will keep communicating in the serving cell until it goes out of its coverage. At this time, call drop shall occur since MS cannot be handed over to a cell with stronger signals. 3) Existence of adjacent cells with the same BSIC and BCCH frequency. 4) Traffic congestion Unbalance of traffic may cause handover failure due to lack of handover channel available for the destination BTS. When reestablishment of handover channel fails too, call drop occurs. 5) BTS clock out of synchronization and frequency offset beyond limits, which can cause handover failure and call drop. 6) T3103 timeout II. Location According to traffic measurement indices analyze whether there are cells with low handover success rate, high call-dropping failure rate, multiple handover and reestablishment failures. By ways of traffic measurement analyze the causes resulting in handover, such as uplink/downlink receive signal level, uplink/downlink receive quality, power budget (PBGT), call directed retry and traffic. Observe whether there are BTS related clock alarms and whether BTS clock runs normally. Check BTS clock and remove clock fault if necessary. Perform a road test to find the cell in which handover is abnormal. Perform multiple road tests near the problem cell to find handover related call drop, and optimize handover parameters to reduce call-dropping failure rate. The following problems might be detected during traffic measurement: 1) Too many handover failures and reestablishment failures in inter-cell handover performance measurement. 2) Too many handovers and successful re-establishments in inter-cell handover performance measurement. 3) The number of measurement reports including undefined adjacent levels in undefined adjacent cell performance measurement is beyond limits. 4) Low outgoing handover success rate (for a cell) in outgoing handover performance measurement. Find the adjacent cell to which the handover success rate is low, and further search the cause in the destination cell. 5) Low incoming handover success rate and unreasonable settings of handover parameters of the opposite cell. 6) Number of handovers in disproportion with number of successful TCH occupancies and too much handover in TCH performance measurement. (number of handovers/number of calls >3) Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-8 III. Solution 1) Check the parameters impacting handover, e.g., settings of stratum levels, handover thresholds, handover hystereses, handover measurement time, handover duration and minimum access level of handover candidate cell. For instance, to decrease call drop resulting from handover, the minimum access level of the handover candidate cell can be improved from -100dBm to -95dBm. That is, it is changed from grade 10 to grade 15. Also, when handover is slow or handover success rate is low owing to clock problem or poor transmission, the value of the parameter NY1, which is the maximum retransmission times of physical information could be set greater, In a word, handover optimization should be based on the actual conditions. Example 6 introduces how to reduce call drop by adjusting handover parameters. See it for details. 2) Traffic adjustment is to handle the call drop resulting from no handover channel available for the destination BTS due to unbalance of traffic. For example, control the coverage of cell by adjusting the engineering parameters such as downtilt and azimuth of antenna, or lead MS to stay in an idle cell via network parameters such as CRO, or lead MS in conversation to hand over to an idle cell by setting stratum level priority, or balance traffic by adopting load handover or directly expand the carrier. 3) Calibrate the BTS clock that is faulty until clock synchronization. 10.2.3 Interference I. Analysis There are co-channel interference, adjacent-channel interference and inter-modulation interference. When MS receives signals in the serving cell with strong co-channel or adjacent-channel interference, it may aggravate BER and make MS cannot accurately demodulate BSIC of the adjacent cell or BTS cannot correctly receive measurement reports of MS. The interference threshold is set as co-channel carrier-to-interference ratio C/I9dB and adjacent-channel carrier-to-interference ratio C/A9dB. When the interference index is so bad that it exceeds the threshold, conversations in network shall be interfered, thus conversation of poor quality and call drop might occur. II. Location Interference may be from inside or outside of the network and exists in uplink or downlink signals. The following methods can be used to locate interference. 1) Find the position that may be interfered by analyzing traffic measurement. 2) Perform road tests at the position that may be interfered according to complaints of the users and search downlink interference. With road test tools check whether the position where the receive signal level is strong but the conversation quality is poor exists. Or use a test MS to perform dialing tests at a locked frequency to observe whether interference occurs at the frequency. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-9 3) Check whether there is co-channel interference caused by improper frequency planning. 4) Adjust the frequencies that might be interfered to try to reduce even avoid interference. 5) Remove the interference caused by equipment failure. 6) If interference still remains, perform frequency scan with a spectrum analyzer to search the frequency that is interfered and to further find the interference sources. See examples for detailed analysis of interference. List below the traffic measurement indices used for interference analysis. 1) Interference band for observing uplink interference If an idle channel appears in interference bands 3, 4 and 5, generally it indicates there is interference. For intra-network interference, the interference may increase with the augmentation of traffic, while out-network interference has nothing to do with it. Note that interference band is reported to BSC by the BTS carrier channel in an idle state via radio frequency resource indication, it indicates the uplink characteristics of the radio channel occupied by MS, namely interference severity of uplink signals. If the channel is busy, it is difficult for it to report resource indication, therefore, the measurement of interference band should be comprehensive. 2) Receive signal level performance measurement (denoting the matrix relationship of level and quality) This is a measurement item used for carrier. If there are too many times when level is high but quality is poor for a carrier board, it indicates co-channel or adjacent-channel or out-network interference occurs to the frequency of the board. 3) Handover ratio on poor quality communication In cell performance measurement/inter-cell handover or outgoing handover performance measurement, handover attempts due to various reasons are measured. If there is too much handover caused by poor quality communication, it means there could be interference. Furthermore, if there is lots of handover resulting from uplink communication of poor quality, and vice versa. 4) Receive quality performance measurement Measure average received quality level for carrier, which serves as a reference. 5) Call drop performance measurement Record average level and quality in case of call drop, which serves as a reference. 6) Too many handover failures and reestablishment failures It might be caused by interference in destination cell, serving as a reference. III. Solution Out-network interference could be solved with the help of operators, while intra-network interference can be handled by adjusting network planning. 1) Perform actual road tests, check the places where interference occurs and distribution of signal quality and analyze the coverage overlap of which cells Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-10 causes the interference. Then according to the actual condition, adjust the related BTS transmit power, downtilt/azimuth of antenna or frequency planning to prevent interference. 2) Application of discontinuous transmission (DTX), frequency hopping, power control and diversity These technologies can help to reduce system noise and improve anti-interference capability. DTX is classified into uplink DTX and downlink DTX, which can help to reduce the effective time for transmission, thus to decrease the interference level of the system. Nevertheless DTX should be adjusted properly considering the actual radio surroundings and relationships with the adjacent cells. When MS receives signals of poor quality, the application of DTX might cause call drop. Because after MS sets up a conversation, the BTS transmit power will be stronger during the conversation due to the activation of DTX downlink function, while in the interval of conversations, the power will decrease. In this manner, it may reduce the interference to other BTSs, but on the other hand, if interference exists around the BTS, DTX of downlink signals may aggravate the quality of conversation. As a result, when the BTS transmit power decreases, debasement of conversation quality even call drop may easily occur at the position where the receive signal level is low but interference signal is strong. 3) Remove the interference caused by equipment itself (e.g., carrier board self-excitation, antenna inter-modulation interference). 10.2.4 Uplink/downlink Unbalance Caused by Antenna & Feeder System I. Analysis 1) Improper installation of antenna and/or feeder. For example, the Tx antenna between two cells is installed just reversedly, which shall make the uplink signal level is much poorer than the downlink one, thus cause call drop, single pass or difficult connection occurring far from the BTS. 2) If single polarization antenna is adopted, a cell has two sets of such antennas. If their azimuths are different, call drop might occur. A directional cell has a main antenna and a diversity antenna, so it is possible that BCCH and SDCCH of this cell come from the two different antennas. Different azimuths will cause different coverage, consequently, although the user can receive BCCH signal, it cannot occupy SDCCH sent by another antenna when originating a call, thus call drop occurs. 3) Different azimuths of two antennas may cause call drop. Different azimuths of two antennas will cause the situation that the user can receive SDCCH, but call drop shall occur once it is assigned to TCH transmitted by another antenna. 4) Antenna problem also can cause call drop. Mar, watering, bend and connector of poor contact all can reduce Tx power and Rx sensitivity, thus cause serious call drop, which can be confirmed via standing wave ratio. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-11 II. Troubleshooting process 1) Check whether there are combiner, CDU, tower top amplifier and standing wave ratio alarms. 2) View whether all boards of BTS work normally via remote maintenance. Analyze whether uplink/downlink unbalance appears from traffic measurement. 3) Trace relevant Abis interfaces by performing Abis interface tracing function or with a Signaling analyzer. Further observe whether uplink/downlink signals are balanced from the measurement report about Signaling messages. 4) Perform road tests and dialing tests. Make sure the BCCH frequency of the serving cell is consistent with the expected one and the Tx antenna is installed correctly prior to road tests. 5) After full remote analysis, perform on-site inspections and tests. Check whether the azimuth and the downtilt of the antenna are designed normatively and whether the feeder and jumper are connected accurately. Make sure the antenna & feeder connector is in good contact and the feeder is in good condition. Test whether the standing wave ratio is normal. 6) Judge whether BTS hardware failure causes the uplink/downlink unbalance. For hardware failure, replace the part that might be faulty or disable other carriers in the cell before performing dialing test on the doubtful carrier to locate the fault point. Once a part is found in a faulty state, it should be replaced in time. If no alternative part is available, block the faulty board first lest call drop should occur to impact the running quality of the network. List below some traffic measurement items used for analysis of uplink/downlink balance: 1) From Up-Down Link Balance Measurement, analyze whether uplink/downlink unbalance exists. 2) From Call Drop Measurement, analyze the average uplink/downlink levels and qualities in case of call drop. 3) From Power-Control Measurement, analyze uplink/downlink average receive signal levels. 10.2.5 Transmission Failure As there are Abis interface and A interface link, poor quality transmission and unstable transmission link also may cause call drop. I. Analysis and solution: 1) Observe transmission and board alarms (e.g., FTC failure alarm, A interface PCM out of sync alarm, LAPD link break alarm, power amplifier alarm, HPA alarm, TRX alarm, CUI/FPU alarm). Based on alarm data, analyze whether transmission is intermittent or whether there are faulty boards (e.g., the carrier board is faulty or in poor contact). Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-12 2) Check transmission paths, test BER and check whether E1 connector or grounding of equipment is reasonable, thus decrease call drops by ensuring stable transmission quality. 3) Observe whether there are too many call drops caused by transmission problem via traffic measurement. (a) in TCH performance measurement of traffic measurement observe whether there are too many A interface failures when TCH is occupied. (b)In TCH performance measurement observe whether the TCH availability rate is abnormal. (c) In TCH performance measurement observe whether there are too many call drops caused by interruption of terrestrial link. 10.2.6 Unreasonable Parameter Settings Check relevant parameter configurations and make sure they are configured reasonably, which are as follows: 1) System message data table: Radio link failure counter If the value is too little, call drop may occur easily when the receive signal level of MS declines greatly and abruptly due to some reasons such as fluky landform. If it is too great, only when the radio link expires can the network release the related resource although the quality of voices is too bad to tolerate, which thus reduces the resource utilization. Generally, this value should be set greater for the area with low than that for the area with high traffic. 2) Cell attribute table: SACCH multiframe number Recommended value: BTS3X 14 (31 for version 05.0529 or newer) BTS2X 31 3) System message data table: MS minimum received signal grade, RACH minimum receive signal level, RACH busy threshold. In virtue of existence of uplink/downlink signals, the actual coverage is subject to the weaker signal. If in a cell the coverage of the uplink signal is larger than that of the downlink signal, the downlink signal is weaker at the edge of the cell and can be submerged easily by stronger signals from other cells. Contrarily, if the coverage of the downlink signal is larger than that of the uplink coverage, MS shall have to stay in the strong signal. However, MS cannot originate a call owing to weak uplink signal, or although it can set up a call, the voice quality is very poor, or signal pass even call drop may occur. Therefore, it is necessary to ensure the uplink/downlink balance as possibly as you can. MS minimum received signal grade: It indicates the minimum receive signal level required for MS accessing the system, which is for the downlink signal. If the value of this parameter in a cell is too little, MS in the cell can access network easily, and the coverage is large. But MS at the edge of the cell tries to stay in the cell, which shall cause greater load on the cell and increase the possibility of call drop. If it is too great, the MS with low receive signal level cannot access network, which helps to reduce call-dropping failure rate but lessens the coverage. Therefore, both coverage and call-dropping failure rate should be taken into account for setting of this Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-13 parameter. Call-dropping failure rate cannot be reduced at the cost of lessening of coverage. RACH minimum receive signal level It indicates the minimum receive signal level required for MSs uplink access to the system (RACH busy threshold used in BTS20 is similar to MS minimum receive signal level. Both coverage and call-dropping failure rate should be fully considered for setting of this parameter.) See M900/M1800 Base Station Controller Data Configuration Reference Network Planning Parameters for details. 10.2.7 Others There are many other reasons causing call drop. For example, when the version of TRX in a BTS is inconsistent with that of FPU, it may increase the number of call drops occurring to the whole network. Or, improper use of BTS version related parameters also causes call drop, as shown in example 7. 10.3 Examples 10.3.1 Example 1: Reducing Call Drop by Optimizing Handover Related Parameter [Description] Too many call drops that occurred at the mouth of the cave near the BTS and were caused by the situation that handover cannot be executed immediately were found during road tests from place A to place B. [Analysis] The mouth of the cave lay just near the BTS. In the cave, the power of the destination cell can be about 80dBm, but the signal power of the serving cell rapidly declined to be less than 100dBm. Handover cannot be triggered since the downlink power of the two cells was good enough, but the signal level of the serving cell decreased rapidly in the cave, which caused the situation that call drop occurred before the measurement ended. [Troubleshooting process] The related parameters shown in Table 10-1 should be modified. Table 10-1Modification table of parameters Parameter name Value before modification Value after modification PBGT handover measurement time 5 3 PBGT handover duration 4 2 PBGT handover threshold 72 68 Emergency handover uplink quality threshold 70 60 Candidate cell minimum downlink power 10 15
Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-14 Optimizing handover related parameters could help to reduce the call-dropping failure rate. 1) Make PBGT handover occur easily so as to achieve anti-interference and reduce call-dropping failure rate on the premise of no toggle handover that may cause too much voice discontinuity. 2) Reasonably set the emergency handover trigger threshold so that emergency handover can be triggered in time before call drop occurs, thus to reduce the call-dropping failure rate. 10.3.2 Example 2: Call Drop Caused by Interference [Description] The BTSs distribution of an area is shown in Figure 10-5. (The red digits indicate BCCH frequencies. DTX is adopted without frequency hopping). As shown in the figure, it could be seen that there were too many call drops occurring in cell 2 of BTS C. (The reason that hardware failure could cause such trouble has been excluded.)
Figure 10-5BTS distribution [Analysis] 1) By analyzing the BTS topology map, it could be concluded that the frequency planning was reasonable. 2) By viewing traffic measurement, the interference bands of the cells of BTS C are shown in Table 10-2. Table 10-2Traffic measurement interference band (09:00~10:00! Interference band 1
3) By performing actual road tests, it was found that the conversation quality had been very poor when the receive signal level become high. 4) By observing traffic measurement, it was found that handover was mainly caused by poor quality conversation and the channel assignment failure rate rose with the augmentation of call drops. 5) By analyzing traffic measurement and the results of road tests, it could be concluded that there was interference. 6) A repeater was found through the on-site inspection. The repeater was a set of broadband equipment, amplifying the signals of a remote analog BTS sent to the near end via optical fiber and transmitting them. Also, the digital signals were amplified by the repeater and then the cell 2 of BTS C was interfered. [Troubleshooting process] The engineer reduced the transmit power of the repeater so that the interference level could degrade from bands 2 and 3 to band 1. Consequently, the high call-dropping failure rate at BTS C was solved. 10.3.3 Example 3: Call Drop Caused by Interference [Description] A BTS adopted 1%3 RF hopping. After it was expanded, TCH assignment failure rate kept high owing to radio link failure, accompanied with high TCH call-dropping failure rate and high handover failure rate. Nevertheless the SDCCH call-dropping failure rate remained normal. [Analysis] Considering high call-dropping rate and high handover failure rate accompanied high assignment failure rate, it could be caused by two reasons as follows. 1) TCH was assigned incorrectly. 2) The frequency or time slot occupied for this conversation was interfered or unstable. As the SDCCH call-dropping rate remained normal, it is almost impossible that the carrier carrying BCCH frequency and BCCH frequency itself were interfered. But the carriers carrying non-BCCH frequencies and hopping frequencies might be interfered. [Troubleshooting process] No faults were found during the check of equipment, antenna & feeder and transmission stability. It was found that the situation of high level with poor quality was serious during road tests. Through an on-site dialing test the voice quality was found very poor, and MAIO of the newly added carrier was found the same as that of another carrier during the check of parameters. Fault point: The hopping frequencies collided. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-16 10.3.4 Example 4: Uplink/downlink Unbalance [Description] The MS occupied a cell but cannot originate a call. Single pass occurred. Call drop always occurred at the place away from the cell. Call drop could occur after frequency handover. [Analysis] The unbalance between the uplink signal level and the downlink signal level might cause such trouble. [Troubleshooting process] Perform on-site tests. Make the MS move to the edge of the cell during the test and trace data with a Signaling analyzer at BTS so as to observe the receive signal levels of the BTS and the MS.
Figure 10-6Explanation of measurement report MA10 As shown in Figure 10-6, the uplink signal level is 98dBm (highlighted with a red circle) and is much lower than the downlink signal level that is 66dBm. If the level is lower than 98dBm, it means the signal is too weak, which can cause call drop easily. 10.3.5 Example 5: Call Drop Caused by Interference from Repeater [Description] The call-dropping failure rate in cell 3 of a BTS reached 10%, while the call-dropping failure rates and congestion rates in cells 1 and 2 kept normal. [Troubleshooting process] Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-17 1) High congestion rate always existed no matter how to block the carrier channel of the cell. 2) The engineer found the interference band was regular by viewing and analyzing traffic measurement data. Generally it was high at days but low at nights. That is to say, when traffic was high at days, interference become high, and vice versa. 3) The engineer set the frequency of cell 3 to be over 1 MHz higher/lower than the original one, but the trouble still existed. Therefore, co-and adjacent-channel interference could be excluded. 4) The engineer checked the equipment and excluded the possibility of equipment fault. 5) The engineer located the trouble was caused by external interference. 6) The engineer performed the frequency scan test with a spectrum analyzer and found a suspicious signal that was similar to a spectrum with the central frequency of 904.14MHz and broadband of 300KHz. And the signal existed continuously and stably. 7) The strength of the interference signal at the mouth of the divider in cell 1 was 27dBm, and those in cells 2 and 3 were 40dBm and 60dBm respectively. Since traffic at days is higher than that at nights, inter-modulation occurs at days more easily than at nights. Therefore, it can be located that the trouble was caused by the external interference source of 904 MHz. 8) The engineer couldnt locate the interference source by performing road tests with a spectrum analyzer. Then he performed all tests at the roof and found the interference came from the little antenna of a repeater. He interrupted the signal test of the repeater, and the interference disappeared. 10.3.6 Example 6: Call Drop Caused by Isolated Island Effect [Description] The user complained call drop always occurred at the fifth floor or above of a building. [Analysis] There are two ways for eliminating isolated island effect. 1) Adjust the antenna of the isolated cell. 2) Define new adjacent cells for the isolated cell. [Troubleshooting process] 1) After the on-site test, the engineer found call drop and noise existed. And from the test MS, he found the MS had always stayed in a serving cell not belonging to the local BTS A before call drop occurred. 2) The cell belongs to BTS B that is 3~4km away from the building. Therefore, he concluded that the signal received here was the signal reflected by an interrupter, consequently a coverage equivalent to an isolated island was formed. 3) By viewing the data configuration, the engineer found that only cell 2 of BTS A had been configured in adjacent relationship between A and B of BSC data Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 10 Troubleshooting for Call Drop
10-18 configuration. When a MS adopts the signal of cell 2 of BTS B in the area, the signal of cell 3 of BTS A is stronger but no adjacent relationship has been defined for cell 2 of BTS B and cell 3 of BTS A. As a result, handover cannot be implemented. 4) As the signal of cell 2 of BTS B has been reflected for many times, when the signal from BTS B received by the MS weakens abruptly owing to a certain reason, an emergency handover might occur. However, for cell 2 of BTS B cells 2 and 3 of BTS A are not the most ideal candidate cells, thus handover to another BTS (e.g., BTS C) might occur. Nevertheless, the MS cannot receive the signal from BTS C at this time, hence call drop occurs. 5) By modifying the data in [BA1(BCCH) Table], [BA2(SACCH) Table] and [Adjacent cell relationship table] in BSC data configuration, the engineer set cell 3 of BTS A as an adjacent cell of cell 2 of BTS B and further optimized the network engineering parameters to eliminate the isolated island effect. 6) Solvingthe trouble was confirmed after tests. 10.3.7 Example 7: Settings of Version Related Parameters [Description] After an expansion, the call-dropping failure rates of five BTSs in an area reached 5%, and the number of call drops in each cell reached 100. In addition, the call-dropping failure rate of a cell that was not expanded rose too. All these troubles were the RF call drop. But the engineer had no idea whats the cause since no interference and no hardware faults were found. [Troubleshooting process] 1) The engineer checked data, frequency planning and BSIC planning. 2) He observed the traffic measurement and found that the interference band kept normal and no interference occurred. 3) The handover success rate remained above 93%. 4) He checked the versions of all TRXs and FPUs of the BTS and found the version of TRX was inconsistent with that of FPU after the expansion. Then he upgraded them to make them consistent. But the trouble still remained. 5) He checked the data again and found that the BTS after expansion was in 15:1 multiplexing and enabled the measurement report pre-processing function for all BTSs of 2.0 versions, but parts of older versions didnt support the function, which hence caused the increasing of the call-dropping failure rate. [Summary] After the system is adjusted greatly, e.g., cut-over access of new BTS, expansion of BTS, re-planning of frequency, upgrading and patching etc., the related parameters should be checked and adjusted correspondingly, especially the adjacent cell relationship, frequency interference, frequency hopping and cell parameters etc. And the version of BTS should be fully taken into account as well. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-1 Chapter 11 Troubleshooting for Antenna & Feeder System 11.1 Overview In the GSM network, the interface between BTS and MS is called Um interface (or radio interface, air interface). The Um interface connects BTS and MS. The signal on this interface (divided into two directions as BTS to MS and MS to BTS) is the RF band signal which uses the space as the transmission media. According to the features of Um interface signals, the voice or controlling signals from BTS to MS, should be first modulated to RF band, and the RF signals from MS to BTS should also be demodulated to baseband for further processing. Modulation and demodulation make up the basic function of the transceiver of BTS. The antenna feeder system of BTS is the equipment for transmitting RF signals to MS and receiving signals from MS. The architecture of BTS30/312 antenna & feeder system is illustrated in Figure 11-1. TRX TX RX RXD CDU TTA (1) (1) (2) (2) (3) (3) (3) (3) (1) (1) Antenna Feeder System D u a l
P o l a r i z a t i o n A n t e n n a
TRX: transceiver CDU: combiner and divider unit TX: transmitter RX: receiver RXD: diversity receiver TTA: tower top amplifier (1) Jumper (2) Lightning arrester of antenna feeder (3) Feeder Figure 11-1Architecture of antenna feeder system In the case of antenna feeder failure, BTS is unable to transmit or receive signals, which causes the interruption of BTS service. Apart from antenna feeder failure, there are also failures of control or alarm related part. These failures have no impact on BTS service in the near future, but have impact on the operations and maintenance of BTS equipment. Usually the antenna feeder system can be defined in two ways: Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-2 1) All equipment from the top of BTS cabinets to antenna, including jumper, lightning arrester, feeder, TTA, antenna and various accessories. The box with dotted line in Figure 11-1 is the antenna feeder system based on this definition. 2) In addition to (1), the combiner and divider module is included. Normally, the "antenna feeder" in "antenna feeder re-usage" belongs to Definition (1), and so is the "antenna feeder" referred in engineering and maintenance. The "antenna feeder system" in the following description always refers to Definition (2) unless they are stated as the one defined in (1).
! Note: There are many types of combiner and divider module, so as to satisfy the needs of different macro cell BTS configuration. In addition to CDU, there are ECDU, EDU, SCU and ESCU. Each of them has two types respectively for GSM900 and GSM1800. In the following text, all combiner and divider modules are indicated with "CDU" unless it is necessary to specify the type.
11.1.1 Common Failures The common failures of antenna feeder system are listed in Table 11-1. Table 11-1Common failures of antenna feeder system Type Failure Symptom No downlink signal MS fails to access the network, calls cannot be established, call drop, TRX idle for a long time On downlink signal Downlink signal weakened Poor conversion quality, BTS coverage shrink No uplink signal Calls cannot be established On uplink signal BTS sensitivity weakened Poor conversation quality, BTS coverage shrink Standing wave alarm Standing wave alarm occurs at CDU LNA alarm LNA alarm occurs at CDU TTA alarm TTA alarm occurs at CDU On controlling and alarm signal TTA feeding fails No DC feeding voltage at CDU antenna port after TTA configuration
11.1.2 Common Causes of Failures The common causes of antenna feeder system are illustrated in Table 11-2. Table 11-2Common causes of antenna feeder failures Type Cause Connector is not tightened Installation of antenna feeder does not conform to engineering specification Not caused by the components specifications Antenna & feeder are abnormal due to artificial causes Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-3 Type Cause Physical impact or damage to antenna feeder Water penetration at antenna Water penetration at TTA Insufficient power tolerance of lightning arrester CDU standing wave alarm performance does not satisfy the requirements in specifications Caused by the components specifications CDU failure
11.2 Fundamental Knowledge In the handling of antenna feeder failure, it is necessary to know: 1) Transmission path of RF signal in antenna feeder system. 2) Method of checking the performance of transmission path. The basic handling method is to check the segments one by one along the transmission path until the failure point is found. Most faults can be located and handled by combining the two points above. 11.2.1 RF Transmission Path in Antenna Feeder System I. Basic architecture The architectures of CDU, EDU, SCU and triplex TTA are shown in the figures below. ECDU and CDU have the same module architecture, so do ESCU and SCU, GSM900 and GSM1800. A N T T T A f eed er Lo w- no i se am pl i f i er T r ans m i s s i on f i l te r R ec ept i on f i l te r R ec ept i on f i l te r B y pas s D C B i as T ee B T S
Figure 11-2Triplex TTA architecture Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-4 Combiner LNA LNA Transmitting Signal Receiving Signal Output CDU Input Divider Divider Receiving Signal Output D u p l e x e r Test Coupler Bias Tee Alarm and Conrol Unit Bias Tee Receiving Filter
Figure 11-3CDU architecture EDU Transmitting Sigal Input Receiving Signal Output Receiving Signal Output Transmitting Sigal Input D u p l e x e r D u p l e x e r LNA LNA Divider Divider Test Coupler Test Coupler Bias Tee Bias Tee Alarm&Control Unit
Figure 11-4EDU architecture Combining SCU 1 2 3 4 Transmitting Signal Output Transmitting Signal Input
Figure 11-5SCU architecture Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-5 II. RF signal transmission path See Figure 11-1. In antenna feeder system, both uplink and downlink signals are transmitted via the same physical channel between CDU duplexer and tower top antenna. While between CDU duplexer and TRX, the downlink signal and uplink signal have their own separate physical channels. Combining the uplink and downlink signals into one physical channel is one of CDU functions. The diversity receiving channel of CDU is a pure uplink path. EDU contains two duplexers. It has no separate diversity receiving channels. The signal paths from CDU antenna port to tower top, including components and connectors are listed in Table 11-3. Table 11-3Paths from CDU antenna port to tower top antenna No. Component/port Connector type Remark 1 CDU antenna port N female For CDU and ECDU, the antenna port is the transceiving antenna port TX/RX ANT and diversity receiving antenna port RXD ANT, for EDU, it is the transceiving antenna port TX/RX ANTA and TX/RX ANTB N male 2 Internal 1/4 jumper of the BTS30/312 cabinet 7/16 DIN female
7/16 DIN male 3 1/2 jumper 7/16 DIN male
7/16 DIN female 4 Lightning arrester 7/16 DIN male
7/16 DIN female 5 Feeder 7/16 DIN female This feeder runs from BTS equipment room to tower top 7/16 DIN male 6 1/2 jumper 7/16 DIN male
7/16 DIN female 7 TTA 7/16 DIN female TTA is optional. If it is not configured, the 7 and 8 can be omitted 7/16 DIN male 8 1/2 jumper 7/16 DIN male
9 Antenna 7/16 DIN female
11.2.2 Measuring Standing Wave Ratio of Antenna Feeder Two kinds of meters are usually used to measure the standing wave ratio of antenna feeders: Site Master and feed-through power meter. The former one is more frequently used on engineering and maintenance site. It is important to calibrate the Site Master before using it so as to eliminate its system error. The system error of the meter is related to its status. The status of a meter can be different at different time. Besides, different meter setting (e.g. test item, frequency range) will have different status. Therefore, calibration should be carried out before switching on and resetting the meter each time. Otherwise, the test result will contain large amount of error and cannot be used as the basis to judge the performance of equipment. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-6 Even if calibration has been carried, the absolute correctness of the test result cannot be guaranteed, because the meter itself may have performance problems, and some test accessories such as RF cables may also have problems, too. Although the possibility is low, these problems still have the chance to happen, and may affect locating and handling of the failures, therefore, the meter itself should also be checked. The performance of a meter can be checked through some simple operations and observation on the site. To test the performance of a meter, do not start the test immediately after the calibration of the meter. Instead, use some standard part (e.g. open circuit, short circuit and matching load) to test the meter performance first. For example, use the meter to measure the standing wave ratio or return loss of a matching load. If the result indicates that the standing wave ratio of the matching load is 1 or the return loss is above 40dB, the meter is in normal condition. To test the performance of the cable used in the test, shake the cable lightly after the parameter curve displayed on Site Master becomes stable. If the curve is slightly changed and remains stable on the whole, the performance of the cable is normal. For the curve used to test the cable, it is recommended that the insertion loss curve or gain curve be selected and isolation curve or return loss curve be avoided. It is because the Site Master receives weak signals when testing isolation or return loss. In this case, even the curve has great change due to the shaking, it does not mean that the performance of the cable is poor. In a word, remember that the meters are not always trustable. The test results should be compared and identified before drawing a conclusion. Make sure to correctly connect the input and output ports of the probe when using the feed-through power meter. Using feed-through power meter to measure the antenna feeder power is greatly different from using Site Master. Site Master has internal signal source. When measuring the antenna feeder standing wave ratio, it sends a monotone signal, and then receives the reflected signal. While the feed-through power meter itself has no signal source. So an external signal source is needed. The data of standing wave ratio Site Master provides is on each separate frequency point. While the data of standing wave ratio feed-through power meter provides is related to signal source: if the signal source is mono tone, the measured one is the standing wave ratio on a single frequency. If the signal source is frequency band signal, the measured one is the mean standing wave ratio within the frequency band. 11.2.3 Checking CDU Antenna Port TTA Power Feeding A special method should be used to check the power supply on the antenna ports of CDU. Otherwise the check will probably fail. Hence it is necessary to explain the method separately CDU (including ECDU and EDU) can provide power feeding to TTA via antenna port. CDU has an internal Bias Tee in the antenna port. The architecture of Bias Tee is shown in the figure below. The DC voltage can be added between the internal and outer conductors of the antenna ports of CDUs through the Bias Tees. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-7 1 Antenna !12V 1"Inductance C Duplexer
Figure 11-6Bias Tee architecture The DC voltage is fed to the TTA via lightning arrester and feeder. To supply the DC voltage to CDU antenna port, it is necessary to open two switches: a hardware switch and a software switch. The hardware switch can be found on the back of CDUs. It can be controlled manually only. While the software switch can be set both at local maintenance console and at the far end (BSC) through OMC. Hardware switch and software switch are connected in serial. Therefore both switches should be switched on to feed DC voltage to CDU antenna port. Normally, it is impossible to detect DC voltage at CDU antenna port with a multi-meter after switching on the hardware and software switches, even if the CDU is working normally. It is because TMU turns off the software switch of TTA feeding on CDU after CDU TTA alarm has been reported to TMU. When measuring the DC voltage on the antenna ports of CDUs, the internal and outer conductors of the port form an open circiut. Therefore, even if the DC voltage has been fed to CDU antenna port, the TTA alarm will be generated immediately, and as a result, TMU will switch off the CDU TTA software switch to cut the DC voltage. This is why the multi-meter is unable to detect DC voltage. Here is the solution to this problem. Remove the DB25 communication cable of CDU after switching on TTA software switch and before removing the feeder at antenna port. In this way, the DC voltage can be fed to the antenna port since the software switch is switched on, on the other hand, although CDU generates TTA alarm, but the alarm cannot be reported to TMU and TMU cannot send the instruction to switch off the TTA software switch since the CDU communication cable has been removed. On this occasion, the multi-meter is able to measure the DC voltage. Another checking of TTA working status is measuring TTA working current. Since this measurement is more complicated than the previous one, it should be carried out after the measurement of voltage. To measure the TTA working current, switch the multimeter to the measure of DC current, and then connect the meter in serial to TTA power supply loop. Make sure that the feeder at CDU antenna port has been removed. Otherwise, the meter cannot be connected to the circuit. Therefore, by this means the meter is actually measuring the TTA static offset current instead of the TTA working current. But the TTA static offset current can be considered as the TTA working current since the internal active part of TTA is LNA with low power consumption. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-8 11.3 Locating Failures of Different Types 11.3.1 On Downlink Signal I. Description " No downlink signal " Downlink signal weakened II. Possible cause See table12-2 Common causes of antenna feeder failures. III. Analysis 1) No downlink signal Although such failure causes severe consequences, it is easier to deal with than the signal weakening. Because the phenomenon of the failure is quite certain. Keep the equipment status before having a clear solution plan. Follow the steps below: (a) View the history alarms and realtime alarm at OMC or local maintenance console. (b) If there is emergent standing wave alarm at CDU, it is the most possible cause for which TMU turns off the transmitter power amplifier resulting in no downlink signal. In this case, check the standing wave ratio at jumper side of CDU antenna port according to 11.2.2 Measuring Standing Wave Ratio of Antenna Feeder. If the standing wave ratio is beyond limits, locate the faulty point segment by segment according to the path illustrated in Table 11-3. (c) If the alarm mentioned in (b) does not exits, first make sure the corresponding TRXs work normally. (d) Since there is no downlink signal, there must be a broken point in the RF signal path. If this point is located at the part from CDU antenna port to tower top, then CDU should be able to detect the emergent standing wave alarm. Otherwise, it can be concluded that the broken point is located between TRX output to CDU antenna point. (e) Check whether the cable connection between CDU TX-COM and TX-DUP is correct. (f) If the operations above fail to locate the failure, change the CDUs. 2) Downlink signal weakened The symptom of this failure is that the coverage of BTS or carrier shrinks. Follow the steps below to handle this problem: (a) Check whether the output power of TRX is normal. (b) Check whether the standing wave ratio at jumper side of CDU antenna port is normal. (c) Check insertion loss of CDU transmitting path. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-9 (d) Check whether the connectors involved in the RF signal path are tightened. 11.3.2 On Uplink Signal I. Description " No uplink signal " BTS sensitivity weakened II. Possible cause See Table 11-2 Common causes of antenna feeder failures. III. Analysis 1) No uplink signal Handling process: (a) Try another antenna feeder (CDU excluded) which has proven to be normal to substitute the one without uplink signal. (b) If the uplink signal at the new feeder recovered while the one at the original feeder fails, then the original antenna feeder has problems, and it is necessary to check the path according to Table 11-3. (c) If the phenomenon remains, then CDU has problems. Check whether the cable connection between RXD OUT and HL_IN or between HL_OUT and HL_IN is correct. (d) If the failure cannot be located yet, change the CDU, and make the related record. (e) Restore the antenna feeder connection to it original status. When changing the antenna feeder, make sure that: (a) The two corresponding antenna feeders should be in the same cell/sector. (b) The antenna connection should be restored to the original status after locating the failure. Otherwise, the coverage of the cell may be affected. This is the basic principle to obey when using this method to locate the problem. 2) BTS sensitivity weakened This failure can have many causes. The following analysis aims at failure related to antenna feeder. If TTA is configured, first check whether there is any TTA alarm. If so, the TTA is working abnormally. Otherwise, check the CDU antenna port feeding according to 11.2.3 Checking CDU Antenna Port TTA Power Feeding. If no feeding is detected, then the CDU is faulty and needs to be changed. If DC voltage is normal, then it is considered that the TTA is normal. After confirming that TTA is normal, check the standing wave ratio of antenna feeder. If it is too large, then the connection of antenna feeder RF path is poor or something else. In this case, check the path segment by segment according to Table 11-3. If the standing wave is normal, check the performance of CDU receiving channel, such as gain and noise factor. This method is seldom applied Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-10 because there may be no necessary meters on the engineering and maintenance site. So the method of changing CDU is more frequently applied. That means to use a CDU proved to be normal to substitute the CDU being tested to see whether the problem can be settled. If so, the original CDU is faulty. Otherwise, the CDU is OK. The common failures can be located by adopting the methods above. But it is inevitable that there are some problems which can not be located by this method since it is not a comprehensive test. For example, if the gain decrease and noise factor increase of TTA is not reflected in the working current, the problem cannot be detected. On such occasions, make clear records of the operations which have been done so far for further analysis. 11.3.3 On Controlling and Alarm I. Description " CDU standing wave alarm " CDU LNA alarm " CDU TTA feeding abnormal " CDU TTA alarm II. Possible cause See Table 11-2 Common causes of antenna feeder failures. III. Analysis 1) Standing wave alarm Check the standing wave ratio of antenna feeder (CDU excluded). If it is lower than 1.5, while CDU standing wave alarm has been generated, this alarm should be regarded as a mis-alarm, and the CDU needs to be changed. If the standing wave ratio is higher than 1.5, it is necessary to adjust the connection of antenna feeder until it is lower than 1.5. The installation specification requires the standing wave ratio be lower than 1.3. 2) LNA alarm LNA performance is difficult to check on the site. When there is LNA alarm, make a record of the equipment configuration and status, and then use another CDU to substitute the faulty one. Keep the faulty CDU for further analysis. 3) TTA alarm When both the hardware and software switches are on, CDU measures the TTA feeding current flowing through the antenna port. If the current is not in the normal TTA working current range (45~170mA), CDU generates TTA alarm. When testing, use a multi-meter to measure the TTA feeding current. If the feeding current is normal while there is TTA alarm, then it can be considered as a TTA mis-alarm. Use another CDU to substitute the faulty CDU. Keep the faulty CDU for further analysis. If the feeding current is beyond limits, TTA is faulty and needs to be changed. For the migration site of antenna salvage, it is also necessary to confirm the type of lightning arrester when using TTA. Usually there are two types of lightning arresters: lightning arrester with discharging tube and lightning arrester with 1/4 Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-11 wavelength transmission cable. When the former one is in normal working status (not struck by lightning), an open circuit of DC is formed between the internal and external conductors. While the internal and outer conductors are always short circiuted for the latter one. Therefore, if the lightning arrester with 1/4 wavelength transmission cable is configured, CDU cannot provide feeding to TTA. It is needed to change the arrester for the one with discharging tub, or use an independent line to feed the TTA. The lightning arresters adopted in Huawei products are all lightning arresters with discharging tube, so the newly installed site will not encounter this problem. 4) TTA feeding abnormal If the TTA feeding voltage cannot be detected at the CDU antenna port after switching on the hardware and software switches of TTA according to the method introduced in 11.2.3 Checking CDU Antenna Port TTA Power Feeding, then it can be concluded that the TTA feeding is abnormal. Use another CDU to substitute the faulty CDU. Keep the faulty one for further analysis. 11.4 Examples 11.4.1 Insufficient Power Tolerance of Lightning Arrester Caused Standing Wave Ratio of Antenna Feeder Abnormal I. Description CDU generated standing wave ratio alarms now and then. II. Failure Handling The engineer used Site Master to measure the standing wave ratio of antenna feeder, and carried out long time monitoring according to the standing wave alarm history observed on OMC. The result showed that the standing wave ratio of antenna feeder was too high. The engineer checked the RF signal path segment by segment and found that the failure was located at the lightning arrester. With further checking, the lightning arrester power tolerance was found to be lower than 100W. It was clear that the lightning arrester had been damaged under the PBU wide coverage configuration. III. Analysis The power tolerance of the equipment should be taken into consideration when using wide coverage configuration, especially for the antenna salvage site. 11.4.2 EDU Internal Bias Tee Quality Problem Causing TTA Feeding Failure I. Description A site had been configured with TTA, but the uplink signal was still undesirable. The TTA alarm could be found at the OMC of BSC side. But there was no TTA alarm on the EDU panel. Troubleshooting Manual M900/M1800 Base Station Subsystem Chapter 11 Troubleshooting for Antenna & Feeder System
11-12 II. Failure handling The TTA feeding voltage at EDU antenna port was found to be 0. It means that the TTA was not working, so the working current was also 0. EDU reported TTA alarm. And then TMU switched off TTA software switch and TTA alarm indicator on EDU panel was turned off. But the software alarm remained. With further analysis on EDU, the internal and external conductors at the internal Bias Tee power supply port were short-circuited. This was the reason of DC feeding failure. III. Analysis TTA feeding is vulnerable to various problems. They need to be checked following the method above. 11.4.3 No Cable Connection between TX-COM and TX-DUP of CDU Causing Call Establishment Failure I. Description An office had been working properly when it was configured with only one TRX. But there was no traffic at all throughout the first day after the capacity had been expanded to two TRXs. BCCH TRX was then supposed to be abnormal. II. Failure handling Before expansion, CDU had only one input port TX-DUP. After expansion, there were two input ports TX1 and TX2. However, the cable connection between TX-DUP and TX-COM of CDU was forgotten to be restored, causing the transmitting signal of TRX unable to reach antenna for transmitting. Although on that occasion call establishment is possible using test MS in the equipment room, it was because of signal leakage from the port TX-COM. Since the test MS is close to the equipment, it needed no antenna to communicate with the network. But the ordinary subscribers outside the equipment room would have no chance to establish a call. III. Analysis The setting of TX-COM and TX-DUP ports brings about convenience of configuration. But another potential failure point comes up at the same time. Attentions should be put on this point.