Académique Documents
Professionnel Documents
Culture Documents
\
|
=
N
i Perror
d N
rmse
1
*
The Perror is defined as:
)) ( ) ( ) ( , 0 max( ) (
95
i ci i MOSLQO i MOSLQS i Perror =
where the index i denotes the condition of the speech sample, N denotes
the number of conditions or speech samples, and d denotes the degrees of
freedom (d = 4 in the case of a 3
rd
order regression).
The results reported in [12] provide general information on the POLQA
performance on a broad range of databases containing a large variety of
technologies, codecs and bandwidths. These results representing an
overall performance might be misleading to a certain extent. Due to the
variety of databases and the statistical aggregation procedure of the results
[3], [12], a weaker or better performance for a specific application and/or
bandwidth could be smoothed out or hidden. Therefore, additional analysis
is expected for more detailed analysis or for a particular application. This
analysis is planned by ITU-T during the POLQA characterization phase and
the results are expected to be published in the forthcoming POLQA
Application Guide (estimated for June 2011).
4 Beyond the MOS Score
Due to the complexity of the NGN environment, as well as the challenges in
supporting voice service on LTE-SAE/SON networks, several solutions for
providing voice service are currently envisioned. Therefore, test and
evaluation of speech quality in the NGN environment must be
comprehensive. In order to understand and cost efficiently control the
speech degradation of different implementation solutions, evaluation
techniques need to go beyond the MOS score.
Ascom (2010) Document:
NT11-1037 12(13)
To a large extent, as in the PESQ case, interim calculations of POLQA as
well as the six degradation parameters used as input to the POLQA
algorithms cognitive model would allow some network diagnosis based on
speech quality evaluation. Details are discussed in [1], but generally the
main diagnosis could regard aspects such as latency, jitter (variable delay),
gain variations, speech signal and BGN level measurements, level clipping,
dropouts (e.g., generated by packet loss), operability of VAD, and short-
term spectra (linear degradations caused by either the frequency response
of the devices and/or by the VoIP landline connection).
5 Ascom Network Testing Presence in the
Standardization Work on Objective Evaluation
Metrics for Listening Speech Quality
For more than 10 years, Ascom Network Testing has been an active
member within ITU-T Study Group 12, which develops objective speech
quality evaluation metrics. Our contributions to the standardization work
cover different areas and stages of objective metric development.
Ascom Network Testing contributed live recorded speech databases
needed for accurate training and tuning of the algorithms running in real life
scenarios typical of network troubleshooting, optimization, and operation
applications performed by operators. Within ITU-T, we were the initiator
and developer of the statistical evaluation procedure for objective metrics
that was first applied to PESQ and that was later applied in a modified form
to POLQA [8]. Recently, based on our initial work as well as work
performed for POLQA performance evaluation, Ascom Network Testing
introduced a new study item within ITU-T on a more general statistical
evaluation procedure to be applied to various types of objective metrics [9].
This type of evaluation becomes more and more a must for all kinds of
objective metrics (e.g., speech, video, audio, multimedia) that are designed
for testing in real life networks and therefore for their implementation in
network testing tools. We also developed a technique for objective quality
metrics calibration to the MOS scale. As a result, we co-authored two
standards in relation to PESQ: P.862.1 (Mapping PESQ to MOS domain)
and P.862.3 (Guidance for PESQ usage) [2].
Additionally, Ascom Network Testing recently wrote a white paper
contribution [10] on aspects related to POLQA implementation in field
testing tools, as well as a white paper contribution related to topics that are
required to be studied during the POLQA characterization phase [11].
6 Conclusions
The convergence and coexistence of voice, data, and multimedia
application services, which involve a multitude of factors that invariably
produce new types of distortions that dynamically, variably, and sometimes
randomly affect voice service quality. Today, speech quality is determined
by more than speech codecs used or frames lost. Networks and devices
now integrate many new components ranging from voice enhancement
devices to new techniques such as time scaling.
Ascom (2010) Document:
NT11-1037 13(13)
Extensive work has been performed during the past decade by both the
ITU-T and the telecommunication industry in developing speech quality
evaluation algorithms designed to accurately evaluate any network
degradation impact on subscriber perception as well as to cope with the
complex testing conditions of the 3G networks and beyond. The new
technology POLQA was developed to cope with the evolving networks
complexities. Like with all new technologies, extensive life testing is
expected to complete POLQA algorithms performance picture. Ascom
Network Testing, a proved veteran in ITU-T on the objective quality metrics
evaluation, continues to play an active role in the standardization work on
this topic.
7 References
[1] I. Cotanis, Voice Services in the Next Generation Networks/LTE-
SON as Perceived by Users, Ascom Network Testing white paper,
November 2010.
[2] ITU-T P.862.x series; P.862 (PESQ algorithm), P.862.1 (Mapping to
MOS domain), P.862.2 (WB-PESQ), P.862.3 (PESQ-Application
guide); PESQ algorithm.
[3] ITU-T P.863, Perceptual Objective Listening Quality Assessment
(POLQA), Geneva, January 2011.
[4] ITU-T P.563, Single-ended method for objective speech quality
assessment in narrow-band telephony applications.
[5] ITU-T P.564, Conformance testing for voice over IP transmission
quality assessment models.
[6] ITU-T TD SG 12 Gen 345, Final report of Working Party 2,
Geneva, May 2010.
[7] ITU-T P.800, Subjective testing of overall listening speech quality.
[8] I Cotanis, ITU-T SG12/Q9 C137, A procedure for statistical
evaluation of the objective quality metrics performance, May 2008.
[9] I. Cotanis, ITU-T C151, Proposal on statistical evaluation
framework for objective quality algorithms, submitted for ITU-T
January 2011 meeting.
[10] I. Cotanis, ITU-T SG 12 C112, Some aspects related to P.OLQA
standard, May 2010.
[11] I. Cotanis, ITU-T C142, Proposed study items for POLQA
characterization phase, September 2010.
[12] Opticom, TNO, SwissQual, ITU-T C148, Performance of the joint
POLQA model, September 2010.
[13] POLQA coalition, www.polqa.info, July 2010.