Vous êtes sur la page 1sur 6

A NOVEL QUALITY METRIC FOR COMPRESSED VIDEO CONSIDERING BOTH FRAME RATE AND QUANTIZATION ARTIFACTS Yen-Fu Ou,

Zhan Ma, Yao Wang Dept. of Electrical and Computer Engineering, Polytechnic Institute of New York University, Brooklyn, NY 11201 Emails: {you01, zma03}@students.poly.edu, yao@poly.edu
ABSTRACT In this paper, we explore the impact of frame rate and quantization on perceptual quality of a video. We propose to use the product of a metric that assesses the quality of a quantized video at the highest frame rate and a temporal correction factor, which reduces the quality assigned by the rst metric according to the actual frame rate. We found that the temporal correction factor follows closely an inverted falling exponential function, whereas the quantization effect can be captured accurately by a sigmoidal function of the PSNR. The proposed overall metric was validated using both our subjective test scores as well as those reported by others. Index Terms Video quality model, subjective quality, quantization parameter, frame rate, scalable video. 1. INTRODUCTION Development of objective quality metrics that can automatically and accurately measure perceptual video quality is becoming more and more important as video applications become pervasive. Prior work in video quality assessment is mainly concerned with applications where the frame rate of the video is xed. The objective quality metric compares each pair of corresponding frames in deriving a similarity score or distortion between two videos with the same frame rate. In many emerging applications targeting for heterogeneous users with different display devices and/or different communication links, the same video content may be accessed with varying frame rate, frame size or quantization (assuming the video is coded into a scalable stream with spatial/temporal/SNR scalability). In applications permitting only very low bit rate video, one often has to determine whether to code an original high frame-rate video at the same frame rate but with signicant quantization, or to code it at a lower frame rate with less quantization. In all proceeding scenarios as well as many others, it is important being able to
This work is supported in part by the National Science Foundation under Grant No. 0430145, and by the Joint Research Fund for Overseas Chinese Young Scholars of National Natural Science Foundation of China under grant No. 60528004.

objectively quantify the perceptual quality of a video that has been subjected to both quantization and frame rate reduction. Prior works in [1, 2, 3] proposed quality metrics that consider the effect of frame rate. The work in [1] used logarithm function of the frame rate with motion content to model the negative impact of frame rate dropping on perceptual video quality. The model was shown to correlate well with subjective ratings for both CIF and QCIF videos. The metric proposed in [2] explores the impact of regular and irregular frame drop. They rst derive the temporal uctuation for all the dropping occasions within one scene based on motion activity and frame dropping severity. The quality of each video scene is then determined by weighting and normalizing a logarithm function of temporal uctuation and the frame dropping severity. Finally, the overall quality of the entire video is the average of the quality indices over all video scene segments. The work in [3] also considers the impact of both regular and irregular frame drops and examines the jerkiness and jitter effects caused by different levels of strength, duration and distribution of the temporal impairment. Besides the study of frame rate impact on perceptual quality, Feghali et al. proposed a video quality metric [4, 5] considering both frame rate and quantization effects. Their metric uses a weighted sum of two terms, one is the PSNR of the interpolated sequences from the original low frame-rate video, another is the frame-rate reduction. The weight depends on the motion of the sequences. The work in [6] extended that of [5] by employing a different motion feature in the weight. Our proposed model, also a function of PSNR and frame rate, uses the product of a PSNR-based metric and a temporal correction factor (TCF). The rst term assesses the quality of video based on the average PSNR of frames included in the video (not including interpolated frames), and the TCF reduces the quality assigned by the rst metric according to the actual frame rate. Our model has only two parameters and correlates very well with subjective ratings obtained in our subjective tests, with signicantly higher correlation than the metrics proposed in [6]. This paper is organized as follows. Section 2 describes our subjective test conguration and presents the test results. Section 3 addresses the proposed ob-

jective metric, and validates its accuracy with our subjective test data. Section 4 compares our metric with those proposed in [3, 6]. Finally Section 5 concludes the paper and discusses the future work.

100 Excellent 80 good 60 fair 40 poor 20 bad 0 PVS, or Reference PVS, or Reference PVS, or Reference 2 minutes 12 minutes

Training Session

Testing Session

8 second Vote

8 second Vote

8 second

and ask questions if any. The training clips (Soccer, Waterfall) are chosen to expose viewers to the types and quality range of the testing clips. The sequences in the test session are ordered randomly so that each subject sees the video clips in a different order. Thirty one (26 males and 5 females) nonexpert viewers who had normal or corrected-to-normal vision acuity participated in one or two subgroup tests. There are on average 20 ratings for each processed video sequence. Given the rating range from 0 to 100, different viewers scores tend to fall in quite different subranges. The raw score data should be normalized before analysis. We rst nd the minimum and maximum scores given by each viewer for a specic source sequence, then normalize all viewers score for this sequence by the average of minimum scores and the average of maximum scores among all subjects. We then average normalized viewer ratings for the same processed video sequence to determine the mean opinion score (MOS).
Akiyo City 100 100

Fig. 1. Subjective quality test setup.


MOS

80

80

60

60 MOS

2. SUBJECTIVE QUALITY ASSESSMENT 2.1. Test Sequence Pool

40 QP = 28, PSNR = 40.55 QP = 36, PSNR = 35.16 QP = 40, PSNR = 32.70 QP = 44, PSNR = 30.75 5 10 15 20 Frame Rate 25 30

40 QP = 28, PSNR = 35.87 QP = 36, PSNR = 30.51 QP = 40, PSNR = 28.08 QP = 44, PSNR = 26.03 5 10 15 20 Frame Rate 25 30

20

20

0 0

0 0

Four video sequences, Akiyo, City , Crew, and Football, all in CIF (352 288) resolution at original frame rate 30 fps, are chosen from JVT (Joint Video Team) test sequences pool. All these sequences are coded using scalable video model (JSVM912) [7], which is the reference software for the scalable extension of H.264/AVC (SVC) developed by JVT. For each sequence, one bitstream is generated with four temporal layers corresponding to frame rates of 30, 15, 7.5, 3.75Hz, and each temporal layer in turn has four quality layers created with QP equal to 28, 36, 40, and 44, respectively, using the coarse grain scalability (CGS). As a result, there are a total 64 processed (encoded and decoded) video sequences used for conducting the subjective quality test. 2.2. Subjective Quality Assessment and Data Postprocessing The subjective quality assessment, illustrated in Figure 1, is carried out using a protocol similar to ACR-HR (Absolute Category Rating with Hidden Reference) described in [8]. In the test, a subject is shown one video at a time, providing an overall rating after each clip is played completely. The rating scale ranges from 0 (worst) to 100 (best). In order to shorten the duration of the test, the experiment is divided into two subgroups. Each of them contains 38 processed video sequences (including training sequences) and lasts about 14 minutes. Each subgroup test consists of two sessions, a training session and a test session. The training session is used for the subject to accustom him/herself to the rating procedure

Crew 100 100

Football

80

80

60 MOS MOS 40 QP = 28, PSNR = 37.34 QP = 36, PSNR = 32.50 QP = 40, PSNR = 30.34 QP = 44, PSNR = 28.51 5 10 15 20 Frame Rate 25 30

60

40 QP = 28, PSNR = 36.52 QP = 36, PSNR = 31.24 QP = 40, PSNR = 29.01 QP = 44, PSNR = 27.22 5 10 15 20 Frame Rate 25 30

20

20

0 0

0 0

Fig. 2. Measured MOS against frame rate at different QP. In order to remove noisy ratings or outliers, we adopted, with some modication, the screening method recommended by BT.500-11 [9] designed for Single Stimulus Continuous Quality Evaluation (SSCQE) for processing the overall scores. After screening there are on average 15 user ratings for each processed video sequence. Figure 2 presents the subjective test results. We see that no matter what QP level is, MOS reduces consistently as the frame rate decreases. In order to examine whether the reduction trend of the MOS against the frame rate is independent of the quantization parameter, we plot in Figure 3, the normalized MOS, which is the ratio of the MOS with the MOS at the highest frame rate (30Hz in our case), for all different QPs. We see that these normalized curves almost overlap with each other, indicating that the reduction of the MOS with frame rate is quite

independent of the QP.


Akiyo 1.2 1 Normalized MOS 0.8 0.6 0.4 0.2 0 0 QP = 28, PSNR = 40.55 QP = 36, PSNR = 35.16 QP = 40, PSNR = 32.70 QP = 44, PSNR = 30.75 5 10 15 20 Frame Rate 25 30 Normalized MOS 1.2 1 0.8 0.6 0.4 0.2 0 0 QP = 28, PSNR = 35.87 QP = 36, PSNR = 30.51 QP = 40, PSNR = 28.08 QP = 44, PSNR = 26.03 5 10 15 20 Frame Rate 25 30 City

30Hz 100

MOS and predicted curve

80

60 Akiyo s = 30.65 City s = 26.32 Crew s = 29.63 Football s = 25.90 30 35 PSNR(dB) 40 45

40

Crew 1.2 1 Normalized MOS 0.8 0.6 0.4 0.2 0 0 QP = 28, PSNR = 37.34 QP = 36, PSNR = 32.50 QP = 40, PSNR = 30.34 QP = 44, PSNR = 28.51 5 10 15 20 Frame Rate 25 30 Normalized MOS 1.2 1 0.8 0.6 0.4 0.2 0 0

Football

20

0 25

QP = 28, PSNR = 36.52 QP = 36, PSNR = 31.24 QP = 40, PSNR = 29.01 QP = 44, PSNR = 27.22 5 10 15 20 Frame Rate 25 30

Fig. 4. The MOS (in point) and prediction curve against PSNR for each video sequence. where Qmax is the quality at highest possible quality rating and p, s are model parameters. We have found that the optimal values for Qmax and p are almost the same for different sequences, and hence we set Qmax = 92.8 and p = 0.34 and only varies s when tting the model to the measured MOS data. Figure 4 compares the MOS obtained for sequences at 30Hz and different PSNRs with those obtained using the model in (2). We can see that the model, with a single parameter s, is very accurate. It is interesting to note that, at the same PSNR, the MOS is lower for sequences with low motion and high contrast (Akiyo and Crew), leading to a high value for the parameter s. This suggests that the human eye is more sensitive to quantization artifacts for such sequences. 3.2. Temporal Correction Factor In a prior work [11], we have investigated the impact of the frame rate on the perceptual quality of uncompressed video, and found that the normalized quality can be modeled very accurately by an inverted exponential falling function. We adopt the same function for the temporal correction factor, i.e, TCF(f ) = 1 eb fmax . 1 eb
f

Fig. 3. Normalized MOS against frame rate at different QP.

3. PROPOSED QUALITY METRIC As described earlier, results in Figure 2 and 3 suggest that the impact of frame rate and that of quantization is separable. Based on this observation, we propose the following metric consisting of the product of two functions: VQMTQ(f, PSNR) = Q(PSNR)TCF(f ), (1)

where f represents the frame rate and Q(PSNR) is the quality of the video at the maximum frame rate fmax with the same PSNR as the average PSNR of the decoded frames. The function TCF(f ) is called the temporal correction factor, which models how the MOS reduces as the frame rate decreases. The specic forms of the function Q(PSNR) and TCF(f ) are described in Sec. 3.1 and 3.2, respectively. 3.1. PSNR-based Quality Metric PSNR is a commonly adopted metric for measuring quality of video with encoding distortion. From the observation of our subjective test, it is shown that, in an intermediate range of PSNR, the perceived quality correlates quite linearly with PSNR. However, the human eyes tend to think video with very low PSNR as equally bad and those with very high PSNR as equally good. Taking into account of this saturation effect of the human vision, we propose to use a sigmoidal function, following the model in [10], Q(PSNR) = Qmax (1 1 1+ ep(PSNRs) ), (2)

(3)

This function describes how the frame rate impacts perceptual quality. Figure 5 demonstrates the curve of temporal correction factor and the normalized MOS against frame rate. We can see that the tting is quite accurate for all sequences. Note that the parameter b characterizes how fast the quality drops as the frame rate reduces, with a smaller b indicating a faster drop rate. The b values for different sequences are provided in Figure 5. As expected, sequences with higher motion have faster drop rates (smaller b).

Akiyo 1 1

City 90 80 MOS and predicted curve

Akiyo 90 80 MOS and predicted curve 70 60 50 40 30 20 10 0 0 5

City

70 60 50 40 30 20 10 0 0 5 10 15 20 Frame Rate(Hz) QP = 28 predicted curve QP = 36 predicted curve QP = 40 predicted curve QP = 44 predicted curve 25 30

Normalized MOS

0.6

Normalized MOS

0.8

0.8

0.6

0.4 QP = 28, PSNR = 40.55 QP = 36, PSNR = 35.16 QP = 40, PSNR = 32.70 QP = 44, PSNR = 30.75 Model Curve b=8.30 5 10 15 20 Frame Rate(Hz) 25 30

0.4 QP = 28, PSNR = 35.87 QP = 36, PSNR = 30.51 QP = 40, PSNR = 28.08 QP = 44, PSNR = 26.03 Model Curve b=7.54 5 10 15 20 Frame Rate(Hz) 25 30

0.2

0.2

QP = 28 predicted curve QP = 36 predicted curve QP = 40 predicted curve QP = 44 predicted curve 10 15 20 Frame Rate(Hz) 25 30

0 0

0 0

Crew 1 1

Football 90 80 MOS and predicted curve

Crew 100

Football

Normalized MOS

Normalized MOS

0.8

0.8

60 50 40 30 20 10 0 0 5 10 15 20 Frame Rate(Hz) QP = 28 predicted curve QP = 36 predicted curve QP = 40 predicted curve QP = 44 predicted curve 25 30

MOS and predicted curve

70

80

60 QP = 28 predicted curve QP = 36 predicted curve QP = 40 predicted curve QP = 44 predicted curve 5 10 15 20 Frame Rate(Hz) 25 30

0.6

0.6

0.4 QP = 28, PSNR = 37.34 QP = 36, PSNR = 32.50 QP = 40, PSNR = 30.34 QP = 44, PSNR = 28.51 Model Curve b=7.38 5 10 15 20 Frame Rate(Hz) 25 30

0.4 QP = 28, PSNR = 36.52 QP = 36, PSNR = 31.24 QP = 40, PSNR = 29.01 QP = 44, PSNR = 27.22 Model Curve b=5.37 5 10 15 20 Frame Rate(Hz) 25 30

40

0.2

0.2

20

0 0

0 0

0 0

Fig. 5. The normalized MOS and temporal correction factor (TCF) against frame rate. 3.3. Video Quality Metric Considering Temporal resolution and Quantization(VQMTQ) Combining Eqs. (1, 2, 3), we obtain the proposed video quality metric considering both temporal and quantization effect: VQMTQ(PSNR, f ) = 92.8 1 1 1 + e0.34(PSNRs) 1 eb fmax . (4) 1 eb
f

Fig. 6. The predicted MOS (in curve) against frame rate at different QP level for each video sequence. with model parameters a1 and a2 . In particular, they dened NIFVQ = 5MOS as the degraded quality. It is noted that they assume the quality of all reference or highest frame-rate videos is 5, and the quality at lower frame rates are decreased according to (5). The metric in [3] models the jerkiness of the video and is given by: jerkiness(f ) = k1 + k2 . 1 + ek3 f +k4 (6)

The model has only two video-dependent parameters s and b. We plot predicted quality using this model together with measured MOS in Figure 6. We can see that predicted curves t the measured MOS very well for most cases. Note that although the rst term is meant to predict the quality of the video at the highest frame rate, assuming the PSNR of the highest frame-rate video is available, when applying this model to a reduced frame-rate video, we calculate the PSNR using only frames available in the reduced frame-rate video. We assume that, if the same QP is applied to all frames, the average PSNR of the highest frame-rate video will be similar to average PSNR of the reduced frame-rate video. 4. PERFORMANCE COMPARISON In this section, we compare our proposed metric with three metrics proposed in [1], [3] and [6]. The models in [1] and [3] only consider the effect of frame rate. The model in [1], called negative impact of framedropping on visual quality, is given by NIFVQ(f ) = a1 [log(30) log(f )]a2 , (5)

In order to compare these two models and our proposed model, we apply all three models to three data sets, used in [1], [3], and this paper, respectively. Table 1 summarizes these data sets. Note that data sets in [1], and [3] include uncompressed videos only, whereas our data set contains quantized videos with different quantization parameters. For each data set, we normalized the MOS given for a test sequence at a particular frame rate (and quantization level) by the MOS for the same sequence at the highest frame rate (and at the same quantization level for DataSet#3). We apply all three models to the normalized MOS and determine

Table 1. Dat Set Description Source Denition DataSet#1 4 sequences used in [1], each with 7 frame rates (30, 15, 10, 7.5, 6, 5, 3Hz) DataSet#2 7 sequences used in [3], each with 6 frame rates (25, 12.5, 8.33, 6.25, 5, 2.5Hz) DataSet#3 4 sequences used in this paper, each with 4 frame rates (30, 15, 7.5, 3.75 Hz) and four quantization levels Data Sets

CoastGuard
1 1

Goldfish
1

Container
1

Daughter

Normalized MOS

Normalized MOS

Normalized MOS

0.8 0.6 0.4 0.2 0 0

0.8 0.6 0.4 0.2 0 0

0.8 0.6 0.4 0.2 0 0

Normalized MOS

0.8 0.6 0.4 0.2 0 0

subjective score TCF, b=3.71 Jerkiness, k1=0.72, k2=1.73, k3=0.18, k4=0.14 NIFVQ, a1=0.25, a2=1.28 5 10 15 20 25 30

subjective score TCF, b=3.93 Jerkiness, k1=6.00, k2=7.01, k3=0.17, k4=1.63 NIFVQ, a1=0.22, a2=1.50 5 10 15 20 25 30

subjective score TCF, b=5.65 Jerkiness, k1=4.16, k2=5.16, k3=0.19, k4=1.63 NIFVQ, a1=0.10, a2=2.20 5 10 15 20 25 30

subjective score TCF, b=5.05 Jerkiness, k1=6.00, k2=7.00, k3=0.24, k4=1.49 NIFVQ, a1=0.13, a2=2.03 5 10 15 20 25 30

Frame rate(fps)

Frame rate(fps)

Frame rate(fps)

Frame rate(fps)

Fig. 7. Predicted v.s normalized MOS for DataSet#1 [1] by three proposed metrics
amfoot
1 1

boxing
1

drink
1

mountain

Normalized MOS

Normalized MOS

Normalized MOS

Normalized MOS

0.8 0.6 0.4 0.2 0 0

0.8 0.6 0.4 0.2 0 0

0.8 0.6 0.4 0.2 0 0

0.8 0.6 0.4 0.2 0 0

subjective score TCF, b=5.42 Jerkiness, k1=0.96, k2=0.46, k3=0.72, k4=4.77 NIFVQ, a1=0.11, a2=1.55 5 10 15 20 25 30

subjective score TCF, b=6.48 Jerkiness, k1=0.97, k2=109.82, k3=0.32, k4=4.57 NIFVQ, a1=0.07, a2=2.24 5 10 15 20 25 30

subjective score TCF, b=4.69 Jerkiness, k1=0.98, k2=0.52, k3=0.71, k4=5.01 NIFVQ, a1=0.16, a2=1.42 5 10 15 20 25 30

subjective score TCF, b=7.11 Jerkiness, k1=1.03, k2=0.59, k3=0.65, k4=3.48 NIFVQ, a1=0.05, a2=2.50 5 10 15 20 25 30

Frame rate(fps) music


1 1

Frame rate(fps) ski


1

Frame rate(fps) talk


1

Frame rate(fps) Ave

Normalized MOS

Normalized MOS

Normalized MOS

0.6 0.4 0.2 0 0

0.6 0.4 0.2 0 0

0.6 0.4 0.2 0 0

Normalized MOS

0.8

0.8

0.8

0.8 0.6 0.4 0.2 0 0

subjective score TCF, b=6.38 Jerkiness, k1=0.99, k2=16.91, k3=0.25, k4=2.84 NIFVQ, a1=0.08, a2=2.07 5 10 15 20 25 30

subjective score TCF, b=8.07 Jerkiness, k1=0.98, k2=101.91, k3=0.51, k4=3.99 NIFVQ, a1=0.02, a2=3.66 5 10 15 20 25 30

subjective score TCF, b=5.09 Jerkiness, k1=0.99, k2=216.82, k3=0.15, k4=5.65 NIFVQ, a1=0.16, a2=1.33 5 10 15 20 25 30

subjective score TCF, b=6.02 Jerkiness, k1=0.99, k2=1.60, k3=0.28, k4=0.04 NIFVQ, a1=0.10, a2=1.87 5 10 15 20 25 30

Frame rate(fps)

Frame rate(fps)

Frame rate(fps)

Frame rate(fps)

Fig. 8. Predicted v.s normalized MOS for DataSet#2 [3] by three proposed metrics the model parameters by least squares tting. Figures 7 - 9 compare the predicted quality indices by these three models and the actual normalized MOS for the three data sets. Table 2 summarizes the Pearson correlation coefcients. Results shown in Figures 7 - 9 and Table 2 demonstrate that all three models can predict the normalized MOS very well with high correlation. Although the other two models have slightly higher correlation values, our proposed model only uses one parameter to model the normalized MOS, instead of 2 and 4 parameters in the models in [1], and [3], respectively. Note that the subjective ratings of DataSet#2 in [3] show different trends for different sequences at the very low end of the frame rates. The model proposed in [3] was able to follow the subjective ratings accurately because it has four parameters. However, it is not clear whether these inconsistent trends are due to viewer inconsistencies at very low frame rates. We should note that the work in [3] actually applied the model in (6) to the average subjective ratings over all test sequences. The average quality actually decreased with the frame rate in the same trend indicated by the proposed model in (3) and the model in (5). In addition to conducting the comparison of quality metrics only on the impact of frame rate artifacts, the model in [6] considers both frame rate and quantization effect and is given by, VQM(PSNR, f ) = 1 PSNR + 2 Ma (30 f ), (7) where 1 and 2 are model parameters. Here Ma represents motion activity intensity and is dened as the standard deviation of motion vector magnitude. Note that in [6], a low frame-rate video is interpolated to the full frame-rate by using frame repetition. The PSNR in (7) is the average PSNR of all frames, including interpolated frames. This PSNR depends on the frame rate, and is signicantly lower than the average PSNR computed from non-interpolated frames. The correlation between the predicted quality by this model and the measured MOS for our data set (DataSet#3) is illustrated in Figure 10(a). Figure 10(b) shows the correlation of our proposed model VQMTQ with the measured MOS, and as can be seen, our proposed model correlates with the measured MOS much better than the model in [6]. 5. CONCLUSION This work is concerned with the impact of quantization and frame rate on the perceptual quality of a video. We demonstrate that the degradation of the perceptual quality due to

Akiyo
1 1

City
1

Crew
1

Football

Normalized MOS

Normalized MOS

Normalized MOS

0.8 0.6 0.4 0.2 0 0

0.8 0.6 0.4 0.2 0 0

0.8 0.6 0.4 0.2 0 0

Normalized MOS

0.8 0.6 0.4 0.2 0 0

subjective score TCF, b=8.55 Jerkiness, k1=0.36, k2=0.64, k3=0.30, k4=1.29 NIFVQ, a1=0.08, a2=1.99 5 10 15 20 25 30

subjective score TCF, b=7.41 Jerkiness, k1=4.69, k2=5.71, k3=0.11, k4=2.33 NIFVQ, a1=0.14, a2=1.15 5 10 15 20 25 30

subjective score TCF, b=7.23 Jerkiness, k1=4.10, k2=5.10, k3=0.19, k4=1.83 NIFVQ, a1=0.11,a2=1.68 5 10 15 20 25 30

subjective score TCF, b=5.26 Jerkiness, k1=1.32, k2=2.33, k3=0.16, k4=0.76 NIFVQ, a1=0.17, a2=1.37 5 10 15 20 25 30

Frame rate(fps)

Frame rate(fps)

Frame rate(fps)

Frame rate(fps)

Fig. 9. Predicted v.s normalized MOS for DataSet#3 by three proposed metrics Proc. SPIE Human Vision and Electronic Imaging, vol. 5666, Jan. 2005, pp. 554562. [2] K.-C. Yang, C. C. Guest, K. El-Maleh, and P. K. Das, Perceptual Temporal Quality Metric for Compressed Video, IEEE Trans. on Multimedia, vol. 9, pp. 1528 1535, Nov. 2007. [3] H.-T. Quan and M. Ghanbari, Temporal Aspect of Perceived Quality of Mobile Video Broadcasting, IEEE Trans. on Broadcasting, vol. 54, no. 3, pp. 641651, Sept. 2008. [4] R. Feghali, D. Wang, F. Speranza, and A. Vincent, Quality metric for video sequences with temporal scalability, in Proc. of ICIP, vol. 3, Sep. 2005, pp. III137 40. [5] , Video quality metric for bit rate control via joint adjustment of quantization and frame rate, IEEE Trans. on Broadcasting, vol. 53, no. 1, pp. 441446, Mar. 2007. [6] S. H. Jin, C. S. Kim, D. J. Seo, and Y. M. Ro, Quality Measurement Modeling on Scalable Video Applications, in Proc. of IEEE Workshop on Multimedia Signal Processing, Otc. 2007, pp. 131 134. [7] Joint Scalable Video Model, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, Doc. JVT-X202, Jul. 2007. [8] ITU-R Rec. P.910, Subjective Video Quality Assessment Methods for Multimedia Applications, 1999. [9] ITU-R Rec. BT.500-11, Methodology for the subjective assessment of the quality of television pictures, 2002. [10] S. Wolf and M. Pinson, Video Quality Measurement Techniques, NTIA, Tech. Rep. 02-392, Jun. 2002. [11] Y.-F. Ou, T. Liu, Z. Zhao, Z. Ma, and Y. Wang, Modeling The Impact of Frame Rate on Perceptual Quality of Video, in Proc. of ICIP, 2008.

90 80 predicted MOS 70 60 50 40 20 QP QP QP QP 40 60 MOS 80 = = = = 28 36 40 44 100 predicted MOS

100

80

60

40

QP QP QP QP 40 60 MOS 80

= = = =

28 36 40 44 100

20 20

(a)

(b)

Fig. 10. The correlation of predicted and measured MOS for DataSet#3 using (a) VQM proposed in [6], (b) VQMTQ. Table 2. Pearson Correlation Coefcients of all the models Quality Metrics DataSet#1 DataSet#2 DataSet#3 TCF 0.99 0.97 0.97 0.99 0.97 0.98 NIFVQ [1] Jerkiness [3] 0.99 0.99 0.98 VQMTQ 0.98 VQM [6] 0.85

quantization and frame-rate reduction can be accurately captured by two functions separately (sigmoidal function and inverted falling exponential). Each function has a single parameter that is video-content dependent. We are currently studying the dependency of these parameters with some content feature that can be easily derived from the underlying video. The proposed model is shown to be highly accurate, compared to the subjective ratings for a large set of test sequences, including subjective ratings reported by other groups. We note that it is possible to replace the sigmoidal function with other metrics that can more accurately access the quality of a video at the full frame rate.

6. REFERENCES [1] Z. Lu, W. Lin, B. C. Seng, S. Kato, S. Yao, E. Ong, and X. K. Yang, Measuring the Negative Impact of Frame Dropping on Perceptual Visual Quality, in in

Vous aimerez peut-être aussi