An Evaluation Framework For More Realistic Simulations of MPEG Video Transmission

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 425-440 (2008)
An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

CHIH-HENG KE1, CE-KUEN SHIEH2, WEN-SHYANG HWANG3 AND ARTUR ZIVIANI4
1
Department of Computer Science and Information Engineering National Kinmen Institute of Technology Kinmen, 892 Taiwan 2 Department of Electrical Engineering National Cheng Kung University Tainan, 701 Taiwan 3 Department of Electrical Engineering National Kaohsiung University of Applied Sciences Kaohsiung, 807 Taiwan 4 National Laboratory for Scientific Computing (LNCC) Petrpolis, Rio de Janeiro, 25651-075 Brazil
We present a novel and complete tool-set for evaluating the delivery quality of MPEG video transmissions in simulations of a network environment. This tool-set is based on the EvalVid framework. We extend the connecting interfaces of EvalVid to replace its simple error simulation model by a more general network simulator like NS2. With this combination, researchers and practitioners in general can analyze through simulation the performance of real video streams, i.e. taking into account the video semantics, under a large range of network scenarios. To demonstrate the usefulness of our new tool-set, we point out that it enables the investigation of the relationship between two popular objective metrics for Quality of Service (QoS) assessment of video quality delivery: the PSNR (Peak Signal to Noise Ratio) and the fraction of decodable frames. The results show that the fraction of decodable frames reflects well the behavior of the PSNR metric, while being less time-consuming. Therefore, the fraction of decodable frames can be an alternative metric to objectively assess through simulations the delivery quality of transmission in a network of publicly available video trace files. Keywords: network simulation, MPEG video, Evalvid, NS2, PSNR, the fraction of decodable frames
1. INTRODUCTION
The ever-increasing demand for multimedia distribution in the Internet motivates research on how to provide better-delivered video quality through IP-based networks [1]. Previous studies [2-7] often use publicly available real video traces to evaluate their proposed network mechanisms in a simulation environment [8-12]. Results are usually presented using different performance metrics, such as the packet/frame loss rate, packet/ frame jitter [13], effective frame loss rate [8], picture quality rating (PQR) [13], and the fraction of decodable frames [9]. Nevertheless, packet loss or jitter rates are network performance metrics and may be insufficient to adequately rate the perceived quality by a (human) end user. Although effective frame loss rate, PQR, and the fraction of decodable
Received January 9, 2006; revised June 19, 2006; accepted August 2, 2006. Communicated by Chung-Sheng Li.
425
426
CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI
frames are application-level Quality of Service (QoS) metrics, they are not as well known and acceptable as MOS (Mean Opinion Scores) and PSNR (Peak Signal Noise Ratio) [14]. Furthermore, it is hard to study the effects of proposed network mechanisms on different characteristics of the same video extensively because the encoding settings for the publicly available video traffic traces are limited. As a consequence, how to best simulate and evaluate the performance of video quality delivery in a simulated network environment is a recursive open issue in network simulation forums, such as [15]. EvalVid [16], a complete framework and tool-set for evaluation of the quality of video transmitted over a real or simulated communication network, provides packet/ frame loss rate, packet/frame jitter, PSNR, and MOS metrics for video quality assessment purposes. The primary aim of EvalVid is to assist researchers or practitioners in evaluating their network designs or setups in terms of the perceived video quality by the end user. Nevertheless, the simulated environment provided by EvalVid is simply an error model to represent corrupted or missing packets in the real network. The lack of generalization of this simple error model causes problems for researchers or practitioners who seek to assess the delivered video quality to end users in more complex and realistic network scenarios. For example, when transmitting video packets via unicast over IEEE 802.11 wireless network, the MAC layer at a sender will retransmit an unacknowledged packet at a maximum of N times before it gives up. The perceived correct rate at application-level is thus
PCORRECT = (1 p ) p i 1 = 1 p N ,
i =1
where N is the maximum number of retransmission at the MAC layer and p is the packet error rate at the physical-level. As a consequence, the application-level error rate is peffecN tive = p . In this kind of scenario, the results obtained from original Evalvid framework are misleading since the simple error model does not take the retransmission mechanism into consideration. This paper integrates EvalVid with NS2 [17], a widely adopted network simulator. On the one hand, the resulting tool-set from this integration allows network researchers and practitioners to analyze their proposed new network designs in the presence of real video traffic in a straightforward way. On the other hand, mechanisms for enhancing the delivery quality of video streams can be evaluated in more complex simulated network scenarios, including characteristics like relatively large topologies, broadband access, limited bandwidth, wireless, node mobility, and whatever functionality is available at the network simulator. Furthermore, we use our new evaluation framework provided by this tool-set to investigate the relationship between two objective QoS assessment metrics: PSNR [18] and the fraction of decodable frames [9]. PSNR takes into account the video content and hence it is more time-consuming than the fraction of decodable frames, which is straightforward to compute. The new tool-set enables the analysis showing that the fraction of decodable frames can reflect the behavior of the PSNR metric adequately, while being less time-consuming. To the best of our knowledge, no tool-set is publicly available to perform a comprehensive video quality evaluation of real video streams in network simulation environment. We argue that the proposed tool-set enables more realistic simulations of video
REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION
427
transmission in a dual sense. This tool-set enables video-coding or video-QoS technicians to simulate the effects of a more realistic network on video sequence resulting from their coding or QoS scheme, respectively. Likewise, the proposed tool-set also enables networking operatives to evaluate the effects of real video streams on proposed network protocols, for instance. Indeed, we believe that our tool-set provides a convergence to more realistic video simulations of video transmissions in the broad sense, thus enabling a large range of video transmissions in network scenarios to be evaluated. [19-21] are examples that use this tool-set for their respective proposed mechanism evaluation. This new proposed tool-set for evaluating the quality performance of network video transmissions is publicly available at [22]. The remainder of this paper is organized as follows. Section 2 provides a brief overview of EvalVid. Section 3 describes the developed connecting agents between EvalVid and NS2 as well as an improved fix YUV program to replace the conventional one. Section 4 analyzes the proposed QoS assessment framework for video streams using two examples to illustrate the video quality evaluation. Section 5, investigates the relationship between the QoS assessment metrics PSNR and the fraction of decodable frames. Finally, section 6 presents the concluding remarks.
2. OVERVIEW OF EVALVID
The structure of the EvalVid framework is shown in Fig. 1, redrawn from [16].
play-out buffer Source Video Encoder VS
sender trace
user Video Decoder
Network (or Simulation) Loss / delay
video trace
ET coded video reconstructed erroneous video FV raw YUV video (sender)
receiver trace erroneous raw YUV video video (receiver) RESULTS: - frame loss / frame jitter - user perceived quality
reconstructed raw YUV video (receiver) PSNR MOS
Fig. 1. Schematic illustration of the evaluation framework provided by EvalVid.
The main components of the evaluation framework are described as follows:

Source The video source can be either in the YUV QCIF (176 144) or in the YUV CIF (352 288) formats. Video Encoder and Video Decoder Currently, EvalVid only supports single layer video
428
coding. It supports three kinds of MPEG4 codecs, namely the NCTU codec [23], ffmpeg [24], and Xvid [25]. The focus of this investigation is NCTU codec for video coding purposes.
VS (Video Sender) The VS component reads the compressed video file from the output of the video encoder, fragments each large video frame into smaller segments, and then transmits these segments via UDP packets over a real or simulated network. For each transmitted UDP packet, the framework records the timestamp, the packet ID, and the packet payload size in the sender trace file with the aid of third-party tools, such as tcp-dump [26] or win-dump [27], if the network is a real link. Nevertheless, if the network is simulated, the sender trace file is provided by the sending entity of the simulation. The VS component also generates a video trace file that contains information about every frame in the real video file. The video trace file and the sender trace file are later used for subsequent video quality evaluation. Examples of a video trace file and a sender trace file are shown in Tables 1 and 2, respectively. It can be seen that the packets with IDs 1 to 4 originate from the same video frame since their transmission times are equal.
Table 1. Example of video trace file.
Frame Number 0 1 2 3 4 ... Frame Type H I P B B Frame Size 29 3036 659 357 374 Number of UDP-packets 1 segment at 4 segments at 1 segment at 1 segment at 1 segment at Sender Time 33 ms 67 ms 99 ms 132 ms 165 ms
Table 2. Example of sender trace file.

Time stamp (sec) 0.033333 0.066666 0.066666 0.066666 0.066666 0.099999 0.133332 0.166665 ... Packet ID 0 1 2 3 4 5 6 7 ... Packet Type udp udp udp udp udp udp udp udp ... Payload Size (bytes) 29 1000 1000 1000 36 659 357 374 ...
ET (Evaluate Trace) Once the video transmission is over, the evaluation task begins. The evaluation takes place at the sender side. Therefore, the information about the timestamp, the packet ID, and the packet payload size available at the receiver has to be transported back to the sender. Based on the original encoded video file, the video trace file, the sender trace file, and the receiver trace file, the ET component creates a frame/ packet loss and frame/packet jitter report and generates a reconstructed video file, which corresponds to the possibly corrupted video found at the receiver side as it would be re-
429
produced to an end user. In principle, the generation of the potentially corrupted video can be regarded as a process of copying the original video trace file frame by frame, omitting frames indicated as lost or corrupted at the receiver side. Nevertheless, the generation of the possibly corrupted video is more complex than this and the process is further explained in more details in section 3.2. Furthermore, the current version of the ET component implements the cumulative inter-frame jitter algorithm [8] for play-out buffer. If a frame arrives later than its defined playback time, the frame is counted as a lost frame. This is an optional function. The size of the play-out buffer must also be set, otherwise it is assumed to be of infinite size.
FV (Fix Video) Digital video quality assessment is performed frame by frame. Therefore, the total number of video frames at the receiver side, including the erroneous frames, must be the same as that of the original video at the sender side. If the codec cannot handle missing frames, the FV component is used to tackle this problem by inserting the last successfully decoded frame in the place of each lost frame as an error concealment technique [28]. PSNR (Peak Signal Noise Ratio) PSNR is one of the most widespread objective metrics to assess the application-level QoS of video transmissions. The following equation shows the definition of the PSNR between the luminance component Y of source image S and destination image D:
PSNR(n)dB = 20 log10
1 N col N row
N col N row
i =0
, 2 [YS (n, i, j ) YD (n, i, j )] j =0 V peak
where Vpeak = 2k 1 and k = number of bits per pixel (luminance component). PSNR measures the error between a reconstructed image and the original one. Prior to transmission, it is possible to compute a reference PSNR value sequence on the reconstruction of the encoded video as compared to the original raw video. After transmission, the PSNR is computed at the receiver for the reconstructed video of the possibly corrupted video sequence received. The individual PSNR values at the source or receiver do not mean much, but the difference between the quality of the encoded video at the source and the received one can be used as an objective QoS metric to assess the transmission impact on video quality at the application level.
Table 3. Possible PSNR to MOS conversion [29].
PSNR[dB] > 37 31-37 25-31 20-25 < 20 MOS 5 (Excellent) 4 (Good) 3 (Fair) 2 (Poor) 1 (Bad)
430
MOS (Mean Opinion Score) MOS is a subjective metric to measure digital video quality at the application level. This metric of the human quality impression is usually given on a scale that ranges from 1 (worst) to 5 (best). In this framework, the PSNR of every single frame can be approximated to the MOS scale using the mapping shown in Table 3.
3. ENHANCEMENT OF EVALVID
This section introduces the proposed enhancement of EvalVid by constructing three connecting interfaces (agents) between EvalVid and NS2. Additionally, this section discusses the problem associated with the conventional fix YUV component (FV) and develops an improved fix YUV component to overcome this problem.
3.1 New Network Simulation Agents
Fig. 2 illustrates the QoS assessment framework for video traffic enabled by the new tool-set that combines EvalVid and NS2. As shown in Fig. 2, three connecting simulation agents, namely MyTrafficTrace, MyUDP, and MyUDPSink, are implemented between NS2 and EvalVid. These interfaces are designed either to read the video trace file or to generate the data required to evaluate the quality of delivered video.
Fig. 2. Interfaces between EvalVid and NS2.
Consequently, the whole evaluation process starts from encoding the raw YUV video, and then the VS program will read the compressed file and generate the traffic trace file. The MyTrafficTrace agent extracts the frame type and the frame size of the
431
video trace file generated from the traffic trace file, fragments the video frames into smaller segments, and sends these segments to the lower UDP layer at the appropriate time according to the user settings specified in the simulation script file. MyUDP is an extension of the UDP agent. This new agent allows users to specify the output file name of the sender trace file and it records the timestamp of each transmitted packet, the packet ID, and the packet payload size. The task of the MyUDP agent corresponds to the task that tools such as tcp-dump or win-dump performs in a real network environment. MyUDPSink is the receiving agent for the fragmented video frame packets sent by MyUDP. This agent also records the timestamp, packet ID, and payload size of each received packet in the user specified receiver trace file. After simulation, based on these three trace files and the original encoded video, the ET program produces the corrupted video file. Afterward, the corrupted video is decoded and error concealed. Finally, the reconstructed fixed YUV video can be compared with the original raw YUV video to evaluate the end-to-end delivered video quality.
3.2 Problem of the Original FV Program
As described in section 2, when the video transmission is over, the receiver trace file has to be sent back to the sender side for the video quality evaluation. Based on the video trace file, the sender trace file, and the receiver trace file, the lost frames can be identified. If a frame is lost due to packet loss, the ET component sets the vop_coded bit of this video object plane (VOP) header in the original compressed video file to 0. The setting of this bit to 0 indicates that no subsequent data exists for this VOP. This type of frame is referred to as a vop-not-coded frame. When a frame is received completely and the vop_coded bit is set to 1, this type of frame is referred to as a decodable frame. After setting the vop_coded bit to 0 for all the lost frames, the processed file is then used to represent the compressed video file delivered to the receiver side. Currently, no standard exists to define an appropriate treatment of vop-not-coded frames. Some decoders with an error concealment mechanism simply replace the vopnot-coded frames by the last successfully decoded frame [28]. In these cases, the FV component is not required. Other decoders, however, without error concealment, such as ffmpeg, decode all frames other than the vop-not-coded frames. In these cases, the FV component can handle these vop-not-coded frames without difficulty by simply replacing them with the last successfully decoded frames. Other decoders, such as Xvid or the NCTU codec, additionally fail to decode the subsequent frames in some cases. For example, when decoding a subsequent frame that is a decodable frame, this frame may fail to be decoded if the frame it depends on is a vop-not-coded frame because there is not enough information to decode it. This type of frame is referred to as a non-decodable frame. In this case, the original FV component fails since it does not take this possibility into consideration. Based on these limitations, a requirement exists to design a new algorithm capable of solving the problem of non-decodable frames. In this study, we develop an algorithm that uses the decoder output to fix the decoding results, i.e. reconstructed erroneous video sequence. If a frame is decodable, the improved FV component copies this decoded YUV frame data from the reconstructed erroneous raw video file into a temporary file and keeps it in a buffer as the last successfully decoded frame data. If a frame is vop-not-
432
coded, the improved FV component reads this frame data from the reconstructed erroneous raw video file, but it does not copy the data into the temporary file. This is because the data read is useless and the file pointer needs to be moved to the next frame. The improved FV component copies the data from the buffer into the temporary file instead. If a frame is missing or considered non-decodable, the improved FV component simply copies the last successfully decoded YUV frame data in the buffer into the temporary file. After processing all the frames in the reconstructed and possibly corrupted video sequence, the resulting temporary file is the reconstructed fixed video sequence. Afterwards, the frame-by-frame PSNR can be evaluated in the usual manner.
4. SIMULATION RESULTS
This section demonstrates the usefulness of the new tool-set by considering two experimental cases simulated in a best-effort network and in a DiffServ (Differentiated Service) network [19, 30, 31] when transmitting real video streams instead of synthetic generated video flow sequences. Fig. 3 presents the simple simulation topology, in which Host A delivers a video traffic stream to Host B through routers R1 and R2. The delivered video is a foreman QCIF format sequence composed of 400 frames. It also has a mean bit rate of 200 Kbps and a peak bit rate of 400 Kbps. The bottleneck link has a capacity of 180 Kbps and is situated between router R1 and router R2. The queue limit at each router is set to 10 packets. The simulation scripts are publicly available at [22].
Router R1 Host A
Router R2
Host B
Fig. 3. Simulation topology.
4.1 Conventional Best-Effort Network
In the first experiment, the video is delivered over a best-effort network and router R1 and R2 implement conventional First In First Out (FIFO) queue management. When the queue size reaches the queue limit, the FIFO queue management discards all the incoming packets until the queue size decreases. Fig. 4 shows the results. It is clearly shown in the figure that the curve of psnr_myfix_be, which is the video fixed by the improved FV component, outperforms that of psnr_fix_be, which is the video fixed by the original component, on intervals from frame number 200 and number 250 and above 370. This is because the original FV component cannot distinguish the vop-not-coded frame and the missing frame. As a consequence, the FV component may copy the wrong frame data from the reconstructed erroneous raw video file into the temporary file. In terms of average PSNR, the psnr_myfix_be curve measures 26.86 dB and psnr_fix_be curve measures 23.43 dB. The simulation results demonstrate that the improved FV component is more effective than the conventional one in reconstructing the corrupted video sequence.
433
Fig. 4. Original FV vs. improved FV for best-effort delivered video.
Fig. 5. QoS delivery vs. best-effortdelivery.
4.2 DiffServ Network
The second experiment is simulated in a DiffServ network in which I-frame packets are pre-marked with the lowest drop probability in the application layer at the source, P-frame packets are pre-marked with a medium drop probability, and B-frame packets are pre-marked with the highest drop probability. The queue management of router R1 and R2 implements a Weighted Random Early Detection (WRED) queue management. When the queue builds up and exceeds a given threshold, the WRED starts to drop packets following the specified drop probability parameters. Fig. 5 shows the results.The PSNR difference values between psnr_noloss, which means no packet loss during transmission, and psnr_myfix_qos, which is the video transmitted by QoS delivery, are less than those between psnr_noloss and psnr_myfix_be, which is the video transmitted by best-effort delivery, especially on the intervals from frame number 260 to number 360. In terms of average PSNR, the delivered video quality in a DiffServ network measures 28.64 dB. As expected, it outperforms the results obtained in a best-effort network, i.e. an average PSNR of 26.86 dB. Consequently, a DiffServ network provides more suitable environment for video transmission. In addition, to illustrate the how difference in performance is perceived by an end user, the corresponding visual effects are shown in Fig. 6 by means of the YUV display tool, i.e. yuvviewer [32]. This kind of visual result for a real video stream being transmitted over a simulated network is enabled by our new tool-set. The possibility of transmitting real video streams over a simulated network also enables the use of the PSNR quality measurement metric that takes into account the video content.
5. RELATIONSHIP BETWEEN PSNR AND THE DECODABLE FRAMES

In this section, we investigate the relationship between two popular objective metrics: PSNR and the fraction of decodable frames. PSNR is a commonly accepted objective performance metric that takes into account the video content to assess the video quality. However, pixel-by-pixel and frame-by-frame comparison to get the PSNR value
434
(a) QoS delivery.
(b) Best-effort delivery. Fig. 6. Visual comparison of the reconstructed 180-184th frames.
Table 4. QoS Mappings.

QoS Index 0 1 2 3 4 5 6 7 8 9 Green I I I I+P I+P I+ P + B Yellow P P+B Red B P+B B B I I+P I+P+B P+B B I+P+B
is a slow and laborious job. If the metric of fraction of decodable frames can adequately correspond to the behavior of the PSNR metric and at the same time be less time-consuming, it can be an alternative to objectively evaluate the delivery quality of transmitted video streams. The fraction of decodable frames reports the number of decodable frames over the total number of transmitted frames. A frame is considered to be decodable if at least a fraction , called decodable threshold, of the data in each frame is received. However, a frame is only considered decodable if and only if all of the frames upon which it depends are also decodable. Therefore, for instance, when = 0.75, 25% of the data from a frame can be lost without causing that frame to be considered as undecodable. The simulation settings refer to [10]. The goal of that paper was to study the delivered video quality for different QoS source mappings. The adopted QoS mapping table is shown in Table 4. For example, QoS 0 means that I frame packets are pre-marked as green, P frame packets are pre-marked as yellow, and B frame packets are pre-marked as red; where color marking in red, yellow, and green represents increasing packet loss protection within the DiffServ network. This paper investigates the relationship between the objective metrics PSNR and fraction of decodable frames. The adopted network topology for this purpose is shown in Fig. 7. Three video sources were connected to a DiffServ network. The three video
435
10 Mbps, 1ms S1
10 Mbps, 1ms D1
10 Mbps, 1ms S2 R1 R2
Mbps, 1ms D2 R3
S3
D3
Fig. 7. Network topology for different QoS source mappings.
Fig. 8. PSNR for foreman video sequence.
Fig. 9. The fraction of decodable frames for foreman video sequence when = 1.0 and = 0.75.
sources transmitted the same video sequence to their respective destinations with a random start time within an interval of 3 seconds. The tested video sequences covered three different kinds of video content, i.e. foreman, akiyo, and highway [33]. These real video traces have different properties in terms of motion, frame size, and quality. Each frame is fragmented into packets of 1,000 bytes before transmission. The three routers in the simulation scenario implement the WRED mechanism for active queue management. The WRED parameters include a minimum threshold, a maximum threshold, and a maximum drop probability, i.e. minth, maxth, and Pmax. The WRED parameters and the bottleneck bandwidth are set differently and specified in the following three simulation scenarios. In the first set of simulations, the tested video sequence is foreman. The parameters for WRED queue mechanism are specified respectively as {10, 20, 0.1} for red packets, {20, 30, 0.05} for yellow packets, and {30, 40, 0.025} for green packets. The bottleneck bandwidth is set to 512 Kbps. The simulation results are shown in Figs. 8
436
Fig. 10. PSNR for akiyo video sequence.
Fig. 11. The fraction of decodable frames for akiyo video sequence when = 1.0 and = 0.75.
and 9. The error bars show the 95% confidence interval. The behavior of the PSNR metric for different QoS indexes matches exactly that of the fraction of decodable frames no matter if = 1.0 or = 0.75. When the QoS indexes have higher PSNR values, the values of the fraction of decodable frames are also higher. Likewise, when the QoS indexes have lower PSNR values, the values of the fraction of decodable frames are also lower. In the second set of simulations, the tested video sequence is the CIF format akiyo video sequence, which has 300 frames coded at 30 frames/sec. It has a mean bit rate of 237 Kbps and a peak rate of 595 Kbps. The parameters for WRED queue mechanism are specified respectively as {20, 40, 0.1} for red packets, {40, 60, 0.05} for yellow packets, and {60, 80, 0.025} for green packets. The bottleneck bandwidth is set to 640 Kbps. The simulation results are shown in Figs. 10 and 11. The error bars show the 95% confidence interval. Similarly to the foreman sequence, in the akiyo sequence the behavior of the PSNR metric for different QoS indexes matches exactly that of the fraction of decodable frames when = 0.75. However, the curve is somewhat inconsistent with that of PSNR values for QoS index 5 and QoS index 8 when = 1.0. During PSNR simulations, the improved FV conceals some packet losses, but the system is completely intolerant to losses in the case of = 1.0. Therefore, using a smaller is better than using a larger in matching PSNR. In the third set of simulations, the tested video sequence is the CIF format highway video sequence, which has 2000 frames coded at 30 frames/sec. It has a mean bit rate of 412 Kbps and a peak rate of 1116 Kbps. The parameters for WRED queue mechanism are specified respectively as {20, 40 and 0.1} for red packets, {40, 60 and 0.05} for yellow packets, and {60, 80 and 0.025} for green packets. The bottleneck bandwidth is set to 1.024 Mbps. The simulation results are shown in Figs. 12 and 13. The error bars show the 95% confidence interval. Likewise, the behavior of the PSNR metric for different QoS indexes matches exactly to that of the fraction of decodable frames when = 0.75. It is also interesting to have a closer look on this simulation. When computing the PSNR metric, it takes around 3 to 4 minutes to finish the task of simulating, evaluating
437
Fig. 12. PSNR for highway video sequence.
Fig. 13. The fraction of decodable frames for highway video sequence when = 1.0 and = 0.75.
traces, decoding, fixing, and doing the frame-by-frame PSNR comparison on a Pentium III 1 GHz computer equipped with 512 MB RAM. In contrast, it takes less than 10 seconds to get the value of the fraction of decodable frames. Similar results hold for the other two video sequences. It needs to be carefully noticed that highway has only 2000 frames or around 1.11 minutes for video transmission at the rate of 30 frames/second. If the test sequence has more frames, it needs more time to finish all the tasks.
6. CONCLUSION AND FUTURE WORK

The contribution of this paper is twofold. First, we have presented the integration of EvalVid and NS2 to provide a novel generalized and comprehensive tool-set for evaluating the video quality performance of network designs in a simulated environment. The developed integration provides three new connecting simulation agents, namely MyTrafficTrace, MyUDP, and MyUDPSink. These agents enable EvalVid to link seamlessly with NS2 in such a way that researchers or practitioners have greater freedom to analyze their proposed network designs for video transmission without being obliged to consider an appropriate tool-set for video quality evaluation. Simulations of real video streams are enabled over a large set of network scenarios, including relatively large topologies, node mobility, different kinds of concurrent traffic, or any other functionality available by the network simulator. Furthermore, in an analysis enabled by the new tool-set, we have shown that the fraction of decodable frames can adequately reflect the behavior of the PSNR QoS video assessment metric with reasonable accuracy and while being less timeconsuming by at least one order of magnitude. Therefore, when researchers or practitioners want to encode their own test video sequences or adopt well-known ones in order to evaluate the delivered video quality in a simulated network environment, our proposed QoS assessment framework would be a good choice. Although this new evaluation framework is beneficial for networking or video-coding technicians for most of cases, there are still some limitations. First, in its current version, it only supports non-scalable video encoding now. Second, due to the video encod-
438
ing modes and the agents we developed, the current framework is not suitable for video transmission over bi-directional channels. The video encoding parameters can not be changed during simulation time. So researchers interested in rate adaptive design can refer to [34] for more information. In the future, we will incorporate more codecs into the framework and support scalable video coding and multiple description coding (MDC). The prototype of a multiple description coding evaluation framework is publicly available at [35]. Researchers interested in multiple-path transport and load balance designs can try this prototype framework for preliminary evaluation.
REFERENCES
1. S. F. Chang and A. Vetro, Video adaptation: concepts, technologies, and open issues, in Proceedings of the IEEE, Vol. 93, 2005, pp. 148-158. 2. F. H. P. Fitzek and M. Reisslein, MPEG-4 and H.263 video traces for network performance evaluation, IEEE Network, Vol. 15, 2001, pp. 40-54. 3. P. Seeling, M. Reisslein, and B. Kulapala, Network performance evaluation using frame size and quality traces of single-layer and two-layer video: a tutorial, IEEE Communications Surveys and Tutorials, Vol. 6, 2004, pp. 58-78. 4. Traffic trace from Mark Garretts MPEG encoding of the Star Wars movie, http:// www.research.att.com/~breslau/vint/trace.html. 5. Video traffic generator based on TES (Transform Expand Sample) model of MPEG4 trace files, contributed by Ashraf Matrawy and Ioannis Lambadaris, It generates traffic that has the same first and second order statistics as an original MPEG4 trace, http://www.sce.carleton.ca/~amatrawy/mpeg4. 6. O. Rose, Statistical properties of MPEG video traffic and their impact on traffic modeling in ATM systems, Report No. 101, Institute of Computer Science, University of Wurzberg, Germany, 1995. 7. D. Saparilla, K. Ross, and M. Reisslein, Periodic broadcasting with VBR-encoded video, in Proceedings of IEEE INFOCOM, 1999, pp. 464-471. 8. L. Tionardi and F. Hartanto, The use of cumulative inter-frame jitter for adapting video transmission rate, in Proceedings of the Conference on Convergent Technologies for Asia-Pacific Region, Vol. 1, 2003, pp. 364-368. 9. A. Ziviani, B. E. Wolfinger, J. F. Rezende, O. C. M. B. Duarte, and S. Fdida, Joint adoption of QoS schemes for MPEG streams, Multimedia Tools and Applications, Vol. 26, 2005, pp. 59-80. 10. J. M. H. Magalhaes and P. R. Guardieiro, A new QoS mapping for streamed MPEG video over a DiffServ domain, in Proceedings of the IEEE International Conference on Communications, Circuits and Systems and West Sino Expositions, 2002, pp. 675-679. 11. M. F. Alam, M. Atiquzzaman, and M. A. Karim, Traffic shaping for MPEG video transmission over the next generation internet, Computer Communications, Vol. 23, 2000, pp. 1336-1348. 12. N. E. Nasser and M. Al-Abdulmunem, MPEG traffic over diffserv assured service, in Proceedings of Asia-Pacific Conference on Communication, 2003, pp. 494-498. 13. J. Takahashi, H. Tode, and K. Murakami, QoS Enhancement methods for MPEG
439
14. 15. 16.
17. 18. 19. 20.
21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35.
video transmission on the Internet, IEICE Transactions on Communications, Vol. E85-B, 2002, pp. 1020-1030. F. A. Shaikh, S. McClellan, M. Singh, and S. K. Chakravarthy, End-to-end testing of IP QoS mechanisms, IEEE Computer Magazine, Vol. 35, 2002, pp. 80-87. NS related mailing lists, http://www.isi.edu/nsnam/htdig/search.html. J. Klaue, B. Rathke, and A. Wolisz, EvalVid A framework for video transmission and quality evaluation, in Proceedings of the International Conference on Modelling Techniques and Tools for Computer Performance Evaluation, 2003, pp. 255272. NS, http://www.isi.edu/nsnam/ns/. S. Olsson, M. Stroppiana, and J. Baina, Objective methods for assessment of video quality: state of the art, IEEE Transactions on Broadcasting, Vol. 43, 1997, pp. 487-495. C. H. Ke, C. K. Shieh, W. S. Hwang, and A. Ziviani, A two-markers system for improved MPEG video delivery in a DiffServ network, IEEE Communications Letters, Vol. 9, 2005, pp. 381-383. J. Naoum-Sawaya, B. Ghaddar, S. Khawam, H. Safa, H. Artail, and Z. Dawy, Adaptive approach for QoS support in IEEE 802.11e wireless LAN, in Proceedings of the IEEE International Conference on Wireless and Mobile Computing, Networking and Communications, 2005, pp. 167-173. H. Huang, J. Ou, and D. Zhang, Efficient multimedia transmission in mobile network by using PR-SCTP, in Proceedings of the IASTED International Conference on Communications and Computer Networks, 2005, pp. 213-217. http://hpds.ee.ncku.edu.tw/~smallko/ns2/Evalvid_in_NS2.htm. NCTU codec, http://megaera.ee.nctu.edu.tw/mpeg. ffmpeg, http://ffmpeg.sourceforge.net/index.php. Xvid, http://www.xvid.org/. tcp-dump, http://www.tcpdump.org. win-dump, http://windump.polito.it. Y. Wang and Q. F. Zhu, Error control and concealment for video communication: a review, in Proceedings of the IEEE, Vol. 86, 1998, pp. 974-997. J. R. Ohm, Bildsignalverarbeitung fuer multimedia-systeme, Skript, 1999. B. Carpenter and K. Nichols, Differentiated services in the internet, in Proceedings of the IEEE, Vol. 90, 2002, pp. 1479-1494. J. Shin, J. Kim, and C. C. J. Kuo, Quality of service mapping mechanism for packet video in differentiated services network, IEEE Transactions on Multimedia, Vol. 3, 2001, pp. 219-231. yuvviewer, http://eeweb.poly.edu/~yao/VideobookSampleData/video/application/YUVviewer.exe. YUV video sequences (CIF), http://www.tkn.tu-berlin.de/research/evalvid/cif.html. Evalvid-RA, http://www.item.ntnu.no/~arnelie/Evalvid-RA.htm. Multiple description coding evaluation framework, http://hpds.ee.ncku.edu.tw/~smallko/ns2/MDC.htm.
440
Chih-Heng Ke () received his B.S. and Ph.D degrees in Electrical Engineering from National Cheng-Kung University, in 1999 and 2007. He is an assistant professor of Computer Science and Information Engineering, National Kinmen Institute of Technology, Kinmen, Taiwan. His current research interests include multimedia communications, wireless network, and QoS network.
Ce-Kuen Shieh () is currently a professor teaching in the Department of Electrical Engineering, National Cheng Kung University. He received his Ph.D., M.S., and B.S. degrees from the Electrical Engineering Department of National Cheng Kung University, Tainan, Taiwan. His current research areas include distributed and parallel processing systems, computer networking, and operating systems.
Wen-Shyang Hwang () received his B.S., M.S., and Ph.D. degrees in Electrical Engineering from National Cheng Kung University, Taiwan, in 1984, 1990 and 1996, respectively. He is professor of Electrical Engineering, National Kaohsiung University of Applied Sciences, Taiwan. His current research focus includes multi-channel WDM networks, performance evaluation, QoS, RSVP, WWW database applications
Artur Ziviani received a B.Sc. in Electronics Engineering in 1998 and a M.Sc. in Electrical Engineering in 1999, both from the Federal University of Rio de Janeiro (UFRJ), Brazil. In 2003, he received a Ph.D. in Computer Science from the University of Paris 6, France, where he has also been a lecturer during 2003 to 2004. Since 2004, he is with the National Laboratory for Scientific Computing (LNCC), Brazil. His research interests include QoS, wireless computing, Internet measurements, and the application of networking technologies in telemedicine.

An Evaluation Framework For More Realistic Simulations of MPEG Video Transmission

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

An Evaluation Framework For More Realistic Simulations of MPEG Video Transmission

Transféré par

Droits d'auteur :

Formats disponibles

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 425-440 (2008)

An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION

user Video Decoder

Network (or Simulation) Loss / delay

ET coded video reconstructed erroneous video FV raw YUV video (sender)

reconstructed raw YUV video (receiver) PSNR MOS

Fig. 1. Schematic illustration of the evaluation framework provided by EvalVid.

The main components of the evaluation framework are described as follows:

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

Table 2. Example of sender trace file.

REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION

, 2 [YS (n, i, j ) YD (n, i, j )] j =0 V peak

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

Fig. 2. Interfaces between EvalVid and NS2.

REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

Fig. 3. Simulation topology.

4.1 Conventional Best-Effort Network

REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION

Fig. 4. Original FV vs. improved FV for best-effort delivered video.

Fig. 5. QoS delivery vs. best-effortdelivery.

4.2 DiffServ Network

5. RELATIONSHIP BETWEEN PSNR AND THE DECODABLE FRAMES

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

(a) QoS delivery.

Table 4. QoS Mappings.

REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION

Fig. 7. Network topology for different QoS source mappings.

Fig. 8. PSNR for foreman video sequence.

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

Fig. 10. PSNR for akiyo video sequence.

REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION

Fig. 12. PSNR for highway video sequence.

6. CONCLUSION AND FUTURE WORK

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

REALISTIC NETWORK SIMULATIONS OF MPEG VIDEO TRANSMISSION

14. 15. 16.

17. 18. 19. 20.

CHIH-HENG KE, CE-KUEN SHIEH, WEN-SHYANG HWANG AND ARTUR ZIVIANI

Vous aimerez peut-être aussi