Vous êtes sur la page 1sur 2

Grid adaptive interpolation filters

Chee Sun Won


A general framework for image interpolation in a uniformly spaced image grid is presented. The proposed formulation is suitable for representing fractional (or sub-pixel) pixels as well as integer pixel interpolations. Also, by tting an interpolation kernel to the grid formulation, nite impulse response lter coefcients can be readily determined for a given sampling interval and lter length. As an example, a four-tap lowpass lter is derived by tting the Lagrange interpolation kernel to a row (or column) expansion by 2 on integer pixel locations.

as follows: y(kT + nD) =


M m=M +1

y(kT + mT )P(mT nD)

( 2)

Introduction: Image interpolation is a popular technique for various image processing problems including colour restoration for missing pixels [1] and motion compensation for video compression [2, 3]. Image interpolations are performed on integer pixel locations as in [1] or on fractional sub-pixels to increase the accuracy of the motion estimation as in [2, 3]. Noting that nite impulse response (FIR) lter coefcients can be determined by sampling (or approximating) a desirable sampling kernel such as the continuous-time sinc function, it would be useful if we have a formula to adjust a sampling kernel to the layout and the scale of the image grid. Motivated by this requirement, this Letter: (i) provides a general form of uniform interpolations for both integer and sub-pixel locations in terms of the sampling interval and lter length, (ii) derives a grid adaptive four-tap lter from the proposed formulation. General formula for uniform interpolations: As shown in Fig. 1a, the original image has a uniform grid with Kv Kh pixels. Here, the vertical and horizontal distances between the two nearest known (existing) pixels are Tv and Th, respectively. Then, our goal is to insert and interpolate Nv and Nh pixels within the intervals of Tv and Th, respectively. So, after the interpolation we will have a total of [Kv (1 + Nv ) Nv ] [Kh (1 + Nh ) Nh ] pixels and the sampling intervals for horizontal and vertical directions will be changed to Dh = Th /(Nh + 1) and Dv = Tv /(Nv + 1), respectively. For example, Fig. 1b shows the case of Nv = Nh = 1 with Dv = Tv /2 and D h = T h / 2.
Th h Tv v

where n denotes the pixel location to be interpolated within two reference (known) samples, y(kT ) and y(kT + T ). So we need to evaluate (2) for all n = 1, , N . The merit of expression (2) is that it covers both fractional and integer pixel interpolations. That is, pixels to be interpolated can be located either in the fractional pixel positions (i.e. D , 1) or in the integer pixel locations. In H.264/AVC a six-tap lter (i.e. M = 3) is used for the interpolation of half pixels in motion compensation [2]. This corresponds to the form of (2) with N = 1 and its symmetric lter coefcients are P(D) = P(T D) = 20/32 , P(T D) = P(2T D) = 5/32 , P(2T D) = P(3T D) = 1/32 . Similarly, the four-tap lter of the AVS (audio video system) [3] has the lter coefcients as P(D) = P(T D) = 5/8 and P(T D) = P(2T D) = 1/8 . Grid adaptive four-tap lter from Lagrange interpolation kernel: From a mathematical point of view, the Lagrange interpolation is to generalise the linear interpolation by approximating the sinc function [4]. The Lagrange interpolation kernel is an Lth-order polynomial function determined by L + 1 sample values as follows: P(t + mT nD) = where QmT nD (t ) =
k

QmT nD (t ) QmT nD (mT nD)

( 3)

(t (kT nD))/(t (mT nD)).

For any sampling grid layout and scale we can calculate the lter coefcients by tting (3) to the grid. For example, by setting the parameters in (2) as T = 2, N = 1, M = 2 and D = 1, we have y(2k + 1) =
2 m=1

y(2k + 2m)P(2m 1) ( 4)

= y(2k 2)P(3) + y(2k )P(1) + y(2k + 2)P(1) + y(2k + 4)P(3)

Th

Tv

Here, the lter coefcients P( 3), P( 1), P(1), and P(3) are samples of (3) when the lter function P is located in the middle of the interpolating pixel (i.e. t = 0). This leads us to a four-tap lter by tting the Lagrange interpolation lter kernel to our lter formulation with T = 2 and D = 1, and we have the coefcients as follows: P(3) = Q3 (0) 1 Q1 (0) 9 = = P(3), P(1) = = Q3 (3) 16 Q1 (1) 16 ( 5)

= P(1)

Fig. 1 Uniformly spaced interpolation


a Before interpolation with known (bold square) pixels b Interpolation for missing (dotted square) pixels

All necessary interpolation parameters include {Kv , Kh , Nv , Nh , Dv , Dh , Tv , Th } . Having dened all the interpolation parameters, a general formula for interpolating the missing value y(kv Tv + nv Dv , kh Th + nh Dh ) with a 2D FIR lter can be written as follows y(kv Tv + nv Dv , kh Th + nh Dh ) =
Mv Mh mv =Mv +1 mh =Mh +1

y((kv + mv )Tv , (kh + mh )Th )

(1)

P(mv Tv nv Dv , mh Th nh Dh ) where 0 kv Kv 1 , 0 kh Kh 1 , 1 nv Nv , 1 nh Nh , and the support area of the 2D lter covers 2Mv 2Mh . Owing to the computational complexity, a separable lter is frequently adopted such that P(mv Tv nv Dv , mh Th nh Dh ) = Pv (mv Tv nv Dv )Ph (mh Th nh Dh ) . Then (1) is equivalent to applying a 1D FIR lter Pv with 2Mv-tap for vertical direction and then a 1D FIR lter Ph with 2Mh-tap for horizontal direction. In this case, the expression in (1) can be simplied as a 1D FIR lter for each image row and column. So, for notational convenience, if we drop subscript v and h, we then have a 1D FIR interpolation formula

Experiments: We compared the performance for the linear averaging lter (denoted as Lin-2 or linear-2), the H.264/AVC six-tap lter [2] (denoted as H.264-6) with coefcients (1, 5 20, 20, 5, 1)/32, the AVS four-tap lter [3] (denoted as AVS-4) with coefcients ( 1, 5, 5, 1)/8 and the proposed four-tap lter (denoted as proposed-4) with coefcients ( 1, 9, 9, 1)/16. First, the magnitude responses of the lters are shown in Fig. 2. For the pass zone performance, linear-2 has the largest attenuation, while AVS-4 has the largest amplication, both of which may cause excessive smoothing and distortions. On the other hand, the H.264-6 and the proposed-4 lters have relatively at responses in the pass zone. For the transition band characteristics, H.264-6 and the AVS-4 have the most rapid transitions from the pass zone to the stop zone and their pass zones extend beyond the normalised frequency of 0.5 (see Fig. 2). This extended pass zone can cause an aliasing artefact. Peak-signal-to-noise ratios (PSNRs) are compared with well-known test images in Table 1. All test images are expanded as the same sampling layout and scale of the proposed four-tap lter (i.e. doubling the number of rows and columns with T = 2, N = 1, M = 2, and D = 1). As shown in the Table, although the lter length is shorter than H.264-6, the proposed four-tap lter yields the highest average PSNRs. This is because our lter coefcients (prop-4) are adaptively determined such that the sampling layout and scale in the experiments performed in Table 1 match exactly with those of the Lagrange interpolation kernel.

ELECTRONICS LETTERS 31st January 2013 Vol. 49 No. 3

1.4 1.2 1.0 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

magnitude response

linear-2 H.264/AVC-6 AVS-4 proposed-4

the lter coefcients adaptively for a given image sampling layout and scale. To demonstrate the power of our grid adaptive lter, four-tap FIR lter coefcients are derived by tting the Lagrange interpolation kernel to the sampling scale of doubling the number of rows and columns at integer pixel locations. In our experiments the grid adaptive four-tap lter yields the highest average PSNR values (even higher than the six-tap lter). Acknowledgments: This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (20110025770). The Institution of Engineering and Technology 2013 17 July 2012 doi: 10.1049/el.2012.2481 One or more of the Figures in this Letter are available in colour online. Chee Sun Won (Department of Electronic and Electrical Engineering, Dongguk University-Seoul, Seoul, 100-715, Republic of Korea) E-mail: cswon@dongguk.edu References
1 Su, C.-Y., Chang, M.-K., and Hong, C.-M.: Optimal integer FIR ltering for colour interpolation, Electron. Lett., 2010, 46, (20), pp. 13761377 2 Wiegand, T., Sullivan, G.J., Gjontegaard, G., and Luthra, A., Overview of the H.264/AVC video coding standard, IEEE Trans. Circuits Syst. Video Technol., 2003, 13, pp. 560576 3 Wang, R., Huang, C., Li, J., and Shen, Y., Sub-pixel motion compensation interpolation lter in AVS. Proc. of IEEE ICME, Taipei, Taiwan, 2004, pp. 9396 4 Lehmann, T.M., Gonner, C., and Spitzer, K., Survey: interpolation methods in medical image processing, IEEE Trans. Med. Imag., 1999, 18, (11), 1999, pp. 10491075

normalised frequency

Fig. 2 Comparison of magnitude responses

Table 1: PSNR comparisons ( rst 12 images are from http://sipi. usc.edu/DATABASE/ and remaining 15 images are from http://www.imagecompression.info/)
File name airplane akiyal #1 baboon barbara foreman fruit_mixed goldhill house lena man peppers tree artcial big_building Lin-2 32.62 34.15 23.16 26.19 33.22 31.4 31.65 30.44 33.31 31.38 32.6 27.56 36.32 35.08 H 264-6 33.37 34.24 22.39 25.01 33.25 31.41 31.11 30.53 33.72 31.45 32.45 27.67 36.92 36.82 AVS-4 33.26 34.32 22.45 25.17 33.25 31.42 31.15 30.54 33.67 31.44 32.41 27.67 36.88 36.62 Prop-4 33.46 34.62 22.93 25.85 33.6 31.76 31.62 30.83 33.93 31.72 32.79 27.97 37.02 36.37 File name big_tree bridge cathedral deer reworks ower_foveon hdr leaves_iso_200 leaves_iso_1600 nightshot_iso_100 nihgtshot_iso_1600 spider_web zone_palte AVERAGE Lin-2 38.33 35.96 37.92 34.14 36.56 46.1 44.36 32.51 31.89 41.58 37.12 47.26 8.36 33.75 H 264-6 39.16 36.65 38.61 33.11 39.79 47.48 45.79 34.71 33.42 44.23 37.05 51.08 7.34 34.399 AVS-4 39.01 36.53 38.5 33.27 39.45 47.33 45.63 34.53 33.33 43.89 37.08 50.66 7.44 34.33 Prop-4 39.37 36.84 38.85 33.84 38.79 47.58 45.82 34.19 33.24 43.86 37.42 50.57 7.91 34.55

Conclusions: We have established a general form of uniform interpolation formula in this Letter. The formulation is useful to determine

ELECTRONICS LETTERS 31st January 2013 Vol. 49 No. 3

Vous aimerez peut-être aussi