Fractal Image Compression Using Quatree Partition

Fractal image compression using quadtree partitioning
K. M. S. Soyjaudah and I. Jahmeerbacus Faculty of Engineering, University of Mauritius, Reduit, Mauritius E-mail: s.soyjaudah@uom.ac.mu
Abstract This paper describes the design and implementation of a software-based fractal image compression technique using quadtree partitioning, where the student produces user-friendly Windows 9xNT software that allows the compression and decompression of the grey-scale image. This involves the human-computer interface to realise the relatively large number of searches for afne patterns that results in an algorithmic representation of the image. Keywords fractals; human-computer interface; image compression; quadtree partitioning
When a digital image is transmitted through a communication medium like a cable or a simple telephone line, the cost of the transmission depends on the size of the le and is a very important factor. To reduce the transmission cost the le has to be compressed. Fractal image compression is a class of methods that allows images to be stored in computers in much less memory than standard image storage schemes.1,2 Fractal compression schemes are at least as good as, if not better, in some compression ranges, than the current standard Joint Photographic Expert Group (JPEG), a compression scheme that has been the subject of extensive research efforts. Furthermore, fractal compression schemes are computationally simple to decode. Software decoding of video as well as still images may become its major future application. However, it should be pointed out that the main thrust of image and video compression is discrete cosine transform (DCT) and wavelet based. The fractal compression scheme deals with complicated-looking sets called fractals that arise out of simple algorithms. A very simple illustration of fractal geometry is a special type of photocopying machine that simultaneously reduces the image by half and reproduces three times as in Fig. 1.3 Figure 2 shows several iterations of this process. When the output of this machine is fed back as input, it is observed that all the copies seem to be converging to the same nal image, Fig. 2 (c). The process does
Fig. 1
Copy machine that makes three reduced copies of the input image.
International Journal of Electrical Engineering Education 39/1
72
K. M. S. Soyjaudah and I. Jahmeerbacus
Fig. 2 The rst three copies generated on the copying machine.
not change this nal image. The latter is formed out of the three reduced copies of it, hence it has detail at every scale it is a fractal. A characteristic of fractals that is different from Euclidean shapes is that fractal shapes are self-similar and independent of scale. They possess no characteristic size. They are constructed using a recursive algorithm. Hence a mathematical model of an image is needed. The Lena image, on the other hand, is a typical image of a human face.4 Hence, it does not contain the type of self-similarity that can be found in fractals. The image does not appear to contain afne transformations described by the equation: x ai wi = = y ci bi x ei + di y fi (1)
Such a transformation can skew, stretch, rotate, scale and translate an input image. Figure 3 shows a very illustrative example of an afne transformation. The shows how a particular transformation ips or rotates the squares. Commonly referred to as the Barnsley fern, four transformations, one of which is squashed at, yield the stem of the fern. The most important concept however is that in the position of the image of the original square there is a reduced copy of the whole image.
Fractal image compression
73
Fig. 3
Transformation, its attractor, and a zoom on the attractor.
Thus, the image is made up of reduced copies of itself and this implies that it has all the details at every scale. That is, the images are fractals. The image of a human face is formed from copies of properly transformed parts of itself. These transformed parts do not t together to form an exact copy of the original image. That is, the image we encode is an approximation to the original image. Hence it is necessary to develop an algorithmic search in the image to locate the self-similar region. In addition, two adjustments, that is a contrast and brightness adjustment, as well as a mask that selects for each copy a part of the original to be copied, are required. A top-down systems engineering approach is well suited to the design and implementation of fractal image compression/decompression because this technique works well in such complex problems. The student is given the specications of the project and has to understand the problem. He or she then uses the necessary strategies to develop the solution, and plans and produces the design needed to implement and test the solution. The student subsequently embarks on the software design and implementation and produces effective and efcient solutions.5,6 Realisation of the complete system requires knowledge in various elds in electronic engineering such as communication theory, information theory, computer programming and interfacing as well as familiarity with various software packages. Various modules during the rst three years of the four-year B.Eng (Hons) Electronic and Communication Engineering course thoroughly cover these materials. Hence by the time students reach the nal year they have the necessary prerequisites to embark on such a project. Quadtree encoding In quadtree encoding a square in the image is split up into four equal-sized subsquares recursively until the squares are small enough to be covered within some specied tolerance. Generally small squares can be covered more easily than large ones because contiguous pixels in an image tend to be highly correlated. A quadtree partition is a representation of an image as a tree in which each node, corresponding to a square portion of the image, contains four subnodes, corresponding to the four quadrants of the square. The root of the tree is then pictured as the initial image as illustrated in Fig. 4.
74
Fig. 4 A quadtree partition is a representation of an image.
First some initial number of quadtree partitions are made based on a minimum tree depth. The squares at the nodes are compared to the domain library or domain pool that are twice the range size. The pixels in the domain are averaged in groups of four so that the domain is reduced to the size of the range. The afne transformation of the pixel that minimizes the r.m.s. difference between the transformed domain pixel values and the range pixel values is determined using eqn (2). R=
n n n 1 n 2 n 2 bi + s s ai - 2 ai bi + 2o ai + o no - 2 bi n i =1 i =1 i =1 i =1 i =1
(2)
In this equation s controls the contrast, o controls the brightness of the transformation and R is equal to the r.m.s. error. There are three major types of domain libraries D1, D2, D3. The rst type (D1) has a roughly equal number of domains of each size. The second domain library (D2) has more small domains than large domains. The domains are selected as subsquares of the image whose upper-left corners are positioned on a lattice determined by a parameter l. Finally the third type of library (D3) has more large domains and fewer small domains, the idea being that it is more important to nd a good domain-
75
range t for larger ranges since then the encoding will require fewer transformations. The lattice spacing is selected as follows: D1 D2 D3 Has a lattice with a xed spacing equal to l. Has a lattice whose spacing is the domain size divided by l. Thus, there will be more small domains than large. Has a lattice as above but with the opposite spacing-size relationship. The largest domains have a lattice corresponding to the smallest domain size divided by l, and vice-versa.
It is important to note that each domain can be mapped onto a range in eight different ways (four rotation and a ip with four rotations). Hence the domain pool can be thought of as containing the domains in eight possible orientations. In practice, the classication explained below denes a xed rotation, so that the domains are only considered on one of two orientations. There are alternative methods of selecting the domains but they exceed the scope of this project and paper. Let Di denote a portion of the original image (with a brightness and contrast transformation) and Ri a part of the produced copy. Di are usually known as domains and Ri as ranges. Also, the domain is where the transformation maps from, and a range is where it maps to. So we should nd pieces Di and maps wi, so that when apply a wi to the part of the image over Di we should get something that is very close to any other part of the image over Ri. Finding the pieces Ri and corresponding Di by minimizing distances between them is the goal of the problem. If the resulting optimal r.m.s. value is above a preselected threshold and the depth of the quadtree is less than a preselected maximum depth, then the range is subdivided into four quadrants and the process is repeated. If the r.m.s. value is below the threshold the optimal domain and the afne transformation on the pixel values are stored. One of the most computationally intensive steps is the domain-range comparison. In order to minimize the number of domains compared, a classication scheme is used. Before the encoding, all the domains in the domain library are classied. During the encoding, a potential range is classied, and only domains with the same or near classication are compared with the range. This signicantly reduces the number of domain-range comparisons. Image decoding There are currently several techniques used to decode a compressed image. In this project iterative as well as pyramidal decoding has been implemented. The decoding of an image using iterative decoding consists of employing quadtree partitioning to determine all the ranges in the image. For each range Ri, the domain Di that maps to it is shrunk by two in each dimension by averaging non-overlapping groups of 2 2 pixels. The shrunken domain pixel values are then multiplied by si, added to oi, and placed in the location in the range determined by orientation information. This represents the rst iteration. Normally about 10 iterations are sufcient to give
76
an appropriate approximation for the xed point. Further iteration does not generally improve the image. Pyramidal decoding is based on decoding a smaller replica of the image rst. A few iterations are subsequently applied to the small image itself, after which the new small image obtained is scaled to the required size to give the output image. The advantage is that fewer iterations are required on the scaled image to cause a proper convergence. This is expected since the image being iterated is smaller. Hence, the decoding time is signicantly shorter. In addition, a better peak-to-peak signal-to-noise ratio (PSNR), dened for a 8-bit grey-scale image as PSNR = 10. log10 2552 1 2 (aij - bij ) # pixels ij (3)
is expected. Generally, all images decoded to a 128 128 image at rst and then at a size twice larger each time, until the nal image size is reached. For example, a 512 512 image is decoded at a low-resolution 128 128 image, and then the resolution is increased to a 256 256 image, and nally to a 512 512 image. It should be pointed out that fractal encoding has the unique characteristic of using transformations that are resolution independent. Problem denition and functional requirements Figure 5 gives the design methodology for the development of the fractal image compression technique employing quadtree partitioning. The designer has two problems at hand. These are:
quadtree encoding of the image with a targeted delity; quadtree decoding of the image using an iterative technique and appropriate post-processing technique to improve the quality of the decoded image.
To achieve a targeted delity an algorithm that partitions the image into ranges of maximal size with a dened tolerance and a minimal and a maximum range is required. Different types of domain libraries are required to obtain a good domain range t as well as fewer encoding transformations. The rst type of domain library must have roughly equal number of domains of each size. The second needs more small domains than large domains and the third needs more large domains and fewer small domains. One of the most computationally intensive steps is the domain-range comparison. In order to minimize the number of domains compared, a classication scheme is required. To reduce the number of domain-range comparisons a potential range must be classied and only domains with the same or near classication are compared with the range. Finally the parameters to be passed to the fractal encoder such as:
The r.m.s. tolerance threshold The maximum and minimum depth of the quadtree partition
77
Fig. 5 Functional requirements of the compression scheme.
The domain pool type and the lattice spacing l The number of classes compared with a range The maximum allowable scaling factor
must be determined. The parameters of the encoded image such as: The nal quadtree partition of the image The scaling and offset values s and o for each range For each range, a domain that is mapped to it The orientation used to map the domain pixels onto the range pixels need to be determined.
Decoding the compressed image is relatively much less computationally intensive than encoding it. Hence decoding is carried out faster than encoding. However, it is never as fast as optimised algorithms. Two techniques commonly used to decode a compressed image, that is, iterative and pyramidal decoding techniques, need to be implemented in this project. In the iterative decoding technique each iteration consists of searching for each range the domain that maps onto it. Since the domains are larger than the range the former has to be shrunk to achieve the mapping. It is also necessary to adjust the brightness and the contrast and perform the afne transformation for the mapping to be complete. Pyramidal decoding is based on decoding a smaller replica of the
78
image rst. Then a few iterations are applied to the small image itself. Finally the image obtained is scaled to the required size to give the output image. Since the ranges are encoded independently there is no guarantee that the pixel values will be smooth at the boundaries. Human eyes are sensitive to such discontinuities, even when they are relatively small. Post-processing must be applied to improve the smoothness of the image, at least to the human eye. Finally, user-friendly software must be implemented for the compression technique and its performance to be analysed. Simulation results and discussion Software has been developed in C++ language using Microsoft Visual C++ 6.0 to encode an image by a quadtree partition. This involves the following steps: 1 A tolerance level ec is set. This acts as the threshold for the r.m.s. value resulting from the comparison of potential domains to a range. A maximum and minimum depth for the quadtree partitioning is also set in order to avoid unnecessary extensive computation. 2 A domain library is selected and then the classication scheme is carried out. 3 If the r.m.s. value is below the threshold value, the range is marked as covered. 4 Alternately the range is subdivided by four quadrants and the processes are repeated. These are illustrated in the owchart of Fig. 6. The owchart in Fig. 7 gives a decoding iteration. Figure 8 gives a user-friendly window with options to implement compression/decompression of an image. Clicking on the compression option encodes a le of sequential bytes that represents an image. Different ags alter the time and delity of the resulting encoding. Output les will have an extension of .trn by default; to denote a transform le containing the coefcients of the partitioned iterated transform that encodes the image. Figure 9 gives all the encoding options. Ranges are selected in the image using a recursive quadtree partitioning of square sub-images of the input image. For each range, a domain sub-image is sought which is twice the range side length. If a domain that can be mapped onto a range with r.m.s. error less than the tolerance value set with the slider is not found, the iterative process of partitioning the range into four quadrants is repeated. The maximum and minimum recursion depth can also be altered using the appropriate options. The default value used for a 256 256 image is 4 (minimum) and 6 (maximum). This gives ranges of maximum size 16 and minimum size 4. It is useful to note that a 300 400 image with the same options will have the same size ranges, since the largest square sub-image that ts in a 300 400 image is 256 256. The domain pool type, domain step as multiplier, and domain pool step size options affect the selection of domains that are compared with ranges. The domain pool type can take values of 0, 1 and 2, which select among different schemes for dening a pool of domains for comparison with each range. The domain pool step size also affects the number of domains, which takes a value (015) that determines
79
Fig. 6
Flowchart of quadtree encoding steps.

80
Fig. 7 Flowchart illustrating a decoding iteration.
81
Fig. 8
User-friendly window to implement fractal compression/decompression.
Fig. 9 Encoding options window.

82
the domain density. The 24 domain classes and 3 domain classes ags determine the number of domains searched for each range. As their name indicates, they respectively search 24 and 3 classes. These cause the program to run longer but result in better encoding. The positive scale only ag causes only positive scaling to be used. This means that only one class (as opposed to two) is searched. It causes the program to run faster but gives a poorer encoding. The software developed also enables decompression of a compressed le again using the window illustrated in Fig. 8. It outputs the decompressed le of sequential byte values that represents an image. Input les will typically have an extension of .trn, to denote a transform le containing the coefcients of the partitioned iterated functions systems that encodes the image. Figure 10 gives the decoding options. Since the representation of images encoded using the compression scheme contains no resolution information, the image can be decoded at any resolution. Images that are decoded at resolutions greater than the encoded resolution will have articial
Fig. 10 Decoding options window.

83
details automatically created. The output image can be post processed to eliminate some blocky artefacts that are side-effects of the encoding method. This option is enabled using the corresponding ag since by default it is disabled. Figures 11 and 12 show the comparative results of PSNR variation against CPU time with compression. Both images were encoded with 24 domain classes. The
Fig. 11
PSNR vs compression for both the 512 512 and 256 256 Lena images.
Fig. 12
Compression time vs PSNR for both the 512 512 and 256 256 Lena images.
84
post-processing coefcients used were A = 7, B = 1, C = 3 and D = 1. For the 512 512 Lena image the following options were used:6 Minimum recursion depth = 5 Maximum recursion depth = 7 And for the 256 256 Lena image the following options were used: Minimum recursion depth = 4 Maximum recursion depth = 6 It is observed that better compression is achieved with the smaller Lena image and that the latter requires less encoding time. Figure 13 gives a comparative result of fractal image compression and the well-known JPEG compression. Both images were encoded with the following options: Minimum recursion depth = 4 Maximum recursion depth = 6 Domain pool step size = 8 Domain pool type = 2 Domain step as multiplier: enabled Fidelity option: 24 classes enabled The images were then decoded using pyramidal decoding using post-processing coefcients of A = 3, B = 1, C = 2 and D = 1. Tolerance was varied to produce the different points for the fractal compression graph. For the JPEG values, the raw 512 512 image was compressed using the imaging software Jasc Paint Shop Pro version 6.00 with the standard encoding option.
Fig. 13
Fractal vs JPEG compression for both the 512 512 and 256 256 Lena images.
85
JPEG Compression
512 512 Lenna Image
Quadtree Fractal Encoded

512 512 Lenna Image
Compression =56.1:1 PSNR = 24.8 dB
Compression = 56.7:1 PSNR = 29.1 dB
Fig. 14 JPEG vs fractal image compression comparison for 512 512 Lena image.
It is evident that fractal compression is far better than the standard JPEG when the compression is beyond 30. Figure 14 conrms these results in terms of quality of compression. The factor that limits the efciency of fractal compression is the time taken to compress the image. While the JPEG compression is a very optimised technique for super fast compression, the quadtree encoding algorithm is relatively slow. In fact, it is the classication step that produces the high delay. Evaluation At the end of the semester students working on similar design and implementation projects using a top-down systems approach were asked to evaluate them. The students felt that they were able to learn more effectively in this type of practiceoriented project. The students had a feeling of satisfaction and accomplishment since they realised that they had actually been able to design and implement what were relatively complex projects. The negative aspect of the evaluation was the amount of time required for the design and implementation of the project particularly at times when demand for the other courses is also high. Conclusion In this paper a fractal image compression/decompression scheme employing quadtree partitioning has been designed and implemented using top-down approach and user-computer interface. Such problem-solving skills are not often emphasised by current educational methods. These skills form the foundation for further learning; are necessary for engineers and managers and must be treated as essential comInternational Journal of Electrical Engineering Education 39/1
86
petencies for all students. General principles and methods of problem solving, thinking, reasoning and learning skills must have a place in the academic curriculum.79 Since forming a solution to a problem cannot be done effectively without the requisite problem-solving techniques such a project seems both useful and appropriate to introduce and enhance such skills. The top-down system approach to solving an engineering problem presented in this paper can serve such a purpose. References
1 Y. Fisher, Fractal image compression, in P. Prusinkiewicz (ed.), Fractals From Folk Art to Hyperreality, ACM SIGGRAPH 92, Fractal Course Notes 12, 1992. 2 Y. Fisher, Fractal Image Compression. Theory and Application (Springer, New York, 1994). 3 M. F. Barnsley, Fractals Everywhere (Academic Press, San Diego, 1988). 4 University of Bath Image Processing Group, http://dmsun4.bath.ac.uk/ 5 F. P. Deek, H. Kimmel and J. A. Mchugh, Pedagogical chances in the delivery of the rst course in computer science: Problem solving then programming, J. Eng Educ., 87(3) (1998), 313320. 6 H. A. Simon, Problem solving and education, in D. T. Tuma and F. Reif (Eds), Problem Solving and Education: Issues in Teaching and Research (Lawrence Erlbaum, Hillsdale, NJ, 1980), pp. 8196. 7 J. R. Hayes, Teaching problem solving mechanism, in D. T. Tuma and F. Reif (Eds), Problem Solving and Education: Issues in Teaching and Research (Lawrence Erlbaum, Hillsdale, NJ, 1980), pp. 141147. 8 H. A. Simon, The New Science of Management (Harper and Row, New York, 1960). 9 W. J. Stepien, S. A. Gallagher and D. Workman, Problem-based learning for traditional and interdisciplinary classrooms, J. Educ. Gifted, 16(4) (1993), 338357.

Fractal Image Compression Using Quatree Partition

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Fractal Image Compression Using Quatree Partition

Transféré par

Droits d'auteur :

Formats disponibles

Fractal image compression using quadtree partitioning

K. M. S. Soyjaudah and I. Jahmeerbacus

Fig. 2 The rst three copies generated on the copying machine.