Vous êtes sur la page 1sur 5

Lossless Compression of JPEG and GIF Files

through Lexical Permutation Sorting with Greedy


Sequential Grammar Transform based
Compression
Sajib Kumar Saha, Mrinal Kanti Baowaly†, Md. Rafiqul IslamҒ, Md. Masudur Rahaman*
Khulna University/CSE, Khulna, Bangladesh, e-mail: to_sajib_cse@yahoo.com / tosajib@gmail.com

Khulna University/CSE, Khulna, Bangladesh, e-mail: mrinalbaowaly@yahoo.com
Ғ
Khulna University/CSE, Khulna, Bangladesh, e-mail: dmri1978@yahoo.com
*
Khulna University/CSE, Khulna, Bangladesh, e-mail: masud_cse02@yahoo.com

Abstract— This paper provides a way for lossless


Compression of Color Images through lexical permutation
sorting (LPS) and Greedy Sequential Grammar Transform 3 1 5 4 2
based compression. The proposed model adopts the 1 5 4 2 3
advantages of lexical permutation sorting for Color Images N= 5 4 2 3 1
to produce a permuted data. Greedy Sequential Grammar 4 2 3 1 5
Transform based compression, which is basically a text
2 3 4 1 5
compression technique can now be applied easily on that
permuted data. For comparison, we have taken Inversion
Coding of Burrows Wheeler Compression (BWIC), and By forming successive rows of N which are consecutive
Burrows Wheeler Compression (BWC) and the model cycle left-shifts of the sequence p. Let F be the first, S the
proposed in ICCIT 2006( on a paper named ‘A New second, and L the last column of N. By sorting the rows of
Approach for Lossless Compression of JPEG and GIF Files N lexically, we transform it to
Using Bit Reduction and Greedy Sequential Grammar
Transform’ ). 1 5 4 2 3
2 3 1 5 4
Keywords— LPS, BWC, BWIC. 3 1 5 4 2
N′ = 4 2 3 1 5
5 4 2 3 1
I. INTRODUCTION
High quality color images with higher resolution and This amounts to string N respect to the first column, i.e.
huge memory spaces are the favorite of modern people. applying a row permutation to N so that its first column
Color image compression is an important technique to becomes (1,2,3,4,5)T. The original sequence p appears in
reduce the image space and retain high image quality. In the 3rd row of N′. If the transmitter transmits the pair (i, S′)
this paper a compression technique for Color images is or (i, L′), then the receiver can reconstruct the original
proposed based on LPS and greedy sequential grammar sequence p uniquely. For example, if (i, S′) is transmitted,
transform. When the underlying data to be transmitted is a the receiver constructs the original sequence p by using
permutation Π, LPS algorithms generates a cyclic group the following procedure as described in [3]
of order n with Ф(n) generators [3], where Ф(n) is Euler’s
Ф function on n. Among the Ф(n) possibilities, one or
more choices may be cheaper to communicate than the Procedure
original permutation (where Burrows Wheeler Transform p[1] = i;
(BWT) produces only one permutation). When greedy for (j = 2; j ≤ n; j++)
sequential grammar transform algorithm works on that
p[j] = S′[p[j-1]];
permuted data, the produced grammar becomes sort.
Or, if (i, L′) is transmitted, the receiver constructs the
original sequence p by using the following procedure [3]
II. LITERATURE SURVAY
Procedure
A. Lexical Permutation Sorting p[n] = L′[j];
Before discussing the theoretical basis of LPS, we for (j = 1; j < n-1; j++)
begin this section by giving an example p[n-j] = L′[p[n-j+1]];
Let p = [3,1,5,4,2] be a given permutation. Construct
the matrix
Of course once we realize that L′ is the inverse of S′ as a Procedure
permutation, the second procedure is seen to be equivalent C0 = Cost (Θ); k0 = 1;
to the one using S′. for (k = 2; k < n; k ++)
More generally, suppose that A is an alphabet of n {
symbols with a linear ordering. If Y is a data string with
elements in A we denote by N(Y) the n × n matrix whose ¥ = Θk ;
ith row is Y(i) , and by N′(Y) the matrix obtained by if (Cost (Θ) < C0)
lexically ordering the rows of N(Y). k0 = k;
Now according to lemma 3.1 in [3], let p be a }
permutation of degree n given in Cartesian form. ∂t = Θko;
Construct an n × n matrix N whose first row is p and
whose each row is a left cyclic shift of the previous row. If t = Inverse (k0 , Θ(n));
Πj is the jth column of N, so that N = [Π1, Π2, …,Πn], then
the result of lexically ordering the rows of N is the matrix Again according to theorem 3.2 of [3] if Y be a data
N′ = [Π1-1Π1, Π1-1Π2, …, Π1-1Πn], where Π1-1Πj is the jth string of length n, from a linearly ordered alphabet A of g
column of N′ in Cartesian form. distinct symbols, with lexical index permutation λ = λY .
When we need to emphasize the dependency of N and And If N′ = N′ (λ), then M′ i,j = Y′ [N′ i,j]; where M = N(Y)
N′ on the input data permutation p, we write N(p) and and M′ = N′(Y).
N′(p) for N and N′ respectively. We continue to assume Theorem 3.2 described in [3] establishes the connection
that p is a given permutation as in lemma 3.1 in [6]. between LPS and BWT. When the data to be transmitted
Now according to theorem 3.1 we have if l = Π1-1Πn and is a permutation, then, in general LPS Algorithm will give
Π = l -1 = Πn -1 Π1 , then p(i+1) = Π (p(i)) better results than BWT, because we are able to select the
Knowledge of l = Π1-1Πn and p(1) allows us to recover least expensive generator ∂ Є ∆ , with an additional
p completely. overhead of transmitting a single integer x, 1 ≤ x ≤ n, such
that ∂x = Θ. This amount to an overhead bounded by (log
Let the matrix N′ = N′(p) = (ti,j ). Note that if we n)/n.
interpret the second column of N′ as a permutation, we get
1 t 1, 2
2 t 2, 2 B. Grammar Based Compression
1 2 … n
. . Let x be a sequence from A which is to be compressed.
Θ= . . = A grammar transform converts it into an admissible
t 1, 2 t 2, 2 t n, 2 grammar [1, 2] that represents x, then encoder works to
. .
n t n, 2 encode that grammar as shown in fig.1. In this paper, we
are interested particularly in a grammar transform that
starts from the grammar G consisting of only one
But the image under Θ Є S0 of any index j can be found production rule s0 → x, and applies repeatedly reduction
by taking any row, and looking at the next element in that rules 1–5 proposed in [1] in some order to reduce into an
row. Since rows are cyclic shifts of each other, it does not irreducible grammar G’. Such a grammar transform is
matter at which row we look. In particular, we could just called an irreducible grammar transform. To compress x,
use the first row, i.e. we have that Θ = (1, t 1, 2 , t 1, 3 ,… the corresponding grammar-based code then uses a zero-
t 1,n). Hence Θ is a cycle of length n. Moreover, it is clear order arithmetic code to compress the irreducible grammar
that the third column, interpreted as a permutation is G’. After receiving the codeword of G’, one can fully
simply Θ2 … , and in general the kth column is Θi-1. recover G’ from which x can be obtained via parallel
According to proposition 3.1 in [3] the columns of N′(p) replacement. Different orders via which the reduction
form a cyclic group G of order n generated by Θ. rules are applied give rise to different irreducible grammar
transforms, resulting in different grammar-based codes.
Let ∆ be the set of symbols of N′ which are generators
of the cyclic group < Θ >, then Θ as well as l = Θ-1 can be
completely specified as an integer power of any one Input data
element of ∆. There are | ∆ | = Ф(n), generators of the Grammar Context-free Arithmetic Binary
cyclic group < Θ >, where Ф(n) is Euler’s Ф function of n. X transform grammar G coder codeword
If n is prime, there are Ф(n) = n-1 generators of the cyclic
group < Θ >. It is straightforward to obtain all elements of Fig. 1: Structure of a grammar based compression.
∆, since Θk Є ∆ if and only if the integer k, 0 < k < n, is '
relatively prime to n. Hence
∆ = { Θk | gcd(k, n) = 1, 0 < k < n}. III. NEW MODEL FOR LOSSLESS IMAGE COMPRESSION
In particular the following procedure as defined in [3] The basic idea in this proposed method is to sort the
allows us to determine a generator of least entropy ∂ for G, input data through first applying LPS and then apply
t
as well as the integer t such that ∂ = Θ. By a cost function greedy sequential grammar transform based compression
in that procedure mean a function that can measure the to compress image file.
entropy of the resulting data, after decomposition.
code with a dynamic alphabet to encode the sequence of
Source image Compressed image parsed phrases x1, x2 … .
Specifically, we associate each symbol β € S U A with a
Lexical permutation Arithmetic counter c(β). Initially c(β), is set to 1 if β € A and 0
sorting decoder otherwise. At the beginning, the alphabet used by the
arithmetic code is A. The first parsed phrase x1 is encoded
by using the probability c(x1) / ∑ β € A c(β). Then the
Reverse greedy counter c(x1) increases by 1.Suppose that x1, x2…xni have
Greedy sequential
sequential grammar been parsed off and encoded and that all corresponding
grammar
transform counters have been updated. Let Gi be the corresponding
transform
irreducible grammar for x1,… xni . Assume that the
variable set of Gi is equal to S(ji)={s0, s1, … ,sji -1} where
Arithmetic Reverse lexical j1 = 1.
coder permutation sorting Let xni +1,…, xni-1 be parsed off as in our irreducible
grammar transform and represented by β €
Compressed image {s1, …,sji -1} U A. Encode β and update the relevant
Source image
(a) counters according to the following steps:
(b)
Step 1: The alphabet used at this point by the arithmetic
Fig. 2: Proposed model (a) Compression code is {s1, …, sji-1}.
(b) Decompression. Encode xn i+1,…, xn i+1 by using the probability
c(β) / ∑ α € s(ji) U A c(α). (1)
A. Proposed Compression Technique Step 2: Increase c(β) by 1.
Step 3: Get Gi+1 from the appended Gi as in our
The proposed compression algorithm consists of two
phases: irreducible grammar transform.
Step 4: If ji+1 > ji , i.e., Gi+1 includes the new variable sj i,
1. Lexical Permutation Sorting.
increase the counter c(s) by 1. Repeat this procedure until
2. Greedy sequential grammar transform based the whole sequence is processed and encoded. Note that
compression. c(s0) is always 0. Thus the summation over S(ji) U A in
(1) is equivalent to the summation over {s0, …, sji-1}U A.
Greedy Grammar Transform [1] From Step 4, it follows that each time when a new
Let x = x1x2 … xn be a sequence from A which is to be variable sji is introduced, its counter increases from 0 to 1.
compressed. It parses the sequence x sequentially into Therefore, in the entire encoding process, there is no
non-overlapping substrings {x1, x2…xn2 , xn t-1 +1… xnt } and zero-frequency problem. Also, in the sequential algorithm
builds sequentially an irreducible grammar for each [1], the parsing of phrases, encoding of phrases, and
x1,…xn i where 1≤ i≤ t, n1 = 1 , and nt = n. The first updating of irreducible grammars are all done in one
substring is x1 and the corresponding irreducible grammar pass. Clearly, after receiving enough co-debits to recover
G1 consists of only one production rule s0 → x. Suppose the symbol β, the decoder can perform the update
that x1, x2…xn , xn i-1 +1… xn i have been parsed off and the
operation in the exact same way as does the encoder.
corresponding irreducible grammar 2 Gi for x1, … xni has
been built. Suppose that the variable set of Gi is equal to
S(ji) ={s0,s1,…, sj -1} where j1 = 1. The next substring B. Proposed Decompression Technique
xn i+1…. xni+1 is the longest prefix of xni +1 …. xn i that can be The proposed compression algorithm consists of two
represented by sj for some 0 <j < ji if such a prefix exists. phases:
i
Otherwise, xn i+1…. xn i+1 = xn i +1 with ni+1 = ni+1.If ni+1- ni 1. Greedy sequential grammar based decompression.
>1 and xn +1 … xn i is represented by sj, then append sj to 2. Reverse Lexical Permutation Sorting.
i+1
the right end of Gi(s0); otherwise, append the symbol xni+1
to the right end of Gi(s0). The resulting grammar is
admissible, but not necessarily irreducible. Apply
reduction rules 1–5 proposed in [1] to reduce the grammar IV. EXPERIMENTAL RESULTS
to an irreducible grammar Gi+1. Then Gi+1 represents x1 … We have taken some JPEG and GIF images as sample
xn i +1 Repeat this procedure until the whole sequence is input to our proposed model. The images used in the
processed. Then the final irreducible grammar Gt experiment are shown in Fig. 3 and Fig. 5. It has been
represents x. found that the quality of the final decompressed image is
Since only one symbol from S(ji) U A is appended to the exactly the same as that of the original image as shown in
end of Gi(s0), not all reduction rules can be applied to get Fig. 4 and Fig. 6.
Gi+1.Furthermore, the order via which reduction rules are
applied is unique.

Encoding Algorithm
In the sequential algorithm [1], we encode the data
sequence x sequentially by using a zero-order arithmetic
(a) House (b) Man 1 (c) Man 2 (d) Tree

Fig. 3: Some sample JPEG input files.


Table I Comparison with BWIC, BWC, the model proposed in [5], and the proposed model for JPEG files
File Name Original Using Using Using the Using the
Size BWIC BWC Model Proposed
Proposed in [5] Model

(bytes) (bytes) (bytes) (bytes) (bytes)


House 51, 202 51, 214 51, 195 51,038 49,031
Man 1 39, 821 39, 339 39, 012 38,723 36,172
Man 2 2, 149 2, 201 2, 157 2,159 2,141
Tree 7, 494 7, 588 7, 498 7,452 7,141
Total 100, 666 100, 342 99, 862 99,372 94,485

(a) House (b) Man 1 (c) Man 2 (d) Tree


Fig. 4: Decompressed JPEG Images.

(a) Texture (b) Advertisement (c) Coin (d) Woman


Fig. 5: Some sample GIF input files.

Table II Comparison with BWIC, BWC, the model proposed in [5],and the proposed model for GIF files
Original Size Using BWIC Using BWC Using the Using the
File Name
Model Proposed
Proposed in [5] Model

(bytes) (bytes) (bytes) (bytes) (bytes)


Texture 8, 714 8, 364 8, 566 8,579 8,157
Advertisement 65, 178 64, 911 64, 860 64,545 58,154
Coin 137, 566 132, 699 135, 129 132,186 119,116
Woman 164, 597 160, 482 162, 979 159,063 133,061
Total 376, 055 366, 456 371, 534 364,373 318,488
(a) Texture (b) Advertisement (c) Coin (d) Woman

Fig. 6: Decompressed GIF input files.

Table III Comparison of time taken by the proposed model and the model proposed in [5]
(Processor: Intel-Celeron, 1.6GHz; RAM: 256; Operating system: Windows XP)
File Name Original Compress time Decompress time
Size
Using the Using the Using the Using the
Model Proposed Model Proposed
Proposed in [5] Model Proposed in Model
[5]
(ms)
(bytes) (ms) (ms) (ms)
House 51, 202 874 585 354 299
Man 1 39, 821 547 421 241 223
Texture 8, 714 102 101 61 61
Advertisement 65, 178 769 566 463 399
Coin 137, 566 1912 1511 701 688
Woman 164, 597 2550 1952 1550 1463

[3] Ziya Arnavut, Spyros S. Magliveras, Lexical


V. CONCLUSION Permutation Sorting Algorithm, The Computer
This paper has aimed at providing a novel for Journal, Vol. 40, no. 5, October 1997.
applying LPS algorithm on JPEG and GIF files. Block
sorting can also be used on the place of LPS as was
used in the model proposed in [6], but LPS is suitable [4] Burrows, M. and Wheeler, D.J (1994), A block
if the underlying data to be transmitted is a sorting Lossless Data Compression Algorithm,
permutation[3] and here for JPEG and GIF files LPS SRC Research Report 124, Digital System
works better that Block sorting. Research center, Palo
The results have shown that the proposed method Alto,CA,gatekeeper.doc.com,
achieves better compression ratio and takes reduced /pub/DEC/SRC/research-reports/SRC-124.ps.Z.
compression time. [5] Mr. Rafiqul Islam, Sajib Kumar Saha, Mrinal
Kanti Baowlay, A New Approach for Lossless
Compression of JPEG and GIF Files Using Bit
REFERENCES Reduction and Greedy Sequential Grammar
Transform, ICCIT 2006.
[1] En-hui Yang and Da-Ke He , Efficient universal
[6] Mr. Rafiqul Islam, Sajib Kumar Saha, Mrinal
lossless data compression algorithms based on
Kanti Baowlay, A Modification of Greedy
a greedy sequential grammar transform—Part
Sequential Grammar Transform based
two: With context models, IEEE Trans. Inform.
Universal Lossless Data Compression, ICCIT
Theory, vol. 49, no. 11, November 2003.
2006.
[2] E.-h. Yang and J. C. Kieffer, Efficient universal
lossless data compression algorithms based
on a greedy sequential grammar transform—
Part one: Without context models, IEEE Trans.
Inform. Theory, vol. 46, pp. 755–788, May
2000.

Vous aimerez peut-être aussi