Vous êtes sur la page 1sur 7

Digital Image Steganography Using Matrix

Addition

G. Margarov, V. Markarov
Information Security and Software Development Department,
State Engineering University of Armenia (Polytechnic), Yerevan, Armenia


Abstract - This paper is devoted to problems of data hiding
in digital images. To hide the data is proposed to form a
LSB data plan of the filled container by matrix addition on
the LSB data plan of the empty container and the binary
matrix of hidden data. Approaches to the matrix
representation of steganographic containers and hidden
data are discussed. It is shown that the proposed algorithm
allows constructing a steganographic system with perfect
secrecy by Shannon. The results of the experimental
evaluation of the proposed and known algorithms are
outlined.
Keywords: Steganography, digital image, data hiding,
binary matrix, matrix addition, perfect secrecy
1 Introduction
The idea and practice of hiding information exchange
has a long history. Steganography was widely used in
historical times, especially before cryptographic systems
were developed. There is a version, that ancient Sumerians
in 4-3 millennium BC one of the first used steganography
as on the territory of Armenian highlands has been found
many clay cuneiform tablets in which one record became
covered by a clay layer, and on the second layer another
text was written [1].
The end of the 20th century and early years of the 21st
century saw rapid growth and widespread use of electronic
data processing and electronic business conducted through
the Internet, along with numerous occurrences of
international terrorism, fuelled the need for better methods
of protecting the computers and the information they store,
process and transmit.
Generally speaking there are two main approaches to
the information protection against the purposeful influence:
- Cryptography - literally means secret writing
- Steganography - literally means covered writing.
In cryptography, one can say that the message has
been encrypted, but it cannot be decoded without the proper
key. Whereas in steganography the message itself may not
be difficult to decode, but the majority will not perceive the
presence of it.
Steganography hides a message inside another
message, which often is called empty container, forming
filled container, which looks like a normal graphic, sound,
or other file. In case of cryptography, an encrypted message
looks like a meaningless jumble of characters. The goal of
steganography was and still is to convey messages under
cover in container, concealing the very existence of secret
message exchange. While steganography is separate and
distinct from cryptography, there are many analogies
between the two and, in fact, some authors categorize
steganography as a form of cryptography, since hidden
communication is undoubtedly a form of secret writing [2].
2 Steganographic System
The high-level view of typical steganographic system
is shown in Figure 1.

Figure 1. The high-level view of typical steganographic
system.
The sender wants to communicate a secret message to
a receiver. The message first can be preprocessed, that is
encripted, compressed or otherwise transformed for
convenience of embedding in the container. The
preprocessed message can now be secretly hidden in empty
container, which, as mentioned earlier, can be an image or
any digital medium that has sufficient amount of redundant
bits, which can be replaced to hide a secret message.
Redundant bits are defined as those bits in the, which if
changed wont change perception of the container file to a
great extent. The embedding process hides the secret
message using the secret steganographic key.
After embedding is finished the filled container can be
transmitted to the receiver. At the receiving end, the
receiver, having the proper key, can extract from container
and postprocess the secret message. Postprocessing process
is naturally precisely opposite to preprocessing. If necessary
for higher safety these processes also can be carried out
with use of special secret keys.
The success of steganography is dependent on secrecy
of the steganographic key. In steganography, a key is a
piece of information (a parameter) that determines the
functional output of a steganographic algorithm. In general
such information can be: special value, the empty container
and the filled container. In a concrete case some parts of a
key or even a key completely can be absent.
The most important requirement for a steganographic
system is undetectability: filled container should be
statistically indistinguishable from empty container. In
other words, there should be no artifacts in the filled
container that could be detected by an attacker with
probability better than random guessing, given the full
knowledge of the embedding algorithm except for the
steganographic key (Kerckhoffs principle).
3 Steganographic Methods
In most existing methods of digital steganography,
something is done to hide the data in container file;
naturally, these actions or techniques can be separated and
analyzed to learn what is happening during the whole
process. These methods can be broke down into the three
categories on base of both how and where the data are
hidden [3]:
- Insertion
- Substitution
- Generation
From a practical point of view of most interest are
substitution based methods. As the name implies these
methods substitute (overwrite) the information that is
already in the file. Substitution based methods may seem
straightforward, but it is necessary to be very careful and
not to overwrite data in a way that would make the file
unusable or visually flawed. The trick is to find
insignificant or least significant data in a file, data that can
be overwritten without having any perceptible impact. Once
these data are identified, the data to be hidden can be
broken up and inserted in them and have to be rather hard to
detect. The most common containers used are digital
images and depending on the file format the steganographic
technique and payload (embedding capacity) can be very
different.
4 Digital Images
A digital image is a representation of a two-
dimensional image using ones and zeros (binary) and can
be of vector or raster type but by itself, the term digital
image usually refers to raster bitmap images. Raster
images have a finite set of digital values, called picture
elements or pixels, which are grouped into a fixed number
of columns N and rows M . Typically, the pixels are
stored in computer memory bitmap, a two-dimensional
array of small integers. Each pixel of an image is associated
to a specific position in some 2D region, and has
a value consisting of one or more quantities (samples)
related to that position. Digital images can be classified
according to the number and nature of those samples [4]:
- binary
- grayscale
- color
All three classes of images can be formally
represented as the relevant matrixes [5].
4.1 Binary Image
A binary image (also called bi-level or two-level) is
a digital image that has only two possible values for
each pixel. Typically the two colors used for a binary image
are black and white though any two colors can be used. The
color used for the object(s) in the image is the foreground
color while the rest of the image is the background color.
This means that a binary image is usually stored in memory
as a bitmap, a packed array of bits, as shown in figure 2,
and each pixel is stored as a single bit (0 or 1).

Figure 2. The binary image.
A binary image formally might be represented as a
binary matrix:
|
|
|
|
|
|
|
|
.
|

\
|
=

1 , 1 1 , 1 0 , 1
1 , 1 1 , 1 0 , 1
1 , 0 1 , 0 0 , 0
. . .
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . .
N M M M
N
N
B B B
B B B
B B B
B , (1)
where } 1 , 0 {
,
e
j i
B for M i < s 0 and N j < s 0 .
In particular for the image of figure 2 binary matrix
will be:
|
|
|
|
|
|
|
|
|
.
|

\
|
=
1 0 0 0 1 0 0 1 1 0 0
0 1 0 0 0 1 0 0 0 0 1
0 1 1 1 0 0 1 0 0 1 0
0 0 0 0 0 0 0 1 0 0 1
1 0 0 1 0 1 0 0 1 0 0
0 0 1 0 1 0 1 0 0 1 0
0 1 0 0 0 1 0 0 0 0 1
B , (2)
Naturally in binary image is rather difficult to hide
data, because change of any bit can be easily detected.
4.2 Grayscale Image
Grayscale images are distinct from binary and have
many shades of gray in between black and white.
Commonly grayscale image stored with 8 bits per sampled
pixel, which allows 256 different intensities (i.e., shades of
gray). This format is very convenient for programming due
to a single pixel occupies a single byte (8 bits). Thus, as
shown in Figure 3, the image consists of 8 arrays of bits
(numbered 0 to 7), one for each bit of a byte.

Figure 3. The grayscale image.
A grayscale image formally might be represented as 8
binary matrixes of similarity to (1) or as a matrix:
|
|
|
|
|
|
|
|
.
|

\
|
=

1 , 1 1 , 1 0 , 1
1 , 1 1 , 1 0 , 1
1 , 0 1 , 0 0 , 0
. . .
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . .
N M M M
N
N
G G G
G G G
G G G
G , (3)
where } 255 ..., , 1 , 0 {
,
e
j i
G is intensity of gray level of
pixel ) , ( j i for M i < s 0 and N j < s 0 .
A very common way of hiding data in grayscale
image is to substitute the least significant bits (LSB) of the
image. The least significant bit term comes from the
numeric significance of the bits in a byte. The high-order,
or the most significant bit (MSB) is the one with the highest
arithmetic value (i.e., 128 2
7
= ), while the low-order, or
the least significant bit is the one with the lowest arithmetic
value (i.e., 1 2
0
= ). Figure 4 shows how the MSB and LSB
changes affect the shades of gray. As can be seen, change
the MSB (01100110 to 11100110) significantly affects on
the shade of gray while change the LSB (01100110 to
01100111) almost imperceptible.

Figure 4. Effect of LSB and MSB changes on shades of
gray.
LSB data plan of grayscale image formally represents
as a binary matrix:
|
|
|
|
|
|
|
|
.
|

\
|
=

1 , 1 1 , 1 0 , 1
1 , 1 1 , 1 0 , 1
1 , 0 1 , 0 0 , 0
) ( . . . ) ( ) (
.
.
.
.
.
.
.
.
.
.
.
.
) ( . . . ) ( ) (
) ( . . . ) ( ) (
) (
N M M M
N
N
L G L G L G
L G L G L G
L G L G L G
L G
,
where } 1 , 0 { ) (
,
e
j i
L G for M i < s 0 and N j < s 0 .
As a simple example of LSB substitution, imagine
hiding the character G across the following eight bytes of
a container file (the LSBs are underlined):
10010101 00001101 11001001 10010110
00001111 11001011 10011111 00010000
The character G is represented in the American
Standard Code for Information Interchange (ASCII) as the
binary string 01000111. These eight bits can be written
down in the LSB of each of eight container bytes as
follows:
10010100 00001101 11001000 10010110
00001110 11001011 10011111 00010001
In the example above, note that only half of the LSBs
were actually changed (shown above in italics). This
actually makes some sense when we are thought to
substitute one set of zeroes and ones with another set of
zeroes and ones. Thus LSB data plan can be used to
visually inconspicuous steganography replacing LSBs to
bits of hidden data, so only 1 bit can be hided in each pixel.
Thus, a 600600 pixel grayscale image can contain a total
amount of 360.000 bits (45.000 bytes) of secret data.
Payload can be increased, if more than one bit of each pixel
are used for data hiding. Therefore, it is obvious, that with
increase of the used bits quantity the alteration of the
container image will become more and more and so the
security will decrease.
As another example the image shown in figure 3 can
be used to hide data provided by the matrix (2), or that the
same by figure 2. As it is shown in figure 5 as a result of
such data hiding 39 bits of the LSB data plan changed.

Figure 5. LSB data plan before (a)) and after (b)) data
hiding, as well as changed bits (c)).
In the basic implementation of this substitution
method the LSB data plane of a container file overwrites
with the hiding data and on average, 50% of the LSBs are
flipped [6].
4.3 Color Image
Color image is a digital image that includes color
information for each pixel. For visually acceptable results,
it is necessary (and almost sufficient) to provide
three samples (color channels) for each pixel. The
RGB (Red, Green and Blue) color space is commonly used
and when using a 24 bit color image each pixel is
represented by 1 byte (8 bit) for each color components, for
example, the definition of a yellow pixel is shown on
Figure 6.
In fact, we can say that the color image is a
superposition of three grayscale images in the appropriate
range of colors (red, green and blue). Consequently color
image can be represented as 3 matrixes similar to (3):
|
|
|
|
|
|
|
|
.
|

\
|
=

1 , 1 1 , 1 0 , 1
1 , 1 1 , 1 0 , 1
1 , 0 1 , 0 0 , 0
. . .
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . .
N M M M
N
N
A A A
A A A
A A A
A , (3)
where } , , { B G R Ae is generalized symbol of color
and } 255 ..., , 1 , 0 {
,
e
j i
A is color A intensity level of
pixel ) , ( j i for M i < s 0 and N j < s 0 .



Figure 6. Definition of a yellow pixel.
LSB substitution can be used to overwrite legitimate
RGB (Red, Green and Blue) encodings in color image. For
data hiding a bit of each color components can be used, so a
total of 3 bits can be hided in each pixel. Thus, a 600600
pixel image can contain a total amount of 1.080.000 bits
(135.000 bytes) of secret data. It is obvious that the
technique of data hiding in a color image differs from the
one for grayscale image only by using of three separate
matrixes for each color components. It can be assumed that
the degree of invisibility of hidden data in these two cases
should be commensurate.
5 Matrix Representation of Hidden
Data
Any digital data regardless of its semantic content
(text, database, executable, etc.) can be represented as a
binary sequence:
( )
1 1 0
. . . , ,

=
K
S S S S ,
where } 1 , 0 { e
k
S is value of k -th bit and K is binary
length of the binary sequence.
The binary sequence might otherwise be represented
as a binary matrix:
|
|
|
|
|
|
|
|
.
|

\
|
=

1 , 1 1 , 1 0 , 1
1 , 1 1 , 1 0 , 1
1 , 0 1 , 0 0 , 0
. . .
.
.
.
.
.
.
.
.
.
.
.
.
. . .
. . .
N M M M
N
N
S S S
S S S
S S S
S ,
In this case
k j i
S S =
,
at j N i k + = and
M N K = .
For example the text MG can be represented in the
ASCII code as the binary sequence
01001101 01000111
The same text while maintaining the dimensions of the
matrix (2) can be represented as:
|
|
|
|
|
|
|
|
|
.
|

\
|
=
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 0 0
0 1 0 1 0 1 1 0 0 1 0
B , (4)
To ensure high imperceptibility it is interesting to
consider as hidden data files with natural binary matrix
structure of data area. A good example can be a binary
bitmap image. In other words, if the hidden data is initially
stored in a binary image and then embedded in the
container they are much harder to detect and such technique
can be recommended as a means to rather strong
steganography.
Just imagine that almost any data can be saved with
some simplification in the file binary images using
appropriate software. The same text MG can be saved as
the binary images shown in figure 7.

Figure 7. The binary image representing the text MG.
In accordance with the figure 7 the binary matrix
representing the text MG will be:
|
|
|
|
|
|
|
|
|
.
|

\
|
=
0 1 1 1 0 0 1 0 0 0 1
1 0 0 0 1 0 1 0 0 0 1
1 0 0 0 1 0 1 0 0 0 1
1 1 1 0 1 0 1 0 0 0 1
0 0 0 0 1 0 1 0 1 0 1
1 0 0 0 1 0 1 1 0 1 1
0 1 1 1 0 0 1 0 0 0 1
B , (5)
In the same way as a binary matrix can be represented
not only any text, but also arbitrary codes, simple drawings
and even hand wrings. For such a representation of data
may be used graphics software, scanners, digital cameras,
etc.
Comparing the matrix (4) and (5) can be seen that the
latter does not contain meaningful text in the direct form
and, therefore, in this case is much more difficult to detect
hidden data in the steganographic containers. In other
words, representation of hidden data in the form of binary
matrixes may contribute to a significant increase in
steganographic system security.
6 Proposed Steganographic Algorithm
The majority of steganographic algorithms focused on
introducing as little distortion in the container as possible
utilizing the seemingly intuitive heuristics that the smaller
the embedding distortion is, the more secure the
steganographic scheme becomes. However, recent advances
in steganalysis clearly showed that this is not the best
approach. The most popular LSB substitution based
embedding with sequential or random message spread has
been successfully attacked even for very short messages [7,
8, 9]. In essence, the LSB embedding is so easily detectable
because it introduces distortion that never naturally occurs
to images and creates an imbalance between appropriately
defined statistical quantities.
A better approach, especially in case of the image
steganography, can be to replace operation of direct
substitution of LSBs with other more "natural" operation,
for example addition of the hiding bit to the value of pixel
intensity. In other words data hiding can be done by adding
the corresponding matrixes of hidden data and the
container. Thus the natural balance between statistical
quantities will be maintained and it will be difficult enough
to detect the filled container by existing means of
steganalysis.
To hide the data is proposed to form a LSB data plan
of the filled container ) (L F by matrix addition on the
LSB data plan of the empty ) (L E container and the binary
matrix of hidden data H :
H L E L F + = ) ( ) ( (6)
In this case
j i j i j i
H L E L F
, , ,
) ( ) ( + = for M i < s 0
and N j < s 0 . Given the fact that all these matrixes are
binary the addition operation ( + ) can be replaced by the
operation of logical addition ( ). So the expression (6)
can be replaced by:
H L E L F = ) ( ) (

Consequently
j i j i j i
H L E L F
, , ,
) ( ) ( = for M i < s 0
and N j < s 0 .
Application of the logical addition of binary matrixes
allows making extraction of hidden data absolutely
identical to the embedding. So extraction of hidden data can
be presented as:
) ( ) ( L E L F H =

Consequently
j i j i j i
L E L F H
, , ,
) ( ) ( = for M i < s 0
and N j < s 0 .
Another important feature of the proposed algorithm
is to use as a key plan of the empty container that achieves
perfect secrecy by Shannon [10]. This means that on the
basis of the proposed algorithm can construct
steganographic analogue of one-time pad, well known from
cryptography.
7 Experimental Evaluation of
Steganographic Algorithms
To assess the effectiveness of the proposed matrix
addition based algorithm is necessary to compare it
experimentally with known similar algorithms. For
comparison, selected direct (simple) substitution algorithm
[11]. This choice is justified by the fact that almost all
known LSB substitution algorithms are slight modification
of direct substitution [7]. As the containers used six
different 600600 pixel test images stored both in grayscale
and full color (24 bit) BMP formats. Test images are shown
in figure 8. During the experimental evaluation in the
grayscale images embedded 100 different 45.000 byte
messages, and in color images 100 different 135.000 byte
messages. Embedded messages chosen randomly and had a
variety of statistical characteristics.
Direct visual comparison of images before and after
embedding messages showed that both compared
algorithms provides an acceptable level of imperceptibility.
This is the expected result, which follows from the fact that
both algorithms are based on the change of only LSBs.


Armine Ararat

Tsitsernakaberd Khachkar

Yerevan Ornament
Figure 8. Test images.
Qualitative difference between the algorithms can be
identified on the basis of algorithmic methods with the
assistance of a computer. The basic idea is that the smaller
the number of bits modified by data embedding the smaller
the possibility of detecting the statistic differences between
empty and filled containers. In connection with this,
numbers of bits of the LSB data plan of the container
changing the value as a result of embedding data are
calculated. The average values of the results of these
calculations as a percentage for compared algorithms and
all test images are shown in Table 1.







Table 1. Experimental evaluation results.
Test image
Direct
Substtution
Matrix
Addition
Armine 52,65 22,96
Ararat 50,4 20,22
Tsitsernakaberd 48,23 19,65
Khachkar 51,25 22,23
Yerevan 50,26 23,11
Ornament 42,32 18,85
In average
G
r
a
y
s
c
a
l
e

49,19 21,17
Armine 51,26 21,8
Ararat 49,74 18,23
Tsitsernakaberd 46,39 21,33
Khachkar 50,05 21,95
Yerevan 48,86 22,06
Ornament 42,23 20,86
In average
F
u
l
l

c
o
l
o
r

48,09 21,04
In average for
all test images
48,64 21,10

8 References
[1] M. L. Thomsen, The Sumerian Language: An
Introduction to Its History and Grammatical Structure.
Copenhagen, Akademisk Forlag, 1984.
[2] F.L. Bauer, Decrypted Secrets: Methods and Maxims
of Cryptology, Springer-Verlag. 2002.
[3] E. Cole. Hiding in Plain Sight: Steganography and the
Art of Covert Communication. New York: John Wiley &
Sons, 2003.
[4] R. C. Gonzalez, R. E. Woods, Digital Image
Processing, 3rd Edition. Pearson Prentice Hall, 2008.
[5] G. Margarov, V. Markarov, Matrix Model of Digital
Steganography, Proceedings of Workshop on Applications
of Information Theory, Coding and Security, Institute for
Experimental Mathematics, University of Duisburg-Essen,
Germany, 2010, pp. 75-78.
[6] H. Wang, S. Wang, Cyber warfare: steganography vs.
steganalysis, Communications of the ACM, Volume 47,
Issue 10, October 2004, pp. 76-82.
[7] J. Fridrich, M. Goljan, R. Du, Detecting LSB
Steganography in Color and Gray-Scale Images, Magazine
of IEEE Multimedia, Special Issue on Security, October-
November issue. 2001. pp. 2228.
[8] S. Dumitrescu, W. Xiaolin, Z. Wang, Detection of
LSB Steganography via Sample Pair Analysis, 5th
Information Hiding Workshop, Noordwijkerhout,
Netherlands. 2002. pp. 355372.
[9] A. Westfeld, Detecting Low Embedding rates, 5th
Information Hiding Workshop, Noordwijkerhout,
Netherlands. 2002. pp. 324339.
[10] C. Shannon, Communication Theory of Secrecy
Systems, Bell System Technical Journal 28 (4). 1949. pp.
656715.
[11] C. K. Chan, L. M. Cheng, Hiding data in images by
simple LSB substitution, Pattern Recognition, Volume 37,
Issue 3, March 2004, pp. 469-474

Vous aimerez peut-être aussi