Vous êtes sur la page 1sur 51

STEGANOGRAPHY & ENCRYPTION IN AUDIO FILES

A PROJECT REPORT Submitted by POOJA . M (41501104058) in partial fulfillment for the award of the degree of BACHELOR OF ENGINEERING in COMPUTER SCIENCE AND ENGINEERING

S.R.M. ENGINEERING COLLEGE, KATTANKULATHUR-603 203, KANCHEEPURAM DISTRICT.

ANNA UNIVERSITY : CHENNAI - 600 025

MAY 2005

BONAFIDE CERTIFICATE

Certified that this project report " STEGANOGRAPHY & ENCRYPTION IN AUDIO FILES " is the bonafide work of "POOJA.M (41501104058)" who carried out the project work under my supervision.

Prof. S.S.SRIDHAR HEAD OF THE DEPARTMENT COMPUTER SCIENCE AND ENGG S.R.M.Engineering College Kattankulathur - 603 203 Kancheepuram District

Mr.T. SABHANAYAGAM SUPERVISOR LECTURER COMPUTER SCIENCE AND ENGG. S.R.M.Engineering College Kattankulathur - 603 203 Kancheepuram District ABSTRACT

This project basically deals with 2 important network security concepts namely STEGANOGRAPHY and ENCRYPTION. A text file containing the secret information is created. An audio file of WAVE FORMAT is the chosen medium to conceal and transmit the secret information. Since the audio file is in ASCII format, the contents of the text file are also converted to bit stream. The file is now encrypted to cipher text. The encrypted file is now embedded behind the audio file by mixing the contents together. The encrypted and steganographed file contents are copied into another file for backup purpose. At the other end, the encrypted file is iv

desteganographed and then decrypted. The original text file contents are then viewed. The desired output of the project is the successful transmission of the file and the audio file should not suffer from any corruption or distortion. forthcoming chapters. TABLE OF CONTENTS Abstract List of Tables List of Figures 1.Introduction 1.1 Existing System 1.2 Proposed System 1.3 Hardware Requirements 1.4 Software Requirements 1.4.1 C language 1.4.2 Linux System 2 Steganography 2.1 Definition 2.2 Hiding Information in an Electronic Image 2.3 Attacks and Steganalysis 2.4Principles of Steganography 2.4.1 Audio Files as Medium 2.4.2 Image Files as medium 2.5 Types of Image Files 2.5.1 BMP Files 2.5.2 GIF/PNG Files 2.6 Terrorists and Steganography 3 Cryptography 3.1 Definition iii vii vii 1 1 1 3 3 3 14 16 16 16 17 18 19 25 25 25 25 27 28 28 v The technique implemented on the audio file is Lsb insertion, which will be discussed, in the

3.2 Basic Terms in Cryptography 3.2.1 ClearText,PlainText 3.2.2 Encryption 3.2.3 Ciphertext 3.2.4 Key 3.2.5 Decryption 3.2.6 Key 3.2.7 Cryptanalysis 3.2.8 Symmetric Cryptography 3.2.9 Asymmetric Cryptography 3.2.10 Digital Signatures 3.3 Cryptography through the Ages 3.3.1 3.3.2 Ancient times Cryptography through the Middle Ages 3.4.1 Public Key Cryptography 3.4.2 DES 3.5 Applications in Cryptography 3.6 Cryptographic Algorithms used in e-commerce 4 Audio Files as WAV FILES 4.1 WAV Format 4.2 RIFF-Resource Interchange File Format 5 Implementation 5.1 Introduction 5.2 MODULE 1- Header Information 5.3 MODULE 2-File Handling Operations 5.4 MODULE 3- Steganography & Encryption 5.5 MODULE 4- Desteganography & Decryption 5.6 MODULE 5- Error Handling

28 28 28 28 29 29 29 29 29 29 30 30 30 32 34 34 36 38 43 44 44 45 47 47 48 48 50 51 51

3.4 Algorithms in Cryptography

vi

APPENDIX 6.1 Creation of WAV file 6.2 Steganography and Encryption 6.3 Desteganography and Decryption

52 52 54 59 63

REFERENCES

LIST OF TABLES Table 1: Encryption Table Table 2:Cryptographic Algorithms Table 3:RIFF Table 4: fmt subchunk LIST OF FIGURES Figure 1: Radical Color Distortion Figure 2:Steganography & Encryption in Audio Files Figure 3:Pointer use for a variable Figure 4:Pointer use for an array Figure 5:Attack and Recovery Figure 6:Steganographic Process Figure 7:Radical Color Distortion Figure 8:Modern Heiroglyphic writing Figure 9:Roger Bacon Figure 10:Working of DES-A Figure 11: Working of DES-B 1 3 11 12 17 19 24 30 32 37 38 34 43 44 44

vii

INTRODUCTION Steganography and Encryption in Audio files 1.1 Existing System The existing system mainly uses image files or audio files without the implementing encryption. The main disadvantage of using an image file is its inability to store a large amount of information. Since image files are bmp files,GIF/PNG files, the resulting stego-medium looks distorted owing to large amount of information stored inefficiently. An example of distortion in image files is shown below-

FIGURE 1 - Radical Color Distortion Coming to audio files, desteganography is a very tedious process. The chances of an attacker of recovering information from an audio are very slim. A major advantage of using audio files as a medium for hiding information is its ability to store a large amount of data efficiently. Normally, audio files of WAV format are used to store information. 1.2 Proposed System The proposed system not only hides the information but also makes the secret or confidential information unrecognizable to an outsider. The proposed system is described as follows: First, a text file containing the secret information is created. The project has been written using C language upon LINUX platform. After the necessary file has been created, the audio file is considered as the cover medium. An audio file of WAV format is created. A WAV file is an RIFF file, which has the advantage of helping multimedia applications store, and handles all kinds of data effectively. The WAV file consists of a WAV chunk, which consists of 2 sub chunks namely fmt and data. The first sub chunk gives details such as the size of the audio file, type of information being stored etc. The second sub chunk contains the actual data. The audio file is the cover medium, which is going to be used to store the data. The contents of the audio file are in hexadecimal format. Hence, the contents of the secret information are converted to bit stream by calling a suitable function which will convert the text file into bit stream contents. Now, the bit stream contents are converted to ciphertext. We dont use any standard algorithm because of their time 8

consuming nature. The Ciphertext is now mixed with the contents of the Audio file by applying the Lsb insertion technique. Now, the audio file is ready to be transmitted to the receiver side. Before the transmission takes place. The original audio file contents are listened with the aide of a speaker. The audio file is again listened to, after the techniques of Encryption and Steganography have been implemented on it. The audio file must sound exactly as it sounded earlier thereby giving no hint of any information being concealed. As mentioned before,steganography is highly tedious. Hence, the chances of recovering information from the audio file are extremely slim. However, even if a determined hacker is able to recover the information, he will be holding a file of junk values because of the encryption implemented upon the secret information. At the receiver side, the audio file is taken and first the cipher text is recovered from the audio file. This is desteganography. Then the ciphertext is converted back into the original contents. This is decryption. Throughout the process, care is taken to ensure that the audio file does not suffer from any noise or corruption. The main advantage of the proposed system is that Dual network security is ensured and large amount of information can be stored effectively.

FIGURE 2 - Steganography and Encryption in Audio File 1.3 Hardware requirements 1)Monitor 2)Keyboard 3)Mouse 4)Speaker 5)CPU 9

1.4Software requirements 1.4.1 C Language Main() and Header Files A C program contains functions and variables. The functions specify the tasks to be performed by the program. The ``main'' function establishes the overall logic of the code. It is normally kept short and calls different functions to perform the necessary sub-tasks. All C codes must have a ``main'' function. Our hello.c code calls printf, an output function from the I/O (input/output) library (defined in the file stdio.h). The original C language did not have any built-in I/O statements whatsoever. Nor did it have much arithmetic functionality. The original language was really not intended for ''scientific'' or ''technical'' computation.. These functions are now performed by standard libraries, which are now part of ANSI C. The K & R textbook lists the content of these and other standard libraries in an appendix. The printf line prints the message ``Hello World'' on ``stdout'' (the output stream corresponding to the X-terminal window in which you run the code); ``\n'' prints a ``new line'' character, which brings the cursor onto the next line. By construction, printf never inserts this character on its own: the following program would produce the same result: #include < stdio.h> void main() { printf("\n"); printf("Hello World"); printf("\n"); } Try leaving out the ``\n'' lines and see what happens. The first statement ``#include < stdio.h>'' includes a specification of the C I/O library. All variables in C must be explicitly defined before use: the ``.h'' files are by convention ``header files'' which contain definitions of variables and functions necessary for the functioning of a program, whether it be in a user-written section of code, or as part of the standard C libaries. The directive ``#include'' tells the C compiler to insert the contents of the specified file at that point in the code. The ``< ...>'' notation instructs the compiler to look for the file in certain ``standard'' system directories. The void preceeding ``main'' indicates that main is of ``void'' type--that is, it has no type associated with it, meaning that it cannot return a result on execution. The ``;'' denotes the end of a statement. Blocks of statements are put in braces {...}, as in the definition of functions. All C statements are defined in free format, i.e., with no specified layout or column assignment. 10

Standard Variable Types C uses the following standard variable types: int -> integer variable short -> short integer long -> long integer float -> single precision real (floating point) variable double -> double precision real (floating point) variable char -> character variable (single byte) The compilers checks for consistency in the types of all variables used in any code. This feature is intended to prevent mistakes, in particular in mistyping variable names. Calculations done in the math library routines are usually done in double precision arithmetic (64 bits on most workstations). The actual number of bytes used in the internal storage of these data types depends on the machine being used. The printf function can be instructed to print integers, floats and strings properly. The general syntax is printf( "format", variables ); where "format" specifies the converstion specification and variables is a list of quantities to print. Some useful formats are %.nd integer (optional n = number of columns; if 0, pad with zeroes) %m.nf float or double (optional m = number of columns, n = number of decimal places) %ns string (optional n = number of columns) %c character \n \t to introduce new line or tab \g ring the bell (``beep'') on the terminal Loops Most real programs contain some construct that loops within the program, performing repetitive actions on a stream of data or a region of memory. There are several ways to loop in C. Two of the most common are the while loop: while (expression) { ...block of statements to execute... } and the for loop: for (expression_1; expression_2; expression_3) { ...block of statements to execute... } The while loop continues to loop until the conditional expression becomes false. The condition is tested upon entering the loop. Any logical construction (see below for a list) can be used in this context. 11

The for loop is a special case, and is equivalent to the following while loop: expression_1; while (expression_2) { ...block of statements... expression_3; } For instance, the following structure is often encountered: i = initial_i; while (i <= i_max) { ...block of statements... i = i + i_increment; } This structure may be rewritten in the easier syntax of the for loop as: for (i = initial_i; i <= i_max; i = i + i_increment) { ...block of statements... } Infinite loops are possible (e.g. for(;;)), but not too good for your computer budget! C permits you to write an infinite loop, and provides the break statement to ``breakout '' of the loop. For example, consider the following (admittedly not-so-clean) re-write of the previous loop: angle_degree = 0; for ( ; ; ) { ...block of statements... angle_degree = angle_degree + 10; if (angle_degree == 360) break; } The conditional if simply asks whether angle_degree is equal to 360 or not; if yes, the loop is stopped. Symbolic Constants You can define constants of any type by using the #define compiler directive. Its syntax is simple--for instance #define ANGLE_MIN 0 #define ANGLE_MAX 360 would define ANGLE_MIN and ANGLE_MAX to the values 0 and 360, respectively. C distinguishes between lowercase and uppercase letters in variable names. It is 12

customary to use capital letters in defining global constants. Conditionals Conditionals are used within the if and while constructs: if (conditional_1) { ...block of statements executed if conditional_1 is true... } else if (conditional_2) { ...block of statements executed if conditional_2 is true... } else { ...block of statements executed otherwise... } and any variant that derives from it, either by omitting branches or by including nested conditionals. Conditionals are logical operations involving comparison of quantities (of the same type) using the conditional operators: < smaller than <= smaller than or equal to == equal to != not equal to >= greater than or equal to greater than

and the boolean operators && and || or ! not Another conditional use is in the switch construct: switch (expression) { case const_expression_1: { ...block of statements... break; } case const_expression_2: { ...block of statements... break; } default: 13

{ ...block of statements.. } }

Pointers The C language allows the programmer to ``peek and poke'' directly into memory locations. This gives great flexibility and power to the language, but it also one of the great hurdles that the beginner must overcome in using the language. All variables in a program reside in memory; the statements float x; x = 6.5; request that the compiler reserve 4 bytes of memory (on a 32-bit computer) for the floating-point variable x, then put the ``value'' 6.5 in it. Sometimes we want to know where a variable resides in memory. The address (location in memory) of any variable is obtained by placing the operator ``&'' before its name. Therefore &x is the address of x. C allows us to go one stage further and define a variable, called a pointer, that contains the address of (i.e. ``points to'') other variables. For example: float x; float* px; x = 6.5; px = &x; defines px to be a pointer to objects of type float, and sets it equal to the address of x:

FIGURE 3 - Pointer use for a variable The content of the memory location referenced by a pointer is obtained using the ``*'' operator (this is called dereferencing the pointer). Thus, *px refers to the value of x. Arrays Arrays of any type can be formed in C. The syntax is simple: type name[dim]; In C, arrays starts at position 0. The elements of the array occupy adjacent locations in memory. C treats the name of the array as if it were a pointer to the first element--this is important in understanding how to do arithmetic with arrays. Thus, if v is an array, 14

*v is the same thing as v[0], *(v+1) is the same thing as v[1], and so on:

FIGURE 4 - Pointer use for an array Consider the following code, which illustrates the use of pointers: #define SIZE 3 void main() { float x[SIZE]; float *fp; int i; /* initialize the array x */ /* use a "cast" to force i */ /* into the equivalent float */ for (i = 0; i < SIZE; i++) x[i] = 0.5*(float)i; /* print x for (i = 0; i < SIZE; i++) printf(" %d %f \n", i, x[i]); /* make fp point to array x fp = x; /* print via pointer arithmetic */ /* members of x are adjacent to */ /* each other in memory */ /* *(fp+i) refers to content of */ /* memory location (fp+i) or x[i] */ for (i = 0; i < SIZE; i++) printf(" %d %f \n", i, *(fp+i)); } (The expression ``i++'' is C shorthand for ``i = i + 1''.) Since x[i] means the i-th element of the array x, and fp = x points to the start of the x array, then *(fp+i) is the content of the memory address i locations beyond fp, that is, x[i]. Functions Functions are easy to use; they allow complicated programs to be parcelled up into small blocks, each of which is easier to write, read, and maintain. We have already encountered the function main and made use of I/O and mathematical routines from 15 */ */

the standard libraries. Now let's look at some other library functions, and how to write and use our own. Calling a Function The call to a function in C simply entails referencing its name with the appropriate arguments. The C compiler checks for compatibility between the arguments in the calling sequence and the definition of the function. Library functions are generally not available to us in source form. Argument type checking is accomplished through the use of header files (like stdio.h) which contain all the necessary information. For example, as we saw earlier, in order to use the standard mathematical library you must include math.h via the statement #include < math.h> at the top of the file containing your code. The most commonly used header files are < stdio.h> -> defining I/O routines < ctype.h> -> defining character manipulation routines < string.h> -> defining string manipulation routines < math.h> -> defining mathematical routines < stdlib.h> -> defining number conversion, storage allocation and similar tasks < stdarg.h> -> defining libraries to handle routines with variable numbers of arguments < time.h> -> defining time-manipulation routines In addition, the following header files exist: < assert.h> -> defining diagnostic routines < setjmp.h> -> defining non-local function calls < signal.h> -> defining signal handlers 1.4.2 Linux System Linux is an operating system that was initially created as a hobby by a young student, Linus Torvalds, at the University of Helsinki in Finland. Linus had an interest in Minix, a small UNIX system, and decided to develop a system that exceeded the Minix standards. He began his work in 1991 when he released version 0.02 and worked steadily until 1994 when version 1.0 of the Linux Kernel was released. The kernel, at the heart of all Linux systems, is developed and released under the GNU General Public License and its source code is freely available to everyone. It is this kernel that forms the base around which a Linux operating system is developed. There are now literally hundreds of companies and organizations and an equal number of individuals that have released their own versions of operating systems based on the Linux kernel. More information on the kernel can be found at our sister site, LinuxHQ and at the official Linux Kernel Archives. The current full-featured version is 2.6 (released December 2003) and development continues. Apart from the fact that it's freely distributed, Linux's functionality, adaptability and robustness, has made it the main alternative for proprietary Unix and Microsoft operating systems. IBM, Hewlett-Packard and other giants of the computing world have embraced Linux and support its ongoing development. More than a decade after its initial release, Linux is being adopted worldwide as a server platform primarily. Its use as a home and office desktop operating system is also on the rise. The operating system can also be incorporated directly into microchips in a 16

process called "embedding" and is increasingly being used this way in appliances and devices. STEGANOGRAPHY INTRODUCTION 2.1 Definition Steganography is often compared to cryptography in its ability To restrict unauthorized access to information.Steganography must be confused with cryptography , where we transform the message so as to make its meaning obscure to the person who intercepts it and only the intended person can decrypt it.Such protection is often not enough .While transmitting an encrypted message it is obvious that some form of communication has occurred,even if the message cannot be read.Creative methods have been devised in the hiding process to reduce the visible detection of the embedded messages.Hiding information ,in electronic media used as such carriers , requires alterations of the media properties, which may introduce some form of degradation. 2.2 HIDING INFORMATION IN AN ELECTRONIC IMAGE At the most fundamental level ,computers use binary , a combination of Zeroes and ones to represent text and graphics.The American National Standard Code for Information Interchange(ASCII) is the de facto for representing text and certain control characters.ASCII is the de facto standard for representing text and certain control characters.ASCII uses one parity bit and seven data bits to represent each character in the English Language.Each pixel contains information as to the intensity of the three primary colors red, green and blue.This information can be stored in a single byte (8 bits) or in three bytes(24 bits).For example,in an 8-bit image ,white is represented by the binary value 11111111 and black is 00000000.Current information hiding techniques rely on the use of cover object (image , document,sound file etc.) known as carrier.The secret information is then broken down to its individual bits that are then embedded in the cover object. The result of the process is known as Stegoobject. 17

2.3 ATTACKS AND STEGANALYSIS Two aspects of attacks on Steganography are detection and destruction of the embedded message.Any image can be manipulated with the intent of destroying some hidden information whether an embedded message exits or not.Images can be destroyed by noise addition,changing formats and lousy compression or by manipulation i.e by blurring, sharpening, rotating,resizing and stretching operations.Detecting the existence of a hidden message will save time in the message elimination phase by processing only those images that contain hidden information. The Figure below gives a depiction of the message attack and recovery process.

FIGURE 5 - Attack and Recovery Incorporating features of robustness in the steg-algorithm though the cost of lower data capacity saves the secret information from being destroyed in case of an attack since it upholds the basic motive for the securing the traffic-the message should reach the receiver undetected and unaltered. 18

With careful selection of an appropriate cover image and stego tool , it is possible to create a stego-image that does not appear to be different within the limits of human perception.Discovering a hidden message is the first step in steganalysis and is an attack on the hidden information.Attacks may come in several different forms depending on what information is available to the steganalyst. Images created with Transform Domain tools require a more aggressive approach in order to disable the hidden information.Although they can survive any single image manipulation on the same image have defeated all the known tools.Image manipulation included techniques such as :cropping,removing portions of the image; rotating the image; blurring;decreasing the contrast between pixels;sharpening , increasing the contrast between pixels (opposite of blurring); adding or removing noise ; resampling ; converting between bit densities ( gray ,8 bit , 24 bit); converting from digital to analog to digital;adding bit wise messages;adding transform message. 2.4 PRINCIPLES OF STEGANOGRAPHY The following formula provides a very generic description of the flow of the steganographic processCover_medium + hidden_data + stego_key = stego_medium A key typically parameterizes the embedding ; without knowledge of this key (or a related one) it is difficult for a third party to detect or remove the embedded material.Once the cover object has material embedded in it , it is called a stegoobject.The steganographic process is depicted in the figure below.Here, the cover media is c , secret message by m ,stego-key by s and the key by k.

19

FIGURE 6 - Steganographic Process ALGORITHMS USED IN STEGANOGRAPHY 2.4.1 Audio Files as a medium There are 4 methods for embedding hidden data into an audio file . Low Bit encoding In this method , binary data can be hidden in the least significant bits of the audio file.Wave files are typically represented by their sample rate and bits/sample.So, larger the data used, higher the quality .A stereo file has interleaved data.The first sample point of the left channel is stored first.Next, the first sample joint of the right channel is stored followed by the second sample point of the left channel .Then the second sample point of the right channel is stored and so on.This is what is meant by interleaved data. Wave files employ what is known as lossless compression, a scheme that allows the software to exactly reconstruct the original sound.While Mpeg can be used for stego applications, it is more common to embed data in Wav files.In this method, the binary representation of the data is taken and the Lsb of each byte within the cover 20

file is overwritten very similar to the method followed for image files. A Flow of the standard Low Bit Encoding is shown below Obtain the cover file and information. Determine the offset of wav data in cover file. Copy the data from Cover file to Steg wav until offset byte. Insert each bit of the information to Least Significant bit of each data byte of the Cover file. Repeat step 4 until end of information. Copy remaining cover wav data to steg wav file. The information is recovered by extracting the Lsb of the steg data and recombining it. Phase Coding The phase coding method works by substituting the phase of the initial segment with a reference phase the data.The procedure for phase followsThe original sound sequence is broken down into a series of N short segments.A Discrete Fourier Transform(DFT) is applied to each segment.To break ,create a matrix of the phase and magnitude.The phase difference between each adjacent segment is calculated .For segment S0,the first segment, an artificial absolute phase p0 is created.For all other segments, new phase frames are created.The new phase and original magnitude are combined to get a new segment Sn.Finally, the new segments are concatenated to create the encoded output.The receiver must know the length of the segment and data interval.The first segment would be detected as 0 or 1 and that would indicate were the binary coded message starts. Echo Data Hiding The echo data hiding embeds data into a host signal by introducing an echo.The data are hidden by varying the 3 parameters of the echo;initial amplitude ,decay rate and offset or delay.As the offset between the original and echo 21 Audio code is as

decreases,the 2 signals blend .At a certain point ,the human ear cannot distinguish between the 2 signals and the echo is merely heard as added resonance. 2.4.2 Image Files as medium Lsb Based Steganography The simplest approach to hiding data within image file is called least significant bit(Lsb) insertion. In this method, we can take the binary representation of the hidden data and overwrite the Lsb of each byte within the cover image.If we are using the 24-bit color,the amount of change will be minimal and indiscernible to the human eye. The idea behind Lsb based steganography is that instead of eliminating the redundant information, you replace it with other data.For example, suppose the first eight bytes of an image were: 10001001 11101001 11101001 10011011 10011011 10001001 00011111 00011101 A simple steganographic program could hide the letter S(01010011) by changing the least significant bit in each of the first eight byted to reflect the binary letter.The result: 10001000 11101001 11101000 10011011 10011010 10001000 00011111 00011101 This technique thus substitutes the less significant parts of the cover media with the message bits to be embedded.Image typically use either 8-bit or 24-bit color.When using the 8-bit color ,there is a definition upto 256 colors forming the palette for this image,each color denoted by an 8-bit value ie each pixel is represented as a byte (8-bits).The importance is not whether the image is gray-scale or not,the importance is the degree to which the colors change between the bits values Now if the message were embedded as it is into the Lsb of the cover image,then the resultant 22

structure in the Lsb plane of the stego-image would be clearly a giveaway.Hence to maintain a random looking appearance of the Lsb, the message be encrypted before it is embedded.In fact, a more sophisticated approach would be to randomly insert the encrypted message only into a sunbset of the pixels in the cover-image.In this case ,the natural question that arises is that how many bits can be embedded before the warden is able to reliably distinguish between cover-iamges and stego-images. A typical image with 640 x 480 pixels and 256 colors(8 bit) can hide approximately 300 Kilobytes of information.A high-resolution image, 1024 x 768 pixels and 24-bit color could hide 2.3 Megabytes worth of data.Due to the potential large size of such files,compression algorithms are used to reduce the image to suitable size for sending across the Internet. Standard Lsb Insertion Technique Algorithm. 1.Obtain Cover image (bmp Image) and Information (generally text). 2.Determine offset of the image data in Cover Image. 3.Copy data from Cover Image to Steg Image till offset byte. 4.Insert each bit of Information to Least Significant bit of each data byte of the Cover Image. 5.Repeat step 4 until end of Information 6.Recovery of information is done by extracting and combining the Lsb of the image data. Transform Domain Techniques These techniques manipulate the cover and embed the secret message in the transform domain.Popular transforms used for information hiding are DFT(Discrete Fourier Transform) ,DCT(Discrete Cosine Transform) and DWT(Discrete Wave Transform).Transform domain techniques offer higher robustness to noise and attack.Changes made in the cover using these transforms are difficult to percept by human sensory system.However,they are computationally more expensive and can create problems in the message extraction process. 23

To encode hidden data 1.Take the DCT or wavelet transform of the cover image . 2.Find the coefficients below a certain threshold. 3.Replace these bits with bits to be hidden (can use the LSB insertion). 4.Take the inverse transform. 5.Store as regular image.

To decode hidden data 1.Take the transform of the modified image. 2.Find the coefficients below a certain threshold. 3.Extract bits of data from these coefficients. 4.Combine the bits into an actual message. Redundant Pattern Encoding The idea behind Redundant Pattern Encoding is to paint a small message over an image many times.An advantage over this method is that it can withstand cropping.A disadvantage is that embedding large messages is not possible. Disadvantage of using image files in Steganography is depicted below

Figure 7 Radical Color Distortion 24

2.5 Types of Image Files 2.5.1 BMP FILES Each pixel in a BMP image can be represented by either 8 bits or 16 bits or 24 bits.Color variation for a pixel is derived from 3 primary colors red,green and blue. A bmp image in general has 54 bytes of header information with details such as image size ,frame width and height and resolution etc.In a 24-bit bmp image,each pixel is represented by 3 bytes : 3 bytes for the primary colors similarly ,each pixel in 16-bit bmp images is represented by 2 bytes and a reserve byte and they also do not need any palette table information.On the other hand, an 8 bit bmp image has a separate palette table since it has only 256 color combinations.Thus,each pixel in the image points to the corresponding color combination in the palette table. 2.5.2 GIF/PNG Files BMP images are of huge size ,making it difficult to transfer and hance a less popular medium over the internet.With third-party softwares ,the steg-images are converted to other file formats. The GIF and PNG files are image formats used on the web that provides lossless compression.It is thus preferred when there is a requirement that the original information remains intact.On the other hand ,lossy compression reduces a file by permanently eliminating certain information, especially redundant information.When the file is uncompressed ,only a part of the original information is still there.Using JPEG compression,the creator can decide how much loss to introduce and make a tradeoff between file size and image quality. The graphic formats chosen for conversion are GIF and PNG files,for JPG files may cause loss of embedded data during compression.But because PNG files provide much reliable conversion than GIF files , the favourite medium is PNG file.

25

Some of the PNG file features are as follows1) Indexed-color images of upto 256 colors. 2) .Streamability: files can be read and written serially,thus allowing the file format to be used as a communication protocol for on-the-fly generation and display of images. 3) .Ancillary information :textual comments and other data can be stored within the image file. 4) .Complete hardware and platform independence. 5) .Effective ,100% lossless compression. 6) .Full alpha channel(general transparency masks). 7) .Reliable ,straightforward detection of file corruption. 8) .Faster initial presentation in progressive display mode. 9) .Simple and portable :developers should be able to implement PNG easily. 10) .Legally unencumbered: to the best knowledge of the PNG authors ,no algorithms under legal challenge are used. 11) .Well compressed: both indexed-color and true color images are compressed as effectively as in any other widely used lossless format and in most cases more effectively. 12) .Interchangeable: any standard conforming PNG decoder must read all conforming PNG files. 2.6 Terrorists and Steganography There is a general belief that some of the plans for September 11 attacks were hidden in images and put into sports and pornographic bulletin boards. Known communications The Al-Qaida terrorist n/w has been known to use encryption .They receive money from Muslim sympathizers by computers and then go online and download encryption programs from the web. The following are brief accounts which describe 3 instances where terrorists have used some sort of encryption 1)Wadih El Hage,one of the suspects in the 1998 bombing of the two U.S Embassies in East Africa,sent encrypted emails under various names including Norman & 26

Abdus Sabbur. 2)Khalil Deek , alleged terrorist arrested in Pakistan,in 1999, used encrypted computer files to plot bombings in Jordan at the turn of the millennium ,U.S officials say.

CRYPTOGRAPHY Introduction 3.1 Definition Cryptography is the science of diverse field of problems related to encryption and decryption techniques, privacy of communication, authentication, digital signatures and much more. However, its main task is the constant quest for making the exchange of information totally secure. As such, its task has not change for centuries. Since secret writing hieroglyphic system, through Julius Cesar "Cesar cipher", German Enigma to latest public-key systems, scientists and practitioners around the world, known as cryptographers are in this quest of hiding information from unauthorized eyes.

3.2 Basic terms used in Cryptography 3.2.1 Clear text, plaintext The text or message the we want to secure (the "input") is called clear text or plaintext. 3.2.2 Encryption Transformation of the contents of the message in such a way that its original content is hidden from outsiders is known as encryption. It is an algorithm.

27

3.2.3 Cipher text The message that has been transformed, or encrypted is called cipher text. 3.2.4 Key Encryption and decryption usually make use of a key, and the coding method is such that only knowing the proper key can perform decryption.

3.2.5 Decryption The transformation of cipher text back to plaintext, where the original content has been retrieved is called decryption.

3.2.6 Key The short text, number or message that unequivocally describes the transformation (encryption or decryption) in the current encrypting algorithm is called the key.

3.2.7 Cryptanalysis The science of retrieving the plaintext without the knowledge of the key (breaking the ciphers) is called Cryptanalysis.

3.2.8 Symmetric cryptography 28

The branch of cryptography that deals with symmetric encrypting algorithms. Symmetric encrypting algorithms use the same key to both encryption and decryption. Sometimes they are called secret-key algorithms. A well known example of symmetric algorithm is DES.

3.2.9 Asymmetric cryptography The branch of cryptography that deals with asymmetric encrypting algorithms. Asymmetric encrypting algorithms use the pair of interrelated keys. One element of the pair is used to encrypt the message, while the other to decrypt. Sometimes they are called public-key algorithms. The pair elements are usually called public and private keys. A well-known example of symmetric algorithm is RSA.

3.2.10 Digital signature Many of asymmetric encrypting algorithms can be used to digitally sign a message. The digital signature is a certain data that was created using secret key and the message. For the given secret key there is only one (unique) public key that can be used to verify that the data (digital signature of the message) was really generated using the original private key and the original message.

3.3 Cryptography through the ages 3.3.1 Ancient times It has been discovered that since 3rd millennium some Egyptian hieroglyphic texts were very strange. Their did not exhibit typical word groups and, in the same time, they had many signs not found in the Egyptian hieroglyphic canon. Many researches

29

attribute that to very early cryptography - a secret writing. However, it is not clear whether the purpose of such "encryption" was really to "encrypt". Some researchers think it could be a kind of eye catcher, to attract the reader to find the pleasure of deciphering it! Nevertheless, it is still a very early encryption.

FIGURE 8 -modern hieroglyphic writing

The piece of Rosetta stone - a large stone with hieroglyphic text that was used in the effort to uncover the hieroglyphic language and writing. Spartans As we are not sure whether Egyptian hieroglyphic were indeed used for the secrecy of communication, we now that the first recorded use of some interesting encrypting technology comes from Spartans. Around 500 BC they have invented a device called Spartan Scytale. It was a kind of conical baton with some exact dimensions.

To encrypt a message, the sender wound a belt of leather or parchment and writes the message. The belt was then unwrapped and transported to the receiver. As no one knew the size of the conical baton, the message was hard to read. The receiver knew the dimensions, so when he used the right baton, the message could be decrypted. Julius Caesar cipher Julius Caesar, the last dictator of Rome (100 BC - 44 BC), was not only a 30

genius warrior and Rome politician. He is known to reform Rome finances and many other achievements. For students of cryptography he is certainly known for the simplest encryption algorithm, the Caesar cipher. Historians report that he used this method to communicate with his army officers and keep the messages secure. What is Caesar cipher and how it could be used in the ancient times? It is the simplest form of so called substitution cipher. In substitution cipher every letter of the plain text is substituted by the some other letter or symbol. The substitution in Caesar cipher is based on SHIFT operation done on letters of the plaintext. For example if the shift 3, then, the order: "CROSS THE RUBICON" Is sent to an army general as: "FURVV WKH UXELFRQ" As you can easily find out the letter C was shifted through D, E to F, R to U, O to R and so on.

You can practice the Caesar cipher in this lesson. Click here. 3.3.2 Cryptography in middle Ages Contrary to popular views, during middle Ages mathematics and some other sciences flourished at many European schools. Cryptography was always in the area of interest to medieval scholars. For example, in XIII century, Roger Bacon, an English monk, and excellent mathematician of his time, in his work "Concerning the Marvelous Power of Art and of Nature and Concerning the Nullity of Magic" wrote about necessity of ciphers: "a man is crazy who writes a secret in any other way than one

which will conceal it from the vulgar". And he listed seven cipher methods in his 31

work.

FIGURE 11 - Roger Bacon Later, almost all european governments used some cryptographic techniques to stay informed by their distant representatives, like ambasadors, army generals etc. These practices were very well organized in Italy, where special government organization was established it XV century to help with cryptographic techniques. But the first "manual" on cryptography was Gabriele de Lavinde's collection of ciphers, published in 1379. For more than 450 years this was regarded as nomenclatural system for ciphers.

One of the most important Mediaval work on cryptography comes from French Blaise de Vigenre's who in 1586 published 600 pages "Traict des chiffres". In this book he published famous "Vigenere tableau" algorithm, that was the great enhancement to Julius Caesar substitution cipher.

32

BCDEFGHIJKLMNOPQRSTUVWXYZA CDEFGHIJKLMNOPQRSTUVWXYZAB DEFGHIJKLMNOPQRSTUVWXYZABC EFGHIJKLMNOPQRSTUVWXYZABCD FGHIJKLMNOPQRSTUVWXYZABCDE GHIJKLMNOPQRSTUVWXYZABCDEF HIJKLMNOPQRSTUVWXYZABCDEFG IJKLMNOPQRSTUVWXYZABCDEFGH JKLMNOPQRSTUVWXYZABCDEFGHI KLMNOPQRSTUVWXYZABCDEFGHIJ LMNOPQRSTUVWXYZABCDEFGHIJK MNOPQRSTUVWXYZABCDEFGHIJKL NOPQRSTUVWXYZABCDEFGHIJKLM OPQRSTUVWXYZABCDEFGHIJKLMN PQRSTUVWXYZABCDEFGHIJKLMNO QRSTUVWXYZABCDEFGHIJKLMNOP RSTUVWXYZABCDEFGHIJKLMNOPQ STUVWXYZABCDEFGHIJKLMNOPQR TUVWXYZABCDEFGHIJKLMNOPQRS UVWXYZABCDEFGHIJKLMNOPQRST VWXYZABCDEFGHIJKLMNOPQRSTU WXYZABCDEFGHIJKLMNOPQRSTUV XYZABCDEFGHIJKLMNOPQRSTUVW YZABCDEFGHIJKLMNOPQRSTUVWX ZABCDEFGHIJKLMNOPQRSTUVWXY TABLE 1 - Encryption Table The table was used to encrypt messages in the following way: 1) The secret key, composed of some number of characters was choosen

33

2) The key was written ABOVE the plain text, repeatadly. If the key was set to FRANCE and the clear text was: "To see a World in a Grain of Sand", the first step was: FR ANC E FRANC EF R ANCEF RA NCEF

TO SEE A WORLD IN A GRAIN OF SAND 3)Then the key letter was used to find the row (using leftmost column) and the plaintext letter was used to establish column (using topmost row). The letter found at the intersection of the row and the column: FR TO ANC SEE E A FRANC WORLD EF IN R A ANCEF GRAIN RA OF NCEF SAND

WF SRG E BFRYF JS R GECMS FF FCRI Decryption was simple: the key was written above the cipher text and the rows were established by the consecutive key letters. The consecutive ciphertext letters were than found in the table and the plaintext letter was restored in the topmost table row.

3.4 Algorithms used in Cryptography 3.4.1 Public Key Algorithm Public- key cryptography allows two parties who must communicateon an insecure medium to do so without any prearranged key exchange.Two total strangers can communicate securely with a public-key algorithm. In this type of algorithm,the sender and user use 2 pairs of key: a private key and a public key.Data is encrypted using a public key but on the other side the private key is needed to decrypt the data.The data cannot be recovered using the public key. Some algorithms such as the Diffie-Hellman algorithm are actually only key 34

exchange algorithms.These algorithms only show how to securely exchange keys over an insecure channel,but are not in themselves full public key algorithms. In a simple public-key algorithm, Alice(the party on the transporter side) , generates 2 keys.A public key and a private key.It is not possible to deduce one key from another.Bob also generates a public key and a private key of his own.Alice and Bob exchange public keys.They can even do this exchange in the open ,since even if Eve( the attacker) obtains both the keys,she still cannot read any of the messages that pass between them.Alice encrypts her messages to Bob using Bobs public key ,while Bob uses his private key to decrypt these messages. The basic working of the Public Key Algorithm is as follows1) Alice sends Bob her public key. 2) Bob generates a random session key.he encrypts this using Alices public key and sends the encrypted session key back to Alice. 3) Alice recovers the session key using private key. 4) Alice and Bob can now send messages to eachother using the session key. So, the Public-key algorithm is only actually used to encrypt a small amount of data(the session key) ,while the faster( and presumably more secure ) symmetric algorithm encrypts most of the message traffic

Common Public Key Algorithms RSA It was developed in the year 1977 by Ron Rivest ,Adi Shamir to be used for both encryption and authentication.This algorithm was based on the factoring problem.RSA is usually combined with a secret-key algorithm such as DES for example.In fact,DES is much faster than RSA.RSA provides key size upto 2048 bit. 35

ELGAMAL Its an encryption algorithm based on the discrete logarithm problem.Analysis shows that ELGAMAL and RSA have similar security for equivalent key length.However ELGAMAL algorithm is very slow.

3.4.2 DES DES is a product block encryption algorithm in which 16 iterations or rounds of the substitutions and transposition( permutation) process are cascaded.The block size is 64 bits ,so that a 64-bit block of data(plain text) can be encrypted into a 64-bit ciphertext.The key, which controls the transformation,also consists of 64 bits.Only 56 of these however are at the users disposal.The remaining eight bits are employed for checking parity .The actual key length is therefore 56 bits. Subsets of the key bits are designated K1,K2 etc..with the subscript indicating the number of the round.The cipher function (substitution and transposition) that is used with the key bits in each round which is labeled f. At each intermediate stage of the transformation process,the cipher output from the preceding stage is partitioned into the 32 leftmost bits L1 and the 32 rightmost bits Ri ..is transposed to become the left hand part of the next higher intermediate cipher. The essential feature to the security of the DES is that if f involves a very special non-linear substitution-i.e f(A)+f(B) does not equal f(A+B) specified by the Bureau of Standards in tabulated functions known as S-boxes.This operation results in 32-bit number,which is logically added to Ri to produce the left-hand 36

half of the new intermediate cipher.This process is repeated 16 times in all. To decrypt a cipher,the process is carried out in reverse order,with the 16th round

FIGURE 12 - Working of DES (A)

(A

37

FIGURE 13 -Working of DES-(B)

3.4

Applications of Cryptography Cryptography is extremely useful; there is a multitude of applications, many of

which are currently in use. A typical application of cryptography is a system built out of the basic techniques. Such systems can be of various levels of complexity. Some of the more simple applications are secure communication, identification, authentication, and secret sharing. More complicated applications include systems for electronic commerce, certification, secure electronic mail, key recovery, and secure computer access.

38

In general, the less complex the application, the more quickly it becomes a reality. Identification and authentication schemes exist widely, while electronic commerce systems are just beginning to be established. However, there are exceptions to this rule; namely, the adoption rate may depend on the level of demand. For example, SSL-encapsulated HTTP gained a lot more usage much more quickly than simpler link-layer encryption has ever achieved. The adoption rate may depend on the level of demand. Secure Communication Secure communication is the most straightforward use of cryptography. Two people may communicate securely by encrypting the messages sent between them. This can be done in such a way that a third party eavesdropping may never be able to decipher the messages. While secure communication has existed for centuries, the key management problem has prevented it from becoming commonplace. Thanks to the development of public-key cryptography, the tools exist to create a large-scale network of people who can communicate securely with one another even if they had never communicated before. Identification and Authentication Identification and authentication are two widely used applications of cryptography. Identification is the process of verifying someone's or something's identity. For example, when withdrawing money from a bank, a teller asks to see identification (for example, a driver's license) to verify the identity of the owner of the account. This same process can be done electronically using cryptography. Every automatic teller machine (ATM) card is associated with a ``secret'' personal identification number (PIN), which binds the owner to the card and thus to the account. When the card is inserted into the ATM, the machine prompts the cardholder for the PIN. If the correct PIN is entered, the machine identifies that person as the rightful owner and grants access. Another important application of cryptography is authentication. Authentication is similar to identification, in that both allow an entity access to resources (such as an Internet account), but authentication is broader because it does not necessarily involve identifying a person or entity. Authentication merely determines whether that person or entity is authorized for whatever is in question. Secret Sharing Another application of cryptography, called secret sharing, allows the trust of a secret to be distributed among a group of people. For example, in a (k, n)-threshold scheme, information about a secret is distributed in such a way that any k out of the n people (k n) have enough information to determine the secret, but any set of k-1 people do not. In any secret sharing scheme, there are designated sets of people whose cumulative information suffices to determine the secret. In some implementations of secret sharing schemes, each participant receives the secret after it has been generated. In other implementations, the actual secret is never made visible to the participants, although the purpose for which they sought the secret (for example, access to a 39

building or permission to execute a process) is allowed. Electronic Commerce Over the past few years there has been a growing amount of business conducted over the Internet - this form of business is called electronic commerce or e-commerce. Ecommerce is comprised of online banking, online brokerage accounts, and Internet shopping, to name a few of the many applications. One can book plane tickets, make hotel reservations, rent a car, transfer money from one account to another, buy compact disks (CDs), clothes, books and so on all while sitting in front of a computer. However, simply entering a credit card number on the Internet leaves one open to fraud. One cryptographic solution to this problem is to encrypt the credit card number (or other private information) when it is entered online, another is to secure the entire session . When a computer encrypts this information and sends it out on the Internet, it is incomprehensible to a third party viewer. The web server ("Internet shopping center") receives the encrypted information, decrypts it, and proceeds with the sale without fear that the credit card number (or other personal information) slipped into the wrong hands. As more and more business is conducted over the Internet, the need for protection against fraud, theft, and corruption of vital information increases. Certification Another application of cryptography is certification; certification is a scheme by which trusted agents such as certifying authorities vouch for unknown agents, such as users. The trusted agents issue vouchers called certificates which each have some inherent meaning. Certification technology was developed to make identification and authentication possible on a large scale. Key Recovery Key recovery is a technology that allows a key to be revealed under certain circumstances without the owner of the key revealing it. This is useful for two main reasons: first of all, if a user loses or accidentally deletes his or her key, key recovery could prevent a disaster. Secondly, if a law enforcement agency wishes to eavesdrop on a suspected criminal without the suspect's knowledge (akin to a wiretap), t e agency must be able to recover the key. Key recovery techniques are in use in some instances; however, the use of key recovery as a law enforcement technique is somewhat controversial. Remote Access Secure remote access is another important application of cryptography. The basic system of passwords certainly gives a level of security for secure access, but it may not be enough in some cases. For instance, passwords can be eavesdropped, forgotten, stolen, or guessed. Many products supply cryptographic methods for remote access with a higher degree of security. Other Applications Cryptography is not confined to the world of computers. Cryptography is also used in 40

cellular (mobile) phones as a means of authentication; that is, it can be used to verify that a particular phone has the right to bill to a particular phone number. This prevents people from stealing (``cloning'') cellular phone numbers and access codes. Another application is to protect phone calls from eavesdropping using voice encryption.

3.6 Cryptographic Algorithms used in Electronic Commerce. TABLE 2 - CRYPTOGRAPHIC ALGORITHMS Other Crypto Algorithms and Systems of Note. Capstone A now-defunct U.S. National Institute of Standards and Technology (NIST) and National Security Agency (NSA) project under the Bush Sr. and Clinton administrations for publicly available strong cryptography with keys escrowed by the government (NIST and the Treasury Dept.). Capstone included in one or more tamper-proof computer chips for implementation (Clipper), a secret key encryption algorithm (Skipjack), digital signature algorithm (DSA), key exchange algorithm (KEA), and hash algorithm (SHA). The computer chip that would implement the Clipper Skipjack encryption scheme. See also EPIC's The Clipper Chip Web page. Largely unused, a controversial crypto scheme Escrowed employing the SKIPJACK secret key crypto Encryption Standard (EES) algorithm and a Law Enforcement Access Field (LEAF) creation method. LEAF was one part of the key escrow system and allowed for decryption of ciphertext messages that had been legally intercepted by law enforcement agencies.

AUDIO FILES AS WAV FILES 4.1 WAVE FORMAT The WAVE file format is a subset of Microsoft's RIFF specification for the storage of multimedia files. A RIFF file starts out with a file header followed by a sequence of data chunks. A WAVE file is often just a RIFF file with a single "WAVE" chunk which consists of two sub-chunks -- a "fmt " chunk specifying the 41

data format and a "data" chunk containing the actual sample data. This form is called the "Canonical form". OFFSET 0 SIZE 4 NAME Chunkid DESCRIPTION Contains the letters

RIFF in ASCII Form 4 8 4 8 Chunksize Format 36+SubChunk2Size Contains the letters WAVE in endian form

TABLE 3 - RIFF TABLE 4 - fmt subchunk OFFSET 12 16 SIZE 4 4 NAME Subchunk1ID Subchunk1Size DESCRIPTION Contains the letters fmt Size of Rest the subchunk

4.2 RIFF- Resource Interchange File Format Also Known As: RIFF, Resource Interchange File Format, RIFX, .WAV, .AVI, .BND, .RMI, .RDI Multimedia Type OFFSET 36 SIZE 4 NAME SubChunk2ID DESCRIPTION Contains data in big-endian form NumSamples*NumChannels

40 4 Colors Compression Maximum Image Size Multiple Images Per File Numerical Format Originator Platform

SubChunk2Size 24-bit RLE, uncompressed, audio, video Varies No Little- and big-endian Microsoft Corporation Microsoft Windows 3.x, Windows NT

42

Supporting Applications

Microsoft Windows and OS/2 multimedia applications

Usage RIFF is a device control interface and common file format native to the Microsoft Windows system. It is used to store audio, video, and graphics information used in multimedia applications. Comments A complex format designed to accommodate various types of data for multimedia applications. Because it is quite new and vendor-controlled, the specification is likely to change in the future. Microsoft RIFF (Resource Interchange File Format) is a multimedia file format created by Microsoft for use with the Windows GUI. RIFF itself does not define any new methods of storing data, as many of the bitmap formats described in this book do. Instead, RIFF defines a structured framework, which may contain existing data formats. Using this concept, you can create new, composite formats consisting of two or more existing file formats. Multimedia applications require the storage and management of a wide variety of data, including bitmaps, audio data, video data, and peripheral device control information. RIFF provides an excellent way to store all these varied types of data. The type of data a RIFF file contains is indicated by the file extension. Examples of data that may be stored in RIFF files are: Audio/visual interleaved data (.AVI) Waveform data (.WAV) Bitmapped data (.RDI) MIDI information (.RMI) A bundle of other RIFF files (.BND) NOTE: At this point, AVI files are the only type of RIFF files that have been fully implemented using the current RIFF specification. Although WAV files have been implemented, these files are very simple, and their developers typically use an older specification in constructing them.

IMPLEMENTATION 5.1 IntroductionIn this chapter we describe in detail about the implementation of the project by discussing the various modules involved.The project has been implemented using C langauage on a LINUX platform.To view the code, please refer to Appendix 1. In the proposed system, we implement two important n/w security concepts namely STEGANOGRAPHY and ENCRYPTION. Steganography is the science of hiding data. A steganographic process normally involves a cover-medium,the secret information and a stego-key.Together combined, 43

they form the Stego-medium. Encryption is often confused with Steganography.Encryption is the process of converting a plaintext message into an unrecognizable form known as the ciphertext.Various algorithms are used in Encryption but in this project no standard algorithm has been used due to time-consumption. In this project, a text file containing the secret information is created.An audio file of WAV format is the chosen cover medium.Since the contents of the Audio file and text file are different, a function is written to convert the text file into bit stream. Then the text data is coverted to an unrecognizable form.This process is known as Encryption.The encrypted file is then taken and embedded into the audio file.Care is taken throughout the project that the audio file does not suffer from any noise or corruption. On the receiver side, the audio file is taken and the encrypted file is recovered by DeSteganography.Then the encrypted file is decrypted to reveal the secret information.The contents of the audio file are listened to before and after the techniques have been implemented with the aid of a speaker. 5.2 MODULE 1 Header Information 1) The cover medium for this project is an Audio file of WAV format. Hence, a file called cam.wav is created.This serves as the cover medium 2) A wav file is basically and RIFF file.RIFF stands for Resource Interchange File Format.It is capable of effectively handling and storing multimedia applications effectively. 3) In order to create an audio file of WAV format , specific structures and members are declared. 4) Any WAV file consists of a WAV chunk which consists of 2 subchunks. The 2 subchunks are fmt and data chunks. 6) The fmt chunk contains the type of data being stored in the WAV file and the size of the data stored. 7) The data chunk contains the actual audio data. The members of the 2 subchunks are declared in a program which will be used during the course of the implementation. 5.3 MODULE 2 FILE HANDLING OPERATIONS 1) In this module, 4 main files are created and manipulated by the use of the file descriptors. 2) The 4 files are as follows CAM.WAV , MSG.TXT , TEMP.WAV and KEY.WAV. 3) Before we proceed into the project, the existence of the above files must be checked.There is no point in performing operations on a non-existent file 4) Four file descriptors are created namely fp,keyfile,fp1 and fp2. By the use of the following code, the validity of the file is checked. If(fp==NULL) { 44

exit(0); } 5) If the filedescriptor is equivalent to a NULL value, the user comes out of the program.This indicates that the corresponding file does not exist. CAM.WAV- Audio File. KEY.WAV- Key File. TEMP.WAV- Backup File. MSG.TXT- Text file containing the secret information. CONT.TXT- Encrypted File. 6) After the validity of the files are checked, the textfile and audio file are compared in size. 7) A code is written to calculate and compare the size of the audio file and text file. 8) If the audio file is lesser in size compared to the text file, the user comes out of the program.Else, he proceeds. 9) After this comparison, the text file MSG.TXT is considered. 10) The Audio file contents are in HEXADECIMAL format while the text file is in text form 11) Hence, a function is called which converts the text file into bitstream.The function is written in the program which consists of the definitions for the audio file. 12) The function is called in the main program. Once the function is called and implemented, the text file is returned in the form of bits. 13) In the above steps discussed, the files are operated in 2 main modes namely the read and write mode 14) The read mode enables the user to read the contents of a file to the end of the file. 15) The write mode enables the user to write contents to another file from another file which is in read mode. 16 )The file is now ready to be encrypted. 5.4 MODULE 3- Encryption and Steganography 1) In this module, the text file contents are in an bit stream FORM. The text file is opened in the read mode while the CAM.WAV file and KEY.FILE are in the write mode. 2) Before the above steps are carried out, the audio file CAM.WAV is listened to with the help of a speaker. 3) The text file is converted to an unrecognizable form known as CIPHERTEXT by writing a suitable code.To to refer the code, please refer to Appendix-1. 4) The encrypted data is now ready to be encrypted into the audio file. 5) The steganographic technique employed in embedding the encrypted data into the audio file is LsB Insertion Algorithm. 6) To understand more about LsB, refer to Chapter-2 which deals with Steganography and its algorithms in detail. 7) The result of the above process is a stego-medium which is now ready to be transmitted to the receiver side. 8) Before the transmission of the stego-file takes place, the stegonagraphed and 45

encrypted audio file is listened to with the help of a speaker.This is to check if any noise or corruption is present in the audio file. 9) The encrypted data is viewed depending on the users choice by means of a printf statement and a while loop.This displays the ciphertext of the secret information of the MSG.TXT file. 10) To any outsider, the result of step 10 will seem like a set of junk values holding no meaning.This is the main advantage of the proposed system 11) The stego-file is now ready for transmission. 5.5 MODULE 4 Desteganography and Decryption 1) In this module, the files are first checked for validity with the help of the corresponding file descriptors. 2) A new file known as OUTPUT.TXT is considered. 3) First, the stego-medium is obtained.The audio file and stego file are opened in their respective modes. 4) In this step, the encrypted file is recovered from the cover medium by means of a suitable code.- DeSteganography. 5) After the above step has taken place, the encrypted file is decrypted to its original form.This is Decryption. 6) But the decrypted file is in its hexadecimal format.Hence, a function is written which converts the bit stream back to text form. 7) The result is stored in OUTPUT.TXT and viewed. 8) The audio file is again taken and its contents are listened to with the aide of a speaker. 5.6 MODULE 6- Error Handling 1) During the course of the project, many errors wrt files are encountered. 2) These errors are detected and corrected by means of DEBUGGING the Cprograms. 3) The audio file is also checked for noise and corruption APPENDIX-1 6. SKELETAL SOURCE CODE 6.1 CREATION OF WAVE FILE #define MAX 0x46464952 struct chunkid { int chunkid; int chunksize; int format; }chunk; struct subchunk1 46

{ int subchunk1id; int subchunk1size; int samplerate; int byterate; short int audioformat; short int numchannels; short int blockalign; short int bitspersample; }chunk1; struct subchunk2 { int subchunk2id; int subchunk2size; }chunk2; func(char d,char *data) { int i; unsigned char temp; for( i=0;i<8;i++) { temp=d & 0x01; if(temp==0) { data[i]=0x02; } else { data[i]=temp; } d=d>>1; } data[8]='\0'; } restor(char * value) { int i; int temp=0; // temp=value & 0x01; //if(temp==0x01) for(i=0;i<8;i++) 47

{ if((value[i]&0x01)==0x01) temp=temp+power(2,i); } return temp; } int power(int a,int b) { int c=1,i; for(i=0;i<b;i++) { c=c*a; } return c; } 6.2 STEGANOGRAPHY AND ENCRYPTION #include<stdio.h> #include"struct.c" main() { FILE *fp, *fp1, *efile, *keyfile,*cont; char data, pixel, tempdata,datastring[8],stegvalue,auddata; int filesize1=0; int filesize=0,stecfilesize=0,textfilesize=0; int efile1=0; int keyfile1=0; int i,m; fp=fopen("cam.wav","r"); fp1=fopen("msg.txt","r"); efile=fopen("temp.wav","w"); keyfile=fopen("key.wav","w"); cont=fopen("cont.txt","w"); if(fp==NULL) { printf("\nWAV FILE CANNOT BE OPENED\n"); exit(0); } if(fp1==NULL) { printf("\nTEXT FILE CANNOT BE OPENED\n"); exit(0); } if(efile==NULL) { printf("\nTEMPORARY FILE CANNOT BE OPENED\n"); exit(0); 48

} while(fread(&data,1,1,fp)) filesize++; printf("\nAUDIO WAV=%d \n",filesize); rewind(fp); fread(&chunk,sizeof(chunk),1,fp); fwrite(&chunk,sizeof(chunk),1,efile); fread(&chunk1,sizeof(chunk1),1,fp); fwrite(&chunk1,sizeof(chunk1),1,efile); fread(&chunk2,sizeof(chunk2),1,fp); fwrite(&chunk2,sizeof(chunk2),1,efile); if(chunk.chunkid!=MAX) { printf("\nINVALID VALUE\n"); exit(0); } else printf(" \n VALID VALUE\n"); while(fread(&pixel,1,1,fp1)) filesize1++; printf("\nMSG=%d\n" ,filesize1); rewind(fp1); if(filesize1>=filesize) { printf("\nTEXT MESSAGE SIZE EXCEEDS THAT OF CAM FILE\n"); exit(0); } for(m=0;m<1000;m++) { fread(&auddata,sizeof(char),1,fp); fwrite(&auddata,sizeof(char),1,keyfile); fwrite(&auddata,sizeof(char),1,efile); } while(fread(&pixel,sizeof(pixel),1,fp1)) { printf("\nENCRYPTION\n"); tempdata=pixel^0x0f; fwrite(&tempdata,sizeof(tempdata),1,cont); 49

func(tempdata,datastring); for( i=0;i<8;i++) { fread(&data,sizeof(data),1,fp); fwrite(&data,sizeof(data),1,keyfile); printf("\nSTEGANOGRAPHY\n"); stegvalue=datastring[i]+data; fwrite(&stegvalue,sizeof(stegvalue),1,efile); } } while(fread(&data,sizeof(data),1,fp)) { fwrite(&data,sizeof(data),1,keyfile); fwrite(&data,sizeof(data),1,efile); } fclose(fp); fclose(fp1); fclose(efile); fclose(keyfile); fp=fopen("cam.wav","w"); efile=fopen("temp.wav","r"); fread(&chunk,sizeof(chunk),1,efile); fwrite(&chunk,sizeof(chunk),1,fp); fread(&chunk1,sizeof(chunk1),1,efile); fwrite(&chunk1,sizeof(chunk1),1,fp); fread(&chunk2,sizeof(chunk2),1,efile); fwrite(&chunk2,sizeof(chunk2),1,fp); while(fread(&data,sizeof(char),1,efile)) { fwrite(&data,sizeof(char),1,fp); } fclose(fp); fclose(efile); fp=fopen("cam.wav","r"); fp1=fopen("msg.txt","r"); keyfile=fopen("key.wav","r"); efile=fopen("temp.wav","r"); while(fread(&data,1,1,fp)) stecfilesize++; printf("\nTHE AUDIO SIZE IS=%d\n",filesize); 50

while(fread(&data,1,1,fp1)) textfilesize++; printf("\nTHE MSG SIZE IS=%d\n",filesize1); while(fread(&data,1,1,efile)) efile1++; printf("\nTHE TEMP. SIZE IS =%d\n",efile1); while(fread(&data,1,1,keyfile)) keyfile1++; printf("\nTHE KEY SIZE IS=%d\n",keyfile1); } 6.3 DeSteganography and Decryption #include<stdio.h> #include"struct.c" main() { FILE *fp,*fp1,*ofile,*keyfile,*fp2; char newdata[8],audiodata,stegdata,msg,dat; char resultval,auddata; int i,m; fp=fopen("cam.wav","r"); keyfile=fopen("key.wav","r"); fp1=fopen("output.txt","w"); fp2=fopen("desteg.wav","w"); if(fp==NULL) { printf("\n FILE CANNOT BE OPENED\n"); exit(0); } if(fp1==NULL) { printf("\nMSG FILE CANNOT BE OPENED\n"); exit(0); } if(keyfile==NULL) { printf("\n KEY FILE CANNOT BE OPENED\n"); exit(0); 51

} fread(&chunk,sizeof(chunk),1,fp); fread(&chunk1,sizeof(chunk1),1,fp); fread(&chunk2,sizeof(chunk2),1,fp); fwrite(&chunk,sizeof(chunk),1,fp2); fwrite(&chunk1,sizeof(chunk1),1,fp2); fwrite(&chunk2,sizeof(chunk2),1,fp2); if(chunk.chunkid!=MAX) { printf("\nINVALID VALUE\n"); exit(0); } else printf("\n VALID VALUE\n"); for(m=0;m<1000;m++) { fread(&auddata,sizeof(char),1,keyfile); fread(&auddata,sizeof(char),1,fp); fwrite(&auddata,sizeof(char),1,fp2); } i=0; while(fread(&stegdata,sizeof(stegdata),1,keyfile)) { if(i==8) { printf("\n DECRYPTION\n"); resultval=restor(newdata); resultval=resultval^0x0f; fwrite(&resultval,sizeof(resultval),1,fp1); i=0; //dat=newdata^0x0f; //restore(dat,newdat); } fread(&audiodata,sizeof(audiodata),1,fp); printf("\nDESTEGNAOGRAPHY\n"); newdata[i]=audiodata-stegdata; if(newdata[i]==0) { fwrite(&stegdata,sizeof(stegdata),1,fp2); break; } //newdata=stegdata[i]-data; 52

fwrite(&stegdata,sizeof(stegdata),1,fp2); i++; } while(fread(&stegdata,sizeof(stegdata),1,keyfile)) fwrite(&stegdata,sizeof(stegdata),1,fp2); fclose(fp); fclose(fp1); fclose(fp2); fclose(keyfile); }

REFERENCES 1) URL: http//members.tripod.com/steganography/stego/software.html. 2) URL: http//www.ise.gmu.edu/~njohnson/ihws98/jgmu.html 3) URL:http//www.citi.umich/techreports/reports/citi_tr_01-11.pdf 4) URL:http://www.wired.com/news/print/0.1294.41861.00.html. 5) Petitcolas,F.A.P,Anderson,R.,M.G, Information Hiding-A survey 6) Decarmo ,L, Pirates on the airwaves New Technologies for AudioCopy Protection. 7) Jossie Hiding in Plain sight 8) www.cacr.math.uwaterloo.ca/hac 9) www.cryptography.com/ 10)axion.physics.ubc.ca/crypt.html

53

Vous aimerez peut-être aussi