International Journal of Emerging Trends & Technology in Computer Science (IJETTCS

)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 2, March – April 2013 ISSN 2278-6856

Text Encryption Using DNA Stenography
M. Yamuna1, Nikhil Bagmar2, Vishal3
1,2,3

VIT University, Vellore, Tamilnadu, India

Abstract: DNA is used as an Information carrier and
modern biological technology is used as implementation tool, implementing data hiding based on DNA sequences to increase the complexity is the main target of this paper. In this method Data hiding is started by applying five different steps. The receiver will apply the process of identifying and extracting the original message which has been hidden in DNA reference sequence. The main goal is exploring characteristics of DNA molecules, searching for simple methods of realizing DNA cryptography, and laying the basis for future development.

sequence consisting of four alphabets: A, C, G and T. Each alphabet is related to a nucleotide. It is usually quite long. For instance, the DNA sequence of “Litmus”, its real length is with 2856 nucleotides long: ATCGAATTCGCGCTGAGTCACAATTCGCGCTGAG TCACAATTCGCGCTGAGTCACAATTGTGACTCAG CCGCGAATTCCTGCAGCCCCGAATTCCGCATTGC AGAGATAATTGTATTTAAGTGCCTAGATACAATA AACGCCATTTGACCATTCACCACATTGGTGTGCA CCTCCAAGCTCGCGCACCGTACCGTCTCGAGGAA TTCCTGCAGGATATCTGGATCCACGAAGCTTCCC ATGGTGACGTCAC. From this sequence a several useful properties could be shown: a) There is almost no difference between a real DNA sequence and a faked one. b) There are a large number of DNA sequences publicly available in various web-sites. A rough estimation would put the number of DNA sequences publicly available to be around 55 million [ 1 ]. By using the above facts, we designed a DNA based encryption method. In general most of the methods that use DNA sequences are in general used for encrypting a binary string. We design a method where any text can be encrypted into a fake DNA sequence S. This transformed sequence S is sent by a sender to the receiver.

Keywords: Encryption, Decryption, DNA.

1. INTRODUCTION
Deoxyribonucleic Acid (DNA) is the molecule that contains the genetic information and functioning of all living organisms and viruses. Genetic information is encoded as a sequence of nucleotides (guanine, adenine, thymine, and cytosine) recorded using the letters G, A, T, and C. Base pairs guaninecytosine and adenine-thymine which is attached to a sugar and a phosphate molecule allow the DNA helix to maintain a regular helical structure that is independent of its sequence. Bases are sequenced differently for different information that needs to be transmitted. This is similar to that of any different sequences of letters form words and sequences of words form sentences.

2. PROPOSED ENCRYPTION SCHEME
Let S be the message to be encrypted. We apply four levels of encryption, first is shifting of each alphabet in the message to any other alphabet. The shift number is the first key. Second is encrypting this shifted message into a binary sequence using ASCII binary sequence. Third is conversion of this into a DNA sequence. Fourth is creating a matrix and convert the message again into a DNA sequence. So basically each text undergoes four levels of encryption. Keys for four levels Level 1 The shift which is the length of the string. Level 2 Binary ASCII Value. Level 3 Binary conversion table 1. Level 4 Matrix designed by the user.

Structure of DNA In recent years, much research work has been done on DNA based encryption schemes. A DNA sequence is a Volume 2, Issue 2 March – April 2013

Page 231

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 2, March – April 2013 ISSN 2278-6856
Table 1 Represents the ASCII conversion into binary numbers [ 3 ]. ASCII CODE: Character to Binary Conversion Table Step 3 Convert B into segments, where each bit is of size k = 2 and hence convert B into a fake DNA string. Step 4 Construct the DNA matrix A. Matrix Construction Each alphabet in the text is converted into a fake DNA strand in step 3. Each DNA strand for the alphabets is taken as a column to construct a 4 x k matrix, where k is the length of the message S. Step 5 Obtain a string S2 obtained by concatenation of the rows of A. Step 6 Send S2 to the receiver. 2.2 Decryption Algorithm Step 1 Obtain the matrix A from S2. Step 2 Obtain the binary string B from the conversion table. Step 3 Obtain S1 from the ASCII table. Step 4 Obtain S using the shift key.

3. EXAMPLE
Let the message S to be encrypted be HELLO, so that | S | = 5. Replacing each alphabet by the corresponding fifth alphabet we obtain the MJQQT. Table 2 this is converted into M: 01101101 J: 01101010 Q: 01110001 T: 01110100 Using table 1 we obtain the following M: GCTG J: GCCC Q: GTAG T: GTGA From this we generate the 4 x 5 matrix
G G G G G    C C T T T A  T C A A G   G C G G A 

Table 1 DNA Binary Conversion Table

Alphabet

Binary Representation 00 10 01 11

A C G T

Table 2 Table 2 represents the ASCII conversion into binary numbers [ 3 ]. 2.1 Encryption Algorithm Let S be the message to be encoded. Let length of S be k. Step 1 Shift each letter in the message to a new letter where the shift value is k to generate S1. Step 2 Convert S1 into a binary string B using ASCII value conversion. Volume 2, Issue 2 March – April 2013

So that the message to be send to the receiver is GGGGGCCTTTTCAAGGCGGA. Suppose the received message is GGGGGGGAAGGGGATCCGGGGTAGACGC. The length of the received string is 28. So the size of the matrix will be 4 x 7. The corresponding matrix is
G G G G G G G    A A G G G G A A  T C C G G G G   T A G A C G C

Arranging the elements column wise the DNA is now converted as

Page 232

International Journal of Emerging Trends & Technology in Computer Science (IJETTCS)
Web Site: www.ijettcs.org Email: editor@ijettcs.org, editorijettcs@gmail.com Volume 2, Issue 2, March – April 2013 ISSN 2278-6856
GATTGACAGGCGGGGAGGGCGGGGGAGC. Using the DNA binary conversion table this string is converted as 010010000101100101010100010101100101010 101000110 Dividing this into segments of length k = 8 we get. 01001000 01011001 01010100 01010110 01010101 01000110 From the ASCII table this string is decoded as OHYTVUF. The length is the string is 7. So shifting each alphabet back by a shift of size 7 the original message is decoded as HARMONY. Like in the example provided above, DNA sequence to be send to the receiver will be of small length, if the message to be encoded is small. But in general the original DNA sequences are long. If the DNA sequence to be encrypted is very small, then a hacker may identify it as a fake sequence. In such cases when the length of the message to be encrypted is small, we can use to usual insertion method [ 5 ] to increase the length of the sequence. Say in the above example HELLO the DNA sequence obtained is GGGGGCCTTTTCAAGGCGGA. We now can choose any fake sequence say for example TTCATAGCACGGATTATCGGAGTTTCGTATGTCC GCTACATAGTGGGCTTACCCTCAATC. Dividing this into segments of length say k = 3 we get TTC ATA GCA CGG ATT ATC GGA GTT TCG TAT GTC CGC TAC ATA GTG GGC TTA CCC TCA ATC. Inserting the message we get GTTCGATAGGCAGCGGGATTCATCCGGATGTTTT CGTTATTGTCCCGCATACAATAGGTGGGGCCTTA GCCCGTCAAATC ( the original one is given in red color to show how it is inserted), so that the fake DNA to be encrypted is GTTCGATAGGCAGCGGGATTCATCCGGATGTTTT CGTTATTGTCCCGCATACAATAGGTGGGGCCTTA GCCCGTCAAATC So using insertion the length of the DNA strand can be increased if the length of the message to be encoded is small. REFERENCES [1]. M. I. Youssef, A. Emam and M. Abd Elghany, Multi-Layer Data Encryption Using Residue Number System in DNA Sequence, International Journal of Security and Its Applications Vol. 6, No. 4, October, 2012 [2]. http://www.academia.edu/920735/DNA_Cryptograph y_using_Binary_Strands [3]. http://www.cs.ndsu.nodak.edu/~adenton/ExpandingH orizons/EH2005/ascii-binary-chart.gif [4]. http://en.wikipedia.org/wiki/DNA [5]. http://en.wikipedia.org/wiki/DNA_sequencing AUTHOR Dr. M. Yamuna received her doctorate in Mathematics from Alagappa University, Karaikudi, India. She is currently working as an Assistant Professor ( Sr ) at Vellore Institute of Technology, Vellore, India. Currently, Nikhil Bagmar, Vishal are first year B. Tech, Computer Science students at VIT University.

Volume 2, Issue 2 March – April 2013

Page 233

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.