Académique Documents
Professionnel Documents
Culture Documents
The data security is a challenging issue nowadays with the increase of information capacity and
its transmission rate. The most common and widely used techniques in the data security fields
are cryptography and steganography. The combination of cryptography and steganography
methods provides more security to the data. Now, DNA (Deoxyribonucleic Acid) is explored as
a new carrier for data security since it achieves maximum protection and powerful security with
high capacity and low modification rate. A new data security method can be developed by taking
the advantages of DNA based AES (Advanced Encryption Standard) cryptography and DNA
steganography. This new technique will provide multilayer security to the secret message. Here
the secret message is first encoded to DNA bases then DNA based AES algorithm is applied to
it. Finally the encrypted DNA will be concealed in another DNA sequence. This hybrid
technique provides triple layer security to the secret message.
CONTENTS
SL.NO Title Page No
1. INTRODUCTION 1
2. RELATED WORKS 3
3. PROPOSED METHOD
3.1 DNA Encoding
3.2 DNA Based AES Algorithm
3.3 DNA Encryption steps
3.4 DNA Steganography steps
3.5 DNAA Decryption steps
4. SECURITY ANALYSIS
4.1 Theoretical Security analysis of the proposed method
4.2 Comparative study
5. CONCLUSION AND FUTURE WORK
LIST OF FIGURES
Figure No Title Page No
INTRODUCTION
RELATED WORKS
A brief review of some related works are discussed in this section. In Design of DNA-
based Advanced Encryption Standard (AES) Explained how can we use DNA as a basic element
ForMany processes which lead to the implementation of a DNAbased version of the "Advanced
Encryption Standard" (DAES). And also trying to facilitate the link and interaction between the
fields of DNA computing and digital computing. The DNA- based AES have the same strength
properties and high security proved by AES.It showed also how the complicated binary based
operations can be implemented on DNA basis using very elementary requirements.
The main contribution was to prove the availability of implementing such a complex
system on DNA Computers using DNA basis. This work is summarized in converting all the
algorithm data, Operations, used functions and specifications, from binary basis to DNA basis.
As researches in DNA cryptography recently focuses more on theory than implementation, they
have converted all the used materials to the form of DNA and implemented this algorithm on
digital computers with detailed specification of the least requirements needed to implement
the algorithm on DNA computers.
In Hybrid Technique for Steganography-based on DNA with N-Bits Binary Coding Rule,
a blind data hiding hybrid technique is introduced using the concepts of cryptography and
steganography in order to achieve double layer secured system. New binary coding rule is
proposed that assigns 2n bits to each combination of n nucleotides instead of assigning two bits
to only one nucleotide which increase the number of rules from 4! To (2n*2n)! Binary coding
rules lead to strengthen the algorithm’s security. The proposed method consists of two phases:
phase one is converting the message to DNA format using the proposed n-bits binary coding
rule leading to high algorithms cracking probability compared with those of other algorithms.
Followed by applying the Play fair cipher based on DNA and amino acids to encrypt the secret
message which generates ambiguity. Phase two is hiding the cipher secret message parts with
the ambiguity results from the first phase.
The data is hidden using the least significant base (LSBase) on each codon of a
selected DNA reference sequence using 3:1 hiding strategy. The proposed technique achieves
hiding the data in DNA with preserving its biological functions as possible without requiring any
extra data to be sent To the receiver. The proposed method also doesn’t expand the real DNA
reference sequence. It preserves the original DNA sequence length as it depends on
substitution only so the algorithm’s payload is zero which avoids attracting the attention to the
faked sequence. As a future work, the proposed method can be modified to increase the hiding
capacity of the DNA sequence with increasing the algorithm’s security too.
Cryptography and Bioinformatics techniques for Secure Information transmission over
Insecure Channels, is another method in DNA cryptographic field.It is better than AES and DES.
The proposed algorithm uses large key values and uniqueness of DNA characteristics, Each DNA
consists of 3 Billion keys for small information of binary data. It will be able to provides stronger
protection against the various intruder's attacks like cipher text only, chosen cipher text etc.The
key for encryption is obtained from the server. The DNA complementary rule is applied to
DNAbased form of data as well as the key. Then XOR operation is performed between the
binary form of the key and data. Convert the result back to DNA form. Obtain the ASCII value of
the bases and finally convert the result to it corresponding seven bits binary form.
The method proposed in “A DNA and Amino-Acids Based Implementation Of Four-Square
Cipher discusses a significant modification of the old approach of using DNA and Amino Acids
based approach with Playfair Cipher to using the same approach with a different encryption
algorithm, i.e., Foursquare cipher to the core of the ciphering process. In this study, a binary
form of data, such as plaintext messages, or images are transformed into sequences of DNA
nucleotides. Subsequently, these nucleotides pass through a Foursquareencryption process
based on amino acids structure. The fundamental idea behind using this type of encryption
process is to enforce other conventional cryptographic algorithms which proved to be broken,
and also to open the door for applying the DNA and Amino Acids concepts to more
conventional cryptographic algorithms to enhance their security features.
CHAPTER 3
PROPOSED METHOD
The naive rule makes 24 permutations. The first choice is for nucleotide A that will have
4 possibilities: 00, 01, 10 or 11. The next choice is for C that will have 3 binary codes that are
remaining after removing the binary code assigned to A.Similarly, G will have two options and
finally T will be assigned the last remaining binary code. So, the overall unrepeated
permutations = 4*3*2*1 = 24.
While, the proposed binary coding rule in Table 3.2. Maps each four bits of the binary
Message to two DNA nucleotides in order to increase the algorithm’s security as the BCR shown
in Table 3.1. Then, AA can be assigned by 0000 or 0001 or 0010 or.. Or 1111 so it has 16
possibilities. The next choice is for AC that will have 15 binary codes that are remaining after
removing the binary code assigned to AA. After that AG will have 14 options after AA and AC
are assigned and so on. So, the overall unrepeated permutations = 16*15*14*..*3*2*1 = 16!
That leads to high cracking probability as will be discussed in the security analysis section.
Simply, the binary coding rule can be generalized by assigning 2n bits to each n nucleotides
bases to achieve the required high degree of security by the system. This is declared by the
following example: A, AA, AAA, AAAA, .... (A...A) bases where A is repeated in the last term n
times can be assigned respectively to 00, 0000, 000000, (00...00) bits where 00 is repeated in
the last term n times.
Table.3.2: 4-bits binary coding rule that is used to convert binary format of a message to DNA
Binary coding rule:The proposed 4-bits representative binary coding rule represents each two
nucleotides by four bits. Since there are 4 bases (A, C, G and T), then the possible combinations
from them are 42 = 16 couples; (AA, AC, AG, AT, CC, CA, ..). The sender is free to select any
equivalentfour bits to every two nucleotides. It means that, AA can be represented by ‘0000’,
‘0001’, ‘0010’, ‘0011’, ..., ‘1111’, so, it has 16 options to be selected. If ‘0000’ is selected to
represent AA, AC can be represented by ‘0001’, ‘0010’, ‘0011’, ‘0100’,
..., ‘1111’, so, it has 15 options and so on till 16 pairs of nucleotides have been assigned a binary
representation. So, simply all binary coding rules are (# of BCRs):
#ofBCRs= 16∗15∗14∗13∗12∗..∗4∗3∗2∗1 = 16!
The output DNA of the secret message is converted to amino acids according to the new
distribution of the alphabet with their corresponding new codons. The new distribution is
derived from the standard universal table of amino acids and their DNA codons representation.
Since each amino acid is associated with multiple codons and the message is converted from
DNA to amino acids. There should be something that refers to the index of each DNA codon
corresponding to each amino acid to be able to retrieve the correct codon by the receiver in the
decryption phase when amino acids convert to DNA.
The conversion of DNA to Amino acid is done by obtaining its RNA form by replacing all ‘T’
With ‘U’ (Uracil nitrogen base). During amino conversion, corresponding ambiguity bits are
recorded. Apply swapping by taking each pair of amino acids. Again convert back to bases
having ambiguity ‘00’ using the same amino acid table. The DNA decoding is the reverse of
encoding steps.
3.2 DNA Based AES Algorithm
After DNA encoding, the result will be a sequence of nitrogen bases. The DNA based
AES algorithm takes data in blocks of 64 bases. A key of 128 bit (64 DNA) is used for encryption
as well as decryption.AES is a variant of Rijndael which has a fixed block size of 128 bits, and a
key size of 128, 192, or 256 bits. By contrast, the Rijndael specification per se is specified with
block and key sizes that may be any multiple of 32 bits, both with a minimum of 128 and a
maximum of 256 bits. AES operates on a 4x4 column-major order matrix of bytes, termed the
state, although some versions of Rijndael have a larger block size and have additional columns
in the state. Most AES calculations are done in a special [mite field. The key size used for an AES
cipher specifies the number of repetitions of transformation rounds that convert the input,
called the plaintext, into the final output, called the cipher text. The number of rounds are as
follows:
• 10 rounds for 128-bit keys.
• 12 rounds for 192-bit keys.
• 14 rounds for 256-bit keys.
At each round (Nr), AES has four basic steps for the plaintext to be implemented. AES should
implement another step of key expansion to expand the cipher key to the round keys schedule
presenting a round key for each round with size equals to the block size. Cipher key is Nb= 128,
192, or 256 bits according to each Nr: 10, 12, and 14 respectively. The binary form of plaintext
is divided into blocks of Nb bits [STATE] and each block goes through Nr rounds of encryption.
The input key is subjected to steps of key expansion Nr times resulting of Nr round keys. Steps
of key expansion are:
1- RotWordO takes a word [aO, al,a2,a3] as input, performs a cyclic permutation, and returns
the word [al,a2,a3,aO].
2- SubWordO is a function that takes a four-byte input word and applies the S-box to each of
the four bytes to produce an output word (substitution).
3- The round constant word array, Rcon[i], contains the values given by [Xi_I, {OO}, {OO}, {OO}],
with x /_1 being powers of x (x is denoted as {02}) in the field GF (28), (note that i starts at 1,
not 0).
4- Last operation includes XOR with Rcon and then XOR with the temp (output of the previous
round).
Inorder to obtain the cipher text, the input has to undergo 10 rounds of operation where
Each round involves the following functions:
1) DNAAddRoundKey: It is a simple operation that involves XOR of elements of the STATE with
the corresponding Round key. The round keys are generated through key expansion process .
2) DNASubBytes: In this step, each 4 DNA in STATE streams will be an input to the function
substitute in the S-Box.
Table. 3.3. AES SUBSTITUTION S-BOX
Each 2 bytes act as row and column entry to s-box resulting 2 different bytes.
4. DNAMixColumn: In Mix Column function, a predefined column matrix is XORed with each
column of input as per the round number. This function is only present in first eight rounds.
The decryption procedure involves the inverse of all theencryption round functions. During
decryption, the inverse DNA Mix Column is present in eight rounds except first and last.
Fig.3.2: AES Mix Columns Operation
Fig.3.4: DNASteganography
2.Processing: The formed DNA (cipherMDNA) from the encryption phase is converted into
binary again to form cipher binary message (cipherMbin) by using 4-bits representationbinary
coding rule as following: The AMBIG as well is converted to binary AMBIGbin and since the
maximum number of codons corresponding to an amino acid is 4, indexed from 0 to 3 so it can
be represented in maximum 2 bits as showing Fig. 3.4. Select a DNA reference sequence from
one of the public databases such as EBI or NCBI and convert it to RNA by substituting each T
with U. Hide cipherMbin and AMBIGbin using LSBase method.
Since, hiding methodology depends on the message bits and the LSBase of each
codon in the DNA. The LSB of S is checked and if it is a purine base (A & G), it is substituted by
(G) to encode 1 of the secret message or (A) to encode 0. If the LSB of S is a pyrimidine base (C
& T), it is substituted by (C) to encode 1 or (U) to encode 0. LSBase algorithm neglects the
following codons: UGA, UGG, AUA and AUG during the hiding process since according to the
standard distribution of DNA codons to amino acids, Trp and Met amino acids have a single
codon which are AUG and UGG respectively. Also, stop has only one codon which is UGA which
will be neglected too. Finally lle is coded by three codons: AUU, AUC and AUA, so, AUA is
neglected and AUU and AUC will be used in data hiding. The complete data hiding scheme is
shown in Fig. 3.
3.Output: Output from phase I, not only the cipher binary message but also the ambiguity
results from converting DNA format of the message to amino acids. The objective of the
proposed method is to hide the secret message and the ambiguity required by the receiver to
retrieve the secret message from the DNA without additional information.
This because, the ratio of the length of the binary cipher message to the length of the binary
ambiguity is 3:1 as will be discussed in subsection C. So, hiding the message with the ambiguity
in the DNA sequence using 3:1 ratio avoids adding additional data to mark the starting position
of the message in the DNA reference sequence and the starting position of the ambiguity as
well.
In order to hide the cipher text bits and ambiguity bits, a pattern is formed by taking each
three bits of cipher bits followed by one ambiguity bit. For the pattern formation, the number
of the cipher bits taken along with ambiguity must be matched suitably. If it is not matching add
required `0' bits at the end of cipher bits or ambiguity bits. After extraction the extra bits added
will get excluded. In the extraction phase, key bits and ambiguity bits length can be extracted
easily since it was embedded based on adjacent base method. After this, the pattern gets
extracted and from that, cipher bits and ambiguity bits are separated.
The Cryptography and Steganography together provides a way for more confidential
data transfer. DNA based encryption and steganography method is one of the recent
techniques embedded into cryptographic field. Here a new hybrid method is presented by
combining the means of cryptography and steganography as well. This achieves multilayer
security of the system along with DNA based AES encryption. The Steganography methods
adopted here do not expand the reference DNA sequence and also the embedded data can be
extracted without the need of actual DNA reference sequence.
As a future work, the proposed method can be modified by introducing various
encryption techniques like Blowfish, Two fish, Triple DES etc. along with DNA based AES
encryption, such that these encryption techniques can be randomly selected and applied to the
selected data. This way the security of data can be enhanced. As the result after steganography
increases the size of data, compression techniques can also be applied with DNA encryption
and steganography.
REFERENCE
[1] Atanu Majumder, Abhishek Majumdar, Tanusree Podder, Nirmalya Kar and Meenakshi
Sharmas, “Secure Data Communication and Cryptography based on DNA based Message
Encoding”, IEEE, pp. 360- 363, 2014.
[2] Mona Sabry, Mohammed Hashem, Taymoor Nazmy and Mohammed Essam Khalifa, “Design
of DNA-based Advanced Encryption Standard (AES)”, IEEE, pp. 390-397, 2015.
[3] Ghada Hamed, Mohammed Marey, Safaa Amin El-Sayed and Mohammed Fahmy Tolba,
“Hybrid Technique for Steganography-based on DNA with N-Bits Binary Coding Rule”, pp.
95102, 2015.
[4] Siddaramappa .V, Ramesh K . B, “Cryptography and Bioinformatics techniques for Secure
Information transmission over Insecure Channels", IEEE, pp. 137-139, 2015.
[7] Sonal Namdev, Vimal Gupta, “A DNA and Amino-Acids BasedImplementation Of Four-Square
Cipher”, Journal of Engineering Research and Applications, ISSN: 2248-9622, vol. 6, Issue No. 1,
(Part-2), pp. 90-96, January 2016.