Académique Documents
Professionnel Documents
Culture Documents
CHAPTER 1 INTRODUCTION 1.1. Background of the Algorithm 1.2. About AES Algorithm 1.3. Notation and Conventions 1.3.1. Inputs and Outputs 1.3.2. Bytes 1.3.3. Arrays of Bytes 1.3.4. The State 1.3.5. The State as an Array of Columns 1.4. Mathematical Background 1.4.1. Addition 1.4.2. Multiplication 1.4.3. Multiplication by x 1.4.4. Polynomials with Coefficients in GF (28) 1.5. Encryption & Decryption 1.6. Cryptography & Types CHAPTER 2 ENCRYPTION 2.1. Encryption Process 2.2. Bytes Substitution Transformation 2.3. Shift Rows Transformation 2.4. Mixing of Columns Transformation 22 22 24 27 28 1 3 3 4 5 5 8 8 8 9 11 12 15 16
2.5. Addition of Round Key Transformation 2.6. Key Schedule Generation CHAPTER 3 DECRYPTION 3.1. Decryption Process 3.2. Inverse Bytes Substitution Transformation 3.3. Inverse Shift Rows Transformation 3.4. Inverse Mixing of Columns Transformation
29 30 34 34 35 36 37
range of equipment to operate securely. AES is a block cipher with a block length of 128 bits. AES allows for three different key lengths: 128, 192, or 256 bits. AES will assume that the key length is 128 bits. Encryption consists of 10 rounds of processing for 128-bit keys, 12 rounds for 192bit keys, and 14 rounds for 256-bit keys. Except for the last round in each case, all other rounds are identical. Each round of processing includes one single-byte based substitution step, a row-wise permutation step, a column-wise mixing step, and the addition of the round key. The order in which these four steps are executed is different for encryption and decryption. To appreciate the processing steps used in a single round, it is best to think of a 128-bit block as consisting of a 4 4 matrix of bytes, arranged as follows:
Byte(0)
Byte(4)
Byte(8)
Byte(12)
Byte(1)
Byte(5)
Byte(9)
Byte(13)
Byte(2)
Byte(6)
Byte(10)
Byte(14)
Byte(3)
Byte(7)
Byte(11)
Byte(15)
Therefore, the first four bytes of a 128-bit input block occupy the first column in the 4 4 matrix of bytes. The next four bytes occupy the second column, and so on. The 4 4 matrix of bytes is referred to as the state array. AES also has the notion of a word. A word consists of four bytes that is 32 bits. Therefore, each column of the state array is a word, as is each row. Each round of processing works on the input state array and produces an output state array. The output state array produced by the last round is rearranged into a 128-bit output block. Unlike DES, the decryption algorithm differs substantially from the encryption algorithm. Although, overall, the same steps are used in encryption and decryption, the order in which the steps are carried out is different, as mentioned previously. AES, notified by NIST as a standard in 2001, is a slight variation of the Rijndael cipher invented by two Belgian cryptographers Joan Daemen and Vincent Rijmen. Whereas AES requires the block size to be 128 bits, the original Rijndael cipher works
4
with any block size (and any key size) that is a multiple of 32 as long as it exceeds 128. The state array for the different block sizes still has only four rows in the Rijndael cipher. However, the number of columns depends on size of the block. For example, when the block size is 192, the Rijndael cipher requires a state array to consist of 4 rows and 6 columns. DES was based on the Feistel network. On the other hand, what AES uses is a substitution-permutation network in a more general sense. Each round of processing in AES involves byte-level substitutions followed by word-level per-mutations. Speaking generally, DES also involves substitutions and permutations, except that the permutations are based on the Feistel notion of dividing the input block into two halves, process-ing each half separately, and then swapping the two halves. The nature of substitutions and permutations in AES allows for a fast software implementation of the algorithm.
1.2.2. Bytes
The basic unit of processing in the AES algorithm is a byte, which is a sequence of eight bits treated as a single entity. The input, output and Cipher Key bit sequences described in Section 1.1 are processed as arrays of bytes that are formed by dividing these sequences into groups of eight contiguous bits to form arrays of bytes. For an input, output or Cipher Key denoted by a, the bytes in the resulting array are referenced using one of the two forms, an or a[n], where n will be in a range that depends on the key < 16. For a key length of 192 bits, n lies in the range 0 n < 24. For a key length of 256 bits, n lies in the range 0
5
n < 32. All byte values in the AES algorithm are presented as the concatenation of the individual bit values, (0 or 1), between braces in the order {b7, b6, b5, b4, b3, b2, b1, b0}. These bytes are interpreted as finite field elements using a polynomial representation 7 7 b7 x bi x i x5 For example, {01100011} identifies the specific finite field element x is
6
6 b6 x
5 b5 x
4 b4 x
3 b3 x
2 b2 x b1 x b0
(1)
also convenient to denote byte values using hexadecimal notation with each of two groups of four bits being denoted by a single hexadecimal character. The hexadecimal notation scheme is depicted in Figure.1. Figure-1: Hexadecimal Representation of Bit Patterns
Hence the element {01100011} can be represented as {63}, where the character denoting the four-bit group containing the higher numbered bits is again to the left. Some finite field operations involve one additional bit {b8} to the left of an 8-bit byte. When the b8 bit is present, it appears as {01} immediately preceding the 8-bit byte. For example, a 9-bit sequence is presented as {01} {1b}.
an = {input8n, input8n+1, , input8n+7}. An example of byte designation and numbering within bytes for a given input sequence is presented in Figure 2.
array, which is denoted by the symbol S, each individual byte has two indices. The first byte index is the row number r index is the column number c
b1.
Such indexing
allows an individual byte of the State to be referred to as Sr,c or S[r,c]. For the AES Nb = 4, input, which is the array of bytes symbolized by in0in1in15 is copied into the State array. This activity is illustrated in Figure 3. The Encryption or Decryption operations are conducted on the State array. After manipulation of the state array has completed its final value is copied to the output, which is an array of bytes symbolized by out0out1out15. Input Bytes State Array Output Bytes
Figure 3. State Array Input and Output [1] At the start of the Encryption or Decryption the input array is copied to the State array with
S[r, c] = in[r + 4c] Nb1 copied to the output array At the end of the Encryption and Decryption the State is with where 0
b1.
out[r
4c]
S[r,c]
and
1.3.1. Addition
The addition of two elements in a finite field is achieved by adding the coefficients for the corresponding powers in the polynomials for the two elements. The addition is performed through use of the XOR operation, which is denoted by the operator symbol . Such addition is performed modulo-2. In modulo-2 addition
9
1 ^1 = 0, 1 ^0 = 1, 0 ^1 = 1 and 0 ^0 =0. Consequently, subtraction of polynomials is identical to addition of polynomials. Alternatively, addition of finite field elements can be described as the modulo-2 addition of corresponding bits in the byte. For two bytes {a7a6a5a4a3a2a1a0}
i
and
where i
represents corresponding bits. For example, the following expressions are equivalent to
2 6 4
(Polynomial
notation)
{01010111}
{57}
d 4}
(Hexadecimal notation)
1.3.2. Multiplication
In the polynomial representation, multiplication in Galois Field GF (28) (denoted
polynomial of degree 8. A polynomial is irreducible if its only divisors are one and itself. For the AES algorithm, this irreducible polynomial is given by the equation (2). m(x) (x6
2
x8
x4
x3
(2)
c1} because x4 x x
7
x x x13 x7 x5 x11 x3 x9 x2 x8 x x7
x6
x4
10
x2
x11
x9
x8
x6
x5
x4
x3
x8
x6
x5
x
4
x3
7
Modulo (x8
6
x x3 x
=x
The modular reduction by m(x) ensures that the result will be a binary polynomial of degree less than 8, which can be represented by a byte. Unlike addition, there is no simple operation at the byte level that corresponds to this multiplication. The multiplication defined above is associative and the element {01} is the multiplicative identity. For any non-zero binary polynomial b(x) of degree less than 8, the multiplicative inverse of b(x), denoted b-1(x), can be found. The inverse is found through use of the extended Euclidean algorithm to compute polynomials a(x) and c(x) such that
b(x)a(x)
m(x)c(x)
(3)
b1 (x)
(4)
Moreover, for any a(x), b(x) and c(x) in the field, it holds that a(x) b(x) c(x)) a(x) b(x) a(x) c(x) (5)
It follows that the set of 256 possible byte values, with XOR used as addition and multiplication defined as above, has the structure of the finite field GF (28).
1.3.3. Multiplication by x
Multiplying the binary polynomial defined in equation (1) with the polynomial x results in b 7 x8
2
b6 x 7
b 5 x6
b 4 x5
b3 x 4
b 2 x3
b1 x b0 x. (6) x). If b7
11
equals zero the result is already in reduced form. If b7 equals one the reduction is accomplished by subtracting the polynomial m(x). It follows that multiplication by x, which is represented by {00000010} or {02}, can be implemented at the byte level as a left shift and a subsequent conditional bitwise XOR with {1b}. This operation on bytes is denoted by xtime( ). Multiplication by higher powers of x can be implemented by repeated application of xtime( ). Through the addition of intermediate results, multiplication by any constant can be implemented.
= {fe}.
which will be denoted as a word in the form [a0 , a1 , a2 , a3 ]. Note that the polynomials in this section behave somewhat differently than the polynomials used in the definition of finite field elements, even though both types of polynomials use the same indeterminate, x. The coefficients in this section are themselves finite field elements, i.e., bytes, instead of bits; also, the multiplication of four-term polynomials uses a different reduction polynomial, defined below. To illustrate the addition and multiplication operations, let b(x) b3 x3 b2 x2 b1 x b0
12
(8)
define a second four-term polynomial. Addition is performed by adding the finite field coefficients of like powers of x. This addition corresponds to an XOR operation between the corresponding bytes in each of the words in other words, the XOR of the complete word values Thus, using the equations of (7) and (8), a(x) b(x) (a3 b3 )x3 (a2 b2 )x 2 (a1 b1 )x (a0 b0 ) (9)
Multiplication is achieved in two steps. In the first step, the polynomial product c(x) = a(x) give c(x) Where c0 c1 c2 c3 c5 a0 a1 a2 a3 b3 c4 a3 b3 c6 a3 b0 b0 a0 b0 a1 b0 a2 a3 b2 a2 b3 b1 b1 a0 b1 a1 b2 b2 a0 b3 c6 x6 c5 x5 c4 x 4 c3 x3 c2 x 2 c1 x c0 (10) b x) is algebraically expanded, and like powers are collected to
b1 a2
b2 a1
The result, c(x), does not represent a four-byte word. Therefore, the second step of the multiplication is to reduce c(x) modulo a polynomial of degree 4; the result can be reduced to a polynomial of degree less than 4. For the AES algorithm, this is accomplished with the polynomial x4 + 1, so that xi mod(x 4 xi mod 4 .
(11)
The modular product of a(x) and b(x), denoted by a(x polynomial d(x), defined as follows
13
d3 x3
d2x2
d1 x
d0
(12)
b0 ) (a3 b0 ) (a0 b0 )
b3 ) b3 ) b3 ) b3 )
b0 ) b1 ) (a1
When a(x) is a fixed polynomial, the operation defined in equation (12) can be written in matrix form as the following equation (13).
d0
d
a0 a3 a1 a0
a2 a3
a1 a2
b0 b1 (13)
d2
a2 a1
a0
a3
b2
3 a2
a1
Because x4 + 1 is not an irreducible polynomial over GF(28), multiplication by a fixed four-term polynomial is not necessarily invertible. However, the AES algorithm specifies a fixed four-term polynomial that does have an inverse is given by x3 b}x3 x2 d}x 2
a(x) a 1 (x)
x x e}
(14) (15)
{01}, which is the polynomial x3. Inspection of equation (13) above will show that its effect is to form the output word by rotating bytes in the input word. This means that [ b0, b1, b2, b3] is transformed into [b1, b2, b3, b0].
Encryption: Encryption is the process of transforming information (referred to as plaintext) to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information (in cryptography, referred to as cipher text). In many contexts, the word
encryption also implicitly refers to the reverse process. Encryption has long been used by militaries and governments to facilitate secret communication. Encryption is now used in protecting information within many kinds of civilian systems, such as computers, networks (e.g. the Internetcommerce), mobile telephones, wireless microphones, wireless intercom systems, Bluetooth devices and bank automatic teller machines. Encryption is also used in digital rights management to restrict the use of copyrighted material and in software copy protection to protect against reverse engineering and software piracy.
Decryption :
15
Decryption is the reverse, moving from unintelligible cipher text tothe plain text. This is essentially the encryption algorithm run in reverse. It takes the cipher text and the secret key and produces the original plain text.
Cryptography: Cryptography is the science of using mathematics to encrypt and decrypt data. Cryptography enables you to store sensitive information or transmit it across insecure networks (like the Internet) so that it cannot be read by anyone except the intended recipient. While cryptography is the science of securing data, cryptanalysis is the science of analyzing and breaking secure communication. Classical cryptanalysis involves an interesting combination of analytical reasoning, application of mathematical tools, pattern finding, patience, determination, and luck. Cryptanalysts are also called attackers. Cryptology embraces both cryptography and cryptanalysis. Cryptography can be strong or weak, as explained above. Cryptographic strength is measured in the time and resources it would require to recover the plaintext. The result of strong cryptography is cipher text that is very difficult to decipher without possession of the appropriate decoding tool. Cipher text: Cipher text is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. Working of cryptography: A cryptographic algorithm, or cipher, is a mathematical function used in the encryption and decryption process. A cryptographic algorithm works in combination with
16
a keya word, number, or phraseto encrypt the plaintext. The same plaintext encrypts to different cipher text with different keys. The security of encrypted data is entirely dependent on two things: the strength of the cryptographic algorithm and the secrecy of the key. A cryptographic algorithm, plus all possible keys and all the protocols that make it work comprise a cryptosystem. TYPES OF CRYPTOGRAPHIC ALGORITHMS: The three types of cryptosystems: Symmetric key, Asymmetric key. Symmetric key encryption uses one key to encrypt and decrypt. Asymmetric key encryption uses two keys; when one key is used to encrypt, the other is used to decrypt. Symmetric Key encryption: Symmetric key is also called as a private key cryptography is the type in which the same key is used to encrypt and decrypt the data. Symmetric encryption was the only type of encryption is use prior to development of public key encryption. Substitution and Transportation are two basic building blocks of all encryption .A substitution technique is one in which the letters of plain text are replaced by other letters or by numbers or symbols.
Substitution ciphers: Caesar ciphers Monalphabetic ciphers Playfair ciphers Hill ciphers Polyalphabetic ciphers Transportation ciphers: Rail fence technique Asymmetric key encryption: In asymmetric key cryptography, different keys are used for encrypting and decrypting a message. The asymmetric key algorithms that are most useful are those in which neither key can be deduced from the other. In that case, one key can be made public while the other is kept secure. There are some distinct advantages to this publickeyprivate-key arrangement, often referred to as public key cryptography: the necessity of distributing secret keys to large numbers of users is eliminated, and the algorithm can be used for authentication as well as for cryptography. Overview of different decryption algorithms: 1. Advanced Encryption Standard (AES) In cryptography, the Advanced Encryption Standard (AES), also known as Rijndael, is a block cipher adopted as an encryption standard by the U.S. government. It has been analyzed extensively and is now used widely worldwide. AES was announced by National Institute of Standards and Technology (NIST) as standardization process. AES is one of the most popular algorithms used in symmetric key cryptography. It is available by choice in many different encryption packages The Rijndael proposals for AES defined a cipher in which the block length and the key length can be independently specified to 128,192, or 256 bits. Strictly speaking, AES is not precisely Rijndael (although in practice they are used interchangeably) as Rijndael supports a larger range of block and key sizes; AES has a fixed block size of 128 bits and a key size of 128, 192 or 256 bits, whereas Rijndael can be specified with key and block sizes in any multiple of 32 bits, with a minimum of 128 bits and a maximum of 256 bits.
18
2. Data Encryption Standard (DES): The DES is a cipher (a method for encrypting information) selected as an official Federal Information Processing Standard (FIPS) for the United States in1976, and which has subsequently enjoyed widespread use internationally. DES is the archetypal block cipher -an algorithm that takes a fixed-length string of plaintext bits and transforms it through a series of complicated operations into another cipher text bit string of the same length. In the case of DES, the block size is 64 bits. DES also uses a key to customize the transformation, so that decryption can only be performed by those who know the particular key used to encrypt. The key ostensibly consists of 64 bits; however, only 56 of these are actually 10 used by the algorithm. Eight bits are used solely for checking parity, and are thereafter discarded. Hence the effective key length is 56 bits, and it is usually quoted as such. 3. Triple DES: In cryptography, Triple DES is a block cipher formed from the Data Encryption Standard (DES) cipher by using it three times. When it was found that a56-bit key of DES is not enough to guard against brute force attacks, TDES was chosen as a simple way to enlarge the key space without a need to switch to a new algorithm. The use of three steps is essential to prevent meet-in-the-middle attacks that are effective against double DES encryption. Note that DES is not a group; if it were one, the TDES construction would be equivalent to a single DES operation and no more secure. By design, DES and therefore TDES, suffer from slow performance in software; on modern processors, AES tends to be around six times faster. TDES is better suited to hardware implementations, and indeed where it is still used it tends to be with a hardware implementation but even there AES outperforms it. Finally, AES offers markedly higher security margins: a larger block size, potentially longer keys and no known public cryptanalytic attacks. 4. RSA Algorithm:
19
In cryptography, RSA is an algorithm for public-key cryptography. It was the first algorithm known to be suitable for signing as well as encryption, and one of the first great advances in public key cryptography. RSA is widely used in electronic commerce protocols, and is believed to be secure given sufficiently long keys and the use of up-to date implementations. RSA involves a public key and a private key. The public key can be known to everyone and is used for encrypting messages. Messages encrypted with the public key can only be decrypted using the private key. Key generation: Finding the large primes p and q is usually done by testing random numbers of the right size with probabilistic primarily tests which quickly eliminate virtually all non-primes and q should not be 'too close', lest the Fermat factorization for N be successful, if p-q, for instance is less than 2n (which for even small 1024 bit values of n is 3x1077) solving for p and q is ultra-trivial. Furthermore, if either p-1 or q-1 has only small prime factors, n can be factored quickly by Pollard's p 1 algorithm and these values of p or q should therefore be discarded as well.RSA is much slower than DES and other symmetric cryptosystems. Need of encryption: Encryption, by itself, can protect the confidentiality of messages, but other techniques are still needed to verify the integrity and authenticity of a message; for example, a message authentication code (MAC) or digital signatures. Standards and cryptographic software and hardware to perform encryption are widely available, but successfully using encryption to ensure security is a challenging problem. A single slip-up in system design or execution can allow successful attacks. Sometimes an adversary can obtain unencrypted information without directly undoing the encryption.
20
Applications of encryption: Encryption has long been used by militaries and governments to facilitate secret communication. Encryption is now used in protecting information within many kinds of civilian systems, such as computers, networks (e.g. the Internet ecommerce), mobile telephones, wireless microphones, wireless intercom systems, Bluetooth devices and bank automatic teller machines. Encryption is also used in digital rights management to restrict the use of copyrighted material and in software copy protection to protect against reverse engineering and software piracy.
21
Encryption Process The Encryption process of Advanced Encryption Standard algorithm is presented below, in figure 4.
22
This block diagram is generic for AES specifications. It consists of a number of different transformations applied consecutively over the data block bits, in a fixed number of iterations, called rounds. The number of rounds depends on the length of the key used for the encryption process. Figure 3 shows the different steps that are carried out in each round except the last one.
Substitute Bytes
Mix Columns
Encryption Round
Decryption Round
STEP 1: It is called as Sub Bytes for byte-by-byte substitution during the forward process This step consists of using a 16 16 lookup table to find a replacement byte for a given byte in the input state array. The entries in the lookup table are created by using the notions of multiplicative inverses in GF (28) and bit scrambling to destroy the bit-level correlations inside each byte.
23
STEP 2: It is called as Shift Rows for shifting the rows of the state array during the forward process. The goal of this transformation is to scramble the byte order inside each 128-bits block. STEP 3: It is called as Mix Columns for mixing up of the bytes in each column separately during the forward process. The goal is here is to further scramble up the 128-bit input block. The shift-rows step along with the mix-column step causes each bit of the cipher text to depend on every bit of the plain-text after 10 rounds of processing. STEP 4: It is called as Add Round Key for adding the round key to the output of the previous step during the forward process.
We next replace the value in each cell by its multiplicative inverse in GF (28) based on the irreducible polynomial x8 +x4 +x3 +x+1. The value 00 is replaced by itself since this element
24
1 2 3 4 5 6 7 8 9
....
b b
where ci is the ith bit of a specially designated byte c whose hex value is Ox63. ( c7c6c5c4c3c2c1c0 = 01100011 ) The above bit-mangling step can be better visualized as the fol-lowing vector-matrix operation. Note that all of the additions in the product of the matrix and the vector are actually XOR operations.[Because of the [A]~x + b nature of this transformation, it is commonly referred to as the transformation.]
1 0 0 0 1 1 1 1
1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0
b0
b1 b2 b3 b4 b5
1
1 0
0 0 1
25
b6
0 0 0 1 1 1 1 1
b7
The 16 16 table created in this manner is called the S-Box. The S-Box is the same for all the bytes in the state array. The extra bit-mangling introduced by the transformation shown above is meant to break the correlation between the bits before the substitution and the bits after the substitution while main-taining a relationship between the input bytes and the output bytes that can be described by algebraic equations. This byte-by-byte substitution step is reversed during decryption, meaning that you first apply the reverse of the bit-mangling op-eration to each byte, as explained in the next step, and then you take its multiplicative inverse in GF (28). The 1616 lookup table for decryption is constructed by starting out in the same manner as for the encryption lookup table. That is, you place in each cell the byte constructed by joining the row index with the column index. Then, for bit mangling, you carry out the following bit-level transformation in each cell of the table:
b = b
where di is the ith bit of a specially designated byte d whose hex value is Ox05. ( d7d6d5d4d3d2d1ddc0 = 00000101 ) Finally, you replace the byte in the cell by its multiplicative inverse in GF (28) The bytes c and d are chosen so that the S-box has no fixed points. That is, we do not want S box(a) = a. Neither do we want S box(a) = a where a is the bitwise Complement of a.
26
27
Recall again that the input block is written column-wise. That is the first four bytes of the input block fill the first column of the state array, the next four bytes the second column, etc. As a result, shifting the rows in the manner indicated scrambles up the byte order of the input block.
0,j
For the bytes in the second row of the state array, this operation can be stated as
s
1,j
28
'
'
'
'
for 0 c<Nb,
Figure 10. Exclusive-OR Operation of State and Cipher Key Words where [wi] are the key generation words described in chapter 3, and round is a value in the
range in the Encryption, the initial Round Key addition occurs when round = 0, prior to the first application of the round function. The application of the AddRoundKey ( ) transformation to the Nr rounds of the encryption occurs when 1 round Nr. The action of this transformation is illustrated in figure10, where l = round * Nb. The byte address within words of the key schedule was described in Section1.2.1.
29
k0 k4 k8
k1
k
k12
k
13
k2
k6
k10
k14
k3 k7 k11
k15
[ w 0 w1 w 2 w3 ]
The first four bytes of the encryption key constitute the word w0, the next four bytes the word w1, and so on. The algorithm subsequently expands the words [w0, w1, w2, w3] into a 44-word key schedule that can be labeled
30
Of these, the words [w0, w1, w2, w3] are bitwise XORed with the input block before the round-based processing begins. The remaining 40 words of the key schedule are used four words at a time in each of the 10 rounds. The above two statements are also true for decryption, except for the fact that we now reverse the order of the words in the key schedule, as shown in Figure 2: The last four words of the key schedule are bitwise XORed wit h the 128-bit ciphertext block before any round-based processing begins. Subsequently, each of the four words in the remaining 40 words of the key schedule are used in each of the ten rounds of processing. Now comes the dicult part: How does the Key Expansion Algorithm expand four words w0, w1, w2, w3 into the 44 words w0, w1, w2, w3, w4, w5, ........, w43 ?
w
0
w
4
w
5
W
6
w
7
31
The key expansion algorithm will be explained in the next sub-section with the help of Figure 4 But first note that the key expansion takes place on a four-word to four-word basis, in the sense that each grouping of four words decides what the next grouping of four words will be generating the four words of the round key for a given round from the corresponding four words of the round key for the previous round. Lets say that we have the four words of the round key for the ith round:
wi wi+1 wi+2 wi+3 For these to serve as the round key for the ith round, i must be a multiple of 4. These will obviously serve as the round key for the (i/4)th round. For example, w4, w5, w6, w7 is the round key for round 1, the sequence of words w8, w9, w10, w11 the round key for round 2, and so on.Now we need to determine the words
w w w w
i+4
i+5
i+6
i+7
wi+1 wi+2
wi+3.
Perform a one-byte left circular rotation on the argument 4-byte word.Perform a byte substitution for each byte of the word returned by the previous step by using the same 16 16 lookup table as used in the SubBytes step of the encryption rounds. XOR the bytes obtained from the previous step with what is known as a round constant. The round constant is a word whose three rightmost bytes are always zero. Therefore, XORing with the round constant amounts to XORing with just its leftmost byte. The round constant for the ith round is denoted Rcon[i]. Since, by specification, the three rightmost bytes of the round constant are zero, we can write it as shown below. The left hand side of the equation below stands for the round constant to be used in the ith round. The right hand side of the equation says that the rightmost three bytes of the round constant are zero.
Rcon[i]
(RC[i], 0, 0, 0)
The only non-zero byte in the round constants, RC[i], obeys the following recursion:
32
RC[1] RC[j]
= =
1 2 RC[j 1]
The addition of the round constants is for the purpose of destroy-ing any symmetries that may have been introduced by the other steps in the key expansion algorithm. Our presentation of the key expansion algorithm is based on the assumption of a 128 bit key. As was mentioned in Section 8.1, AES calls for a larger number of rounds in Figure 2 when you use a key length other than 128 bits. A key length of 192 bits entails 12 rounds and a key length of 256 bits entails 14 rounds. (The length of the input block remains unchanged at 128 bits.) The key expansion algorithm must obviously generate a longer schedule for the 12 rounds required by a 192 bit key and the 14 rounds required by a 256 bit keys. Keeping in mind how we used the key schedule for the case of a 128 bit key, we are going to need 52 words in the key schedule for the case of 192-bit keys and 60 words for the case of 256-bit keys with round-based processing remaining the same as described in Section 8.4.The cool thing about 128-bit keys is that you can think of the key expansion being in one-one correspondence with the rounds. However, that is no longer the case with, say, the 192-bit keys. Now you have to think of key expansion as something that is divorced even conceptually from round-based processing of the input block.Note that if you change one bit of the encryption key, it will a ect the round key for several rounds. The key expansion algorithm ensures that AES has no weak keys. A weak key is a key that reduces the security of a cipher in a predictable manner. For example, DES is known to have weak keys. Weak keys of DES are those that produce identical round keys for each of the 16 rounds. An example of DES weak key is when it consists of alternating ones and zeros. This sort of a weak key in DES causes all the round keys to become identical,which, in turn, causes the encryption to become self-inverting. That is, plain text encrypted and then encrypted again will lead back to the same plain text. (Since the small number of weak keys of DES are easily recognized, it is not considered to be a problem with that cipher.)
33
DECRYPTION 3.1. Decryption Process The Decryption process of Advanced Encryption Standard algorithm is presented below, in figure12.
Key Schedule
This process is direct inverse of the Encryption process (chapter2). All the transformations applied in Encryption process are inversely applied to this process. Hence the last round values of both the data and key are first round inputs for the Decryption process and follows in decreasing order.
34
Inverse Bytes Substitution Transformation Inverse Byte Substitution Transformation InvSubBytes( ) is the inverse of the byte substitution transformation, in which the inverse S-Box (figure14) is applied to each byte of the State. This is obtained by applying the inverse of the affine transformation to the equation (16) followed by taking the multiplicative inverse in GF (2 ).
8
Figure 13. Application of the Inverse S-box to Each Byte of the State
Figure 14. Inverse S-box Values for All 256 Combinations in Hexadecimal Format
35
Inverse Shift Rows Transformation Inverse Shift Rows Transformation InvShiftRows( ) is the inverse of the ShiftRows( ) transformation presented in Chater2. The bytes in the last three rows of the State are cyclically shifted over different numbers of bytes. The first row, r = 0, is not shifted. The bottom three rows are cyclically shifted by Nb-shift(r, Nb) bytes, where the shift value shift(r, Nb) depends on the row number, and is explained in Section.2.3. Specifically, the InvShiftRows( ) transformation proceeds as follows
'
S
r,c
Figure 15. Inverse Cyclic Shift of the Last Three Rows of the State [1] 3.4. Inverse Mixing of Columns Transformation Inverse Mixing of Columns Transformation InvMixColumns( ) is the inverse of the MixColumns ( ) transformation) presented in chapter2. InvMixColumns ( ) operates on the State column-by-column, treating each column as a four term polynomial as described in Section.1.3.4. The columns are considered as polynomials over GF (2 ) and multiplied
8
36
-1
0e 09
0b 0e 0d 0b
0d 0b 09 0d
09 0d 0e 09
S S 0b 0e S S
As a result of this multiplication, the four bytes in a column are replaced by the following equations.
' S0 ,c ({0e} S0,c ) ({0b} S1,c ) ({0d} S2,c ) ({09} S3,c ) ' S1 ,c ({09} S0,c ) ({0e} S1,c ) ({0b} S2,c ) ({0d} S3,c ) ' S2 ,c ({0d} S0,c ) ({09} S1,c ) ({0e} S2,c ) ({0b} S3,c ) ' S3 ,c ({0b} S0,c ) ({0d} S1,c ) ({09} S2,c ) ({0e} S3,c
37
INTRODUCTION OF VLSI
Very-large-scale integration (VLSI) is the process of creating integrated circuits by combining thousands of transistor-based circuits into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device. The term is no longer as common as it once was, as chips have increased in complexity into the hundreds of millions of transistors. Overview The first semiconductor chips held one transistor each. Subsequent advances added more and more transistors, and, as a consequence, more individual functions or systems were integrated over time. The first integrated circuits held only a few devices, perhaps as many as ten diodes, transistors, resistors and capacitors, making it possible to fabricate one or more logic gates on a single device. Now known retrospectively as "small-scale integration" (SSI), improvements in technique led to devices with hundreds of logic gates, known as large-scale integration (LSI), i.e. systems with at least a thousand logic gates. Current technology has moved far past this mark and today's microprocessors have many millions of gates and hundreds of millions of individual transistors. At one time, there was an effort to name and calibrate various levels of large-scale integration above VLSI. Terms like Ultra-large-scale Integration (ULSI) were used. But the huge number of gates and transistors available on common devices has rendered such fine distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in widespread
38
use. Even VLSI is now somewhat quaint, given the common assumption that all microprocessors are VLSI or better. As of early 2008, billion-transistor processors are commercially available, an example of which is Intel's Montecito Itanium chip. This is expected to become more commonplace as semiconductor fabrication moves from the current generation of 65 nm processes to the next 45 nm generations (while experiencing new challenges such as increased variation across process corners). Another notable example is NVIDIAs 280 series GPU. This microprocessor is unique in the fact that its 1.4 Billion transistor count, capable of a teraflop of performance, is almost entirely dedicated to logic (Itanium's transistor count is largely due to the 24MB L3 cache). Current designs, as opposed to the earliest devices, use extensive design automation and automated logic synthesis to lay out the transistors, enabling higher levels of complexity in the resulting logic functionality. Certain high-performance logic blocks like the SRAM cell, however, are still designed by hand to ensure the highest efficiency (sometimes by bending or breaking established design rules to obtain the last bit of performance by trading stability). What is VLSI? VLSI stands for "Very Large Scale Integration". This is the field which involves packing more and more logic devices into smaller and smaller areas. VLSI Simply we say Integrated circuit is many transistors on one chip. Design/manufacturing of extremely small, complex circuitry using modified semiconductor material
39
Integrated circuit (IC) may contain millions of transistors, each a few mm in size Applications wide ranging: most electronic logic devices
VLSI - Very Large-Scale Integration (105-107) ULSI - Ultra Large-Scale Integration (>=107)
While we will concentrate on integrated circuits , the properties of integrated circuits-what we can and cannot efficiently put in an integrated circuit-largely determine the architecture of the entire system.
40
Integrated circuits improve system characteristics in several critical ways. ICs have three key advantages over digital circuits built from discrete components: Size. Integrated circuits are much smaller-both transistors and wires are shrunk to micrometer sizes, compared to the millimeter or centimeter scales of discrete components. Small size leads to advantages in speed and power consumption, since smaller components have smaller parasitic resistances, capacitances, and inductances. Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than they can between chips.
Communication within a chip can occur hundreds of times faster than communication between chips on a printed circuit board. The high speed of circuits on-chip is due to their small size-smaller components and wires have smaller parasitic capacitances to slow down the signal. Power consumption. Logic operations within a chip also take much less power. Once again, lower power consumption is largely due to the small size of circuits on the chip-smaller parasitic capacitances and resistances require less power to drive them. VLSI and systems These advantages of integrated circuits translate into advantages at the system level: Smaller physical size. Smallness is often an advantage in itselfconsider portable televisions or handheld cellular telephones. Lower power consumption. Replacing a handful of standard parts with a single chip reduces total power consumption.
41
Reducing power consumption has a ripple effect on the rest of the system: a smaller, cheaper power supply can be used; since less power consumption means less heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic shielding may be feasible, too. Reduced cost. Reducing the number of components, the power supply requirements, cabinet costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that the cost of a system built from custom ICs can be less, even though the individual ICs cost more than the standard parts they replace. Understanding why integrated circuit technology has such profound influence on the design of digital systems requires understanding both the technology of IC manufacturing and the economics of ICs and digital systems.
Applications
Electronic system in cars. Digital electronics control VCRs Transaction processing system, ATM Personal computers and Workstations Medical electronic systems. Etc.
Applications of VLSI
Electronic systems now perform a wide variety of tasks in daily life. Electronic systems in some cases have replaced mechanisms that operated mechanically, hydraulically, or by other means; electronics are usually smaller, more flexible, and easier to service. In other cases electronic systems have
42
created totally new applications. Electronic systems perform a variety of tasks, some of them visible, some more hidden: Personal entertainment systems such as portable MP3 players and DVD players perform sophisticated algorithms with remarkably little energy. Electronic systems in cars operate stereo systems and displays; they also control fuel injection systems, adjust suspensions to varying terrain, and perform the control functions required for anti-lock braking (ABS) systems. Digital electronics compress and decompress video, even at high-definition data rates, on-the-fly in consumer electronics. Low-cost terminals for Web browsing still require sophisticated electronics, despite their dedicated function. Personal computers and workstations provide word-
processing, financial analysis, and games. Computers include both central processing units (CPUs) and special-purpose hardware for disk access, faster screen display, etc. Medical electronic systems measure bodily functions and perform complex processing algorithms to warn about unusual conditions. The availability of these complex systems, far from overwhelming consumers, only creates demand for even more complex systems. The growing sophistication of applications continually pushes the design and manufacturing of integrated circuits and electronic systems to new levels of complexity. And perhaps the most amazing characteristic of this collection of systems is its variety-as systems become more complex, we build not a few
43
general-purpose computers but an ever wider range of special-purpose systems. Our ability to do so is a testament to our growing mastery of both integrated circuit manufacturing and design, but the increasing demands of customers continue to test the limits of design and manufacturing
44
2.VERILOG HDL
Verilog HDL is a hardware description language that can be used to model a digital system at many levels of abstraction ranging from the algorithmic-level to the gate-level to the switch-level. The complexity of the digital system being modeled could vary from that of a simple gate to a complete electronic digital system, or anything in between. The digital system can be described hierarchically and timing can be explicitly modeled within the same description. The Verilog HDL language includes capabilities to describe the behavior-al nature of a design, the dataflow nature of a design, a design's structural composition, delays and a waveform
generation mechanism including aspects of response monitoring and verification, all modeled using one single language. In addition, the language provides a programming language interface through which the internals of a design can be accessed during simulation including the control of a simulation run. The language not only defines the syntax but also defines very clear simulation semantics for each language construct. Therefore, models written in simulator. The language inherits this language can be verified using a Verilog
programming language. Verilog HDL provides an extensive range of modeling capabilities, some of which are quite difficult to comprehend initially. However, a core subset of the language is quite easy to leam and use. This is sufficient to model most applications.
2.1 History:
The verilog HDL language was first developed by Gateway Design Automation in 1983 as hardware are modleling language for their simulator product, At that time ,twas a propnetary language. Because of the popularity of the,simulator product, Verilog HDL gained acceptance as a usable and practical language by a number of designers. In an effort to increase the popularity of the language, the language was placed in the public domain in 1990. Open verilog International (OVI) was formed to promote Verilog. In 1992 OVI decided to pursue standardization of verilog HDL as an IEEE standard. This effort was succeful and the language became an IEEE standard in 1995. The
45
complete standard is described in the verilog hardware description language reference manual. The standard is called std 1364-1995.
Verilog HDL also has built-in logic functions such as & (bitwise-and) and I (bitwise-or). High-level programming language constructs such as condition- als, case statements, and
loops are available in the language.
Notion of concurrency and time can be explicitly modeled. Powerful file read and write capabilities fare provided. The language is non-deterministic under certain situations, that is, a model may produce
different results on different simulators; for example, the ordering of events on an event queue is not defined by the standard.
2.3 SYNTHESIS:
Synthesis is the process of constructing a gate level netlist from a register-transfer level model of a circuit described in Verilog HDL. Figure.2-2 shows such a process. A synthesis system may as an intermediate step, generate a netlist that is comprised of register-transfer level blocks such as flip-flops, arithmetic-logic-units, and multiplexers, interconnected by wires. In such a case, a second program called the RTL module builder is necessary. The purpose of this builder is to build, or acquire from a library of predefined components, each of the required RTL blocks in the userspecified target technology.
Having produced a gate level netlist, a logic optimizer reads in the netlist and optimizes the circuit for the user-specified area and timing constraints. These area and timing constraints may also 47
be used by the module builder for appropriate selection or generation of RTL blocks. In this book, we assume that the target netlist is at the gate level. The logic gates used in the synthesized netlists are described in Appendix B. The module building and logic optimization phases are not described in this book. The above figure shows the basic elements ofVerilog HDL and the elements used in hardware. A mapping mechanism or a construction mechanism has to be provided that translates the Verilog HDL elements into their corresponding hardware elements as shown in figure.2-3
48
RESULTS ENCRYPTION WF
49
DECRYPTION WF
50
51