Académique Documents
Professionnel Documents
Culture Documents
Theory:
A cryptographic hash function is a hash function which takes an input (or 'message') and
returns a fixed-size alphanumeric string. The string is called the 'hash value', 'message digest',
'digital fingerprint', 'digest' or 'checksum'.
The ideal hash function has three main properties:
Uses:
Functions with these properties are used as hash functions for a variety of purposes, not
only in cryptography. Practical applications include message integrity checks, digital
signatures, authentication, and various information security applications.
A hash function takes a string of any length as input and produces a fixed length string which acts
as a kind of "signature" for the data provided. In this way, a person knowing the "hash value" is
unable to know the original message, but only the person who knows the original message can
prove the "hash value" is created from that message.
A cryptographic hash function should behave as much as possible like a random function while
still being deterministic and efficiently computable. A cryptographic hash function is considered
"insecure" from a cryptographic point of view, if either of the following is computationally
feasible:
An attacker who can find any of the above computations can use them to substitute an authorized
message with an unauthorized one.
Ideally, it should be impossible to find two different messages whose digests ("hash values") are
similar. Also, one would not want an attacker to be able to learn anything useful about a message
from its digest ("hash values"). Of course the attacker learns at least one piece of information, the
digest itself, by which the attacker can recognise if the same message occurred again.
In various standards and applications, the two most commonly used hash functions
are MD5 and SHA-1.
MD5:
The MD5 message-digest algorithm is a widely used hash function producing a 128-
bit hash value. Although MD5 was initially designed to be used as a cryptographic hash function,
it has been found to suffer from extensive vulnerabilities. It can still be used as a checksum to
verify data integrity, but only against unintentional corruption. It remains suitable for other non-
cryptographic purposes, for example for determining the partition for a particular key in a
partitioned database.
The weaknesses of MD5 have been exploited in the field, most infamously by the Flame
malware in 2012. The CMU Software Engineering Institute considers MD5 essentially
"cryptographically broken and unsuitable for further use".[4]
MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4,
and was specified in 1992 as RFC 1321.
MD5 processes a variable-length message into a fixed-length output of 128 bits. The input
message is broken up into chunks of 512-bit blocks (sixteen 32-bit words); the message
is padded so that its length is divisible by 512. The padding works as follows: first a single bit, 1,
is appended to the end of the message. This is followed by as many zeros as are required to bring
the length of the message up to 64 bits fewer than a multiple of 512. The remaining bits are filled
up with 64 bits representing the length of the original message, modulo 264.
The main MD5 algorithm operates on a 128-bit state, divided into four 32-bit words,
denoted A, B, C, and D. These are initialized to certain fixed constants. The main algorithm then
uses each 512-bit message block in turn to modify the state. The processing of a message block
consists of four similar stages, termed rounds; each round is composed of 16 similar operations
based on a non-linear function F, modular addition, and left rotation.
Working
The MD5 algorithm first divides the input in blocks of 512 bits each. 64 Bits are inserted at the
end of the last block. These 64 bits are used to record the length of the original input. If the last
block is less than 512 bits, some extra bits are 'padded' to the end.
Next, each block is divided into 16 words of 32 bits each. These are denoted as M0 ... M15.
The buffer
MD5 uses a buffer that is made up of four words that are each 32 bits long. These words are
called A, B, C and D. They are initialized as
word A: 01 23 45 67
word B: 89 ab cd ef
word C: fe dc ba 98
word D: 76 54 32 10
The table
MD5 further uses a table K/T that has 64 elements. Element number i is indicated as Ki. The
table is computed beforehand to speed up the computations. The elements are computed using
the mathematical sin function:
In addition MD5 uses four auxiliary functions that each take as input three 32-bit words and
produce as output one 32-bit word. They apply the logical operators and, or, not and xor to the
input bits.
The contents of the four buffers (A, B, C and D) are now mixed with the words of the input,
using the four auxiliary functions (F, G, H and I). There are four rounds, each involves 16 basic
operations. One operation is illustrated in the figure below.
Fig: Each iteration in a round
The figure shows how the auxiliary function F is applied to the four buffers (A, B, C and D),
using message word Mi and constant Ki. The item "<<<s" denotes a binary left shift by s
bits.
Output:
After all rounds have been performed, the buffers A, B, C and D contain the MD5 digest of
the original input.
Algorithm:
var int[64] s, K
var int i
for i from 0 to 63
end for
//Initialize variables:
// where the first bit is the most significant bit of the byte.[49]
append "0" bit until message length in bits ≡ 448 (mod 512)
var int A := a0
var int B := b0
var int C := c0
var int D := d0
//Main loop:
for i from 0 to 63
var int F, g
if 0 ≤ i ≤ 15 then
g := i
else if 16 ≤ i ≤ 31 then
g := (5×i + 1) mod 16
else if 32 ≤ i ≤ 47 then
F := B xor C xor D
g := (3×i + 5) mod 16
else if 48 ≤ i ≤ 63 then
g := (7×i) mod 16
F := F + A + K[i] + M[g]
A := D
D := C
C := B
B := B + leftrotate(F, s[i])
end for
a0 := a0 + A
b0 := b0 + B
c0 := c0 + C
d0 := d0 + D
end for
leftrotate (x, c)
rotate_amounts = [7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12,
17, 22,
5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9,
14, 20,
4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11,
16, 23,
6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10,
15, 21]
index_functions = 16*[lambda i: i] + \
16*[lambda i: (5*i + 1)%16] + \
16*[lambda i: (3*i + 5)%16] + \
16*[lambda i: (7*i)%16]
x &= 0xFFFFFFFF
def md5(message):
message.append(0x80)
message.append(0)
hash_pieces = init_values[:]
a, b, c, d = hash_pieces
chunk = message[chunk_ofst:chunk_ofst+64]
for i in range(64):
f = functions[i](b, c, d)
g = index_functions[i](i)
to_rotate = a + f + constants[i] +
int.from_bytes(chunk[4*g:4*g+4], byteorder='little')
new_b = (b + left_rotate(to_rotate, rotate_amounts[i])) &
0xFFFFFFFF
a, b, c, d = d, new_b, b, c
if i%16 == 15:
print(int((i + 1)/16),"A:",a,"B:",b,"C:",c,"D:",d,sep =
' ')
hash_pieces[i] += val
hash_pieces[i] &= 0xFFFFFFFF
def md5_to_hex(digest):
raw = digest.to_bytes(16, byteorder='little')
return '{:032x}'.format(int.from_bytes(raw, byteorder='big'))
if __name__=='__main__':
demo = [
b"The quick brown fox jumps over the lazy dog.",
]
for message in demo:
print(md5_to_hex(md5(message)),' <=
"',message.decode('ascii'),'"', sep='')
“””
C:\Users\Asus\Documents\Sem VI\CSS\EXP4>python md5rosetta.py
1 A: 4156205459 B: 1288067225 C: 1515065400 D: 3684338458
2 A: 2960212908 B: 2098643899 C: 4106848806 D: 823743074
3 A: 4014926640 B: 4178434990 C: 833246599 D: 1855571174
4 A: 1522841315 B: 757998855 C: 356813730 D: 3231239785
e4d909c290d0fb1ca068ffaddf22cbd0 <= "The quick brown fox jumps over the
lazy dog."
”””
Conclusion/Analysis Report: Thus the Md5 hashing algorithm has been implemented and
the 128 bit hash value for the given plaintext has been calculated.