Vous êtes sur la page 1sur 4

Hashing

Synonym It is a technique for performing insertion, deletion and find operations on a structure It is a method for accessing data records Note Hashing implements Hash Table. For Ex: consider a list of names: John Smith, Sarah Jones, and Roger Adams. Using Hash Table, one can create an index named as Hash Table for these records which has unique numeric value for each names using a formulae. So you might get something like: 1345873 John smith 3097905 Sarah Jones 4060964 Roger Adams Then to search for the record containing Sarah Jones, you just need to reapply the formula, which directly yields the index key to the record. This is much more efficient than searching through all the records till the matching record is found General Idea Hash table structure is similar to that of an array of some fixed size containing keys. It starts its variation from 0 to table size-1, where a key is a string with some value. Each key in the Hash Table is mapped into some number in the range 0 to table size and placed in the appropriate cell. This mapping is referred as Hash Function. This function should ensure that any 2 distinct keys get different cells. For ex: consider a Hash Table Structure 0 1 2 3 4 5 6 7 8 9

Tom Raj

25 28

While inserting a key or data into a Hash Table, data collision may occur. This has to be resolved. To overcome this problem, we move into a concept called 1. Separate Chaining 2. Open Addressing Hash Table Separate Chaining Separate Chaining is to keep a list of all elements that has to the same value For Example: Assume a Hast Table of size 10 and the elements 0, 1, 4, 25, 16, 9, 36, 49, 4, 81 has to be stored in a Hash Table These elements can be stored in a Hash Table using Hash(x) function & it can be written as Hash(x) = x mod 10

Mr.A.Thomas Paul Roy, SL-CSE

pauli.dgl@gmail.com

By using Separate Chaining, the Hash table can be pictured as:

0 1 2 3 4 5 6 7 8 9

0 1 81

4 25 6

64

36

49

Basic operations 2 operations: Find and Insertion To perform Find operation, use Hash function to determine which list to be traversed do the comparison in the appropriate list & return the position of element to be found To perform insertion, traverse down the appropriate list to check whether an element is placed already else insert as new element in the list Open Addressing Demerit of Separate Chaining Hash Table Time in which managing a list is more if the table size is more cos it needs more pointers This possibility tends to slow down the algorithm coz it requires more time to allocate cells in Hash Table This problem can be solved by using Open Addressing which is an alternative to resolve collision with linked list. In this Open Addressing Scheme, if a collision occurs alternative cells are linked until an empty cell is found. More formally, H0(x), H1(x), H2(x),are tried in successive stories, where Hi(x) can be written as: Hi(x) = (Hash(x) + F(i)) mod tablesize Where F = collision resolution strategy Strategies to be used 3 strategies 1. Linear Probing 2. Quadratic Probing with F(0) = 0

Mr.A.Thomas Paul Roy, SL-CSE

pauli.dgl@gmail.com

3. Double Hashing Linear Probing With reference to this strategy, it is needed to try cells sequentially for an empty cell. In this case, function F is a linear one & is typically written as F(i)=i For Example: Consider the set of data elements or keys are to be inserted in the Hash Table using Hash Function. X = {89, 18, 49, 58, 69} 49 49 58 49 58 69 Create a table with 5 more columns cos total number of keys given is 5

18 89 89

18 89

18 89

18 89

0 1 2 3 4 5 6 7 8 9

Step 1: Insert 89

Apply x mod 10 89 mod 10 = 9 So at the 9th spot of table. Cell is empty. No collision occurs. Store it linearly to a end of cell Apply x mod 10 18 mod 10 = 8 So at the 8th spot of table. Cell is empty. No collision occurs. Store it linearly from 2nd column to a end of cell Apply x mod 10 49 mod 10 = 9 So at the 9th spot of table. Cell is not empty. Collision occurs. Store it in the next available cell in the Hash table to place the data / key. So data can be stored at 0th spot of empty cell linearly from 3rd column to a end of cell Apply x mod 10 58 mod 10 = 8 So at the 8th spot of table. Cell is not empty. Collision occurs. Store it in the next available cell in the Hash table to place the data / key. So data can be stored at 1st spot of empty cell linearly from 4th column to a end of cell

Step 2: Insert 18

Step 3: Insert 49

Step 4: Insert 58

Mr.A.Thomas Paul Roy, SL-CSE

pauli.dgl@gmail.com

Step 5: Insert 69

Apply x mod 10 69 mod 10 = 9 At the 9th spot of table. Cell is not empty. Collision occurs. Store it in the next available cell in the Hash table to place the data / key. So data can be stored at 2nd spot of empty cell linearly from 5th column

Keypoints to be remembered For every successive insertion of a key, it is needed to check a free cell i.e.., all the occupied cells will also have to be referenced each time for an empty cell which forms a cluster. This effect is referred as Primary Clustering If a table of size is too large, time needed to identify an empty cell is large If clustering is not a problem, one can implement with random collision strategy where one problem is independent of other

Quadratic Probing It is a collision resolution strategy /method that eliminates the primary clustering problem of linear probing. In this case, function F is a linear one & is typically written as F(i)=i2 49 For Example: Consider the set of data elements or keys are to be inserted in the Hash Table using Hash Function. X = {89, 18, 49, 58, 69}

58 69

18 89 89

18 89

0 1 2 3 4 5 6 7 8 9

First find X mod 10 for each data if no collision store at empty cell. Else find next available cell from that collision point measured from 0 calculate F(i)=i2 For Ex: To store 58; 58 mod 10 return 8. the spot at 8th cell is no empty collision occurs find next available empty location count the point where collision occurs as 0 to empty cell. As per this problem 8th spot is 0, 9th spot 1 and 0th spot index is 2 So next available empty location is 22 = 4. So store data 58 at 4th cell from data 18.

Mr.A.Thomas Paul Roy, SL-CSE

pauli.dgl@gmail.com

Vous aimerez peut-être aussi