0 évaluation0% ont trouvé ce document utile (0 vote)
554 vues13 pages
The document discusses hash table organization and properties of hash functions. It describes three possibilities when searching for an entry - it may be occupied by the searched symbol, occupied by another symbol causing a collision, or empty. It presents an algorithm to handle collisions by using multiple hash functions. The key properties of good hash functions are also outlined.
The document discusses hash table organization and properties of hash functions. It describes three possibilities when searching for an entry - it may be occupied by the searched symbol, occupied by another symbol causing a collision, or empty. It presents an algorithm to handle collisions by using multiple hash functions. The key properties of good hash functions are also outlined.
Droits d'auteur :
Attribution Non-Commercial (BY-NC)
Formats disponibles
Téléchargez comme PPT, PDF, TXT ou lisez en ligne sur Scribd
The document discusses hash table organization and properties of hash functions. It describes three possibilities when searching for an entry - it may be occupied by the searched symbol, occupied by another symbol causing a collision, or empty. It presents an algorithm to handle collisions by using multiple hash functions. The key properties of good hash functions are also outlined.
Droits d'auteur :
Attribution Non-Commercial (BY-NC)
Formats disponibles
Téléchargez comme PPT, PDF, TXT ou lisez en ligne sur Scribd
i.e., e is a function of s • Three possibilities exist concerning the predicted entry – the entry may be occupied by s, the entry may be occupied by some other symbol, or the entry may be empty • The situation in the second case (s ≠ se), is called a collision. • Following a collision, the search continues with a new prediction. • In the third case, s is entered in the predicted entry Algorithm 1. e := h (s) 2. Exit with success if s = se, and with failure if entry e is unoccupied. 3. Repeat steps 1 and 2 with different functions h' and h'', etc. The function h is called the hashing function Notations n : number of entries in the table f : number of occupied entries in the table k : number of distinct symbols in the source language kp: Number of symbols used in some source program Sp: Set of symbols used in some source program N: Address table of the table K: key space of the system Kp: key space of a program Properties of a Hash Function • A hashing function has the property that 1 ≤ h(symb) ≤ n • If k ≤ n, we can select a one-one function as the hashing function h • This will eliminate collisions in the symbol table since entry number e given by e = h(s) can only be occupied by s • We refer to the organization as direct entry organization Properties of a Hash Function • However, k is a very large number in practice hence use of one-to-one function will require a very large symbol table. • For good performance it is adequate if the hashing function implements a mapping Kp => N • The effectiveness of a hashing organization depends on the average value of ps. • For a given size of the hash table, the value of ps can be expected to increase with the value of kp. Hashing Functions • While hashing, the representation of s is treated as a binary number • The hashing function performs numerical transformation on this number to obtain e • Let the representation has b bits in it and let the host computer uses m bit arithmetic. Call it rs • If b ≤ m, the representation of s can be padded with 0’s. If b > m the representation of s is split into pieces of m bits each, and the bitwise OR operations are performed to obtain rs Hashing Functions A hashing function h possess the following properties to ensure good search performance: 1. h should not be sensitive to the symbols in Sp. Thus, the value of ps should only depend on kp. 2. h should execute reasonably fast Hashing Functions Two classes of such hashing functions: • Multiplication Function: Analogous to functions used in random number generator h(s)=(a × rs + b) mod 2m. The table size should be a power of 2, say 2g, such that lower order g bits of h(s) can be used as e • Division function: h(s) = remainder of rs/n + 1 where n is the size of the table. If n is the prime number, it is called prime number hashing Hashing Functions • A multiplication function has the advantage of being slightly faster but suffers from the drawback that the table size has to be a power of 2. • Prime division hashing is slower but has the advantage that prime numbers are more closely spaced than powers of 2. This provides a wider choice of table size Collision Handling Two approaches: • Rehashing: accommodate the colliding entry elsewhere in the hash table • Overflow Chaining: accommodate the colliding entry in the separate table Rehashing • Uses a sequence of hashing functions h1, h2,… to resolve collisions • If the collision occurs at hi(s), then a new prediction can be made at hi+1 (s) • Sequential rehashing: hi+1 (s) = hi(s) mod n + 1 • Drawback: colliding entry accommodated elsewhere may lead to more collisions. This may lead to clustering of entries in the table. Overflow Chaining • Avoids the problem of clustering by accommodating the entries in the separate table called the overflow table • Thus, a search which encounters a collision in the primary table has to be continued in the overflow table. • A pointer field is added in the primary and overflow tables Symbol Other info Pointer Overflow Chaining • Drawback: extra memory requirement • An organization called as scatter table is often used to reduce the memory requirements. • The hash table contains only pointers and all the symbol table entries are stored in the overflow table
Michael B. White - Mastering C - (C Sharp Programming) - A Step by Step Guide For The Beginner, Intermediate and Advanced User, Including Projects and Exercises (2019) - Libgen - Li