Techniques of hashing and collision resolution
Last updated
Last updated
Given a limited range array contains both positive and non-positive numbers, i.e., elements are in the range from -MAX to +MAX. Our task is to search if some number is present in the array or not in O(1) time.
We use values as the index in a big array. Therefore we can search and insert elements in O(1) time.
Separate Chaining: The idea is to make each cell of hash table point to a linked list of records that have same hash function value.
Let us consider a simple hash function as “key mod 7” and sequence of keys as 50, 700, 76, 85, 92, 73, 101.
Advantages: 1) Simple to implement. 2) Hash table never fills up, we can always add more elements to the chain. 3) Less sensitive to the hash function or load factors. 4) It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted.
Disadvantages: 1) Cache performance of chaining is not good as keys are stored using a linked list. Open addressing provides better cache performance as everything is stored in the same table. 2) Wastage of Space (Some Parts of hash table are never used) 3) If the chain becomes long, then search time can become O(n) in the worst case. 4) Uses extra space for links.
Open Addressing is done in the following ways:
a) Linear Probing: In linear probing, we linearly probe for next slot. For example, the typical gap between two probes is 1.
b) Quadratic Probing We look for i^2 th slot in i th iteration.