Sunday, April 3, 2016

Summarized Hashing Fundamentals

Need of Hashing: Searching is typically the most frequent data operation. Linear Search on an unsorted array has O(n) while Binary Search on a sorted array has O(log n) time complexities. Hash Search further optimizes it to have a O(1) asymptotic time complexity.

Hash Function: A function that ensures uniform distribution of hash values & least collisions.

Collision: Collision happens when multiple keys hash to the same bucket.

Collision Avoidance Techniques:
  • Direct Chaining/Separate Chaining: Collided elements are added to Linked list.
  • Open Addressing: Involves Linear or Quadratic Probing.
    • Linear Probing: [( H(x) + f(i) ) mod ArrLen] preferred in simple scenarios.
    • Quadratic Probing: [( H(x) + f(i^2) ) mod ArrLen] preferred in simple scenarios.
  • Closed Addressing: Involves Double Hashing.
    • Double Hashing: [( H1(x) + (i * H2(x) ) ) mod ArrLen] where H2(x) = [PrimeNum – x mod PrimeNum] is for complex scenarios.
Collision Avoidance Tips:
  • Use separate bucket for null key (special case).
  • Avoid clustering of values one besides other.
  • Choose twice array size than the values to be inserted.
  • Choose Prime Number (smaller than table size) as the initial size of data-structure.
  • For double-hashing, ensure that second Hash function (H2(x)) shouldn't evaluate to zero & should probe all locations.
Collision Avoidance Example: Following example is the actual HashMap implementation code.

 public V put(K key, V value) {  
   if (key == null)  
     return putForNullKey(value); // Used separate bucket for null-key.  
   int hash = hash(key.hashCode()); // Called Hash-function to get hash-value.  
   int i = indexFor(hash, table.length);  
   for (Entry e = table[i]; e != null; e = e.next) {  
     Object k;  
     if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {  
       V oldValue = e.value;  
       e.value = value;  
       e.recordAccess(this);  
       return oldValue;  
     }  
   }  
   modCount++;  
   addEntry(hash, key, value, i); // Stored element using above hash-value.  
   return null;  
 }