Monday, November 10, 2014

What is the purpose of universal hashing?


Suppose you are participating in programming competition. The task for you is to write a hash function that will process an input data as fast as possible. You can see and test other programmers' solutions as well as they can see your source code and generate the worst-case input. The winner is the programmer who will write the fastest hash function.

Problem is that any fixed hash function is vulnerable to such terrible worst-case behavior. Even if you write sophisticated hash function it will be possible to choose n keys that all hash to the same slot, so complexity of searching in average case will be O(n) instead of O(1).

Instead of writing sophisticated hash functions you might pay attention to randomness. Randomness not in the implementation of hash function, but in choosing hash function from set of multiple hash functions.

The approach when hash function is choosing randomly in a way that is independent of the keys that are actually going to be stored is called universal hashing.

So, purpose of universal hashing is to eliminate the vulnerability that can be used to slow down your program execution.