Algorithms: Design and Analysis, Part 1 - Universal Hashing
For the security problem of paralysis of real-world systems by exploiting badly designed hash functions, what are two solutions?
1) Use a cryptographic hash function (e.g. SHA-2) because it makes it infeasible to reverse engineer a pathological data set. 2) Use randomization. Design a family H of hash functions, such that ∀ data sets S, "almost all" functions h ∈ H spread S out "pretty evenly".
Which hash table implementation strategy is feasible for load factors larger than 1: chaining and/or open addressing?
Chaining only, because chaining allows each bucket to contain more than one object via a linked list while open addressing only supports a single object per bucket.
How can the load factor be mitigated as the run time becomes worse than constant time?
Increase the # of buckets of a hash table.
What is the "load factor" of a has table?
The load factor of a hash table is α := # of objects in hash table / # of buckets of hash table
Regarding the load factor, what are the necessary conditions required for operations to run in constant time?
α = O(1) is necessary condition for operations to run in constant time. Once the load factor goes above O(1), operations no longer run in constant time (due to length of linked list operations). With open addressing, α must be significantly less than 1. As α approaches 1, run time becomes closer to linear