Chapter 11 - Hash Tables
quadratic probing formula
(H + c1 * i + c2 * i^2) mod (tablesize)
double hashing formula
(h1(key) + i * h2(key)) mod (tablesize)
good hash function
- O(1) inserts, searches, removes, worst-case O(N) - short bucket lists w/ chaining - minimized average linear probing length for fast operations
empty-after-removal
a bucket that had an item removed that caused the bucket to now be empty
empty-since-start bucket
a bucket that has been empty since the hash table was created
open addressing
a collision resolution technique where collisions are resolved by looking for an empty bucket elsewhere in the table (so 75 might be stored in bucket 6)
chaining
a collision resolution technique where each bucket has a list of items (so bucket 5's list would become 55, 75)
modulo operator %
a common hash function that computes the integer remainder when dividing two numbers
password hashing function
a cryptographic hashing function that produces a hash value for a password
hash table
a data structure that stores unordered items by mapping (or hashing) each item to a location in an array (or vector) main advantage: searching (or inserting / removing) an item may only require O(1), in contrast to O(n) for searching a list or to O(log n) for binary search
cryptography
a field of study focused on transmitting data securely
hash function
a function that computes a bucket index from the item's key
cryptographic hash function
a hash function designed specifically for cryptography
perfect hash function
a hash function that maps items to buckets with no collisions. Can be created if the number of items and all possible key items are known beforehand. Runtime for insert, search, remove: O(1)
multiplicative string hash
a hash function that repeatedly multiplies the hash value and adds the ASCII (or Unicode) value of each character in the string
mid-square hash
a hash function that squares the key, extracts R digits from the result's middle, and returns the remainder of the middle digits divided by the hash table size N - for N buckets in a decimal mid-square hash function, R must be greater than or equal to [log base 10 N] - typically implemented using binary (base 2) because binary implementation is faster
direct hash function
a hash function that uses the item's key as the bucket index AKA direct access table Adv: no collisions, O(1) operations Disadv: keys must be non-negative integers for most applications; table size equals the largest key value plus 1, which may be very large
modulo hash
a hash function that uses the remainder from division of the key by hash table size N - making N prime typically reduces collisions (fewer items map to the same bucket)
linear probing
a hash table that handles a collision by starting at the key's mapped bucket and then linearly searches subsequent buckets until an empty bucket is found if probing reaches the end before finding an empty bucket, probing starts again at bucket 0
quadratic probing
a hash table that handles a collision by starting at the key's mapped bucket, and then quadratically searches subsequent buckets until an empty bucket is found
bucket (hash table)
a name for each hash table array element
collision
a situation in which an item being inserted into a hash table maps to the same bucket as an existing item in the hash table
encryption
alteration of data to hide the original meaning
double hashing
an open-addressing collision resolution technique that uses 2 different has functions to compute bucket indices
resize operation (hash table)
an operation that increases the number of buckets while preserving all existing items. Commonly resized from N buckets to the next prime number ≥ N * 2. Time complexity: O(N)
probing sequence
iterating through sequential i values to obtain the desired table index in quadratic probing search/insert/remove: uses the formula and the key (key = H) with an initial i = 0, then when an empty bucket is not found, i is incremented by 1
decryption
reconstruction of original data from encrypted data
load factor (hash table)
the number of items in the hash table divided by the number of buckets
key (hash table)
the value used to map to an index
insert/remove (chain)
uses the item's key to determine the bucket and then inserts the item in that bucket's list. Likewise for removes