CSCI 251 Ch. 5 Hash Tables
How many direct access table buckets are needed for items with keys ranging from 100 to 200 (inclusive)?
201 200 + 1 = 201. Buckets indices will be 0 to 200. The hash table size with direct hashing equals the largest key plus 1. Note that only buckets 100 to 200 will be used; buckets 0 to 99 will be unused.
Assume a hash function returns key % 16 and quadratic probing is used with c1 = 1 and c2 = 1. Refer to the table below. Which value was inserted without collision? 99 64 23
23
Assume a hash function returns key % 16 and quadratic probing is used with c1 = 1 and c2 = 1. Refer to the table below. How many bucket index computations were necessary to insert 64 into the table? 1 2 3
3
Consider the following hash table and a hash function of key % 10. HashSearch(valsTable, 110) probes _____ buckets.
3
Consider the following hash table and a hash function of key % 10. HashRemove(idsTable, 65) probes _____ buckets.
3
reconstruction of original data from encrypted data
decryption
alteration of data to hide the original meaning
encryption
Consider the following hash table, and a hash function of key % 10. What does HashSearch(valsTable, 186) return?
null
Double hashing formula
(h1(key) + i * h2(key)) mod (tablesize)
a cryptographic hashing function that produces a hash value for a password
password hashing function
Quadratic probing formula (consider an item's mapped bucket is H)
(H + c1 * i + c2 * i^2) mod (tablesize)
Type the hash table after the given operations. Type the hash table as: E, 1, 2, E, E (where E means empty). HashInsert(numsTable, item 0) numsTable:
0, 1, 2, E, E
Assume a hash function returns key % 16 and quadratic probing is used with c1 = 1 and c2 = 1. Refer to the table below. What is the probing sequence when inserting 48 into the table? 8 0, 8 0, 2, 6, 12
0, 2, 6, 12
Consider the following hash table and a hash function of key % 10. HashSearch(valsTable, 75) probes _____ buckets.
1
Consider the following hash table, a hash function of key % 10, and quadratic probing with c1 = 1 and c2 = 1. HashSearch(valsTable, 75) probes _____ buckets.
1
a field of study focused on transmitting data securely
cryptography
Type the hash table after the given operations. Type the hash table as: E, 1, 2, E, E (where E means empty). HashRemove(numsTable, 0) HashInsert(numsTable, item 4) numsTable:
E, E, 2, 3, 4
a collision resolution technique where collisions are resolved by looking for an empty bucket elsewhere in the table (so 75 might be stored in bucket 6).
open addressing
Suppose a hash table has 101 buckets. If the hash table was using chaining, the load factor could be ≤ 0.1, but an individual bucket could still contain 10 items. True False
true In the worst case scenario, the 10 items are the only items in the table and are all added to the same bucket. The load factor, 10 / 101 = 0.099, is ≤ 0.1.
When resizing to a larger size, the load factor is guaranteed to decrease. True False
true The load factor is computed as X / Y, where X is the number of items and Y the number of buckets. Resizing will not change X. When Y increases, the fraction X / Y decreases. Therefore, the load factor always decreases when resizing to a larger size.
In a hash table using open addressing, the load factor cannot exceed 1.0. True False
true With open addressing, one bucket holds at most one item. So a table with N buckets can have at most N items, making the maximum possible load factor N / N = 1.0.
Will the hash function and expected key likely work well for the following scenarios? Hash function: key % 250Key: 5-digit customer IDHash table size: 250 Yes No
yes Assuming the IDs are well distributed, the hash function distributes keys into all hash table buckets.
The load factor may be used to decide when to resize the hash table. True False
true
The removal algorithm searches for the bucket containing the key to remove. If found, the bucket is marked as empty-after-removal. True False
true
Double hashing would never resolve collisions if the second hash function always returned 0. True False
true
Consider the following hash table, and a hash function of key % 10. How many list elements are compared for HashSearch(valsTable, 62)?
1
Linear Probing Given hash function of key % 5, determine the insert location for each item. HashInsert(numsTable, item 74) bucket = ________
1
Given a hash table with 50 buckets and modulo hash function, in which bucket will HashSearch(table, 201) search for the item?
1 201 % 50 = 1. Search, insert, and remove operations all use the hash function to determine the bucket index.
Given: hash1(key) = key % 11 hash2(key) = 5 - key % 5 and a hash table with a size of 11. Determine the index for each item after the following operations have been executed. HashInsert(valsTable, item 16) HashInsert(valsTable, item 77) HashInsert(valsTable, item 55) HashInsert(valsTable, item 41) HashInsert(valsTable, item 63) Item 63
1 63 % 11 = 8, but the bucket at index 8 already holds item 41. i is incremented to 1, and the bucket index is computed as (63 % 11 + 1 * (5 - 63 % 5)) % 11 = 10. Bucket 10 is occupied by item 55, so i is incremented to 2 and the bucket index is computed as (63 % 11 + 2 * (5 - 63 % 5)) % 11 = 1. Bucket 1 is empty, so Item 63 is inserted in this bucket.
How many buckets will be checked for HashSearch(numsTable, 45)? 1 6 10
1 Bucket 5 is checked first (45 % 10 = 5), and the matching item is found. Hash search may search only O(1) buckets.
For a decimal mid-square hash function, what is the bucket index for key = 110, N = 200, and R = 3?
10 110 * 110 = 12100. Middle 3 digits = 210. Then 210 % 200 = 10. The mid-square hash function returns the remainder of the middle digits divided by hash table size.
Given: hash1(key) = key % 11 hash2(key) = 5 - key % 5 and a hash table with a size of 11. Determine the index for each item after the following operations have been executed. HashInsert(valsTable, item 16) HashInsert(valsTable, item 77) HashInsert(valsTable, item 55) HashInsert(valsTable, item 41) HashInsert(valsTable, item 63) Item 55
10 55 % 11 = 0, but the bucket at index 0 already holds item 77. i is incremented to 1, and the bucket index is computed as (55 % 11 + 1 * (5 - 55 % 5)) % 11 = 5. Bucket 5 is occupied by item 16, so i is incremented to 2, and the bucket index is computed as (55 % 11 + 2 * (5 - 55 % 5)) % 11 = 10. Bucket 10 is empty, so Item 55 is inserted in this bucket.
A modulo hash function is used to map to indices 0 to 9. The hash function should be: key % _____
10 key % 10 will yield a remainder of 0, 1, ..., or 9. Ex: 19 % 10 = 9, 20 % 10 = 0, and 21 % 10 = 1. A common mistake is to use key % 9, whose remainders can only be 0 to 8.
For R = 3, what are the middle bits for a key of 9? 9 * 9 = 81; 81 in binary is 1010001.
100 The middle 3 bits are 100, which is 4 in decimal.
A class has 100 students. Student ID numbers range from 10000 to 99999. Using the ID number as key, how many buckets will a direct access table require?
100000 99999 + 1 = 100000. The table size depends only on the largest key. As such, a direct access table is very inefficient for this application, since only 100 buckets would be used out of 100,000 total.
Suppose the hash table below is resized. The hash function used both before and after resizing is: hash(key) = key % N, where N is the table size. At what index does 99 reside in the resized table? 1 9 14
14 99 is rehashed using the new table size, and is inserted at index 99 % 17 = 14.
Given hash function of key % 10, type the specified bucket's list after the indicated operation(s). Assume items are inserted at the end of a bucket's list. Type the bucket list as: 5, 7, 9 (or type: Empty). HashRemove(valsTable, 46) Bucket 6's list: _____
16
Suppose the hash table below is resized. The hash function used both before and after resizing is: hash(key) = key % N, where N is the table size. What is the most likely allocated size for the resized hash table? 7 14 17
17 Resizing will use the next prime number >= 14, which is 17
A hash table's items will be positive integers, and -1 will represent empty. A 5-bucket hash table is: -1, -1, 72, 93, -1. How many items are in the table? 0 2 5
2
Consider the following hash table and a hash function of key % 10. HashSearch(valsTable, 207) probes _____ buckets.
2
Consider the following hash table, a first hash function of key % 10, and a second hash function of 7 - key % 7. HashInsert(valsTable, item 24) probes _____ buckets.
2
Consider the following hash table, a first hash function of key % 10, and a second hash function of 7 - key % 7. HashSearch(valsTable, 110) probes _____ buckets.
2
Consider the following hash table, a hash function of key % 10, and quadratic probing with c1 = 1 and c2 = 1. After removing 66 via HashRemove(valsTable, 66), HashSearch(valsTable, 66) probes _____ buckets.
2
Consider the following hash table, a hash function of key % 10, and quadratic probing with c1 = 1 and c2 = 1. HashSearch(valsTable, 110) probes _____ buckets.
2
Consider the following hash table, and a hash function of key % 10. How many list elements are compared for HashSearch(valsTable, 837)?
2
Linear Probing Given hash function of key % 5, determine the insert location for each item. HashInsert(numsTable, item 41) bucket = ________
2
Linear Probing Given hash function of key % 5, determine the insert location for each item. HashInsert(numsTable, item 90) bucket = ________
2
For a decimal mid-square hash function, what is the bucket index for key = 112, N = 1000, and R = 3?
254 112 * 112 = 12544. Middle 3 digits = 254. Then 254 % 1000 = 254. For a 1000 entry hash table, R must be greater than or equal to ⌈log_10*1000⌉=3 to index all buckets. 3 digits can index all 1000 buckets at indices 0 to 999.
If the Caeser cipher were implemented such that strings were restricted to only lower-case alphabet characters, how many distinct ways could a message be encrypted? 26 52 Length of the message
26
Consider the following hash table, and a hash function of key % 10. How many list elements are compared for HashSearch(valsTable, 40)?
3
Linear Probing Given hash function of key % 5, determine the insert location for each item. HashInsert(numsTable, item 13) bucket = _______
3
Given a hash table with 100 buckets and modulo hash function, in which bucket will HashInsert (table, item 334) insert item 334?
34 334 % 100 = 34. The insert operation uses the hash function to determine the bucket index.
Consider the following hash table and a hash function of key % 10. HashRemove(idsTable, 10) probes _____ buckets.
4
Consider the following hash table and a hash function of key % 10. HashRemove(idsTable, 68) probes _____ buckets.
4
Suppose the hash table below is resized. The hash function used both before and after resizing is: hash(key) = key % N, where N is the table size. How many elements are in the hash table after resizing? 0 4 7
4 The 4 existing elements are preserved during the resize.
Given hash function of key % 10, type the specified bucket's list after the indicated operation(s). Assume items are inserted at the end of a bucket's list. Type the bucket list as: 5, 7, 9 (or type: Empty). HashInsert(valsTable, item 20) Bucket 0's list: _____
40, 20
Consider the following hash table, a first hash function of key % 10, and a second hash function of 7 - key % 7. After removing 66 via HashRemove(valsTable, 66), HashSearch(valsTable, 66) probes _____ buckets.
5
Given: hash1(key) = key % 11 hash2(key) = 5 - key % 5 and a hash table with a size of 11. Determine the index for each item after the following operations have been executed. HashInsert(valsTable, item 16) HashInsert(valsTable, item 77) HashInsert(valsTable, item 55) HashInsert(valsTable, item 41) HashInsert(valsTable, item 63) Item 16
5 (16 % 11 + 0 * (5 - 16 % 5)) % 11 = 5. The bucket at index 5 is empty prior to the insertion of 16, so item 16 is stored at index 5.
If item keys range from 0 to 49, how many keys may map to the same bucket? 1 5 50
5 5 keys will map into each bucket. Ex: 1, 11, 21, 31, and 41 will each map to bucket 1. A modulo hash function will map (num_keys / num_buckets) keys to each bucket.
A modulo hash function for a 50 entry hash table is: key % _____
50 key % 50 yields values from 0 to 49, which is 50 values (counting the 0).
Given hash function of key % 10, type the specified bucket's list after the indicated operation(s). Assume items are inserted at the end of a bucket's list. Type the bucket list as: 5, 7, 9 (or type: Empty). HashInsert(valsTable, item 23) HashInsert(valsTable, item 99) Bucket 3's list: _____
53, 363, 23
If a linear search were applied to the array, how many array elements would be checked to find item 45? 1 6 10
6
For a decimal mid-square hash function, what are the middle digits for key = 40, N = 100, and R = 2?
60 40 * 40 = 1600. Middle 2 digits = 60. The bucket index is 60 % 100 = 60.
For a binary mid-square hash function, how many bits are needed for an 80 entry hash table?
7 R = [log_2 80] = 6.32 = 7 The value of the middle 7 bits may range from 0 to 127, so the hash function returns the remainder of the middle bits divided by hash table size.
For a 1000-entry direct access table, type the bucket number for the inserted item, or type: None HashInsert(hashIndex, item 734)
734
Assume a hash function returns key % 16 and quadratic probing is used with c1 = 1 and c2 = 1. Refer to the table below. If 21 is inserted into the hash table, what would be the insertion index? 5 9 11
9
key % 1000 maps to indices 0 to ____.
999 key % 1000 maps to 1000 indices numbered 0 to 999. (Common mistakes are to assume indices 0 to 1000, or 1 to 1000).
Limitations of direct hashing
A direct access table has the advantage of no collisions: Each key is unique (by definition of a key), and each gets a unique bucket, so no collisions can occur. However, a direct access table has two main limitations. 1. All keys must be non-negative integers, but for some applications keys may be negative. 2. The hash table's size equals the largest key value plus 1, which may be very large.
Given hash function of key % 10, type the specified bucket's list after the indicated operation(s). Assume items are inserted at the end of a bucket's list. Type the bucket list as: 5, 7, 9 (or type: Empty). HashRemove(valsTable, 218) Bucket 8's list: _____
Empty
Which is not an advantage of storing password hash values, instead of actual passwords, in a database? Database administrators cannot see users' passwords. Database storage space is saved. Attackers who gain access to database contents still may not be able to determine users' passwords.
Database storage space is saved. Password hash values do not compress passwords, so storage space is not saved when storing hash values.
A company will store all employees in a hash table. Each employee item consists of a name, department, and employee ID number. Which is the most appropriate key? Name Department Employee ID number
Employee ID number * Keys should be unique; assumedly the company would assign a unique ID number to each employee.
A hash table has buckets 0 to 9 and uses a hash function of key % 10. If the table is initially empty and the following inserts are applied in the order shown, the insert of which item results in a collision?HashInsert(hashTable, item 55)HashInsert(hashTable, item 90)HashInsert(hashTable, item 95) Item 55 Item 90 Item 95
Item 95
For a 1000-entry direct access table, type the bucket number for the inserted item, or type: None HashInsert(hashIndex, item 1034)
None 999 is the largest possible key value. An N-entry direct access table supports keys ranging from 0 to N-1.
For a 1000-entry direct access table, type the bucket number for the inserted item, or type: None HashInsert(hashIndex, item -45)
None A negative number is not a valid array index. An N-entry direct access table supports keys ranging from 0 to N-1.
The runtime for insert, search, and remove is ______ with a perfect hash function.
O(1)
For a well-designed hash table, searching requires _____ on average. O(1) O(N) O(log N)
O(1) *Hash tables support fast search, insert, and remove.
A hash table with N buckets is commonly resized to the next prime number ≥ N * 2. A new array is allocated, and all items from the old array are re-inserted into the new array, making the resize operation's time complexity ______
O(N)
The Caeser cipher is an encryption algorithm that works well to secure data for modern digital communications. True False
false very easy to decrypt
When a hash table is initialized, all entries must be empty-after-removal. True False
false All entries must be initialized to empty-since-start.
What is the result of applying the Caeser cipher with a left shift of 1 to the string "computer"? eqorwvgt dpnqvufs bnlotsdq
bnlotsdq
Each hash table array element is called a ________
bucket
A 100 element hash table has 100 _____. items buckets
buckets * Each hash table array element is called a bucket.
Handles hash table collisions by using a list for each bucket, where each list may store multiple items that map to the same bucket.
chaining
a collision resolution technique where each bucket has a list of items (so bucket 5's list would become 55, 75).
chaining
Various techniques are used to handle collisions during insertions, such as _______ or _________.
chaining, open addressing
Occurs when an item being inserted into a hash table maps to the same bucket as an existing item in the hash table.
collision
a hash function designed specifically for cryptography
cryptographic hash function
A hash table with a direct hash function is called a ___________
direct access table
A __________ uses the item's key as the bucket index. Ex: If the key is 937, the index is 937.
direct hash function
an open-addressing collision resolution technique that uses 2 different hash functions to compute bucket indices.
double hashing
Given hash function of key % 10, determine the bucket status after the following operations have been executed. HashInsert(valsTable, item 64) HashInsert(valsTable, item 20) HashInsert(valsTable, item 51) HashRemove(valsTable, 51) Bucket 1: empty-since-start empty-after-removal
empty-after-removal
Given hash function of key % 10, determine the bucket status after the following operations have been executed. HashInsert(valsTable, item 64) HashInsert(valsTable, item 20) HashInsert(valsTable, item 51) HashRemove(valsTable, 51) Bucket 2: empty-since-start empty-after-removal
empty-since-start
linear probing distinguishes two types of empty buckets. An ________ bucket has been empty since the hash table was created. An ________ bucket had an item removed that caused the bucket to now be empty.
empty-since-start, empty-after-removal
Double Hashing: When the removal algorithm finds the bucket containing the key to be removed, the bucket is marked as empty-since-start. True False
false
Encryption and decryption are synonymous. True False
false
A hash value can be used to reconstruct the original data. True False
false Although true for some hashing functions, MD5 and many others produce a hash value that cannot be used to reconstruct the original data.
MD5 produces larger hash values for larger input data sizes. True False
false MD5 always produces a 128-bit hash value.
A hash table implementation must use only one criteria for resizing. True False
false Multiple criteria can be used together. Ex: Resizing could occur when either the load factor exceeds 0.5 or the number of items in a bucket exceeds 11.
In a hash table using chaining, the load factor cannot exceed 1.0. True False
false The load factor is the number of items divided by the number of buckets. Multiple items can be placed in one bucket when using chaining. So the hash table could have more items than buckets, making the load factor > 1.0.
If computer B in the above example computed a hash value identical to the downloaded hash value, then the downloaded message would be guaranteed to be uncorrupted. True False
false Two identical MD5 hash values imply a high likelihood that the data is uncorrupted, but not a guarantee. Different hash values, on the other hand, guarantee that the data is corrupted.
Suppose a hash table has 101 buckets. If the hash table was using open addressing, a load factor > 0.9 guarantees a collision during insertion. True False
false Unless the load factor is 1.0, empty buckets exist in the table. So if 0.9 < load factor < 1.0, a key's hash may lead to an empty bucket, allowing insertion without collision.
The search algorithm stops only when encountering a bucket containing the key being searched for. True False
false Probing N buckets or encountering an empty-since-start bucket will also stop the search.
The insertion algorithm can only insert into empty-since-start buckets. True False
false The insertion algorithm can insert into empty-since-start and empty-after-removal buckets. Whichever bucket is encountered first in the probing sequence will be used for the insertion.
A ________ computes a bucket index from the item's key.
hash function
a data structure that stores unordered items by mapping (or hashing) each item to a location in an array (or vector)
hash table
In a hash table, an item's _______ is the value used to map to an index.
key
A hash function computes a bucket index from an item's _____. integer value key
key * The key is usually a part of the item, such as a student object that has a name, age, and student ID number, with the ID number serving as the key.
A hash table with _________ handles a collision by starting at the key's mapped bucket, and then linearly searches subsequent buckets until an empty bucket is found.
linear probing
A hash table's _________ is the number of items in the hash table divided by the number of buckets.
load factor
A _________ squares the key, extracts R digits from the result's middle, and returns the remainder of the middle digits divided by hash table size N.
mid-square hash
A _______ uses the remainder from division of the key by hash table size N.
modulo hash
A common hash function uses the _______, which computes the integer remainder when dividing two numbers.
modulo operator %
Will the hash function and expected key likely work well for the following scenarios? Hash function: key % 1000 Key: Customer's 3-digit U.S. phone number area code, of which about 300 exist. Hash table size: 1000 Yes No
no Every key is ideally unique, but numerous customers may have the same area code. Also, less than a third of 1000 buckets will be used; the U.S. has only about 300 area codes.
For all items that might possibly be stored in the hash table, every key is ideally unique, so that the hash table's algorithms can search for a specific item by that key. True False
true
Will the hash function and expected key likely work well for the following scenarios? Hash function: key % 1000 Key: Selling price of a house. Hash table size: 1000 Yes No
no Most house prices are rounded to the nearest 1000, as in $249,000. Thus, nearly all prices will map to bucket 0.
Will the hash function and expected key likely work well for the following scenarios? Hash function: key % 1000 Key: 6-digit employee ID Hash table size: 20000 Yes No
no Numerous collisions are likely. The hash function only maps keys into the first 1000 buckets. A good hash function should uniformly distribute hashed keys across all buckets.
Will the hash function and expected key likely work well for the following scenarios? Hash function: key % 40 Key: 4-digit even numbers Hash table size: 40 Yes No
no The remainder from dividing two even numbers is an even number. So, keys only map to even buckets. If keys have common factors or the expected keys are unknown, making N a prime number typically reduces collisions.
Consider the following hash table and a hash function of key % 10. What does HashSearch(valsTable, 112) return?
null
Given hash function of key % 10, determine the bucket status after the following operations have been executed. HashInsert(valsTable, item 64) HashInsert(valsTable, item 20) HashInsert(valsTable, item 51) HashRemove(valsTable, 51) Bucket 4: occupied empty-after-removal
occupied
A _________ maps items to buckets with no collisions.
perfect hash function
A hash table with _________ handles a collision by starting at the key's mapped bucket, and then quadratically searches subsequent buckets until an empty bucket is found.
quadratic probing
Using linear probing, a hash table _________ algorithm uses the sought item's key to determine the initial bucket. The algorithm probes each bucket until either a matching item is found, an empty-since-start bucket is found, or all buckets have been probed. If the item is found, the item is removed, and the bucket is marked empty-after-removal.
remove
A hash table ______ operation increases the number of buckets, while preserving all existing items.
resize
If a message is encrypted with a left shift of X, what shift is needed to decrypt? left shift of X right shift of X
right shift of X
In linear probing, a hash table _______ algorithm uses the sought item's key to determine the initial bucket. The algorithm probes each bucket until either the matching item is found (returning the item), an empty-since-start bucket is found (returning null), or all buckets are probed without a match (returning null). If an empty-after-removal bucket is found, the _____ algorithm continues to probe the next bucket.
search
A perfect hash function can be created if the number of items and all possible item keys are known beforehand. True False
true
Cryptography is used heavily in internet communications. True False
true
Assume a hash function returns key % 16 and quadratic probing is used with c1 = 1 and c2 = 1. Refer to the table below. 32 was inserted before 16 True False
true 16 and 32 both have an initial bucket of 0. 32 is at index 0 and 16 is at index 2, implying that 32 was inserted before 16.
Suppose a hash table has 101 buckets. If the hash table was using open addressing, a load factor < 0.25 guarantees that no more than 25 collisions will occur during insertion. True False
true A load factor < 0.25 means fewer than 25% of the 101 buckets are in use. Therefore, at most 25 buckets are in use. An insertion only collides with non-empty buckets, so no more than 25 collisions will occur during insertion.
Generating and storing random data alongside each password hash in a database, and using (password + random_data) to generate the hash value, can help increase security. True False
true If a different random value is generated for each password, an attacker will have a more difficult time trying to find an approach that will decrypt all passwords in the database.
A user could login with an incorrect password if a password hashing function produced the same hash value for two different passwords. True False
true If password verification involves comparing hash values, each password must have a distinct hash code. Otherwise, a user could potentially login with an incorrect password.