Final Exam pt4
Hashing is a useful technique for storing and accessing data based on __________.
ID (key)
What is the difference in array representation between hashed, sequential, and sorted approaches?
Illustration needed.
What is the impact of collisions on accessing elements?
It complicates the task of accessing elements.
How does the multiplicative hash function work?
It multiplies the key by a constant less than one and returns the first few digits of the fractional part of the result.
How does the mid-square hash function work?
It multiplies the key by itself and returns some middle digits of the result.
What is the disadvantage of Mergesort, despite having a worst-case running time of O(n log n)?
It requires allocating a second array for merging.
What happens if the keys do not match during linear probing?
Linear probing is used again, starting at the next slot in the hash table.
What is a hash table?
List that uses hashing for storing/accessing elements.
What is the hash function useless for if it wasn't used to determine where to store data?
Locating where some particular data is stored.
What is the disadvantage of a binary search tree?
May become unbalanced, O(n) search
Name two types of hash functions mentioned in the textbook for attaining deterministic randomness.
Mid-square hash function and Multiplicative hash function.
What are two examples of other kinds of hash functions?
Mid-square hash function and multiplicative hash function.
In array representation, the efficiency of the algorithm improves by traversing from the element with index ______ to the first element when the complete binary tree has N nodes.
N/2
What are the advantages of hashing?
No worry about collisions and efficient search.
Is it possible to completely avoid collisions in practice?
No, it is extremely difficult.
What is another disadvantage of hashing?
Number of collisions grows rapidly when table is nearly full
What is the advantage of hashing?
O(1) search speed, no sorting required
What is the advantage of binary search?
O(log n) search speed
Why are collisions bad?
Only one element can be stored at the collision location.
Once all elements of one of the subarrays have been copied in Mergesort, no further comparison is necessary, and any 'unaccounted for' elements remaining in the other subarray are __________ copied into the temporary array.
Orderly
What is a head pointer?
Pointer that points to the first element in a linked list.
What does increasing the range of the hash function do?
Reduces collisions.
When the node currently visited is a leaf, no __________ work is needed because any 1-node tree is always a heap.
Reheapification
What should an algorithm do when removing an item?
Replace it with DeletedItem.
What are the disadvantages of hashing?
Requires a good hash function and potential for clustering.
What is the disadvantage of binary search?
Requires sorted array
What are the disadvantages of binary search?
Requires sorted data and high memory usage.
What is the disadvantage of hashing?
Search speed for non-existent items is no better than linear search
What are the two approaches to data storage/access studied so far?
Sequential and sorted.
How do we delete an element from the hash table using linear probing?
Set location to EmptyItem or DeletedItem.
What is the second step to delete an element using linear probing?
Set the item at h(key) to EmptyItem or DeletedItem.
What is the first step to delete an element using linear probing?
Set the location to h(key).
After each iteration of Step 2, 'Sorted' region grows by one element and 'Heap' region _______________.
Shrinks by one element
What is the trade-off when selecting the table size for a hash table?
Space and time.
What are the considerations for double hashing?
Stay within hash table range and visit every array position.
What are the two uses of a hash function?
Storing and accessing data.
What is hashing?
Technique for ordering and accessing elements in a list.
Descriptively, the sorted array is obtained by copying the smaller of the leftmost 'unaccounted for' elements of the two subarrays into the __________ array.
Temporary
What happens when a collision occurs in linear probing?
The colliding element is stored in the next available location.
What must be consistent when using a hash scheme?
The hash scheme used to store an element must be used to access it.
What is the location of the rightmost node item at the deepest level of the tree when the 'Heap' region is represented as an array?
The last element of the array
What happens when collisions occur?
The other element needs to be stored in a different location.
How are the two uses of a hash function related?
They are intimately related.
What is the purpose of making the hash table size relatively prime with respect to h2?
To ensure every array position is visited during double hashing.
What is the purpose of a hash function in hashing?
To identify the location of an element in the list.
What is the main goal at each visit of a node during the traversal of the tree?
To make the semiheap rooted at that node into a heap.
What is the purpose of making a distinction between EmptyItem and DeletedItem?
To optimize the retrieval algorithm and avoid unnecessary searching.
What is the purpose of linear probing?
To resolve collisions by sequentially searching the hash table.
What is the role of a good hash function?
To spread elements uniformly and minimize collisions.
What is the purpose of a hash table?
To store and access elements using hashing.
Array Traversal:
Traversing the elements from the last element to the first element.
True or False: The basis for Mergesort involves continuously halving the array until 1-element subarrays are obtained.
True
True or False: Mergesort has the advantage of O(n log n) average-case running time but requires a second array for merging.
True
True or False: Reheapification downward is applied to the root after swapping the first and last elements of the 'Heap' region.
True
How is the temporary array managed in terms of memory in Mergesort?
Typically, dynamically allocated, used, and then deallocated.
What is clustering in the context of linear probing?
Uneven distribution of items in the hash table.
What are some characteristics of a good hash function?
Uniformly spreads elements and minimizes collisions.
How does double hashing reduce clustering?
Uses a second hash function to resolve collisions.
How is data accessed in the sorted approach?
Using a binary search algorithm.
How is data accessed in the sequential approach?
Using a linear search algorithm.
What is the most common implementation of chained hashing?
Using linked lists.
Why do collisions complicate the deletion algorithm?
We may not want to simply replace the item with EmptyItem.
When does the retrieval algorithm end?
When an unused location with EmptyItem is encountered.
What is a collision?
When two or more keys map to the same index in the hash table.
In Heapsort, the running time is O(n log n) for both the __________ and the __________.
Worst-case, Average case
What is a circular array strategy?
Wrapping around the hash table to ensure hash result stays within range.
Which sorting algorithm is particularly good if the array is nearly sorted to begin with? a) Insertionsort b) Selectionsort c) Mergesort d) Quicksort
a
What is a notable disadvantage of Mergesort? a) High time complexity b) Requirement for a temporary array c) Inability to handle large datasets d) Lack of stability
b
What is the main goal at each visit of a node during the traversal of the tree? a) Move the root to the 'Sorted' region b) Make the semiheap rooted at that node into a heap c) Enlarge the 'Heap' region d) Reorder the 'Sorted' region
b
What is the primary goal of Quicksort? a) Continuous halving b) Filtering segments c) Sorting with Mergesort d) Sorting a given array
b
What term is introduced to describe a complete binary tree whose sub-trees are both heaps, but the root may be out of place? a) Superheap b) Semiheap c) Megheap d) Microheap
b
Which data structure is applicable for storing data items sequentially? a) Hash table b) Contiguous array c) Binary search tree d) AVL tree
b
Which hash function type involves squaring the middle digits of the key? a) Pseudo-random hash function b) Multiplicative hash function c) Mid-square hash function
c
In the algorithm, to move the largest item from 'Heap' region, the root is removed, and the rightmost node at the deepest level is copied into the root, turning the 'Heap' region into the '_______________' region.
reduced
True or False: a good hash function minimizes collisions by spreading elements uniformly.
true
What is a uniform distribution?
Evenly spread distribution of items.
When does the retrieval algorithm continue searching?
When an unused location with DeletedItem is encountered.
Mergesort and Quicksort each have an average-case running time that is __________.
(n log n)
What is linear probing?
A collision-handling algorithm that sequentially searches the hash table.
What is a hash scheme?
A combination of a hash function and a collision-handling algorithm.
Why must a hash function be fast, and what is the consequence of collisions in hashing?
A hash function must be fast for efficient computation. Collisions in hashing lead to increased work during storage and access.
What is a collision-handling algorithm?
An algorithm that resolves collisions in a hash table.
How do we access an element from the hash table using linear probing?
Apply hash function, compare keys, and use linear probing if necessary.
What is the first step to access an element using linear probing?
Apply the hash function on the key.
What data structures can be used in the sorted approach?
Array or binary search tree.
What data structures can be used in the sequential approach?
Array or linked list.
What is the biggest challenge in designing a good hash function?
Avoiding collisions.
Why can't we use a hash function to access data if it was stored in sequential or relative order?
Because the hash function determines where to store and look for data.
How can we make the deletion algorithm as efficient as possible?
By distinguishing between EmptyItem and DeletedItem.
How can collisions be minimized in a hash table?
By using a hash table with more space than needed.
What does dealing with collisions lead to?
Collision-handling algorithms.
What is the second step to access an element using linear probing?
Compare the desired key to the actual key at that location.
What is the division method used for in hash functions?
Computing hash values.
What is collision?
Condition when two or more keys produce the same hash location.
How is data stored in the sorted approach?
Data is stored in sorted order.
How is data stored in the sequential approach?
Data is stored sequentially as it arises.
What is a linked list?
Data structure that stores a sequence of elements with pointers.
What do we have to do when collisions occur?
Deal with them.
What should be done to the hash function to minimize collisions?
Design it properly.
What is a second hash function used for in double hashing?
Determines how far forward to move through the array.
What is the hash function used for in accessing data?
Determining where to look for the data.
What is the hash function used for in storing data?
Determining where to store the data.
What is chained hashing or chaining?
Each hash table component holds multiple entries.
What is the advantage of a binary search tree?
Efficiency for insertions and deletions
What are the advantages of binary search?
Efficient search and sorted data.
In perfect hashing, the table has a slot for __________.
Every possible data item
True or False: In the strategy for sorting using Mergesort, the halving action is represented by dashed arcs.
False
True or False: Quicksort involves halving the array until 1-element subarrays are obtained.
False
True or False: During the tree traversal described, reheapification work is needed at every node visited.
False
True or False: Hashing is a storage and access technique that always guarantees O(1) time complexity.
False
True or False: Hashing is a suitable technique for min-max searches and range searches.
False
True or False: In Mergesort, the temporary array is dynamically allocated but never deallocated.
False
True or False: Increasing the hash table size generally increases the likelihood of collisions.
False
True or False: Quicksort always has a worst-case running time of O(n^2).
False
True or False: Reheapification work is needed at every node during the traversal of the tree.
False
True or False: Selectionsort and Insertionsort have a running time of O(n log n) in both the worst case and the average case.
False
True or False: The 'Sorted' region initially contains elements from the entire array.
False
True or False: The algorithm described always starts traversing from the last element in the array.
False
True or False: Traversing the tree from the deepest level up ensures that leaf nodes are visited last.
False
Traversal:
From the deepest level up and from right to left at each level.
What is a hash function?
Function used to manipulate the key of an element.
What factors affect collisions in hashing?
Hash table size, nature of input, and collision resolution strategy.