COP3530 Exam #2
How many children can a node in a B+ tree have
A B+ tree is an M-ary tree and each node can have up to M children
which selection sort recognize a sorted array?
no, will run through until everything is sorted
Red-Black Tree Rules
node= red(true)/black(false or null) root=black red node always has black children # of black nodes in any path from the root to a leaf is the same
how many children can a non-leaf node have in an B+ tree
non-leaf nodes can have between (M/2) and M children
How many keys can non-leaf nodes store in B+ trees
non-leaf nodes can store up to M-1 keys
properties of non-ordered set objects (sets/hash tables)
not indexed does not reveal order of inserted items efficient search-O(1) allows for removal of elements without moving elements around
Block number is equal to
number of leaves ( if there are 512 blocks we need 512 leaves)
how many swaps on average per heap insertion
on average one or two swaps per heap insertion
deleting from a heap
only delete root, replace with last item in the heap, move the last item in the heap down until its at its correct placement
load factor for open addressing and chaining
open addressing (1/2)(1+1/(1-L)) chaining = (1+L/2)
I move from low to high indices and look at neighboring items and put them in order with respect to each other.
Bubble Sort
I am a quadratic sort. In the best case (already sorted data), I am O(n), in the worst case (reverse sorted), I actually do n2 comparisons and n2 swaps.
Bubble Sort (Insertion can also be correct)
Quadratic Sorts
Bubble sort, selection sort, insertion sort
key modulo table size desired
table size should be prime to avoid more collisions. pick next prime number closest to desired size
Describe the children of a root in a B+ tree
the root is a leaf or has between 2 and M children
(real life application of a) queue
ticketing websites
what is the basis of a hashing
to transform the item's key value into an integer value which is then transformed into a table index
non linear ordered data structures
trees graphs
In a B tree can M be the same as L
true
heaps are complete trees
true
Why does a B tree have the following properties: non-leaf node has between M/2 and M children all leaves are at the same depth leaves have between L/2 and L values
when a node is full and a split occurs we will end up with two nodes with the minimum number but plenty of space to add more
Why does the B+ tree exist
when data is pulled from external memory the cache blocks are mapped to to B-tree nodes when searching indexed data bases a node maps to a block of data
Why is deleting entries hard in hash table functions open addressing?
when deleting you cannot set the entry to null because if we search an item that may have collided with a deleted item it may incorrectly conclude it is not in the table. ( Instead store in dummy value or mark the location)
does insertion sort recognize a sorted array?
yes , does not enter while loop, best case O(n)n
does bubble sort recognize a sorted array?
yes, pops out early best case: O(n)
When can you replace a item in a hash table open addressing?
you cannot replace the deleted item with a new item until you verify that the new item is not in the table
I am a divide and conquer sort. I split the array into smaller and smaller pieces and sort as I put the pieces back together
Merge Sort
Bucket Sort syntax
N = # of items M=biggest size they can be
A hash table has a size of 2048. The search keys are English words. The hash function is h(key) = (sum of positions in alphabet of key's letters) mod 2048. Is this a good hash function?
No
The hash table is 10,000 entires long. The search keys are integers in the range 0 through 9,999. The hash function is h(key) = (key * random) truncated to an integer Where random represents a sophisticated random-number generator that returns a real value between 0 and 1. Is this a good hash function?
No ( b/c multiplied by a random number)
The hash table is 10,000 entires long. The search keys are integers in the range 0 through 9,999. The hash function is given by the following method int hashIndex(x) {for (int i=1; i<=1000000; i++) x=(x*x) % 10000 return x;} Is this a good hash function?
No (multiplied by itself is not easy to computer)
Is quicksort guaranteed to be O(n log n)?
No (worst case O(n^2)
CC all operations for a heap
O (log n)
CC of heap sort
O( n log n) ( not recursive) arrange items into an array build heap percolate down use half of elements
CC of left rotating a BST tree at a given node? (AVL Trees)
O(1)
What is the average computational complexity of accessing an item in a hash table
O(1)
What is the complexity of accessing the minimum element from a min-heap?
O(1)
average CC of operations (access, insert, delete) of a Hash Table
O(1)
CC of Set Operations
O(1) testing for membership, adding elements, removing elements.
Bucket Sort CC
O(N+M)
CC of BST Search
O(log n)
CC of adding an item to a BST
O(log n)
CC of heap deletion
O(log n)
Splay tree average CC operations
O(log n)
What is the average computational complexity of accessing an item in a binary search tree
O(log n)
What is the complexity of adding an element in the min-heap?
O(log n)
unbalanced BST search
O(n)
worst CC of operations (access, insert, delete) of a Hash Table
O(n)
Build Heap CC
O(n) ( but items in binary treeusing an array and percolating down in reverse order )
Splay tree worst CC operations
O(n) (But subsequent operations are much faster)
Difference
Only in A and Only in B
Hash Tables vs. BST
Operations: BST O(log n) guaranteed, Hash table O(1) can degrade with large load factor or poor hash function Traversal: Hash table can't be traversed, BST can Memory: BST memory allocated as need Hash table more memory allocated than needed
I am a divide and conquer sort. I choose a pivot value and partition the array into items bigger and smaller than the pivot. I then perform the partition algorithm on each half.
Quick Sort
I'm building a database. I'm going to be doing a lot of insertions and deletions. The insertions aren't in any particular order. Memory access is fast for this database. What balanced tree should I choose?
Red-Black Tree
I am a quadratic sort. No matter what (best, worst, average case), I am O(n2).
Selection Sort
I am a sort that finds the smallest item in the array and puts it in its correct place, then finds the second smallest item in the array and puts it in its correct place, and so on...
Selection Sort
Which collision avoidance scheme allows the load factor to go above 1?
Separate chaining
I am a refined version of Insertion Sort where items separated by a specified gap are sorted with respect to each other, then this performed again with a smaller gap value.
Shell Sort
BST tree that brings recently accessed item to the root (makes recently searched item accessible in O(1) time if accessed again )
Splay Tree
data structure that guarantees that m tree operations will take O(m log n) time.
Splay Tree
My entire database is held in fast memory. I often search for the same item multiple times in a row. What balanced tree should I choose?
Splay Tree (same item search)
What if you wanted to make a table where the key is a character and the value is its frequency in a file.
This scheme may produce collisions, where two different keys hash to the same index. The characters with values of 41 and 241 will hash to the same index of 41.
where is data stored in a B+ tree
data items are stored in leaves
How to restore a heap
delete from heap and place it at behind the lat heap item
Rehashing
doubling the size of the table and recalculating the hash values for all entries
maps are one to one explain
each key maps to one value (note: value does not have to be unique, in BST key=node)
Goal of hash functions
evenly distribute data( random distribution of values, fewer collisions, mostly consist of strings of letters and digits) , easy to compute ( may generate a lot of collisions)
complete tree
every level above the last is complete and in the last all nodes to the left are filled
Name two properties of a good hash function.
fast to compute evenly distributes values throughout the table
(real life application of a) tree
files in hard disk
A binary tree (is also a)
graph (is also a)
The probability of a collision with s.hashCode() % table.length is proportional to
how full the table is
(real life application of a) array
images in pixel form
AVL tree insertion steps
insert check balance rotate to make balanced
inserting into a heap
insert the new item in the next position at the bottom of the heap percolate up/bubble up ( while new item is not at the root and new item is smaller than its parent, swap the new item with its parent, moving the new item up the heap)
shells sort
instead of sorting the entire array, shell sort sorts many smaller subarrays using insertion sort before sorting the entire array
common hash function
int index=unicode %200
Radix Sort CC
intergers O(N log (M)) string O(N*L) , L=length of the string
what is the perfect hash?
key=index in the table
how to splay tree insertions
last item inserted at root. (newly inserted key becomes the new root)
left right unbalance
left right rotation needed (parent balance=-2 , left child balance= +1 )
right right unbalance
left rotation needed (parent balance= +2 , right child balance= +1)
properties of BST
left subtree node key< parent right subtree node key > parent (each subtree a BST) no duplicate nodes
height of a node
length of longest path from node to a leaf
linear ordered data structures
lists stacks queues
lists vs sets
lists: ordered (items accessed through index), duplicate items allowed. Sets: no order and no duplicate items
(real life application of a) stack
local variables in C++
CC of percolate up/down
log n
How many items can you split an array of n items
log n times
formula height of a node
max(heightL,heightR)+1
sort that splits an array in half again and again until it gets to one item ; log n recursive calls
merge sort
benefit of mid swuare
more random, randomly distributed (while modulo is faster if the lower order numbers aren't randomly distributed use mid-square)
multiplicative string hash
multiplies hash value adds the ASCII or Unicode value of each character in the string (can start w/ an initial value)
how many deletions occur in heap sort
n deletions
how many moves does partitioning require
n moves
What will be the root->data after we search for 1 in the splay tree?
1
load factor
# of filled cells/table size
Splay tree cases
(3) Zig Zig ( item to be splayed is item of the root= rotate with root) Zig Zag (node to be splayed is the left child of a right child or vise versa= do right left or left right rotation)
CC of BST
(if not self balancing ) Average =O(log n) Worst=O(n)
balance factor in AVL tree
(start at bottom) HR-HL
How do you know when to stop searching if the table is full and you have not found the correct value? (linear probing)
-Stop when the index value for the next probe is the same as the hash code value for the object -Ensure that the table is never full by increasing its size after an insertion when its load factor exceeds a specified threshold
Explain Chaining (The approach sometimes is called bucket hashing
-an alternative to open addressing -Each table element references a linked list that contains all of the items that hash to the same table index -The linked list often is called a bucket
A B+ Tree is used for a database with block sizes of 1KB and data sizes of 4 bytes. How many values should each leaf node have? (Hint: What is L?)
256.0
4 / \ 2 3 / \ 5 8 level order traversal
4 2 3 5 8
My entire database is held in fast memory. The database is stable, so there aren't many insertions or deletions. I do a lot of searches and but rarely search for the same item multiple times in a row. What balanced tree should I choose?
AVL Tree ( clue=don't need memory access, lots of searches B+ tree cares a lot about memory access. )
AVL VS. Red-Black Searching
AVL strictly balanced (1.441 log n)* RB not (h<2 log n)
AVL vs. RB insert/delete
AVL trees perform calculations and multiple rotations , RB performs no calculations and has fewer rotations *
AVL VS. Red-Black Storage
AVL=balance factor, RB=color(bool)*
CC of self balancing BST operations (AVL, Red-Black, Splay (!=access), B+)
Avergage=O(log n) Worst=O(log n) (This is for access, search, insertion, and deletion)
I bring blocks of data from external storage. Often it is the case that once I access one item, I also access an item near it in storage. What balanced tree should I choose?
B+ tree ( this is the definition of a cache=locality of reference, blocks of data)
used in databases and anything where an item is searched for with a key
BST
I am a sort that keeps "sorted" and "unsorted" regions. Starting with index 1 and moving upward I place items in their spot in the sorted region.
Insertion Sort
Three common hash functions
Key modulo table size(most popular, used in Java hashCode method ) mid square multiplicative string hash (useful if key is string)
Formula for L in B tree
L=block size(memory/database)/size of item(word or database item) (example: block/word)
B+ Tree syntax (L and M)
L=leaf node rules, can have L values M=non-leaf node rules, M children allowed and M-1 values allowed.
Is mergesort guaranteed to be O(n log n)?
Yes
Chaining ( Bucket Hashing) CC
access O(1) , insert/deletion O(1), worst case O(n)
basic idea of a splay tree
after a node is accessed it is pushed to the root via a series of operations (start at the bottom and move to the top)
What's special about the leaves in a B+ tree
all the leaves are at the same depth and have between (L/2) and L values
sorting
arranging data in order
what do you use to find the entry in a hash table
based on key value ( do no need to determine location)
HeapSort CC
best : O( n log n) average: O( n log n) worst: O( n log n)
Merge Sort CC
best : O( n log n) average: O( n log n) worst: O( n log n)
Quick Sort CC
best : O( n log n) average: O( n log n) worst: O( n^2)
Bubble sort CC
best : O(n) average: O(n^2) worst: O(n^2)
Insertion sort CC
best : O(n) average: O(n^2) worst: O(n^2) In the best case (when the array is sorted already), only one comparison is required for each insertion
Selection sort CC
best : O(n^2) average: O(n^2) worst: O(n^2)
Bubble Sort Vs. Selection sort
bubble performs worse
node at position c parent
c/2
does open addressing or chaining have a better performance?
chaining (rehash happens at L>= 1)
set definition
collection that contains no duplicate elements
Bubble sort
compares adjacent/neighbor array elements and exchanges if they are out of order
how to calculate unique values in B tree
count values in leaf nodes
quick sort how to
pivot=index 0 up= index 1 down = index last increment up until up is great than pivot decrease down until it is smaller than the pivot
rearranges array in two parts- called partitioning
quick sort
right left unbalance
right left rotation needed (parent balance= +2 , right child balance= -1)
left left unbalance
right rotation needed (parent balance=-2 , left child balance=-1)
selection sort
selecting next smallest item and placing it where it belongs
steps to separate B+ tree
separate if too many values(>M/L) when separating on right most box the left most item move up
non ordered data structures
sets hash tables
a type of insertion sort (better)
shell sort
CC changes based on parameters given
shell sort (Not quadratic)
Mid Square hash function
square the key take digits from the middle take that value modulo table size