Multiway search trees = 2 4 trees
Binary Search Tree
- Insert operations add new node as a leaf - Tree can become unbalanced • Worst case height: O(n)
For a single delete operation, what is the maximum number of transfer operations you will need to perform to resolve an underflow?
1
A search tree is a 2-4 tree if:
1. All leaves are at the same depth and contain 1, 2, or 3 keys 2. An interior node either - Contains one key and has 2 children (2-node), - Contains two keys and has 3 children (3-node), or - Contains three keys and has 4 children (4-node) 3. It has the following search properties: • ∀ 2-node v: - ∀ u ∈ left subtree of v, ∀ w ∈ right subtree of v 𝑘𝑒𝑦 𝑢 < 𝑘𝑒𝑦 𝑣 < 𝑘𝑒𝑦(𝑤) • ∀ 3-node v with keys K1 < K2 : - ∀ u ∈ left subtree of v, ∀ x ∈ middle subtree of v, ∀ w ∈ right subtree of v 𝑘𝑒𝑦 𝑢 < 𝐾1 < 𝑘𝑒𝑦 𝑥 < 𝐾2 < 𝑘𝑒𝑦(𝑤) • ∀ 4-node v with keys K1 < K2 < K3 : - ∀ u ∈ left subtree of v, ∀ x ∈ middle1 subtree of v, ∀ w ∈ middle2 subtree of v, ∀ z ∈ right subtree of v 𝑘𝑒𝑦 𝑢 < 𝐾1 < 𝑘𝑒𝑦 𝑥 < 𝐾2 < 𝑘𝑒𝑦 𝑤 < 𝐾3 < 𝑘𝑒𝑦(𝑧)
insert Operation
Assume key k is not in the tree • Search for k until you reach leaf w • Add k to leaf w may cause "overflow" - Overflow: a node has more than 3 keys! - You MUST resolve overflows!
Height of a 2-4 Tree cont
Consider 2-4 trees of height h with the fewest and most keys • 2-4 tree of height h with fewest keys: All nodes have one key perfect binary tree n = # keys = # nodes • 2-4 tree of height h with most keys: All nodes have three keys all internal nodes have four children n = # keys = 3 × # nodes
OVERFLOW
If a 4-node (has three keys) ever becomes a 5-node (has four keys), you have an overflow. To resolve an overflow, perform the split operation
Fusion
Sibling of the empty node is a 2-node • Merge v with its sibling u new node v' • Move a key from parent w to the merged node v' After a fusion, underflow may propagate to parent w
Transfer
Sibling of the empty node is a 3-node or 4-node • Move a key from parent w of v to v • Move a key from u to parent w • Move a child of u to v to preserve search property After transfer, no additional underflows occur
lookUp Operation
Similar to search in a binary search tree • Search for key k trace a downward path starting at the root • Compare k to key at current node determine next node to visit • If we reach a leaf and do not find k key not in the dictionary
For a single delete operation, what is the maximum number of fusion operations you will need to perform to resolve an underflow?
h
LookUp in an 2-4 tree is Insert delete
h h h
Tree Height vs Least Number of Keys Tree Height vs Largest Number of Keys
h = log_2(n) h = log_4(n) h = O(log(n)) Therefore: • LookUp, Insert, and Delete are all O(log n)
UNDERFLOW
• Consider key deletion at node v with parent w • If v becomes a 1-node (no key, at most one child) • Two Cases - A sibling u adjacent to v is a 3-node or 4-node underflow and transfer - the adjacent siblings of v are 2-nodes underflow and fusion
2-4 Trees
• Generalization of binary search trees multiway search trees • Size Property every node has at most four children • Perfectly-balanced all leaves have the same depth • Grow from the top of the tree • Contain any desired number of nodes
delete Operation
• If key k is at a leaf node delete it • Otherwise - Replace k with its inorder successor k' • k' is at a leaf w - Delete k' from w • Deletion may cause underflow
B-Tree Application
• Large-scale applications tree must be stored in disk • Accessing external storage device - Orders of magnitude more expensive than accessing RAM - Block transfer mode entire block accessed in one operation - Once block is read into RAM, operations are inexpensive • Objectives - Minimize tree height minimize # of block transfers - Avoid accessing storage device for info on a single tree node
B-Trees
• Map entries stored exclusively in leaves • Keys at interior nodes: -Appear without their associated data - May even not be part of the map -Only provide index guiding the search to appropriate leaf • Applications: dictionaries stored in external memory (disk, CD-ROM) • Special case of an (a,b) tree with b = 2a-1 • Best known method for maintaining a dictionary in external memory • Idea: select b such that b pointer fields (to node's children), plus b-1 keys can all fit into a single disk block
(a, b) Trees
• Multi-way search trees generalization of 2-4 trees • a and b are such that 2 ≤ 𝑎 ≤ 𝑏+1 2 • Definition: a multi-way search tree with two properties: - Size property: each internal node has c children: 𝑎 ≤ 𝑐 ≤ 𝑏 - Depth property: all leaves have the same depth • Lookup, insert, delete operations straightforward generalization of corresponding 2-4 tree operations
Height of a 2-4 Tree
• Prove by induction that a 2-4 tree of height h has at least 2h nodes. • Conclude that: 2 h ≤ n (number of nodes) Therefore, that h ≤ log2 n
Split
• Replace w by two nodes - New 3-node w' has children c1 , c2 , c3 and stores key K1 and K2 - New 2-node w'' has children c4 , c5 and stores key K4 • Pass key K3 to w's parent - Recursively split parent if necessary - If w is the root create new 2-node root with key K3