DBC Exam 2 CH4.4 Multi-level indexes
CONCEPT
1) in a dense index, inserts always generate new bottom-level index entries. In a sparse index, inserts generate new bottom-level index entries when table blocks split. 2) If the new index entry goes in a full index block, the block splits. Half of the rows move to the new block, creating space for the entry. 3) The new block in the bottom level generates a new index entry the next level up. If block in the next level up is full, the block splits and the process repeats. 4) If blocks are full at all index levels, the split propagates to the top level. In this case, the top-level block splits and a new level is created.
CONCEPT
A query searches for rows containing a specific value of an indexed column. Assuming query hit ratio is low: 1) A table scan reads at most 250,000 table blocks. 2) A single-level index scan reads at most 25,000 index blocks plus a few table blocks. 3) A binary search of a sorted, single-level index reads at most 24 index blocks plus a few table blocks. 4) A multi-level index search reads 3 index blocks plus a few table blocks.
B+tree
All indexed values appear in the bottom level. Pointers to table blocks appear only in the bottom level. Since some indexed values also appear in higher levels, values are occasionally repeated in the index.
CONCEPT
B-trees are more compact than B+trees since index values are not repeated. However, B+trees are simpler, since all pointers to table blocks appear in the same (bottom) level. The B+tree structure has two benefits: 1) The bottom level of a B+tree is a single-level index and can be scanned or searched. 2) In a B-tree, inserts, updates, and deletes may cause a table pointer to change levels, which is hard to implement. B+trees do not have this problem, since table pointers are always in the bottom level.
Imbalanced
BRANCHES are different lengths. Undesirable, since processing time is unpredictable.
CONCEPT
Deletes may cause block mergers. Block mergers are the reverse of block splits and potentially eliminate the top level of the index. Consequently, deletes also maintain a balanced index.
Branch
Each path from the top-level block to a bottom-level block
T/F: Each column value appears at most once in a multi-level index.
False. A column value appears at most once in each level. However, a value may appear in several levels. Ex: In the animation above, 'ATL' appears in both levels.
T/F: The bottom of a multi-level index is always sparse
False. The bottom level of a multi-level index is like a single-level index. The bottom level is usually sparse when the table is sorted on the indexed column. The bottom level is dense when the table is not sorted on the indexed column.
B-tree
If an indexed value appears in a higher level, the value is not repeated at lower levels. Instead, a pointer to the corresponding table block appears in the higher level along with the value.
Balanced
Multi-level indexes are BALANCED when all BRANCHES are the same length
CONCEPT
Multi-level indexes are faster than single-level indexes on most queries.
CONCEPT
New levels are always added at the top of the hierarchy rather than the bottom of one branch. As a result, all branches are always the same length, and the index is always balanced.
CONCEPT
The balanced multi-level index described above is called a B+tree. B+tree structure is derived from an earlier approach called a B-tree.
CONCEPT
The multi-level index search reads one index block per level. Usually the top two levels are small and retained in memory. Since the index has three levels, the query reads just one index block from storage media. Because multi-level indexes are faster than single-level indexes on most queries, databases commonly use multi-level rather than single-level indexes.
Fan-out
The number of index entries per block is called the fan-out of a multi-level index. The number of levels in a multi-level index can be computed from fan-out, number of rows, and rows per block: For a dense index, number of levels = logfan-out (number of rows) For a sparse index, number of levels = logfan-out (number of rows / rows per block) In both cases, log is a fractional number and must be rounded up to the nearest integer. Both formulas assume minimal free space in the index. Dense indexes usually have four levels or less. Sparse indexes usually have three levels or less.
T/F: Levels above the bottom of a multi-level index are always sparse
True. Since each level above the bottom is an index to the next level down, and all index levels are sorted, levels above the bottom are always sparse.
T/F: In a sparse multi-level index, each table block pointer appears exactly once.
True. Table block pointers appear in the bottom level only. In a sparse multi-level index, each table block pointer appears exactly once in the bottom level.
CONCEPT
Updates to an indexed column behave like a delete of the initial value followed by an insert of a new value. Since updates are implemented as deletes and inserts, updates also leave the index balanced.
CONCEPT
multi-level indexes are usually implemented as B+trees.
Multi-level index
stores column values and row pointers in a hierarchy. The bottom level of the hierarchy is a sorted single-level index. The bottom level is sparse for primary and clustering indexes, or dense for secondary indexes. Each level above the bottom is a sparse sorted index to the level below. Since all levels above the bottom are sparse, levels rapidly become smaller. The top level always fits in one block. Top-level block is read first. Bottom-level block contains a pointer to the correct table block.