Assessment 11

Ace your homework & exams now with Quizwiz!

1. Why can a table (file) maintain at most one primary index? 2. Why do we build and use an index file when searching a table (file) on its ordering field? Hint: Blocking factor of the index and data files. 3. Why must a secondary index be a dense index?

1. A primary index is applied to the table's ordering field. A table (file) can have only a single physical ordering so a table can have only a single primary index. 2. The efficiency of most database operations depends on minimizing the number of blocks that must be retrieved from the disk. The blocking factor of an index file is (almost always) greater than the BF of the data file. So we can locate the desired field with fewer block retrievals when we search the file's index rather than directly retrieve and search the records in the data file. 3. A secondary index (by definition) is created against a non-ordering field. This means that the record ordering defined by the secondary index is not reflected in order of the data file blocks. That is, two adjacent records on the secondary index are seldom found on the same blocks. So a dense index is needed to identify the position of each record in the correct order. That is, an index that contains one entry for every data file record.

1. What are the differences among primary, clustering, and secondary indexes? 2. Which of the indexes are dense, and which are not?

1. A primary index must be defined on an ordered key field. A clustered index must be defined on an order field (not keyed) allowing for ranges of records with identical index field values. A secondary index is defined on any non-ordered (keyed or non-key) field. 2. Primary and clustered indexes can be sparse (or dense). The secondary index must be dense.

What are the two strategies given for searching for multiple fields in the same table (file) when one or both fields are indexed? Hint: Three strategies are given in the book, but the first two are identical.

1. If only one of the two fields is indexed, search for the records that match the indexed field value, and search the identified records for the non-indexed field value. 2. If both fields are indexed, then search the intersection the two sets of blocks identified by the two fields for the desired record(s).

1. How does a B-Tree differ from a B^+-Tree? 2. Why is a B^+-Tree usually preferred as an access structure for a data file? 3. What is the relationship between number of pointers and keys in the internal nodes?

1. The B-Tree maintains both node pointers, and <key-values & record (block) pointers> in the tree nodes. There is no difference between internal and leaf nodes. The B^+-Tree maintains node pointers and key values in the internal nodes. Leaf nodes are reserved for <key, record-pointers> 2. The B^+-Tree leaf nodes maintain the record pointers in a linked list of blocks that can transversed (visited) when accessing the table's records in the order defined by indexed key field. 3. Each tree node maintains p pointer entries and p-1 key entries.

1. Provide a short description of the "Block Anchor Record". 2. What is the relationship between the anchor record and the other records on same block? 3. What is the relationship between the anchor record on block i and block i+1? 4. On what block might the record containing a searched-for field value be found?

1. This is the first record of a block. The record contains the starting indexing field value of the sequence of records stored on the block. 2. The anchor record's index field value should be ≤ all other field values of other records on the same block. 3. The Anchor Record (AR) index field value of block i should be < the Anchor Record value of block i+1. 4. When searching for a specific index field value (VAL), the record will be located on block i where ARi ≤ VAL < ARi+1

Describe the similarity between the data structures used by B-Tree indexes and the dynamic hashing directory (Section 17.8.3).

An index tree is similar to dynamic hashing. Both data structures are trees and are searched in Logn operations to find a pointer to the block containing the desired record. A difference is that an index search uses the search value to transverse the tree, whereas the hash directory search uses the binary hash value that is generated by applying the hash function to the search value.

Repeat the calculation given in the slide "Efficiency of Indexes as an Access Path" for the number of block reads for each of the following but using a block size of 4096 (10^12). 1. The number of block reads to search an ordered file without an index. 2. The number of block reads to search an indexed file. 3. The number of block reads to search an unordered, non-indexed file.

Blocking Factor Bfr = (B div R) = (4096 div 150) = 27 records/block. Number of data file blocks b = (r/Bfr) = (30000/27) = 1111 blocks. An unchanged index entry size RI = (VSSN+ PR) = (9+7) = 16 bytes. Index Blocking Factor BfrI= (B div RI) = (4096 div 16) = 256 index entries / block. Number of Index Blocks bl = (r/ BfrI) = (30000/256) = 117 blocks 1. Non-Indexed binary search of an ordered data file needs Log2 1111 = 11 block reads. 2. Index Binary Search needs Log2bI = Log2 117 + 1 = 8 block reads. Note the +1 read is to retrieve the data file block containing the record. 3. An unordered, non-Indexed file needs on average 1111/2 block reads (assuming the key is always found).

Define Fan Out and how fan out size effects the efficiency of searching in a multi-level index.

Fan out is the maximum number of child nodes reachable by a non-leaf node in a search tree. It defines the number of (block) pointers maintained in a tree node. It defines the number of nodes (blocks) that must be visited (retrieved) before the searched-for value is found or not found. The number is LogfoN when the tree is balanced.

Assessment 11

Related study sets

POLS- the Bureaucracy

Engine Performance

Elsevier Adaptive Quizzing Endocrine Quiz

Credit and Debt

Psych final

Chp 8: El Nino & La Nina

Chapter 36 Calculation of Medication Dosage and Medication Administration

HESI practice ?s II (AH III)

Negotiations Key Points

Chapter 16: Education

MECH502

Lecture 19 (Final)

Big Data and data lakes (Ch.10)

Interpretation Book 1

Wrist

Sino Korean Numbers 1-10

W3Schools - HTML (Attributes, Paragraphs & Text Formatting)

coding terms and definitions

Ecology Practice MC questions

COP3530 Midterm