DB 11

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

1. Describe the primary difference between a dense and sparse index. 2. What is the advantage of the sparse over the dense index? 3. Can we apply a sparse index to an unordered data file? Explain your answer.

1. A dense index maintains an index entry for every record in the data file. A sparse entry maintains an index entry for every anchor record (every block) in the data file. 2. The sparse index requires fewer index entries to cover every record in the file and so disk block are needed to maintain an index. During a search, fewer disk read operations are needed to identify the data file block containing the target record resulting in faster and more efficient searches. 3. A sparse index cannot be utilized for an unordered data file. A sparse index relies on the fact that the index attribute value is ascending across data file blocks. That is, that the indexed attribute values on index block i are less than the values on index block i+1 (assuming ascending order). An unordered data file places records randomly across its blocks, and so each unique index value (data record) requires a pointer to the record's block i.e. a dense index.

1. What are the qualities of the attributes that can be used as a Primary Index on Unique Ordered Attribute as described in the slides? 2. Can a Primary Index on Unique Ordered Attribute, as described in the slides, be applied to both ordered and unordered data files? 3. What are the qualities of the attributes we would find in a clustered index? 4. Can we have two Primary Index on Unique Ordered Attribute, each on separate attributes, on the same data file? Explain your answer.

1. A primary index is defined on a unique ordered key field e.g. email. 2. No. Because a primary index indexes the anchor records of data file blocks and not individual data records (i.e. it is sparse), a primary index can only be applied to ordered data files. 3. A clustered index is defined on a non-unique order field allowing for indexes on records with identical index field values e.g. department number or zip-code. 4. No. Both of these indexes will required an ordered data file i.e. a data file that is physically ordered by its ordering-attribute(s). So we cannot have two indexes that rely on an ordering of different attributes on the same file i.e. we cannot have two physical orderings in a single file. This answer applies to two primary, two clustered, or a primary and clustered index on the same data file.

1. Define "Blocking Factor" of both Index and Data Files. 2. In terms of Blocking Factors, what is the advantage of using an index file to search for a specific record in a data file? Hint: The Blocking Factor of the index file vs. the Blocking Factor of the data file.

1. Blocking Factor (BF) describes how many 'units' fit on a single drive block. The BF of an index file describes how many index entries will fit in a block. The BF of a data file describes how many records (rows) will fit on a block. 2. Increasing the performance of most database operations depends on minimizing the number of blocks that must be read or written from/to the disk. The blocking factor of an index file is almost always much less (better) than the BF of the data file. So when searching for a target record, we can locate the block containing the record with fewer block retrievals when we search the index rather than retrieving and searching the records in the data file. That is, the I/O efficiency of the index file is much higher than the data file.

What are the two strategies discussed in the text for searching for multiple attributes in the same table (file) when one or both attributes are indexed? For example, searching for customers with a given last-name and zip-code. Hint: Three strategies are given in the book, but the first two are identical

1. If only one of the two attributes is indexed (), search for the records that match the indexed attribute value, and linearly search the identified records for the non-indexed attribute value. For example, if only zip-code is indexed, 1) search for the given zip-code and 2) linearly search the zip-code result set for records with the given last-name. 2. If both attributes are individually indexed (both zip-code and last-name have separate indexes), then perform two searches and return the intersection of both result sets.

1. How does a B-Tree differ from a B+-Tree? 2. What type of select operation benefits most from a B+ tree? Briefly explain your answer. 3. Which type is used by most DBMS vendors?

1. The B-Tree maintains both node pointers, and <key-values & record (block) pointers> in the tree nodes. There is no difference between internal and leaf nodes. The B+-Tree maintains node pointers and key values in the internal nodes. Leaf nodes are reserved for <key, record-pointers> The B+-Tree maintains the leaf node blocks as a linked list so that indexed records can be transversed (visited) in index order in linear time. 2. A select operation that is identifies a range of values. For example a BETWEEN operation identifying a range of records between two indexed attribute values e.g. a date. This is because the leaf nodes of the B+ tree maintain pointers to the next block and so moving from one block / record to the next in an ascending or descending order is an O(1) operation. 3. Most DBMS vendors utilize the B+-Tree to implement their indexes.

1. Provide a short description of a block's "Anchor Record" in a single-level index. 2. What is the relationship between the anchor record and the other records on same block? 3. What is the relationship between the anchor record on block i and block i+1? 4. In terms of anchor record values, what block might the record containing a searched-for field value VAL be found?

1. This is the first record of a data block. The record contains the starting indexing field value of the sequence of records stored on the data file block. 2. The anchor record's index field value should be ≤ all other field values of other records on the same data block (assuming ascending order). 3. The Anchor Record (AR) index field value of block i should be < the Anchor Record value of block i+1. 4. When searching for a specific index field value (VAL), the record will be located on block i where ARi ≤ VAL < ARi+1

Describe the structure of a multi-level index

A multi-level index is a single-level index with additional layers that serve to index the index. Like the single-level index, the first level index entries are physically ordered according to the indexing attribute values of the data records being indexed. The multi-level index is structured as tree of disk blocks where each layer (level) in the tree indexes the complete range of data record index values. The first level (leaves of the index tree) reference the record address (block address) in the data file. The second level entries index the anchor records (first) entries of the first level index entry blocks. The third level entries index the anchor record entries of the second level index entry blocks. Etc.

Describe how a multi-level index improves the search performance when applied to a single-level secondary index.

Single-level index entries are physically ordered according to the indexed attribute value. Searching a single-level list is a binary search requires Log2N block reads where N is the number of blocks in the secondary index file. Each level in a multi-level index reduces the number of blocks to be searched by FO where FO is the number of index entries in the index block. So a search of a secondary index (non-ordered data file) of N blocks requires examining LogFON blocks.

Why must a secondary index be a dense index?

So a secondary index requires an index entry for every record in the data file. This is the definition of a dense index. Also know that a secondary index is less efficient that a primary index. A secondary index requires an index entry per record, and so a secondary index requires more index entries than the primary index. The secondary index requires more N-1 times more blocks then the primary index where N is the index file's blocking factor (number of index entries that fit into a block).


Set pelajaran terkait

Diagnostic Imaging Final Exam Review

View Set

Chapter 23: Occupational Safety and Health Administration (OSHA)

View Set

Chapter 45: Management of Patients with Oral and Esophageal Disorders

View Set

Ch. 4: The Balance Sheet and the Statement of Shareholders' Equity

View Set

Chapter 18 Review Questions Ovaries and Fallopian Tubes

View Set

Chapter 1: The First Civilizations

View Set