Data Resource Management Exam 1 Chapter 5
Range Control
A range control limits the set of permissible values a field may assume.
Segment
A table, index, or partition
When should you use denormalization?
Benefits: can improve performance (speed) by reducing number of table lookups Costs: wasted storage space; data integrity/consistency threats one-to-one relationship many-to-many relationship with non-key attributes reference data (1:N relationship where 1-side has data not used in any other relationship)
Extent
Contiguous section of disk space
What are integrity controls?
Default value range control null value control referential integrity
Advantages of partitioning
Efficiency: records used together are grouped together Local optimization: Each partition can be optimized for performance Security: data not relevant to users are segregated Recovery and uptime: smaller files take less time to back up Load balancing: partitions stored on different disks, reduces contention
Null value control
Empty value; each primary key must have an integrity control that prohibits a null value;
Which type of file is easiest to update?
Hashed
Disadvantages of partitioning
Inconsistent access speed: Slow retrievals across partitions Complexity: Non-transparent partitioning Extra space or update time: Duplicate data; access from multiple partitions
An index on columns from two ore more tables that come from the same domain of values is called a...
Join index
Which type of file is most efficient with storage space?
Sequential
Handling missing data
Substitute an estimate of the missing value (using a formula) Construct a report listing missing values In programs, ignore missing data unless the value is significant (sensitivity testing)
What is a field?
The smallest unit of application data recognized by system software.
Rules of thumb for choosing indexes
Use on larger tables Index the primary key of each table Index search fields (fields frequently in WHERE clause) Fields in SQL ORDER BY and GROUP BY comands When there are more than 100 values but not when there are less than 30 values Avoid use of indexes for fields with long values; perhaps compress values first if key to index is used to determine location of record, use surrogate to allow even spread in storage area DBMS may have limit on number of indexes per table and number of bytes per indexed fields Be careful of indexing attributes with null values; many DBMSs will not recognize null values in an index search
Hash algorithm
Usually uses division remainder to determine record position. Records with same position are grouped in lists
All of the following are common denormalization opportunities EXCEPT
a one-to-many relationship (a many-to-many relationship with nonkey attributes, reference data, two entities with a one-to-one relationship)
A method to allow adjacent secondary memory space to contain rows from several tables is called
clustering
Goal of a physical database design
create a design for storing data that will provide adequate performance and ensure database integrity, security, and recoverability
The storage format for each attribute from the logical data model is ______________ and minimize storage space.
data integrity
Another form of denormalization where the same data are stored in multiple places in the database is called
data replication
The value a field will assume unless the user enters an explicit value for an instance of that field is called a
default value
An advantage of partitioning is
efficiency
A contiguous section of disk storage space is called
extent
A disadvantage of partitioning is
extra space and update time
A file organization that uses hashing to map a key into a location in an index where there is a pointer to the actual data record matching the hash key is called a
hash index table
Distributing the rows of data into separate files is called
horizontal partitioning
Which of the following is an objective of selecting a data type?
improve data integrity
A requirement to being designing physical files and databases is
normalized relations, technology descriptions, definitions of each attribute
a ____ is a field of data used to locate a related field or record
pointer
An integrity control supported by a DBMS is
range control
Data block
smallest unit of storage
Clustering
stored together in the same disk area useful for improving performance of join operations primary key records of the main table are stored adjacent to associated foreign key records of the dependent table
Default value
the value a field will assume unless a user enters an explicit value for an instance of that field. Assigning a default value to a field can reduce data entry time because entry of a value can be skipped. IT can also help to reduce data entry errors for the most common value.
Database access frequencies are estimated from...
transaction volumes
A rule of thumb for choosing indexes is to...
use an index when there is a variety in attribute values index each primary key of each table be careful indexing attributes that may be null