1. Physical Database Design
Physical File
A named portion of secondary memory allocated for the purpose of storing physical records.
File Organizations
A technique for physically arranging the records of a file on secondary storage devices.
Insertion Anomalies
Allow duplicate data or create duplicate fields (should never have two records the exact same).
Decisions into the Physical Design Process
Attribute data types; Physical record designs (doesn't always match logical design); File organizations; Indexes and database architectures; Hardware configuration options.
Example of an Update Anomalie
Changing a foreign key to something that does not exist.
Update Anomalies
Changing a record incorrectly. Changing one field and other similar field did not update.
Creating Well Structured Tables
Contain minimal redundancy. Allows users to insert, modify, and delete rows in the table without errors or inconsistencies.
Tablespace
Disk storage in which database tables are stored
Vertical Partitioning
Distributing the columns of a table into two or more separate files.
Horizontal Partitioning
Distributing the rows of a table into two or more separate files. Range, List, Hash partitioning.
Example of Vertical Partitioning
Employee table can be split according to data sensitivity.
Example of Horizontal Partitioning
Employee table can be split according to employee type.
Physical Design Archetecture
End user (Authorization & Authentication Control Matrix); Front End Menu Hierarchy; Middleware; Back End Database (data storage).
Second Normal Form
Every non-key attribute is fully, functionally dependent on the primary key. This means that every non-key attribute must be determined by the entire primary key, not only part of the primary key.
Goals of File Organizations
Fast data retrieval; High throughput for input and maintenance; Efficient use of storage space; Protection for failures or data loss; Minimal need for reorganization; Accommodation for growth; Security from unauthorized use.
Types of Partitioning
Horizontal; Vertical
Cons of Sequential File Organizations
Horribly inefficient if randomly reading files (transaction based); Inserting files requires rewriting files; Deleting requires reorganization.
File Organization
How are they accessed?
Size Estimates
How large is each table?
Data Volume
How many records in entity?
Volume Estimates
How much data do I need in my database? How much traffic is going to be going through my database?
Disk Storage Requirment
How much storage needed?
Memory/Processor Requirment
How often accessed?
Access Method
How the data can be retrieved based on the file organizations.
Organization of Files
How the files are arranged on the disk
Conceptual Modeling
How well you can model the reality of the situation
Well Structured tables avoid...
Insertion Anomalies; Update Anomalies; Deletion Anomalies
Cons of Denormalization
Introduces anomalies and redundancies which will necessitate the need for more data maintenance.
Cons of Indexed File Organizations
May waste some space in index; Deleting/Inserting records requires maintenance to the index.
0 Normal Form
Multi-valued attributes are present in the data. Also there is no clear identifier within the dataset.
Pros of Indexed File Organizations
No wasted space for records; Moderately fast for sequential and random access.
Inputs into the Physical Design Process
Normalization relations; Volume Estimates; Attribute definitions; Response time expectations; Data security needs; Backup/recovery needs; Integrity expectations; DBMS tech used.
Database Design Decisions
Normalization; Denormalization; Partitioning; File Organization/Indexing
Partial Functional Dependency
Not all fields in the table are dependent on one primary key.
Security
Only certain employees are given access to certain parts of the database, need to set up measures that keep them from sensitive data.
Indexed File Organizations
Records may or may not be stored in any specific order, but a legend is created to map where specific records or groups of records can be found.
Sequential File Organization
Records of the file are stored in order by the primary key field values.
Pros of Partitioning
Records used together are group together; Each section can be optimized for performance; Security and recovery; Can be stored on different disks (less contention); Parallel processing capability.
Pros of Denormalization
Reduced disk access and greater performance; Makes writing SQL statements much simpler.
Steps to Normalization
Remove multivalued attributes; remove partial dependencies; remove transitive dependencies
First Normal Form
Removed multi-valued attributes and there is one value per cell, but need to separate fields that have nothing to do with the other fields.
Deletion Anomalies
Removing a record that removes another record you did not want to remove.
Cons of Partitioning
Slower retrievals when across different sections; Complexity for application programmers; Anomalies and extra storage space requirements due to the duplication of data across the different sections.
Example of Functional Dependcies
Social Security Number, UT Student ID, VIN # for your car, etc.
Denormalization
Transform normalized relations into non-normalized physical record specifications
Composite Usage Maps
Use of tables to capture table sizes and access rates/expectations. Requirements may also be recorded and displayed using an annotated ERD.
Pros of Sequential File Organizations
Very fast and efficient if reading almost every file with few or no changes.
Transitive Dependancey
When non-key attribute is functionally dependent on another non-key attribute.
Access Paths/Security
Which tables are accessed?
Goal of Physical Database Design
create a design for storing data that will provide performance, data integrity, data security, data recoverability, and simplify application development.
For relational models, the order of the fields...
do not matter.
Normalization
the process of grouping attributes together into tables. This validates and improves logical database design to satisfy certain constraints and avoid duplication of data. Decomposition of tables to remove anomalies and create 2 or more well-structured tables.
Functional Dependcies
the value of one attribute determines the other attribute.
Purpose of Physical Database Design
translate the logical description into technical specifications for storing and retrieving data.
To move from First Normal Form to Second Normal Form...
we need to know which attributes are functionally dependent to each other.