Chapter 6 - Normalization of Database Tables

Ace your homework & exams now with Quizwiz!

Normalization's objective is to have tables with the following characteristics:

*Each relation (table) represents a single subject.* ^^For example, a COURSE table will contain only data that directly pertain to courses. Similarly, a STUDENT table will contain only student data. *Each row/column intersection contains only one value and not a group of values.* *No data item will be unnecessarily stored in more than one table (tables have minimum controlled redundancy).* ^^The reason for this requirement is to ensure that the data is updated in only one place. *All nonprime attributes in a relation (table) are dependent on the primary key—the entire primary key and nothing but the primary key.* ^^The reason for this requirement is to ensure that the data is uniquely identifiable by a primary key value. *Each relation (table) has no insertion, update, or deletion anomalies, which ensures the integrity and consistency of the data.*

Steps to convert to 1NF

1. Eliminate the repeating groups 2. Identify the primary key 3. Identify all dependencies

Issues to Address to ensure high-quality Normalization (Post-normalization)

1. Evaluate PK Assignments 2. Evaluate Naming Conventions 3. Refine Attribute Atomicity 4. Identify New Attributes 5. Identify New Relationships 6. Refine Primary Keys as Required for Data Granularity 7. Maintain Historical Accuracy 8. Evaluate Using Derived Attributes

Steps to convert to 2NF

1. Make New Tables to Eliminate Partial Dependencies 2. Reassign Corresponding Dependent Attributes

Steps to convert to 3NF

1. Make New Tables to Eliminate Transitive Dependencies 2. Reassign Corresponding Dependent Attributes

Transitive Dependency

A condition in which an attribute is dependent on another attribute that is not part of the primary key. EX: If X --> Y, Y --> Z, and X is the primary key. In that case, the dependency X --> Z is a transitive dependency because X determines the value of Z via Y Relatively difficult to spot/identify An effective way of finding a transitive dependency is by seeing if any attributes are dependent on non-prime attributes (attributes that are not assigned as any type of key). For example, Y in the previous example is a non-prime attribute, so finding Y-->Z indicates that there is a transitive dependency, which in this case, is X-->Z

Partial Dependency

A condition in which an attribute is dependent on only a portion (subset) of the primary key. EX: If (A, B) --> (C, D), B --> C, and (A, B) is the primary key, then B --> C is only a partial dependency because only part of the primary key (B) is needed to determine the value of C. Relatively easy to spot/identify

*Denormalization*

A process by which a table is changed from a higher-level normal form to a lower-level normal form, usually to increase processing speed. Denormalization potentially yields data anomalies.

*Normalization*

A process that assigns attributes to entities so that data redundancies are reduced or eliminated. Works through a series of stages: *1NF, 2NF, and 3NF.* Other stages includterm-17e BCNF & 4NF. As you increase through the levels of normalization: - Data redundancy is reduced - More relational join operations needed to be made, resulting in more resources being required. If too many resources are being used, then you will need to denormalize (go down a level or more) to meet performance requirements. In essence, normalization identifies which attributes determine other attributes (like functional dependencies).

Dependency Diagram

A representation of all data dependencies (primary key, partial, or transitive) within a table.

Simple and Composite Keys (Re-Review from Chapter 2)

A simple key is when only one attribute is a primary key Composite keys are when there are multiple primary keys

Boyce-Codd normal form (BCNF)

A special type of third normal form (3NF) in which every determinant is a candidate key. A table in BCNF must be in 3NF. *Characteristic:* *Every determinant needs to become a candidate key (special case of 3NF)*

Third Normal Form (3NF)

A table is in 3NF when it is in 2NF and no nonkey attribute is functionally dependent on another nonkey attribute; that is, it cannot include transitive dependencies. Usually the highest a normal business database design will need to go. *To convert a table from 2NF to 3NF, you need to remove the transitive dependencies.* *In the end, you will basically create new tables with one-to-many relationships.* Compare this picture with the picture in the 2NF card to see the changes from 2NF to 3NF

Fourth Normal Form (4NF)

A table is in 4NF if it is in 3NF and contains no multiple independent sets of multivalued dependencies. To convert a table from 3NF to 4NF, you need to remove independent multivalued dependencies.

Learning Objectives

After completing this chapter, you will be able to: Explain normalization and its role in the database design process Identify and describe each of the normal forms: 1NF, 2NF, 3NF, BCNF, and 4NF Explain how normal forms can be transformed from lower normal forms to higher normal forms Apply normalization rules to evaluate and correct table structures Identify situations that require denormalization to generate information efficiently Use a data-modeling checklist to check that the ERD meets a set of minimum requirements

Steps to convert to 3NF Step 2: Reassign Corresponding Dependent Attributes

After creating the new table(s), move the dependent attributes to the new table. EX: If hourly_pay was dependent on job_title, and job_title was moved to its own table, then remove hourly_pay from the original table completely and move it to the new table.

Steps to convert to 1NF Step 2: Identify the Primary Key

After eliminating repeating groups, look for attributes that could be a primary key. Multiple primary keys can be assigned during this time if it makes searching.

Steps to convert to 2NF Step 2: Reassign Corresponding Dependent Attributes

After making however many new tables, take all partially dependent attributes with the new sole primary key. If attributes A and B both have attributes that are partial dependencies, as well as attributes that are fully functional dependent, then three tables would be made: one table for all partial dependencies of A, one table for all partial dependencies of B, and one table for all fully functional dependencies of (A,B)

Atomic Attribute

An attribute that cannot be further subdivided to produce meaningful components. For example, a person's last name attribute cannot be meaningfully subdivided. Refer to "Refine Attribute Atomicity" for more context

*Nonprime Attribute*

An attribute that is not part of a key or defined as a key. aka Nonkey Attribute

*Prime Attribute*

An attribute that is part of or defined a key (or is just the whole key).

Issues to Address to ensure high-quality Normalization. #4: Identify New Attributes

Analyze a table to see if additional attributes can be added, or need to be added based on the rules/requirements of the business.

Issues to Address to ensure high-quality Normalization. #5: Identify New Relationships

Analyze all tables to see if more relationships can be made.

Issues to Address to ensure high-quality Normalization. #6: Refine Primary Keys as Required for Data Granularity

Analyze attributes to specify the granularity. EX: For assign_hours, does it mean hours per day? Hours per week? Hours per month? These need to be specified. The level of desired granularity would normally be determined during the conceptual design phase

Determinant

Any attribute in a specific row whose value directly determines other values in that row.

Issues to Address to ensure high-quality Normalization. #2: Evaluate Naming Conventions

Assign names to attributes that can be easily understood simply by their name. EX: job_class is not a good attribute to name for storing job titles. Instead, job_description or job_title is better.

Issues to Address to ensure high-quality Normalization. #7: Maintain Historical Accuracy

Attributes can have values change over time, so that needs to be accounted for

Issues to Address to ensure high-quality Normalization. #1: Evaluate PK Assignments

Create a primary key that can lead to a possible violation of data integrity. EX: Instead of having an attribute for job title in a table, create a new table with an integer primary key, and assign a pre-set job title to an attribute dependent on that primary key. This way, multiple users won't run the risk of entering a job title's name incorrectly, and will simply have the choice to select what job title they are assigning to the employee.

Issues to Address to ensure high-quality Normalization. #3: Refine Attribute Atomicity

Ensure all attributes meet the atomicity requirement. (Basically, make sure attributes cannot be broken down further). EX: emp_name does not meet the atomicity requirement because you can break this down to emp_first_name and emp_last_name. On the other hand, emp_job_title cannot be broken down further. Breaking down attributes as much as possible can make generating lists or reports faster.

Repeating Group

Happens when a group of multiple entries of the same or multiple types can exist for any single key attribute occurrence Ex: All values for the attribute "customer_id" are combined into one record with a single "emp_id". Instead of every single customer_id showing up as its own individual record, they are all clumped together in one record with a emp_id primary key.

*Important topics to understand before continuing with Normalization*

Identifying business rules (page 38) Identifying and defining business and data constraints (page 37) Defining functional dependencies (page 72) Identifying entities and relationships (all of chapter 4) Eliminating multivalued attributes (page 117)

Steps to convert to 2NF Step 1: Make New Tables to Eliminate Partial Dependencies

If a partial dependency exists, take the primary key that is determinant and make a new table with that primary key.

Steps to convert to 1NF Step 1: Eliminate the Repeating Groups

If a repeating group is found, instantly split it into many records.

Issues to Address to ensure high-quality Normalization. #8: Evaluate Using Derived Attributes

If you can easily create derived attributes instead of making new tables, then do that.

Steps to convert to 3NF Step 1: Make New Tables to Eliminate Transitive Dependencies

Look for transitive dependencies, take out the nonprime attribute, and create a new table with that attribute as the primary key. Then, bring that attribute back to the original table as a foreign key (with a relationship).

Atomicity

Not being able to be divided into smaller units. Refer to "Refine Attribute Atomicity" for more context

Multivalued dependency

Occurs when one key determines multiple values of two other attributes and those attributes are independent of each other.

Normal Forms

Purpose: To reduce data redundancy as you go higher through the levels (or increase as you go down). *First Normal Form (1NF)* *Second Normal Form (2NF)* *Third Normal Form (3NF)* Boyce-Cood Normal Form (BCNF) Fourth Normal Form (4NF) Normal Forms past 4NF, such as 5NF or domain-key normal form (DKNF) are purely theoretical and don't add much value in eliminating data redundancy. In fact, these levels only slow performance. Only very specific and specialized apps might need to go beyond 4NF, but those would go outside the scope of business operations.

Unnormalized Data

Raw data in its original state; it might contain redundant data, multivalued data, and/or other data anomalies not found on normalized data relations.

Steps to convert to 1NF Step 3: Identify all Dependencies

Starting with the primary key(s), analyze all attributes to see what the dependency paths look like. Use a dependency diagram to speed this process up and to help make sure nothing is missed.

Functional Dependence (Re-review from Chapter 3)

The attribute B is fully functionally dependent on the attribute A if each value of A determines one and only one value of B. In other words: *A determines B (A-->B)* which means *B is dependent on A* *^^Understand this before continuing* Think of 'A' as the primary key attribute of a table, and B as a simple attribute in the same table

Key Attributes

The attributes that form a primary key.

First Normal Form (1NF)

The first stage in the normalization process. The term first normal form (1NF) describes the tabular format that conforms to the definition of a relational table in which: - All of the key attributes are defined. - There are no repeating groups in the table. In other words, each row/column intersection contains one and only one value, not a set of values. - All attributes are dependent on the primary key. *To start normalization, you need to convert your current table structure to 1NF.* You can do this by *removing repeating groups, identifying primary keys, and identifying dependencies*. Compare the picture with the picture in the 2NF card to see the changes from 1NF to 2NF

Granularity

The level of detail represented by the values stored in a table's row. Data stored at its lowest level of granularity is said to be atomic data. Refer to "Refine Primary Keys as Required for Data Granularity" for more context

Second Normal Form (2NF)

The second stage in the normalization process, In which a relation is in 1NF and there are no partial dependencies (dependencies in only part of the primary key). *To convert a table from 1NF to 2NF, you need to remove the partial dependencies* *In the end, you will basically create new tables with many-to-many relationships, meaning associative entities are created during this form.* Compare this picture with the picture in the 1NF card to see the changes from 1NF to 2NF


Related study sets

Med Surg - Chapter 17 - Principles of Inflammation and Immunity

View Set

DAY 2 QUESTIONS (Posterior Shoulder, Axilla, Arm, Pectoral Region, Breast)

View Set

U16 - Technological Progress, Employment and Living Standards in the long-run

View Set

Prep for Exit Exam #1 (Pre-Assessment)

View Set

Taylor- Fundamentals of Nursing Chp 15

View Set

IB/AP Biology Unit 6 Gene Expression and Regulation

View Set