CIS 3365: Chapter 6 - Normalization (Use this one)

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Trick to Remembering the Database Normal Forms

"The key, the whole key, and nothing but the key, so help me Codd" The key (1NF) - tables must have PK The whole key (2NF) - 1NF with no partial dependencies Nothing but the key (3NF) - 2NF with no transitive dependencies

The objective of normalization is to ensure that each table conforms to the concept of well-formed relations—in other words, tables that have the following characteristics:

1. Each table represents a single subject. For example, a COURSE table will contain only data that directly pertain to courses. Similarly, a STUDENT table will contain only student data. 2. No data item will be unnecessarily stored in more than one table (in short, tables have minimum controlled redundancy). The reason for this requirement is to ensure that the data is updated in only one place. 3. All nonprime attributes in a table are dependent on the primary key—the entire primary key and nothing but the primary key. The reason for this requirement is to ensure that the data is uniquely identifiable by a primary key value. 4. Each table is void of insertion, update, or deletion anomalies, which ensures the integrity and consistency of the data.

Steps to Converting to First Normal Form

1. Eliminate the Repeating Groups 2. Identify the Primary Key 3. Identify All Dependencies

Steps in Converting to Second Normal Form

1. Make New Tables to Eliminate Partial Dependencies 2. Reassign Corresponding Dependent Attributes

Steps in Converting to Third Normal Form

1. Make New Tables to Eliminate Transitive Dependencies 2. Reassign Corresponding Dependent Attributes

Transitive dependency

A condition in which an attribute is dependent on another attribute that is not part of the primary key. Ex. X --> Y, Y --> Z, and X is the primary key X --> Z is a transitive dependency because X determines the value of Z via Y. There is an effective way to identify transitive dependencies: they occur only when a functional dependence exists among nonprime attributes. In the previous example, the actual transitive dependency is X --> Z. However, the dependency Y --> Z signals that a transitive dependency exists. Hence, throughout the discussion of the normalization process, the existence of a functional dependence among nonprime attributes will be considered a sign of a transitive dependency.

Partial dependencies

A condition in which an attribute is dependent on only a portion (subset) of the primary key. For example, (A, B) --> (C, D), B --> C, and (A, B) is the primary key The functional dependence B --> C is a partial dependency because only part of the primary key (B) is needed to determine the value of C.

Prime attribute

A key attribute; that is, an attribute that is part of a key or is the whole key.

Denormalization

A process by which a table is changed from a higher-level normal form to a lower-level normal form, usually to increase processing speed. Denormalization potentially yields data anomalies.

Normalization

A process that assigns attributes to entities so that data redundancies are reduced or eliminated.

Dependency Diagram

A representation of all data dependencies (primary key, partial, or transitive) within a table.

(IMG) A Table that is in 3NF but not in BCNF

A table is in 3NF when it is in 2NF and there are no transitive dependencies, but what about a case in which one key attribute is the determinant of another key attribute? That condition does not violate 3NF, yet it fails to meet the BCNF requirements (see Figure 6.8) because BCNF requires that every determinant in the table be a candidate key. Note these functional dependencies in Figure 6.8: A + B → C, D A + C → B, D C → B Notice that this structure has two candidate keys: (A + B) and (A + C). The table structure shown in Figure 6.8 has no partial dependencies, nor does it contain transitive dependencies. (The condition C → B indicates that one key attribute determines part of the primary key—and that dependency is not transitive or partial because the dependent is a prime attribute!) Thus, the table structure meets the 3NF requirements, although the condition C → B causes the table to fail to meet the BCNF requirements.

(IMG) Improving Database Design: Identify New Relationships

According to the original report, the users need to track which employee is acting as the manager of each project. This can be implemented as a relationship between EMPLOYEE and PROJECT. From the original report, it is clear that each project has only one manager. Therefore, the system's ability to supply detailed information about each project's manager is ensured by using the EMP_NUM as a foreign key in PROJECT. That action ensures that you can access the details of each PROJECT's manager data without producing unnecessary and undesirable data duplication.

Atomic attribute

An attribute that cannot be further subdivided to produce meaningful components. For example, a person's last name attribute cannot be meaningfully subdivided.

Nonprime attribute

An attribute that is not part of a key.

Determinant

Any attribute in a specific row whose value directly determines other values in that row.

(IMG) First Normal Form (1NF) Dependency Diagram

As you examine Figure 6.3, note the following features of a dependency diagram: 1. The primary key attributes are bold, underlined, and in a different color. 2. The arrows above the attributes indicate all desirable dependencies—that is, dependencies based on the primary key. In this case, note that the entity's attributes are dependent on the combination of PROJ_NUM and EMP_NUM. 3. The arrows below the dependency diagram indicate less desirable dependencies. Two types of such dependencies exist: a. Partial dependencies. You need to know only the PROJ_NUM to determine the PROJ_NAME; that is, the PROJ_NAME is dependent on only part of the primary key. Also, you need to know only the EMP_NUM to find the EMP_NAME, the JOB_CLASS, and the CHG_HOUR. A dependency based on only a part of a composite primary key is a partial dependency. b. Transitive dependencies. Note that CHG_HOUR is dependent on JOB_CLASS. Because neither CHG_HOUR nor JOB_CLASS is a prime attribute—that is, neither attribute is at least part of a key—the condition is a transitive dependency. In other words, a transitive dependency is a dependency of one nonprime attribute on another nonprime attribute. The problem with transitive dependencies is that they still yield data anomalies.

Functional dependence (generalized definition)

Attribute A determines attribute B (that is, B is functionally dependent on A) if all (generalized definition) of the rows in the table that agree in value for attribute A also agree in value for attribute B.

(IMG) Improving Database Design: Refine Attribute Atomicity

Clearly, the use of the EMP_NAME in the EMPLOYEE table is not atomic because EMP_NAME can be decomposed into a last name, a first name, and an initial. By improving the degree of atomicity, you also gain querying flexibility. For example, if you use EMP_LNAME, EMP_FNAME, and EMP_INITIAL, you can easily generate phone lists by sorting last names, first names, and initials

Conversion to 2NF occurs only when the 1NF has a ___

Composite primary key. If the 1NF has a single-attribute primary key, then the table is automatically in 2NF. (No partial dependencies exist if the primary key only has one attribute).

At the end of First normal form (1NF), you should have:

Data in a table format No repeating groups PK identified. All non-key attributes in the relation are dependent on the primary key. You may have partial dependencies (to be removed in 2NF) You may have transitive dependencies (to be removed in 3NF)

(IMG) Tabular Representation of the Report Format for a Construction Company

Each project has its own project number, name, assigned employees, and so on. Each employee has an employee number, name, and job classification, such as engineer or computer technician. The company charges its clients by billing the hours spent on each contract. The hourly billing rate is dependent on the employee's position. For example, one hour of computer technician time is billed at a different rate than one hour of engineer time. Note that the data in Figure 6.1 reflects the assignment of employees to projects. Apparently, an employee can be assigned to more than one project. For example, Darlene Smithson (EMP_NUM = 112) has been assigned to two projects: Amber Wave and Starflight. Given the structure of the dataset, each project includes only a single occurrence of any one employee. Therefore, knowing the PROJ_NUM and EMP_NUM values will let you find the job classification and its hourly charge. In addition, you will know the total number of hours each employee worked on each project. (The total charge—a derived attribute whose value can be computed by multiplying the hours billed and the charge per hour—has not been included in Figure 6.1). This table has many deficiencies: 1. The project number (PROJ_NUM) is apparently intended to be a primary key (PK) or at least a part of a PK, but it contains nulls. Given the preceding discussion, you know that PROJ_NUM + EMP_NUM will define each row. 2. The table entries invite data inconsistencies. For example, the JOB_CLASS value "Elect. Engineer" might be entered as "Elect.Eng." in some cases, "El. Eng." in others, and "EE" in still others. 3. The table displays data redundancies that yield the following anomalies: a. Update anomalies. Modifying the JOB_CLASS for employee number 105 requires many potential alterations, one for each EMP_NUM = 105. b. Insertion anomalies. Just to complete a row definition, a new employee must be assigned to a project. If the employee is not yet assigned, a phantom project must be created to complete the employee data entry. c. Deletion anomalies. Suppose that only one employee is associated with a given project. If that employee leaves the company and the employee data is deleted, the project information will also be deleted. To prevent the loss of the project information, a fictitious employee must be created. Given the existence of update anomalies, suppose Darlene M. Smithson is assigned to work on the Evergreen project. The data-entry clerk must update the PROJECT file with the following entry: 15 Evergreen 112 Darlene M. Smithson DSS Analyst $45.95 0.0 to match the attributes PROJ_NUM, PROJ_NAME, EMP_NUM, EMP_NAME, JOB_ CLASS, CHG_HOUR, and HOURS. Each time another employee is assigned to a project, some data entries (such as PROJ_NAME, EMP_NAME, and CHG_HOUR) are unnecessarily repeated. Imagine the data-entry chore when 200 or 300 table entries must be made! The entry of the employee number should be sufficient to identify Darlene M. Smithson, her job description, and her hourly charge.

(IMG) Improving Database Design: Evaluate PK Assignments

Each time a new employee is entered into the EMPLOYEE table, a JOB_CLASS value must be entered. Unfortunately, it is too easy to make data-entry errors that lead to referential integrity violations. For example, entering DB Designer instead of Database Designer for the JOB_CLASS attribute in the EMPLOYEE table will trigger such a violation. Therefore, it would be better to add a JOB_CODE attribute to create a unique identifier. The addition of a JOB_CODE attribute produces the following dependency: JOB_CODE → JOB_CLASS, CHG_HOUR If you assume that the JOB_CODE is a proper primary key, this new attribute does produce the following dependency: JOB_CLASS → CHG_HOUR However, this dependency is not a transitive dependency because the determinant is a candidate key. Note that the new JOB table now has two candidate keys—JOB_CODE and JOB_CLASS. In this case, JOB_CODE is the chosen primary key as well as a surrogate key (an artificial PK introduced by the designer with the purpose of simplifying the assignment of primary keys to tables. Surrogate keys are usually numeric, they are often generated automatically by the DBMS, they are free of semantic content (they have no special meaning), and they are usually hidden from the end users).

A table is in Boyce-Codd normal form (BCNF) when

Every determinant in the table is a candidate key (recall that a candidate key has the same characteristics as a primary key, but for some reason, it was not chosen to be the primary key). A table in BCNF must be in 3NF.

(IMG) Improving Database Design: Evaluate Using Derived Attributes

Finally, you can use a derived attribute in the ASSIGNMENT table to store the actual charge made to a project. That derived attribute, named ASSIGN_CHARGE, is the result of multiplying ASSIGN_HOURS by ASSIGN_ CHG_HOUR. This creates a transitive dependency such that: (ASSIGN_CHARGE + ASSIGN_HOURS) → ASSIGN_CHG_HOUR From a system functionality point of view, such derived attribute values can be calculated when they are needed to write reports or invoices.

Stages of Normalization

First normal form (1NF) Second normal form (2NF) Third normal form (3NF)

Steps in Converting to Second Normal Form: Step 1 - Make New Tables to Eliminate Partial Dependencies

For each component of the primary key that acts as a determinant in a partial dependency, create a new table with a copy of that component as the primary key. While these components are placed in the new tables, it is important that they also remain in the original table as well. The determinants must remain in the original table because they will be the foreign keys for the relationships needed to relate these new tables to the original table. To construct the revised dependency diagram, write each key component on a separate line and then write the original (composite) key on the last line. For example: PROJ_NUM EMP_NUM PROJ_NUM EMP_NUM Each component will become the key in a new table. In other words, the original table is now divided into three tables (PROJECT, EMPLOYEE, and ASSIGNMENT).

Steps in Converting to Third Normal Form: Step 1 - Make New Tables to Eliminate Transitive Dependencies

For every transitive dependency, write a copy of its determinant as a primary key for a new table. If you have three different transitive dependencies, you will have three different determinants. As with the conversion to 2NF, it is important that the determinant remain in the original table to serve as a foreign key. In the Construction Company example, there is only one transitive dependency: JOB_CLASS --> CHG_HOUR Therefore, write the determinant for this transitive dependency as: JOB_CLASS

(IMG) Decomposition to BCNF

Functional dependencies of the first data structure: A + B → C, D A + C → B, D C → B The first table structure is not in BCNF because of the C--> B dependency. BCNF requires that every determinant in the table is a candidate key. To convert the table structure in Figure 6.8 into table structures that are in 3NF and in BCNF, first change the primary key to A + C. This change is appropriate because the dependency C → B means that C is effectively a superset of B. At this point, the table is in 1NF because it contains a partial dependency, C → B. Next, follow the standard decomposition procedures to produce the results shown in Figure 6.9.

(IMG) Initial Contracting Company ERD

Given descriptions of the company's operations, two entities and their attributes are initially defined: PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INITIAL, JOB_ DESCRIPTION, JOB_CHG_HOUR) After creating the initial ERD shown in Figure 6.13, the normal forms are defined: PROJECT is in 3NF and needs no modification at this point. EMPLOYEE requires additional scrutiny. The JOB_DESCRIPTION attribute defines job classifications such as Systems Analyst, Database Designer, and Programmer. In turn, those classifications determine the billing rate, JOB_CHG_HOUR. Therefore, EMPLOYEE contains a transitive dependency. The removal of EMPLOYEE's transitive dependency yields three entities: PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INITIAL, JOB_CODE) JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HOUR)

Steps in Converting to Third Normal Form: Step 2 - Reassign Corresponding Dependent Attributes

Identify the attributes that are dependent on each determinant identified in Step 1. The transitive dependency is JOB_CLASS --> CHG_HOUR. The determinant identified is JOB_CLASS. The dependent is CHG_HOUR. Place the dependent attributes in the new tables with their determinants and remove them from their original tables. In the Construction Company example, eliminate CHG_HOUR from the EMPLOYEE table to leave the EMPLOYEE table dependency definition as: EMP_NUM --> EMP_NAME, JOB_CLASS Instead of EMP_NUM --> EMP_NAME, JOB_CLASS, CHG_HOUR Draw a new dependency diagram to show all of the tables you have defined in Steps 1 and 2. Name the table to reflect its contents and function. In this case, JOB seems appropriate. Check all of the tables to make sure that each table has a determinant and that no table contains inappropriate dependencies. When you have completed these steps, you will see the results in Figure 6.5. In other words, after the 3NF conversion has been completed, your database will contain four tables: PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) JOB (JOB_CLASS, CHG_HOUR) ASSIGNMENT (PROJ_NUM, EMP_NUM, ASSIGN_HOURS)

Normalization starts by:

Identifying the dependencies of a given relation and progressively breaking up the relation (table) into a set of new relations (tables) based on the identified dependencies.

Fully functional dependence (composite key)

If attribute B is functionally dependent on a composite key A but not on any subset of that composite key, the attribute B is fully functionally dependent on A.

(IMG) Improving Database Design: Identify New Attributes

If the EMPLOYEE table were used in a real-world environment, several other attributes would have to be added. For example, year-to-date gross salary payments, Social Security payments, and Medicare payments would be desirable. An employee hire date attribute (EMP_HIREDATE) could be used to track an employee's job longevity, and it could serve as a basis for awarding bonuses to long-term employees and for other morale-enhancing measures

(IMG) Improving Database Design: Evaluate Naming Conventions

It is best to adhere to the naming conventions outlined in Chapter 2, Data Models. Therefore, CHG_HOUR will be changed to JOB_CHG_ HOUR to indicate its association with the JOB table. In addition, the attribute name JOB_CLASS does not quite describe entries such as Systems Analyst, Database Designer, and so on; the label JOB_DESCRIPTION fits the entries better. Also, you might have noticed that HOURS was changed to ASSIGN_HOURS in the conversion from 1NF to 2NF. That change lets you associate the hours worked with the ASSIGNMENT table

A table is in second normal form (2NF) when:

It is in 1NF There are no partial dependencies (i.e. there is no attribute that is dependent on only a portion of the primary key).

A table is in third normal form (3NF) when:

It is in 2NF. It contains no transitive dependencies.

A table is in fourth normal form (4NF) when:

It is in 3NF and has no multivalued dependencies.

Steps to Converting to First Normal Form: Step 2 - Identify the Primary Key

Note that PROJ_NUM is not an adequate primary key because the project number does not uniquely identify all of the remaining entity (row) attributes. To maintain a proper primary key that will uniquely identify any attribute value, the new key must be composed of a combination of PROJ_NUM and EMP_NUM. If you know that PROJ_NUM = 15 and EMP_NUM = 103, the entries for the attributes PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, and HOURS must be Evergreen, June E. Arbough, Elect. Engineer, $84.50, and 23.8, respectively.

(IMG) Repeating Groups for Construction Company

Note that each single project number (PROJ_NUM) occurrence can reference a group of related data entries. For example, the Evergreen project (PROJ_NUM = 15) shows five entries at this point—and those entries are related because they each share the PROJ_NUM = 15 characteristic. Each time a new record is entered for the Evergreen project, the number of entries in the group grows by one.

(IMG) Another BCNF Decomposition

Panel A of Figure 6.10 shows a structure that is clearly in 3NF, but the table represented by this structure has a major problem because it is trying to describe two things: staff assignments to classes and student enrollment information. Such a dual-purpose table structure will cause anomalies. For example, if a different staff member is assigned to teach class 32456, two rows will require updates, thus producing an update anomaly. Also, if student 135 drops class 28458, information about who taught that class is lost, thus producing a deletion anomaly. The solution to the problem is to decompose the table structure, following the procedure outlined earlier. The decomposition of Panel B shown in Figure 6.10 yields two table structures that conform to both 3NF and BCNF requirements. Remember that a table is in BCNF when every determinant in that table is a candidate key. Therefore, when a table contains only one candidate key, 3NF and BCNF are equivalent.

Two types of functional dependencies

Partial dependencies Transitive dependencies

Steps to Converting to First Normal Form: Step 1 - Eliminate the Repeating Groups

Start by presenting the data in a tabular format, where each cell has a single value and there are no repeating groups. To eliminate the repeating groups, eliminate the nulls by making sure that each repeating group attribute contains an appropriate data value.

(IMG) Final Contracting Company ERD

The ASSIGNMENT entity in Figure 6.16 uses the primary keys from the entities PROJECT and EMPLOYEE to serve as its foreign keys. However, note that in this implementation, the ASSIGNMENT entity's surrogate primary key is ASSIGN_NUM, to avoid the use of a composite primary key. Therefore, the "enters" relationship between EMPLOYEE and ASSIGNMENT and the "requires" relationship between PROJECT and ASSIGNMENT are shown as weak or nonidentifying. Because you will likely need detailed information about each project's manager, the creation of a "manages" relationship is useful. The "manages" relationship is implemented through the foreign key in PROJECT. Finally, some additional attributes may be created to improve the system's ability to generate additional information. For example, you may want to include the date the employee was hired (EMP_HIREDATE) to keep track of worker longevity. Based on this last modification, the model should include four entities and their attributes: PROJECT (PROJ_NUM, PROJ_NAME, EMP_NUM) EMPLOYEE (EMP_NUM, EMP_LNAME, EMP_FNAME, EMP_INITIAL, EMP_HIREDATE, JOB_CODE) JOB (JOB_CODE, JOB_DESCRIPTION, JOB_CHG_HOUR) ASSIGNMENT (ASSIGN_NUM, ASSIGN_DATE, PROJ_NUM, EMP_NUM, ASSIGN_HOURS, ASSIGN_CHG_HOUR, ASSIGN_CHARGE)

(IMG) Improving Database Design: Refine Primary Keys as Required for Data Granularity

The ASSIGNMENT table in 3NF uses the ASSIGN_HOURS attribute to represent the hours worked by a given employee on a given project. However, are those values recorded at their lowest level of granularity? In other words, does ASSIGN_HOURS represent the hourly total, daily total, weekly total, monthly total, or yearly total? Clearly, ASSIGN_ HOURS requires more careful definition. In this case, the relevant question would be as follows: for what time frame—hour, day, week, month, and so on—do you want to record the ASSIGN_HOURS data?

(IMG) Tables with Multiple Candidate Keys

The CLASS table has two candidate keys: - CLASS_CODE - CRS_CODE + CLASS_SECTION The table is in 1NF because the key attributes are defined and all nonkey attributes are determined by the key. This is true for both candidate keys. Both candidate keys have been identified, and all of the other attributes can be determined by either candidate key. The table is in 2NF because it is in 1NF and there are no partial dependencies on either candidate key. Finally, the table is in 3NF because there are no transitive dependencies. Therefore, the table is also in BCNF, because it is in 3NF and every determinant in the table is a candidate key.

Functional dependence

The attribute B is functionally dependent on the attribute A if each value of A determines one and only one value of B. Example: PROJ_NUM --> PROJ_NAME (read as PROJ_NUM functionally determines PROJ_NAME) In this case, the attribute PROJ_NUM is known as the determinant attribute, and the attribute PROJ_NAME is known as the dependent attribute.

Key attributes

The attributes that form a primary key.

Steps to Converting to First Normal Form: Step 3 - Identify All Dependencies

The identification of the PK in Step 2 means that you have already identified the following dependency: PROJ_NUM, EMP_NUM --> PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS That is, the PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, and HOURS values are all dependent on—they are determined by—the combination of PROJ_NUM and EMP_NUM. There are additional dependencies. For example, the project number identifies (determines) the project name. In other words, the project name is dependent on the project number. You can write that dependency as: PROJ_NUM --> PROJ_NAME Also, if you know an employee number, you also know that employee's name, job classification, and charge per hour. Therefore, you can identify the dependency shown next: EMP_NUM --> EMP_NAME, JOB_CLASS, CHG_HOUR You can also see that knowing the job classification means knowing the charge per hour for that job classification. Therefore, you can identify one last dependency: JOB_CLASS --> CHG_HOUR This dependency exists between two nonprime attributes; therefore, it is a signal that a transitive dependency exists.

Granularity

The level of detail represented by the values stored in a table's row. Data stored at its lowest level of granularity is said to be atomic data.

Steps in Converting to Second Normal Form: Step 2 - Reassign Corresponding Dependent Attributes

The primary key of the Construction company is (PROJ_NUM, EMP_NUM). The partial dependencies in the Construction example is: PROJ_NUM --> PROJ_NAME and EMP_NUM --> EMP_NAME, JOB_CLASS, CHG_HOUR The attributes that are dependent in a partial dependency are removed from the original table and placed in the new table with the dependency's determinant. Any attributes that are not dependent in a partial dependency will remain in the original table. In other words, the three tables that result from the conversion to 2NF are given appropriate names (PROJECT, EMPLOYEE, and ASSIGNMENT) and are described by the following relational schemas: PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR) ASSIGNMENT (PROJ_NUM, EMP_NUM, ASSIGN_HOURS) Because the number of hours spent on each project by each employee is dependent on both PROJ_NUM and EMP_NUM in the ASSIGNMENT table, you leave those hours in the ASSIGNMENT table as ASSIGN_HOURS. Notice that the ASSIGNMENT table contains a composite primary key composed of the attributes PROJ_NUM and EMP_NUM. Notice that by leaving the determinants in the original table as well as making them the primary keys of the new tables, primary key/foreign key relationships have been created. For example, in the EMPLOYEE table, EMP_NUM is the primary key. In the ASSIGNMENT table, EMP_NUM is part of the composite primary key (PROJ_NUM, EMP_NUM) and is a foreign key relating the EMPLOYEE table to the ASSIGNMENT table. Because a partial dependency can exist only when a table's primary key is composed of several attributes, a table whose primary key consists of only a single attribute is automatically in 2NF once it is in 1NF. Note that there is still a transitive dependency.

How does Normalization take place if there are multiple candidate keys in a table?

The table can have partial dependencies based on this composite candidate key, even when the primary key chosen is a single attribute. In those cases, following the process described above, those dependencies would be perceived as transitive dependencies and would not be resolved until 3NF (which is a mistake). The existence of multiple candidate keys can also influence the identification of transitive dependencies. Previously, a transitive dependency was defined to exist when one nonprime attribute determined another nonprime attribute. In the presence of multiple candidate keys, the definition of a nonprime attribute as an attribute that is not a part of any candidate key is critical. If the determinant of a functional dependence is not the primary key but is a part of another candidate key, then it is not a nonprime attribute and does not signal the presence of a transitive dependency.

The objective of normalization is to ensure that all tables are at least in ___

Third normal form (3NF)

(IMG) Incorrect M:N Relationship Representation

To represent the M:N relationship between EMPLOYEE and PROJECT, you might think that two 1:M relationships could be used—an employee can be assigned to many projects, and each project can have many employees assigned to it. (See Figure 6.15.) Unfortunately, that representation yields a design that cannot be correctly implemented. The ERD in Figure 6.15 must be modified to include the ASSIGNMENT entity to track the assignment of employees to projects,

Similarities with converting to both 2NF and 3NF

When confronted with both partial and transitive dependencies, the solution is the same: create a new table for each problem dependency. The determinant of the problem dependency remains in the original table (as a foreign key) and is placed as the primary key of the new table. The dependents of the problem dependency are removed from the original table and placed as nonprime attributes in the new table.

(IMG) Improving Database Design: Maintain Historical Accuracy

Writing the job charge per hour into the ASSIGNMENT table is crucial to maintaining the historical accuracy of the table's data. It would be appropriate to name this attribute ASSIGN_CHG_HOUR. Although this attribute would appear to have the same value as JOB_CHG_HOUR, this is true only if the JOB_ CHG_HOUR value remains the same forever. It is reasonable to assume that the job charge per hour will change over time. However, suppose that the charges to each project were calculated and billed by multiplying the hours worked from the ASSIGNMENT table by the charge per hour from the JOB table. Those charges would always show the current charge per hour stored in the JOB table rather than the charge per hour that was in effect at the time of the assignment.

Repeating groups

a group of multiple entries of the same type exists for any single key attribute occurrence. For example, a car can have multiple colors for its top, interior, bottom, trim, and so on. A relational table must not contain repeating groups.

Surrogate key

a system-defined attribute generally created and managed via the DBMS. Usually, a system-defined surrogate key is numeric, and its value is automatically incremented for each new row.


Ensembles d'études connexes

Food Science - Sensory Evaluation, Exam 2

View Set

基础印尼语 Bahasa Indonesia elementary 4

View Set