Chapter 14
What are normals forms defined informally?
1st normal form: All attributes depend on the key 2nd normal form: All attributes depend on the whole key 3rd normal form: All attributes depend on nothing but the key
What is join dependency?
A join dependency (JD), denoted by JD(R1, R2, ..., Rn), specified on relation schema R, specifies a constraint on the states r of R. The constraint states that every legal state r of R should have a non-additive join decomposition into R1, R2, ..., Rn; that is, for every such r we have
When is a join dependency trivial?
A join dependency JD(R1, R2, ..., Rn), specified on relation schema R, is a trivial JD if one of the relation schemas Ri in JD(R1, R2, ..., Rn) is equal to R.
What is a key K ?
A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey any more.
What is a multivalued dependencies?
A multivalued dependency (MVD) X —>> Y specified on relation schema R, where X and Y are both subsets of R, specifies the following constraint on any relation state r of R: If two tuples t1 and t2 exist in r such that t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with the following properties, where we use Z to denote (R 2 (X υ Y)):
Can you achieveBCNF by decomposition?
A relation NOT in BCNF should be decomposed so as to meet this property, while possibly forgoing the preservation of all functional dependencies in the decomposed relations.
What is the BCNF or Boyce-codd normal form?
A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X → A holds in R, then X is a superkey of R Each normal form is strictly stronger than the previous one Every 2NF relation is in 1NF Every 3NF relation is in 2NF Every BCNF relation is in 3NF There exist relations that are in 3NF but not in BCNF Hence BCNF is considered a stronger form of 3NF The goal is to have each relation in BCNF (or 3NF)
What is the fifth normal form or when is a relational schema in one?
A relation schema R is in fifth normal form (5NF) (or Project-Join Normal Form (PJNF)) with respect to a set F of functional, multivalued, and join dependencies if, for every nontrivial join dependency JD(R1, R2, ..., Rn) in F+ (that is, implied by F), every Ri is a superkey of R. 5NF is rarely used in practice
What is a superkey?
A superkey of a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-of R with the property that no two tuples t1 and t2 in any legal relation state r of R will have t1[S] = t2[S]
What is functional dependencies?
Are used to specify formal measures of the "goodness" of relational designs And keys are used to define normal forms for relations Are constraints that are derived from the meaning and interrelationships of the data attributes A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y An FD is a property of the attributes in the schema R
What is the reasons for null in good base relations?
Attribute not applicable or invalid Attribute value unknown (may exist) Value known to exist, but unavailable
What are the two decompositions for binary relational decompositions?
Binary Decomposition: Decomposition of a relation R into two relations. PROPERTY NJB (non-additive join test for binary decompositions): A decomposition D = {R1, R2} of R has the lossless join property with respect to a set of functional dependencies F on R if and only if either
What is normal form?
Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form
What is the first normal forms?
First normal forms disallows: composite attributes multivalued attributes nested relations It is also considered to be part of the definition of a relation most RDBMSs allow only those relations to be defined that are in First Normal Form
What is the fourth guidelines for good base relations?
Generation of Spurious Tuples- avoid at all cost: The relations should be designed to satisfy the lossless join condition. No spurious tuples should be generated by doing a natural-join of any relations.
What is the third guidelines for good base relations?
Null Values in tuples Relations should be designed such that their tuples will have as few NULL values as possible Attributes that are NULL frequently could be placed in separate relations (with the primary key)
What is the second guidelines for good base relations?
Redundant information in Tuples and Update anomalies if the information is stored redundantly it wastes storage and causes issues with update anomalies like insertion, deletion, and modification anomalies Design a schema that does not suffer from the insertion, deletion, and update anomalies. If there are any anomalies present, then note them so that applications can be made to take them into account.
What is the first guidelines for good base relations?
Semantics of the relational attributes must be clear Design a schema that can be explained easily relation by relation. The semantics of attributes should be easy to interpret.
General definition of thrid normal form:
Superkey of relation schema R - a set of attributes S of R that contains a key of R A relation schema R is in third normal form (3NF) if whenever a FD X → A holds in R, then either: (a) X is a superkey of R, or (b) A is a prime attribute of R or A relation schema R is in third normal form (3NF) if every non-prime attribute in R meets both of these conditions: It is fully functionally dependent on every key of R It is non-transitively dependent on every key of R Note that stated this way, a relation in 3NF also meets the requirements for 2NF.
General Normal form guidelines for multiple keys:
The above definitions consider the primary key only The following more general definitions take into account relations with multiple candidate keys Any attribute involved in a candidate key is a prime attribute All other attributes are called non-prime attributes.
What is relational database design?
The grouping of attributes to form "good" relation schemas
What are the two levels of relational schemas?
The logical "user view" level The storage "base relation" level
What is normalization?
The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations Normalization is carried out in practice so that the resulting designs are of high quality and meet the desirable properties
What is denormalization?
The process of storing the join of higher normal form relations as a base relation—which is in a lower normal form
What is the third normal form?
Transitive functional dependency: a FD X -> Z that can be derived from two FDs X -> Y and Y -> Z A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key R can be decomposed into 3NF relations via the process of 3NF normalization
What is an example of update anomaly?
Updating the name of an attribute can affects all tuples in the table
What is an example of delete anomaly?
When a project is deleted all the employees are deleted as well
What is the fourth normal form in multivalued dependencies?
a relation schema R is in 4NF with respect to a set of dependencies F (that includes functional dependencies and multivalued dependencies) if, for every nontrivial multivalued dependency X —>> Y in F+, X is a superkey for R.
What are the two most important properties of decompositions?
a)Non-additive or losslessness of the corresponding join b)Preservation of the functional dependencies. Note that Property (a) is extremely important and cannot be sacrificed. Property (b) is less stringent and may be sacrificed.
What are nested relations?
attributes whose values for an individual tuple are non-atomic
What is 2NF, 3NF, BCNF?
based on keys and FDs of a relation schema
What is 5NF?
based on keys, join dependencies : JDs
What is the 4NF?
based on keys, multi-valued dependencies : MVDs;
What is an example of insert anomaly?
cannot insert a project unless an employee is assigned to it and conversely is cannot insert an employee unless he or she is assigned to a project
What are the first condition for the 3nf?
condition a catches two types of violations -one where a prime attribute functionally determines a non-prime attribute. This catches 2NF violations due to non-full functional dependencies. -second, where a non-prime attribute functionally determines a non-prime attribute. This catches 3NF violations due to a transitive dependency.
What is a concern for base relations?
design
When is multivalued dependencies trivial?
if (a) Y is a subset of X, or (b) X υ Y = R.
What is a candidate key?
if a relation schema has more than one key, each key is called a candidate key. One of the candidate keys is designated to be the primary keys and the others are called secondary keys
What is the second normal form for multiple candidate keys?
if every non-prime attribute A in R is fully functionally dependent on every key of R
When is a relation schema R is in second normal form?
if every non-prime attribute A in R is fully functionally dependent on the primary key R can be decomposed into 2NF relations via the process of 2NF normalization or "second normalization"
When is a relational schema in the third normal form?
if whenever a FD X → A holds in R, then either: (a) X is a superkey of R, or (b) A is a prime attribute of R
What is a prime attribute?
must be a member of some candidate key
What is Nonprime attribute?
not a member of any candidate key
What is the second condition for the 3NF?
the condition (b) from the last slide takes care of the dependencies that "slip through" (are allowable to) 3NF but are "caught by" BCNF which we discuss next.
What is the second normal forms or 2NF?
uses the concepts of FDs, primary key Prime attribute: An attribute that is member of the primary key K Full functional dependency: a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more
