Chapter 5 - The Relational Data Model and Relational Database Constraints
Update Operations on Relations
- INSERT a tuple - DELETE a tuple - MODIFY a tuple Should not violate integrity constraints. Several update operations may have to be grouped together; updates may propagate to cause other updates automatically to maintain ICs. Actions Taken Upon IC Violation - Cancel operation (RESTRICT/REJECT options) - Perform operation but inform user of violation - Trigger additional updates (CASCADE, SET NULL) so violation is corrected - Execute a user-specified error-correction runtime
Referential Integrity
A constraint involving two relations, as opposed to the single relation constraints before. - Used to specify a relationship among tuples in two relations. -- Referencing relation and the referenced relation Tuples in the referencing relation R1 have attributes FK (foreign key attributes) that reference the primary key attributes PK of the referenced relation R2. - t1[FK] = t2[PK]
Relational Database State
A relational database state DB of S is a set of relation states DB = {r1, r2, ..., rm} such that each ri is a state of Ri and such that the ri relation states satisfy the integrity constraints of IC. - Sometimes called a relational database snapshot or instance - Instance will not be used since it also applies to single tuples. A union of all the individual relation states
Relational Model
Describes the database as a set of relations.
Key of a Relation
Each row of a relation can have a value which uniquely identifies that row. In the STUDENT table, the SSN is the key
Basic Operations
INSERT a new tuple DELETE an existing tuple MODIFY an attribute of an existing tuple
Artificial/Surrogate Key
Keys created from row-ids or sequential numbers - basically, an added column used for the sole purpose of being a key.
Tuple
Name for a row in the formal model
Possible DELETE Violations
Only referential integrity. - If primary key value of the tuple being deleted is referenced from other tuples in the database. - Can be remedied by several actions: RESTRICT, CASCADE, SET NULL -- Reject deletion -- Propagate new primary key value into the foreign keys of the referencing tuples -- Set the foreign keys of the referencing tuples to NULL One of the above must be specified during database design for each foreign key constraint
Referential Integrity/Foreign Key Constraint
Statement of the Constraint The value in the foreign key column (or columns) FK of the referencing relation R1 can be either: - A value of an existing primary key value of a corresponding primary PK in the reference relation R2, or: - A null, in which case the FK in R1 should not be a part of its own primary key.
Schema
The schema (description) of a relation is denoted as follows: R(A1, A2, ... , An) - R is the name of the relation (entity type) - A1, A2... are attributes Example: CUSTOMER(id, name, address, phone) Basically, the generic outline for a tuple.
Relational Integrity Constraints
Three main constraints expressed in the relational model: - Key constraints - Entity integrity constraints - Referential integrity constraints Additional schema-based constraint: Domain constraint. - Every value in the tuple must be from the domain of its attribute (or null if allowed for that attribute). - Value of attribute A must be from dom(A) All Constraints ig - Domain Constraint - NOT NULL Constraint - Key Constraint - Entity Integrity (primary key not null) - Referential Integrity
Displaying a Relational Database Schema and its Constraints
- Primary keys are underlined - Foreign key constraints are displayed with an arrow from FK to reference table
Relational Database Schema
A set of S relation schemas that belong to the same database. - S is the name of the whole database schema - S = {R1, R2, ..., Rn} and a set IC of integrity constraints - R1, R2, ..., Rn are the names of the individual relation schemas within the database S
Relation State
A subset of the Cartesian product of the domains of its attributes - in other words, a subset of all possible tuples/possible attribute values. Denoted as r(R) r(R) is the specific state of relation R - a set of tuples
Characteristics of Values in a Tuple
All values in a tuple are considered to be atomic (indivisible). Each value in a tuple must be within the domain of the attribute for that column. If tuple t = <v1, v2, ..., vn> is a tuple (row) in the relation state r of R(A1, A2, ..., An) Then each vi must be a value from dom(Ai). A special null value is used to represent values unknown or not available/applicable in certain tuples.
Notation of Relations
Component Values of a tuple t are referred using the following notation: t[Ai] or t.Ai - Refers to value vi of attribute Ai for tuple t. t[Au, Av, ..., Aw] refers to the subtuple of t containing the values of attributes Au, Av, ..., Aw respectively in t Example: t = <"John", 123456789, 4.0> t[Name] = <"John"> t[SSN, GPA] = <123456789, 4.0> Note how we grab subtuples and never just individual values.
Possible INSERT Violations
Domain Constraint - If an attribute value provided for the new tuple is not of a specified attribute domain Key Constraint - If the value for a key attribute in the new tuple is a dupe of an existing key Referential Integrity - If a foreign key value in the new tuple references a primary key value that does not exist in the referenced relation Entity Integrity - If the primary key value is null in the new tuple
Possible UPDATE Violations
Domain constraint and NOT NULL constraint on a modified attribute. Can also violate other constraints depending on attribute being updated: - Updating a primary key: similar to DELETE followed by an INSERT, specify similar options to DELETE - Updating a foreign key (FK): May violate referential integrity - Updating an ordinary attribute (not FK or PK): Can only violate domain constraints
Relation
Informally, a table of values. Each row in the table has a collection or related data values which correspond to a real-world entity or relationship, and each column has a header (attribute name) which indicates meaning of the data in that column - Relation name can be compared to entity type - Row/Tuple can be compared to an individual entity - Column headers are comparable to attribute names
Constraints
Inherent or Implicit Constraints - Based on the data model itself; inherent to the data model used. Ex. Relational Model does not allow lists as values for attributes Schema-Based or Explicit Constraints - Expressed in the schema using the facilities provided by the model. Ex. Maximum cardinality ratio in the ER Model Application-Based or Semantic Constraints - Beyond the expressive power of the model and must be specified and enforced by application programs
Tuples (Relations)
Ordered set of values enclosed in angled brackets < >. Each value is derived from its appropriate domain (valid set of values). Ex. A row in the CUSTOMER relation is a 4-tuple which could look like: <1234, "John", "1234 Main Street", "(123)-456-7890"> Basically, a specific application of the schema. A relation is a set of tuples (rows).
Attribute Domain
Refers to the set of valid values for an attribute. Domains can also have data types and formats defined for them. Attribute names can designate the role of a domain in the relation. - The domain Date may be used to define two attributes named "Invoice-date" and "Payment-date" with different meanings
Other Constraints
Semantic Integrity Constraints - Based on application semantics and cannot be expressed by the data model. - Constraint specification language may be used to express these. - SQL-99 allows CREATE TRIGGER and CREATE ASSERTION to express some of these constraints - Keys, Permissibility of Null values, Candidate Keys (Unique in SQL), Foreign Keys, Referential Integrity, etc. are expressed by the CREATE TABLE statement in SQL.
Key Constraints
Superkey of R A set of attributes SK of R with the following condition: - No two tuples in any valid relation state r(R) will have the same value for attribute SK. That is, for two distinct tuples t1 and t2 in r(R), t1[SK] =/= t2[SK] - Default superkey is the set of all attributes Key of R - A "minimal" superkey. That is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey (does not possess the superkey uniqueness property). - Two distinct tuples cannot have matching values for all attributes in the key. Superkey shares this constraint. - We cannot remove any attributes and still have this uniqueness constraint hold (minimality property). Optional for superkeys. - Compare this to composite keys. The combination of the attribute values is what makes the uniqueness. A Key is a Superkey but not necessarily vice versa. In General: - Any key is a superkey, not vice versa - Any set of attributes which includes a key is a superkey. - A minimal superkey is also a key. - If a relation has several candidate keys, one is chosen arbitrarily to be the primary key, whose attributes will be underlined - Primary key is used to uniquely identify each tuple in a relation, and reference the tuple from another tuple. - Generally, try to choose primary key based on smallest value
Relation Degree
The degree of a relation is the number of attributes of its relation schema.
Entity Integrity
The primary key values PK of each relation schema R in S cannot have null values in any tuple of r(R). - If they could, then multiple "keys" could be null, removing their uniqueness. - If PK is composite/has several attributes, none can be null - would invalidate minimality
Ordering of Tuples in Relations
Tuples are not considered to be ordered within the relation, even if they appear to be in tabular form. - The ordering of values within a tuple should follow the ordering defined by the schema. If the schema defines R(A1, A2, A3) then the values in t = <v1, v2, v3> should be ordered. v1=A1, v2=A2, etc. Self-describing representation: t = {<name, "John">, <SSN, 123456789>} - Matches the attribute name with its value.