Chapter 9: Designing Databases
Logical database design is still somewhat
"conceptual" (like E-R modeling)
The *result* of normalization
Every nonprimary key attribute depends upon the whole primary key and nothing but the primary key
An attribute may be functionally dependent on
two (or more) attributes rather than on a single attribute (Emp_ID, Course → Date_Completed)
*Primary deliverable* of logical database design
Normalized relations
Two benefits of normalized tables
• *Not likely to change* over time • Have *minimal redundancy*
*Functional Dependency*
a particular relationship between two attributes
Each row of a relation corresponds to
a record that contains data values for an entity
Descriptions of *where and when*
data are entered, retrieved, deleted, and updated
For a given relation, attribute B is *functionally dependent* on attribute A if
for every valid value of A, that value of A uniquely determines the value of B.
All relations have a
primary key
When *arranging related records in secondary memory* (hard disks and magnetic tapes) so that records can be stored, retrieved and updated rapidly (called file organization), you should also consider
protecting data and recovering data after errors are found
Not all tables are
relations
*Expectations* for
response time and data integrity
For most information systems, these files will be
tables in a relational database *(database tables)*
Functional dependence of B on A means that
there can be only one value of B for each value of A
Choose data storage
*technologies* (such as Read/ Write DVD or optical disc) that will *efficiently, accurately, and securely* process database activities
*Compare* the *consolidated logical database design* with the (produce what?)
*translated E-R model* and produce one final logical database model for the application
Develop a logical data model for each known
*user interface* for the application using *normalization principles*
*Key physical database design decisions* include:
1. *Choosing a storage format (called data type) for each attribute* from the logical database model. 2. *Grouping attributes* from the logical database model into physical records. 3. *Arranging related records in secondary memory* (hard disks and magnetic tapes) so that records can be stored, retrieved and updated rapidly (called file organization). 4. *Selecting media and structures* for storing data to make access more efficient.
*Translate* the conceptual
*E-R data model* for the application *into normalized data requirements*
Develop a *logical*
database design
A *Well-Structured Relation* allows users to
insert, modify, and delete the rows without errors or inconsistencies (keep the relational database table properties)
Functional dependency is not a
mathematical dependency
Structure the data in
stable structures, called *normalized tables*
Developing a *logical database* design *reflects*
the *actual data requirements* that exist in the forms and reports of an information system
The main emphasis with logical design is to ensure a
well-structured database
*Third Normal Form (3NF)*
• 2NF • Nonprimary key attributes do not depend on each other (i.e. no transitive dependencies)
A relation is in second normal form (2NF) if any of the following two conditions apply:
• The primary key consists of only one attribute. • Every nonprimary key attribute is functionally dependent on the full set of primary key attributes.
File and Database Design occurs in two steps: (Develop and Prescribe what?)
*1.)* *Develop a logical database model*, which describes data using notation that corresponds to a data organization used by a database management system. *2.)* *Prescribe the technical specifications* for computer files and databases in which to store the data
What provides the technical specifications for computer files and databases in which to store the data
*Physical* database design
One of the most important principles of the relational model
*Referential integrity*
Descriptions of the *file* and
*database technologies* to be used
With physical database design, you actually begin to consider
*how* this is actually going to be implemented on the computer
When *selecting media and structures*, the primary structure used today to make access to data more rapid is
*key indexes* on unique and non-unique keys
*Combine normalized data requirements* from all user interfaces into
*one consolidated logical database model* (view integration)
Logical and physical database design in
*parallel* with other system design steps
During *physical database design*, you use the __________ of these four key logical database design steps.
*results*
*Primary Key*
An attribute whose value is unique across all occurrences of a relation
The functional dependence of B on A is represented by
A→B. (e.g., Emp_ID→Name)
Each column in a relation corresponds to
an attribute of that relation
*Foreign Key*
an attribute that appears as a nonprimary key attribute (or part of a primary key) in one relation and as a primary key attribute in another relation
*Referential Integrity*
an integrity constraint specifying that the value (or existence) of an attribute in one relation depends on the value (or existence) of the same attribute in another relation
The rows may be interchanged or stored in
any sequence
In a table, B is considered fully functional dependent on A if
attribute B is functionally dependent on A, but is not functionally dependent on a proper subset of A
The sequence of columns can be interchanged without
changing the meaning or use of the relation
These specifications are sufficient for programmers and database analysts to code the
definitions of the database
Develop a logical database design from which we can
do *physical database design*
However, you can use sample data to demonstrate that a functional dependency
does not exist between two or more attributes
During *logical database design*, you must account for
every data element on a system input or output - form or report - and on the E-R diagram
E-R diagram must be transformed into relational notation, normalized, and then
merged with the existing normalized relations
When *choosing a storage format (called data type) for each attribute* from the logical database model, the format is chosen to
minimize storage space and to maximize data quality
A relation is in third normal form (3NF) if it is in second normal form (2NF) and there are
no functional (transitive) dependencies between two (or more) nonprimary key attributes
*Secondary Key*
A candidate key which is not the Primary key
What do not prove the existence of a functional dependency
Instances (or sample data) in a relation
*Relational database model*
data represented as a set of related tables or relations
*Well-Structured Relation* (or *table*)
A relation that contains a minimum amount of redundancy
*Normaliation*
The process of converting complex data structures into simple, stable data structures
Each row is
unique
*Definitions* of each
attribute
The *five purposes* of *Database Design*:
1. Structure the data in stable structures, called *normalized tables* 2. Develop a *logical database design* 3. Develop a logical database design from which we can do *physical database design* 4. *Translate a relational database model* into a *technical file and database design* that balances several performance factors 5. *Choose data storage technologies* (such as Read/ Write DVD or optical disc) that will efficiently, accurately, and securely process database activities
*Relations* for the following inquiry screen showing the customer with the largest volume of total sales for a specified product during a indicated time period: *HIGHEST-VOLUME COSTUMER* *ENTER PRODUCT ID:* M128 *START DATE:* 11/01/2017 *END DATE:* 12/31/2017 -------------------------------- *CUSTOMER ID.:* 1256 *NAME:* Commonwealth Builder *VOLUME:* 30
CUSTOMER(*Customer_ID* [pk], Name) ORDER(*Order_Number* [pk], Customer_ID [fk], Order_Date) PRODUCT(*Product_ID* [pk]) LINE ITEM(*Order_Number* [pk*1*], *Product_ID* [pk*2*], Order_Quantity)
The result of integrating these two separate sets of normalized relation: CUSTOMER(*Customer_ID* [pk], Name) ORDER(*Order_Number* [pk], Customer_ID [fk], Order_Date) PRODUCT(*Product_ID* [pk]) LINE ITEM(*Order_Number* [pk*1*], *Product_ID* [pk*2*], Order_Quantity) PRODUCT(*Product_ID* [pk]) LINE ITEM(*Product_ID* [pk1], *Order_Number* [pk2], Order_Quantity) ORDER(*Order_Number* [pk], Order_Date) SHIPMENT(*Product_ID* [pk1], *Invoice_Number* [pk2], Ship_Quantity) INVOICE(*Invoice_Number* [pk], Invoice_Date, Order_Number [fk])
CUSTOMER(*Customer_ID* [pk], Name) PRODUCT(*Product_ID* [pk]) INVOICE(*Invoice_Number* [pk], Invoice_Date, Order_Number [fk]) ORDER(*Order_Number* [pk], Customer_ID [fk], Order_Date) LINE ITEM(*Order_Number* [pk1], *Product_ID* [pk2], Order_Quantity) SHIPMENT(*Product_ID* [pk1], *Invoice_Number* [pk2], Ship_Quantity)
Most reliable method for identifying functional dependency
Knowledge of problem domain, obtained from a thorough requirements analysis
*Relations* for the following report that shows the unit volume of each product that has been ordered less than amount shipped through the specified date: *BACKLOG SUMMARY REPORT* *11/30/2017* *PRODUCT ID* --- *BACKLOG QTY* B381 --- 0 B975 --- 0 B985 --- 6 E125 --- 30 M128 --- 2
PRODUCT(*Product_ID* [pk]) LINE ITEM(*Product_ID* [pk1], *Order_Number* [pk2], Order_Quantity) ORDER(*Order_Number* [pk], Order_Date) SHIPMENT(*Product_ID* [pk1], *Invoice_Number* [pk2], Ship_Quantity) INVOICE(*Invoice_Number* [pk], Invoice_Date, Order_Number [fk])
The coding, done during systems implementation, is written in
Structured Query Language (SQL)
*Relation*
a named, two-dimensional table of data; each relation consists of a set of named columns and an arbitrary number of unnamed rows
It is useful to transform the conceptual data model into
a set of normalized relations.
A primary key may involve
a single attribute or be composed of multiple attributes
The referenced value must exist if a value of one attribute (column) of a relation (table) references
a value of another attribute (either in the same or a different relation)
For example, a given functional dependency of Emp_ID→Name, Emp_ID value can have only
one Name value associated with it
A table is automatically in 2nd form when it is in 1st normal form and contains
only a single key as the primary key
The E-R diagram you developed in conceptual data modeling is another source of insight into
possible data requirements for a new application system
The determinants are the
primary keys of the new relations
A foreign key must satisfy
referential integrity
Developing a physical database design from the logical database design requires the use of
relational database management systems
The most common style for a logical database model is the
relational database model
Entries in columns are from the
same set of values
Entries in cells are
simple
In *physical database design*, you translate the relations from logical database design into
specifications for computer files
Translate a relational database model into a
technical file and database design that balances several performance factors
Whereas logical design is meant for ensuring data integrity, *physical design* is meant for ensuring
the *best system performance* in terms of minimizing storage space and maximizing computational speed
To convert a relation into 2NF, decompose the relation into new relations using
the attributes, called *determinants*, that determine other attributes.
A primary key is how rows are ensured to be
unique.
Attribute B is not functionally dependent on attribute A when A does not
uniquely determine B (two rows with the same value of A have different values of B).
When *selecting media and structures* for storing data to make access more efficient, the choice of media affects the
utility of different file organizations
*Second Normal Form (2NF)*
• 1NF • Each nonprimary attribute is identified by the whole primary key (called full functional dependency) • No nonprimary attribute is functionally dependent on part, but not all, of the primary key
*First Normal Form (1NF)*
• Unique rows, no multivalued attributes • All relations are in 1NF
