Database Design

¡Supera tus tareas y exámenes ahora con Quizwiz!

Map Dependent (Weak) Entities

Becomes a separate table with a foreign key taken from the primary key of the table for the strong entity Primary key is composed of the partial identifier of the dependent entity plus the primary key from the table for the strong entity (thus, creating a composite PK)

Data Integrity

Business rules may be defined that maintain the integrity of the data that is manipulated

Business rules

Can be used to define or constrain some aspect of a business's structure or processes Database systems should support business rules that are relevant to the specific functions supported by the system

Two Cases When you Must Use Associative Entities

Case 1: If it is necessary to assign an identifier to uniquely identify each occurrence of the M:M relationship between the original entities, then an associative entity must be created. (Entities have identifiers; relationships do not.) Case 2: In a ternary relationship, if the optionalities at all three entities are not identical, then an associative entity must be created.Otherwise, the ERD cannot depict the interactions between the entities without ambiguity.

Object-Relational Database Model

Combine: Ability of object technology to handle advanced relationship types Data integrity, reliability, and recovery features of the traditional relational model Represented by the most popular and powerful of modern database management systems E.g., Oracle, IBM DB2, Microsoft SQL Server

Map Regular Entities to Tables

Composite Attributes: Use only their simple, component attributes Multivalued Attributes: Become a separate table with a foreign key taken from the table for the original entity Derived Attributes: Are not included in a relational schema (since, by definition, they represent data that are not stored, only calculated as needed)

Hardware and software characteristics

Computer, network, and DBMS characteristics

Map Supertype/Subtype Structures

Create a separate table for the supertype and each of the subtypes Assign common attributes, including subtype discriminator, to the supertype table Assign to the subtype tables those attributes unique to each subtype Assign to the subtype tables the primary key of the supertype table (which also functions as a FK referencing the supertype)

Relational Database Model Components

Data Structure, Data Manipulation, and Data Integrity

Data Structure

Data are organized in two-dimensional tables (also called relations) with columns and rows

Information

Data processed to increase the knowledge of the people who use it

Application characteristics

Data requirements and application priorities

Operational requirements

Data security and backup & recovery concerns

Data Manipulation

Data stored in the tables may be manipulated through the use of a command language (Structured Query Language - SQL - was developed expressly for this purpose)

Physical database design inputs include

Data tables described in logical database design, Business environment requirements, Data characteristics, Application characteristics, Operational requirements, and Hardware and software characteristics

Metadata

Data that describes the properties or charac-teristics of other data (a means of providing context for data)

Data characteristics

Data volume and volatility assessments

Foreign keys

Define relationships between tables

Transaction Processing Systems (TPS)

Designed to support the processing of everyday operational business transactions (i.e., retrieving, adding, modifying, and deleting data) Represent the vast majority of database systems Are the primary focus of this course

Relational Database Model Constraints

Domain Integrity, Policy Integrity, Entity Integrity, and Referential Integrity

Primary key

Enforces uniqueness for each record

Entity-Relationship Diagram (ERD)

Entities: named as singular nouns Attributes: named as singular nouns Relationships: named as verbs or verb phrases

E-R Model

Entity, Attribute, and Instance

Data

Facts that can be recorded and stored using computer media

Removing Transitive Dependencies

For each nonkey attribute that is a determinant in a table, create a new table; that attribute becomes the primary key of the new table Move all of the attributes that are functionally dependent on that determinant attribute from the old table to the new table Leave the attribute that serves as the primary key in the new table in the old table to serve as a foreign key to allow the tables to be related

Referential Integrity

For every value of a foreign key there must be a primary key with that value For every value of CustomerID in the Order table there must be a matching value of CustomerID in the Customer table The primary key must exist before the foreign key can be defined

Database Management System (DBMS)

General-purpose software system that facilitates the processes of defining, creating, using, and maintaining databases

Map Associative Entities

Identifier Not Assigned Default primary key for the table formed for the associative entity is typically a composite PK composed of (at least) the primary keys of the two entities Identifier Assigned May use if one exists that is natural and familiar to end-users Must use if the composite PK can not be made unique by adding intersection data

candidate key

If a table has more than one column that provides a way of uniquely identifying the rows of the table When there is more than one candidate key, one of them must be chosen to be the primary key of the table

Third Normal Form

Important points about the 3NF structure are: It is free of unnecessary data redundancy All foreign keys appear where needed to logically tie together related tables It is the same structure that would have been derived from a properly drawn ERD of the same business environment Normalization can be used to check the structures of any relational tables, even those created through ERD conversions, for 3NF (e.g., to check for the possible existence of previously unidentified transitive dependencies)

Functional Dependency

A constraint between two attributes in which the value of one attribute is determined by the value of another attribute. The value of one attribute (the determinant) determines the value of another attribute A -----> B "A determines B" or "B is functionally dependent on A"

Attribute

A discrete data element Describes an entity (i.e., is a characteristic) Meaningful (for the system being modeled) Value may be required or optional

Data Normalization

A formal process for grouping attributes into tables A tool to validate and improve logical designs so that they satisfy certain constraints to avoid unnecessary duplication of data The process of decomposing a table with anomalies into two or more, smaller, well-structured tables

Inheritance Rule

An instance of a subtype is also an instance of the supertype Subtype entities inherit values of all attributes of the supertype

Database

An organized collection of logically related data. Includes data and metadata.

Physical Database Design

Physical design involves enhancing and/or modifying a logical database design to improve the performance of the run-time environment This may involve adding additional structures (e.g., indexes) to the logical database design, and/or modifying the 3NF tables produced by the logical database design Thus, the well-structured relational tables produced either by the conversion from ERDs or by the data normalization process form the starting point for physical database design

Supertype/Subtype Relationships Rule

Relationships shown at the supertype level indicate that all subtypes will participate in the relationship The instances of a subtype may participate in a relationship unique to that subtype; in this situation, the relationship is shown at the subtype level

Business environment requirements

Response time and throughput requirements

Decision Support Systems (DSS)

Specifically designed to aid managers in decision-making tasks Contain data that has been accumulated over time to aid in trend analysis and forecasting Data about subjects must be organized in such a way that it provides a unified, overall picture of all the important details about the subjects over time Data warehouses are the most common example Data marts are subsets of data warehouses designed to support a part of an organization

Relational Model

Table (Relation), Column (Attribute), and Row (Tuple)

First Normal Form

Table has no multivalued attributes a table that has multivalued attributes is unnormalized in this context, a multivalued attribute is sometimes referred to as a repeating group

Second Normal Form

Table is in 1NF and has no partial functional dependencies (that is, every nonkey attribute is fully functionally dependent on the entire primary key) This means that every nonkey attribute must be determined by the entire primary key, not by only part of the primary key

Third Normal Form

Table is in 2NF and no transitive dependencies (functional dependencies between nonkey attributes) This means that no nonkey attribute should be able to determine another nonkey attribute

Well-Structured Tables

Tables that contains minimal redundancy and that allow users to insert, delete, and modify table rows without errors or inconsistencies are considered to be well-structured When designing relational database tables, we seek to eliminate anomalies through the use of normalization

Logical Database Design

The process of arranging the entities and attributes of the conceptual data model of the business environment into the tables and columns of a relational database structure to serve that business in an information system The goal is to create well-structured tables (i.e., free of anomalies) that properly reflect the organization's business environment (see the following slides for an explanation of anomalies)

Candidate Keys as Determinants

There is one exception to the rule that, in third normal form, nonkey attributes are not allowed to define other nonkey attributes The rule does not hold if the defining nonkey attribute is a candidate key of the table Thus, candidate keys in a table may define other nonkey attributes without violating third normal form

Rule Restricting the Use of Supertype/Subtypes

You may use this type of structure only when either (or both) of the following are present: 1. When there are attributes that apply to some (but not all) of the instances of an entity type 2. When the instances of a subtype participate in a relationship unique to that subtype

Overlap

a composite attribute whose subparts pertain to different subtypes; each subpart contains a boolean value (yes or no) to indicate whether or not the instance belongs to the associated subtype

Composite

a related group of attributes example: Address (Street, City, State, Zip)

Disjoint

a simple attribute with alternative values to indicate the possible subtypes

Subtype

a subgrouping of a supertype entity that is meaningful to an organization (not just possible) shares all attributes of its supertype, but also has unique attributes of its own and/or : has relationships with other entities distinct from those of other subtypes

Enhanced Entity-Relationship Model

addressed shortcomings of the basic E-R model not universally agreed upon in some respects introduced the supertype/subtype structure

Identifier

an attribute that uniquely identifies each entity instance (e.g., Social Security Number) can be one or more attributes (e.g., First Name, Middle Name, and Last Name) Create an identifier if there is no obvious identifying attribute (e.g., Part Number) Identifier of a dependent (weak) entity is its partial identifier combined with identifier of its owner

candidate key

any attribute that would qualify as an identifier (e.g., SSN and Employee ID)

Deletion Anomalies

are experienced when a value for a column we wish to keep is unexpectedly removed when a value for another column is deleted e.g., cannot delete the sole order for a customer without deleting the only copy of the customer's information also

Update Anomalies

are experienced when changes to multiple rows of a table are needed to effect an update to a single value of a column e.g., cannot completely update a customer's address without changing it for every order placed by that customer

Insertion Anomalies

are experienced when we attempt to store a value for a column but cannot because the value of another column is unknown e.g., cannot add a new customer's information until an order number is ready to be entered (because OrderID column serves as the primary key for the table and cannot have null values)

Simple

at the atomic, most basic level

Derived

calculated, but not stored (e.g., Total)

Referential Integrity

constrains a foreign key value to match a primary key value in a related table more on this concept to come

Domain Integrity

constrains allowable values for columns (e.g., data type, column size, maximum value, etc.)

Entity Integrity

constrains data operations to business rules (e.g., only managers may place vendor orders)

Dependent (Weak) Entity

dependent on a strong entity; cannot exist on its own does not have a unique identifier, only a partial identifier (shown w/double underline) represented with rectangle symbol with lines in corners

Strong Entity

exists independently of other entities has its own unique identifier (shown w/single underline) represented with regular rectangle symbol

Conceptual Data Modeling

goal is to arrive at an understanding of the principal data sources and data elements of interest to the business or organization, and the relationships between the data sources, in order to satisfy requirements for information tool is the Entity-Relationship Diagram (ERD)

Logical Data Modeling

goal is to convert the conceptual model into a form that can be utilized to create an IS (e.g., a relational database) tool is the Relational Schema

Physical Data Modeling

goal is to specify all identifying & operational characteristics of the data that will be recorded in the information system tools include the Data Dictionary and others

Each table (relation)

in a given database has a unique name

Every row (tuple)

in a table is unique

Referential Integrity

involves the circumstance of referring to a specific data row in one table in the database based on the value stored in a column in another table This constraint states that a foreign key value can not be stored in a table unless a matching value can be found in the primary key of the related table

A relationship:

is an association between entities: The degree of a relationship indicates the number of entities involved in the relationship specified as unary, binary, ternary, or higher (n-ary) The cardinality of a relationship describes the number of instances of one entity that may be associated with one instance of another entity specified as either one or many (many = one or more) The optionality (modality) indicates whether partici-pation in the relationship is required or not specified as either mandatory or optional

referential integrity arrows

linkages between the tables used to connect the FK of one table to the PK of another table

Identifying Relationship

links strong entities to dependent (weak) entities represented with diamond symbol with lines in corners

Data dictionary

location where definitions of data (metadata) are maintained

Multivalued

multiple values per entity instance are possible (e.g., Degree, Club, Skill)

unary relationship

only one entity is involved.

Single Valued

only one value per entity instance (e.g., Last Name, Date Of Birth)

Anomalies

problems that are experienced when attempting to manipulate stored data

E. F. Codd

published an article titled "A Relational Model of Data for Large Shared Data Banks" in the Communications of the ACM journal.

Data access

read, insert, update, and delete

derived attributes

represent values that are calculated when needed and not stored (e.g., values generated by a query for use on a report) These are only optionally shown on an ERD When shown, diagrammed within square brackets Example: Acquisition Cost * (1 + Percent Markup) = Selling Price

Foreign Keys

serves as a primary key in another table means by which tables are linked together to repre-sent unary, binary, ternary, etc. relationships

Entity

something of interest in the environment (e.g., person, place, object, event, concept) that is described by attributes and that will have numerous instances we wish to track

Columns

store attribute data

data

the actual values of interest

ternary relationship

three entities are simultaneously involved.

binary relationship

two entities are involved.

Rows

unique instances or records of entity

Each column (attribute)

within a given table has a unique name Every column (attribute) is single-valued Thus, multivalued attributes require special treatment when designing relational tables

Map Ternary (and n-ary) Relationships

One table for each original entity and one for the common relationship (associative entity) (i.e., a ternary relationship maps to a total of four tables) Table representing the associative entity has foreign keys to each entity in the relationship PK of the table formed for the associative entity is typically a composite PK composed of (at least) the primary keys of the three entities

Map Binary Relationships

One-to-Many - Primary key on the one side becomes a foreign key on the many side Many-to-Many - Create a new table; the primary key of the new table is typically a CPK comprised of (at least) the primary keys of the two entities involved in the relationship One-to-One - Primary key on the mandatory side becomes a foreign key on the optional side (if optionalities are asymmetric)

Map Unary (Recursive) Relationships

One-to-Many: Recursive foreign key in the same table (also true for unary One-to-One) Many-to-Many (e.g., bill of materials): Two tables result: One for the entity type One for an associative relation in which the primary key has two fields, both taken from the identifier of the original entity

Supertype

(example: Employee) a generic entity that has a relationship with one or more subtypes

Supertype/Subtype Constraints

1. Completeness Constraint: Addresses the question of whether an instance of a supertype must also be a member of at least one subtype 2. Disjointness Constraint: Addresses the question of whether an instance of a supertype may simultaneously be a member of two (or more) subtypes

Disjointness Constraint: Two Possible Rules

1. Disjoint Rule (Letter "d" notation) Specifies that if an entity instance is a member of one subtype, it cannot simultaneously be a member of any other subtype (Example: all PERSONS are either MALE or FEMALE) 2. Overlap Rule (Letter "o" notation) Specifies that an entity instance can simultaneously be a member of two (or more) subtypes (Example: an ATHLETE can be both a RUNNER and a JUMPER)

Two Processes to Develop Supertype/Subtypes

1. Generalization: The process of defining a more general entity type from a set of more specialized entity types (A "bottom-up" approach) 2. Specialization: The process of defining one or more subtypes of a general entity based on distinguishing attributes or relationships (A "top-down" approach)

Completeness Constraint: Two Possible Rules

1. Total Specialization Rule (Double-line notation) Specifies that each entity instance of the super-type must be a member of some subtype in the relationship (Example: all STUDENTS are either UNDERGRADUATE or GRADUATE students) 2. Partial Specialization Rule (Single-line notation) Specifies that an entity instance of the super-type is allowed to not belong to any subtype (Example: FACULTY and STAFF are not the only possible types of EMPLOYEE)

alternate key

A candidate key that is not chosen to be the primary key

Primary Key (PK)

A column (or columns) whose value uniquely identifies or differentiates each row in a table (e.g., EmployeeID) Required for every table in a relational database Composite Key - a primary key made up of more than one column (e.g., FirstName + MiddleName + LastName)

Foreign Key (FK)

A column in one table that serves as the primary key of another table in the same database (thus serving as a link between the two tables)


Conjuntos de estudio relacionados

Intro to Physical Anthropology Final Study Set

View Set

Mastering Biology: The Chemical Basis of Life

View Set

NCLEX RN Examination Maternity: Antepartum

View Set