Database Test 1
First Normal Form (1NF)
Any relation that does not have multi-valued attributes is in First Normal Form • "Atomic" • In fact, any valid relation is in 1NF, by definition
Network database model
Created to improve on hierarchical model Allows records to have more than one parent Can access data from multiple points
Business processes need to be converted to?
entities and interactions
Types of Databases can be classified according to?
number of users, database locations, and type of use
Relational Database Model - Entity
object/concept about which data is collected and stored Entity instance (or occurrence) = row in a relational table
Problems with a "flat" database
Redundancy (i.e. data duplication), Update anomalies (Update a single publisher's phone number requires you to update multiple entries), Insertion anomalies (You cannot insert information about an author until we have procured a book by that author), and Deletion anomalies (If there is a single book from a particular publisher, and for some reason we decide to remove the record about the book, we lose information about the publisher too)
Relationships and Subtypes
Relationships at the supertype level indicate that all subtypes will participate in the relationship The instances of a subtype may also participate in a relationship unique to that subtype In this situation, the relationship is shown at the subtype level
Data Models
Relatively simple (logical or conceptual) representations of complex real-world data structures Often graphical abstraction of a real-world object or event Useful in understanding complexities of the real-world environment
Partial Specialization
Relaxes the above constraint: every entity in the supertype need not belong to any of the subtypes in the specialization
When we convert EER diagrams to relations?
Some of the richness of the context is lost (although foreign keys do indicate relationships) - All we know is now in the form of some special tables or relations, that satisfy some criteria
Hierarchical database model
Tree-based approach developed in 1960's Based on parent-child relationships (1:M) Each record is linked to all its child records Ex: customer → order(s) → item(s)
Total Specialization
When this constraint is specified, every instance of the supertype does have to belong to one of the subtypes
Steps to Start the normalization process
1. Eliminate repeating groups (multi-valued attributes) 2. Identify the Primary Key 3. Identify all dependencies
Converting to 3NF
1. Make new tables to eliminate transitive dependencies - The determinant of each transitive dependency becomes the primary key of its own new table 2. Reassign corresponding dependent attributes − Remove dependent attributes from the original table and move them to the corresponding new table
Attribute Rules when mapping regular entities to relations
1. Simple attributes: ER attributes map directly onto the relation 2. Composite attributes: Use only their simple, component attributes 3. Multi-valued attribute: Becomes a separate relation with a foreign key taken from the superior entity
Secondary Key
attribute (or attributes) used for data retrieval
Single-valued Attribute
attribute with a single value
Entities have some _____ which are of interest to the 'organization'
attributes
Primary Key
cell value that uniquely identifies a tuple
Foreign Key
cell value which is exactly equal to the primary key value in some other relation
Relational Database Model - Attributes
characteristics of an entity (columns)
DBMS (Database Management System)
collection of programs that manages the database structure and controls access to and retrieval of the data (security)
Relational Database Model - Relationships
connections among data Used to connect tables
In an EER Diagram we attempt to identify a supertype that contains?
the maximum number of common attributes -All subtypes inherit all of the attributes from the supertype -Each subtype has other unique attribute(s)
Database
"a collection of interrelated data organized in such a way that it corresponds to the needs and structure of an organization and can be used by more than one person for more than one application." - i.e.: interrelated files with shared data.
File-based database systems often imply?
"flat" data files
Relational Database Model - Table
"relation" = matrix of intersecting rows and columns
Data Management focuses on?
the proper generation, storage, and retrieval of data and is a core activity of any organization
Mapping Weak Entities
A Weak entity becomes a relation with a foreign key that is adopted from its associated Strong entity The primary key for the Weak entity relation is composed of: -Partial identifier of Weak entity -Primary key of identifying relation (Strong entity) = Composite primary key (multiple attributes)
Primary Key - Relational
A candidate key selected to be used as the primary row identifier Cannot contain NULL entries
Entity Type
A collection of entities that share common properties or characteristics
ER Model
A detailed, logical representation of the data for an organization or business area expressed in terms of Entity Types, Relationships and Attributes A CUSTOMER places an ORDER for a PRODUCT
ER Diagram
A graphical representation of an ER model
Candidate Key
A minimal (irreducible) superkey No subset of attributes in this key can be a superkey
Attribute
A named property or characteristic of an entity type that is of interest to the organization
Entity
A person, place, object, event, or concept in the user environment, about which the organization wishes to maintain data
Normalization
A process for evaluating and correcting table structures to minimize data redundancies, thereby reducing the likelihood of data anomalies. - i.e., the process of decomposing relations with anomalies to produce smaller, well-structured relations
Third Normal Form (3NF)
A relation is in third normal form, or 3NF, if it is in second normal form and it contains no transitive dependencies
Well-Structured Relations
A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies
Entity Instance
A single occurrence of an entity type. example: Susan and John are employees of a company, both of them are entity instances, they belong to entity type EMPLOYEE
Subtype
A subgrouping of the entities in an entity type that is meaningful to the organization and that shares common attributes or relationships distinct from other subgroupings
Enhanced ER Diagram
An ER diagram that also shows subtypes and supertypes of entities and identifies the relationships between them
Relationships
An association between instances of one or more entity types that is of interest to the organization (usually a VERB)
Foreign Key - Relational
An attribute (or combination of attributes) in one table whose values must either match the primary key in another table, or be NULL
Superkey
An attribute (or combination of attributes) that uniquely identifies each row in a table
Identifier/Primary Key
An attribute (or set or attributes) that uniquely identifies individual instances of an entity type
Subtype Discriminator
An attribute of the supertype whose values determine the target subtype or subtypes
Composite Attribute
An attribute that can be broken down into its component parts
Simple Attribute
An attribute that cannot be broken down into smaller components
Derived Attribute
An attribute whose value can be derived from related attribute values
Associative Entity
An entity that associates the instances of one or more entity types and contains attributes that are peculiar to the relationship between those entity instances
Supertype
An entity type that includes distinct subtypes that needs to be represented in the data model
What are the two ways of viewing a database?
As a set of relations, and the associated definitions of tuples, keys, etc. • Nothing to do with a real-life situation and.... • An implementation of a real-life model, namely the E-R model • A one-to-one mapping between the elements in the model, and the elements of the database
Relational Database Model
Basis for most approaches used today for storing data
Mapping Binary Relationships - Many-to-Many
Create a new relation with the primary keys of the two entities as its primary key
Mapping Supertype/Subtype Relationships
Define one relation for the supertype and one for each subtype Supertype attributes (including identifier and subtype discriminator) go into supertype relation Subtype-specific attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation 1:1 relationship established between supertype and each subtype, with supertype as primary table
Disjoint Constraints
Define whether it is possible for an instance of a superclass to be a member or one or more subclasses simultaneously. May be Disjoint Rule or Overlap Rule
Participation Constraints
Dictate whether every member of a superclass must participate as a member of a subclass. May be Total Participation or Partial Participation
What are the four things that we need to be able to build into a set of relations?
Entities (with simple, composite, or multi-valued attributes), Relationships (one to many (1:M), one to one (1:1), and many to many (M:N)), Associative Entities, and Supertypes/Subtypes
What are the two main types of database integrity constraints?
Entity Integrity and Referential (or Reference) Integrity
Referential Integrity implies that?
Every non-NULL foreign key references an existing primary key value Enforcing this rule makes it impossible to have an invalid foreign key value, or to delete a row whose primary key has matching foreign key values in another table
Entity Integrity implies that?
Every relation in a relational data model has a unique primary key The data values for all attributes in the primary key are not NULL Enforcing this rule ensures that every row has a unique identity and thus allows foreign key values to properly reference primary key values EX: Customers must be uniquely identified by their ID number
Properties of relations
Every relation must have a distinct name Each cell of the relation is atomic A particular tuple can have only one value for an attribute Note that two different tuples can have the same value for an attribute Each attribute in a table must have a distinct name The values of any attribute must be from the same domain There should not be any duplicate tuples Order of tuples has no significance Order of attributes has no significance
What are the 3 types of database models?
Hierarchical database model, Network database model, and Relational database model
Functional Dependency
If A and B are two sets of attributes of a relation R, then B is functionally dependent on A if each value of A in the relation is associated with exactly one value of B - Represented by A −> B - A is called the DETERMINANT - If we know the values of A, the values of B are uniquely determined
Non-Disjoint (Overlap) Constraint Rule
If subtypes of a specialization are not disjoint then an entity may be a member of more than one subtype of a specialization
Computerized Manual Filing Systems
Individual files for each purpose Each business unit had its own file system Organized to facilitate the expected use of the data Data processing specialist required to retrieve data and run reports
What are some examples of DBMS software?
Microsoft Access, Microsoft SQL Server, Oracle, MySQL
What is the solution to database problems like redundancy and anomalies that make it harder to create, maintain, or modify the database?
Normalization
Characteristics of Relations
Not all database tables qualify as relations • Requirements (to be a relation): - Every relation has a unique name - Every attribute value is atomic - Every row is unique - Attributes (columns) in a given table have unique names - The order of the columns is irrelevant - The order of the rows is irrelevant
Mapping Binary Relationships - One-to-One
Primary key on mandatory side becomes a foreign key on optional side
Mapping Binary Relationships - One-to-Many
Primary key on the "one" side becomes a foreign key on the "many" side
Converting to 2NF
Starting with a relation in 1NF, do the following: 1. Make new tables to eliminate partial dependencies - A new table is needed for each component of the primary key that is a determinant in a partial dependency - The original (composite) primary key should remain in the original table 2. Reassign the corresponding dependent attributes - Remove dependent attributes from the original table and move them to the corresponding new table • Note: any relation in 1NF that has a single-attribute primary key, is already automatically in 2NF
Attribute Inheritance
The concept that if an entity is a subtype it retains all the attributes of the supertype. All attributes of a supertype are 'inherited' by a subtype and hence need not be shown explicitly
Degree of a Relationship
The number of entity types that participate in a relationship Unary: (degree 1) - Also called "Bill of Materials" or "Recursive" Binary: (degree 2) - Most common Ternary: (degree 3)
Cardinality
The number of instances of one entity that can (or must) be associated with each instance of another entity
Specialization
The process of maximizing the differences between members of an entity by identifying their distinguishing characteristics...a top-down approach
Generalization
The process of minimizing the differences between entities by identifying their common features...a bottom-up approach
Information
The result of processing raw data to reveal its meaning in a given context
Boyce-Codd Normal Form (BCNF)
a table is in Boyce-Codd Normal Form or BCNF, if every determinant in the table is a candidate key
What are some issues with file-based systems?
data redundancy (different files contain the same information like id, address, etc), isolation of data in separate systems, data inconsistency (lack of data integrity), data anomalies (update/insert/delete anomalies), program-data dependence (definition of data is coded into application programs)
The number of entities involved in a relationship gives us the?
degree of the relationship
How do we reduce the number of entities in an ER Diagram?
generalize attributes entities have in common into "sets" of entities using an Enhance ER (EER) Diagram
What are the components of a database system?
hardware, software, people, procedures, and data
Multi-valued Attribute
may take on more than one value for a given entity instance
As a general rule of thumb: a table should not pertain to?
more than one entity type
Normalization is a "check" against?
potential mistakes/ inefficiencies that might be in the structure of the database tables
ER Diagrams enable us to understand not only the data, but also the?
processes
Data
raw facts that can be recorded and preserved
ER Diagrams can be complex (ex. multi-valued attributes, associative entities, etc) - however?
relations cannot -We have to suitably break down the complexity -Relations present one consistent method of representing the data - everything is in the form of tables
Interactions between entities represent?
relationships
Tuple
row of a relation, with each cell consisting of specific values for a particular variable or attribute
Relation
set of tuples (table)
Disjoint Constraint Rule
specifies that if the subtypes of a specialization are disjoint then an entity can be a member of only one of the subtypes of the specialization
What are some examples of people in a Database System?
system admins, database admins (DBAs), database designers, system analysts and programmers and end users
Multi-User Database
the database used the most often, especially by businesses. includes workgroup database and enterprise database
Second Normal Form (2NF)
• A relation is in second normal form or 2NF if it is in first normal form, and every nonkey attribute is fully functionally dependent on the primary key • "Fully" implies no partial functional dependencies, i.e. no non-key attributes are functionally dependent on only part of the primary key