Database Design!
What are the three types of relationships in information modeling?
- Association = instance connection (1:1, 1:m, n;m) - Aggregation = part-sub part correspondence between things - Gen/Spec. = similarity among things
What is a conceptual data model?
- Shows data requirements of an organization at a high level of abstraction - Does NOT have implementation details - Includes objects, relationships, constraints
What is a logical data model?
- derived from conceptual data model with implementation included (aka Implementation Data Model) - includes more details - uses commercial DBMS software (Oracle, MySQL)
What is a physical data model?
- uses selected DBMS language to translate logical data model into physical representation within the DBMS - internal storage structure and file organizations
candidate key
1. Least likely to have its value changed 2. Least likely to lose its uniqueness 3. Contains the fewest characters 4. Easiest for the user to use
Testing a Database
1. Should be preformed in order to locate and fix errors 2. Should follow a carefully designed strategy
Database Requirement Document
1. Well defined Mission statement. 2. Operational scenarios
Conceptual Design Phase
1.identifying entities 2. identifying attributes and attribute domains for entities 3. identifying relationships 4. identifying candidate and primary keys for entities 5. Create an entity-relationship (ER) diagram.
3NF
2NF and no attributes that are not part of the primary key are transitively dependent on the primary key.
Update Anomaly
A circumstance in which redundant data in a relation may not be properly updated.
data mode
A formal expression of data, data relationships, and constraints on the data. Expresses these data relationships to a DBMS
Data Control Language (DCL)
A language used to control access to data in a database. Define privileges for users of the database
Deadlock
A situation in which two or more transactions are simultaneously waiting for another transaction to release a lock on a data item.
Entities
A uniquely identifiable element about which data is stored in a database. Can be any element about which data needs to be stored. Need not be tangible objects
Determinant
An attribute or group of attributes on which another attribute is functionally dependent.
What are the relational DB technologies?
Anomalies Normalization 1NF 2NF 3NF
Three Tier
Client: User Interface Application Server: Business Logic Data Processing Database Server: Data Validation Database Access
Two-tier
Client: User Interface + Business & Data Processing Logic Server: Server-side Validation database Access
Durability
Committed transactions are permanent and should not be lost owing to any ensuing malfunction.
What is the construct of a relationships? What are they dependent on?
Connection between things Business rules
ER Model
Contains rectangles with diamonds intermediates. Conceptual high level diagram of the structure of a database. Provides basic framework and and a user's view of data.
Integrity Controls
Controls prevent invalid data from being entered into the database. Invalid entries can lead to misleading or incorrect data.
database strategy
Creating a corporate data model is a common activity during the planning.
Backup and Repair
Creating copies of their database to ensure data access and security. Copying log files to help restore a database.
Transactions criterion
Criteria refer to: backup and recovery capabilities, logging systems, concurrency control, rollback and commit support, and support for stored procedures.
Utilities criterion
Criteria refer to: performance monitoring, use monitoring, support for database administration, and database tuning capabilities
Encryption
Database protection technique that encodes data and requires a specific key to decode it.
Operational Scenario
Describes one or more beginning to end transactions involving the database system and it's environment.
In order to have a complete and correct Information Model:
Does your model: Capture all the relevant objects and their relationships? Satisfy all the requirements? Satisfy all the requirements efficiently?
Commit
Final step in a successful completion of a database transaction in which the database attains a new consistent state.
Granting Privileges
GRANT (Options: SELECT, UPDATE, INSERT,etc) ON Employee TO emp_00004;
All Privileges
GRANT ALL PRIVILEGES ON Employee TO Emp_Users; or GRANT ALL ON Employee TO Emp_Users;
With Grant Options
GRANT SELECT ON Employee TO emp_00004 WITH GRANT OPTION;
Transitive dependency
In a functional dependency, if B is functionally dependent on A, and C is functionally dependent on B, C is then said to be transitively dependent on A.
partial functional dependency
In a functional dependency, when B is functionally dependent on A, and an attribute can be removed from A and the dependency still exists, B is said to be partially dependent on A.
2NF
It's in 1NF. All non-key attributes are dependent on all parts of the primary key.
Union-compatibility
Means that they must have the same number of attributes with matching domains
Views
Mechanism is a powerful security tool that can be used to hide many parts of the database from unauthorized users.
Authorization Controls
Mechanisms built into database requiring user IDs and passwords.
1NF
No repeating groups. All data values are atomic. Still subject to insertion, deletion and update anomalies.
System R
Project that led to the Commercial Database (IBM's DB2 and Oracle's SQL/DS) development.
Data Dictionary
Provides: 1. Each attribute's domain 2. Any default values (if present and permitted) for each attribute 3. Attribute's value, if it can be null.
Optimistic Concurrency Control
Read, Validate, Write
Data Definition Language (DDL)
SQL s subset is referred to as the schema definition language. Used to create relations, domains, views and access privileges in the database.
Data Manipulation Language (DML)
SQL subset is used to operate on the data, including retrieval, update, delete and insertion operations
Diamond
Shape used in a Chen ER model to indicate a relationship between entities
RESTRICT
Synonymous with NO ACTION.
Degree
The number of attributes (columns) in a relation
Decomposition
The process of creating new relations from existing relations based on functional dependencies within the original relation.
Denormalization
The rejoining of relations that were decomposed during normalization.
Atomicity
The transaction is a complete unit, and is executed in its entirety or not at all.
Isolation
The transaction is independent from other transactions. An incomplete transaction should not be visible to other transactions.
Consistency
The transaction must change the database from one consistent state to another consistent state.
Privileges
Type of action a user can take regarding a specific database object.
Rollback
Undoing a partially completed databases transaction, in which a database is returned to the consistent state it was before the transaction
SQL INSERT
Used to add records to a database. Ex: INSERT INTO Employee VALUES (00005, 'Jill', 'Jackson', '01-10-80', 's_rep', '1003', '28000');
What is a Relational Data Model?
Uses concept of normalization; collection of related relations
Lost updates
When simultaneous updates occur to a relation, one update may override another.
Database Prototype
Working model of both database and the application that accesses the database
Data Model
a formal expression of data, data relationships, and constraints on the data.
One of the main goals in information modeling is to....?
capture the objects and their relationships from the real world that are important to the business under study
What is an Information Model?
describes the structure of objects in a system--identity, attributes, relationships to other objects and their operations
data definition functionality criterion
functionality refers to: enforcement of primary keys, foreign key support, support for data types, domain support, data integrity mechanisms, support for views, data dictionary, data model type, and schema support
timestamp
is a distinct identifier that the DBMS assigns to a transaction.Methods prevent conflicts by rolling back any transactions involved in a conflict and restarting them.
Granting TO PUBLIC
keyword allows the DBA to grant privileges to all users in one statement. Ex: GRANT SELECT ON Employee TO PUBLIC;
Concurrent transactions only read data
no conflict exists and the order of execution is not significant.
Concurrent transactions read or write entirely different data structures
no conflict exists and the order of execution is not significant.
Insertion anomaly
occurs when certain attributes cannot be inserted into the database without the presence of other attributes.
Unrepeatable query results:
problems can occur if a database query accesses only partially updated data. AKA Dirty Read
Uncommitted updates
problems can occur when one transaction is allowed to see the intermediate results of another transaction.
Normalization
process of organizing and refining relations within a relational database usually has the effect of reducing the duplication of data items within the database at times reducing the amount of storage space needed for the base tables of the database. Addresses insertion, deletion and update anomalies. Creation of additional tables to achieve these goals.
Integrity
refers to the fact that data that is not kept secure cannot be guaranteed to be accurate. Corrupt or invalid data is of little use to an enterprise.
Availability
refers to the fact that the enterprise must have access to its data at all times. Loss of data can have far-reaching, detrimental effects on the enterprise.
Confidentiality
refers to the need of an enterprise to maintain control over sensitive data. Security breakdowns of this type can lead to loss of competitiveness, loss of revenues, or any other of many detrimental consequences.
Serializability
refers to whether a set of transactions must be executed individually, or whether they can be executed concurrently with one another.
Concurrent transactions read or write the same data structures
the order of execution is significant.
Why are the terms conceptual, logical, and physical used in data modeling?
to differentiate levels of abstraction versus detail in the model
Relationships
usually expressed using verbal expressions within the requirements document
The ____ relational type is the "relational model ideal."
1:M
One-to-one (1:1):
A strict matching, very rare. If you think one exists in your diagram, you are probably wrong.
_____ can serve as a communication tool between the users and designers.
Business rules
Examples of Conceptual Data Model?
ERD, Class Diagram (OOD)
____ yields only the rows that appear in both tables.
INTERSECT
____ data exist in the format in which they were collected.
Unstructured
What are Attributes/Record/Keys?
characteristic of an entity collection of data items that have something in common with the entity a data item(s) in a record that is used to identify a record
What does IT do?
coordinate firm activities
The ____ data model is said to be a semantic data model.
object-oriented
Network substrate hardware concerns
-Bandwith (too little) -Response -Stability -Security
Servers support
-Common systems -Common data -Anything large scale -Give info to clients when requested
3 IT Conflicts
1. Priority 2. Strategy Conflict 3. Technical Goals
What are the steps to Database Design? What are the outputs of each?
1. Requirements Analysis = DB Requirement Specification 2. Conceptual Database Design = Conceptual Data Model 3. Logical Database Design = Logical Data Model 4. Physical Database Design = Physical Data Model
What are the steps in the normalization process?
1. remove repeating groups (1NF) 2. make all attributes functionally dependent on the primary key (2NF) 3. all non-key attributes are fully functionally dependent on the primary key AND there are no transitive (non-key) dependencies
The hierarchical data model was developed in the ____.
1960s
The relational data model was developed in the ____.
1970s
The object-oriented data model was developed in the ____.
1980s
The ____ relationship should be rare in any relational database design.
1:1
Data standard
A common method of defining, structuring, and describing data. Avoids confusion from different names for the same thing and different things with the same name. Compatible data standards provide semantic interoperability.
First normal form (1 NF):
A database is in 1NF if each cell contains a single atomic value.
Third Normal Form (3NF):
A database is in 3NF if each field in a table is either: -Part of the primary key. -Fully functionally dependent on the primary key.
Second Normal Form (2NF):
A database is in second normal form if we can identify a field (or set of fields) that all remaining fields are functionally dependent upon.
Primary Key
A field, or set offends, in a table such that all other fields are functionally dependent
Communication Protocol
A set of rules that governs the communications between computers on a network. Problems come when we need to span networks. Compatible communication protocols provide syntactic interoperability.
A(n) ____ might be written by a programmer or it might be created through a DBMS utility program.
Application program
Hardware and software
Configuration and maintenance is heavily impacted by corporate governance and value chain configuration.
What is the Physical Data Model Dependent on?
DB Platform
Attribute:
Important information describing a given object or, on occasion, a relationship (oval)
Relationship:
Interactions among entities, typically describe an activity that is taking place. (diamond)
The body of information and facts about a specific subject.
Knowledge
Codd's Rule of ____ states: Application programs and ad hoc facilities are logically unaffected when changes are made to the table structures that preserve the original table values (changing order of columns or inserting columns).
Logical Data Independence
What are the 4 types of files?
Master = records for a group of entities Table = data used to calculate more data Transaction = used to enter changes Report = print reports of the data
Codd's Rule of ____ states: If the system supports low-level access to the data, users must not be allowed to bypass the integrity rules of the database
Nonsubersion
Entity:
Objects of interest, could be people, places, or things (like a transaction) (rectangle)
____ is a set of tools that work together to provide an advanced data analysis environment for retrieving, processing, and modeling data from the data warehouse.
Online analytical processing
____ yields a vertical subset of a table.
PROJECT
The response of the DBMS to a query is the ____.
Query result set
End-user data
Raw facts of interest to the end-user
Examples of Logical Data Model?
Relational Data Model, Network Data Model
To be considered minimally relational, the DBMS must support the key relational operators ____, PROJECT, and JOIN.
SELECT
____, also known as RESTRICT, yields values for all rows found in a table that satisfy a given condition.
SELECT
What is a ER Diagram?
Shows relationship between entities
Enterprise system
Support and integrate functional units
What is the Logical Data Model Dependent on?
The DB Tech. that will be used
One-to-Many (1:M):
Very common, describes a setting in which an instance of one entity can be associated with multiple instances of another entity.
Many-to-Many (M:M):
Very common, describes a setting in which multiple instances of a given entity can be associated with multiple instances of another entity.
What is the main problem in Database Design?
What are the domain needs?
Oracle 11g is an example of a(n) ____.
XML/Hybrid data model
What are Entities/Relationships
any object or event which someone chooses to collect data on associations between entities
A(n) ____ is the equivalent of a field in a file system.
attribute
A ____ key can be described as a superkey without unnecessary attributes, that is, a minimal superkey.
candidate
What is a database?
central source of data meant to be shared by many users for a variety of applications
A(n) ____ model represents a global view of the database as viewed by the entire organization.
conceptual
A(n) ____ is a restriction placed on the data.
constraint
____ are important because they help to ensure data integrity.
constraints
____ are normally expressed in the form of rules.
constraints
A(n) ____ enables a database administrator to define schema components.
data definition language (DDL)
A ____ contains at least all of the attribute names and characteristics for each table in the system.
data dictionary
The phrase ____ refers to an organization of components that define and regulate the collection, storage, management and use of data within a database environment.
database system
In the context of a database table, the statement "A ____ B" indicates that if you know the value of attribute A, you can look up the value of attribute B.
determines
A CUSTOMER table's primary key is CUS_CODE. The CUSTOMER primary key column has no null entries, and all entries are unique. This is an example of ____ integrity.
entity
A noun in a business rule translates to a(n) ____ in the data model.
entity
A(n) ____ is anything about which data are to be collected and stored.
entity
A(n) ____ represents a particular type of object in the real world.
entity
The ____ model uses the term connectivity to label the relationship types.
entity relationship
The ____ model is the end users' view of the data environment.
external
A ____ is a character or group of characters that has a specific meaning.
field
A ____ is a collection of related records.
file
VMS/VSAM is an example of a(n) ____.
file system data model
The attribute B is ____ the attribute A if each value in column A determines one and only one value in column B.
functionally dependent on
What is a DBMS?
heart of the database that allows the creation, modification, and updating of the data
In the ____ model, each parent can have many children, but each child has only one parent.
hierarchical
In the ____ model, the basic logical structure is represented as an upside-down tree.
hierarchical
One of the limitations of the ____ model is that there is a lack of standards.
hierarchical
In a database context, the word ____ indicates the use of the same attribute name to label different attributes.
homonym
Functional dependency
if field A assumes a value of X: Field B must assume a value of Y
A(n) ____ is an ordered arrangement of keys and pointers.
index
What is metadata?
information that describes data
A(n) ____ join only returns matched records from the tables that are being joined.
inner
The organization of the data within the folders in a manual file system was determined by ____.
its expected use
In the relational model, ____ are important because they are used to ensure that each row in a table is uniquely identifiable.
keys
The relational database model enables you to view data ____ rather than ____.
logically, physically
Students and classes have a ____ relationship.
many-to-many
A(n) ____ join links tables by selecting only the rows with common values in their common attribute(s).
natural
In the ____ model, the user perceives the database as a collection of records in 1:M relationships, where each record can have more than one parent.
network
In an outer join, the matched pairs would be retained and any unmatched values in the other table would be left ____.
null
Most decision-support data are based on historical data obtained from ____.
operational databases
____ relates to the activities that make the database perform more efficiently in terms of storage and access speed.
performance tuning
Controlled ____ makes a relational database work.
redundancy
MySQL is an example of a(n) ____.
relational data model
A verb associating two nouns in a business rule translates to a(n) ____ in the data model.
relationship
A(n) ____ is bidirectional.
relationship
A ____ key is defined as a key that is used strictly for data retrieval purposes.
secondary
Most data you encounter is best classified as ____.
semistructured
XML data is ___.
semistructured
In a database context, a(n) ____ indicates the use of different names to describe the same attribute.
synonym
The ____ is actually a system-created database whose tables store the user/designer-created database characteristics and contents.
system catalog
The relational model's creator, E. F. Codd, used the term relation as a synonym for ____.
table
When you define a table's primary key, the DBMS automatically creates a(n) ____ index on the primary key column(s) you declared.
unique