Database Management
Weak entity set requirements
1. Key consists of zero or more of its own attributes, and 2. Key attributes from entity sets that are reached by supporting relationships(many-one) 3. If R is a supporting relationship from a weak entity set E to some entity set F, R must: be a binary, many-one relationship from E to F. R must have referential integrity from E to F the attributes that F supplies for the key of E must be key attributes of F If F is weak, then key attributes of F supplied to E may be key attributes of some entity set to which F is connected by a supporting relationship For multiple supporting relationship from E to F, each relationship is used to supply a copy of the key attributes of F
E/R diagram to relationsl database
1. Turn each regular entity set into a relation with the same set of attributes 2. Represent relationships by relations except for supporting relationships 3. Combining relations when appropriate 4. Handle Weak entity sets 5. Convert subclass structures to relations
self contained
A DBMS stores the description of the database (meta-data) allows DBMS software to work with different databases
Entity set
A collection of similar entities. Usually all entities in an entity set have the same set of attributes Each has a key each attribute has a domain
Database
A large collection of related data
Boyce-Codd Normal Form
A relation R is in BCNF iff whenever there is a nontrivial dependency that holds for R, it is a superkey for R
Deletion Anomalies
A set of values becomes empty, may lose other information as a side effect
Attributes on Relationships
Can attach an attribute to a relationship if it is a property of the tuples in the relationship set
Insertion Anomalies
Cannot insert a tuple because some of the data not yet available Problem of using null value to fill the missing/ unavailable data: When the data becomes available, will we remember to update the one with nulls
Update Anomalies
Change information in one tuple but leave the same information unchanged in another A consequence of redundancy Cause potential inconsistency
Relationship set
Collection of similar relationships same entity set could participate in different relationship sets, or in different "roles" in the same set.
Tree-based model
Corresponds to a number of natural hierarchically organized domains. Little room for query optimization.
Use Null values
Create one relation with all the attributes of all the entity sets in the hierarchy. Each entity is represented by one tuple, and that tuple has a null value for whatever attribute the entity does not have
DBMS advantages
DBMS support: Users to create databases and specify their schema, Users to query and modify the data at a fine granularity, Storage of very large amount of data, secure the data from accidental or unauthorized use, and allow efficient access, Control concurrent access from multiple users, presenting correct/consistent data to each user, and prevent accidental corruption of the data from simultaneous accesses
many-one relationships
Each entity on the many side is connected to at most one entity of the second set But an entity on the one side can be connected to zero, one, or many entities of the first set
Isolation
Each transaction, when executed concurrently with other transactions, should have the same effect as if it had been executed by itself
Support of multiple data views
Each user may see a different view of the database which describes only the data of interest to that user.
General constraints
Eg. degree of the relationship
Referential Integrity
Exactly one value exists in a given role Enforced at database implementation
execution engine
Execute the resulting query plan(sequence of actions for the DBMS to perform) Issues a sequence of requests to storage manager for small pieces of data Interacts with the scheduler to avoid accessing locked data Makes a record of all database changes (log manager)
E/R approach
For each entity set E in the hierarchy, create a relation that includes the key attributes from the root and any attributes belonging to E.
Combining Relations
In general, if 2 or more relations have exactly the same keys, which mean exactly the same thing, combine them.
Data
Known facts that can be recorded and have an implicit meaning.
Techniques to enforce ACID
Locking Logging Transaction Commitment
characteristics of Keys
Will not change in value( time-invariance) Will not be null/ unknown No intelligent identifiers( containing locations or people that might change) Substitute new, simple keys for long, composite keys
Functional Dependencies
X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X, then they must also agree on the attribute A
Data Abstraction
a data model is used to hide storage details and present the users with a conceptual view of the database
relational database
a set of relations
Weak Entity Sets
an Entity set whose key is composed of attributes some or all of which belong to another entity set has one or more many-one relationships to other (supporting) entity sets the key for a weak entity set is its own underlined attributes plus the keys for the supporting entity sets
candidate key
an attribute( combination of attributes) that could be a key
Many-Many Relationships
an entity of either set can be connected to many entities of the other set, including zero If R is neither many-one from E to F or from F to E, then R is many-many.
Multiway Relationships
connects more than two entity sets
one-one relationship
each entity of either entity set is related to at most one entity of the other set. If R is both many-one from E to F and many-one from F to E, then R is one-one.
Locking
granularity of locks is important
Binary relationship
involving two entity sets
Domain constraints
restrict the value of an attribute to be in a limited set declaring type of attribute specify range for the attribute value
DBMS Characteristics
self-contained Independence between Programs and data Data Abstraction Support of multiple data views
Database schema
set of all relation schemas in the database
Internal schema
to describe data storage structures and access paths at the internal level. typically uses a physical data model
Conceptual Schema
to describe the structure and constraints for the whole database at the conceptual level
External schema
to describe the various user view at the external level
Logging
write a log to nonvolatile storage to assure durability
Durability
Once a transaction has completed( commit successfully), its changes to the database should be permanent. Even serious failures should not affect the permanence of a transaction
Redundancy
duplicate representation of data, lead to: -Inconsistency, violate data integrity -Waste space
Transaction Commitment
for durability and atomicity, transactions are computed "tentatively", recorded, but no changes are made to the DB until the transaction gets committed. Changes copied to the log, then copied to DB
Augmentation
if A1 A2 ... An -> B1 B2 ... Bm, then A1 A2, ... An C1 ... Ck -> B1 B2 ...Bm C1 ... Ck for any set of attributes C1 ... Ck
Reflexivity
if B1 B2 ... Bm is a subset of A1 A2 ... An , then A1 A2 ... An -> B1 B2 ... Bm //trivial dependencies
Buffer manager
keeps portions of the disk contents in main memory, obtains disk blocks from the file manager, and tries to optimize the access to data
File manager
keeps track of the location of files on disk and obtains the blocks containing the requested data
Role
label on the edges between the relationship and the entity set when entity set appears more than once in a relationship
Supporting relationship
must be many to one links supporting entities to weak entities respresented with double line diamond
degree/arity
number of fields
cardinality
number of rows
Constraints
part of the schema, not an instance of the database
Normalization
process to eliminate redundancy in relations due to functional or multi-valued dependencies
Database Management System
A software package/ system to facilitate the creation and maintenance of a computerized database
Atomicity
A transaction is performed either completely or not at all
Independence between Programs and data
Allow changing data storage structures and operations without having to change the DBMS access programs
Single-Value Constraints
Assertion: at most one value exists in a given context
Relationship
Association among two or more entities
ACID properties of "proper" execution:
Atomicity Consistency Isolation Durability
Key
Attributes which uniquely identify an entity in an entity set serves as a uniqueness constraint
Relational Model
Based on mathematically defined relations of entities Consists of Attributes, Domain legitimate values of attributes, views of data presented in table format, records that are n-tuples, where n = # of atttributes
Trivial Dependency Rule
FD A1 A2 ... An -> B is trivial if B is one of the A's for A1 A2 ... An -> B1 B2 ... Bn -trivial if the B's are a subset of the A's -Nontrivial if at least on e of the B's is not among the A's -Completely Nontrivial if none of the B's is also one of the A's
File system drawbacks
Fail to support: Efficient search, Efficient modifications to small pieces of data, complex queries, Controlled buffering of useful data in main memory, independent execution of transaction
Transitivity
If A1 A2 ... An -> B1 B2 ... Bm and B1 B2 ... Bm -> C1 .. Ck then A1 A2 ... An -> C1 ... Ck
Relation
Instance: a table, with rows and columns Schema: specifies name of relation, plus name and type of each column/attribute/field
Keys of relations
K is a key for relation R if: 1. Set K functionally determines all attributes of R(superkey) 2. For no proper subset of K is 1 true (minimality)
Classification of constraints
Keys(primary key) Single-value constraints(Unique) Referential integrity constraints (references) Domain constraints ( data type, value range, etc.) General constraints
Database schema
Logical structure of the data. Use Data Definition Language Includes descriptions of the database structure and the constraints that should hold on the database
Closure Test
Look for an FD's left side X that is a subset of the current Y+. If the FD is X -> A, add A to Y+
Object Oriented Model
One set compromises models of persistent O-O programming Languages.
Query Manager
Parse and optimize the query using a query compiler
Entity
Real-world object distinguishable from other objects. Described using a set of attributes
Transaction Processing
Responsible for logging database changes to support recovery after a system crash Supports concurrent execution of transactions in a way that assures atomicity and isolation
Storage Manager
Responsible for storing data, metadata, indexes, and logs
Database System
The DBMS software together with the data itself. Sometimes, the applications are also included
Consistency
Transactions are expected to preserve the consistency of the database
O-O Approach
Treat entities as objects belonging to a single class - for each possible subtree including the root, create one relation, whose schema includes all the attributes of all the entity sets in the subtree
Data definition language
Used by the DBA and database designers to specify the conceptual schema of a database.