DBMS - IPU
Pattern matching % and _
% matches any string _ matches any character
union, intersect and except operators
(select course id from section where semester = 'Fall' and year= 2009) union (select course id from section where semester = 'Spring' and year= 2010);
attribute types
- simple and composite - single valued and multivalued - derived
Schedule
A chronological execution sequence of a transaction is called a schedule. A schedule can have many transactions in it, each comprising of a number of instructions/tasks.
Data model
A collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. A data model provides a way to describe the design of a database at the physical, logical, and view levels.
DML
A data-manipulation language (DML) is a language that enables users to access or manipulate data as organized by the appropriate data model. The types of access are: • Retrieval of information stored in the database • Insertion of new information into the database • Deletion of information from the database • Modification of information stored in the database
Foreign Key
A foreign key is a set of attributes in a referencing relation, such that for each tuple in the referencing relation, the values of the foreign key attributes are guaranteed to occur as the primary key value of a tuple in the referenced relation.
Query
A query is a statement requesting the retrieval of information. The portion of a DML that involves information retrieval is called a query language
Super key, Candidate key, Primary key
A superkey of a relation is a set of one or more attributes whose values are guaranteed to identify tuples in the relation uniquely. A candidate key is a minimal superkey, that is, a set of attributes that forms a superkey, but none of whose subsets is a superkey. One of the candidate keys of a relation is chosen as its primary key.
Transaction
A transaction is a collection of operations that performs a single logical function in a database application. Each transaction is a unit of both atomicity and consistency. Thus, we require that transactions do not violate any database consistency constraints.
triggers
A trigger is a statement that the system executes automatically as a side effect of a modification to the database. To design a trigger mechanism, we must meet two requirements: 1. Specify when a trigger is to be executed. This is broken up into an event that causes the trigger to be checked and a condition that must be satisfied for trigger execution to proceed. 2. Specify the actions to be taken when the trigger executes.
States of Transactions
Active − In this state, the transaction is being executed. This is the initial state of every transaction. Partially Committed − When a transaction executes its final operation, it is said to be in a partially committed state. Failed − A transaction is said to be in a failed state if any of the checks made by the database recovery system fails. A failed transaction can no longer proceed further. Aborted − If any of the checks fails and the transaction has reached a failed state, then the recovery manager rolls back all its write operations on the database to bring the database back to its original state where it was prior to the execution of the transaction. Transactions in this state are called aborted. The database recovery module can select one of the two operations after a transaction aborts − Re-start the transaction Kill the transaction Committed − If a transaction executes all its operations successfully, it is said to be committed. All its effects are now permanently established on the database system.
Aggregation
Aggregation is an abstraction in which relationship sets (along with their associated entity sets) are treated as higher-level entity sets, and can participate in relationships.
Attributes
An attribute of an entity is a particular property that describes the entity. The set of all possible values of an attribute is the attribute domain. For example: An entity named "Student" can have attributes such as "Name", "Age", "Roll No.", etc. It is denoted by an oval (or ellipse).
Entity
An entity is a piece of data-an object or concept about which data is stored. In other words, an entity may be defined as a thing capable of an independent existence that can be uniquely identified. It is denoted by a rectangle.
Entity, Relationship sets
An entity is an object that exists in the real world and is distinguishable from other objects. We express the distinction by associating with each entity a set of attributes that describes the object. • A relationship is an association among several entities. A relationship set is a collection of relationships of the same type, and an entity set is a collection of entities of the same type.
ER Model
An entity relationship model, also called an entity-relationship (ER) diagram, is a graphical representation of entities and their relationships to each other, typically used in computing in regard to the organization of data within databases or information systems.
Strong entity
An entity that whose attributes can be selected to form a primary key is a string entity
Application programmers
Application programmers are computer professionals who write application programs. Application programmers can choose from many tools to develop user interfaces.
Functional dependency
Consider a relation schema r(R), and let a⊆ R and b⊆ R. • Given an instance of r(R), we say that the instance satisfies the functional dependency a →b if for all pairs of tuples t1 and t2 in the instance such that t1[a] = t2[a], it is also the case that t1[b] = t2[b]. • We say that the functional dependency a→b holds on schema r(R) if, in every legal instance of r(R) it satisfies the functional dependency.
X
Cross product
Disadvantages of Filesystems
Data redundancy and inconsistency. Difficulty in accessing data. Data isolation. Integrity problems. Atomicity problems. Concurrent-access anomalies. Security problems.
Lock-based Protocols
Database systems equipped with lock-based protocols use a mechanism by which any transaction cannot read or write data until it acquires an appropriate lock on it. Locks are of two kinds − Binary Locks − A lock on a data item can be in two states; it is either locked or unlocked. Shared/exclusive − This type of locking mechanism differentiates the locks based on their uses. If a lock is acquired on a data item to perform a write operation, it is an exclusive lock. Allowing more than one transaction to write on the same data item would lead the database into an inconsistent state. Read locks are shared because no data value is being changed.
Instance of database
Databases change over time as information is inserted and deleted. The collection of information stored in the database at a particular moment is called an instance of the database.
Database Independence
Each higher level of the data architecture is immune to changes of the next lower level of the architecture. It means we can change the conceptual schema at one level without affecting the data at another level. It also means we can change the structure of a database without affecting the data required by users and programs. This feature was not available in the file oriented approach.
Explicit Cursors
Explicit cursors are programmer-defined cursors for gaining more control over the context area. An explicit cursor should be defined in the declaration section of the PL/SQL Block.
Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for efficiency has led designers to use complex data structures to represent data in the database.Since many database-system users are not computer trained, developers hide the complexity from users through several levels of abstraction, to simplify users' interactions with the system.
⟗
Full outer join
aggregation operator
G, which permits the use of aggregate functions such as min or average, on sets of values.
Result Equivalence
If two schedules produce the same result after execution, they are said to be result equivalent. They may yield the same result for some value and different results for another set of values. That's why this equivalence is not generally considered significant.
Implicit Cursors
Implicit cursors are automatically created by Oracle whenever an SQL statement is executed, when there is no explicit cursor for the statement. Programmers cannot control the implicit cursors and the information in it.
Isolation
In a database system where more than one transaction are being executed simultaneously and in parallel, the property of isolation states that all the transactions will be carried out and executed as if it is the only transaction in the system. No transaction will affect the existence of any other transaction.
Deadlock
In a multi-process system, deadlock is a situation, which arises in shared resource environment where a process indefinitely waits for a resource, which is held by some other process, which in turn waiting for a resource held by some other process
Weak entity
In a relational database, a weak entity is an entity that cannot be uniquely identified by its attributes alone; therefore, it must use a foreign key in conjunction with its attributes to create a primary key. The foreign key is typically a primary key of an entity it is related to.
∩
Intersection
Relationship
It defines the relationship between two or more entities, that is, how the data is shared between entities. There are three types of relationships between entities: One-to-One, One-to-Many, Many-to-Many. It is denoted by a diamond.
generalized projection
It extends the projection operation by allowing operations such as arithmetic and string functions to be used in the projection list.
Serial Schedule
It is a schedule in which transactions are aligned in such a way that one transaction is executed first. When the first transaction completes its cycle, then the next transaction is executed. Transactions are ordered one after the other. This type of schedule is called a serial schedule, as transactions are executed in a serial manner.
⟕
Left outer join
-
Minus
⋈
Natural join, (θ-join when condition is included)
Naive users
Na¨ıve users are unsophisticated users who interact with the system by invoking one of the application programs that have been written previously.
DBA
One of the main reasons for using DBMSs is to have central control of both the data and the programs that access those data. A person who has such central control over the system is called a database administrator (DBA).
Mapping cardinalities
One-to-one One-to-many Many-to-one Many-to-many
SQL Cursor
Oracle creates a memory area, known as the context area, for processing an SQL statement, which contains all the information needed for processing the statement; for example, the number of rows processed, etc. A cursor is a pointer to this context area. PL/SQL controls the context area through a cursor. A cursor holds the rows (one or more) returned by a SQL statement. The set of rows the cursor holds is referred to as the active set.
Features of PL/SQL
PL/SQL is tightly integrated with SQL. It offers extensive error checking. It offers numerous data types. It offers a variety of programming structures. It supports structured programming through functions and procedures. It supports object-oriented programming. It supports the development of web applications and server pages.
PL/SQL Subprograms
PL/SQL provides two kinds of subprograms − Functions − These subprograms return a single value; mainly used to compute and return a value. Procedures − These subprograms do not return a value directly; mainly used to perform an action.
Pre-claiming Lock Protocol
Pre-claiming protocols evaluate their operations and create a list of data items on which they need locks. Before initiating an execution, the transaction requests the system for all the locks it needs beforehand. If all the locks are granted, the transaction executes and releases all the locks when all its operations are over. If all the locks are not granted, the transaction rolls back and waits until all the locks are granted.
π(R)
Projection
ρ
Rename
⟖
Right outer join
σ(R)
Selection
Simplistic Lock Protocol
Simplistic lock-based protocols allow transactions to obtain a lock on every object before a 'write' operation is performed. Transactions may unlock the data item after completing the 'write' operation.
Sophisticated users
Sophisticated users interact with the system without writing programs. Instead, they form their requests either using a database query language or by using tools such as data analysis software
Specialisation and generalisation
Specialization and generalization define a containment relationship between a higher-level entity set and one or more lower-level entity sets. Specialization is the result of taking a subset of a higher-level entity set to form a lower-level entity set. Generalization is the result of taking the union of two or more disjoint (lower-level) entity sets to produce a higher-level entity set. The attributes of higher-level entity sets are inherited by lower-level entity sets.
Specialized users
Specialized users are sophisticated users who write specialized database applications that do not fit into the traditional data-processing framework.
Catalog
The catalog is the place where--among other things--all of the various schemas (external, conceptual, internal) and all of the corresponding mappings (external/conceptual, conceptual /internal) are kept. In other words, the catalog contains detailed information (sometimes called descriptor information or metadata) regarding the various objects that are of interest to the system itself.
Consistency
The database must remain in a consistent state after any transaction. No transaction should have any adverse effect on the data residing in the database. If the database was in a consistent state before the execution of a transaction, it must remain consistent after the execution of the transaction as well.
Durability
The database should be durable enough to hold all its latest updates even if the system fails or restarts. If a transaction updates a chunk of data in a database and commits, then the database will hold the modified data. If a transaction commits but the system fails before the data could be written on to the disk, then that data will be updated once the system springs back into action.
Strict Two-Phase Locking
The first phase of Strict-2PL is same as 2PL. After acquiring all the locks in the first phase, the transaction continues to execute normally. But in contrast to 2PL, Strict-2PL does not release a lock after using it. Strict-2PL holds all the locks until the commit point and releases all the locks at a time. Strict-2PL does not have cascading abort as 2PL does.
View level
The highest level of abstraction describes only part of the entire database. The view level of abstraction exists to simplify their interaction with the system. The system may provide many views for the same database.
Physical level
The lowest level of abstraction describes how the data are actually stored. The physical level describes complex low-level data structures in detail.
Logical level
The next-higher level of abstraction describes what data are stored in the database, and what relationships exist among those data. The logical level thus describes the entire database in terms of a small number of relatively simple structures. Although implementation of the simple structures at the logical level may involve complex physical-level structures, the user of the logical level does not need to be aware of this complexity.
Database schema
The overall design of the database is called the database schema. Schemas are changed infrequently, if at all. "desc" Database systems have several schemas, partitioned according to the levels of abstraction. The physical schema describes the database design at the physical level, while the logical schema describes the database design at the logical level. A database may also have several schemas at the view level, sometimes called subschemas, that describe different views of the database.
Query processor subsystem
The query processor subsystem compiles and executes DDL and DML statements.
Relational algebra
The relational algebra provides a set of operations that take one or more relations as input and return a relation as an output. Practical query languages such as SQL are based on the relational algebra, but add a number of useful syntactic features.
Relational Model
The relational model uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns, and each column has a unique name. Tables are also known as relations.
Relational query languages
The relational query languages define a set of operations that operate on tables, and output tables as their results. These operations can be combined to get expressions that express desired queries.
Storage manager subsystem
The storage manager subsystem provides the interface between the low level data stored in the database and the application programs and queries submitted to the system.
Database Users and User Interfaces
There are four different types of database-system users, differentiated by the way they expect to interact with the system. Different types of user interfaces have been designed for the different types of users.
Two-Phase Locking 2PL
This locking protocol divides the execution phase of a transaction into three parts. In the first part, when the transaction starts executing, it seeks permission for the locks it requires. The second part is where the transaction acquires all the locks. As soon as the transaction releases its first lock, the third phase starts. In this phase, the transaction cannot demand any new locks; it only releases the acquired locks. Two-phase locking has two phases, one is growing, where all the locks are being acquired by the transaction; and the second phase is shrinking, where the locks held by the transaction are being released. To claim an exclusive (write) lock, a transaction must first acquire a shared (read) lock and then upgrade it to an exclusive lock.
Atomicity
This property states that a transaction must be treated as an atomic unit, that is, either all of its operations are executed or none. There must be no state in a database where a transaction is left partially completed. States should be defined either before the execution of the transaction or after the execution/abortion/failure of the transaction.
Conflict Equivalence
Two schedules would be conflicting if they have the following properties − Both belong to separate transactions. Both accesses the same data item. At least one of them is "write" operation. Two schedules having multiple transactions with conflicting operations are said to be conflict equivalent if and only if − Both the schedules contain the same set of Transactions. The order of conflicting pairs of operation is maintained in both the schedules.
View Equivalence
Two schedules would be view equivalence if the transactions in both the schedules perform similar actions in a similar manner. For example − If T reads the initial data in S1, then it also reads the initial data in S2. If T reads the value written by J in S1, then it also reads the value written by J in S2. If T performs the final write on the data value in S1, then it also performs the final write on the data value in S2.
U
Union
DDL
We specify a database schema by a set of definitions expressed by a special language called a data-definition language (DDL). The DDL is also used to specify additional properties of the data.
add column
alter table r add A D;
remove column
alter table r drop A;
predicate connectors
and, or, not
create table example
create table teaches (ID varchar (5), course id varchar (8), sec id varchar (8), semester varchar (6), year numeric (4,0), primary key (ID, course id, sec id, semester, year), foreign key (course id, sec id, semester, year) references section, foreign key (ID) references instructor); create table section (course id varchar (8), sec id varchar (8), semester varchar (6), year numeric (4,0), building varchar (15), room number varchar (7), time slot id varchar (4), primary key (course id, sec id, semester, year), check (semester in ('Fall', 'Winter', 'Spring', 'Summer')));
delete data from table r
delete from r;
delete tuples and schema for table r
drop table r;
grant statement
grant <privilege list> on <relation name or view name> to <user/role list>; grant update (budget) on department to Amit, Satoshi;
revoke statement
revoke <privilege list> on <relation name or view name> from <user/role list>; revoke select on department from Amit, Satoshi; revoke update (budget) on department from Amit, Satoshi;
typical SQL query
select A1, A2,..., An from r1, r2,...,rm where P;
aggregate by grouping, having clause
select dept name, count (distinct ID) as instr_count from instructor natural join teaches where semester = 'Spring' and year = 2010 group by dept name having avg (salary) > 42000; select course id, semester, year, sec id, avg (tot cred) from takes natural join student where year = 2009 group by course id, semester, year, sec id having count (ID) >= 2;
use of aliases
select distinct T.name as Name from instructor as T, instructor as S where T.salary > S.salary and S.dept name = 'Biology';
nested subqueries
select distinct course id from section where semester = 'Fall' and year= 2009 and course id not in (select course id from section where semester = 'Spring' and year= 2010); delete from instructor where dept name in (select dept name from department where building = 'Watson');
Deadlock Prevention
wait-die scheme wound-wait scheme
aggregate functions
• Average: avg • Minimum: min • Maximum: max • Total: sum • Count: count(*), count(distinct ),
Function of a DBA
• Schema definition • Storage structure and access-method definition. • Schema and physical-organization modification. • Granting of authorization for data access. • Routine maintenance. Examples: ◦ Periodically backing up the database ◦ Ensuring that enough free disk space is available for normal operations. ◦ Monitoring jobs running on the database.
constraints on a single relation
• not null • unique • check(<predicate>)