CSE 305: Database Systems Midterm I Material
Semantic Constraints
Express the logic of the application at hand: e.g. number of registered students <= maximum enrollment
Example of semantic constraint
Limit attribute Values:: CREATE TABLE Transcript( StudId INTEGER, CrsCode CHAR(6), Semester CHAR(6), Grade CHAR(1), CHECK(Grade IN ('A','B','C','D', 'F')), CHECK(StudID > 0 AND StudID < 1000000000) ) -- Each row in table must satisfy condition
Relation
Mathematical entity corresponding to a table: i) row ---> Tuple ii) column ---> Attribute Relation R can be thought of as a predicate R(x,y,z) is true IFF tuple (x,y,z) is in R. Why relations? They are a very simple model.
System Admin
maintains transaction processing system: monitors interconnection of HW and SW modules, deals with failures and congestion.
Cardinality of Relation
number of tuples
Domain
possible values of an attribute
Superkey
set of attributes containing key {ID, name} is a superkey of Student
Serial Execution
synonym for sequentially: A set of transactions is executed sequentially if one transaction in the set is executed to completion before another is started. Serializable execution has better preformance than serial, but performance might still be inadequate.
Data Model
Mathematical representation of data ex: relational model = tables semistructured model = trees / graphs * Determines the operations on the data * Determines the constraints on the language
Key
Minimum set of attributes that uniquely identifies an entity (candidate key)
Minimality property
NO SUBSET OF A1,..., An is a key constraint
Airity of a Relation
Number of Attributes
Data Warehouse
Offline: Repository of historical data generated from OLTP or other sources
Data Manipulation Language
Operations on Data
Physical Data Independence
Physical schema can be changed without changing application: * DBMS would change mapping from conceptual to physical transparently * This property is referred to as physical data independece
User-Defined Domains
Possible attribute values can be specified: Creating a new domain CHECK clauss
Semantic Constraints in SQL
Primary key and foreign key are examples of ________________ constraints
Inclusion Dependency
Referential integrity contraint that is not a foreign key constraint: Foreign key constraint is just a special kind of inclusion dependency where referrenced attribute set is a key.
Relation Schema
Relation name + attribute list (optionally: types of attributes) Ex: Student(ID, name, address, status) ---> CAPS (actually would be underline: is a KEY) * Tuples cannot have the same value in all key attributes, excellent example of a constraint.
Examples of SELECT FROM WHERE
SELECT Id, Name FROM Student SELECT Id, name, FROM Student WHERE Status = 'senior'
SQL Data Definition Sublanguage
SQL is a language for describing database schema and operations on tables Data Definition language DDL sublanguage of SQL for describing schema.
Structured Query Language
SQL: Language for manipulating tables Declarative: Statement specifies what needs to be obtained, not how it is to be achieved (e.g., how to access data, the order of operation=)
CREATE SCHEMA <name>
SQL: Create a new schema
Database Schema
Set of all Relation Schemas in the Database
Declarative
Statement specifies what needs to be obtained, not how it is to be achieved (e.g., how to access data, the order of operation=) SQL determines evaluation strategy but this does not mean that the DBMS is not incapable of making mistakes or being wrong.
Rollback
TP MONITOR: Responsibility of ensuring that whatever partial changes the transaction has made to the database are undone. An example of this is known as __________________.
TPS
TP Monitor: Transactions get passed to the DBMS and the DBMS updates the database This whole system is reffered to as the ________________________________.
Unary
Takes a single argument, usually a relation itself: (e.g. delete certain rows)
Binary
Takes two arguments: Operation on two relations. (e.g. union, Cartesian product)
External Data Level
The _________________________________ customizes the conceptual schema to the needs of various classes of users (it also plays a role in database security). In the relational model, the external schema also presents data as a set of relations. The external schema looks and feels like a conceptual schema, and both are definied in essentially the same way in modern DBMSs There might be several external schemas (i.e. views on the conceptual schema), usually one per user category.
Database System
The collection of database applications that interact with the database along with the DBMS and the database itself.
Data Mining
Use of warehouse data to discover relationships (discovers hidden patterns in data) that might influence enterprise strategy.
Simplified Database System Environment
Users / Programmers v DBS: Application Programs / Queries v DBMS Software: Software to Process Queries / Programs v Software to Access Stored Data v v Stored Database Definition (Metadata) + stored database
Reactive Constraints
When a constraint is violated, the corresponding transaction is typically aborted. However in some cases other remedianl actions are more appropriate. Foreign key constraints are one example of this situation. It would be nice to have a mechanism that allows a user to specify how to react to a violation of a constraint. A ______________________ constraint is a static constraint coupled with a specification of what to do if a certain event happens. Triggers attached to a foregin key constraints.
Transaction
When an event in the real world changes the state of an enterprise, a "transaction" is executed to cause the corresponding change in the database state. ---> Transaction is an application: ACID properties
Consistency
[ACID Properties] Each transaction maintains database ___________________ . __________________ means that a transaction must access and update the database in such a way that it preserves all database integrity constraints. The transaction designer must ensure that: IF the database is in a state that satisfies all integrity constraints when execution of a transaction is started THEN the same should be true when the transaction completes. (constraints can be violated in intermediate states).
Isolation
[ACID Properties] The concurrent execution of a set of transactions has the same effect as some serial execution of that set. A set of transactions is executed sequentially, or serially, if one transaction in the set is executed to completion before another is started. If all transaction are coinsistent and the database is initially in a consistent state, serial execution maintains consistency. But serial execution maintains consistency. But serial execution is inadequate from a performance perspective.
Durability
[ACID Properties] The effects of committed transactions are permanently recorded in the database _____________ means that the system must ensure that once the transaction commits, its effects remain in the database even if the computer or the medium on which the database is stored subsequently crashes.
Atomicity
[ACID properties] Each transaction is executed completely or not at all ____________ means that the system must ensure that the transaction either runs to completion (i.e commits) or if it does not complete has no effect at all as if it has never been started (i.e aborts) TP MONITOR: Responsibility of ensuring that whatever partial changes the transaction has made to the database are undone (i.e. rolled back)
conceptual, physical
[Conceptual Data Level] DBMS maps _____________________ schema to __________________ schema This is known as physical data independence
Long Lifetime
[Database Systems Requirements] Complex systems are not easily replaced, must be designed so they can be easily extended as the needs of the enterprise change
High Reliability
[Database Systems Requirements] Correctly tracks state, does not lose data, controlled concurrency
High Availability
[Database Systems Requirements] Online: Must be operational while enterprise is functioning
Low Response Time
[Database Systems Requirements] Online: users are waiting
Database Security
[Database Systems Requirements] Sensitive information must be carefully protected since system is accessible to many users, Authentication, authorization, encryption.
High Throughput
[Database Systems Requirements] many users: many transactions per second
Static
[Integrity Constraints] Restricts legal states of data base
Semantic
[Integrity Constraints] Sub-category of static: Involves the meaning of attributes cannot register for more than 18 credits
Syntactic
[Integrity Constraints] Sub-category of static: Structural constraints: e.g. all values in a column must be unique
WHERE a2 = b2 (filter)
[Join] B 17 rst 3.2 17 B 17 rst 4.8 17
FROM T1,T2
[Join] Yields: Cartesian Product a1 a1 a3 b1 b2 A 1 xxy 3.2 17 A 1 xxy 4.8 17 B 17 rst 3.2 17 B 17 rst 4.8 17
Application Programmer
[Roles in Design, Implementation, and Maintenance of a TPS] Implements application programs (transactions) that access data and support enterprise rules.
Database Admin
[Roles in Design, Implementation, and Maintenance of a TPS] Maintains the database once system is operational, space allocation, performance optimization, database security
Database Designer
[Roles in Design, Implementation, and Maintenance of a TPS] Specifies structure of data that will be stored in database.
System Analyst
[Roles in Design, Implementation, and Maintenance of a TPS] Specifies system using input from customer; provides complete description of functionality from customer's and user's point of view
Trigger Keyword
___________ is a schema element (like table assertion, etc)
Triggers
____________ are a more general mechanism for handling events CREATE __________________ CrsChange AFTER UPDATE OF CrsCode, Semester ON Transcript WHEN (Grade IS NOT NULL) ROLLBACK
Dynamic
______________ constraints are a limitation on sequences of database states for e.g. cannot raise the salary by more than 5% in one transaction.
Integrity Constraints
_____________________ also known as consistency constraints are rules of the enterprise generally limiting the occurrence of certain real world events. EX: A Student cannot register for a course if current number of registrants = maximum allowed.
INSERT INTO
________________________ Student(Id, Name, Address, Status) VALUES (12345, 'John Smith', 123 Main St, NYC', NULL)
Concurrent Execution
________________________ is when multiple transactions are executed simultaneously Different transactions are effectively interleaved in time.
Integrity Constraint (IC)
_______________________________ is a statement about all legal instances of a database: Restriction on a state or of sequence of states of a database it is a part of the schema
Referential Integrity
An item named in one relation must refer to tuples that describe that item in another.
Entity
An object that is involved in the enterprise. (think of as an instance) Ex: John, CSE 305
Database
* Collection of data central to some enterprise (Ex of an enterprise: SBU Solar, Amazon, eBay, etc.)
Table
* Set of rows, no duplicates * Each row describes a different entity * Each column states a particular fact about each entity * Each column has an associated domain
Transaction Processing System (TPS)
* Transaction execution is controlled by a Transaction Processing (TP) monitor * TP monitor and DBMS together guarantee the special properties of transactions. TPS = TP Monitor + (possibly) Multiple Databases & transactions
Handling Foreign Key Violations
1) Reject if no row exists in B containing foreign key of inserted row 2)Delete B 3) Set NULL in A CASCADE: Delete referencing rows in A as well
Entity Type
A Set of similar objects: Ex: Students, courses
Transaction
A ___________________ is a program that accesses the database in response to real world events Additional Information: Many enterprises use databases to store information about their state: E.g. Balances of all depositors. * The occurrence of a real world event that changes the enterprise state requires the execution of a program that changes the database state in a corresponding way (e.g. balance must be updated when you withdraw)
Database Application
A computer program that interacts with the database by issuing an appropriate request (typically using SQL statements to the DBMS).
Relational Database
A database schema = the set of a relation schemas and constraints among relations (inter-relation constraints) * A relational database (or database instance) = a set of corresponding relation instances.
Key Constraint
A key constraint or candidate key is a sequence of attributes A1,...An (n = 1 is possible) of a relation schema, S with the following property: A relation instance s of S satisfies the key constraint IFF at most one row in s can contain a particular set of calues, a1, ... an for the attributes A1, ... An
Relation Schema
A relation schema is the heading of that table and the appicable constraints (integrity constraints) COMPOSED OF: Relation name Attribute names & domains Integrity constraints such as the values of a particular attribute in all tuples are greater than 0 (although the domain can be bigger) Default values for some attributes. A relation consists of a relation schema and a relation instance.
Relationships among entities
A relationship captures how entities are related to one another: can be thought of as verb
Relation Instance
A set of tuples
Online Analytical Processing (OLAP)
Analysis of information in a database for the purpose of making management decisions. * Analyzes historical data (terabytes) using complex queries * Summarizes the data and makes forecasts "What are the avg sales of cars by region and by year?"
Assertion
Applies to entire database (not just the individual rows of a single table) Unlike CHECK, conditions that appear inside a table definition, those in the CREATE ASSERTION statement must be satisfied by the
SELECT a1, b1 (projection)
B 3.2 B 4.8
Primary / Candidate Keys
CREATE TABLE Course( CrsCode CHAR(6), CrsName CHAR(20), DeptID CHAR(4), Descr CHAR(100), PRIMARY KEY(CrsCode), UNIQUE(DeptID, CrsName) -- candidate key) )
Table Declaration
CREATE TABLE Student(Id INTEGER, Name CHAR(20), Address CHAR(50), Status CHAR(10) )
Entities
Can be thought of as nouns: Examples: a computer, an employee, a song
Database
Collection of relations
Data Definition Language
Conceptual and external schema, * Integrity constraints and comains
Database Management System (DBMS)
DBMS is a program that manages a database. ---> Supports high-level access language (SQl) ---> DBMS interprets statements of language to perform rquested database access.
Data Model
Data and it's Structure
Online Transaction Processing (OLTP)
Day to day handeling of transactionsthat result from enterprise operation, maintains correspondence between database state and enterprise state
Attribute
Describes one aspect of an entity type * Every attribute of an entity specifies a particular property of that entity. Example: an employee entity might have a social security number
Schema
Description of data at some abstraction level
SELECT Name, CrsCode, Grade FROM Student, Transcript WHERE Transcript.StudId = Student.Id AND Status = 'senior'
GOAL: extract a table in which each row names a senior student and gives a course taken and grade It combines information from two tables: Student: Id, Name, Address, Status Transcript: StudID, CrsCode, Semester, Grade
Storage Definition Language
The directives that influence the physical schema (affects performance, not semantics)
Every relation has a super key
The set of all attributes in a schema S is always a superkey because if a legal instance of S has a pair of tuples that agree on all attributes in S then these must be identical tuples since relations are sets they cannot have identical elements
Operations
They take relations as an argument, produce new relation as a result. * Using mathematical properties, equivalence can be decided, important for query optimization: opt(T1, op2(T2)) =? op3(op4(T1), T2)
Modifying Tables
UPDATE Student SET Status = 'Soph' WHERE Id = 111111111 INSERT INTO Student(Id, name, Address, Status) VALUES (99999999, 'Bill', '432 Pine', 'senior') DELETE FROM Student WHERE Id = 111111111
WHERE
indicated which rows to retain (acts like a filter)
FROM
indicates source tables
SELECT
indicates which columns / attribute to extract from the retained rows.
Inter-relational
involve several relations e.g. the value of the attribute ID of each professor shown as Teaching a course must appear as the value of the ID attribute of some row of the table Professor (foreign Key)
Intra-relational
involves only one relation e.g. all Student IDs are unique (key)
DROP TABLE <name>;
to delete a relation