CS4400 Quiz 2

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

INTERSECTION Operation

- operation done on two union compatible relations, R and S - resulting relation *includes all tuples that are in both R and S* - denoted by *R∩S* - commutative and associative

Selected Combinations

- part of a tuple relational calculus expression - *a condition to select the particular combinations of tuples to be returned* (tuples that are true are returned) - if not specified then it returns all the tuples in R - {t | R(t) AND *t.salary>30000*}

Domain Contraint

- schema-based constraint - constraint that specifies that each value within a tuple must be an atomic value from the domain

CARTESIAN PRODUCT (or CROSS JOIN) Operation

- binary set operation done on two relations, *does not have to be union compatible* - produces a new element by combining every tuple from one relation to every tuple from another relation - denoted by *X* - if relation R has n attributes and relation S has m attributes, then the *result Q has n + m attributes* - if relation R has a tuples and S has b tuples, then *result Q has a x b tuples*

Total Number of Tuples in a Relation:

- cartesian product of all the domains - *|dom(A1)| x |dom(A2)| x .... x |dom(An)|*

Relation Schema or Scheme

- denoted by *R(A1, A2,..., An)* - also called *relation intension* - R is the relation name - (A1...An) is the list of attributes - ex: STUDENT(Name: string, SSN: string, Age: int)

Universal Quantifier

- denoted by *∀* - *formula (∀t)(F) is true IFF formula F evaluates true for every tuple in the universe* - t is bound when in an ∀ clause

Aggregate Functions

- operations that cannot be be described by basic relational algebra - specifies *mathematical functions on collections of values from the DB to get statistics and summarize tables* - *SUM, AVERAGE, MAXIMUM, MINIMUM, and COUNT operations* - *<grouping attributes>ƒ<function list>(R)* - grouping attributes tells the query what to group the tuples by when doing the functions - duplicates not eliminated

Referential Integrity Constraint Violation Examples:

1. If you try to insert or update a foreign key that does not exist in the referenced primary keys 2. If you try to delete a tuple that is being referenced by a foreign key

Relational Database State

- *DB = {r1, r2,....,rm}* - a *set DB of relational states* in the relational db schema S - diagrams of this show *a snapshot of a DB state of each table (includes the tuples)*

Relational Database Schema

- *S = {R1, R2, R3, ...., Rm}* - a *set S of relation schemas* and a *set IC of integrity constraints* - diagrams of this *show the table names and the column names* (does not include tuples) - use a *DDL to define this* (most use SQL)

Join Selectivity

- *the expected size of the join result / max size of join* (Ns * Nr) - Ns: number of tuples in S - Nr: number of tuples in R

NATURAL JOIN Operation

- Variation of the EQUIJOIN operation that *gets rid of the attributes that have identical values* (only uses = operator) - denoted with a * - *don't need to specify a join condition* bc it finds the two attributes with the same name in each table and joins based off that - if the attributes don't have the same name, rename one of them first to be the same - ex: PROJ_DEP <-- PROJECT * DEPARTMENT

Attribute

- a *column* in the relation table - the *name of a role* played by some domain D - all values have the same data type - if a value isn't known or not applicable for a certain entity --> *NULL value*

Tuple

- a *row* in the relation table - *represents a fact* that typically corresponds to a real-world entity or relationship - ex: in a student table, each row corresponds to one students info

Relational Calculas

- a formal query language for the relational model - Two variations of this language: *tuple and domain* - *specifies what is to be retrieved rather than how to retrieve it; written in one declarative expression* (a nonprocedural language) - oder doesn't matter

Referential Integrity Constraint

- a relational model constraint - specified *b/w two relations* - used to maintain the consistency among tuples in the two relations - *a tuple in one relation that refers to another relation must refer to an existing tuple*

Entity Integrity Constraint

- a relational model constraint - states that *no primary key can be NULL* - specified on individual relations

Relational Algebra Expression

- a sequence of relational algebra operations

Complete Set of Relational Algebra Operations

- all of the basic relational algebra operations - *{σ, π, ρ, ∪, ∩, X}* - *doesn't include any of the join operations bc they can be done with these basic operations* - other operations can be done by combining all of these operations - ex: INTERSECTION can be done by using the union and minus, JOIN can be done by using the cartesian product and select

Schema-based Constraints

- also called explicit constraints - constraints that can be directly expressed in the schemas of the data model - *typically specified in the DDL* - ex: domain constraints, key constrains, constraints on NULL values, entity integrity constraints, and referential integrity constraints

Inherent Constraints

- also called inherent model-based constraints - constraints that are inherent in the data model - ex: a relation cannot have duplicate tuples

LEFT OUTER JOIN Operation

- an outer join operation - denoted by *⟕* - *keeps every tuple in the first (or left of symbol) relation* --> searches right relation for a match --> if no match then the attributes of right relation are filled with NULL; otherwise filled in with the matching tuple values - *RESULT <-- (EMPLOYEE⟕<ssn=mgrssn>DEPT)* (shows what department a manager manages, if employee not found then not a manager so put NULL)

RIGHT OUTER JOIN Operation

- an outer join operation - denoted by *⟖* - *keeps every tuple in the right relation* --> seaches left relation for a match --> if no match then attributes of left relation are filled with NULL; otherwise with matching tuple values - *RESULT <-- (DEPT⟖<mgrssn=ssn>EMPLOYEE)*

FULL OUTER JOIN Operation

- an outer join operation - denoted by *⟗* - *keeps all the tuples in both the left and right relation*; puts NULL anywhere there is no match for any attribute

DIVISION Operation

- binary operation - result relation *T holds the values that every attribute in S shares in R* - used in queries that involve *universal quantification* - denoted by *÷* - *T(Y) <-- R(Z) ÷ S(X)* - X, Y, Z are attribute lists - *X is a subset of Z; result attribute list Y = Z - X; Z = X∪Y*

Transition Constraints

- constraint that deals with *state changes in the database* - handled within application program, triggers, or assertions - ex: the salary of an employee can only increase

Cardinality of D

- denoted by *|D|* - the total number of values in the domain

Existential Quantifier

- denoted by *∃* - *formula (∃t)(F) is true if the formula F evaluates true for at least one tuple* - t is bound when in an ∃ clause

Relation State

- denoted by r or r(R) - also called just *relation or relation extension* - set of *ordered n-tuples* - *r(R) = {t1, t2,.., tn}*

Primary Key

- designate one of the candidate keys this; all other candidate keys become unique keys - *values of this key are used to identify tuples in a relation* - *underlined* in schema diagrams

In-line Expression

- doing multiple operations in a single relational algebra expression - ex: π<attributes>(σ<cond>(table)) (does a project and select in one step)

Relational Algebra

- formal relational model language - *defines a set of operations for the relational model* - *must write a sequence of operations in a particular order for a retrieval request *(procedural language) - result of a retrieval query is a new relation called the *result relation*

Assignment Operation

- give *a result table a name* to reference it in later queries - denoted by *<--* (left arrow) - *can also name the resulting attributes* w (col1, col2, col3, ...) - ex: RESULT(col1) <-- σ<cond>(table) (new table name will be RESULT with one column named col1)

Recursive Relationship

- happens *when a foreign key references the primary key within the same table* - ex: EMPLOYEE primary key is ssn, and each employee tuple has a supervisor_ssn which is a foreign key pointing to ssn; *relates one employee tuple to another employee tuple*

Set Null/Set Default (constraint violation handler)

- happens when a delete operation violates a constraint - *sets foreign keys to NULL or another primary key*

Cascade (constraint violation handler)

- happens when a delete operation violates a constraint - DB attempts to *cascade the deletion by also deleting the tuples that reference the deleted tuple*

Duplicate Elimination

- if attribute list for the PROJECT operation only has nonkey columns, then *only the distinct tuples are shown* (duplicates removed) - involves sorting so takes more processing power

Attribute Roles

- if many attributes have the same domain, the attribute names indicate different interpretations for the domain - ex: Home_phone and Office_phone have same domain but obviously represent two different things

Relational Data Model

- introduced by *Tedd Codd* - uses the mathematical relation concept - uses the SQL language - *represents database as a collection of relations* - relations are represented by a *table or flat file*

Atom

- makes up the formula (boolean condition) of a tuple relational calculus expression - three different forms: 1. R(t) 2. t.A op t.b (where op is in the set {=, <, <=, >, >=, !=} 3. t.A op c (op is same as above, c is a constant) - evaluates to either true or false

Key of R

- not redundant like superkeys - has the same *uniqueness constraint* as superkeys, but is *a minimal superkey* (required) - cannot remove any attributes and still have the uniqueness property hold - usually formed by one attribute

OUTER UNION Operatoin

- operation developed to *take the union of two relations that are not union compatible* - need to be *partially compatible* tho --> some of their attributes are union compatible - attribute list X is the list of union compatible attributes (also need same name), and then Y and Z are the non compatible attributes - *T(X, Y, Z) <-- R(X, Y)∪S(X,Z)* - tuples with the same values for the X attributes will only appear once - tuples that appear only in R will have NULL values for the Z attributes - tuples that appear only in S will have NULL values for the Y attributes

UNION Operation

- operation done on two union compatible relations, R and S - resulting relation *includes all tuples in R and all tuples in S* (duplicates removed) - denoted by *R∪S* - commutative and associative

MINUS (or SET DIFFERENCE) Operation

- operation done on two union compatible relations, R and S - resulting relation *includes all tuples that are in R but not in S* (whichever one comes first) - denoted by *R-S* - not commutative

Recursive Closure

- operation that cannot be be described by basic relational algebra - *applied to a recursive relationship b/w tuples of the same type*

THETA JOIN Operation

- operation used to *combine related tuples from two relations into single longer tuples* - important for relationships b/w relations - denoted by *⋈* - *R⋈<cond>S* - only tuples that are true for the condition appear - condition is usually comparing values of an attribute in R with values of an attribute in S (ex: ssn=mgr_ssn) - a cartesian product followed by a select statement can be replaced with this operation

Range Relation

- part of a tuple relational calculus expression - *specifies the relation (aka range) to look at* - if not specified then the range is the universe :-o - {t | *R(t)*}

Requested Attributes

- part of a tuple relational calculus expression - the *set of attributes to be retrieved* - if not set then all the attributes of R are returned - {*t.First, t.Last* | R(t) AND t.age>21}

Time-invariant Property

- property that table keys must hold - *key of a table must continue to be distinct when we insert new tuples into the relation* - ex: if we choose name to be the key then it could eventually break the uniqueness property bc two people can have the same name - solve this by picking something that will always be unique

Current Relation State

- reflects only the *valid tuples that represent a particular state of the real world* - *changes often* as tuples get updated - relation schema (table setup) doesn't change often because it is not common to add new attributes (columns)

Alternate Definition of a Relation

- standard def --> ordering of values in a tuple and the ordering of attributes is important - alt def uses a *mapping for each tuple so ordering doesn't matter as much* - has column name and then tuple value - ex: t = <(Name, Kelsey Murdock), (Age, 21), ...> (tuple can be ordered in anyway since column name is included)

The Closed World Assumption

- states that *the only true facts in the universe are those present within the extension state of the relations* - each value in a tuple is a fact about a certain entity and any other combination of values is false for that entity - useful for *queries based on relational calculus*

Selection Condition

- the boolean condition in the SELECT operation - *format: <attribute name><comparison op><constant val>* (last param can also be another attribute name) - use AND, OR, or NOT to specify multiple boolean conditions in one operation - ex: Dno=4; Salary>30000

Degree or Arity of a Relation

- the total number of attributes in a relation == n - *number of columns in a table*

Join Attribute

- the two identical attributes in a NATURAL JOIN operation - resulting relation only keeps one of these attributes to avoid the redundancy like in EQUIJOIN

Ordering of Tuples in a Relation

- this is *not a part of the relation definition* - *ordering of tuples (rows) doesn't matter* but can be specified - ex: Ordering rows by name in alphabetical order

Outer Joins

- type of *join operation developed so that users can keep any tuples they need regardless of whether or not they are true for the join condition* - can keep all tuples from R, all tuples from S, or all tuples in both

Inner Joins

- type of *join operation that only shows matching tuples in the result* - tuples that don't match the join condition or tuples with NULL values are eliminated - includes the *theta join, equijoin, and natural join operations*

Flat Relational Model

- type of relational model - based off the *first normal form assumption --> assumption that every value is atomic* - Multivalued attributes must be represented by separate tuples - Composite attributes represented by multiple attributes

Superkey of R

- type of schema-based constraint; more specifically a *uniqueness constraint* - no two distinct *tuples in any state can have the same value for a subset of attributes* - *default is the set of all attributes*, but a smaller subset can be defined - can have redundant attributes tho

SELECT Operation

- unary operation (operate on one table) - used to *choose a subset of the tuples* from a relation that *satisfy a boolean condition* - denoted by *σ<boolean condition>(R)* - every tuple that is true for the condition is returned in the resulting relation - multiple of these operations can be applied in any order to get the same result; *Commutative*

Foreign Key

- used to define a referential integrity constraint - *when one table has a column(s) that refers to the primary key of another table* (can also refer to the primary key of its own table) - *the primary key must exist in the referenced relation* (can be NULL) - if these two ^ properties hold, then the referential integrity constraint holds - *drawn with a line pointing to the referenced primary key*

Valid vs. Not Valid DB State

- valid: state satisfies all the constraints in the defined set of integrity constraints - not valid: does not obey the IC

Atomic Values

- values not divisible into smaller components - *Composite and multivalued attributes are not allowed*

Tuple Relational Calculus

- variation of relational calculus - based on specifying a number of *tuple variables that range over a particular DB relation* - *{t | R(t)}* --> simple query returns all the tuples that are in relation R - SQL based on this

Candidate key

- when a relation schema has multiple keys - each key is called this

Union Compatible (or type compatible)

- when two relations have the same number of attributes and each corresponding pair of attributes has the same domain

Entity Integrity Constraint Violation Example:

1. If you try to insert or update a primary key with a NULL value

Domain Constraint Violation Example:

1. If you try to insert or update a value that is not in the domain of Ai

Key Constraint Violation Example:

1. if you try to insert or update a primary key that already exists

EQUIJOIN Operation

- A JOIN operation where the *only comparison operator used is the "="* - result relation *always has two or more attributes that have identical values in every tuple*

Update Operation

- DB modification operation - *changes the values of one or more attributes in a tuple* (or multiple tuples) - has the potential to violate domain, key, entity integrity, and referential integrity constraints

Delete Operation

- DB modification operation - *deletes a record (tuple)* from the table - can *violate referential integrity only* - *handles violations with restrictions (reject), cascades, set null, or a combo*

Insert Operation

- DB modification operation - provides *a list of attribute values for a new tuple t, that is to be inserted into relation R* - can violate domain, key, entity integrity, and referential integrity constraints - *constraint violations handled by rejecting operation*

Generalized Projection Operation

- operation extends the projection operation by allowing functions of attributes to be included in the attribute list - can't be described by the basic relational algebra operations - *π<F1, F2, ..., Fn>(R)* - F1, F2,..., Fn are functions over the attributes; *may include arithmetic operators or constant vals* - Ex: π<0.25 * Salary>(EMPLOYEE) (takes taxes out of the salary from the Salary attribute)

Domain

- Denoted with *D(Ai)* - a set of atomic values - *specifying the data types of values that can appear in each column, and the column name itself* - ex: Employee_age: age of an employee; must be an int b/w 15 and 80

RENAME Operation

- Operation that does the same thing as the assignment operation but is formally defined - denoted by *ρS(B1, B2,..., Bn)(R) or ρS(R) or ρ(B1, B2,...,Bn)(R)* - S is the new table name, (B1,B2,...,Bn) are the new attribute names

Selectivity of the Selection Condition

- The fraction of tuples selected by a selection condition

PROJECT Operation

- Unary operation - *selects certain columns* from a table and discards the rest - denoted by *π<attribute list>(R)* - attribute list is the desired columns to show in the same order as they appear in the list - *Not commutative*; order of calling does matter

Domain Relational Calculus

- Variant of relational calculus - *QBE* based off of this (query by example) - *variables range over single values from domains of attributes* - {x1,x2,..,xn | R(x1, x2,...xj)} - first list are the attributes to be retrieved - list <x1,x2,...xj> states that there must be a tuple with these values in R


Kaugnay na mga set ng pag-aaral

Accounting Test Chapter 5, 6, 7, & 8

View Set

ATI Pharm Made Easy 4.0 Musculoskeletal

View Set

Ethics Chapter 7 - Intentional & Quasi-Intentional Torts

View Set

Chapter 7 Virtualization and cloud computing

View Set