CMSC 508 Midterm

Ace your homework & exams now with Quizwiz!

Overlapping

subclasses may have overlapping sets of entities

Disjoint

subclasses must have disjoint sets of entities

Tuple

records that correspond to the attributes; horizontal

Consistency constraints

refers to the requirement that any given database transaction must change affected data only in allowed ways. refers to maintaining and assuring the accuracy and consistency of data over its entire life-cycle,

Relation schema

relation name followed by a parenthesized list of attribute names with associated domains followed by constraints

Superclass and subclass

The derived class (the class that is derived from another class) is called a subclass. The class from which it's derived is called the superclass

Physical schema

defines how the conceptual schema is actually stored in data structures on disks

Data abstraction

entities

Total participation

every entity in R participates in at least one relationship in S

Atomic domain

indivisible

UML representation

is a way of visualizing a software program using a collection of diagrams.

Weak entity

is one whose existence is dependent on another identifying entity

Primary key

key in a relational database that is unique for each record. It is a unique identifier, such as a driver license number, telephone number (including area code), or vehicle identification number (VIN).

Schema

logical structure of the Database

Null value

null signifies unknown or does not exist o All comparisons involving null are false by definition o The result of any arithmetic expression involving null is null

Partial participation

only some entities in R participate in relationships in S

Atomicity

either all actions in a transaction occur, or none

Database languages

Procedural • Specifies what data is required and how to get those data o Declarative • Specifies what data is required without specifying how

Normalization

Process of organizing the columns and tables of a relational database to minimize data redundancy • Normalization decomposes a table into smaller tables without losing information • References are defined using primary keys and foreign keys

Projection

Projection (∏) • Outputs specified attributes from all rows of the input relation, and removes duplicate tuples from the output • Let R (A,B,C) Properties • Idempotent ∏A1,...,AN(∏B1,...,BM(R )) = ∏A1,...,AN(R ) where {A1,...,AN} ⊆ {B1,...,BM} • Distributive over set union ∏A1,...,AN(R P ) = ∏A1,...,AN(R ) ∏A1,...,AN(P ) Projection does NOT distribute over intersection and set difference

One-to-many relationship

1:M

Strong entity

An entity set that has a primary key

Physical data independence

Being able to change the physical schema without changing the logical schema

Query language

SQL

Instance

The actual content of a database at a certain point in time

One-to-one relationship

1:1

Logical schema

describes the data model

Generalization

A bottom-up design process to combine a number of entity sets that share the same features into a higher-level entity set • Specialization and generalization are simple inversions of each other; they are represented in an E-R diagram in the same way

Data models

A collection of tools for describing o Data o Data relationships o Data semantics o Data constraints

Database Management System

Collection of interrelated data Set of programs to access the data An environment that is convenient, efficient, and reliable

Relational data model

Collection of tables to represent both data and the relationships • Tables have multiple columns with unique names • Tables contain records (tuples) that correspond to the attributes • Tables are called relations

Derived attribute

Computed from other attributes o Example: age given the date of birth, GPA given grades

ACID properties

Consistency - making sure that if a DBMS starts in a consistent state, that it ends up in a consistent state. Isolation - One transaction does not affect other transactions Durability - If a transactions commits then that effect is persistent

Failure recovery

DBMS provides fault-tolerant design using transactions

Entity-relationship model

Data modelling as a collection of entities and relationships • Represented diagrammatically by an entity-relationship diagram • Unified Modeling Language (UML) • Entity sets are represented by a rectangular box • Relationship sets are represented by a diamond connecting related entity sets

Unary, Binary, Ternary relationship

Degree of a relationship

Database administrator (DBA)

Designs logical /physical schemas o Handles security and authorization o Data availability, crash recovery o Database tuning as needs evolve

Data inconsistency

Different rows should different content for an attribute that should have the same value

Aggregation

E-R model cannot express relationships among relationships • Suppose we want to record evaluations of a student by a guide on a project Eliminate this redundancy via aggregation without redundancy o Treat relationship as an abstract entity o Allows relationships between relationships o Abstraction of relationship into new entity • The following diagram represents: o A student is guided by a instructor on a project o A student, instructor, project set may have an evaluation To represent aggregation, create a schema containing o Primary key of the aggregated relationship o The primary key of the associated entity set o Any descriptive attributes eval_for (s_ID, project_id, i_ID, evaluation_id) The schema proj_guide is redundant!

Entity and entity set

Entity sets • Entities are represented by their name and a set of attributes • Entities are uniquely identified by the primary key • Foreign keys guarantee referential integrity • Entities are represented graphically as rectangles, listing the attributes. The primary key is underlined 4 Example: instructor = (ID, name, salary ) student = (ID, name, tot_cred)

Mapping cardinality

Express the number of entities to which another entity can be associated via a relationship set

Functional dependencies

Given a relation R, a set of attributes X in R is said to functionally determine another set of attributes Y, also in R, (written X → Y) if, and only if, each X value is associated with precisely one Y value • A functional dependency X → Y holds over relation R if, for every allowable instance r of R: o t1ϵ r, t2ϵ r, (t1) = (t2) implies (t1) = (t2) o if the X values agree, then the Y values must also agree • An FD is a statement about all allowable relations o Must be identified based on semantics of application o Given some allowable instances of R, we can check if they violate some FDs, but we cannot tell if holds over R!

E-R model and normalization

Identify main entities and relationships (list) 2. Populate entities with attributes, identify PK 3. Draw entities, relationships, and name the roles 4. Define the cardinality and participation of the relationships 5. Refine and simplify the model 6. Translate the model to relational schemas Common errors in ERDs • Missing cardinalities, participation, roles, diamonds (NOTATION!) • Excessive decomposition (attributes -> entities), specialization (car category), ternary relationships, weak entities • Age is NOT an attribute (birthdate is) process of organizing the attributes and relations to minimize data redundancy • Normalization involves decomposing a table into less redundant and smaller tables without losing information using foreign keys • The objective is to isolate data so that modifications of an attribute can be made in just one table and then propagated through the rest of the database using the defined foreign keys

Data-manipulation language

Language for accessing and manipulating the data SQL is the most widely used query language (declarative) Example: select name from instructor where instructor.ID = '22222' Application programs generally access databases through: • Language extensions to allow embedded SQL • Application program interface (e.g., ODBC/JDBC) which allow SQL queries to be sent to a database (drivers)

Many-to-one relationship

M:1

Many-to-many relationship

M:N

Joins

Natural join (⨝) • Let r and s be relations on schemas R and S respectively • Natural join of relations R and S is a relation on schema R S as: o Consider each pair of tuples tr from r and ts from s. o If tr and ts have the same value on each of the attributes in R S, add a tuple t to the result, where t has the same value as tr on r t has the same value as ts on s • Properties: associative and commutative Theta-join (θ) • R ⨝θS = σθ(R×S) • Selection σ meeting condition θ after cross product Equijoin • θ is the equality operator Outer join • An extension of the join operation that avoids loss of information • Computes the join and then adds tuples form one relation that does not match tuples in the other relation • Uses null values o null signifies unknown or does not exist o All comparisons involving null are false by definition o The result of any arithmetic expression involving null is null Inner join • Instructor ⨝ teaches Left outer join Right outer Join Full outer join

Object-based data model

Objects, classes and inheritance are directly supported in database schemas and in the query language • Extend the relational data model by including object orientation and constructs to deal with added data types • Allow attributes of tuples to have complex types, including nonatomic values such as nested relations • Preserve relational foundations, in particular the declarative access to data, while extending modeling power • Provide upward compatibility with existing relational languages

Query processor

Parsing and translation Optimization Evaluation • Cost difference between equivalent queries can be enormous

Relationship and relationship set

Relationships define associations among entities • Relationships may generate new attributes due to the association • Relationships are represented graphically as diamonds

Selection

Selection (σ) • Returns rows of the relation that satisfy the predicate • Let R (A,B,C,D) Properties • Idempotent σA(R ) = σA(σA(R )) • Commutative σA(σB(R ) ) = σB(σA(R )) • Breaking up σA ∧ B(R) = σA (σB(R)) = σB(σA(R)) σA ∨ B(R) = σA (R) σB(R)

Simple and composite attributes

Simple (atomic) and composite attributes (divisible) Example: VCU eID vs name (First name + last name)

Data-definition language

Specification notation for defining the database schema Example: create table instructor ( ID char(5), name varchar(20), dept_name varchar(20), salary numeric(8,2)) • DDL compiler generates a set of tables stored in a data dictionary • Data dictionary contains metadata o Database schema o Integrity constraints • Primary key • Referential integrity o Authorization

Entity-relationship data model

The ER data mode was developed to facilitate database design by allowing specification of an enterprise schema that represents the overall logical structure of a database • The ER model is very useful in mapping the meanings and interactions of real-world enterprises onto a conceptual schema • Because of this usefulness, many database-design tools draw on concepts from the ER model

Identifying relationship

The relationship associating the weak entity set with the identifying entity set

Specialization and generalization

Top-down design process: sub-groupings within an entity set that are distinctive from other entities in the set • These sub-groupings become lower-level entity sets that have attributes or participate in relationships that do not apply to the higher-level entity set • Depicted by a triangle component labeled ISA e.g., instructor "is a" person, student "is a" person, but attributes and relationships for instructor and student are different

Total and partial generalization

Total: every entity in the superclass must be a member of a subclass • Partial: an entity may not belong to any of the subclasses

Operations on relations

Union () • Let r and s be relations on schemas R and S respectively • 𝑅 𝑆 = 𝑥 𝑥 ∈ 𝑅 𝑜𝑟 𝑥 ∈ 𝑆} Intersection () • Let r and s be relations on schemas R and S respectively • 𝑅 𝑆 = 𝑥 𝑥 ∈ 𝑅 𝑎𝑛𝑑 𝑥 ∈ 𝑆} Difference(−) • Let r and s be relations on schemas R and S respectively • 𝑅 − 𝑆 = { 𝑥 ∈ 𝑅 | 𝑥 ∉ 𝑆} Cartesian product(×) • Output all pairs of rows from the two input relations • 𝑅 × 𝑆 = 𝑟, 𝑠 𝑟 ∈ 𝑅 𝑎𝑛𝑑 𝑠 ∈ 𝑆}

Attribute

Vertical on table

Semistructured data model

XML - Extensible Markup Language • Originally intended as a document markup language • The ability to specify new tags, and to create nested tag structures made XML a great way to exchange data • XML has become the basis for new generation data interchange • A wide variety of tools is available for parsing, browsing and querying XML documents/data

Attribute inheritance

a lower-level entity set inherits all the attributes and relationship participation of the higher-level entity set to which it is linked

Candidate key

a minimal set of attributes which is a key

Storage manager

a program that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. • The storage manager is responsible to the following tasks: o Interaction with the file manager o Efficient storing, retrieving and updating of data • Issues: o Storage access o File organization o Indexing and hashing

Table

a set of data elements (values) using a model of vertical columns (identifiable by name) and horizontal rows, the cell being the unit where a row and column intersect.

Relation instance

a table with a column for each attribute which is atomic; each row represents an entity

Referential integrity constraint

all foreign key constraints are enforced.

Foreign key

an attribute of a relation which serves as a primary key of another relation

Superkey

any key including a set of all attributes of a relation

Multi-valued attributes

can have more than one value at a time for an attribute

Transactions

collection of operations that performs a single logical function (atomic) in a database application

Database schema

collection of relation schemas

Nonprocedural language

computer language that does not require writing traditional programming logic. Also known as a "declarative language," users concentrate on defining the input and output rather than the program steps required in a procedural programming language such as C++ or Java.

Application program

computer program whose primary purpose is entering and retrieving information from a computerized database. Early examples of database applications were accounting systems and airline reservations systems

Single-valued attributes

considered single-valued if there is at most one value associated with it at any one point in time

Metadata

container object that keeps together many different features of a database (or multiple databases) being described

Data dictionary

contains metadata o Database schema o Integrity constraints • Primary key • Referential integrity o Authorization

Concurrency control

controls the interaction among the concurrent transactions, to ensure the consistency of the database

Domain

set of allowed values for each attribute

Database instance

set of memory structures that manage database files

Relation

set of tuples (d1, d2, ..., dn), where each element dj is a member of Dj, a data domain. a table

Keys

set of zero, one, or more attributes. The value(s) of these attributes are required to be unique for each tuple (row) in a relation

Schema diagram

shows the relations, attributes, and keys

ER diagram

shows the relationships of entity sets stored in a database. An entity in this context is a component of data. In other words, ER diagrams illustrate the logical structure of databases

Procedural language

type of computer programming language that specifies a series of well-structured steps and procedures within its programming context to compose a program. It contains a systematic order of statements, functions and commands to complete a computational task or program.

Recursive relationship set

unary with roles

Degree of relationship set

unary, binary, ternary

Set operations

union, intersection, difference, product

Discriminator attributes

used to identify other attributes of a weak entity

Null value

value used to represent an unknown piece of data.

Lossless decomposition

• A decomposition of R into R1 and R2 is lossless join if at least one of the following dependencies is in F + : o R1 R2 R1 o R1 R2 R2 • The above functional dependencies are a sufficient condition for lossless join decomposition; the dependencies are a necessary condition only if all constraints are functional dependencies 4 R1 (R) R2 (R) = R

Two- and three-tier architectures

• Two-tier architectures - application runs on client machine, communicates directly with the DB running on the server. • Three-tier architectures - application code running on the client's machine communicates with with an application server at the server, but never communicates with the database itself.


Related study sets

Assets, Liability, Owners Equity, Revenue, and Expense Accounts

View Set

human growth and development ch 10 app state

View Set

key issue 3: where is agriculture distributed?

View Set

4c Chinese Internal Medicine 2 Test 2 Pattern and Formula to Herbs

View Set

Cosmetology @MBIT ; Mrs. Menkes situational 100 Ch. 2 life skills

View Set

Intro to Psych: Chapter 7 - Thinking and Intelligence - Test Bank

View Set

Questions Missed on Practice Quizzes for Property & Casualty

View Set

Unit 3 multiple choice ap gov test

View Set

Antidepressants, Psychomotor Stimulants, and Lithium

View Set