DB 1

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

arity of a relation

the number of attributes it has

LInear hashing

the one that expands linearly (if index 0 is full, start at index M [2M + i where i is 0]. Can be maintained by a file-load factor instead of just overflows

Total participation with subclass

the superclass HAS to be one of the subclasses

Relation state

the tuples existing in a relation. A subset of the cross-products of all attribute domains

Seek time

time for disk head to find track

block transfer time

time needed to read info from block into main memory

Mappings

transforming requests/results between three DB levels

A database represents some aspect of the real world, sometimes called the

universe of discourse miniworld

A state that satisfies the structure and constraints specified in the schema is called a ________ state.

valid

Traditional file processing

writing custom programs to access data stored in files. User defines and implements files needed for a specific software app as part of programming the app

DBMS

- allows user to create database - specify types of data - relationships among data elements - constraints on what is stored

Data normalization is great, but sometimes we use _________ to improve the performance of queries, like storing pieces of data in multiple places, called _________________

1. controlled redundancy 2. denormalization

View

A composite of a subset of columns from one or more related tables. Contains virtual data derived from the database files but is not explicitly stored

Online transaction processing applications

A concurrently available application that needs to specify holds on data per user efficiently and correctly (online airline booking example)

Data model

A form of data abstraction / conceptual representation that supports basic operations (retrievals and updates on the database) and dynamic behavior (allows database designer to specify a set of valid operations allows on DB objects, like COMPUTE_GPA and stuff). Hides implementation details form users

Operation

A function on data built into the DB definitions. Specified as an interface (inputs and the operation name) and the implementation.

Stored Procedure

A set of SQL statements that can be parameterized and executed ("macro").

Table

A set of data records of the same format, divided into columns that all contain the same kind of data, and rows of related records.

Domain

A set of valid atomic values

Index

A type of access path, allows direct access to data using an index term or a keyword

Four Desirable Properties of Transactions

ACID properties Atomicity Consistency preservation Isolation Durability

Partial participation constraint

An entity can exist without a relationship to the other entity

Total participation constraint

An entity cannot exist without a relationship to the other entity

Dense vs sparse index

An entry for every value versus entries for just some

SQL CHECK Clause

Applies to each tuple individually Example: Dnumber INT NOT NULL CHECK (Dnumber > 0 AND Dnumber < 21); CHECK (Dept_create_date <= Mgr_start_date);

Relationship

Association among entities

Uniqueness constraint

Attributes must be distinct between entities.

Difference between B+ tree and B tree

B tree contains pointers at intermediate levels, B+ tree only has data pointers in leaves. Leaves point to their right neighbors too in B+ trees

SQL Create statement

CREATE TABLE [name] (columns, constraitns, etc);

Which of the CRUD operation(s) result in a change of state in a database?

CUD (not retrieve)

A row in a relational model

Called a tuple. represents a collection of related data values. Typically corresponds to real-world entity or relationship.

Definitions of attributes

Can be an ordered list (this is customary) or an unordered list of (attribute, value) pairs.

Data Independence

Capacity to change the schema at one level of a database system Without having to change the schema at the next higher level

Types of end users

Casual end users (occasionally access the DB, usually through specific interfaces) Naive/parametric end users (constantly query/update the DB using canned transactions, like bank tellers and stuff) Sophisticated end users (engineers/scientists/etc that use the DBMS often to execute complex requirements) Standalone users (maintain personal DBs and use GUIs and packages, like someone using a financial software package)

Query processing/optimization

Chooses an efficient query plan for each query based on storage structures

Category

Class that is a subset of the union of n defining superclasses

Subclass

Class whose entities must always be a subset of the entities in another class

Modules and who develops them?

Components of a DBMS for implementing the catalog, querying, interface processing, accessing data. Developed by DBMS System designers/implementers

Semantic integrity constraints

Constraints on the type of values for an attribute

What are nine advantages of using a DBMS and the capabilities that a good DBMS should possess?

Controlling redundancy. Restricting unauthorized access. Providing persistent storage for program objects. Providing storage structures and search techniques for efficient query processing. Providing backup and recovery. Providing multiple user interfaces. Representing complex relationships among data. Enforcing integrity constraints. Permitting inferencing and actions using rules and triggers

Buffering/caching

DBMS often uses this to keep parts of the database in memory to avoid queries loading/unloading data from memory

Who are the three categories of people whose jobs are associated with the design, development, and operation of the DBMS software and system environment.

DBMS system designers and implementers. Tool developers. Operators and maintenance personnel.

In many DBMSs where no strict separation of levels is maintained, this one language is used by the DBA and database designers to define both the conceptual and internal schemas. Otherwise, it is used to specify the conceptual schema only.

Data Definition Language (DDL)

Once the database schemas are compiled and the database is populated with data, users must have some means to manipulate the database. Typical manipulations include retrieval, insertion, deletion, and modification of the data. This language provides these manipulations.

Data Manipulation Language (DML)

Self-describing data

Data contains item names and data values together in one structure (usually used in NOSQL)

Mirroring/Shadowing

Data is written redundantly to two discs that are treated as one logical disc.

Who are the four categories of people whose jobs involve the day-to-day use of a large database

Database administrators, database designers, end users, and system analysts and application programmers (software engineers).

Storage definition language

Define internal schema only

Data definition language (DDL)

Define the internal and conceptual schema

The DBMS is a general-purpose software system that facilitates what four processes that involve databases?

Defining, constructing, manipulating, and sharing databases

Physical data model

Describe how files are stored on computer.

Atomic

Describes a value that is indivisible

External/view level

Describes part of the database that a particular user group is interested in

Internal level

Describes physical storage structure of the database

Conceptual level

Describes structure of the whole database for a community of users

Primary key

Designated among candidate keys. Other candidate keys are designated as unique keys

Predicate-defined subclass

Displayed by writing the predicate condition next to the line that connects the subclass to the specialization circle. An attribute equals a particular value.

What constraints can Insert violate?

Domain constraints Key constraints Entity integrity Referential integrity

Weak Entities

Don't have key attributes of their own. They are identified by being related to another entity. Always has total participation constraint. If parent entity is deleted, all related weak entities are deleted too.

Backup and recovery system

Ensures DB can recover, such as, in the middle of a transaction or by disk backup etc

Isolation property of transactions

Ensures that each transaction appears to execute in isolation from other transactions

Atomicity property of transactions

Ensures that either all the database operations in a transaction are executed or none are

Specialization hierarchy

Every subclass participates as a subclass in ONLY ONE class/subclass relationship

Transaction

Executing process that includes multiple DB accesses

A Primary Key must be composed of exactly one Attribute in a Relation schema (TRUE/FALSE)

FALSE

A database whose state violates only one Integrity Constraint may still be possible to be in a valid state. (TRUE/FALSE)

FALSE

Tuples in a relation are ordered, TRUE OR FALSE

FALSE. A relation state is a SET of tuples

Sequential file

File containing ordered records

Heap/Pile file

File containing unordered records

Difference between STUDENT and STUDENT(Name, Ssn, ...)

Former is the current state and latter is only the schema

Multiple hashing

Hashing collision method that applies a second hash function if a collision happens. If that collides, use open addressing or a third hash function

Chaining

Hashing collision resolution method that dumps records in an overflow location and sets a pointer of the occupied hash address location to the address of the overflow location

Open addressing

Hashing collision resolution method that proceeds in order from the specified address and checks each position until an open one is found

The overhead costs of using a DBMS are due to which three general areas?

High initial investment in hardware, software, and training. The generality that a DBMS provides for defining and processing data. Overhead for providing security, concurrency control, recovery, and integrity functions.

Entity-relationship ER model/diagram

High-level conceptual model and diagrammatic notation associated with the ER model

Two types of distributed DDBMS software

Homogenous (same DBMS at all sites) and heterogenous (different DBMS at each site)

Database designers

Identifying data to be stored and choosing structures to represent/store the data. Develop views. Collect requirements for the entities to be stored

Deductive database systems

Inferring new information from the stored database facts

Types of Model Constraints

Inherent model-based (constraints inherent in the data model) Schema-based (expressed in schemas of data model by specifying them in the DDL) Application based (business rules not expressed in the schema, but expressed by application program)

What four main types of actions involve databases?

Insert (Add) Delete (Remove) Modify (Update) Retrieve (Query)

Program-operation independence

Interface includes operation name and data types of its arguments, so implementation can be changed without affecting interface

What are the three levels of the Three-Schema Architecture?

Internal level Conceptual level External level

Levels of three-schema architecture

Internal level Conceptual level External/view level

Basic SQL constraints

Key and referential integrity constraints Restrictions on attribute domains and NULLs Constrains on individual tuples within a relation

List four common Database System Utilities discussed in Chapter 2.

Loading (load data files into DB) Backup DB storage reorganization (reorganize DB or make new access paths) Performance monitoring

What are the two types of data independence?

Logical data independence Physical data independence

Of the two types of Data Independence, which is harder to achieve? Why?

Logical data independence is harder to achieve because it allows structural and constraint changes without affecting application programs.

List the nine DBMS Interfaces discussed in the Chapter 2.

Menu-based for web clients or browsing Apps for mobile devices Forms-based GUIs Natural Language Keyword-based DB search Speech I/O Interfaces for Parametric users Interfaces for DBA

Stored Procedures

More involved procedures to enforce rules

Three things DBMS offer

Multiuser usage Encryption and authentication API to interact with it

Entity integrity constraint

No NULL primary key

Key constraints

No two tuples can have the same combination of values for all their attributes

Why include class/subclass specializations in the model?

Not all attributes/relationships may apply to all members of the entity type

What constraints can Delete violate

Only referential integrity. CASCADE, RESTRICT, SET NULL/DEFAULT

Schema evolution

Ops that change the schema. Modern DBMS makes it possible to do so while the DB is operational

Tools and who develops them?

Optional software packages that facilitate DB modeling and design, DB system design, and improved performance. Tool developers develop these.

End users

People whose jobs require access to DB for querying, updating, generating reports.

What are five additional implications of using the database approach, i.e. those that can benefit most organizations

Potential for enforcing standards. Reduced application development time. Flexibility. Availability of up-to-date information. Economies of scale.

How to specify a relation schema R

R(A1, A2, A3... An) where Ai is attribute i (name of a role played by domain D in the relation schema R)

Know the following

Referential integrity constraint Uniqueness/Key constraint Business rules

Attribute

Represents some property of interest about an entity

Security and authorization subsystem

Responsible for creating username/passwords for accessing the database according to certain access restrictions

Triggers

Rule activated by updates to the table

What are the four main characteristics of the database approach versus the file-processing approach?

Self-describing nature of a database system. Insulation between programs and data, and data abstraction. Support of multiple views of the data. Sharing of data and multiuser transaction processing.

Relational database schema

Set of relation schemas S = {R1, R2, ..., Rm) and a set of integrity constraints IC.

Relational database state

Set of relation states DB = { r1, r2,.... rm}

Class

Set or collection of entities.

In spite of the advantages of using a DBMS, there are a few situations in which a DBMS may involve unnecessary overhead costs that would not be incurred in traditional file processing. These four situations are:

Simple, well-defined database applications that are not expected to change at all. Stringent, real-time requirements for some application programs that may not be met because of DBMS overhead. Embedded systems with limited storage capacity, where a general-purpose DBMS would not fit. No multiple-user access to data.

Union constraint on specialization

Single superclass/subclass relationship with more than one superclass (disjoint)

Privileged software

Software for applying big changes to the DB like creating new accounts

Spanned vs unspanned

Spanned -- records can span more than one block, with the tail end of a block pointing to the rest of the record in the next block Unspanned -- if there's not enough space for one more, don't store another one

Canned transaction

Standard DB operations like queries and updates

In databases where a clear separation is maintained between the conceptual and internal levels, this language is used to specify the internal schema.

Storage Definition Language (SDL)

Program-data independence

Structure of data files is stored in DBMS catalog separately from access programs

What features are in EER diagrams that aren't in ER diagrams

Sub-/super-classes Specialization/generalization Category/union type Attribute/relationship inheritance

User-defined subclass

Subclass is not defined by a predicate

Specialization lattice

Subclasses can be a subclass of more than one class/subclass relationship (multiple inheritance).

Overlap constraint on specialization

Subclasses of a specialization don't have to be disjoint

Disjointness constraint on specialization

Subclasses of a specialization must be disjoint (only one specialization per entity)

Key

Superkey of Relation r.

Query optimizer

Takes compiled query/form and makes an efficient search plan

Dynamic hashing

The one that looks like a binary tree that splits on bit values until the leaf node points to the address of the bucket

Extendible hashing

The one with a directory indexed by higher order bits. Index by the first d bits, but each bucket can be indexed by the first <d bits until they overflow, at which point, they're split.

How do you represent inheritance in an EER diagram

The pitchfork U where the forked end points toward the parent

Relationship participation constraints

Total and partial

A Primary Key has a unique value for each Tuple (i.e. Row) in a given Relation (i.e. Table). (TRUE/FALSE)

True

What two properties does a key satisfy

Uniqueness (no two tuples can have the same values for all attributes in key) Minimal superkey (cannot remove any attributes and still have uniqueness constraint hold)

Do key attributes have to be unique?

Uniqueness property must hold for every entity set of the entity type by key attributes

How do you match queries on aggregate values?

Use the HAVING keyword (after WHERE)

What constraints can update violate

Usually causes no problems unless the primary/foreign key is being updated

Referential integrity constraint

Value of Foreign key in a tuple is the primary key of another tuple OR is NULL

In true Three-schema architectures, this language specifies user views (perspectives) and their mapping to the conceptual schema.

View Definition Language (VDL)

Correlated nested query

When a condition in the where clause of a nested query references an attribute of a relation declared in the outer query, the two are correlated

Does the order of table creation matter?

Yes, when foreign keys are involved

Attribute-defined subclass

You can tell the specialization of a class by a particular attribute that the subclass may have.

Relational model represents data as

a collection of relations

Consistency of a transaction

a correct execution of this transaction moves the DB from one consistent state to another

low-level/procedural DML

a data sublanguage embedded in general purpose host language. uses record-at-a-time processing to work on one record at a time

High-level/nonprocedural DML

a query language used on their own to specify complex DB ops at batch levels. Set-at-a-time. DECLARATIVE LANGUAGE because it says what to retrieve instead of how

Valid state

a state that satisfies ALL integrity constraints

Overflow/transaction file

a supplementary file to sequential files -- insertions are just thrown in there and merged in later

Atomicity of a transaction

a transaction is an atomic unit of processing; it is either performed in its entirety or not performed at all.

Isolation of a transaction

a transaction should not make its updates visible to other transactions until it is committed; this property, when enforced strictly, solves the temporary update problem and makes cascading rollbacks of transactions unnecessary

Centralized DBMS

all DBMS ops executed on ONE machine

Data manipulation language

allows CRUD

Schema construct

an object in the schema (which I think is a table)

rotational delay

beginning of block rotates until head finds it

RAID 6

block striping with TWO parity blocks

RAID 5

block striping with parity block distributed

Does DBMS allow you to manipulate the schema or the actual data?

both!

Logical data independence

capacity to change the conceptual schema without having to change external schemas or application programs

Physical data independence

capacity to change the internal schema without having to change the conceptual schema

High-Level/Conceptual data models

closer to the way users perceive data

Entity type

collection of entities that have the same attributes

What is a DB

collection of related data over a miniworld that has inherent meaning

Stored data manager

conducts low-level IO between main-memory and disk

Virtual data

data derived from the DB files but not explicitly stored

A database design that stores each logical data item in only one place in the database is called...

data normalization

Specialization

defining a set of subclasses for an entity type defined on the basis of some distinguishing characteristic of the entities in the superclass. They can define specific attributes or specific relationship types

Low-level/physical data models

describe the details of how data is stored on computer storage media

Metadata

describes data and is stored in the DB

System analysts

determine requirements of end users and create canned transactions for them. Also application programmers, software developers, software engineers

Data striping

distributing data over multiple discs to make them look like a single, large, fast disc (bit-level striping, block-level striping, etc)

Representational data models

easily understood by end users also similar to how data is organized in computer storage hides many detail of data storage on disk, but can be implemented on a computer system directly

Query compiler

evaluates queries typed in interactively.

RAID 1

exact copy of data on both discs. No striping

Runtime DB Processor

executes all queries and updates catalog with statistic changes

Blocking Factor

for fixed-length records of size R and disc of block size B, floor(B/R) which is the number of records per block.

Traditional database systems often suffered from the so-called [a] problem, since the data structures provided by the DBMS were incompatible with the programming language's data structures.

impedance mismatch

Three types of constraints on entities

implicit (inherent in the data model) explicit (defined by the schema) semantic (contained in the business rules)

Clustering index

index on non-distinct key (that the table is ordered on) and has one entry per distinct value -- points to the block where that distinct value first shows up. BAD PERFORMANCE DUE TO REDIRECTION

The DBMS must provide specialized data structures and search techniques to speed up disk search for the desired records. Auxiliary files called _____ are often used for this purpose.

indexes

Primary index

indexes on primary key and has one entry per DISK BLOCK. entry is for the first record in that block

Column names in a relational model

interpret meanings of the values in tuple attributes

The DBMS stores the descriptions of the schema constructs and constraints—also called the ________ —in the DBMS catalog so that DBMS software can refer to the schema whenever it needs to.

metadata

Type of higher-level implementation data model

object-data model

Secondary index

on a potentially non-ordered potentially non-distinct value. dense index that points to the BLOCK where the value is contained -- indexed values are stored in order in index

Durability of a transaction

once a transaction changes the DB and the changes are committed, these changes must never be lost because of subsequent failure

Client module vs Server module

one runs on user-facing end, other is responsible for data storage, access, etc

RAID 0

only data striping. No redundancy

Stored procedures

part of the DB definition and are invoked when certain conditions are met

active DB system

provides active rules that auto initiate actions when certain events/conditions occur

Open Database Connectivity (ODBC)

provides an API for direct DB access that can run on the client side (LIKE JDBC)

Entity

real-world object or concept

RAID

redundancy (reliable), data striping (performance), parity (error detection).

Candidate key

relation schema can have more than one key -- any attribute is a candidate key

Operators and maintenance personnel

responsible for actual running and maintenance of the hardware and software environment for the DB system

Database administrator

responsible for administering the database, the DBMS, and related software. Responsible for authorizing access to database, coordinating and monitoring its use, and acquiring software and hardware resources as needed

In a data model, it is important to distinguish between the description of the database and the database itself. The description of a database is the database _____________

schema

access path

search structure that makes the search for a particular DB record efficient, such as indexing or hashing

A DBMS should provide a ________, which the DBA uses to create accounts and to specify account restrictions.

security and authorization subsystem

Relation

set of tuples of arity n. Each tuple t is an ordered list of n values, where each is an element in the domain of Ai or a NULL value.

Concurrency control

several users trying to update the same data do so in a controlled manner and the result of the updates is correct

View definition language

specifies user views/mappings to conceptual schema

DBMS catalog (and what does it store)

stores info such as the structure of each file, the type and storage format of each data item, and various constrains on the data -- called METADATA

Access path

structure that makes the search for particular database records efficient

Schema

structure/framework of the DB

precompiler

takes DB queries written in progrzmming languages and extracts DML commands while sending the rest of hte program to the compiler

Deductive DB systems

the ability to deduce/infer new information from stored DB facts. Done by specifying declared rules.

database state

the data in the database at a point in time

What is used to classify a DBMS

the data model (relational versus NOSQL) number of users (single vs multiuser systems) number of sites the DB is distributed over (centralized vs distributed) cost (opensource versus other) Access path options (reverse structures vs others) General vs specific purpose

Intension

the database schema

Extension

the database state

Superkey

the entire tuple

relational data model, network/hierarchal models, record-based data model

the first is new, the next two are old, and the last is another representational data model


Set pelajaran terkait

MGT 370 Chapter 7-15 Reading Notes

View Set

CH1 histoire : La 1ere guerre mondiale

View Set

2214 RealizeIT Module 5 Questions

View Set

Intro to Kinesiology ch 10 butte college

View Set

MODULE 4: ADVENT OF A NATIONAL HERO

View Set

Chapter 03: Gross Income: Inclusions and Exclusions

View Set