WGU C192 Data Management for Programmers

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

OODM (object oriented data model)

A (logical) data model that captures the semantics of objects supported in object-oriented programming

Equijoin

A Theta join where the predicate F contains only equality (=)

package

A collection of procedures, functions, variables, and SQL statements that are grouped together and stored as a single program unit

internal schema

A complete description of the internal model, down to the physical storage of data. Each DB has exactly one.

timestamping

A concurrency control protocol that orders transactions in such a way that older transactions (transactions with smaller timestamps) get priority in the event of conflict

dimensional model

A database design composed of one table with a composite primary key, called the fact table, and a set of smaller tables, called dimension tables. All natural keys are replaced with surrogate keys.

data mart

A database that contains a subset of corporate data to support the analytical requirements of a particular business unit or to support users who share the same requirement to analyze a particular business process

temporal database

A database that contains time-varying historical data with the possible inclusion of current and future data and has the ability to manipulate this data

timeout

A deadlock technique where a transaction that requests a lock will wait for only a system-defined period of time before being automatically restarted

deadlock detection and recovery

A deadlock technique where the DBMS allows deadlock to occur but recognizes occurrences of deadlock and breaks them, usually through a Wait-For graph

deadlock prevention

A deadlock technique where the DBMS looks ahead to determine if a transaction would cause deadlock, and never allows deadlock to occur

star schema (star join)

A dimensional data model that has a fact table in the center, surrounded by *denormalized* dimension tables

starflake schema

A dimensional data model that has a fact table in the center, surrounded by normalized *and* denormalized dimension tables

snowflake schema

A dimensional data model that has a fact table in the center, surrounded by normalized dimension tables

fact constellation

A dimensional model, which contains more than one fact table sharing one or more conformed dimension tables

dynamic optimization

A form of query optimization where decomposition and optimization are carried out every time query is run. Selects best optimum strategy, but has a performance overhead.

static optimization

A form of query optimization where the query is parsed, validated, and optimized at once, similar to a compiler. Runtime overhead removed, but may no longer use optimum strategy when run.

two phase locking (2PL)

A locking protocol where all locking operations precede the first unlock operation in the transaction. A transaction must acquire a lock (read or write, as needed) on an item before operating on it. Once a lock is released, no new locks can be acquired.

dimensionality modeling

A logical design technique that aims to present the data in a standard, intuitive form that allows for high-performance access

base relation

A named relation corresponding to an entity in the conceptual schema, whose tuples are physically stored in the database

locking

A procedure used to control concurrent access to data. When one transaction is accessing the database, this may deny access to other transactions to prevent incorrect results

recursive relationship

A relationship type in which the *same* entity type participates more than once in *different roles*

operational data store (ODS)

A repository of current and integrated operational data used for analysis. May be a staging area for a data warehouse. Provides the ease of use of a RDBMS while remaining distant from the decision support functions of a data warehouse

nonserial schedule

A schedule where the operations from a set of concurrent transactions are interleaved

serial schedule

A schedule where the operations of each transaction are executed consecutively without any interleaved operations from other transactions

sagas

A sequence of (flat) transactions that can be interleaved with other transactions

row type

A sequence of field name/data type pairs that provides a data type to represent the types of rows in tables, so that complete rows can be stored in variables

shared subclass

A subclass with more than one superclass

data warehousing

A subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management's decision making process

candidate key

A superkey such that no proper subset is a superkey within the relation (atomic superkey)

timestamp

A unique identifier created by the DBMS that indicates the relative starting time of a transaction

consistency

ACID component. A transaction must transform the database from one consistent state to another consistent state

atomicity

ACID component. The "all or nothing" property-a transaction is an indivisible unit that is either performed in its entirety or is not performed at all

durability

ACID component. The effects of a successfully committed transaction are permanently recorded in the database and must not be lost because of a subsequent failure

isolation

ACID component. Transactions execute independently of one another.

Columns, constraints, defaults

ALTER TABLE allows you to add and remove these elements from a table

discretionary access control (DAC)

Access control based on assigning individual rights to specific users via simple GRANT and REVOKE SQL statements.

mandatory access control (MAC)

Access control based on system-wide policies that cannot be changed by individual users. Each database object is assigned a security class, users assigned clearances. Rules determine whether a user can read/write an object based on their clearance.

external schema

Also called a subschema, this is a specific view of the data utilized by an end user

superkey

An attribute, or set of attributes, that uniquely identifies a tuple within a relation

savepoint

An identifiable point in a flat transaction representing some partially consistent state, which can be used as an internal restart point for the transaction if a subsequent problem is detected

data model

An integrated collection of concepts for describing and manipulating data, relationships between data, and constraints on the data in an organization (structural, manipulative, constraints)

conformed dimension

Any dimension that is to be represented in more than one dimensional model—and hence data mart

CREATE OR REPLACE PACKAGE BODY PackageName AS

Code to create a package body

CREATE OR REPLACE PACKAGE PackageName AS

Code to create a package specification

CREATE OR REPLACE PROCEDURE name (...) AS

Code to create a stored procedure

CREATE TRIGGER TriggerName When ON

Code to create a trigger

Fan trap

Connection trap where a model represents a relationship between entity types, but the pathway between certain entity occurrences is ambiguous

chasm trap

Connection trap where a model suggests the existence of a relationship between entity types, but the pathway does not exist between certain entity occurrences

link analysis

Data mining technique that aims to establish associations between the individual records, or sets of records, in a database

deviation detection

Data mining technique that identifies outliers, which express deviation from some previously known expectation and norm

predictive modeling

Data mining technique that uses observations to form a model of the important characteristics of some phenomenon

database segmentation

Data mining technique used to partition a database into an unknown number of segments/clusters of records that share a number of properties and so are considered to be homogeneous

temporal data

Data that changes over time

Corporate Information Factory (CIF)

Data warehouse approach by Inmon which begins with a data model of all data, creates the warehouse, then uses the warehouse to feed data marts. Uses traditional techniques.

Business Dimensional Lifecycle (BDL)

Data warehouse approach by Kimball, starts by identifying info requirements and business processes to create Data Warehouse Bus Matrix, which is then used to create data marts, which are finally integrated into a warehouse. Uses dimensionality modeling.

centralized approach

Database design approach where requirements for each user view are merged into a single set of requirements for the new database system. A data model representing all user views is created during the database design stage.

view integration approach

Database design approach where requirements for each user view are merged into a single set of requirements for the new database system. A data model representing all user views is created during the database design stage.

native XML database (NXD)

Defines a logical data model for an XML document that is used to store and retrieve XML. The XML document must be the logical unit of storage, although the underlying physical storage model may be different

vertical partitioning

Distributes the attributes of a relation across a number of (smaller) relations (the primary key is duplicated to allow the original relation to be reconstructed)

horizontal partitioning

Distributes the tuples of a relation across a number of (smaller) relations

disjoint constraint

EER constraint that describes the relationship between members of the subclasses and indicates whether it is possible for a member of a superclass to be a member of one, or more than one, subclass

participation constraint

EER constraint that determines whether every member in the superclass must participate as a member of a subclass

aggregation

EER technique that represents a "has-a" or "is-part-of" relationship between entity types, where one represents the "whole" and the other the "part."

composition

EER technique; a specific form of aggregation that represents an association between entities, where there is a strong ownership and coincidental lifetime between the "whole" and the "part."

specialization

EER technique; the process of maximizing the differences between members of an entity by identifying their distinguishing characteristics

generalization

EER technique; the process of minimizing the differences between entities by identifying their common characteristics

identify facts

Final step in dimensionality modeling, where you decide on the columns in the fact table (i.e., the facts)

referential integrity

If a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or the foreign key value must be wholly null

True

In a CREATE TABLE statement, after specifying columns, remaining clauses can be proceeded by "CONSTRAINT name" to allow those to be dropped using ALTER TABLE

location transparency

In a DDBMS, allows a user to access data without caring where it's located, although they may need to be aware of fragmentation

distribution transparency

In a DDBMS, allows a user to perceive a DDBMS as a single logical entity

transaction transparency

In a DDBMS, ensures all distributed transactions maintain the distributed database's integrity and consistency

failure transparency

In a DDBMS, ensures atomicity for distributed transactions in events of failure

performance transparency

In a DDBMS, ensures it performs as well as a centralized DBMS

fragmentation transparency

In a DDBMS, when a user doesn't need to know data is fragmented

entity integrity

In a base relation, no attribute of a primary key can be null

exclusive lock (write)

Lock where a transaction can both read and update an item

shared lock (read)

Lock where a transaction can read an item but not update it

nested transaction model

Model where a transaction is viewed as a collection of related subtasks, or subtransactions, each of which may also contain any number of subtransactions

Locking and timestamping

Name two pessimistic concurrency control techniques

. (dot)

Operator used to reference items declared in a package

connection trap

Problem that arises in ER modeling when the meaning of a relationship is not fully understood nor clearly defined

shadow paging

Recovery protocol that maintains a current page table and a shadow page table. When a transaction completes, the current page table becomes the new shadow page table.

immediate update

Recovery protocol where updates are applied to the database as they occur without waiting to reach the commit point

deferred update

Recovery protocol where updates are not written to the database until after a transaction has reached its commit point

intersection

Relational algebra operation that defines a relation consisting of the set of all tuples that are in both R and S. R and S must be union-compatible.

difference

Relational algebra operation that defines a relation consisting of the tuples that are in relation R, but not in S. R and S must be union-compatible.

Division

Relational algebra operation that defines a relation over the attributes C that consists of the set of tuples from R that match the combination of *every* tuple in S

union

Relational algebra operation that defines a relation that contains all the tuples of R, or S, or both R and S, duplicate tuples being eliminated. R and S must be union-compatible.

Semijoin

Relational algebra operation that defines a relation that contains the tuples of R that participate in the join of R with S satisfying the predicate F.

Theta join (θ)

Relational algebra operation that defines a relation that contains tuples satisfying the predicate F from the Cartesian product of R and S. The predicate F is of the form R.ai θ S.bi, where θ may be one of the comparison operators (<, ≤, >, ≥, =, ≠).

Cartesian product

Relational algebra operation that defines a relation that is the concatenation of every tuple of relation R with every tuple of relation S

Outer join

Relational algebra operation that is a join in which tuples from R that do not have matching values in the common attributes of S are also included in the result relation. Missing values in the second relation are set to null.

Natural join

Relational algebra operation that is an Equijoin of the two relations R and S over all common attributes x. One occurrence of each common attribute is eliminated from the result.

rename

Relational algebra operation that provides a new name S for the expression E, and optionally names the attributes as a1, a2, . . . , an. (Greek letter rho)

projection

Relational algebra operation that works on a single relation R and defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates (i.e., selecting specific columns from a table)

selection (restriction)

Relational algebra operation that works on a single relation R and defines a relation that contains only those tuples of R that satisfy the specified condition (predicate)

CREATE ASSERTION name

SQL DDL command to create a constraint that is not directly linked with a table definition, using a CHECK statement

CREATE DOMAIN name AS type

SQL DDL command to create a new datatype based on an underlying existing type using a CHECK statement

CREATE SCHEMA name AUTHORIZATION user

SQL DDL command to create a schema

CHECK

SQL DDL keyword to add a domain constraint to a column

EXISTS (NOT EXISTS)

SQL keyword used to check only for the existence or nonexistence of rows in the subquery result table

ESCAPE

SQL keyword used to define an escape character

declare grain

Second step in dimensionality modeling, where you decide how granular you want to be in the fact table (e.g., an individual sale vs some aggregate)

serializability

Term for finding nonserial schedules that allow transactions to execute concurrently without interfering with one another, and thereby produce a database state that could be produced by a serial execution

degree (relation)

Term for the number of attributes a relation/table contains

cardinality (relation)

Term for the number of tuples in a relation/table. Changes over time.

view maintenance

Term for the process of updating a materialized view in response to changes to the underlying data

migrating rows

Term for the rows that enter or leave a view

intension

Term for the structure of a relation/table, together with a specification of the domains and any other restrictions on possible values

extension (state)

Term for the tuples of a relation, which change over time

view materialization

Term for when a DBMS stores the view as a temporary table in the database when the view is first queried, for performance

DOMAIN and TABLE

The SQL DDL ALTER command can only be used on these two types of objects

alternate keys

The candidate keys that are not selected to be the primary key

systems specification

The document that describes in detail the data to be held in the database and how the data is to be used

users requirements specification

The document that describes in detail the data to be held in the database and how the data is to be used

view

The dynamic result of one or more relational operations operating on the base relations to produce another relation; a virtual relation that does not necessarily exist in the database but can be produced upon request by a particular user, at the time of request

online analytical processing (OLAP)

The dynamic synthesis, analysis, and consolidation of large volumes of multidimensional data

select business process

The first step in dimensionality modeling, where you choose which part of the business to model.

fact finding

The formal process of using techniques such as interviews and questionnaires to collect facts about systems, requirements, and preferences

select business process, declare grain, choose dimensions, identify facts

The four steps in phase 1 of dimensional modeling

physical data independence

The immunity of the conceptual schema to changes in the internal schema

logical data independence

The immunity of the external schemas to changes in the conceptual schema

fact table

The main table in a dimensional model, composed of a composite primary key (made up of foreign keys) containing one or more numerical records

dimension table

The other tables in a dimensional model, composed of a noncomposite primary key referenced by the main table and containing descriptive textual info, with attributes used as the constraints in queries

growing phase

The phase where a transaction acquires all the locks needed but cannot release any locks

shrinking phase

The phase where a transaction releases its locks but cannot acquire any new locks

checkpoint

The point of synchronization between the database and the transaction log file. All buffers are force-written to secondary storage

conceptual modeling

The process of constructing a detailed architecture for a database that is independent of implementation details

logical modeling

The process of constructing a model of the data based on a specific data model, but that is independent of implementation details

logical database design

The process of constructing a model of the data used in an enterprise based on a specific data model, but independent of a particular DBMS and other physical considerations

Conceptual database design

The process of constructing a model of the data used in an enterprise, independent of all physical considerations

data mining

The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions

physical database design

The process of producing a description of the implementation of the database on secondary storage.

closure

The property allowing expressions to be nested in the relational algebra, just as we can nest arithmetic operations

cascading rollback

The situation in which a single transaction leads to a series of rollbacks. Avoidable using rigorous 2PL or strict 2PL.

read, validation, write

The three phases of an optimistic concurrency control protocol

optimistic

These concurrency control techniques are based on the premise that conflict is rare, and so allow transactions to proceed unsynchronized, checking for conflicts only at commit

Pessimistic (conservative)

These concurrency control techniques cause transactions to be delayed in case they conflict with others at some future time

fact finding techniques

These include examining documentation, interviewing, observing the enterprise in operation, research, and questionnaires.

choose dimensions

Third step in dimensionality modeling, where you decide on dimension tables to set the context for asking about facts in the fact table

cursor

This acts as a pointer to a particular row of a query result. It allows the rows of a query result to be accessed one at a time.

conceptual schema

This describes all the entities, attributes, and relationships together with integrity constraints. Each DB has exactly one.

Normalization

This is used to test the correctness of a logical data model

True

True or false: The WHERE clause filters individual rows going into the final result table, whereas HAVING filters groups going into the final result table

True

True or false: When GROUP BY is used, each item in the SELECT list must be single-valued per group

conflict serializability

Type of serializability that orders any conflicting operations in the same way as some serial execution

livelock

When two transactions are left in a wait state indefinitely, unable to acquire any new locks, although the DBMS is not in deadlock

deadlock

When two transactions wait for locks on items held by the other

database and log files

Which parts of a database need to be backed up?


Set pelajaran terkait

Chapter 14 - Depressive Disorders (Psych) EAQ's

View Set

Health Insurance Practice Exam Questions

View Set

Midterm Exam 2 (ch 8, ch 9, ch 10, ch 11, ch 12)

View Set

Gerontological ch. 15, 16, 20,28

View Set

Michael J. Fox-Before the Senate Speech

View Set