C192 chapter summary study guide

¡Supera tus tareas y exámenes ahora con Quizwiz!

ER

A conceptual data model is supported by documentation, such as ______________________ diagrams and a data dictionary, which is produced throughout the development of the model.;

capabilities

Alternative approaches for developing an OODBMS extend an existing object-oriented programming language with database ______________________ ;

embed

Alternative approaches for developing an OODBMS: ______________________ OODB language constructs in a conventional host language;

domain

An attribute ______________________ is the set of allowable values for one or more attributes.;

algorithms

Techniques are specific implementations of the operations (______________________ ) that are used to carry out the data mining operations. Each operation has its own strengths and weaknesses.;

load reduction, disconnected

The benefits of database replication are improved availability, reliability, performance, with _____1_____ ____________ , and support for _______2_______________ computing, many users, and advanced applications.;

copy, workflow, update-anywhere

Data ownership models for replication can be primary- and secondary-___________1___________ , ____________2__________ , and _____3______-___________ (peer-to-peer).

logical,PCDATA

A native XML database defines a (_________1_____________ ) data model for an XML document (as opposed to the data in that document) and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, ______________________ , and document order. The XML document must be the unit of (____________1__________ ) storage although it is not restricted by any underlying physical storage model (so traditional DBMSs are not ruled out but neither are proprietary storage formats such as indexed, compressed files);

planning

Database ______________________ involves the management activities that allow the stages of the database system development lifecycle to be realized as efficiently and effectively as possible.;

join

A relation is in Fifth Normal Form (5NF) if and only if for every ______________________ dependency (1, 2, . . . Rn) in a relation R, each projection includes a candidate key of the original relation;

nontrivial

A relation is in Fourth Normal Form (4NF) if and only if for every ______________________ multi-valued dependency A ↠ B, A is a candidate key of the relation.;

Boyce-Codd

A relation is in __________-____________ Normal Form (BCNF) if and only if every determinant is a candidate key.;

fragments

A relation may be divided into a number of subrelations called ______________________, which are allocated to one or more sites. Fragments may be replicated to provide improved availability and performance.;

left-deep

A relational algebra tree where the right hand relation is always a base relation is known as a ___________-___________ tree. Left-deep trees have the advantages of reducing the search space for the optimum strategy and allowing the query optimizer to be based on dynamic processing techniques. Their main disadvantage is that in reducing the search space many alternative execution strategies are not considered, some of which may be of lower cost than the one found using a linear tree.;

unit of work, undone

A transaction is an action, or series of actions, carried out by a single user or application program, that accesses or changes the contents of the database. A transaction is a logical ___1___ ____ ____________ that takes the database from one consistent state to another. Transactions can terminate successfully (commit) or unsuccessfully (abort). Aborted transactions must be __________2____________ or rolled back.

atomicity, consistency, isolation, duarbilty

A transaction should possess the four basic or so-called ACID properties: ________1______________, _________2_____________, _________3_____________, and _________4_____________. Atomicity and durability are the responsibility of the recovery subsystem; isolation and, to some extent, consistency are the responsibility of the concurrency control subsystem.;

granularity

A tree may be used to represent the _________1_____________ of locks in a system that allows locking of data items of different sizes. When an item is locked, all its descendants are also locked.

FOR

A user can allow a receiving user to pass privileges on using the WITH GRANT OPTION clause and can revoke this privilege using the GRANT OPTION ______________________ clause;

CREATE VIEW

A view is a virtual table representing a subset of columns and/or rows and/or column expressions from one or more base tables or views. A view is created using the ______________________ statement by specifying a defining query. It may not necessarily be a physically stored table, but may be recreated each time it is referenced.;

users

A view is the dynamic result of one or more relational operations operating on the base relations to produce another relation. A view is a virtual relation that does not actually exist in the database but is produced upon request by a particular user at the time of request. The view mechanism provides a powerful and flexible security mechanism by hiding parts of the database from certain ______________________.

twelve

E.F. Codd formulated ________rules as the basis for selecting OLAP tools.;

application, disk

Conventional DBMSs have a two-level storage model: the ___________1___________ storage model in main or virtual memory, and the database storage model on ___________2___________ .

secure

Integrity constraints also contribute to maintaining a ______________________ database system by preventing data from becoming invalid, and hence giving misleading or incorrect results.;

discretionary

Most commercial DBMSs provide an approach called ________1______________ Access Control (DAC), which manages privileges using SQL. The SQL standard supports DAC through the GRANT and REVOKE commands.

n-ary

Multiplicity for a complex relationship is the number (or range) of possible occurrences of an entity type in an ______________________ relationship when the other (n-l) values are fixed.;

system, data

Oracle DBMS provides two types of security measures: ________1______________ security and __________2____________ security. System security enables the setting of a password for opening a database, and data security provides user-level security, which can be used to limit the parts of a database that a user can read and update.;

Builder

Oracle Warehouse ______________________ (OWB) is a key component of the Oracle Warehouse solution, enabling the design and deployment of data warehouses, data marts, and e-Business intelligence applications. OWB is both a design tool and an extraction, transformation, and loading (ETL) tool;

legislation

Recent failures of well-known organizations have led to increased scrutiny of organizations. This scrutiny has been formalized in a number of acts of ______________________ in the United States and elsewhere.;

external, conceptual, internal

The ANSI-SPARC database architecture uses three levels of abstraction: ______________________ , ______________________ , and ______________________ . The external level consists of the users' views of the database. The conceptual level is the community view of the database: it specifies the information content of the entire database, independent of storage considerations. The conceptual level represents all entities, their attributes, and their relationships, as well as the constraints on the data, and security and integrity information. The internal level is the computer's view of the database: it specifies how data is represented, how records are sequenced, what indexes and pointers exist, and so on.;

hardware, software, data procedures, and people

The DBMS environment consists of ______________________ (the computer), ______________________ (the DBMS, operating system, and applications programs), ______________________ , ______________________ , and ______________________ . The people include data and database administrators, database designers, application developers, and end-users.;

monitoring, tuning

The final step (Step 8) of physical database design is the ongoing process of _______1______________and _________2_____________the operational system to achieve maximum performance.;

process, grain, diminstions, facts

The first phase of the Dimensional Modeling stage uses a four-step process to facilitate the creation of a DM. The steps include: select business ________1______________ , declare ________2______________ , choose __________3____________ , and identify ___________4___________ .;

Selection, Projection

The five fundamental operations in relational algebra—__________1____________ , ___________2___________ , Cartesian product, Union, and Set difference—perform most of the data retrieval operations that we are interested in. In addition, there are also the Join, Intersection, and Division operations, which can be expressed in terms of the five basic operations.;

interviewing, questionnaires

The five most common fact-finding techniques are examining documentation, __________1____________ , observing the enterprise in operation, conducting research, and using ____________2__________ .;

USAGE

The privileges that can be passed on are ______________________ , SELECT, DELETE, INSERT, UPDATE, and REFERENCES; INSERT, UPDATE, and REFERENCES can be restricted to specific columns.

Eager, Lazy

The processing of updates has to maintain the transactional consistency. 1-copy-serializability is the correctness criterion of concurrent data processing in a replicated database. _________1_____________ update anywhere replication has a poor scalability. ___________2___________ update anywhere replication has to cope with frequent reconciliations.

conceptual

The purpose of Step 2.1 of the methodology for logical database design is to derive a relational schema from the ______________________ data model created in Step 1.;

extended relational, universal server or universal, object-relational

Various terms have been used for systems that have extended the relational data model. The original term used to describe such systems was the ____1______ ____________ DBMS (ERDBMS), and the term _____2____ _____________ DBMS (UDBMS) has also been used. However, in recent years, the more descriptive term ______3____ ____________ DBMS (ORDBMS) has been used to indicate that the system incorporates some notion of "object.";

resolution

View ______________________ merges the query on a view with the definition of the view producing a query on the underlying base table(s). This process is performed each time the DBMS has to process a query on a view.

write

Views can be used to simplify the structure of the database and make queries easier to ______________________ . They can also be used to protect certain columns and/or rows from unauthorized access. Not all views are updatable.;

Snowflake, normalized

________1______________ schema is a dimensional data model that has a fact table in the center, surrounded by _________2_____________ dimension tables.;

Patent

______________________ provides an exclusive (legal) right for a set period of time to make, use, sell, or import an invention.;

Copyright

______________________ provides an exclusive (legal) right for a set period of time to reproduce and distribute a literary, musical, audiovisual, or other "work" of authorship.;

Trademark

______________________ provides an exclusive (legal) right to use a word, symbol, image, sound, or some other distinction element that identifies the source of origin in connection with certain goods or services another make, use, sell, or import an invention;

query processing

In a mobile environment, ______________________ ______________________ must deal with location-aware queries and location-dependent queries, as well as moving object database queries, spatio-temporal queries, and continuous queries. To enable location-dependent queries, a database must support two- and three-dimensional geometric shapes and geometric operations, for example, to calculate the intersection of shapes.;

pointer

In both cases of OIDs, an OID is different in size from a standard in-memory ______________________ , which need be only large enough to address all virtual memory.;

scalability

Logical database design concludes with Step 2.7, which includes consideration of whether the model is capable of being extended to support possible future developments (______________________ ). At the end of Step 2, the logical data model, which may or may not have been developed using Step 2.6, is the source of information for physical database design described as Steps 3 to 8 in Chapters 18 and 19;

associative

OQL can be used for both ______________________ and navigational access;

past, change

The star schema exploits the characteristics of factual data such that facts are generated by events that occurred in the __________1____________ , and are unlikely to _________2_____________ , regardless of how they are analyzed. As the bulk of data in the data warehouse is represented as facts, the fact tables can be extremely large relative to the dimension tables.;

predictive, segmentation, analysis, deviation

There are four main operations associated with data mining techniques: ______________________ modeling, database ______________________ , link ______________________ , and ______________________ detection.;

horizontal, vertical

There are two main types of fragmentation: ______________________ and ______________________.

online transaction processing

A DBMS built for ______________________ ______________________ ______________________ (OLTP) is generally regarded as unsuitable for data warehousing because each system is designed with a differing set of requirements in mind. For example, OLTP systems are designed to maximize the transaction processing capacity, while data warehouses are designed to support ad hoc query processing.;

processing, network

A DDBMS is distinct from distributed _________1_____________, where a centralized DBMS is accessed over a _________2_____________. It is also distinct from a parallel DBMS, which is a DBMS running across multiple processors and disks and which has been designed to evaluate operations in parallel, whenever possible, in order to improve performance.;

homogeneous, heterogeneous

A DDBMS may be classified as homogeneous or heterogeneous. In a _________1_____________ system, all sites use the same DBMS product. In a ___________2___________ system, sites may run different DBMS products, which need not be based on the same underlying data model, and so the system may be composed of relational, network, hierarchical, and object-oriented DBMSs.;

cursors

A SELECT statement can be used if the query returns one and only one row. To handle a query that can return an arbitrary number of rows (that is, zero, one, or more rows), SQL uses __________1____________ to allow the rows of a query result to be accessed one at a time. In effect, the ____________1__________ acts as a pointer to a particular row of the query result. The __________1____________ can be advanced by one to access the next row. A __________1____________ must be declared and opened before it can be used, and it must be closed to deactivate it after it is no longer required. Once the ___________1___________ has been opened, the rows of the query result can be retrieved one at a time using a FETCH statement, as opposed to a SELECT statement.;

non-blocking

A ____-__________________ protocol is three-phase commit (3PC), which involves the coordinator sending an additional message between the voting and decision phases to all participants asking them to pre-commit the transaction.;

lossless-join

A _________- _____________ dependency is a property of decomposition, which means that no spurious tuples are generated when relations are combined through a natural join operation.;

subselect

A _________1_____________ is a complete SELECT statement embedded in another query. A __________1____________ may appear within the WHERE or HAVING clauses of an outer SELECT statement, where it is called a subquery or nested query. Conceptually, a subquery produces a temporary table whose contents can be accessed by the outer query. A subquery can be embedded in another subquery.;

web service

A __________ ____________ is a software system designed to support interoperable machine-to-machine interaction over a network. They are based on standards such as XML, SOAP, WSDL, and UDDI.;

mathematical relation

A __________ ____________ is a subset of the Cartesian product of two or more sets. In database terms, a relation is any subset of the Cartesian product of the domains of the attributes. A relation is normally written as a set of n-tuples, in which each element is chosen from the appropriate domain.;

multi-valued

A __________-____________ attribute holds multiple values for each occurrence of an entity type.;

superkey, primary, foreign

A __________1____________ is an attribute, or set of attributes, that identifies tuples of a relation uniquely, and a candidate key is a minimal superkey. A __________2____________ is the candidate key chosen for use in identification of tuples. A relation must always have a primary key. A _________3_____________ is an attribute, or set of attributes, within one relation that is the candidate key of another relation.;

fan trap

A ___________ ___________ exists where a model represents a relationship between entity types, but the pathway between certain entity occurrences is ambiguous.;

single-valued

A ___________-___________ attribute holds a single value for each occurrence of an entity type.;

FLOWR

A ______________________ (pronounced "flower") expression is constructed from FOR, LET, WHERE, ORDER BY, and RETURN clauses. A FLWOR expression starts with one or more FOR or LET clauses in any order, followed by an optional WHERE clause, an optional ORDER BY clause, and a required RETURN clause. As in an SQL query, these clauses must appear in order. A FLWOR expression binds values to one or more variables and then uses these variables to construct a result.;

Transaction Processing (TP)

A ______________________ Monitor is a program that controls data transfer between clients and servers in order to provide a consistent environment, particularly for online transaction processing (OLTP). The advantages include transaction routing, distributed transactions, load balancing, funneling, and increased reliability;

simple

A ______________________ attribute is composed of a single component with an independent existence.;

composite

A ______________________ attribute is composed of multiple components each with an independent existence.;

disjoint

A ______________________ constraint describes the relationship between members of the subclasses and indicates whether it is possible for a member of a superclass to be a member of one, or more than one, subclass.;

participation

A ______________________ constraint determines whether every member in the superclass must participate as a member of a subclass.;

mobile

A ______________________ database is a database that is portable and physically separate from the corporate database server but is capable of communicating with that server from remote sites allowing the sharing of corporate data. With mobile databases, users have access to corporate data on their laptop, PDA, or other Internet access device that is required for applications at remote sites.;

distributed

A ______________________ database is a logically interrelated collection of shared data (and a description of this data), physically distributed over a computer network. The DDBMS is the software that transparently manages the distributed database.;

trigger

A ______________________ defines an action that the database should take when some event occurs in the application. A trigger may be used to enforce some referential integrity constraints, to enforce complex integrity constraints, or to audit changes to data.

strong

A ______________________ entity type is not existence-dependent on some other entity type. A weak entity type is existence-dependent on some other entity type.;

view

A ______________________ in the relational model is a virtual or derived relation that is dynamically created from the underlying base relation(s) when required. Views provide security and allow the designer to customize a user's model. Not all views are updatable;

computer-based

A ______________________ information system includes the following components: database, database software, application software, computer hardware (including storage media), and personnel using and developing the system.;

superclass

A ______________________ is an entity type that includes one or more distinct subgroupings of its occurrences, which require to be represented in a data model. A subclass is a distinct subgrouping of occurrences of an entity type, which require to be represented in a data model.;

substransaction

A ______________________ is of type compensatable, repeatable, pivot, or location-dependent and a correct execution order of subtransactions ensures termination despite disconnection.;

composite

A ______________________ key is a candidate key that consists of two or more attributes.;

primary

A ______________________ key is the candidate key that is selected to uniquely identify each occurrence of an entity type.;

candidate

A ______________________ key is the minimal set of attributes that uniquely identifies each occurrence of an entity type.;

design

A ______________________ methodology is a structured approach that uses procedures, techniques, tools, and documentation aids to support and facilitate the process of design.;

persistent

A ______________________ programming language is a language that provides its users with the ability to (transparently) preserve data across successive executions of a program. Data in a persistent programming language is independent of any program, able to exist beyond the execution and lifetime of the code that created it. However, such languages were originally intended to provide neither full database functionality nor access to data from multiple languages.;

recursive

A ______________________ relationship is a relationship type where the same entity type participates more than once in different roles.;

null

A ______________________ represents a value for an attribute that is unknown at the present time or is not applicable for this tuple.;

replication

A ______________________ server is a software system that manages data replication.;

multidatabase

A ______________________ system (MDBS) is a distributed DBMS in which each site maintains complete autonomy. An MDBS resides transparently on top of existing database and file systems and presents a single database to its users. It maintains a global schema against which users issue queries and updates; an MDBS maintains only the global schema and the local DBMSs themselves maintain all user data.;

relationship

A ______________________ type is a set of meaningful associations among entity types. A relationship occurrence is a uniquely identifiable association, which includes one occurrence from each participating entity type.;

threat

A ______________________is any situation or event, whether intentional or accidental, that will adversely affect a system and consequently an organization.;

sublanguage

A data ______________________ consists of two parts: a Data Definition Language (DDL) and a Data Manipulation Language (DML). The DDL is used to specify the database schema and the DML is used to both read and update the database. The part of a DML that involves data retrieval is called a query language.;

model

A data ______________________ is a collection of concepts that can be used to describe a set of data, the operations to manipulate the data, and a set of integrity constraints for the data. They fall into three broad categories: object-based data models, record-based data models, and physical data models. The first two are used to describe data at the conceptual and external levels; the latter is used to describe data at the internal level.;

views

A data warehouse is well equipped for providing data for mining as a warehouse not only holds data of high quality and consistency, and from multiple sources, but is also capable of providing subsets (______________________ ) of the data for analysis and lower level details of the source data, when required

schema

A database ______________________ is a description of the database structure. There are three different types of schema in the database; these are defined according to the three levels of the ANSI-SPARC architecture. Data independence makes each level immune to changes to lower levels. Logical data independence refers to the immunity of the external schemas to changes in the conceptual schema. Physical data independence refers to the immunity of the conceptual schema to changes in the internal schema.;

corporate

A database represents an essential ______________________ resource, so security of this resource is extremely important. The objective of Step 6 is to design the realization of the security mechanisms identified during the requirements collection and analysis stage.;

same entity

A derived attribute represents a value that is derivable from the value of a related attribute or set of attributes, not necessarily in the ______________________ ______________________ .;

pathway

A fan trap exists where a model suggests the existence of a relationship between entity types, but the ______________________ does not exist between certain entity occurrences;

relational

A logical data model includes ER diagram(s), ______________________ schema, and supporting documentation such as the data dictionary, which is produced throughout the development of the model.;

multi-valued dependency

A__________-____________ ______________________ (MVD) represents a dependency between attributes (A, B, and C) in a relation such that for each value of A there is a set of values of B and a set of values for C. However, the set of values for B and C are independent of each other.;

computer-aided design

Advanced database applications include ______________________ (CAD),

computer-aided manufacturing

Advanced database applications include ______________________ (CAM),

computer-aided software engineering

Advanced database applications include ______________________ (CASE),

geographic information systems

Advanced database applications include ______________________ (GIS), and interactive and dynamic Web sites, as well as applications with complex and interrelated objects and procedural data.;

office information system

Advanced database applications include network management systems, ______________________ (OIS) and multimedia systems, digital publishing,

sagas, dynamically

Advanced transaction models include nested transactions, ___________1___________ , multilevel transactions, _________2_____________ restructuring transactions, and workflow models.;

impedance mismatch, encasulation

Advantages of OODBMSs include enriched modeling capabilities, extensibility, removal of ______1____ ____________ , more expressive query language, support for schema evolution and long-duration transactions, applicability to advanced database applications, and performance. Disadvantages include lack of universal data model, lack of experience, lack of standards, query optimization compromises __________2____________ , locking at the object level affects performance, complexity, and lack of support for views and security.;

modifications, scheduled

Advantages of triggers include: eliminates redundant code, simplifies _________1_____________ , increases security, improves integrity, improves processing power, and fits well with the client-server architecture. Disadvantages of triggers include: performance overhead, cascading effects, inability to be __________2____________ , and less portable;

DDL, DML

All access to the database is through the DBMS. The DBMS provides a Data Definition Language (______________________ ), which allows users to define the database, and a Data Manipulation Language (______________________ ), which allows users to insert, update, delete, and retrieve data from the database.;

novel

Alternative approaches for developing an OODBMS: develop a ______________________ database data model/data language.;

object-oriented

Alternative approaches for developing an OODBMS: extend an existing database language with __________-____________ capabilities;

libraries

Alternative approaches for developing an OODBMS: provide extensible OODBMS ______________________ ;

label, value

An OEM object consists of an object identifier, a descriptive textual ______________________ , a type, and a ______________________ .;

persistent, shareable

An OODBMS is a manager of an OODB. An OODB is a __________1____________ and __________2____________ repository of objects defined in an OODM. An OODM is a data model that captures the semantics of objects supported in object-oriented programming. There is no universally agreed-upon OODM.;

operator

An OQL query is a function that delivers an object whose type may be inferred from the ______________________ contributing to the query expression.

schema

An XML ______________________ is the definition (both in terms of its organization and its data types) of a specific XML structure. An XML schema uses the W3C XML Schema language to specify how each type of element in the schema is defined and what data type that element has associated with it. The schema is itself an XML document, so it can be read by the same tools that read the XML it describes.;

Type

An XML document consists of elements, attributes, entity references, comments, CDATA sections, and processing instructions. An XML document can optionally have a Document ______________________ Definition (DTD), which defines the valid syntax of an XML document.;

attribute

An ______________________ is a property of an entity or a relationship type.;

entity

An ______________________ type is a group of objects with the same properties, which are identified by the enterprise as having an independent existence. An entity occurrence is a uniquely identifiable object of an entity type.;

Netscape

An alternative approach to CGI is to extend the Web server, typified by the ______________________ API (NSAPI) and Microsoft Internet Information Server API (ISAPI). Using an API, the additional functionality is linked into the server itself. Although this provides improved functionality and performance, the approach does rely to some extent on correct programming practice.;

materialization

An alternative approach to view resolution, called view ______________________ , stores the view as a temporary table in the database when the view is first queried. Thereafter, queries based on the materialized view can be much faster than recomputing the view each time.

Lightweight

An example of a semistructured DBMS is Lore (______________________ Object REpository), a multi-user DBMS, supporting crash recovery, materialized views, bulk loading of files in some standard format (XML is supported), and a declarative update language called Lorel. Lore also has an external data manager that enables data from external sources to be fetched dynamically and combined with local data during query processing. Lorel, an extension to OQL, supports declarative path expressions for traversing graph structures and automatic coercion for handling heterogeneous and typeless data.;

information system

An___________ ___________ is the resources that enable the collection, management, control, and dissemination of information throughout an organization.;

classification, value prediction

Applications of predictive modeling include customer retention management, credit approval, cross-selling, and direct marketing. There are two associated techniques: __________1____________ and ________2______________ ______________________ .;

near-optimum

As there are many equivalent transformations of the same high-level query, the DBMS has to choose the one that minimizes resource usage. This is the aim of query optimization. Because the problem is computationally intractable with a large number of relations, the strategy adopted is generally reduced to finding a _________-_____________ solution.;

extended

As well as having the standard functionality expected of a centralized DBMS, a DDBMS will need ________1______________ communication services, __________1____________ system catalog, distributed query processing, and ___________1___________ security, concurrency, and recovery services.;

log file

Backup is the process of periodically taking a copy of the database and __________ ____________ (and possibly programs) on to offline storage media. Journaling is the process of keeping and maintaining a log file (or journal) of all changes made to the database to enable recovery to be undertaken effectively in the event of a failure.;

unod, redo

Causes of failure in a distributed environment are loss of messages, communication link failures, site crashes, and network partitioning. To facilitate recovery, each site maintains its own log file. The log can be used to ______________________ and ______________________ transactions in the event of failure.;

roll, drill, pivit

Common analytical operations on data cubes include ___1_____-up, ___2_____-down, slice and dice, and ____3____.;

integrated

Computer-Aided Software Engineering (CASE) applies to any tool that supports software development and permits the database system development activities to be carried out as efficiently and effectively as possible. CASE tools may be divided into three categories: upper-CASE, lower-CASE, and ______________________ -CASE;

implementation

Conceptual database design begins with the creation of a conceptual data model of the enterprise, which is entirely independent of ______________________ details such as the target DBMS, application programs, programming languages, hardware platform, performance issues, or any other physical considerations.;

selection

DBMS ______________________ involves selecting a suitable DBMS for the database system.;

mart

Data ______________________ is a subset of a data warehouse that supports the requirements of a particular department or business function. The issues associated with data marts include functionality, size, load performance, users' access to data in multiple data marts, Internet/intranet access, administration, and installation.;

administration

Data ______________________ is the management of the data resource, including database planning, development and maintenance of standards, policies and procedures, and conceptual and logical database design.;

mining

Data ______________________ is the process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions.;

warehousing

Data ______________________ is the subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management's decision making process. The goal is to integrate enterprise-wide corporate data into a single repository from which users can easily run queries, produce reports, and perform analysis.;

localization

Data ______________________ takes into account how the data has been distributed and replaces the global relations at the leaves of the relational algebra tree with their reconstruction algorithms.

security

Database ______________________ is the mechanisms that protect the database against intentional or accidental threats. Database security is concerned with avoiding the following situations: theft and fraud, loss of confidentiality (secrecy), loss of privacy, loss of integrity, and loss of availability.;

design

Database ______________________ is the process of creating a design that will support the enterprise's mission statement and mission objectives for the required database system. There are three phases of database design: conceptual, logical, and physical database design.;

security, integrity

Database administration is the management of the physical realization of a database system, including physical database design and implementation, setting _________1_____________ and _________2____________ controls, monitoring system performance, and reorganizing the database as necessary;

conceptual, logical, physical

Database design includes three main phases: _________1_____________ , ____________2__________ , and _________3_____________ database design.;

attributes

Dimension tables most often contain descriptive textual information. Dimension ______________________ are used as the constraints in data warehouse queries.;

query decomposition, data localization, global, and local

Distributed query processing can be divided into four phases: ____________1__________ ______________________ , ___________2___________ ______________________ , ___________3___________ optimization, and __________4____________ optimization.

fact, dimension, noncomposite

Every dimensional model (DM) is composed of one table with a composite primary key, called the ________1______________ table, and a set of smaller tables called __________2____________ tables. Each dimension table has a simple (________3______________ ) primary key that corresponds exactly to one of the components of the composite key in the fact table. In other words, the primary key of the fact table is made up of two or more foreign keys. This characteristic "star-like" structure is called a star schema or star join.;

early

Fact-finding is particularly crucial to the ______________________ stages of the database system development lifecycle, including the database planning, system definition, and requirements collection and analysis stages.;

one and only one

First Normal Form (1NF) is a relation in which the intersection of each row and column contains __________ ____________ __________ ____________ value.;

denormalization, nulls

Formally, the term ________1______________ refers to a refinement to the relational schema such that the degree of normalization for a modified relation is less than the degree of at least one of the original relations. The term is also used more loosely to refer to situations in which two relations are combined into one new relation and the new relation is still normalized but contains more __________2____________ than the original relations.;

unary, Cartesian, inner

Fundamental to the efficiency of query optimization is the search space of possible execution strategies and the enumeration algorithm that is used to search this space for an optimal strategy. For a given query this space can be very large. As a result, query optimizers restrict this space in a number of ways. For example, __________1____________ operations may be processed on the fly; _________2_____________ products are never formed unless the query itself specifies it; the __________3____________ operand of each join is a base relation.;

candidate

General definition for Second Normal Form (2NF) is a relation that is in first normal form and every non-candidate-key attribute is fully functionally dependent on any ______________________ key. In this definition, a candidate-key attribute is part of any candidate key.;

candidate

General definition for Third Normal Form (3NF) is a relation that is in first and second normal form in which no non-candidate-key attribute is transitively dependent on any ______________________ key. In this definition, a candidate-key attribute is part of any candidate key;

Selecting, Projecting, join

Heuristics rules include performing _________1_____________ and __________2____________ operations as early as possible; combining Cartesian product with a subsequent Selection whose predicate represents a join condition into a Join operation; using associativity of __________3____________ operations to rearrange leaf nodes so that leaf nodes with the most restrictive Selections are executed first.;

tuples

Horizontal fragments are subsets of ______________________

governance

IT ______________________ is used for specifying the decision rights and accountability framework to encourage desirable behavior in the use of IT.;

kernel-based, middleware-based

If a replication protocol is implemented as part of the database kernel it is ___________-___________, if an additional middleware layer that resides on top of the replicated database system implements the protocol, it is __________-____________.;

UNION, INTERSECT, EXCEPT

If the columns of the result table come from more than one table, a join must be used, by specifying more than one table in the FROM clause and typically including a WHERE clause to specify the join column(s). The ISO standard allows Outer joins to be defined. It also allows the set operations of Union, Intersection, and Difference to be used with the __________1____________ , ________2______________ , and _____________3_________ commands.;

serializable

If the schedule of transaction execution at each site is ______________________, then the global schedule (the union of all local schedules) is also serializable, provided that local serialization orders are identical. This requires that all subtransactions appear in the same order in the equivalent serial schedule at all sites.;

Architecture, Request

In 1990, the OMG first published its Object Management _________1_____________ (OMA) Guide document. This guide specified a single terminology for object-oriented languages, systems, databases, and application frameworks; an abstract framework for object-oriented systems; a set of technical and architectural goals; and a reference model for distributed applications using object-oriented techniques. Four areas of standardization were identified for the reference model: the Object Model (OM), the Object __________2____________ Broker (ORB), the Object Services, and the Common Facilities.;

normalization, transactions

In Step 2.2 the relational schema is validated using the rules of _________1_____________ to ensure that each relation is structurally correct. Normalization is used to improve the model so that it satisfies various constraints that avoids unnecessary duplication of data. In Step 2.3 the relational schema is also validated to ensure that it supports the ________2______________ given in the users' requirements specification.;

integrity constraints

In Step 2.4 the ______________________ ______________________ of the logical data model are checked. Integrity constraints are the constraints that are to be imposed on the database to protect the database from becoming incomplete, inaccurate, or inconsistent. The main types of integrity constraints include: required data, attribute domain constraints, multiplicity, entity integrity, referential integrity, and general constraints.;

users

In Step 2.5 the logical data model is validated by the ______________________ .;

single-level

In contrast, an OODBMS tries to give the illusion of a ________-______________ storage model, with a similar representation in both memory and in the database stored on disk.;

unordered, primary

In physical database design, one approach to selecting an appropriate file organization for a relation is to keep the tuples ________1______________ and create as many secondary indexes as necessary. Another approach is to order the tuples in the relation by specifying a _________2_____________ or clustering index. One approach to determining which secondary indexes are needed is to produce a "wish-list" of attributes that we consider are candidates for indexing, and then to examine the impact of maintaining each of these indexes.;

Oriented, Relational

In response to the increasing complexity of database applications, two new data models have emerged; the Object-______________________ Data Model (OODM) and the Object-______________________ Data Model (ORDM). However, unlike previous models, the actual composition of these models is not clear. This evolution represents the third generation of DBMSs.;

entities, functional

In the FDM, the main modeling primitives are __________1____________ (either entity types or printable entity types) and ________2_____________ relationships.;

data cubes

In the OLAP environment multidimensional data is represented as n-dimensional ______________________ ______________________ . An alternative representation for a data cube is as a lattice of cuboids.;

domains

In the domain relational calculus, domain variables take their values from ______________________ of attributes rather than tuples of relations.;

read-only

In the first two models of data ownership (primary and secondary copy), replicas are __________-____________.

TRUE

In the tuple relational calculus, we are interested in finding tuples for which a predicate is ______________________ . A tuple variable is a variable that "ranges over" a named relation: that is, a variable whose only permitted values are tuples of the relation.;

Armstrong

Inference rules called ______________________ 's axioms can be used to identify a minimal set of functional dependencies from the set of all functional dependencies for a relation.;

DBMS, physical

Logical database design is the process of constructing a model of the data used in an enterprise based on a specific data model but independent of a particular __________1____________ and other _________2_____________ considerations.;

splitting, trusting

Microsoft Office Access DBMS provides four methods to secure the database including _________1_____________ the database, setting a password, _________2_____________ (enabling) the disabled content of a database, packaging, signing, and ______________________the database.;

ActiveX

Microsoft produced ______________________ Data Objects (ADO) as a programming extension of ASP for database connectivity that provided an easy-to-use API to OLE DB.;

functional

OLAP applications are found in widely divergent ______________________ areas including budgeting, financial performance analysis, sales analysis and forecasting, market research analysis, and market/customer segmentation.;

Multidimensional, Relational, Hybrid, Desktop

OLAP tools are categorized according to the architecture of the database providing the data for the purposes of analytical processing. There are four main categories of OLAP tools: __________1____________ OLAP (MOLAP), ____________2__________ OLAP (ROLAP), __________3____________ OLAP (HOLAP), and __________4____________ OLAP (DOLAP).;

currency

One disadvantage with materialized views is maintaining the ______________________ of the temporary table.;

functional dependency

One of the main concepts associated with normalization is ______________________ ______________________ , which describes the relationship between attributes in a relation. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A → B), if each value of A is associated with exactly one value of B. (A and B may each consist of one or more attributes.);

throughput

One of the main objectives of physical database design is to store and access data in an efficient way. There are a number of factors that can be used to measure efficiency, including ______________________, response time, and disk storage.;

Exchange, objects

One of the proposed models for semistructured data is the Object ________1______________ Model (OEM), a nested object model. Data in OEM can be thought of as a labeled directed graph where the nodes are __________2____________.

Analytical

Online ______________________ Processing (OLAP) is the dynamic synthesis, analysis, and consolidation of large volumes of multidimensional data.;

performance

Perhaps two of the most important concerns from the programmer's perspective are ______________________ and ease of use. Both are achieved by having a more seamless integration between the programming language and the DBMS than that provided with traditional database systems.

secondary storage

Physical database design is the process of producing a description of the implementation of the database on _________ _____________ . It describes the base relations and the storage structures and access methods used to access the data effectively, along with any associated integrity constraints and security measures. The design of the base relations can be undertaken only once the designer is fully aware of the facilities offered by the target DBMS.;

decomposition

Query ______________________ takes a query expressed on the global relations and performs a partial optimization using the techniques discussed in Chapter 23.

decomposition

Query ______________________ transforms a high-level query into a relational algebra query, and checks that the query is syntactically and semantically correct. The typical stages of query decomposition are analysis, normalization, semantic analysis, simplification, and query restructuring. A relational algebra tree can be used to provide an internal representation of a transformed query.;

Transformation

Query optimization can apply transformation rules to convert one relational algebra expression into an equivalent expression that is known to be more efficient. ______________________ rules include cascade of selection, commutativity of unary operations, commutativity of Theta join (and Cartesian product), commutativity of unary operations and Theta join (and Cartesian product), and associativity of Theta join (and Cartesian product).;

decomposition, optimization, generation, execution

Query processing can be divided into four main phases: _______1_______________ (consisting of parsing and validation), __________2____________ , code ___________3___________, and _________4_____________. The first three can be done either at compile time or at runtime.;

procedural, nonprocedural, graphical

Relational data manipulation languages are sometimes classified as _______1______________ or ___________2___________ , transform- oriented, __________3____________ , fourth-generation, or fifth-generation;

anomalies

Relations with data redundancy suffer from update ______________________ , which can be classified as insertion, deletion, and modification anomalies.;

ownership, privileges

SQL access control is built around the concepts of authorization identifiers, ______________________ , and ______________________ .

nonprocedural

SQL is a ______________________ language consisting of standard English words such as SELECT, INSERT, and DELETE that can be used by professionals and non-professionals alike. It is both the formal and de facto standard language for defining and manipulating relational databases.;

GROUP BY

SQL supports five aggregate functions (COUNT, SUM, AVG, MIN, and MAX) that take an entire column as an argument and compute a single value as the result. It is illegal to mix aggregate functions with column names in a SELECT clause, unless the______________________ clause is used.;

exceptions

SQL/PSM supports the declaration of variables and has assignment statements, flow of control statements (IF-THEN-ELSE-END IF; LOOP-EXIT WHEN-END LOOP; FOR-END LOOP; WHILE-END LOOP), and ______________________ .;

fully functionally

Second Normal Form (2NF) is a relation that is in first normal form and every non-primary-key attribute is __________ ____________ dependent on the primary key. Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A but not on any proper subset of A.;

Data

Several important vendors formed the Object ______________________ Management Group (ODMG) to define standards for OODBMSs. The ODMG produced an Object Model that specifies a standard model for the semantics of database objects. The model is important because it determines the built-in semantics that the OODBMS understands and can enforce. The design of class libraries and applications that use these semantics should be portable across the various OODBMSs that support the Object Model.;

user defined types, user defined routines

Since 1999, the SQL standard includes object management extensions for: row types, ___________1___________ (UDTs) and _________2_____________ (UDRs), polymorphism, inheritance, reference types and object identity, ______________________ types (ARRAYs), new language constructs that make SQL computationally complete, triggers, and support for large objects—Binary Large Objects (BLOBs) and Character Large Objects (CLOBs)—and recursion;

advantages, disadvantages

Some ______________________ of the database approach include control of data redundancy, data consistency, sharing of data, and improved security and integrity. Some ______________________ include complexity, cost, reduced performance, and higher impact of a failure.;

Recovery

Some causes of failure are system crashes, media failures, application software errors, carelessness, natural physical disasters, and sabotage. These failures can result in the loss of main memory and/or the disk copy of the database. ______________________ techniques minimize these effects.;

mandatory, security, clearance

Some commercial DBMSs also provide an approach to access control called __________1____________ Access Control (MAC), which is based on system-wide policies that cannot be changed by individual users. In this approach each database object is assigned a __________2____________ class and each user is assigned a __________3____________ for a security class, and rules are imposed on reading and writing of database objects by users. The SQL standard does not include support for MAC.;

views, merged

Step 2.6 of logical database design is an optional step and is required only if the database has multiple user ___________1___________ that are being managed using the view integration approach (see Section 10.5), which results in the creation of two or more local logical data models. A local logical data model represents the data requirements of one or more, but not all, user views of a database. In Step 2.6 these data models are _________2_____________ into a global logical data model, which represents the requirements of all user views. This logical data model is again validated using normalization, against the required transaction, and by users.;

denormalize

Step 7 of physical database design includes a consideration of whether to ______________________ the relational schema to improve performance. There may be circumstances in which it may be necessary to accept the loss of some of the benefits of a fully normalized design in favor of performance. This option should be considered only when it is estimated that the system will not be able to meet its performance requirements. As a rule of thumb, if performance is unsatisfactory and a relation has a low update rate and a very high query rate, denormalization may be a viable option.;

definition

System ______________________ involves identifying the scope and boundaries of the database system and user views. A user view defines what is required of a database system from the perspective of a particular job role (such as Manager or Supervisor) or enterprise application (such as marketing, personnel, or stock control).;

security, integrity, concurrency and recovery control,

The DBMS provides controlled access to the database. It provides ______________________ , ______________________ , ______________________ and ______________________ ______________________ , and a user-accessible catalog. It also provides a view mechanism to simplify the data that users have to deal with.;

centralized

The DDBMS should appear like a ______________________ DBMS by providing a series of transparencies. With distribution transparency, users should not know that the data has been fragmented/replicated. With transaction transparency, the consistency of the global database should be maintained when multiple users are accessing the database concurrently and when failures occur. With performance transparency, the system should be able to efficiently handle queries that reference data at more than one site. With DBMS transparency, it should be possible to have different DBMSs in the system;

detail

The Dimensional Modeling stage of Kimball's Business Dimensional Lifecycle begins by defining a high-level DM, which progressively gains more ______________________ ; this is achieved using a two-phased approach. The first phase is the creation of the high-level DM and the second phase involves adding detail to the model through the identification of dimensional attributes for the model.;

Kimball

The Dimensional Modeling stage of ______________________ 's Business Dimensional Lifecycle can result in the creation of a dimensional model (DM) for a data mart or be used to "dimensionalize" the relational schema of an OLTP database.;

result

The GROUP BY clause allows summary information to be included in the ______________________ table. Rows that have the same value for one or more columns can be grouped together and treated as a unit for using the aggregate functions. In this case, the aggregate functions take each group as an argument and compute a single value for each group as the result. The HAVING clause acts as a WHERE clause for groups, restricting the groups that appear in the final result table. However, unlike the WHERE clause, the HAVING clause can include aggregate functions.;

CREATE DOMAIN

The ISO SQL standard provides clauses in the CREATE and ALTER TABLE statements to define integrity constraints that handle required data, domain constraints, entity integrity, referential integrity, and general constraints. Domain constraints can be specified using the CHECK clause or by defining domains using the ______________________ statement.

ON DELETE, ON UPDATE

The ISO SQL standard provides clauses in the CREATE and ALTER TABLE statements to define integrity constraints that handle required data, domain constraints, entity integrity, referential integrity, and general constraints. Foreign keys should be defined using the FOREIGN KEY clause and update and delete rules using the subclauses _________1_____________ and _________2_____________ .

CHECK, UNIQUE, CREATE ASSERTION

The ISO SQL standard provides clauses in the CREATE and ALTER TABLE statements to define integrity constraints that handle required data, domain constraints, entity integrity, referential integrity, and general constraints. General constraints can be defined using the _________1_____________ and _________2_____________ clauses. General constraints can also be created using the ________3______________ statement.;

UNIQUE

The ISO SQL standard provides clauses in the CREATE and ALTER TABLE statements to define integrity constraints that handle required data, domain constraints, entity integrity, referential integrity, and general constraints. Primary keys should be defined using the PRIMARY KEY clause and alternate keys using the combination of NOT NULL and ______________________ .

NOT NULL

The ISO SQL standard provides clauses in the CREATE and ALTER TABLE statements to define integrity constraints that handle required data, domain constraints, entity integrity, referential integrity, and general constraints. Required data can be specified using ______________________.

bit, numeric

The ISO standard provides eight base data types: boolean, character, __________1____________ , exact ___________2___________ , approximate __________2____________ , datetime, interval, and character/binary large objects.;

open, split

The Kangaroo, Reporting and Co-Transactions, and MoFlex transaction models are based on the concepts of __________1____________ nested transactions and ___________2___________ transactions and support mobility and disconnections. A mobile host starts a transaction and a subtransaction is ______________________ at the connected mobile support station.;

Facility, Metamodel

The OMG has also developed a number of other specifications including UML (the Unified Modeling Language), which provides a common language for describing software models; MOF (Meta-Object __________1____________ ), which defines a common, abstract language for the specification of metamodels (CORBA, UML, and CWM are all MOF-compliant metamodels); XMI (XML Metadata Interchange), which maps MOF to XML; and CWM (Common Warehouse _________2_____________ ), which defines a metamodel for metadata that is commonly found in data warehousing and business intelligence domains.;

Driven

The OMG has also introduced the Model-______________________ Architecture (MDA) as an approach to system specification and interoperability building upon the above four modeling specifications. It is based on the premise that systems should be specified independently of all hardware and software details. Thus, while the software and hardware may change over time, the specification will still be applicable. Importantly, MDA addresses the complete system lifecycle, from analysis and design to implementation, testing, component assembly, and deployment.;

Definition, signature

The Object ________1______________ Language (ODL) is a language for defining the specifications of object types for ODMG-compliant systems, equivalent to the Data Definition Language (DDL) of traditional DBMSs. The ODL defines the attributes and relationships of types and specifies the _________2_____________ of the operations, but it does not address the implementation of signatures.;

external/conceptual

The ______________________ mapping transforms requests and results between the external and conceptual levels. The conceptual/internal mapping transforms requests and results between the conceptual and internal levels.;

Management

The Object ______________________ Group (OMG) is an international nonprofit industry consortium founded in 1989 to address the issues of object standards. The primary aims of the OMG are promotion of the object-oriented approach to software engineering and the development of standards in which the location, environment, language, and other characteristics of objects are completely transparent to other objects.;

Query

The Object ______________________ Language (OQL) provides declarative access to the object database using an SQL-like syntax. It does not provide explicit update operators, but leaves this to the operations defined on object types.

Fusion

The Oracle ______________________ Middleware is aimed particularly at providing extensibility for distributed environments. It is an n-tier architecture based on industry standards such as HTTP and HTML/XML for Web enablement, Java, J2EE, Enterprise JavaBeans (EJB), JDBC, and SQLJ for database connectivity, Java servlets, and JavaServer Pages (JSP), OMG's CORBA technology, Internet Inter-Object Protocol (IIOP) for object interoperability and Remote Method Invocation (RMI). It also supports Java Messaging Service (JMS), Java Naming and Directory Interface (JNDI), and it allows stored procedures to be written in Java;

tables, views

The SELECT clause identifies the columns and/or calculated data to appear in the result table. All column names that appear in the SELECT clause must have their corresponding _________1_____________ or __________2____________ listed in the FROM clause.;

DROP SCHEMA

The SQL DDL statements allow database objects to be defined. The CREATE and __________1____________ statements allow schemas to be created and destroyed; the CREATE, ALTER, and DROP TABLE statements allow tables to be created, modified, and destroyed; the CREATE and DROP INDEX statements allow indexes to be created and destroyed.;

CUBE, ROLLUP

The SQL:2011 standard supports OLAP functionality in the provision of extensions to grouping capabilities such as the __________1____________ and __________2____________ functions and elementary operators such as moving windows and ranking functions;

2011

The SQL:________ standard has defined extensions to SQL to enable the publication of XML, commonly referred to as SQL/XML. In particular, SQL/XML contains: a new native XML data type, XML, which allows XML documents to be treated as relational values in columns of tables, attributes in user-defined types, variables, and parameters to functions, and a set of operators for the type; an implicit set of mappings from relational data to XML.;

XQuery, XPath

The XML ______________________ 1.0 and ______________________ 2.0 Data Model defines the information contained in the input to an XSLT or XQuery Processor as well as all permissible values of expressions in the XSLT, XQuery, and XPath languages. The Data Model is based on the XML Information Set, with the new features to support XML Schema types and representation of collections of documents and of simple and complex values. An instance of the Data Model represents one or more complete XML documents or document parts, each represented by its own tree of nodes. In the Data Model, every value is an ordered sequence of zero or more items, where an item can be an atomic value or a node.;

well-formed, valid

The XML specification provides for two levels of document processing: ___1____-_______________ and ___________2___________ . Basically, an XML document that conforms to the structural and notational rules of XML is considered well-formed. An XML document that is well-formed and also conforms to a DTD is considered valid.;

operational data store

The ____ __________ ________ (ODS) is a repository of current and integrated operational data used for analysis. It is often structured and supplied with data in the same way as the data warehouse, but may in fact simply act as a staging area for data to be moved into the warehouse.;

system catalog

The _________ _____________ is one of the fundamental components of a DBMS. It contains "data about the data," or metadata. The catalog should be accessible to users. The Information Resource Dictionary System is an ISO standard that defines a set of access methods for a data dictionary. This standard allows dictionaries to be shared and transferred from one system to another;

ETL manager

The _________ _____________ performs all the operations associated with the extraction and loading of data into the warehouse. These operations include simple transformations of the data to prepare the data for entry into the warehouse.;

warehouse manager

The _________ _____________ performs all the operations associated with the management of the data in the warehouse. The operations performed by this component include analysis of data to ensure consistency, transformation, and merging of source data, creation of indexes and views, generation of denormalizations and aggregations, and archiving and backing up data.;

query manager

The _________ _____________ performs all the operations associated with the management of user queries. The operations performed by this component include directing queries to the appropriate tables and scheduling the execution of queries.;

three-tier

The _________-_____________ architecture can be extended to n tiers, will additional tiers added to provide more flexibility and scalability.;

CLR, compiled

The _________1_____________ is an execution engine that loads, executes, and manages code that has been compiled into an intermediate bytecode format known as the Microsoft Intermediate Language (MSIL) or simply IL, analogous to Java bytecodes. However, rather than being interpreted, the code is __________2____________ to native binary format before execution by a just-in-time compiler built into the CLR.

MoFLex

The ______________________ models allow defining predicates about time, costs, and location that affect the execution of a transaction.;

Resource

The ______________________ Description Framework (RDF) is an infrastructure that enables the encoding, exchange, and reuse of structured metadata. This infrastructure enables metadata interoperability through the design of mechanisms that support common conventions of semantics, syntax, and structure. RDF does not stipulate the semantics for each domain of interest, but instead provides the ability for these domains to define metadata elements as required. RDF uses XML as a common syntax for the exchange and processing of metadata.;

ODMG

The ______________________ OM is a superset of the OMG OM, which enables both designs and implementations to be ported between compliant systems. The basic modeling primitives in the model are the object and the literal. Only an object has a unique identifier. Objects and literals can be categorized into types. All objects and literals of a given type exhibit common behavior and state. Behavior is defined by a set of operations that can be performed on or by the object. State is defined by the values an object carries for a set of properties. A property may be either an attribute of the object or a relationship between the object and one or more other objects.;

Microsoft

The ______________________ Open Database Connectivity (ODBC) technology provides a common interface for accessing heterogeneous SQL databases. Microsoft eventually packaged Access and Visual C++ with Data Access Objects (DAO). The object model of DAO consisted of objects such as Databases, TableDefs, QueryDefs, Recordsets, fields, and properties. Microsoft then introduced the Remote Data Objects (RDO) followed by OLE DB, which provides low-level access to any data source. Subsequently

Common Gateway

The ______________________ ______________________ Interface (CGI) is a specification for transferring information between a Web server and a CGI script. It is a popular technique for integrating databases into the Web. Its advantages include simplicity, language independence, Web server independence, and its wide acceptance. Disadvantages stem from the fact that a new process is created for each invocation of the CGI script, which can overload the Web server during peak times.;

Cross Industry Standard Process

The ______________________ ______________________ ______________________ ______________________ for Data Mining (CRISP-DM) specification describes a data mining process model that is not specific to any particular industry or tool.;

dynamic programming

The ______________________ ______________________ algorithm is based on the assumption that the cost model satisfies the principle of optimality. To obtain the optimal strategy for a query consisting of n joins, we need to consider only the optimal strategies that consist of (n - 1) joins and extend those strategies with an additional join. Equivalence classes are created based on interesting orders and the strategy with the lowest cost in each equivalence class is retained for consideration in the next step until the entire query has been constructed, whereby the strategy corresponding to the overall lowest cost is selected.;

.NET Framework

The ______________________ ______________________ class library is a collection of reusable classes, interfaces, and types that integrate with the CLR.

CLR

The ______________________ allows one language to call another, and even inherit and modify objects from another language.

centralized

The ______________________ approach involves merging the requirements for each user view into a single set of requirements for the new database system. A data model representing all user views is created during the database design stage. In the view integration approach, requirements for each user view remain as separate lists. Data models representing each user view are created then merged later during the database design stage.;

WHERE

The ______________________ clause selects rows to be included in the result table by applying a search condition to the rows of the named table(s). The ORDER BY clause allows the result table to be sorted on the values in one or more columns. Each column can be sorted in ascending or descending order. If specified, the ORDER BY clause must be the last clause in the SELECT statement.;

functional

The ______________________ data model (FDM) shares certain ideas with the object approach, including object identity, inheritance, overloading, and navigational access. In the FDM, any data retrieval task can be viewed as the process of evaluating and returning the result of a function with zero, one, or more arguments.

operational

The ______________________ data source for the data warehouse is supplied from mainframe operational data held in first-generation hierarchical and network databases, departmental data held in proprietary file systems, private data held on workstations and private servers and external systems such as the Internet, commercially available databases, or databases associated with an organization's suppliers or customers.;

Relational Database Management System (RDBMS)

The ______________________ has become the dominant data-processing software in use today, with estimated new licence sales of between US billion and US billion per year (US billion with tools sales included). This software represents the second generation of DBMSs and is based on the relational data model proposed by E. F. Codd.;

database

The ______________________ is a fundamental component of an information system, and its development and usage should be viewed from the perspective of the wider requirements of the organization. Therefore, the lifecycle of an organizational information system is inherently linked to the lifecycle of the database that supports it.;

DBMS

The ______________________ is now the underlying framework of the information system and has fundamentally changed the way in which many organizations operate. The database system remains a very active research area and many significant problems remain.;

determinant

The ______________________ of a functional dependency refers to the attribute, or group of attributes, on the left-hand side of the arrow.;

degree

The ______________________ of a relation is the number of attributes, and the cardinality is the number of tuples. A unary relation has one attribute, a binary relation has two, a ternary relation has three, and an n-ary relation has n attributes.;

degree

The ______________________ of a relationship type is the number of participating entity types in a relationship.;

systems

The ______________________ specification describes any features to be included in the database system, such as the performance and security requirements;

SELECT

The ______________________ statement is the most important statement in the language and is used to express a query. It combines the three fundamental relational algebra operations of Selection, Projection, and Join. Every ______________________ statement produces a query result table consisting of one or more columns and zero or more rows.;

COMMIT

The ______________________ statement signals successful completion of a transaction and all changes to the database are made permanent.

ROLLBACK

The ______________________ statement signals that the transaction should be aborted and all changes to the database are undone.;

shareable, modular, standards

The advantages of a DDBMS are that it reflects the organizational structure; it makes remote data more _________1_____________; it improves reliability, availability, and performance; it may be more economical; it provides for __________2____________ growth, facilitates integration, and helps organizations remain competitive. The major disadvantages are cost, complexity, lack of ___________3___________, and experience.;

high-level

The aims of query processing are to transform a query written in a ______________________-______________________ language, typically SQL, into a correct and efficient execution strategy expressed in a low-level language like the relational algebra, and to execute the strategy to retrieve the required data.;

transactions

The conceptual data model is validated to ensure it supports the required ______________________ . Two possible approaches to ensure that the conceptual data model supports the required transactions are: (1) checking that all the information (entities, relationships, and their attributes) required by each transaction is provided by the model by documenting a description of each transaction's requirements; (2) ______________________ representing the pathway taken by each transaction directly on the ER diagram.;

total cost, response time

The cost model for distributed query optimization can be based on ______________________ ______________________ (as in the centralized case) or ______________________ ______________________, that is, the elapsed time from the start to the completion of the query. The latter model takes account of the inherent parallelism in a distributed system. Cost needs to take account of local processing costs (I/O and CPU) as well as networking costs. In a WAN, the networking costs will be the dominant factor to reduce.;

FILE-BASED APPROACH

The database approach emerged to resolve the problems with the ______________________ . A database is a shared collection of logically related data and a description of this data, designed to meet the information needs of an organization. A DBMS is a software system that enables users to define, create, maintain, and control access to the database. An application program is a computer program that interacts with the database by issuing an appropriate request (typically a SQL statement) to the DBMS. The more inclusive term database system is used to define a collection of application programs that interact with the database along with the DBMS and database itself.;

conceptual, logical, physical

The database design methodology includes three main phases: ________1______________ , _________2_____________ , and ______________________ database design.;3

unsupervised learning

The database segmentation approach of predictive modeling uses ______________________ ______________________ to discover homogeneous subpopulations in a database to improve the accuracy of the profiles.;

allocation

The definition and ______________________ of fragments are carried out strategically to achieve locality of reference, improved reliability and availability, acceptable performance, balanced storage capacities and costs, and minimal communication costs.

function, procedure

The difference between a procedure and a function is that a __________1____________ will always return a single value to the caller, whereas a ______________________ will not. Usually, procedures are used unless only one return value is needed.;

month

The guiding principles associated with Kimball's Business Dimensional lifecycle are the focus on meeting the information requirements of the enterprise by building a single, integrated, easy-to-use, high-performance information infrastructure, which is delivered in meaningful increments of six- to twelve-______________________ timeframes.;

algorithms

The important characteristics of data mining tools include: data preparation facilities; selection of data mining operations (______________________ ); scalability and performance; and facilities for understanding results.;

target

The initial step (Step 3) of physical database design is the translation of the logical data model into a form that can be implemented in the ______________________ relational DBMS.;

computationally complete

The initial versions of the SQL language had no programming constructs; that is, it was not ___________ ___________ . However, with the more recent versions of the standard, SQL is now a full programming language with extensions known as SQL/PSM (Persistent Stored Modules).;

views, calculations, time

The key characteristics of OLAP applications include multidimensional ______________________ of data, support for complex ______________________ , and ______________________ intelligence.;

conformed

The key to understanding the relationship between dimensional models and ER models is that a single ER model normally decomposes into multiple DMs. The multiple DMs are then associated through ______________________ (shared) dimension tables.;

one

The main characteristics of functional dependencies that we use for normalization have a one-to-______________________ relationship between attribute(s) on the left-hand and right-hand sides of the dependency, hold for all time, and are fully functionally dependent.;

entity,alternate

The main objective of Step 1 of the methodology is to build a conceptual data model of the data requirements of the enterprise. A conceptual data model comprises: ___________1___________ types, relationship types, attributes, attribute domains, primary keys, and __________2____________ keys.;

system definition

The main stages of the database system development lifecycle include: database planning, __________ ____________ , requirements collection and analysis, database design, DBMS selection (optional), application design, prototyping (optional), implementation, data conversion and loading, testing, and operational maintenance.;

block, indexed, sort-merge, hash

The main strategies for implementing the Join operation are: _________1_____________ nested loop join, _________2_____________ nested loop join, ______3______-__________ join, and ___________4___________join.;

linear, binary, hash, primary, primary, clustering, nonclustering, secondary

The main strategies for implementing the Selection operation are: __________1____________ search (unordered file, no index), ________2_____________ search (ordered file, no index), equality on ________3______________ key, equality condition on ________4______________ key, inequality condition on ________5______________ key, equality condition on ___________6___________ (secondary) index, equality condition on a __________7___________ (secondary) index, and inequality condition on a __________8____________ B+-tree index.;

ETL

The major components of a data warehouse include the operational data sources, operational data store, ______________________ manager, warehouse manager,query manager,detailed, lightly and highly summarized data, archive/backup data, metadata, and end-user access tools.;

numerical, additive

The most useful facts in a fact table are ___________1___________ and ________2______________ , because data warehouse applications almost never access a single record; rather, they access hundreds, thousands, or even millions of records at a time and the most useful thing to do with so many records is to aggregate them.;

base

The next step (Step 4) of physical database design designs the file organizations and access methods that will be used to store the ______________________ relations. This involves analyzing the transactions that will run on the database, choosing suitable file organizations based on this analysis, choosing indexes and, finally, estimating the disk space that will be required by the implementation.;

.NET

The next, and current, evolution in Microsoft's Web solution strategy was the development of Microsoft ______________________ . There are various tools, services, and technologies in the new platform such as Windows Server, BizTalk Server, Commerce Server, Application Center, Mobile Information Server, SQL Server (an object-relational DBMS), and Microsoft Visual Studio .NET.

SQL

The objective of Step 5 of physical database design is to design an implementation of the user views identified during the requirements collection and analysis stage, such as using the mechanisms provided by ______________________.;

atomicity

The objectives of distributed transaction processing are the same as those of centralized systems, although more complex because the DDBMS must ensure the ______________________ of the global transaction and each subtransaction.;

implemented

The physical database design phase allows the designer to make decisions on how the database is to be ______________________ . Therefore, physical design is tailored to a specific DBMS. There is feedback between physical and conceptual/logical design, because decisions taken during physical design to improve performance may affect the structure of the conceptual/logical data model.;

investments, competitive

The potential benefits of data warehousing are high returns on ______________________ , substantial ______________________ advantage, and increased productivity of corporate decision makers.;

FILE-BASED

The predecessor to the DBMS was the______________________ , which is a collection of application programs that perform services for the end-users, usually the production of reports. Each program defines and manages its own data. Although the file-based system was a great improvement over the manual filing system, it still has significant problems, mainly the amount of data redundancy present and program—data dependence.;

supervised learning

The predictive model is developed using a __________ ____________ approach, which has two phases: training and testing.

algebra

The relational ______________________ is a (high-level) procedural language: it can be used to tell the DBMS how to build a new relation from one or more relations in the database. The relational calculus is a nonprocedural language: it can be used to formulate the definition of a relation in terms of one or more database relations. However, formally the relational algebra and relational calculus are equivalent to one another: for every expression in the algebra, there is an equivalent expression in the calculus (and vice versa).;

calculus

The relational ______________________ is used to measure the selective power of relational languages. A language that can be used to produce any relation that can be derived using the relational calculus is said to be relationally complete. Most relational query languages are relationally complete but have more expressive power than the relational algebra or relational calculus because of additional operations such as calculated, summary, and ordering functions.;

subset

The relational algebra is logically equivalent to a ______________________ of the relational calculus (and vice versa).;

tuple, domain

The relational calculus is a formal nonprocedural language that uses predicates. There are two forms of the relational calculus: _________1_____________ relational calculus and _________2_____________ relational calculus.;

semantic, impedence

The relational model, and relational systems in particular, have weaknesses such as poor representation of "real-world" entities, __________1____________ overloading, poor support for integrity and enterprise constraints, limited operations, and _________2_____________ mismatch. The limited modeling capabilities of relational DBMSs have made them unsuitable for advanced database applications.;

performance, processing

The requirements for a data warehouse DBMS include load ______________________ , load ______________________ , data quality management, query performance, terabyte scalability, mass user scalability, networked data warehouse, warehouse administration, integrated dimensional analysis, and advanced query functionality.;

hierarchical and CODASYL

The roots of the DBMS lie in file-based systems. The ______________________ and ______________________ systems represent the first generation of DBMSs. The hierarchical model is typified by IMS (Information Management System) and the network or CODASYL model by IDS (Integrated Data Store), both developed in the mid-1960s. The relational model, proposed by E. F. Codd in 1970, represents the second generation of DBMSs. It has had a fundamental effect on the DBMS community and there are now over one hundred relational DBMSs. The third generation of DBMSs are represented by the Object-Relational DBMS and the Object-Oriented DBMS.;

proxy, Secure Sockets, Transactions, Transfer

The security measures associated with DBMSs on the Web include: ________1_____________ servers, firewalls, message digest algorithms and digital signatures, digital certificates, Kerberos, __________2____________ ___________2___________ Layer (SSL) and Secure HTTP (S-HTTP), Secure Electronic ___________3___________ (SET) and Secure ________4______________ Technology (STT), Java security, and ActiveX security.;

intension

The structure of the relation, with domain specifications and other constraints, is part of the ______________________ of the database; the relation with all its tuples written out represents an instance or extension of the database.;

completeness, reconstruction, disjointness

The three correctness rules of fragmentation are _________1_____________, _____________2_________, and __________3____________.;

concurrency, recovery

The transaction is also the unit of ___________1___________and the unit of __________2____________.;

voting, decision

The two-phase commit (2PC) protocol comprises a ______________________ and ______________________ phase, in which the coordinator asks all participants whether they are ready to commit. If one participant votes to abort, the global transaction and each subtransaction must be aborted. Only if all participants vote to commit can the global transaction be committed. The 2PC protocol can leave sites blocked in the presence of sites failures.;

requirements

The users' ______________________ specification describes in detail the data to be held in the database and how the data is to be used.;

resources

There are a number of issues that need to be addressed with mobile DBMSs, including managing limited ______________________ , security, transaction handling, and query processing.;

mixed, derived

There are also two other types of fragmentation: ______________________and ______________________, a type of horizontal fragmentation where the fragmentation of one relation is based on the fragmentation of another relation.;

centralized, fragmented, complete, selective

There are four allocation strategies regarding the placement of data: __________1____________(a single centralized database), __________2____________(fragments assigned to one site), _________3_____________ replication (complete copy of the database maintained at each site), and _________4_____________ replication (combination of the first three).;

tuple, shredded, schema-independent, parsed

There are four general approaches to storing an XML document in a relational database: store the XML as the value of some attribute within a ___________1___________ ; store the XML in a __________2____________ form across a number of attributes and relations;store the XML in a _____3_____-____________ form; store the XML in a _________4_____________ form; that is, convert the XML to internal format, such as an Infoset or PSVI representation, and store this representation.;

users

There are several critical factors for the success of the database design stage, including, for example, working interactively with ______________________ and being willing to repeat steps.;

associations, sequential, time

There are three specializations of link analysis: __________1____________ discovery, __________2____________ pattern discovery, and similar __________3____________ sequence discovery. Associations discovery finds items that imply the presence of other items in the same event.

scalar, row, and table

There are three types of subquery: __________1____________ , __________2____________ , and _________3_____________ . A scalar subquery returns a single column and a single row, that is, a single value. In principle, a scalar subquery can be used whenever a single value is needed. A row subquery returns multiple columns, but only a single row. A row subquery can be used whenever a row value constructor is needed, typically in predicates. A table subquery returns one or more columns and multiple rows. A table subquery can be used whenever a table is needed; for example, as an operand for the IN predicate.;

participation, disjoint

There are two constraints that may apply to a specialization/generalization called _________1_____________ constraints and _________2_____________ constraints.;

requirements, systems

There are two main documents created during the requirements collection and analysis stage: the users' _________1_____________ specification and the __________2____________ specification.;

enterprise, Kimball, Inmon

There are two main methodologies that incorporate the development of an ___________1___________ data warehouse (EDW) that were proposed by the two key players in the data warehouse arena: __________2____________ 's Business Dimensional Lifecycle (Kimball, 2008) and __________3____________ 's Corporate Information Factory (CIF) methodology (Inmon, 2001).;

heuristics, relative costs

There are two main techniques for query optimization, although the two strategies are usually combined in practice. The first technique uses __________1____________ rules that order the operations in a query. The other technique compares different strategies based on their _________2_____________ ________2______________ and selects the one that minimizes resource usage.;

object, physical

There are two types of OID (object identifiers): ________1______________ OIDs, which are independent of the physical location of the object on disk, and _________2_____________ OIDs, which encode the location.

object

There is no single extended relational data model; rather, there are a variety of these models, whose characteristics depend upon the way and the degree to which extensions were made. However, all the models do share the same basic relational tables and query language, all incorporate some concept of "______________________ ," and some have the ability to store methods or procedures/triggers as well as data in the database.;

transitively

Third Normal Form (3NF) is a relation that is in first and second normal form in which no non-primary- key attribute is ______________________ dependent on the primary key. Transitive dependency is a condition where A, B, and C are attributes of a relation such that if A → B and B → C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).;

swizzling, faulting

To achieve the required performance, an OODBMS must be able to convert OIDs to and from in-memory pointers. This conversion technique has become known as pointer _________1____________ or object ________2______________ , and the approaches used to implement it have become varied, ranging from software-based residency checks to page-faulting schemes used by the underlying hardware.;

log file

To facilitate recovery, one method is for the system to maintain a ______________________ ______________________ containing transaction records that identify the start/end of transactions and the before- and after-images of the write operations. Using deferred updates, writes are done initially to the log only and the log records are used to perform actual updates to the database. If the system fails, it examines the log to determine which transactions it needs to redo, but there is no need to undo any writes. Using immediate updates, an update may be made to the database itself any time after a log record is written. The log can be used to undo and redo transactions in the event of failure.;

performance

To improve ______________________, it is necessary to be aware of how the following four basic hardware components interact and affect system performance: main memory, CPU, disk I/O, and network;

B-tree

Traditional RDBMSs use ___-___________________ indexes to speed access to scalar data. With the ability to define complex data types in an ORDBMS, specialized index structures are required for efficient access to data. Some ORDBMSs are beginning to support additional index types, such as generic B-trees, R-trees (region trees) for fast access to two- and three-dimensional data, and the ability to index on the output of a function. A mechanism to plug in any user-defined index structure provides the highest level of flexibility.;

Event-Condition-Action (ECA)

Triggers are based on the ______________________ model: the event (or events) that trigger the rule, the condition that determines whether the action should be executed, and the action to be taken.;

locking, timestamping

Two methods that can be used to guarantee distributed serializability are ______________________ and ______________________. In two-phase locking (2PL) a transaction acquires all its locks before releasing any. Two-phase locking protocols can use centralized, primary copy, or distributed lock managers. Majority voting can also be used. With timestamping, transactions are ordered in such a way that older transactions get priority in the event of conflict.;

serializability

Two methods that guarantee ______________________ are two-phase locking (2PL) and timestamping. Locks may be shared (read) or exclusive (write). In two-phase locking, a transaction acquires all its locks before releasing any. In timestamping, transactions are ordered in such a way that older transactions get priority in the event of conflict.;

expression, sequence, values, nodes

W3C Query Working Group has proposed a query language for XML called XQuery. XQuery is a functional language in which a query is represented as an __________1____________ . The value of an expression is always a __________2____________ , which is an ordered collection of one or more atomic __________3____________ or __________4____________ . XQuery supports several kinds of expression, which can be nested (supporting the notion of a subquery).;

ancestors, intention

When a new transaction requests a lock, it is easy to check all the __________1____________ of the object to determine whether they are already locked. To show whether any of the node's descendants are locked, an __________2____________ lock is placed on all the ancestors of any node being locked.;

Semijoin

When the main cost component is communication time, the ______________________ operation is particularly useful for improving the processing of distributed joins by reducing the amount of data transferred between sites

materialization

With ______________________ the output of one operation is stored in a temporary relation for processing by the next operation. An alternative approach is to pipeline the results of one operation to another operation without creating a temporary relation to hold the intermediate result, thereby saving the cost of creating temporary relations and reading the results back in again.;

update-anywhere

With the __________-____________ model of data ownership, each copy can be updated and so a mechanism for conflict detection and resolution must be provided to maintain data integrity.;

OSI-TP

X/Open DTP is a distributed transaction processing architecture for a distributed 2PC protocol, based on ___-__. The architecture defines application programming interfaces and interactions among transactional applications, transaction managers, resource managers, and communication managers.;

tree, event

XML APIs generally fall into two categories: ______________________ -based and ______________________ -based.

end-user access

____-________ __________ tools can be categorized into four main groups: traditional data reporting and query tools, application development tools, online analytical processing (OLAP) tools, and data mining tools.;

computer-based

________-______________ security controls for the multiuser environment include: authorization, access controls, views, backup and recovery, integrity, encryption, and RAID technology.;

authorization identifiers

_________ _____________ are assigned to database users by the DBA and identify a user. Each object that is created in SQL has an owner. The owner can pass privileges on to other users using the GRANT statement and can revoke the privileges passed on using the REVOKE statement.

functions and services

_________ and _____________ of a multi-user DBMS include data storage, retrieval, and update; a user-accessible catalog; transaction support; concurrency control and recovery services; authorization services; support for data communication; integrity services; services to promote data independence; and utility services.;

data conversion, loading

_________1_____________ __________1____________ and ________2______________ involves transferring any existing data into the new database and converting any existing applications to run on the new database.;

fact-finding

__________-____________ is the formal process of using techniques such as interviews and questionnaires to collect facts about systems, requirements, and preferences.;

Entity integrity

___________ ___________ is a constraint that states that in a base relation no attribute of a primary key can be null. Referential integrity states that foreign key values must match a candidate key value of some tuple in the home relation or be wholly null. Apart from relational integrity, integrity constraints include required data, domain, and multiplicity constraints; other integrity constraints are called general constraints.;

Starflake, normalized, denormalized

___________1___________ schema is a dimensional data model that has a fact table in the center, surrounded by _________2_____________ AND _________3_____________ dimension tables.;

application

______________________ design involves user interface design and transaction design, which describes the application programs that use and process the database. A database transaction is an action or series of actions carried out by a single user or application program, which accesses or changes the content of the database.;

DOM

______________________ (Document Object Model) is a tree-based API for XML that provides an object-oriented view of the data. The API was created by W3C and describes a set of platform- and language-neutral interfaces that can represent any well-formed XML or HTML document.

SAX

______________________ (Simple API for XML) is an event-based, serial-access API for XML that uses callbacks to report parsing events to the application. The application handles these events through customized event handlers.;

XML, HTML

______________________ (eXtensible Markup Language) is a metalanguage (a language for describing other languages) that enables designers to create their own customized tags to provide functionality not available with ______________________ . XML is a restricted form of SGML designed as a less complex markup language than SGML that is, at the same time, network-aware.;

ADO

______________________ .NET is the next version of ADO with new classes that expose data access services to the programmer. ADO.NET is one component of the .NET Framework that was designed to address three main weaknesses with ADO: providing a disconnected data access model that is required in the Web environment, providing compatibility with the .NET Framework class library, and providing extensive support for XML.;

object

______________________ OIDs, a level of indirection is required to look up the physical address of the object on disk.

Intellectual property

______________________ ______________________ (IP) includes inventions, inventive ideas, designs, patents and patent applications, discoveries, improvements, trademarks, designs and design rights, written work, and know-how devised, developed, or written by an individual or set of individuals.;

Unnormalized Form

______________________ ______________________ (UNF) is a table that contains one or more repeating groups.;

Similar time sequence discovery

______________________ ______________________ ______________________ ______________________ of link analysis is used, for example, in the discovery of links between two sets of data that are time-dependent, and is based on the degree of similarity between the patterns that both time series demonstrate.;

Sequential pattern discovery

______________________ ______________________ ______________________ of link analysis finds patterns between events such that the presence of one set of items is followed by another set of items in a database of events over a period of time.

Link analysis

______________________ ______________________ aims to establish links, called associations, between the individual records, or sets of records, in a database.

requirements collection and analysis

______________________ ______________________ and ______________________ is the process of collecting and analyzing information about the part of the organization that is to be supported by the database system, and using this information to identify the requirements for the new system. There are three main approaches to managing the requirements for a database system that has multiple user views: the centralized approach, the view integration approach, and a combination of both approaches.;

predictive modeling

______________________ ______________________ can be used to analyze an existing database to determine some essential characteristics (model) about the data set.

Cost estimation

______________________ ______________________ depends on statistical information held in the system catalog. Typical statistics include the cardinality of each base relation, the number of blocks required to store a relation, the number of distinct values for each attribute, the selection cardinality of each attribute, and the number of levels in each multilevel index.;

operational maintenance

______________________ ______________________ is the process of monitoring and maintaining the system following installation.;

database segmentation

______________________ ______________________ partitions a database into an unknown number of segments, or clusters, of similar records.

client-server

______________________ architecture refers to the way in which software components interact. There is a client process that requires some resource, and a server that provides the resource. In the two-tier model, the client handles the user interface and business processing logic and the server handles the database functionality. In the Web environment, the traditional two-tier model has been replaced by a three-tier model, consisting of a user interface layer (the client), a business logic and data processing layer (the application server), and a DBMS (the database server), distributed over different machines.;

subprograms

______________________ are named PL/SQL blocks that can take parameters and be invoked. PL/SQL has two types of subprograms called (stored) procedures and functions. Procedures and functions can take a set of parameters given to them by the calling program and perform a set of actions. Both can modify and return data passed to them as a parameter.

Relations

______________________ are physically represented as tables, with the rows corresponding to individual tuples and the columns to attributes.;

Checkpoints

______________________ are used to improve database recovery. At a checkpoint, all modified buffer blocks, all log records, and a checkpoint record identifying all active transactions are written to disk. If a failure occurs, the checkpoint record identifies which transactions need to be redone.;

Cloud

______________________ computing is the use of computing software or hardware resources that are delivered over a network and accessed typically from a Web browser or mobile application.;

Concurrency

______________________ control is the process of managing simultaneous operations on the database without having them interfere with one another. Database recovery is the process of restoring the database to a correct state after a failure. Both protect the database from inconsistencies and data loss.;

Internal

______________________ controls are a set of measures that an organization adopts to ensure that policies and procedures are not violated, data is properly secured and reliable, and operations can be carried out efficiently.;

Semistructured

______________________ data is data that has some structure, but the structure may not be rigid, regular, or complete and generally the data does not conform to a fixed schema. Sometimes the term schema-less or self-describing is used to describe such data.;

object-based

______________________ data models include the Entity-Relationship, semantic, functional, and object-oriented models. Record-based data models include the relational, network, and hierarchical models.;

logical

______________________ database design is the process of constructing a model of the data used in an enterprise based on a specific data model (such as the relational model), but independent of a particular DBMS and other physical considerations. Logical database design translates the conceptual data model into a logical data model of the enterprise.;

logical

______________________ database design is the process of constructing a model of the data used in an enterprise based on a specific data model, but independent of a particular DBMS and other physical considerations.;

conceptual

______________________ database design is the process of constructing a model of the data used in an enterprise, independent of all physical considerations.;

physical

______________________ database design is the process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and indexes used to achieve efficient access to the data, and any associated integrity constraints and security measures.;

Distributed

______________________ deadlock involves merging local wait-for graphs together to check for cycles. If a cycle is detected, one or more transactions must be aborted and restarted until the cycle is broken. There are three common methods for handling deadlock detection in distributed DBMSs: centralized, hierarchical, and distributed deadlock detection.;

COBRA

______________________ defines the architecture of ORB-based environments. This architecture is the basis of any OMG component, defining the parts that form the ORB and its associated structures. Using GIOP or IIOP, a CORBA-based program can interoperate with another CORBA-based program across a variety of vendors, platforms, operating systems, programming languages, and networks. Some of the elements of CORBA are an implementation-neutral Interface Definition Language (IDL), a type model, an Interface Repository, methods for getting the interfaces and specifications of objects, and methods for transforming OIDs to and from strings.;

cardinality

______________________ describes the maximum number of possible relationship occurrences for an entity participating in a given relationship type.;

Deviation

______________________ detection of link analysis is often a source of true discovery because it identifies outliers, which express deviation from some previously known expectation and norm. This operation can be performed using statistics and visualization techniques or as a by-product of data mining.;

participation

______________________ determines whether all or only some entity occurrences participate in a given relationship.;

secondary

______________________ indexes provide a mechanism for specifying an additional key for a base relation that can be used to retrieve data more efficiently. However, there is an overhead involved in the maintenance and use of secondary indexes that has to be balanced against the performance improvement gained when retrieving data.;

Background

______________________ intellectual property is IP that already exists before an activity takes place.;

Foreground

______________________ intellectual property is IP that is generated during an activity.;

prototyping

______________________ involves building a working model of the database system, which allows the designers or users to visualize and evaluate the system.;

Service-Oriented Architecture(SOA)

______________________ is a business-centric software architecture for building applications that implement business processes as sets of services published at a granularity relevant to the service consumer.;

Cloud computing

______________________ is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. The three main service models are: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS). Cloud-based database solutions fall into two basic categories: Data as a Service (DaaS) and Database as a Service (DBaaS).;

Java

______________________ is a simple, object-oriented, distributed, interpreted, robust, secure, architecture-neutral, portable, high-performance, multithreaded, and dynamic language from Sun Microsystems. Java applications are compiled into bytecodes, which are interpreted and executed by the Java Virtual Machine. Java can be connected to an ODBC-compliant DBMS through, among other mechanisms, JDBC or SQLJ, Container-Managed Persistence (CMP), Java Data Objects (JDO), or Java Persistence API (JPA).;

composition

______________________ is a specific form of aggregation that represents an association between entities, where there is a strong ownership and coincidental lifetime between the "whole" and the "part.;

normalization

______________________ is a technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. Normalization is a formal method that can be used to identify relations based on their keys and the functional dependencies among their attributes.;

middleware

______________________ is computer software that connects software components or applications. Middleware types include RPC (synchronous and asynchronous), publish/subscribe, message-oriented middleware (MOM), object-request brokers (ORB), and database middleware.;

encryption

______________________ is the encoding of the data by a special algorithm that renders the data unreadable by any program without the decryption key.;

multiplicity

______________________ is the number (or range) of possible occurrences of an entity type that may relate to a single occurrence of an associated entity type through a particular relationship.;

implementation

______________________ is the physical realization of the database and application designs.;

conceptual

______________________ is the process of constructing a detailed architecture for a database that is independent of implementation details, such as the target DBMS, application programs, programming languages, or any other physical considerations. The design of the conceptual schema is critical to the overall success of the system. It is worth spending the time and effort necessary to produce the best possible conceptual design.;

Replication

______________________ is the process of generating and reproducing multiple copies of data at one or more sites. It is an important mechanism, because it enables organizations to provide users with access to current data where and when they need it.;

specialization

______________________ is the process of maximizing the differences between members of an entity by identifying their distinguishing features.;

generalization

______________________ is the process of minimizing the differences between entities by identifying their common features.;

testing

______________________ is the process of running the database system with the intent of finding errors.;

Snapshot

______________________ isolation has been shown to be a good solution to replication techniques and group communication protocols ensure the delivery of messages in a total order.;

dimensionality

______________________ modeling is a design technique that aims to present the data in a standard, intuitive form that allows for high-performance access.;

Properties

______________________ of database relations are: each cell contains exactly one atomic value, attribute names are distinct, attribute values come from the same domain, attribute order is immaterial, tuple order is immaterial, and there are no duplicate tuples.;

Local

______________________ optimization is performed at each site involved in the query.;

Global

______________________ optimization takes account of statistical information to find a near-optimal execution plan.

Orthogonal

______________________ persistence is based on three fundamental principles: persistence independence, data type orthogonality, and transitive persistence.;

Eager

______________________ replication is the immediate updating of the replicated target data following an update to the source data. This is achieved typically using the 2PC (two-phase commit) protocol. Lazy replication is when the replicated target database is updated at some time after the update to the source database. The delay in regaining consistency between the source and target database may range from a few seconds to several hours or even days. However, the data eventually synchronizes to the same value at all sites.;

aggregation

______________________ represents a "has-a" or "is-part-of" relationship between entity types, where one represents the "whole" and the other the "part.";

inference

______________________ rules can be used to identify the set of all functional dependencies associated with a relation. This set of dependencies can be very large for a given relation.;

star

______________________ schema is a dimensional data model that has a fact table in the center, surrounded by ______________________ dimension tables.;

Persistence

______________________ schemes include checkpointing, serialization, explicit paging, and orthogonal persistence.

Communication

______________________ takes place over a network, which may be a local area network (LAN) or a wide area network (WAN). LANs are intended for short distances and provide faster communication than WANs. A special case of the WAN is a metropolitan area network (MAN), which generally covers a city or suburb.;

Classical

______________________ transactions models may not be appropriate for a mobile environment. Disconnection is a major problem, particularly when transactions are long-lived and there are a large number of disconnections. Frequent disconnections make reliability a primary requirement for transaction processing in a mobile environment. Further, as mobile hosts can move from one cell to another, a mobile transaction can hop through a collection of visited sites.;

Concurrency

______________________control is needed when multiple users are allowed to access the database simultaneously. Without it, problems of lost update, uncommitted dependency, and inconsistent analysis can arise. Serial execution means executing one transaction at a time, with no interleaving of operations. A schedule shows the sequence of the operations of transactions. A schedule is serializable if it produces the same results as some serial schedule.;

authorization

______________________is the granting of a right or privilege that enables a subject to have legitimate access to a system or a system's object. ______________________is a mechanism that determines whether a user is who he or she claims to be.;

Deadlock

______________________occurs when two or more transactions are waiting to access data the other transaction has locked. The only way to break deadlock once it has occurred is to abort one or more of the transactions.;

Language

there is the Microsoft .NET Framework, consisting of the Common ______________________ Runtime (CLR) and the .NET Framework Class Library.;

attributes

vertical fragments are subsets of ______________________.


Conjuntos de estudio relacionados

CHAPTER 6: Motivation and Job Design

View Set

sec160 ch 11 netacad quiz & terms

View Set

BUSML 3250 Midterm Textbook Questions

View Set