Databases Comprehensive

Ace your homework & exams now with Quizwiz!

The SDLC phase in which database processing programs are created is the ________ phase.

Implementation

A method of capturing only the changes that have occurred in the source data since the last capture is called ________ extract.

Incremental

In the figure below, which attributed is derived?

Years_Employed

Operational and informational systems are generally separated because of which of the following factors?

A data warehouse centralizes data that are scattered throughout disparate operational systems and makes them readily available for decision support applications.

What results would the following SQL statement produce? select owner, table_name from dba_tables where table_name = 'CUSTOMER';

A listing of the owner of the customer table

An entity cluster is:

A set of one or more entity types and associated relationships grouped into a single abstract entity type

Which of the following is NOT a component of a repository system architecture?

An informational model

The SDLC phase in which the detailed conceptual data model is created is the ________ phase.

Analysis

A property or characteristic of an entity type that is of interest to the organization is called a(n):

Attribute

Which of the following data-mining techniques identifies clusters of observations with similar characteristics?

Clustering and signal processing

________ is/are a new technology which trade(s) off storage space savings for computing time.

Column databases

A(n) ________ constraint is a type of constraint that addresses whether an instance of a supertype must also be an instance of at least one subtype.

Completeness

An attribute that uniquely identifies an entity and consists of a composite attribute is called a(n):

Composite attribute

A primary key that consists of more than one attribute is called a:

Composite key

Which of the following is true of data visualization?

Correlations and clusters in data can be easily identified

A technique using pattern recognition to upgrade the quality of raw data is called:

Data scrubbing

With HDFS it is less expensive to move the execution of computation to data than to move the:

Data to computation

The role of a ________ emphasizes integration and coordination of metadata across many data sources.

Data warehouse administrator

A data mart is a(n):

Data warehouse that is limited in scope

________ is a technical function responsible for database design, security, and disaster recovery.

Database administration

Which of the following is software used to create, maintain, and provide controlled access to databases?

Database management system (DBMS)

The value a field will assume unless the user enters an explicit value for an instance of that field is called a:

Default value

Allowing users to dive deeper into the view of data with online analytical processing (OLAP) is an important part of:

Descriptive analytics

The oldest form of analytics is:

Descriptive analytics

A star schema contains both fact and ________ tables.

Dimension

The ________ rule specifies that an entity can be a member of only one subtype at a time.

Disjoint

A ________ addresses whether an instance of a supertype may simultaneously be a member of two or more subtypes.

Disjointedness constraint

________ is the most popular RDMS data model notation.

ERD

In order to embed SQL inside of another language, the ________ statement must be placed before the SQL in the host language.

EXEC SQL

A database action that results from a transaction is called a(n):

Event

The goal of data mining related to analyzing data for unexpected relationships is:

Exploratory

A contiguous section of disk storage space is called a(n):

Extent

Datatype conflicts is an example of a(n) ________ reason for deteriorated data quality.

External data source

A disadvantage of partitioning is:

Extra space and update time

A file organization is a named portion of primary memory.

False

A join index is a combination of two or more indexes.

False

A synonym is an attribute that may have more than one meaning.

False

A ternary relationship is equivalent to three binary relationships.

False

A transversal dependency is a functional dependency between two or more nonkey attributes.

False

An extent is a named portion of secondary memory allocated for the purpose of storing physical records.

False

Horizontal partitioning refers to the process of combining several smaller relations into a larger table.

False

In the figure below, Name would be an ideal identifier.

False

In the figure shown below, a person has to be married.

False

In the figure shown below, a rental unit can be both a house and an apartment.

False

It is desirable that no two attributes across all entity types have the same name.

False

Master data management is the disciplines, technologies and methods to ensure the currency, meaning and quality of data within one subject area.

False

Most data outages in organizations are caused by hardware failures.

False

Operational metadata are derived from the enterprise data model.

False

Quality data are not essential for well-run organizations.

False

Reduced uptime is a disadvantage of partitioning.

False

Specifying the attribute names in the SELECT statement will make it easier to find errors in queries and also correct for problems that may occur in the base system.

False

Subqueries can only be used in the WHERE clause.

False

The degree of a relationship is the number of attributes that are associated with it.

False

The following figure is an example of total specialization.

False

The following queries produce the same results. Select DISTINCT customer_name, customer_city from customer, salesman where customer.salesman_id = salesman.salesman_id and salesman.lname = 'SMITH'; select customer_name, customer_city from customer where customer.salesman_id = (select salesman_id from salesman where lname = 'SMITH');

False

The internal schema consists of the physical schema and the enterprise data model.

False

Older systems that often contain data of poor quality are called ________ systems.

Legacy

A database is an organized collection of ________ related data.

Logically

________ tools commonly load data into intermediate hypercube structures.

MOLAP

Which of the following is NOT true of poor data and/or database administration?

Maintaining a secure server

A student can attend five classes, each with a different professor. Each professor has 30 students. The relationship of students to professors is a ________ relationship.

Many-to-many

When an organization must decide on optimization and simulation tools to make things happen it is using:

Prescriptive analytics

An attribute (or attributes) that uniquely identifies each row in a relation is called a:

Primary Key

A two-dimensional table of data sometimes is called a:

Relation

________ are established between entities in a well-structured database so that the desired information can be retrieved.

Relationships

Packaged data models:

Require customization

An attribute that must have a value for every entity (or relationship) instance is a(n):

Required Attribute

A centralized knowledge base of all data definitions, data relationships, screen and report formats, and other system components is called a(n):

Respository

The ________ rule specifies that each entity instance of the supertype must be a member of some subtype in the relationship.

Total specialization

A method for handling missing data is to:

Track missing data with special reports

A candidate key is an attribute, or combination of attributes, that uniquely identifies a row in a relation.

True

A data quality audit helps an organization understand the extent and nature of data quality problems.

True

A data warehouse contains summarized and historical information.

True

A good data definition is always accompanied by diagrams, such as the entity-relationship diagram.

True

A hashing algorithm is a routine that converts a primary key value into a relative record number.

True

A method of capturing data in a snapshot at a point in time is called static extract.

True

A partial functional dependency is a functional dependency in which one or more nonkey attributes are functionally dependent on part (but not all) of the primary key.

True

A physical schema contains the specifications for how data from a conceptual schema are stored in a computer's secondary memory.

True

A pointer is a field of data that can be used to locate a related field or record of data.

True

A primary key is an attribute that uniquely identifies each row in a relation.

True

A single occurence of an entity is called an entity instance.

True

A tablespace is a named set of disk storage elements in which physical files for the database tables may be stored.

True

Data scrubbing is a technique using pattern recognition and other artificial intelligence techniques to upgrade the quality of raw data before transforming and moving the data to the data warehouse.

True

Data structures include data organized in the form of tables with rows and columns.

True

ETL is short for Extract, Transform, Load.

True

For the relationship represented in the figure below, a department can have more than one employee.

True

HBASE is a wide-column store database that runs on top of HDFS (modeled after Google).

True

Human intervention is an important part of big data analytics.

True

Improving data capture process is a fundamental step in data quality improvement.

True

In order to find out what customers have not placed an order for a particular item, one might use the NOT qualifier along with the IN qualifier.

True

Informational systems are designed to support decision making based on historical point-in-time and prediction data.

True

One of the biggest challenges of the extraction process is managing changes in the source system.

True

One property of a relation is that each attribute within a relation has a unique name

True

Packaged data models are as flexible as possible, because all supertype/subtype relationships allow the total specialization and overlap rules.

True

Participation in a relationship may be optional or mandatory.

True

Security is one advantage of partitioning.

True

The UNION clause is used to combine the output from multiple queries into a single result table.

True

The first requirement for building a user-friendly interface is a set of metadata that describes the data in the data mart in business terms that users can easily understand.

True

The need for data warehousing in an organization is driven by its need for an integrated view of high-quality data.

True

The relationship between the instances of two entity types is called a binary relationship.

True

The three major types of analytics are: descriptive, predictive, and prescriptive.

True

________ includes concern about data quality issues.

Veracity

______ partitioning distributes the columns of a table into several separate physical records.

Vertical

The three 'v's commonly associated with big data include:

Volume, variety, velocity

An entity type whose existence depends on another entity type is called a ________ entity.

Weak

A relation that contains minimal redundancy and allows easy use is considered to be:

Well-structured

Data federation is a technique which:

provides a virtual view of integrated data without actually creating one centralized database.

A ________ is a DBMS module that restores the database to a correct condition when a failure occurs.

recovery manager

Relational databases establish the relationships between entities by means of common fields included in a file called a(n):

relation

A DBMS may perform checkpoints automatically or in response to commands in user application programs.

True

A business transaction is a sequence of steps that constitute some well-defined business activity.

True

A join in which the joining condition is based on equality between values in the common column is called an equi-join.

True

An SQL query that implements an outer join will return rows that do not have matching values in common columns.

True

An enterprise data warehouse that accepts near-real time feeds of transactional data and immediately transforms and loads the appropriate data is called a real-time data warehouse.

True

SQL statements can be included in another language, such as C or Java.

True

Smartphones can produce millions of observations per second making them Business Intelligence and Analytics 3.0.

True

The external schema contains a subset of the conceptual schema relevant to a particular group of users.

True

An optimistic approach to concurrency control is called:

Versioning

A transaction that terminates abnormally is called a(n) ________ transaction.

aborted

While views promote security by restricting user access to data, they are not adequate security measures because:

an unauthorized person may gain access to view through experimentation

At a basic level, analytics refers to:

analysis and interpretation of data

One way to improve the data capture process is to:

check entered data immediately for quality against data in the database

The actions that must be taken to ensure data integrity is maintained during multiple simultaneous transactions are called ________ actions.

concurrency control

Organizing the database in computer disk storage is done in the ________ phase.

design

Embedded SQL consists of:

hard-coded SQL statements included in a program written in another language

Data quality is important for all of the following reasons EXCEPT:

it provides a stream of profit

The process of combining data from various sources into a single table or view is called:

joining

Data that describe the properties of other data are:

metadata

Data quality ROI stands for:

risk of incarceration

A(n) ________ is a procedure for acquiring the necessary locks for a transaction where all necessary locks are acquired before any are released.

two-phase lock

Establishing IF-THEN-ELSE logical processing within an SQL statement can be accomplished by:

using the CASE keyword in a statement

The following figure shows an example of:

A composite attribute

For the relationship represented in the figure below, which of the following is true?

A department can have more than one employee

Which of the following advances in information systems contributed to the emergence of data warehousing?

Advances in middleware products that enabled enterprise database connectivity across heterogeneous platforms.

The SDLC phase in which every data attribute is defined, every category of data is listed and every business relationship between data entities is defined is called the ________ phase.

Analysis

A ________ defines or constrains some aspect of the business.

Business rule

Which of the following factors drive the need for data warehousing?

Businesses need an integrated view of company information

Which of the following criteria should be considered when selecting an identifier?

Choose an identifier that doesn't have large composite attributes

An attribute that can be broken down into smaller parts is called a(n) ________ attribute.

Composite

________ takes a value of TRUE if a subquery returns an intermediate results table which contains one or more rows.

EXISTS

In the following diagram, which of the answers below is true?

Each patient has one or more patient histories

An advantage of partitioning is:

Efficiency

A primary key whose value is unique across all relations is called a(n):

Enterprise key

A person, place, object, event, or concept about which the organization wishes to maintain data is called a(n):

Entity

The logical representation of an organization's data is called a(n):

Entity-relationship model

A constraint is a rule in a database system that can be violated by users.

False

A data mart is a data warehouse that contains data that can be used across the entire organization.

False

A default value is the value that a field will always assume, regardless of what the user enters for an instance of that field.

False

Creating a data model from a packaged data model requires much more skill than creating one from scratch.

False

Databases are generally the property of a single department within an organization.

False

Denormalization is the process of transforming relations with variable-length fields into those with fixed-length fields.

False

Generalization is a top-down process.

False

Hadoop is considered a relational database management system.

False

There are two principal types of authorization tables: one for subjects and one for facts.

False

There can be multivalued attributes in a relation.

False

Transient data are never changed.

False

The smallest unit of application data recognized by system software is a:

Field

A(n) ________ is a technique for physically arranging the records of a file on secondary storage devices.

File organization

A(n) ________ is a field of data used to locate a related field or record.

Index

An audit trail of database changes is kept by a:

Journalizing facility

The entity integrity rule states that:

No primary key attribute can be null

Which of the following are properties of relations?

No two rows in a relation are identical

According to your text, NoSQL stands for:

Not Only Structured Query Language

In a supertype/subtype hierarchy, each subtype has:

Only one supertype

The most commonly used form of join operation is the:

Outer join

In the figure below, which of the following is a subtype of patient?

Outpatient

The ________ rule states that an entity instance can simultaneously be a member of two (or more) subtypes.

Overlap

Which type of file is most efficient with storage space?

Sequential

In the figure below, which attribute is multivalued?

Skill

________ are examples of Business Intelligences and Analytics 3.0 because they have millions of observations per second.

Smartphones

The process of defining one or more subtypes of a supertype and forming relationships is called:

Specialization

Which of the following is an entity that exists independently of other entity types?

Strong

________ is a tool even non-programmers can use to access information from a database.

Structured Query language

The characteristic that indicates that a data warehouse is organized around key high-level entities of the enterprise is:

Subject-oriented

A type of query that is placed within a WHERE or HAVING clause of another query is called a:

Subquery

The following code is an example of a: SELECT CustomerName, CustomerAddress, CustomerCity, CustomerState, CustomerPostalCode FROM Customer_T WHERE Customer_T.CustomerID = (SELECT Order_T.CustomerID FROM Order_T WHERE OrderID = 1008);

Subquery

An attribute of the supertype that determines the target subtype(s) is called the:

Subtype discriminator

Which of the following is a generic entity type that has a relationship with one or more subtypes?

Supertype

Which of the following is a basic method for single field transformation?

Table lookup

An open-source DBMS is:

A free source-code RBMS that provides the functionality of an SQL-compliant DBMS

An alternative name for an attribute is called a(n):

Alias

The following code would include: SELECT Customer_T.CustomerID, CustomerName, OrderID FROM Customer_T RIGHT OUTER JOIN Order_T ON Customer_T.CustomerID = Order_T.CustomerID;

All rows of the Order_T Table regardless of matches with the Customer_T Table

The property by which subtype entities possess the values of all attributes of a supertype is called:

Attribute inheritance

Which of the following is a type of network security?

Authentication of the client workstation

A method to allow adjacent secondary memory space to contain rows from several tables is called:

Clustering

The real-time data warehouse is characterized by which of the following?

Data are immediately transformed and loaded into the warehouse

A repository of information about a database that documents data elements of a database is called a:

Data dictionary

Including data capture controls (i.e., dropdown lists) helps reduce ________ deteriorated data problems.

Data entry

________ is a component of the relational data model included to specify business rules to maintain the integrity of data when they are manipulated.

Data integrity

The Hadoop Distributed File System (HDFS) is the foundation of a ________ infrastructure of Hadoop.

Data management

A graphical system used to capture the nature and relationships among data is called a(n):

Data model

________ duplicates data across databases.

Data propagation

Which of the following functions do cost/benefit models?

Database planning

The process of managing simultaneous operations against a database so that data integrity is maintained is called completeness control.

False

The schema on write and schema on read are considered synonymous approaches.

False

The status of data is the representation of the data after an event has occurred.

False

When all multivalued attributes have been removed from a relation, it is said to be in:

First normal form

An attribute in a relation of a database that serves as the primary key of another relation in the same database is called a:

Foreign Key

Loading data into a data warehouse does NOT involve:

Formatting the hard drive

A constraint between two attributes is called a(n):

Functional dependency

The process of defining a more general entity type from a set of more specialized entity types is called:

Generalization

In which type of file is multiple key retrieval not possible?

Hashed

An attribute that may have more than one meaning is called a(n):

Homonym

The Hadoop framework consists of the ________ algorithm to solve large scale problems.

MapReduce

The methods to ensure the quality of data across various subject areas are called:

Master Data management

A functional dependency in which one or more nonkey attributes are functionally dependent on part, but not all, of the primary key is called a ________ dependency.

Partial functional

The ________ rule specifies that an entity instance of a supertype is allowed not to belong to any subtype.

Partial specialization

In the figure below, which of the following apply to both OUTPATIENTs and RESIDENT_PATIENTs?

Patient_Name

All of the following are applications for big data and analytics EXCEPT:

Personal finances

________ is arguably the most common concern by individuals regarding big data analytics.

Personal privacy

Descriptive, predictive, and ________ are the three main types of analytics.

Prescriptive

One of the most popular RAD methods is:

Prototyping

Conformed dimensions allow users to do the following:

Query across fact tables with consistency

Data that are detailed, current, and intended to be the single, authoritative source of all decision support applications are called ________ data.

Reconciled

Which of the following is NOT an advantage of database systems?

Reduced program maintenance

While triggers run automatically, ________ do not and have to be called.

Routines

One field or combination of fields for which more than one record may have the same combination of values is called a(n):

Secondary Key

The traditional methodology used to develop, maintain and replace information systems is called the:

Systems Development Life Cycle.

Data is represented in the form of:

Tables

Security measures for dynamic Web pages are different from static HTML pages because:

The connection requires full access to the database for dynamic pages

The following figure shows an example of:

The overlap rule

Subtypes should be used when:

There are attributes that apply to some but not all instances of an entity type.

A trigger can be used as a security measure in which of the following ways?

To cause special handling procedures to be executed

A referential integrity constraint is a rule that maintains consistency among the rows of two relations.

True

Triggers have three parts: the event, the condition, and the action.

True

When two or more attributes describe the same characteristic of an entity, they are synonyms.

True

The ________ operator is used to combine the output from multiple queries into a single result table.

UNION

A generic or template data model that can be reused as a starting point for a data modeling project is called a(n):

Universal data model

Regarding big data value, the primary focus is on:

Usefulness

A(n) ________ is often developed by identifying a form or report that a user needs on a regular basis.

User View

In the figure below, to which of the following entities are the entities "CAR" and "TRUCK" generalized?

Vehicle

A procedure is:

called by name

A join operation:

causes two tables with a common domain to be combined into a single table or view

A logical data mart is a(n):

data mart created by a relational view of a slightly denormalized data warehouse.

A technique using artificial intelligence to upgrade the quality of raw data is called:

data scrubbing

A detailed coding scheme recognized by system software for representing organizational data is called a(n):

data type

The coding or scrambling of data so that humans cannot read them is called:

encryption

The following code is an example of a(n): SELECT Customer_T.CustomerID, Order_T.CustomerID, CustomerName, OrderID FROM Customer_T, Order_T WHERE Customer_T.CustomerID = Order_T. CustomerID;

equi-join

Data governance can be defined as:

high-level organizational groups and processes that oversee data stewardship

Most data outages in organizations are caused by:

human error

The best place to improve data entry across all applications is:

in teh database definitions

The analysis of summarized data to support decision making is called:

informational processing

An operational data store (ODS) is a(n):

integrated, subject-orientated, updateable, current-valued, detailed database designed to serve the decision support needs of operational users

A dependent data mart:

is filled exclusively from the enterprise data warehouse with reconciled data.

Dynamic SQL:

is used to generate appropriate SQL code on the fly as an application is processing

Big Data includes:

large volumes of data with many different data types that are processed at very high speeds.

When reporting and analysis organization of the data is determined when the data is used is called a:

schema on read

External data sources present problems for data quality because:

there is a lack of control over data quality

User-defined transactions can improve system performance because:

transactions are processes as sets, reducing system overhead

A relatively small team of people who collaborate on the same project is called a:

workgroup


Related study sets

Chapter 37: Cardiac Glycosides, Antianginals, and Antidysrhythmics, Chapter 40: Anticoagulants, Antiplatelets, and Thrombolytics, Chapter 41: Antihyperlipidemics and Peripheral Vasodilators, Chapter 54: Drugs for Hemophilia, Chapter 53: Management of...

View Set

CA1: Introduction to Liabilities (True or False)

View Set

The Formation of Soil (Quiz Questions)

View Set

Chapter 4 (The Income Statement, Comprehensive Income, and The Statement of Cash Flows)

View Set

Тестові екзаменаційні завдання з навчальної дисципліни «Політична економія»

View Set