DB Ch10-14

Ace your homework & exams now with Quizwiz!

The process of transforming data from a detailed to a summary level is called:

aggregating.

A distributed database must:

all of the above

A model of a real world application and associated properties are created in the:

analysis phase

Replication should be used when:

no or few triggers

An organization using HDFS realizes that hardware failure is a(n):

norm

According to your text, NoSQL stands for:

not only sql

A graph of instances that are compatible within a class diagram is called a(n):

object diagram

Data replication allowing for each transition to proceed without coordination is called:

node decoupling

NoSQL focuses on:

flexibility

An organization should have one data warehouse administrator for every:

100 gigs of data in warehouse

Hive is a(n) ________ data warehouse software.

Apache

Which of the following are key steps in a data quality program?

Apply TQM principles and practices

A trigger can be used as a security measure in which of the following ways?

Cause handling procedures to be executed

Which of the following functions model business rules?

DB analysis

The best place to improve data entry across all applications is:

In the database definitions

Which type of index is commonly used in data warehousing environments?

Join and bit-mapped index

________ tools commonly load data into intermediate hypercube structures.

MOLAP

The methods to ensure the quality of data across various subject areas are called:

Master Data Management

An organization that decides to adopt the most popular NoSQL database management system would select:

MongoDB

An organization that requires a graph database that is highly scalable would select the ________ database management system.

Neo4j

TQM stands for:

Total Quality Management.

Which type of operation has side effects?

Update

A design goal for distributed databases that states that although a distributed database runs many transactions, it appears that a given transaction is the only one in the system is called:

concurrency transparency

All of the following are tasks of data cleansing EXCEPT:

creating foreign keys.

Snapshot replication is most appropriate for:

data warehouse

In the ________ approach, one consolidated record is maintained, and all applications draw on that one actual "golden" record.

federated

Loading data into a data warehouse does NOT involve:

formatting the hard drive.

An audit trail of database changes is kept by a:

journalizing facility

Which of the following factors in deciding on database distribution strategies is related to autonomy of organizational units?

organizational forces

NoSQL systems enable automated ________ to allow distribution of the data among multiple nodes to allow servers to operate independently on the data located on it.

sharding

External data sources present problems for data quality because:

there is a lack of control over data quality.

Quality data can be defined as being:

unique.

________ includes NoSQL accommodation of various data types.

variety

Although volume, variety, and velocity are considered the initial three v dimensions, two additional Vs of big data were added and include:

veracity and value

The three 'v's commonly associated with big data include:

volume, variety, and velocity.

Apache Cassandra is a leading producer of ________ NoSQL database management systems.

wide-column

The NoSQL model that incorporates 'column families' is called a:

wide-column

Which of the following functions develop integrity controls?

DB design

Which of the following functions do cost/benefit models?

DB planning

Including data capture controls (i.e., dropdown lists) helps reduce ________ deteriorated data problems

Data entry

________ duplicates data across databases.

Data propagation

All of the following are popular architectures for Master Data Management EXCEPT

Normalization

The W3C standard for Web privacy is called:

Platform for privacy preferences

An organization that requires a sole focus on performance with the ability for keys to include strings, hashes, lists, and sorted sets would select ________ database management system.

Redis

________ is the most popular key-value store NoSQL database management system.

Redis

Data may be loaded from the staging area into the warehouse by following:

SQL Commands (Insert/Update).

Research shows that if an online customer does not get the service he or she expects within a few ________, the customer will switch to a competitor.

Seconds

________ are examples of Business Intelligences and Analytics 3.0 because they have millions of observations per second.

Smartphones

Which of the following is a principal type of authorization table?

Subject

Which of the following is a basic method for single field transformation?

Table lookup

Which of the following is true of distributed databases?

better local control

One way to improve the data capture process is to

check entered data immediately for quality against data in the database

A diagram that shows the static structure of an object-oriented model is called a(n):

class diagram

First degree or complete price discrimination relates to:

company charges max what they are willing to pay

A part object which belongs to only one whole object and which lives and dies with the whole object is called a:

composition

A characteristic of reconciled data that means the data reflect an enterprise-wide view is

comprehensive

Data quality problems can cascade when:

data are copied from legacy systems.

Conformance means that

data are stored, exchanged or presented in a format that is specified by its metadata.

Which of the following is true about horizontal partitioning?

data can be stored to optimize

The Hadoop Distributed File System (HDFS) is the foundation of a ________ infrastructure of Hadoop.

data management

All of the following are ways to consolidate data EXCEPT:

data rollup and integration

A technique using pattern recognition to upgrade the quality of raw data is called:

data scrubbing

A technique using artificial intelligence to upgrade the quality of raw data is called

data scrubbing.

With HDFS it is less expensive to move the execution of computation to data than to move the:

data to computation

Which of the following is NOT a component of a repository system architecture?

data transformation

Converting data from the format of its source to the format of its destination is called:

data transformation.

The role of a ________ emphasizes integration and coordination of metadata across many data sources.

data warehouse admin

________ is an application that can effectively employ snapshot replication in a distributed environment.

data warehousing

The oldest form of analytics is:

descriptive

When online analytical processing (OLAP) studies last year's sales, this represents:

descriptive (past)

Allowing users to dive deeper into the view of data with online analytical processing (OLAP) is an important part of:

descriptive analytics

A researcher trying to explain why sales of garden supplies in Hawaii have decreased would be an example of ________ data mining.

explanatory

The goal of data mining related to analyzing data for unexpected relationships is:

exploratory

Datatype conflicts is an example of a(n) ________ reason for deteriorated data quality

external data source

Getting poor data from a supplier is a(n) ________ reason for deteriorated data quality

external data source.

Which of the following is true of data replication?

fast response, node decoupling, addnl storage

The step in which a distributed database decides the order in which to execute the distributed query is called:

global optimization

The NoSQL model that is specifically designed to maintain information regarding the relationships (often real-world instances of entities) between data items is called a:

graph-oriented database.

Data governance can be defined as:

high-level organizational groups and processes that oversee data stewardship

Data that are accurate, consistent, and available in a timely fashion are considered:

high-quality.

A method of capturing only the changes that have occurred in the source data since the last capture is called ________ extract.

incremental

The process of combining data from various sources into a single table or view is called:

joining.

The NoSQL model that includes a simple pair of a key and an associated collection of values is called a:

key-value score

An optimization strategy that allows sites that can update to proceed and other sites to catch up is called:

lazy commit

Informational and operational data differ in all of the following ways EXCEPT:

level of detail.

Big data requires effectively processing:

many data types

The Hadoop framework consists of the ________ algorithm to solve large scale problems.

mapreduce

Object-oriented model objects differ from E-R models because:

objects vs relations

The process of replacing a method inherited from a superclass by a more specific implementation of the method in a subclass is called:

overriding

In the ________ approach, one consolidated record is maintained from which all applications draw data.

persistent

________ is arguably the most common concern by individuals regarding big data analytics.

personal privacy

________ means that the same operation can apply to two or more classes in different ways.

polymorphism

Application of statistical and computational methods to predict data events is:

predictive analytics.

Descriptive, predictive, and ________ are the three main types of analytics.

prescriptive

When an organization must decide on optimization and simulation tools to make things happen it is using:

prescriptive analytics

Regarding big data value, the primary focus is on:

privacy

Data federation is a technique which

provides a virtual view of integrated data without actually creating one centralized database

Event-driven propagation:

pushes data to duplicate sites as an event occurs.

The major advantage of data propagation is:

real-time cascading of data changes throughout the organization.

An approach to filling a data warehouse that employs bulk rewriting of the target data periodically is called

refresh mode

An approach to filling a data warehouse that employs bulk rewriting of the target data periodically is called:

refresh mode

A design goal for distributed databases to allow programmers to treat a data item replicated at several sites as though it were at one site is called:

replicatoin transparency (sites)

Data quality ROI stands for:

return on investment

NoSQL systems allow ________ by incorporating commodity servers that can be easily added to the architectural solution.

scaling out

When reporting and analysis organization of the data is determined when the data is used is called a:

schema on read.

It is true that in an HDFS cluster the NameNode is the:

single master server

It is true that in an HDFS cluster the DataNodes are the:

slaves

One simple task of a data quality audit is to:

statistically profile all files

The end of an association where it connects to a class is called a(n):

terminator

Security measures for dynamic Web pages are different from static HTML pages because:

the connection requires full access to the database for dynamic pages

One characteristic of quality data which pertains to the expectation for the time between when data are expected and when they are available for use is:

timeliness

________ generally processes the largest quantities of data.

transaction processing

User interaction integration is achieved by creating fewer ________ that feed different systems

user interfaces


Related study sets

Chap 5 Sec 3 CD/CD-RW DVD/DVD-RW BLU RAY Terms and Facts.

View Set

Symptoms of Right and Left Sided Heart Failure

View Set

3 Domains/6 Kingdoms of Life (EXTENDED)

View Set

FINN 3120 Exam 4: Reading Questions

View Set