CPSC 231 Final Review

Ace your homework & exams now with Quizwiz!

Which of the following factors drive the need for data warehousing? Select one: a. Businesses need an integrated view of company information. b. Informational data must be kept together with operational data. c. Reduce virus and Trojan horse threats. d. Data warehouses generally have better security.

a. Businesses need an integrated view of company information.

________ is/are a new technology which trade(s) off storage space savings for computing time. Select one: a. Column databases b. Snowflake schemas c. Dimensional modeling d. Fact tables

a. Column databases

The real-time data warehouse is characterized by which of the following? Select one: a. Data are immediately transformed and loaded into the warehouse. b. It provides periodic access for the transaction processing systems to an enterprise data warehouse. c. It accepts batch feeds of transaction data. d. It is based on Oracle technology.

a. Data are immediately transformed and loaded into the warehouse.

Human intervention is an important part of big data analytics. Select one: a. True b. False

a. True

Improving data capture process is a fundamental step in data quality improvement. Select one: a. True b. False

a. True

One way to improve the data capture process is to: Select one: a. check entered data immediately for quality against data in the database. b. provide little or no training to data entry operators. c. not use any automatic data entry routines. d. allow all data to be entered manually.

a. check entered data immediately for quality against data in the database.

A technique using artificial intelligence to upgrade the quality of raw data is called: Select one: a. data scrubbing. b. dumping. c. completion backwards updates. d. data reconciliation.

a. data scrubbing.

Datatype conflicts is an example of a(n) ________ reason for deteriorated data quality. Select one: a. external data source. b. data entry problem c. lack of organizational commitment d. inconsistent metadata

a. external data source.

Data governance can be defined as: Select one: a. high-level organizational groups and processes that oversee data stewardship. b. a means to slow down the speed of data. c. a means to increase the speed of data. d. a government task force for defining data quality.

a. high-level organizational groups and processes that oversee data stewardship.

The analysis of summarized data to support decision making is called: Select one: a. informational processing. b. data scrubbing. c. artificial intelligence. d. operational processing.

a. informational processing.

An optimistic approach to concurrency control is called: Select one: a. versioning. b. denormalization. c. HappyControl. d. deadlock resolution.

a. versioning.

An SQL query that implements an outer join will return rows that do not have matching values in common columns. Select one: a. False b. True

b. True

Data scrubbing is a technique using pattern recognition and other artificial intelligence techniques to upgrade the quality of raw data before transforming and moving the data to the data warehouse. Select one: a. False b. True

b. True

HBASE is a wide-column store database that runs on top of HDFS (modeled after Google). Select one: a. False b. True

b. True

Informational systems are designed to support decision making based on historical point-in-time and prediction data. Select one: a. False b. True

b. True

One of the biggest challenges of the extraction process is managing changes in the source system. Select one: a. False b. True

b. True

SQL statements can be included in another language, such as C or Java.

b. True

Smartphones can produce millions of observations per second making them Business Intelligence and Analytics 3.0. Select one: a. False b. True

b. True

The three major types of analytics are: descriptive, predictive, and prescriptive. Select one: a. False b. True

b. True

At a basic level, analytics refers to: Select one: a. conducting a needs analysis. b. analysis and interpretation of data. c. normalizing data. d. collecting data.

b. analysis and interpretation of data.

The actions that must be taken to ensure data integrity is maintained during multiple simultaneous transactions are called ________ actions. Select one: a. transaction authorization b. concurrency control c. logging d. multiple management

b. concurrency control

Including data capture controls (i.e., dropdown lists) helps reduce ________ deteriorated data problems. Select one: a. external data source b. data entry c. lack of organizational commitment d. inconsistent metadata

b. data entry

The Hadoop Distributed File System (HDFS) is the foundation of a ________ infrastructure of Hadoop. Select one: a. Java b. data management c. relational database management system d. DBBMS

b. data management

The role of a ________ emphasizes integration and coordination of metadata across many data sources. Select one: a. data architect b. data warehouse administrator c. data administrator d. database administrator

b. data warehouse administrator

Allowing users to dive deeper into the view of data with online analytical processing (OLAP) is an important part of: Select one: a. prescriptive analytics. b. descriptive analytics. c. predictive analytics. d. comparative analytics.

b. descriptive analytics.

A star schema contains both fact and ________ tables. Select one: a. narrative b. dimension c. cross functional d. starter

b. dimension

A database action that results from a transaction is called a(n): Select one: a. log entry. b. event. c. transition. d. journal happening.

b. event.

Most data outages in organizations are caused by: Select one: a. software failures. b. human error. c. electrical outages. d. hardware failures.

b. human error.

A dependent data mart: Select one: a. participates in a relationship with an entity. b. is filled exclusively from the enterprise data warehouse with reconciled data. c. is filled with data extracted directly from the operational system. d. is dependent upon an operational system.

b. is filled exclusively from the enterprise data warehouse with reconciled data.

External data sources present problems for data quality because: Select one: a. there are poor data capture controls. b. there is a lack of control over data quality. c. data are not always available. d. data are unformatted.

b. there is a lack of control over data quality.

A(n) ________ is a procedure for acquiring the necessary locks for a transaction where all necessary locks are acquired before any are released. Select one: a. authorization rule b. two-phase lock c. exclusive lock d. record controller

b. two-phase lock

What results would the following SQL statement produce? select owner, table_name from dba_tables where table_name = 'CUSTOMER'; Select one: a. An error message b. A listing of all customers in the customer table c. A listing of the owner of the customer table d. A listing of the owner of the customer table as well as customers

c. A listing of the owner of the customer table

In order to embed SQL inside of another language, the ________ statement must be placed before the SQL in the host language. Select one: a. SQL SQL b. GET SQL c. EXEC SQL d. RUN SQL

c. EXEC SQL

According to your text, NoSQL stands for: Select one: a. No Structured Query Language. b. Numeric Only Structured Query Language. c. Not Only Structured Query Language. d. Numbered Structured Query Language.

c. Not Only Structured Query Language.

________ is arguably the most common concern by individuals regarding big data analytics. Select one: a. Saving money b. Processing time c. Personal privacy d. Taking up large amounts of computer storage

c. Personal privacy

Which of the following is a basic method for single field transformation? Select one: a. Cross-linking entities b. Cross-linking attributes c. Table lookup d. Field-to-field communication

c. Table lookup

The ________ operator is used to combine the output from multiple queries into a single result table. Select one: a. INTERSECT b. DIVIDE c. UNION d. COLLATE

c. UNION

________ includes concern about data quality issues. Select one: a. Vigilant b. Variety c. Veracity d. Velocity

c. Veracity

A repository of information about a database that documents data elements of a database is called a: Select one: a. view. b. schema. c. data dictionary. d. subschema.

c. data dictionary.

A logical data mart is a(n): Select one: a. integrated, subject-oriented, detailed database designed to serve operational users. b. centralized, integrated data warehouse. c. data mart created by a relational view of a slightly denormalized data warehouse. d. data mart consisting of only logical data.

c. data mart created by a relational view of a slightly denormalized data warehouse.

With HDFS it is less expensive to move the execution of computation to data than to move the: Select one: a. data to systems analysis. b. data to processes. c. data to computation. d. data to hardware.

c. data to computation.

The oldest form of analytics is: Select one: a. comparative analytics. b. predictive analytics. c. descriptive analytics. d. prescriptive analytics.

c. descriptive analytics.

The coding or scrambling of data so that humans cannot read them is called: Select one: a. encoding. b. demarcation. c. encryption. d. hiding.

c. encryption.

Embedded SQL consists of: Select one: a. SQL translated to a lower-level language. b. SQL encapsulated inside of other SQL statements. c. hard-coded SQL statements included in a program written in another language. d. SQL written into a front-end application.

c. hard-coded SQL statements included in a program written in another language.

An operational data store (ODS) is a(n): Select one: a. representation of the operational data. b. place to store all unreconciled data. c. integrated, subject-oriented, updateable, current-valued, detailed database designed to serve the decision support needs of operational users. d. small-scale data mart.

c. integrated, subject-oriented, updateable, current-valued, detailed database designed to serve the decision support needs of operational users.

All of the following are applications for big data and analytics EXCEPT: Select one: a. security and public health. b. business. c. personal finances. d. science and technology.

c. personal finances.

When an organization must decide on optimization and simulation tools to make things happen it is using: Select one: a. descriptive analytics. b. comparative analytics. c. prescriptive analytics. d. predictive analytics.

c. prescriptive analytics.

Which of the following is a type of network security? Select one: a. Password naming conventions b. Guidelines for frequency of password changes c. Random password guessing d. Authentication of the client workstation

d. Authentication of the client workstation

A trigger can be used as a security measure in which of the following ways? Select one: a. To conduct a DFD analysis b. To design a database c. To check for viruses d. To cause special handling procedures to be executed

d. To cause special handling procedures to be executed

An open-source DBMS is: Select one: a. an object-oriented database management system. b. source code for a commercial RDBMS. c. a beta release of a commercial RDBMS. d. a free source-code RBMS that provides the functionality of an SQL-compliant DBMS.

d. a free source-code RBMS that provides the functionality of an SQL-compliant DBMS.

The following code would include: SELECT Customer_T.CustomerID, CustomerName, OrderID FROM Customer_T RIGHT OUTER JOIN Order_T ON Customer_T.CustomerID = Order_T.CustomerID; Select one: a. only rows that match both Customer_T and Order_T Tables. b. only rows that don't match both Customer_T and Order_T Tables. c. all rows of the Customer_T Table regardless of matches with the Order_T Table. d. all rows of the Order_T Table regardless of matches with the Customer_T Table.

d. all rows of the Order_T Table regardless of matches with the Customer_T Table.

A procedure is: Select one: a. unable to be modified. b. stored outside the database. c. given a reserved SQL name. d. called by name.

d. called by name.

Data federation is a technique which: Select one: a. creates a distributed database. b. provides a real-time update of shared data. c. creates an integrated database from several separate databases. d. provides a virtual view of integrated data without actually creating one centralized database.

d. provides a virtual view of integrated data without actually creating one centralized database.

Data that are detailed, current, and intended to be the single, authoritative source of all decision support applications are called ________ data. Select one: a. subject b. derived c. detailed d. reconciled

d. reconciled

The characteristic that indicates that a data warehouse is organized around key high-level entities of the enterprise is: Select one: a. nonvolatile. b. integrated. c. time-variant. d. subject-oriented.

d. subject-oriented.

A type of query that is placed within a WHERE or HAVING clause of another query is called a: Select one: a. multi-query. b. master query. c. superquery. d. subquery.

d. subquery.

a. Advances in middleware products that enabled enterprise database connectivity across heterogeneous platforms. b. The invention of the iPad. c. Increase in viruses and other computer threats. d. Improvements in monitor technologies.

a. Advances in middleware products that enabled enterprise database connectivity across heterogeneous platforms.

A data mart is a data warehouse that contains data that can be used across the entire organization. Select one: a. False b. True

a. False

Databases are generally the property of a single department within an organization. Select one: a. False b. True

a. False

Hadoop is considered a relational database management system. Select one: a. False b. True

a. False

Master data management is the disciplines, technologies and methods to ensure the currency, meaning and quality of data within one subject area. Select one: a. False b. True

a. False

Most data outages in organizations are caused by hardware failures. Select one: a. False b. True

a. False

Operational metadata are derived from the enterprise data model. Select one: a. False b. True

a. False

Quality data are not essential for well-run organizations. Select one: a. False b. True

a. False

Specifying the attribute names in the SELECT statement will make it easier to find errors in queries and also correct for problems that may occur in the base system. Select one: a. False b. True

a. False

Subqueries can only be used in the WHERE clause. Select one: a. False b. True

a. False

The process of managing simultaneous operations against a database so that data integrity is maintained is called completeness control. Select one: a. False b. True

a. False

The schema on write and schema on read are considered synonymous approaches. Select one: a. False b. True

a. False

The status of data is the representation of the data after an event has occurred. Select one: a. False b. True

a. False

Transient data are never changed. Select one: a. False b. True

a. False

Which of the following is NOT true of poor data and/or database administration? Select one: a. Maintaining a secure server b. Unknown meanings of stored data c. Multiple entity definitions d. Data timing problems

a. Maintaining a secure server

________ are examples of Business Intelligences and Analytics 3.0 because they have millions of observations per second. Select one: a. Smartphones b. Web-based interaction logs c. Web-based customer platforms d. Administrative systems

a. Smartphones

The following code is an example of a: SELECT CustomerName, CustomerAddress, CustomerCity, CustomerState, CustomerPostalCode FROM Customer_T WHERE Customer_T.CustomerID = (SELECT Order_T.CustomerID FROM Order_T WHERE OrderID = 1008); Select one: a. Subquery. b. Correlated subquery. c. JOIN. d. FULL OUTER JOIN.

a. Subquery.

A data quality audit helps an organization understand the extent and nature of data quality problems. Select one: a. True b. False

a. True

A join in which the joining condition is based on equality between values in the common column is called an equi-join. Select one: a. True b. False

a. True

A method of capturing data in a snapshot at a point in time is called static extract. Select one: a. True b. False

a. True

An enterprise data warehouse that accepts near-real time feeds of transactional data and immediately transforms and loads the appropriate data is called a real-time data warehouse. Select one: a. True b. False

a. True

ETL is short for Extract, Transform, Load. Select one: a. True b. False

a. True

In order to find out what customers have not placed an order for a particular item, one might use the NOT qualifier along with the IN qualifier. Select one: a. True b. False

a. True

The UNION clause is used to combine the output from multiple queries into a single result table. Select one: a. True b. False

a. True

The first requirement for building a user-friendly interface is a set of metadata that describes the data in the data mart in business terms that users can easily understand. Select one: a. True b. False

a. True

The following queries produce the same results. Select DISTINCT customer_name, customer_city from customer, salesman where customer.salesman_id = salesman.salesman_id and salesman.lname = 'SMITH'; select customer_name, customer_city from customer where customer.salesman_id = (select salesman_id from salesman where lname = 'SMITH'); Select one: a. True b. False

a. True

The need for data warehousing in an organization is driven by its need for an integrated view of high-quality data. Select one: a. True b. False

a. True

The best place to improve data entry across all applications is: Select one: a. in the database definitions. b. in the users. c. in the data entry operators. d. in the level of organizational commitment.

a. in the database definitions.

A method of capturing only the changes that have occurred in the source data since the last capture is called ________ extract. Select one: a. incremental b. update-driven c. static d. partial

a. incremental

Dynamic SQL: Select one: a. is used to generate appropriate SQL code on the fly as an application is processing. b. is not used widely on the Internet. c. creates a less flexible application. d. is quite volatile.

a. is used to generate appropriate SQL code on the fly as an application is processing.

Big Data includes: Select one: a. large volumes of data with many different data types that are processed at very high speeds. b. large volumes of entity relationship diagrams (ERD) with many different data types that are processed at very high speeds. c. large volumes of entity relationship diagrams (ERD) with a single data type processed at very high speeds. d. large volumes of data entry with a single data type processed at very high speeds.

a. large volumes of data with many different data types that are processed at very high speeds.

Conformed dimensions allow users to do the following: Select one: a. query across fact tables with consistency. b. delete correlated data. c. fix viruses in html documents. d. identify viruses in web sites.

a. query across fact tables with consistency.

Which of the following is true of data visualization? Select one: a. It is more difficult to observe trends and patterns in data. b. Correlations and clusters in data can be easily identified. c. It is often used in conjunction with poems. d. It is generally not helpful for decision making.

b. Correlations and clusters in data can be easily identified.

________ is a technical function responsible for database design, security, and disaster recovery. Select one: a. Tech support b. Database administration c. Data administration d. Operations

b. Database administration

Which of the following functions do cost/benefit models? Select one: a. Database analysis b. Database planning c. Database design d. Operations

b. Database planning

There are two principal types of authorization tables: one for subjects and one for facts. Select one: a. True b. False

b. False

The methods to ensure the quality of data across various subject areas are called: Select one: a. Variable Data Management. b. Master Data Management. c. Joint Data Management. d. Managed Data Management.

b. Master Data Management.

A DBMS may perform checkpoints automatically or in response to commands in user application programs. Select one: a. False b. True

b. True

A business transaction is a sequence of steps that constitute some well-defined business activity. Select one: a. False b. True

b. True

Triggers have three parts: the event, the condition, and the action. Select one: a. False b. True

b. True

While views promote security by restricting user access to data, they are not adequate security measures because: Select one: a. views are not possible to create in most DBMS. b. an unauthorized person may gain access to a view through experimentation. c. all users can read any view. d. a view's data does not change.

b. an unauthorized person may gain access to a view through experimentation.

Loading data into a data warehouse does NOT involve: Select one: a. purging data that have become obsolete or were incorrectly loaded. b. formatting the hard drive. c. appending new rows to the tables in the warehouse. d. updating existing rows with new data.

b. formatting the hard drive.

The three 'v's commonly associated with big data include: Select one: a. viewable, volume, and variety. b. volume, variety, and velocity. c. vigilant, viewable, and verified. d. verified, variety, and velocity.

b. volume, variety, and velocity.

________ takes a value of TRUE if a subquery returns an intermediate results table which contains one or more rows. Select one: a. IN b. HAVING c. EXISTS d. EXTENTS

c. EXISTS

The following code is an example of a(n): SELECT Customer_T.CustomerID, Order_T.CustomerID, CustomerName, OrderID FROM Customer_T, Order_T WHERE Customer_T.CustomerID = Order_T. CustomerID; Select one: a. Right Outer JOIN. b. Full Outer JOIN. c. equi-join. d. subquery.

c. equi-join.

The process of combining data from various sources into a single table or view is called: Select one: a. selecting. b. updating. c. joining. d. extracting.

c. joining.

A ________ is a DBMS module that restores the database to a correct condition when a failure occurs. Select one: a. transaction logger b. restart facility c. recovery manager d. backup facility

c. recovery manager

Data quality ROI stands for: Select one: a. return on installation. b. rough outline inclusion. c. risk of incarceration. d. rate of installation.

c. risk of incarceration.

While triggers run automatically, ________ do not and have to be called. Select one: a. trapdoors b. selects c. routines d. updates

c. routines

When reporting and analysis organization of the data is determined when the data is used is called a: Select one: a. schema binding. b. entity relationship diagram. c. schema on read. d. cognitive schema.

c. schema on read.

User-defined transactions can improve system performance because: Select one: a. speed is decreased due to query optimization. b. all triggers delete rows. c. transactions are processed as sets, reducing system overhead. d. transactions are mapped to SQL statements.

c. transactions are processed as sets, reducing system overhead.

Which of the following is NOT a component of a repository system architecture? Select one: a. An informational model b. The repository database c. The repository engine d. A data transformation process

d. A data transformation process

Operational and informational systems are generally separated because of which of the following factors? Select one: a. Only operational systems allow SQL statements. b. A separate data warehouse increases contention for resources. c. A properly designed data warehouse decreases value to data. d. A data warehouse centralizes data that are scattered throughout disparate operational systems and makes them readily available for decision support applications.

d. A data warehouse centralizes data that are scattered throughout disparate operational systems and makes them readily available for decision support applications.

Which of the following data-mining techniques identifies clusters of observations with similar characteristics? Select one: a. Case reasoning b. Neural nets c. Rule discovery d. Clustering and signal processing

d. Clustering and signal processing

________ duplicates data across databases. Select one: a. Data duplication b. A replication server c. Redundant replication d. Data propagation

d. Data propagation

________ tools commonly load data into intermediate hypercube structures. Select one: a. TLAP b. ROLAP c. OLAP d. MOLAP

d. MOLAP

The Hadoop framework consists of the ________ algorithm to solve large scale problems. Select one: a. MapCluster b. MapSystem c. MapComponent d. MapReduce

d. MapReduce

A transaction that terminates abnormally is called a(n) ________ transaction. Select one: a. deleted b. terminated c. completed d. aborted

d. aborted

A join operation: Select one: a. causes two disparate tables to be combined into a single table or view. b. is used to combine indexing operations. c. brings together data from two different fields. d. causes two tables with a common domain to be combined into a single table or view.

d. causes two tables with a common domain to be combined into a single table or view.

A technique using pattern recognition to upgrade the quality of raw data is called: Select one: a. data scrounging. b. data analysis. c. data gouging. d. data scrubbing.

d. data scrubbing.

A data mart is a(n): Select one: a. enterprisewide data warehouse. b. generic on-line shopping site. c. smaller system built upon file processing technology. d. data warehouse that is limited in scope.

d. data warehouse that is limited in scope.

The goal of data mining related to analyzing data for unexpected relationships is: Select one: a. laboratory. b. explanatory. c. confirmatory. d. exploratory.

d. exploratory.

Data quality is important for all of the following reasons EXCEPT: Select one: a. it helps to expand the customer base. b. it aids in making timely business decisions. c. it minimizes project delay. d. it provides a stream of profit.

d. it provides a stream of profit.

An audit trail of database changes is kept by a: Select one: a. subschema. b. before image. c. change control device. d. journalizing facility.

d. journalizing facility.

The most commonly used form of join operation is the: Select one: a. equi-join. b. outer join. c. union join. d. natural join.

d. natural join.

Descriptive, predictive, and ________ are the three main types of analytics. Select one: a. adaptive b. decisive c. comparative d. prescriptive

d. prescriptive

Security measures for dynamic Web pages are different from static HTML pages because: Select one: a. HTML is more complex than dynamic Web pages. b. dynamic Web pages are built "on the fly." c. static Web pages contain more sensitive data. d. the connection requires full access to the database for dynamic pages.

d. the connection requires full access to the database for dynamic pages.

Regarding big data value, the primary focus is on: Select one: a. quantity. b. speed. c. variety. d. usefulness.

d. usefulness.

Establishing IF-THEN-ELSE logical processing within an SQL statement can be accomplished by: Select one: a. using the immediate if statement. b. using the if-then-else construct. c. using a subquery. d. using the CASE keyword in a statement.

d. using the CASE keyword in a statement.


Related study sets

Chapter 14 Bonds and Long Term Notes

View Set

Mega International Econ: True/False

View Set

Fundamentals Unit 2 (Professional Standards in Nursing)

View Set

Fundamentals CoursePoint Practice questions

View Set

Chapter 22: Physiological and Behavioral Adaptations of the Newborn

View Set

Fr. 8th - Unit 1 - Module 1 Quiz Practice quiz

View Set