ISBC 217 Final pt 1
Which of the following statements is true of a data warehouse? A) A data warehouse is larger than a data mart. B) A data warehouse functions like a retail store in a supply chain. C) Users in a data warehouse obtain data pertaining to a business function from a data mart. D) Data analysts who work with a data warehouse are experts in a particular business function.
A) A data warehouse is larger than a data mart.
________ is an unsupervised data mining technique in which statistical techniques identify groups of entities that have similar characteristics. A) Cluster analysis B) Content indexing C) Regression analysis D) Cloud computing
A) Cluster analysis
________ are reports produced when something out of predefined bounds occurs. A) Exception reports B) Static reports C) Dynamic reports D) Subscriptions
A) Exception reports
Which of the following statements is true of data with granularity? A) Granularity refers to the level of detail represented by the data. B) If granularity is too coarse, data can be made finer by summing and combining. C) The granularity of clickstream data is too coarse. D) If granularity is too coarse, data can be separated into constituent parts using regression.
A) Granularity refers to the level of detail represented by the data.
________ is an open-source program supported by the Apache Foundation that manages thousands of computers and implements MapReduce. A) Hadoop B) BigData C) Linux D) Apache Wave
A) Hadoop
________ are user requests for particular business intelligence results on a particular schedule or in response to particular events. A) Subscriptions B) Third-party cookies C) Static reports D) Dynamic reports
A) Subscriptions
A ________ is a data collection that addresses the needs of a particular department or functional area of a business. A) data mart B) data room C) datasheet D) dataspace
A) data mart
The source, format, assumptions, constraints, and other facts concerning certain data are called ________. A) metadata B) data structures C) microdata D) network packets
A) metadata
Which of the following statements is true of unsupervised data mining? A) Analysts apply data mining techniques to estimate the parameters of a developed model. B) Analysts create hypotheses only after performing an analysis. C) Regression analysis is the most commonly used unsupervised data mining technique. D) Data miners develop models prior to performing an analysis.
B) Analysts create hypotheses only after performing an analysis.
13) ________ refers to the level of detail represented by data. A) Abstraction B) Granularity C) Dimensionality D) Aggregation
B) Granularity
Which of the following statements is true of business intelligence (BI) publishing alternatives? A) The skills required to publish static content are extremely high. B) Publishing dynamic BI is more difficult than publishing static content. C) For static content, the skill required to create a publishing application is high. D) For Web servers, push options are manual.
B) Publishing dynamic BI is more difficult than publishing static content.
Which of the following is a fundamental category of business intelligence (BI) analysis? A) data acquisition B) data mining C) push publishing D) pull publishing
B) data mining
Which of the following refers to data in the form of rows and columns? A) granulated data B) structured data C) nonintegrated data D) problematic data
B) structured data
In the case of ________, data miners develop models prior to conducting analyses and then apply statistical techniques to data to estimate parameters of the models. A) pull publishing techniques B) supervised data mining C) push publishing techniques D) unsupervised data mining
B) supervised data mining
The more attributes there are in a sample data, the easier it is to build a model that fits the sample data, but that is worthless as a predictor. Which of the following best explains this phenomenon? A) the free rider problem B) the curse of dimensionality C) the tragedy of the commons D) the zero-sum game
B) the curse of dimensionality
________ process operational and other data in organizations to analyze past performance and make predictions
Business intelligence systems
________ is the application of statistical techniques to find patterns and relationships among data for classification and prediction. A) Data encryption B) Push publishing C) Data mining
C) Data mining
________ techniques emerged from the combined discipline of statistics, mathematics, artificial intelligence, and machine-learning. A) Push publishing B) Pull publishing C) Data mining D) Exception reporting
C) Data mining
Which of the following statements is true of Hadoop? A) Hadoop is written in C++ and runs on Linux. B) Hadoop includes a query language entitled Big. C) Hadoop is an open-source program that implements MapReduce. D) Technical skills are not required to run and use Hadoop.
C) Hadoop is an open-source program that implements MapReduce.
The results generated in the Map phase are combined in the ________ phase. A) Pig B) control C) Reduce D) construct
C) Reduce
________ are business intelligence documents that are fixed at the time of creation and do not change. A) Subscriptions B) Third-party cookies C) Static reports D) Exception reports
C) Static reports
A ________ is a facility for managing an organization's business intelligence data. A) datasheet B) dataspace C) data warehouse D) data room
C) data warehouse
Which of the following problems is particularly common for data that have been gathered over time? A) wrong granularity B) lack of integration C) lack of consistency D) missing values
C) lack of consistency
Which of the following activities in the business intelligence process involves delivering business intelligence to the knowledge workers who need it? A) data acquisition B) BI analysis C) publish results D) data mining
C) publish results
The goal of ________, a type of business intelligence analysis, is to create information about past performance. A) push publishing B) data mining C) reporting D) BigData
C) reporting
The use of an organization's operational data as the source data for a BI system is not usually recommended because it ________. A) is not possible to create reports based on operational data B) is not possible to perform business intelligence analyses on operational data C) requires considerable processing and can drastically reduce system performance D) considers only external data and not internal data regarding an organization's functions
C) requires considerable processing and can drastically reduce system performance
Regression analysis is used in ________. A) static reporting B) exception reporting C) supervised data mining D) unsupervised data mining
C) supervised data mining
Which of the following statements is true of BigData? A) BigData contains only structured data. B) BigData has low velocity and is generated slowly. C) BigData cannot store graphics, audio, and video files. D) BigData refers to data sets that are at least a petabyte in size.
D) BigData refers to data sets that are at least a petabyte in size.
Which of the following statements is true of business intelligence (BI) systems? A) Business intelligence systems are primarily used for developing software systems and data mining applications. B) The four standard components of business intelligence systems are software, procedures, applications, and programs. C) The software component of a business intelligence system is called an intelligence database. D) Business intelligence systems analyze an organization's past performance to make predictions.
D) Business intelligence systems analyze an organization's past performance to make predictions.
________ is the process of obtaining, cleaning, organizing, relating, and cataloging source data. A) Data interpretation B) Pull publishing C) Push publishing D) Data acquisition
D) Data acquisition
________ reports are business intelligence documents that are updated at the time they are requested. A) Subscriptions B) Third-party cookies C) Static D) Dynamic
D) Dynamic
In the ________ phase, a BigData collection is broken into pieces and hundreds or thousands of independent processors search these pieces for something of interest. A) crash B) control C) Pig D) Map
D) Map
30) ________ is used to measure the impact of a set of variables on another variable during data mining. A) Cluster analysis B) Context indexing C) Cloud computing D) Regression analysis
D) Regression analysis
________ is the process of sorting, grouping, summing, filtering, and formatting structured data. A) Push publishing B) Pull publishing C) Cloud computing D) Reporting analysis
D) Reporting analysis
The purpose of a ________ is to extract data from operational systems and other sources, clean the data, and store and catalog that data for processing by business intelligence tools. A) data mart B) data center C) data room D) data warehouse
D) data warehouse
Users in a data mart obtain data that pertain to a particular business function from a ________. A) data room B) data center C) datasheet D) data warehouse
D) data warehouse
Problematic data are referred to as ________. A) rough data B) clickstream data C) granular data D) dirty data
D) dirty data
The ________ of business intelligence servers maintains metadata about the authorized allocation of business intelligence results to users. A) exception report B) dynamic report C) delivery function D) management function
D) management function
T/F A printed sales analysis is an example of a dynamic report.
False
T/F An advantage of data warehouses is the low cost required to create, staff, and operate them.
False
T/F BigData has low velocity and is generated slowly.
False
T/F Cluster analysis measures the impact of a set of variables on another variable.
False
T/F Data marts are usually larger than data warehouses.
False
T/F Data mining is the process of obtaining, cleaning, organizing, relating, and cataloging source data.
False
T/F External data purchased from outside resources are not included in data warehouses.
False
T/F For dynamic content, the skills required to create a publishing application are low.
False
T/F If the granularity of certain data is too coarse, the data can be separated into constituent parts using statistical techniques.
False
T/F Project management is one of the few domains in which business intelligence is rarely used.
False
T/F Pull publishing delivers business intelligence to users without any request from the users.
False
T/F Regression analysis is used to identify groups of entities that have similar characteristics.
False
T/F Reporting analysis is used primarily for classifying and predicting BI data.
False
T/F Static reports are business intelligence documents that are updated at the time they are requested.
False
T/F The curse of dimensionality states that the more attributes there are, the more difficult it is to build a model that fits the sample data.
False
T/F The granularity in clickstream data is too coarse.
False
T/F The software component of a business intelligence system is called a business intelligence database.
False
T/F Using BI for identifying changes in the purchasing patterns of customers is a labor-intensive process.
False
________ requires users to request business intelligence results.
Pull publishing
________ is the process of delivering business intelligence to users without any request from the users.
Push publishing
T/F BigData has volume, velocity, and variation characteristics that far exceed those of traditional reporting and data mining.
True
T/F BigData refers to data sets that are at least a petabyte in size
True
T/F Business intelligence enables police departments to better utilize their personnel through predictive-policing.
True
T/F Business intelligence systems are information systems that process operational and other data to analyze past performance and to make predictions.
True
T/F Data analysts who work with data warehouses are not usually experts in a given business function.
True
T/F Data granularity refers to the level of detail represented by data.
True
T/F Data marts are data collections that address the needs of a particular department or functional area of a business.
True
T/F Facts about data, such as its source, format, assumptions, and constraints, are called metadata.
True
T/F MapReduce is a technique for harnessing the power of thousands of computers working in parallel.
True
T/F Problematic data are termed dirty data.
True
T/F Publish results is the process of delivering business intelligence to the knowledge workers who need it.
True
T/F Pull publishing delivers business intelligence to users without any request from the users.
True
T/F Push options are manual when emails or collaboration tools are used for BI publishing.
True
T/F Structured data is data in the form of rows and columns.
True
T/F The data that an organization purchases from data vendors can act as the source data for a business intelligence system.
True
T/F The management function of BI servers maintains metadata about the authorized allocation of BI results to users.
True
T/F Users in a data mart obtain data that pertain to a particular business function from a data warehouse.
True
T/F With unsupervised data mining, analysts do not create a model or hypothesis before running the analysis.
True