MIS 4300
Understanding customers better has helped Amazon and others become more successful. The understanding comes primarily from A) collecting data about customers and transactions. B) developing a philosophy that is data analytics-centric. C) analyzing the vast data amounts routinely collected. D) asking the customers what they want.
Analyzing the vast data amounts routinely collected
The data mining algorithm type used for classification somewhat resembling the biological neural networks in the human brain is A) association rule mining. B) cluster analysis. C) decision trees. D) artificial neural networks
Artificial neural networks
Kaplan and Norton developed a report that presents an integrated view of success in the organization called A) metric management reports. B) balanced scorecard-type reports. C) dashboard-type reports. D) visual reports
Balanced scorecard-type reports
Why is a performance management system superior to a performance measurement system? A) because performance measurement systems are only in their infancy B) because measurement automatically leads to problem solution C) because performance management systems cost more D) because measurement alone has little use without action
Because measurement alone has little use without action
Which kind of chart is described as an enhanced variant of a scatter plot? A) heat map B) bullet C) pie chart D) bubble chart
Bubble Chart
Which data mining process/methodology is thought to be the most comprehensive, according to kdnuggets.com rankings? A) SEMMA B) proprietary organizational methodologies C) KDD Process D) CRISP-DM
CRISP-DM
In which stage of extraction, transformation, and load (ETL) into a data warehouse are anomalies detected and corrected? A) transformation B) extraction C) load D) cleanse
Cleanse
Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features? A) associations B) visualization C) classification D) clustering
Clustering
In the Cabela's case study, what types of models helped the company understand the value of customers, using a five-point scale? A) reporting and association models B) simulation and geographical models C) simulation and regression models D) clustering and association models
Clustering and association models
Which of the following is a data mining myth? A) Data mining is a multistep process that requires deliberate, proactive design and use. B) Data mining requires a separate, dedicated database. C) The current state-of-the-art is ready to go for almost any business. D) Newer Web-based tools enable managers of all educational levels to do data mining
Data mining requires a separate, dedicated database
The "single version of the truth" embodied in a data warehouse such as Capri Casinos' means all of the following EXCEPT A) decision makers get to see the same results to queries. B) decision makers have the same data available to support their decisions. C) decision makers get to use more dependable data for their decisions. D) decision makers have unfettered access to all data in the warehouse
Decision makers have unfettered access to all data in the warehouse
When querying a dimensional database, a user went from summarized data to its underlying details. The function that served this purpose is A) dice. B) slice. C) roll-up. D) drill down
Drill Down
What is the fundamental challenge of dashboard design? A) ensuring that users across the organization have access to it B) ensuring that the organization has the appropriate hardware onsite to support it C) ensuring that the organization has access to the latest web browsers D) ensuring that the required information is shown clearly on a single screen
Ensuring that the required information is shown clearly on a single screen
Which approach to data warehouse integration focuses more on sharing process functionality than data across systems? A) extraction, transformation, and load B) enterprise application integration C) enterprise information integration D) enterprise function integration
Enterprise Application Integration
For those executives who do not have the time to go through lengthy reports, the best alternative is the A) last page of the report. B) raw data that informed the report. C) executive summary. D) charts in the report
Executive Summary
Due to the fact that business environments are now more complex than ever, trial-and-error is an effective means of arriving at acceptable solutions. True False
FALSE
The ETL process in data warehousing usually takes up a small portion of the time in a data-centric project. True False
FALSE
Two-tier data warehouse/BI infrastructures offer organizations more flexibility but cost more than three-tier ones. True False
FALSE
A Six Sigma deployment can be deemed effective even if the number of defects are not reduced to 3.4 defects per million. True False
False
A well-designed data warehouse means that user requirements do not have to change as business needs change True False
False
Because the recession has raised interest in low-cost open source software, it is now set to replace traditional enterprise software True False
False
Bill Inmon advocates the data mart bus architecture whereas Ralph Kimball promotes the hub-and-spoke architecture, a data mart bus architecture with conformed dimensions True False
False
Data is the contextualization of information, that is, information set in context. True False
False
Data mining can be very useful in detecting patterns such as credit card fraud, but is of little help in improving sales. True False
False
Data mining requires specialized data analysts to ask ad hoc questions and obtain answers quickly from the system. True False
False
Data that is collected, stored, and analyzed in data mining is often private and personal. There is no way to maintain individuals' privacy other than being very careful about physical data security. True False
False
Data warehouse administrators (DWAs) do not need strong business insight since they only handle the technical aspect of the infrastructure. True False
False
For best results when deploying visual analytics environments, focus only on power users and management to get the best return on your investment True False
False
In the Cabela's case study, the SAS/Teradata solution enabled the direct marketer to better identify likely customers and market to them based mostly on external data sources. True False
False
In the Dallas Cowboys case study, the focus was on using data analytics to decide which players would play every week. True False
False
In the Isle of Capri case, the only capability added by the new software was increased processing speed of processing reports True False
False
In the Memphis Police Department case study, predictive analytics helped to identify the best schedule for officers in order to pay the least overtime. True False
False
In the cancer research case study, data mining algorithms that predict cancer survivability with high predictive power are good replacements for medical professionals. True False
False
Large companies, especially those with revenue upwards of $500 million consistently reap substantial cost savings through the use of hosted data warehouses. True False
False
Market basket analysis is a useful and entertaining way to explain data mining to a technologically less savvy audience, but it has little business significance. True False
False
Moving the data into a data warehouse is usually the easiest part of its creation True False
False
OLTP systems are designed to handle ad hoc analysis and complex queries that deal with many data items. True False
False
Organizations seldom devote a lot of effort to creating metadata because it is not important for the effective use of data warehouses. True False
False
Ratio data is a type of categorical data True False
False
Statistics and data mining both look for data sets that are as large as possible. True False
False
Subject oriented databases for data warehousing are organized by detailed subjects such as disk drives, computers, and networks. True False
False
The BPM development cycle is essentially a one-shot process where the requirement is to get it right the first time True False
False
The ETL process in data warehousing usually takes up a small portion of the time in a data- centric project True False
False
The balanced scorecard is a type of report that is based solely on financial metrics True False
False
The dashboard for the WebFOCUS BI platform in the Travel and Transport case study required client side software to operate True False
False
The data storage component of a business reporting system builds the various reports and hosts them for, or disseminates them to users. It also provides notification, annotation, collaboration, and other services True False
False
The entire focus of the predictive analytics systems in the Infinity P&C case was on detecting and handling fraudulent claims for the company's benefit. True False
False
When telling a story during a presentation, it is best to avoid describing hurdles that your character must overcome, to avoid souring the mood True False
False
When training a data mining model, the testing dataset is always larger than the training dataset. True False
False
With the balanced scorecard approach, the entire focus is on measuring and managing specific financial goals based on the organization's strategy True False
False
the best key performance indicators are derived independently from the company's strategic goals to enable developers to "think outside of the box." True False
False
Which type of visualization tool can be very helpful when a data set contains location data? A) bar chart B) geographic map C) highlight table D) tree map
Geographic Map
Which of the following is LEAST related to data/information visualization? A) information graphics B) scientific visualization C) statistical graphics D) graphic artwork
Graphic Artwork
How does the use of cloud computing affect the scalability of a data warehouse? A) Cloud computing vendors bring as much hardware as needed to users' offices. B) Hardware resources are dynamically allocated as use increases. C) Cloud vendors are mostly based overseas where the cost of labor is low. D) Cloud computing has little effect on a data warehouse's scalability
Hardware resources are dynamically allocated as use increases
Which data warehouse architecture uses a normalized relational warehouse that feeds multiple data marts? A) independent data marts architecture B) centralized data warehouse architecture C) hub-and-spoke data warehouse architecture D) federated architecture
Hub-and-spoke data warehouse architecture
Data warehouses provide direct and indirect benefits to using organizations. Which of the following is an indirect benefit of data warehouses? A) better and more timely information B) extensive new analyses performed by users C) simplified access to data D) improved customer service
Improved Customer Service
Which kind of data warehouse is created separately from the enterprise data warehouse by a department and not reliant on it for updates? A) sectional data mart B) public data mart C) independent data mart D) volatile data mart
Independent Data Mart
Identifying and preventing incorrect claim payments and fraudulent activities falls under which type of data mining applications? A) insurance B) retailing and logistics C) customer relationship management D) computer hardware and software
Insurance
All of the following are true about in-database processing technology EXCEPT A) it pushes the algorithms to where the data is. B) it makes the response to queries much faster than conventional databases. C) it is often used for apps like credit card fraud detection and investment risk management. D) it is the same as in-memory storage technology.
It is the same as in-memory storage technology
Which of the following statements is more descriptive of active data warehouses in contrast with traditional data warehouses? A) strategic decisions whose impacts are hard to measure B) detailed data available for strategic use only C) large numbers of users, including operational staffs D) restrictive reporting with daily and weekly data currency
Large numbers of users, including operational staffs
What is the management feature of a dashboard? A) operational data that identify what actions to take to resolve a problem B) summarized dimensional data to analyze the root cause of problems C) summarized dimensional data to monitor key performance metrics D) graphical, abstracted data to monitor key performance metrics
Operational data that identify what actions to take to resolve a problem
Which of the following BEST enables a data warehouse to handle complex queries and scale up to handle many more requests? A) use of the web by users as a front-end B) parallel processing C) Microsoft Windows D) a larger IT staff
Parallel Processing
Which of the following online analytical processing (OLAP) technologies does NOT require the precomputation and storage of information? a. MOLAP b. SQL c. ROLAP d. HOLAP
ROLAP
Prediction problems where the variables have numeric values are most accurately defined as A) classifications. B) regressions. C) associations. D) computations.
Regressions
Active data warehousing can be used to support the highest level of decision making sophistication and power. The major feature that enables this in relation to handling the data is A) country of (data) origin. B) nature of the data. C) speed of data transfer. D) source of the data.
Speed of Data Transfer
When representing data in a data warehouse, using several dimension tables that are each connected only to a fact table means you are using which warehouse structure? A) star schema B) snowflake schema C) relational schema D) dimensional schema
Star Schema
Operational or transaction databases are product oriented, handling transactions that update the database. In contrast, data warehouses are A) subject-oriented and nonvolatile. B) product-oriented and nonvolatile. C) product-oriented and volatile. D) subject-oriented and volatile
Subject Oriented and Nonvolatile
One way an operational data store differs from a data warehouse is the recency of their data. True False
TRUE
The hub-and-spoke data warehouse model uses a centralized warehouse feeding dependents data marts. True False
TRUE
In the Target case study, why did Target send a teen maternity ads? A) Target's analytic model confused her with an older woman with a similar name. B) Target was sending ads to all women in a particular neighborhood. C) Target's analytic model suggested she was pregnant based on her buying habits. D) Target was using a special promotion that targeted all teens in her geographical area
Target's analytic model suggested she was pregnant based on her buying habits
All of the following are true about external reports between businesses and the government EXCEPT A) they can include tax and compliance reporting. B) they can be filed nationally or internationally. C) they are standardized for the most part to reduce the regulatory burden. D) their primary focus is government.
Their primary focus is government
A Web client that connects to a Web server, which is in turn connected to a BI application server, is reflective of a A) one tier architecture. B) two tier architecture. C) three tier architecture. D) four tier architecture.
Three tier architecture
Because of performance and data quality issues, most experts agree that the federated architecture should supplement data warehouses, not replace them. True False
True
During classification in data mining, a false positive is an occurrence classified as true by the algorithm while being false in reality True False
True
Google Maps has set new standards for data visualization with its intuitive Web mapping software True False
True
If using a mining analogy, "knowledge mining" would be a more appropriate term than "data mining. True False
True
In data mining, classification models help in prediction True False
True
In the 2degrees case study, the main effectiveness of the new analytics system was in dissuading potential churners from leaving the company True False
True
In the FEMA case study, the BureauNet software was the primary reason behind the increased speed and relevance of the reports FEMA employees received True False
True
Information density is a key characteristic of performance dashboards. True False
True
One comparison typically made when data is presented in business intelligence systems is a comparison against historical values. True False
True
PCs and, increasingly, mobile devices are the most common means of providing managers with information to directly support decision making, instead of using IT staff intermediaries True False
True
The "islands of data" problem in the 1980s describes the phenomenon of unconnected data being stored in numerous locations within an organization. True False
True
The WebFOCUS BI platform in the Travel and Transport case study decreased clients' reliance on the IT function when seeking system reports. True False
True
The cost of data storage has plummeted recently, making data mining feasible for more firms. True False
True
The data warehousing maturity model consists of six stages: prenatal, infant, child, teenager, adult, and sage True False
True
The main difference between service level agreements and key performance indicators is the audience. True False
True
The number of users of free/open source data mining software now exceeds that of users of commercial software versions. True False
True
Using data mining on data about imports and exports can help to detect tax avoidance and money laundering True False
True
Visualization differs from traditional charts and graphs in complexity of data sets and use of multiple dimensions and measures. True False
True
When a problem has many attributes that impact the classification of different patterns, decision trees may be a useful approach. True False
True
With key performance indicators, driver KPIs have a significant effect on outcome KPIs, but the reverse is not necessarily true True False
True
Without middleware, different BI programs cannot easily connect to the data warehouse. True False
True
in the Starwood Hotels case, up-to-date data and faster reporting helped hotel managers better manage their occupancy rates. True False
True
Contextual metadata for a dashboard includes all the following EXCEPT A) whether any high-value transactions that would skew the overall trends were rejected as a part of the loading process. B) which operating system is running the dashboard server software. C) whether the dashboard is presenting "fresh" or "stale" information. D) when the data warehouse was last refreshed
Which operating system is running the dashboard server software
Which type of question does visual analytics seek to answer? a. What is happening today? b. Why did it happen? c. What happened yesterday? d. When did it happen?
Why did it happen?
Which type of question does visual analytics seeks to answer? A) Why did it happen? B) What happened yesterday? C) What is happening today? D) When did it happen?
Why did it happen?
All of the following may be viewed as decision support systems EXCEPT a. a knowledge management system to guide decision makers. b. a retail sales system that processes customer sales transactions. c. a system that helps to manage the organization's supply chain management. d. an expert system to diagnose a medical condition
a retail sales system that processes customer sales transaction
hen you tell a story in a presentation, all of the following are true EXCEPT A) a story should make sense and order out of a lot of background noise. B) a well-told story should have no need for subsequent discussion. C) stories and their lessons should be easy to remember. D) the outcome and reasons for it should be clear at the end of your story
a well-told story should have no need for subsequent discussion
In data mining, finding an affinity of two products to be commonly together in a shopping cart is known as a. cluster analysis b. artificial neural networks c. decision trees d. association rule mining
association rule mining
Why is the customer perspective important in the balanced scorecard methodology? A) because dissatisfied customers will eventually hurt the bottom line B) because customers should always be included in any design methodology C) because customers understand best how the firm's internal processes should work D) because companies need customer input into the design of the balanced scorecard
because dissatisfied customers will eventually hurt the bottom line
What is the main reason parallel processing is sometimes used for data mining? a. because the hardware exists in most organizations and it is available to use b. because the most of the algorithms used for data mining require it c. because any strategic application requires parallel processing d. because of the massive data amounts and search efforts involved.
because of the massive data amounts and search efforts involved.
All of the following statements about data mining are true EXCEPT a. understanding the business goal is critical b. building the model takes the most time and effort. c. data is typically preprocessed and/or cleaned before use. d. understanding the data, e.g., the relevant variables, is critical to success
building the model takes the most time and effort
Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes? a. associations b. visualization c. clustering d. classification
classification
Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features? a. visualization b. classification c. associations d. clustering
clustering
Business Intelligence (BI) can be characterized as a transformation of a. data to information to decisions to actions b. data to processing to information to actions c. big data to data to information to decisions d. actions to decisions to feedback to information
data to information to decisions to actions
Which of the following activities permeates nearly all managerial activity? a. controlling b. decision-making c. planning d. directing
decision-making
For those executives who do not have the time to go through lengthy reports, the best alternative is the a. raw data that informed the report b. executive summary c. last page of the report d. charts in the report
executive summary
Which data warehouse architecture uses metadata from existing data warehouses to create a hybrid logical data warehouse comprised of data from the other warehouses? a. hub-and-spoke data warehouse architecture b. centralized data warehouse architecture c. independent data marts architecture d. federated architecture
federated architecture
Which of the following is NOT an example that falls within the four major categories of business environment factors for today's organizations? a. increased pool of customers b. globalization c. fewer government regulations d. increased competition
fewer government regulations
All of the following statements about metadata are true EXCEPT a. for most organizations, data warehouse metadata are an unnecessary expense b. there may be ethical issues involved in the creation of metadata c. metadata gives context to reported data d. metadata helps to describe the meaning and structure of data
for most organizations, data warehouse metadata are an unnecessary expense
In answering the question "Which customers are likely to be using fake credit cards?", you are most likely to use which of the following analytic applications? a. customer profitability b. fraud detection c. customer segmentation d. channel optimization
fraud detection
Which type of visualization tool can be very helpful when a data set contains location data? a. highlight table b. bar chart c. geographic map d. tree map
geographic map
All of the following are benefits of hosted data warehouses EXCEPT a. better quality hardware b. greater control of data c. frees up in-house systems d. smaller upfront investment
greater control of data
What does the scalability of a data mining method refer to? A) its ability to predict the outcome of a previously unknown data set accurately B) its speed of computation and computational costs in using the mode C) its ability to construct a prediction model efficiently given a large amount of data D) its ability to overcome noisy data to make somewhat accurate predictions
its ability to construct a prediction model efficiently given a large amount of data
What does the robustness of a data mining method refer to? A) its ability to predict the outcome of a previously unknown data set accurately B) its speed of computation and computational costs in using the mode C) its ability to construct a prediction model efficiently given a large amount of data D) its ability to overcome noisy data to make somewhat accurate predictions
its ability to overcome noisy data to make somewhat accurate predictions
Which of the following statements is more descriptive of active data warehouses in contrast with traditional data warehouses? a. restrictive reporting with daily and weekly data currency b. large numbers of users, including including operational staffs c. detailed data available for strategic use only d. strategic decisions whose impacts are hard to measure
large numbers of users, including operational staff
The Internet emerged as a new medium for visualization and brought all the following EXCEPT a. worldwide digital distribution of visualization b. new graphics displays through PC displays c. new forms of computation of business logic d. immersive environments for consuming data
new forms of computation of business logic
the data field "ethnic group" can be best described as A) nominal data. B) interval data. C) ordinal data. D) ratio data
nominal data
In the Magpie Sensing case study, the automated collection of temperature and humidity data on shipped goods helped with various types of analytics. Which of the following is an example of prescriptive analytics? a. location of the shipment b. warning of an open shipment seal c. optimal temperature setting d. real time reports of the shipment's temperature
optimal temperature setting
Which of the following BEST enables a data warehouse to handle complex queries and scale up to handle many more requests? a. parallel processing b. Microsoft Windows c. use of the web by users as a front-end d. a larger IT staff
parallel processing
Which type of visualization tool can be very helpful when the intention is to show relative proportions of dollars per department allocated by a university administration? a. pie chart b. heat map c. bubble chart d. bullet
pie chart
In answering the question "Which customers are most likely to click on my online ads and purchase my goods?", you are most likely to use which of the following analytic applications? a. customer attrition b. channel optimization c. customer profitability d. propensity to buy
propensity to buy
The data field "salary" can be best described as A) nominal data. B) interval data. C) ordinal data. D) ratio data
ratio data
Third party providers of publicly available datasets protect the anonymity of the individuals in the data set primarily by A) asking data users to use the data ethically. B) leaving in identifiers (e.g., name), but changing other variables. C) removing identifiers such as names and social security numbers. D) letting individuals in the data know their data is being accessed.
removing identifiers such as names and social security numbers
All of the following statements about balanced scorecards and dashboards are true EXCEPT a. scorecards are less preferred at operational and tactical levels b. scorecards are preferred for tracking the achievement of strategic goals c. dashboards would be the preferred choice to monitor production quality. d. scorecards are best for real-time tracking of a marketing campaign.
scorecards are best for real-time tracking of a marketing campaign
All of the following statements about data mining are true EXCEPT a. the novel aspect means that previously unknown patterns are discovered b. the valid aspect means that the discovered patterns should hold true on new data c. the potentially useful aspect means that results should lead to some business benefit d. the process aspect means that data mining should be a one-step process to results
the process aspect means that data mining should be a one-step process to results.
Big Data often involves a form of distributed storage and processing using Hadoop and MapReduce. One reason for this is a. the processing power needed for the centralized model would overload a single computer b. Big Data systems have to match the geographical spread of social media c. the "Big" in Big Data necessitates over 10,000 processing nodes d. centralized storage creates too many vulnerabilities.
the processing power needed for the centralized model would overload a single computer
In estimating the accuracy of data mining (or other) classification models, the true positive rate is A) the ratio of correctly classified positives divided by the total positive count. B) the ratio of correctly classified negatives divided by the total negative count. C) the ratio of correctly classified positives divided by the sum of correctly classified positives and incorrectly classified positives. D) the ratio of correctly classified positives divided by the sum of correctly classified positives and incorrectly classified negatives
the ratio of correctly classified positives divided by the total positive count
Dashboards can be presented at all the following levels EXCEPT A) the visual dashboard level. B) the static report level. C) the visual cube level. D) the self-service cube level
the visual cube level
Benefits of the latest visual analytics tools, such as SAS Visual Analytics, include all of the following EXCEPT a. there is less demand on IT departments for reports b. mobile platforms such as the iPhone are supported by these products c. it is easier to spot useful patterns and trends in the data d. they explore massive amounts of data in hours, not days
they explore massive amounts of data in hours, not days