BI Exam
What is Six Sigma?
A project based approach for improving effectiveness and efficiency
Today, many vendors offer diversified tools, some of which are completely preprogrammed (called shells). How are these shells utilized?
All a user needs to do is insert the numbers
Which of the following is NOT an assumption used by a LP allocation problem?
All data are unknown with decision making under uncertainty. Total returns cannot be compared
A newly popular unit of data in the Big Data era is the petabyte (PB) which is
10^15 bytes.
In what decade did disjointed information systems begin to be integrated?
1980s
Relational databases began to be used in the
1980s
What is Big Data's relationship to the cloud?
Amazon and Google have working Hadoop cloud offerings
In sentiment analysis, which of the following is an implicit opinion?
The customer service I got for my TV was laughable
In the opening vignette, the architectural system that supported Watson used all the following elements EXCEPT
a core engine that could operate seamlessly in another domain without changes
In a Hadoop "stack," what is a slave node?
a node where data is stored and processed
When you tell a story in a presentation, all of the following are true EXCEPT
a well-told story should have no need for subsequent discussion
This measure of central tendency is the sum of all the values/observations divided by the number of observations in the data set.
arithmetic mean
Why is a performance management system superior to a performance measurement system?
because measurement alone has little use without action
Why is a performance management system supervisor to a performance measurement system?
because measurement alone has little use without action
What is the main reason parallel processing is sometimes used for data mining?
because of the massive data amounts and search efforts involved
what is the main reason parallel processing is something used for data mining?
because of the massive data amounts and search efforts involved
This plot is a graphical illustration of several descriptive statistics about a given data set
box-and-whiskers plot
This plot is a graphical illustration of several descriptive statistics about a given data set.
box-and-whiskers plot
Which kind of chart is described as an enhanced version of a scatter plot?
bubble chart
Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features?
clustering
How are enterprise resources planning (ERP) systems related to supply chain management (SCM) systems?
complementary systems
If a simulation result does NOT match the intuition or judgment of the decision maker, what can occur?
confidence gap
Which characteristic of data requires that the variables and data values be defined at the lowest (or as low as required) level of detail for the intended use of the data?
data granularity
A data mining study is specific to addressing a well-defined business task, and different business tasks require
different sets of data
This method calculates the values of the inputs necessary to achieve a desired level of an output
goal seek
This method calculates the values of the inputs necessary to achieve a desired level of an output.
goal seek
This method calculates the values of the inputs necessary to generate a zero profit outcome
goal seek
This method calculates the values of the inputs necessary to generate a zero profit outcome.
goal seek
All of the following are benefits of hosted data warehouses EXCEPT
greater control of data
A(n) ________ is a graphical representation of a model.
influence diagram
Key performance indicators (KPIs) are metrics typically used to measure
internal results
What does the scalability of a data mining method refer to?
its ability to construct a prediction model efficiently given a large amount of data
What does the robustness of a data mining method refer to?
its ability to overcome noisy data to make somewhat accurate predictions
Intermediate result variables reflect intermediate outcomes in
mathematical models
Oper marts are created when operational data needs to be analyzed
multidimensionally
The Internet emerged as a new medium for visualization and brought all the following EXCEPT
new forms of computation of business logic
The data field "ethnic group" can be best described as
nominal data
What is the management feature of a dashboard?
operational data that identify what actions to take to resolve a problem
A Web client that connects to a Web server, which is in turn connected to a BI application server, is reflective of a
three-tier architecture
The need for more versatile reporting than what was available in 1980s era ERP systems led to the development of what type of system?
Executive information systems
How does the use of cloud computing affect the scalability of a data warehouse?
Hardware resources are dynamically allocated as use increases
How does Hadoop work?
It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers.
What does advanced analytics for social media do?
It examines the content of online conversations.
All of the following statements about Map Reduce are true EXCEPT
Map Reduce runs without fault tolerance.
All of the following statements about MapReduce are true EXCEPT
MapReduce runs without fault tolerance
Which of the following statements about Big Data is true?
Pure Big Data systems do not involve fault tolerance
Which of the following sources is likely to produce Big Data the fastest?
RFID tags
Which of the following is NOT a disadvantage of a simulation?
Simulation is often the only DSS modeling method that can readily handle relatively unstructured problems.
h of the following is NOT a characteristic displayed by a LP allocation problem?
The problem is not bound by constraints
Which of the following is NOT a characteristic displayed by a LP allocation problem?
There is a single way in which the resources can be used.
What do voice of the market (VOM) applications of sentiment analysis do?
They examine customer sentiment at the aggregate level
What is one major way in which Web-based social media differs from traditional publishing media?
They have different costs to own and operate
Search engine optimization (SEO) is a means by which
Web site developers can increase Web site search rankings
Web site usability may be rated poor if
Web site visitors download few of your offered PDFs and videos.
Which of the following is an umbrella term that combines architectures, tool, databases, analytical tools, applications, and methodologies?
BI
Which of the following is an umbrella term that combines architectures, tools, databases, analytical tools, applications, and methodologies?
BI
Natural language processing (NLP) is associated with which of the following areas?
all of these
What does Web content mining involve?
analyzing the unstructured content of Web pages
In text mining, tokenizing is the process of
categorizing a block of text in a sentence
When the decision maker knows exactly what the outcome of each course of action will be, this is decision making under
certainty
Which of the following is NOT a component of a quantitative model?
classes
Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes?
classification
A more general form of an influence diagram is called a(n)
cognitive map
This technique makes no a priori assumption of whether one variable is dependent on the other(s) and is not concerned with the relationship between variables; instead it gives an estimate on the degree of association between the variables.
correlation
In a network analysis, what connects nodes?
edges
The very design that makes an OLTP system efficient for transaction processing makes it inefficient for
end-user ad hoc reports, queries, and analysis
What is the fundamental challenge of dashboard design
ensuring that the required information is shown clearly on a single screen
What is the fundamental challenge of dashboard design?
ensuring that the required information is shown clearly on a single screen
What is the fundamental challenge of dashboard designs?
ensuring that the required information is shown clearly on a single screen
Which data warehouse architecture uses metadata from existing data warehouses to create a hybrid logical data warehouse comprised of data from the other warehouses?
federated architecture
Which of the following is LEAST related to data/information visualization?
graphic artwork
The most common method for solving a risk analysis problem is to select the alternative with the
greatest expected value
Which data warehouse architecture uses a normalized relational warehouse that feeds multiple data marts?
hub-and-spoke data warehouse architecture
Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near-real time with highly accurate insights. What is this process called?
in-memory analytics
Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near—real time with highly accurate insights. What is this process called?
in-memory analytics
All of the following are true about-in-database processing technology EXCEPT
it is the same as-in-memory
Which of the following developments is NOT contributing to facilitating growth of decision support and analytics?
locally concentrated workforces
A decision tree can be cumbersome if there are
many alternatives
Breaking up a Web page into its components to identify worthy words/terms and indexing them using a set of rules is called
parsing the documents
Which type of visualization tool can be very helpful when the intention is to show relative proportions of dollars per department allocated by a university administration
pie chart
Which type of visualization tool can be very helpful when the intention is to show relative proportions of dollars per department allocated by a university administration?
pie chart
Important spreadsheet features for modeling include all of the following EXCEPT
pivot tables
What type of analytics seeks to determine what is likely to happen in the future?
predictive
What type of analytics seeks to recognize what is going on as well as the likely forecast and make decisions to achieve the best performance possible?
prescriptive
Prediction problems where the variables have numeric values are most accurately defined as
regressions
Third party providers of publicly available data sets protect the anonymity of the individuals in the data set primarily by
removing identifiers such as names and social security numbers.
When the decision maker must consider several possible outcomes for each alternative, each with a given probability of occurrence, this is decision making under
risk
Which of the following is NOT an example of transaction processing?
sales report
In a Hadoop "stack", what node periodically replicates and stores data from the Name Node should it fail?
secondary node
In a Hadoop "stack," what node periodically replicates and stores data from the Name Node should it fail?
secondary node
In the Wimbledon case study, the tournament used data for each match in real time to highlight
significant events
Clustering partitions a collection of things into segments whose members share
similar characteristics
What types of documents are BEST suited to semantic labeling and aggregation to determine sentiment orientation?
small- to medium-sized documents
Real-time data warehousing can be used to support the highest level of decision making sophistication and power. The major feature that enables this in relation to handling the data is
speed of data transfer
This measure of dispersion is calculated by simply taking the square root of the variations.
standard deviation
When representing data in a data warehouse, using several dimension tables that are each connected only to a fact table means you are using which warehouse structure?
star schema
What type of VIM models display a visual image of the result of one decision alternative at a time?
static
Companies with the largest revenues from Big Data tend to be the largest computer and IT services firms.
the largest computer and IT services firms
What has caused the growth of the demand for instant, on-demand access to dispersed information?
the more pressing need to close the gap between the operational data and strategic objectives
In the research literature case study, the researchers analyzing academic papers extracted information from which source?
the paper abstract
All of the following statements about data mining are true EXCEPT
the process aspect means that data mining should be a one-step process to results
Big Data often involves a form of distributed storage and processing using Hadoop and MapReduce. One reason for this is
the processing power needed for the centralized model would overload a single computer
In estimating the accuracy of data mining (or other) classification models, the true positive rate is
the ratio of correctly classified positives divided by the total positive count
Traditional data warehouses have not been able to keep up with
the variety and complexity of data
Under which of the following requirements would it be more appropriate to use Hadoop over a data warehouse?
unrestricted, ungoverned sandbox explorations
What is the Hadoop Distributed File System (HDFS) designed to handle?
unstructured and semistructured non-relational data
Sentiment analysis projects require a lexicon for use. If a project in English is undertaken, you must generally make sure to
use an English lexicon appropriate to the project at your discretion
Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called?
variability
Which of the following is a data mining myth?
Data mining requires a separate, dedicated database
Which of the following is the order of simulation methodology?
Define the problem, Construct the simulation model, Test and validate the model, Design the experiment, Conduct the experiment, Evaluate the results, Implement the results.
___________is an evolving tool space that promises real-time data integration from a variety of sources, such as relational databases, Web services, and multidimensional databases
Enterprise information integration (EII)
Which type of question does visual analytics seeks to answer?
Why did it happen?
Online transaction processing (OLTP) systems handle a company's routine ongoing business. In contrast, a data warehouse is typically
a distinct system that provides storage for data that will be made use of in analysis
A large storage location that can hold vast quantities of data (mostly unstructured) in its native/raw format for future/potential analytics consumption is referred to as a(n)
data lake
Business applications have moved from transaction processing and monitoring to other activities. Which of the following is NOT one of those activities?
data monitoring
Which characteristic of data means that all the required data elements are included in the data set?
data richness
Which characteristics of data means that all the required data elements are included in the data set?
data richness
All of the following are challenges associated with natural language processing EXCEPT
dividing up a text into individual words in English
When querying a dimensional database, a user went from summarized data to its underlying details. The function that served this purpose is
drill down
A(n) ________ spreadsheet model represents behavior over time
dynamic