ISDS 415
Use Excel to calculate the standard deviation for Y.
=std.dev.s
Today, many vendors offer diversified tools, some of which are completely preprogrammed (called shells). How are these shells utilized?
All a user needs to do is insert the numbers.
This measure of central tendency is the sum of all the values/observations divided by the number of observations in the data set.
Arithmetic mean
What is the main reason parallel processing is sometimes used for data mining?
because of the massive data amounts and search efforts involved
Which characteristic of data requires that the variables and data values be defined at the lowest (or as low as required) level of detail for the intended use of the data?
data granularity
All of the following statements about data mining are true EXCEPT
he process aspect means that data mining should be a one-step process to results.
Big Data often involves a form of distributed storage and processing using Hadoop and MapReduce. One reason for this is
he processing power needed for the centralized model would overload a single computer
Which type of visualization tool can be very helpful when the intention is to show relative proportions of dollars per department allocated by a university administration?
pie chart
What type of analytics seeks to determine what is likely to happen in the future?
predictive
What type of analytics seeks to recognize what is going on as well as the likely forecast and make decisions to achieve the best performance possible?
prescriptive
The competitive imperatives for BI include all of the following EXCEPT
right user
What has caused the growth of the demand for instant, on-demand access to dispersed information?
the more pressing need to close the gap between the operational data and strategic objectives
In estimating the accuracy of data mining (or other) classification models, the true positive rate is
the ratio of correctly classified positives divided by the total positive count
In the Opening Vignette on Sports Analytics, what was adjusted to drive one-time ticket sales?
ticket prices
Regression Models of _________________ data focus on predicting the future
time series
A light bulb manufacturer uses descriptive analytics
to present supply chain to managers visually.
Which of the following is an umbrella term that combines architectures, tools, databases, analytical tools, applications, and methodologies?
BI
This plot is a graphical illustration of several descriptive statistics about a given data set.
Box-and-whiskers
This plot is a graphical illustration of several descriptive statistics about a given data set.
Box_and_whiskers plot
Organizations using BI systems are typically____ seeking to Answer the gap between the operational data and strategic objectives has become more pressing.
Close
Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features?
Clustering
Data mining can be very useful in detecting patterns such as credit card fraud, but is of little help in improving sales.
False
Data source reliability means that data are correct and are a good match for the analytics problem.
False
Due to industry consolidation, the analytics ecosystem consists of only a handful of players across several functional areas.
False
In the Miami-Dade Police Department case study, predictive analytics helped to identify the best schedule for officers in order to pay the least overtime
False
Nominal data represent the labels of multiple classes used to divide a variable into specific groups.
False
Open-source data mining tools include applications such as IBM SPSS Modeler and Dell Statistica.
False
Statistics and data mining both look for data sets that are as large as possible.
False
Which type of visualization tool can be very helpful when a data set contains location data?
Geographic Map
How are enterprise resources planning (ERP) systems related to supply chain management (SCM) systems?
How are enterprise resources planning (ERP) systems related to supply chain management (SCM) systems?
Identifying and preventing incorrect claim payments and fraudulent activities falls under which type of data mining applications?
Insurance
What is the management feature of a dashboard?
Operational data that identify what actions to take to resolve a problem.
Which of the following statements about Big Data is true?
Pure Big Data systems do not involve fault tolerance
Due to the expansion of information technology coupled with the need for improved competitiveness in business, there has been an increase in the use of computing power to produce unified reports that join different views of the enterprise in one place.
Rapid
Describe the difference between simple and multiple regression.
Simple Regression requires only 1 variable to predict the outcome or value. Multiple Regression uses two variables or more in order to predict the outcome or value.
Benefits of the latest visual analytics tools, such as SAS Visual Analytics, include all of the following EXCEPT
They explore massive data in hours not days
The cost of data storage has plummeted recently, making data mining feasible for more firms.
True
There are basic chart types and specialized chart types. A Gantt chart is a specialized chart type.
True
Using data mining on data about imports and exports can help to detect tax avoidance and money laundering
True
Visualization differs from traditional charts and graphs in complexity of data sets and use of multiple dimensions and measures.
True
Which type of question does visual analytics seeks to answer?
Why is it happening?
Which characteristic of data means that all the required data elements are included in the data set?
Data Richness
Which of the following is a data mining myth?
Data mining requires a separate, dedicated database
Business applications have moved from transaction processing and monitoring to other activities. Which of the following is NOT one of those activities?
Data monitoring
A(n) ______Answer is a major component of a Business Intelligence (BI) system that holds source data.
Data warehouse
How does Amazon.com use predictive analytics to respond to product searches by the customer?
Depending on the Search Amazon is able to identify possible items that may suit similar needs or are used as accessories. By using historical data the algorithms employed by amazon can look at the customers search historically and observe trends or relations among that search and purchasing behavior. Perhaps individuals who buy a certain item are more susceptible to certain kinds of product placement. This helps Amazon deliver the right ads to the right customer. Furthermore, by using geographic information Amazon can focus on ensuring steady supplies to regions based on time of the year and demographics. For instance in rural areas near lakes and rivers, nearby distribution centers should have more survival/ camping equipment stocked and ready for spring and summer when the hunting season beings and more people are likely to go out into the woods. Locations in cities or near the beach would require much different products to be placed as extra inventory.
Business intelligence (BI) is a specific term that describes architectures and tools only.
False
Dashboards provide visual displays of important information that is consolidated and arranged across several screens to maintain data order.
False
Data mining requires specialized data analysts to ask ad hoc questions and obtain answers quickly from the system.
False
Data that is collected, stored, and analyzed in data mining is often private and personal. There is no way to maintain individuals' privacy other than being very careful about physical data security.
False
Demands for instant, on-demand access to dispersed information decrease as firms successfully integrate BI into their operations.
False
The data storage component of a business reporting system builds the various reports and hosts them for, or disseminates them to users. It also provides notification, annotation, collaboration, and other services.
False
Visual analytics is aimed at answering, "What is it happening?" and is usually associated with business analytics.
False
Key performance indicators (KPIs) are metrics typically used to measure
Internal Results
The programing algorithm developed by Google to handle Big Data computational challenges is known as
Mapreduce
When you tell a story in a presentation, all of the following are true EXCEPT
No need for Subsequent Discussion Should BE , easy to remember lesson, clear outcome and reason, makes sense and orders out background noise
A regression model that involves a single independent variable is called
Simple Regression
Data accessibility means that the data are easily and readily obtainable.
True
Data is the main ingredient for any BI, data science, and business analytics initiative.
True
Descriptive statistics is all about describing the sample data on hand.
True
If using a mining analogy, "knowledge mining" would be a more appropriate term than "data mining."
True
Interval data are variables that can be measured on interval scales
True
Structured data is what data mining algorithms use and can be classified as categorical or numeric.
True
When a problem has many attributes that impact the classification of different patterns, decision trees may be a useful approach.
True
A(n) ______Answer is a major component of a Business Intelligence (BI) system that is often browser based and often presents a portal or dashboard.
User interface
Which type of question does visual analytics seeks to answer?
Why did It Happen
Online transaction processing (OLTP) systems handle a company's routine ongoing business. In contrast, a data warehouse is typically
a distinct system that provides storage for data that will be made use of in analysis.
BI applications must be integrated with
all of these
The user interface of a BI system is often referred to as a(n) Answer
dashboard
Business applications have moved from transaction processing and monitoring to other activities. Which of the following is NOT one of those activities?
data monitoring
A data mining study is specific to addressing a well-defined business task, and different business tasks require
different sets of data
Online transaction processing (OLTP) systems handle a company's routine ongoing business. In contrast, a data warehouse is typically
distinct system that provides storage for data that will be made use of in analysis
The very design that makes an OLTP system efficient for transaction processing makes it inefficient for
end-user ad hoc reports, queries, and analysis
What is the fundamental challenge of dashboard design
ensuring required information is shown clearly on a single screen
Data generation is a precursor, and is not included in the analytics ecosystem.
false
In the Opening Vignette on Sports Analytics, what type of modeling was used to predict offensive tactics?
heat maps
Benefits of the latest visual analytics tools, such as SAS Visual Analytics, include all of the following EXCEPT
hey explore massive amounts of data in hours, not days.
What does the scalability of a data mining method refer to?
its ability to construct a prediction model efficiently given a large amount of data
What does the robustness of a data mining method refer to?
its ability to overcome noisy data to make somewhat accurate predictions
Which of the following developments is NOT contributing to facilitating growth of decision support and analytics?
locally concentrated workforces
Data generation is a precursor, and is not included in the analytics ecosystem.
False
This measure of dispersion is calculated by simply taking the square root of the variations.
Standard deviation
Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes
classification