Exam 1 (MKTG 6300)

Ace your homework & exams now with Quizwiz!

The lift ratio of an association rule with a confidence value of 0.45 and in which the consequent occurs in 6 out of 10 cases is

.75

Compute the 50th percentile for the following data. 10, 15, 17, 21, 25, 12, 16, 11, 13, 22

15.5

The College Board originally scaled SAT scores so that the scores for each section were approximately normally distributed with a mean of 500 and a standard deviation of 100. Assuming scores follow a bell-shaped distribution, use the empirical rule to find the percentage of students who scored greater than 700.

2.5%

In a survey of patients in a local hospital, 62.42% of the respondents indicated that the health care providers needed to spend more time with each patient. What is the population being studied?

All patients in a local hospital

______ are visual methods of displaying data.

Charts

To generate a scatter chart matrix, we use

Excel Add-In XLMiner.

Which of the following is true of Euclidean distances?

It is commonly used as a method of measuring dissimilarity between quantitative observations.

Which statement is true of an association rule?

It is ultimately judged on how actionable it is and how well it explains the relationship between item sets.

Which one of the following statements is not true concerning PivotTables in Excel?

PivotTables summarize only categorical and quantitative data.

_______________ analytics are techniques that use models, constructed from past data, to predict the future or to ascertain the impact of one variable on another.

Predictive

Which of the following gives the proportion of items in each bin?

Relative frequency

____________________ are used in the pharmaceutical industry to assess the risk of introducing a new drug.

Simulations

Which of the following reasons contributes to the increase in the use of data-mining techniques in business?

The ability to electronically warehouse data

The extraction of information on the number of shipments, how much was included in each shipment, the date each shipment was sent, and so on from the manufacturing plant's database exemplifies

data queries.

To identify patterns across transactions, we can use

association rules.

When a decision maker is faced with several alternatives and an uncertain set of future events, s/he uses ______ to develop an optimal strategy.

decision analysis

In order to visualize three variables in a two-dimensional graph, we use a

bubble chart.

Jaccard's coefficient is different from the matching coefficient in that the former

does not count matching zero entries while the latter does.

In a business, the values indicating the business's current operating characteristics, such as its financial position, the inventory on hand, and customer service metrics, are typically known as

key performance indicators.

The strength of the association rule is known as ____________ and is calculated as the ratio of the confidence of an association rule to the benchmark confidence.

lift

The simplest measure of variability is the

range

Compute the relative frequencies for students who earned a C shown in the table of grades below.

0.43 (# of C/Total student)

The strength of a cluster can be measured by comparing the average distance in a cluster to the distance between cluster centroids. One rule of thumb is that the ratio for between-cluster distance to within-cluster distance should exceed what value for useful clusters?

1

Euclidean distance can be used to calculate the dissimilarity between two observations. Let u = (25, $350) correspond to a 25-year old customer that spent $350 at Store A in the previous fiscal year. Let v = (53, $420) correspond to a 53-year old customer that spent $4,100 at Store A in the previous fiscal year. Calculate the dissimilarity between these two observations using Euclidean distance.

75.39

_______________ act(s) as a representative of the population.

A sample

Which of the following best exemplifies big data?

Cellphone owners around the world generate vast amounts of data by calling, texting, tweeting, and browsing the Web on a daily basis.

__________________ is a measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations in the two clusters.

Complete linkage

____________________ are collected from several entities at the same point in time.

Cross-sectional data

A retail store owner offers a discount on product A and predicts that the customers would purchase products B and C in addition to product A. Identify the technique used to make such a prediction.

Data mining

In which of the following data-mining process steps is the data manipulated to make it suitable for formal modeling?

Data preparation

____________________ are analytical tools that describe what has happened.

Descriptive analytics

The software package most commonly used for creating simple charts is

Excel.

An increase in data ____________________ would help to protect stored data from destructive forces or unauthorized users.

FB

Which one of the following is used in predictive analytics?

Linear regression

Which of the following are necessary to be determined to define the classes for a frequency distribution with quantitative data?

Number of nonoverlapping bins, width of each bin, and bin limits

In the spectrum of business analytics, which is the most complex?

Prescriptive

Which of the following analytical techniques helps us arrive at the best decision?

Prescriptive analytics

_____ merges maps and statistics to present data collected over different geographies.

The geographic information system

Which is NOT a primary option for addressing missing data?

To generate random data to replace the missing values

_______________ approaches are designed to describe patterns and relationships in large data sets with many observations of many variables.

Unsupervised learning

____________________ assigns values to outcomes based on the decision maker's attitude toward risk, loss, and other factors.

Utility theory

Heirarchial clustering using ____________ results in a sequence of aggregated clusters that minimizes the loss of information between the individual observation level and the cluster level

Ward's method

___________________ can be used to partition observations in a manner to obtain clusters with the least amount of information loss due to the aggregation.

Ward's method

In which of the following scenarios would it be appropriate to use hierarchical clustering?

When binary or ordinal data needs to be clustered.

A data visualization tool that updates in real time and gives multiple outputs is called

a data dashboard.

The charts that are helpful in making comparisons between categorical variables are

bar charts and column charts.

A better understanding of consumer behavior through analytics directly leads to

better pricing strategies.

Data that are too large or too complex to be handled by standard data-processing techniques and typical desktop software are called _______________________ .

big data

Data preparation includes all of the following except which task?

calculating the confidence ratio for all association rules

The data preparation technique used in market segmentation to divide consumers into different homogeneous groups is called

cluster analysis.

An alternative for a stacked column chart when comparing more than a couple of quantitative variables in each category is a

clustered column chart.

A PivotChart, in few instances, is the same as a

clustered-column chart.

Complete linkage can be used to measure the distance between _________ in cluster analysis.

clusters

Average linkage is a measure of calculating dissimilarity between two clusters by

computing the average distance between every pair of observations between two clusters.

Single linkage is a measure of calculating dissimilarity between clusters by

considering only the two most similar observations in the two clusters.

In preparing categorical variables for analysis, it is usually best to

convert the categories to binary, dummy variables.

The data dashboard for a marketing manager may have KPIs related to

current sales measures and sales by region.

Corporate-level managers use ______ to summarize sales by region, current inventory levels, and other company-wide metrics all in a single screen.

data dashboards

The U.S. Internal Revenue Service uses _____________ to identify patterns that distinguish questionable annual personal income tax filings.

data mining

The use of analytical techniques for better understanding patterns and relationships that exist in large data sets is ______________.

data mining

A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a

dendrogram

Data dashboards are a type of _________ analytics.

descriptive

In order to manage an organization's human resource activities, such as hiring employees, tracking, and influencing employee retention, HR personnel use

descriptive and predictive analytics.

The process of eliminating variables from formal analysis without losing any crucial information is called

dimension reduction.

A cluster's _____________ can be measured by the difference between the distance value at which a cluster is originally formed and the distance value at which it is merged with another cluster in a dendrogram.

durability

In a(n) ________________, one or more variables are identified and controlled or manipulated so that data can be obtained about how they influence the variable of interest identified first.

experimental study

You would _________________ a table if you wanted to display only data that match specific criteria.

filter

Fields may be chosen to represent all of the following except ____________ in the body of a PivotTable.

filters

A two-dimensional graph representing the data using different shades of color to indicate magnitude is called a

heat map.

The __________ the lift ratio, the ____________ the association rule.

higher; stronger

Bar charts use

horizontal bars to display the magnitude of the quantitative variable.

Deleting the grid lines in a table and the horizontal lines in a chart

increases the data-ink ratio.

Data-ink is the ink used in a table or chart that

is necessary to convey the meaning of the data to the audience.

An analysis of items frequently co-occurring in transactions is known as

market basket analysis.

When clustering only by dummy variables that represent categorical variables, the simplest measure of similarity between two observations is called the

matching coefficient.

Imputing values is valid only if the variable values are We may replace missing values with the variable's mode, mean, or median, but only if the variable values are

missing completely at random (MCAR).

A dashboard is a collection of tables, charts, and maps to help management ____________ selected aspects of the company's performance.

monitor

Complete linkage can be used to measure the distance between clusters that are the _________________ in cluster analysis.

most different

Single linkage can be used to measure the distance between clusters that are the _______________ in cluster analysis.

most similar

The endpoint of a k-means clustering algorithm occurs when

no further changes are observed in cluster structure and number.

The data collected from the customers in restaurants about the quality of food is an example of a(n)

observational study.

Euclidean distance can be used to measure the distance between________________ in cluster analysis.

observations

A decision concerned with how the organization is run from day to day is known as a(n) _______________.

operational decision

A mathematical model that gives the best decision, subject to the situation's constraints, is an a(n) _________________.

optimization model

k-means clustering is the process of

organizing observations into distinct groups based on a measure of similarity.

A ______________ is used for examining data with more than two variables, and it includes a different vertical axis for each variable.

parallel-coordinates plot

A forecast that helps direct police officers to areas where crimes are likely to occur based on past data is an example of

predictive analytics.

Advanced analytics generally refers to

predictive and prescriptive analytics

_______________ analytics use techniques that take input data and yield a best course of action.

prescriptive

A data ______________ is a request to obtain information with certain characteristics from a database.

query

The act of collecting data that are representative of the population data is called

random sampling

The difference between the largest and the smallest data values is the __________.

range

In many cases, white space in a chart can improve

readability

Data-driven decision making tends to decrease a firm's

risk.

A _____________ is a graphical presentation of the relationship between two quantitative variables.

scatter chart

Observation refers to the

set of recorded values of variables associated with a single entity

The use of probability and statistics to construct a computer model to study the impact of uncertainty on the decision at hand is called _____________________.

simulation

When working with large spreadsheets with many rows of data, it can be helpful to ____________ the data to better find, view, or manage subsets of data.

sort and filter

A line chart that has no axes but is used to provide information on overall trends for time series data is called a

sparkline.

To avoid problems in interpreting the differences in color in a heat map, ____________ can be added.

sparklines

A method for modifying variables that reduces bias prior to cluster analysis is

standardization.

A ____________________ decision is concerned with how the organization should achieve the goals and objectives set by its strategy.

tactical

If a model's implications depend on the inclusion or exclusion of outliers, one should spend additional time to track down

the cause of the outliers.

Tables should be used instead of charts when

the values being displayed have different units or very different magnitudes

If covariance between two variables is near 0, it implies that

the variables are not linearly related.

Using multiple lines on a line chart or employing multiple charts is an alternative to a

three-dimensional chart.

Simulation optimization helps

to find good decisions in highly complex and highly uncertain settings.

Utility theory is the study of the __________________ or relative desirability of a particular outcome that reflects the decision maker's attitude toward a collection of factors, such as profit, loss, and risk.

total worth

A _____________ is a line that provides an approximation of the relationship between the variables.

trendline

Veracity has to do with how much __________________ is in the data.

uncertainty

The goal regarding using an appropriate number of bins is to show the

variation in the data.

The difference in a variable measured over observations (time, customers, items, etc.) is known as

variation.


Related study sets

Nursing Application: Adrenergic, Adrenergic-Blocking, Cholinergic, and Cholinergic-Blocking Drugs

View Set

AP Lang Unit 6 MCQ Progress Check

View Set

18-19_Q4_P2_Word Study_Vocabulary

View Set

CompTIA A+ (220-1001) 2.8 Networking Tools

View Set

Week 3 - Near Misses and Sentinel Events

View Set