Business Analytics 2 ch1-3

Ace your homework & exams now with Quizwiz!

Compute the coefficient of variation for the following sample data. 32, 41, 36, 24, 29, 30, 40, 22, 25, 37

21.36%

Compute the median of the following data. 32, 41, 36, 24, 29, 30, 40, 22, 25, 37

31

What is the mode of the data set given below? 35, 47, 65, 47, 22

47

A sample of 13 adult males' heights are listed below. Find the range of the data. 70, 72, 71, 70, 69, 73, 69, 68, 70, 71, 67, 71, 74

7

Which of the following is not included in CIA triad - the framework we can use to think about how best to protect data security?

Accountability

Data-ink ratio

Data-ink refers to ink used in a table or chart that is necessary to convey the mean of the data to the audience

Data preparation/treatment of missing values

Discard observations (rows or columns) with any missing values Fill in missing entries with estimated values Apply a data-mining algorithm that can handle missing values

Measuring similarity/dissimilarity between observations

Euclidean distance to measure the similarity/dissimilarity between a pair of observations Standardizing helps remove bias due to difference in measurement units, and variable weighting allows the analyst to introduce appropriate bias based on the business text Matching coefficient Jaccard's coefficient better than matching coefficient since it does not count matching zero value for observations

Why did Dr. Li said Orange and Blue are good colors to choose for data virtualization?

Good color contrast while considering the color-blind audience

__________ is an open-source programming environment that supports big data processing through distributed storage and distributed processing on clusters of computers.

Hadoop

__________ is the most critical step of the decision-making process.

Identifying and defining the problem

K-Means Clustering

In k-means clustering, the analyst must specify the number of clusters, k. If k is not clearly established by the context of the business problem, the k-means clustering algorithm can be repeated for several values of k. The algorithm repeats this process (calculate cluster centroid, assign observation to cluster with nearest centroid) until there is no change in the clusters or a specified maximum number of iterations is reached. In general, the larger the ratio of the distance between a pair of cluster centroids and the within-cluster distance, the more distinct the clustering is for the observations in the two clusters in the pair.

Examples of Predictive Analytics

Linear regression Time-series Data mining: used to find patterns or relationships among elements of the data in a large database Simulation: the use of probability and statistics to construct a computer model to study the impact of uncertainty on a decision

To summarize and analyze data with both a cross tabulation and charting, Excel typically pairs

PivotCharts with PivotTables.

Which one of the following statements is not true concerning PivotTables in Excel?

PivotTables can be built using data arrayed in rows.

Types of data

Population, sample, observations, variables Quantitative data vs categorical data Cross-sectional data vs. time-series data Frequency distribution, histogram

Which of the following gives the proportion of items in each bin?

Relative frequency

Examples of Descriptive Analytics

Reports Descriptive Statistics Data visualization (including data dashboards) Data-mining techniques Basic what-if spreadsheet models

Bill, the manager of Columbus Café, schedules twice the number of waiters and cooks on holiday. Which of the following is the approach Bill used in his decision-making?

Rules of thumb

Types of charts

Scatter chart: presents the relationship between two quantitative variables Trendline: a line that provides an approximation of the relationship between variables Sparkline Bar charts Column charts Pie charts Bubble charts Heat map Stacked column chart Clustered column chart Scatter chart matrix PivotChart

The 4 Vs of big data

Veracity Velocity Variety Volume

Hierarchical clustering:

a bottom-up hierarchical clustering approach starts with each observation in its own cluster and then iteratively combines the two clusters that are the most similar into a single cluster

A data visualization tool that updates in real time and gives multiple outputs is called

a data dashboard

Data query

a request for information with certain characteristics from a database

MISM 3116 summer class students will be chosen to represent Turner College to attend the campus-wide modeling competition. Of the students in Turner College, MISM 3116 students are _______________________

a sample

Big data

a set of data that cannot be managed, processed or analyzed with commonly available software in a reasonable amount of time

Be familiar with how to read clustered bar charts

and how to interpret data from them

A chart that is recommended as an alternative to a pie chart is a

bar chart

Optimization models:

best decision subject to constraints of the situation, e.g. portfolio, supply network design models, price markdown models, etc.

A better understanding of consumer behavior through analytics directly leads to

better marketing strategies

The correlation coefficient will always take values

between -1 and 1

Simulation optimization:

combining the use of probability and statistics to model uncertainty with optimization techniques to find the best decisions in highly complex and highly uncertain situations

The financial dashboard on the second floor of CCT building is a type of _________ analytics.

descriptive

Descriptive analytics

encompasses the set of techniques that describes what has happened in the past

Tactical decisions are concerned with

how the organization should achieve the goals and objectives set by its strategy.

Deleting the grid lines in a table and the horizontal lines in a chart

increases the data-ink ratio

Prescriptive analytics

indicates the best course of action to take

The letter grades (A, B, C, D, F) of business analysis students are recorded by a professor. This variable's classification

is categorical data

Data-ink is the ink used in a table or chart that

is necessary to convey the meaning of the data to the audience

A disadvantage of stacked-column charts and stacked-bar charts is that

it can be difficult to perceive small differences in areas

In a business, the values indicating the business's current operating characteristics, such as its financial position, the inventory on hand, and customer service metrics, are typically known as

key performance indicators (KPIs)

A time series plot is also known as a

line chart

A set of values corresponding to a set of variables is defined as a(n)

observation

Dr. Bill plans to open a donuts store nearby CSU, he collected the data from the students about their favorite flavors of donuts is an example of a(n)

observational study

Any data value with a z-score less than -3 or greater than +3 is considered to be a(n)

outlier

Predictive analytics

predicting the future or ascertaining the impact of one variable on another

Advanced analytics generally refers to

predictive and prescriptive analytics

Data-driven decision making tends to decrease a firm's

risk

Williams & Lee Inc. is an Internet-based retail seller of hiking boots and mountaineering gear. The company decides to open retail stores across the major cities of Georgia to help complement its Internet-based strategy. This activity would be categorized as a(n)

strategic decision

The decisions concerning an organization's goals and future plans are called

strategic decisions

__________ merges maps and statistics to present data collected over different geographies.

the geographic information system

Dimension reduction

the process of removing variables from the analysis without losing any crucial information

Simulation optimization helps

to find good decisions in highly complex and highly uncertain settings.

A _____________ is a line that provides an approximation of the relationship between the variables.

trendline

Centroid linkage:

uses the averaging concept of cluster centroids to define between-cluster similarity

A quantity of interest that can take on different values is known as a(n)

variable


Related study sets

Chapter 14 - Financial Info and Accouting

View Set

Time Period 4~ Chapter 7 Apush Quiz MCQ

View Set

Study Questions for Auto Cad 3 Test 2

View Set

EXAM #4 - CH. 16, 21, 23, 27, 56

View Set

Ethical Hacking Chapter 9 Web and Database Attacks

View Set