Business Analytics Final

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

A mathematical model that gives the best decision, subject to the situation's constraints, is an a(n)

Optimization Model

An alternative for a stacked column chart when comparing more than a couple of quantitative variables in each category is a

clustered column chart

A graphical presentation that uses vertical bars to display the magnitude of quantitative data is known as a

column chart

When working with data sets in Excel, __________ can be used to automatically highlight cells that meet specified requirements.

conditional formatting

In a linear regression model, y = b0 + b1x + e the parameter b1 represents the

slope of the regression line

A __________ is a graphical summary of data previously summarized in a frequency distribution

histogram

Bar charts use

horizontal bars to display the magnitude of the quantitative variable.

The process of making estimates and drawing conclusions about one or more characteristics of a population through analysis of sample data drawn from the population is known as

statistical inference

The graph of the simple linear regression equation is a(n)

straight line

The __________ is a measure of the error that results from using the estimated regression equation to predict the values of the dependent variable in the sample.

sum of squares due to error (SSE)

In the k-nearest neighbors method, when the value of k is set to 1

the classification or prediction of a new observation is based solely on the single most similar observation from the training set.

A procedure for using sample data to find the estimated regression equation is

the least squares method

The "logit" is

the natural log of an odds ratio.

The sum of frequencies for all classes will always equal _____.

the number of elements in a data set

The adjusted R2 adjusts R2 for

the number of explanatory variables in a multiple regression model

In the k-Nearest Neighbor technique in data mining, the "k" refers to

the number of neighbors used.

In linear regression, the fitted value is

the predicted value of the dependent variable

The equation that describes how the dependent variable (y) is related to the independent variable (x) is called _____.

the regression model

If we use linear regression for predicting a categorical variable y (event happens y=1, event not happens y=0), then

the results are not interpretable as predicted y is not restricted to 0 and 1

In a cumulative frequency distribution, the last class will always have a cumulative frequency equal to _____.

the total number of elements in the data set

If the probability is 0.75, then log odds measure is

odds that event happens is 3:1 1/1-p

A __________ is used for examining data with more than two variables, and it includes a different vertical axis for each variable

parallel-coordinates plot

Multicollinearity

refers to the degree of correlation among independent variables in a regression model.

A _____________ is a graphical presentation of the relationship between two quantitative variables

scatter chart

If a data set produces SSR = 400 and SSE = 100, then the coefficient of determination is

.80

What would be the coefficient of determination if the total sum of squares (SST) is 23.29 and the sum of squares due to regression (SSR) is 10.03?

0.43 The coefficient of determination r2 = SSR/SST. Substituting the given values, we get r2 = 0.43

A regression model between sales (y in $1000), unit price (x1 in dollars), and television advertisement (x2 in dollars) resulted in the following function:ŷ = 7 - 3x1 + 5x2​​For this model, SSR = 3500, SSE = 1500, and the sample size is 18. The adjusted R-squared or adjusted coefficient of determination for this problem is

0.66

k-nearest neighbor uses the features of the closest distance neighbor to classify a new observation. This distance is

Euclidean distance

refers to the use of sample data to calculate a range of values that is believed to include the value of the population parameter

Interval estimation

_________ attempts to classify a categorical outcome as a linear function of explanatory variables.

Logistic regression

Which of the following is true about Residuals ?

Lower is better

Which of the following are necessary to be determined to define the classes for a frequency distribution with quantitative data?

Number of nonoverlapping bins, width of each bin, and bin limits

Which of the following gives the proportion of items in each bin?

Relative frequency

In a regression analysis, if r2 = 1, then _____.

SSE must be equal to 0

R^2

SSR/SST

__________ merges maps and statistics to present data collected over different geographies

The geographic information system

Big data is data that is too large or too complex. What are the four Vs that describe such data?

Volume, Veracity, Velocity, Variety

The difference between the observed value of the dependent variable and the value predicted by using the estimated regression equation is called _____.

a residual

We ran a regression to study the effects of education in years (x) on wages in thousand dollars (y). We fit the regression line: ŷ = b0+b1x and found b0=1.2 and b1=0.35. Which of these statements are true?

b1 is the slope of regression line and it suggests that 1 more year of education increases wages by 0.35 thousand dollars.

Frequency distributions can be made for _____.

both categorical and quantitative data

In order to visualize three variables in a two-dimensional graph, we use a

bubble chart

The U.S. Internal Revenue Service uses _____________ to identify patterns that distinguish questionable annual personal income tax filings.

data mining

Data dashboards are a type of _________ analytics.

descriptive

The __________ is the range of values of the independent variables in the data used to estimate the regression model

experimental region

ŷ = b0 +b1x1 + b2x2 is called

least square regression line equation

It is possible for the coefficient of determination to be

less than 1

If the log odds measure suggest that log odds = 0.67 + 0.45 x1 then which statement is true?

log odds of y=1 or event happening increases by 0.45 due to one unit increase in x1

A least squares regression line ______.

may be used to predict a value of y if the corresponding x value is given

In multiple regression analysis

there can be several independent variables, but only one dependent variable

A __________ is useful for visualizing hierarchical data along multiple dimensions

treemap

In the graph of the simple linear regression equation, the parameter b0 represents the ___________ of the true regression line

y-intercept


Set pelajaran terkait

The Stress Emotions: Anger, Fear, and Joy/ Stress Prone Personality Traits

View Set

RN Pharmacology Online Practice 2019 A

View Set

A&P Ch.1 Introduction to the Human Body

View Set

CH 1- ETHICAL REASONING. IMPLICATIONS FOR ACCOUNTING

View Set

Real Estate Investment Trusts {REIT's}

View Set

Life Insurance Policy Provisions, Options and Riders

View Set

Developmental Psyc Adolescence Body & Mind (Psyc 110 Ch. 9)

View Set