BSIS FINAL

Ace your homework & exams now with Quizwiz!

_____ is a statistical procedure used to develop an equation showing how two variables are related.

Regression analysis

The following regression model has been proposed to predict sales at a gas station: where x1 = their previous day's sales (in $1,000s), x2 = population within 5 miles (in 1,000s), x3 = 1 if any form of advertising was used, 0 if otherwise, and sales (in $1,000s). Predict sales (in dollars) for a store with competitor's previous day's sale of $3,000, a population of 10,000 within 5 miles, and six radio advertisements.

$86,000. ŷ = 10 - 4(3) + 7(10) + 18(1) = 86,000

What would be the coefficient of determination if the total sum of squares (SST) is 30 and the sum of squares due to regression (SSR) is 27?

0.90. The coefficient of determination = SSR/SST = 27/30 = 0.9.

What would be the value of the sum of squares due to regression (SSR) if the total sum of squares (SST) is 22.21 and the sum of squares due to error (SSE) is 6.89?

15.32 SST = SSR + SSE

The Excel operator for "not equal to" is

< >

Which formula is used for counting cells where the value in column "D" is less than $40,000?

=COUNTIF(D2:D17,"<40000")

Which formula is used for counting cells where the value in column "D" is at least $70,000?

=COUNTIF(D2:D17,">=70000")

For the following table, which Excel formula determines the number of "Quantity" cells that have a value of at most 50?

=COUNTIF(D2:D8,"<=50")

For the following table, which Excel formula determines the number of "Quantity" cells that have a value greater than 10?

=COUNTIF(D2:D8,">10")

What Excel formula could be used to perform the following computation? B5*B17 + C5*C17 + D5*D17 + B6*B18 + C6*C18 + D6*D18 + E6*E18 + B7*B19 + C7*C19 + D7*D19 + E7*E19

=SUMPRODUCT(B5:E7,B17:E19)

Using the following table, which formula calculates the total amount (in dollars) the company will have to pay in Cost of Living Adjustments?

=SUMPRODUCT(D2:D17,E2:E17)

Given this scenario (x in column D, and y in E), find ∑xy.

=SUMPRODUCT(D2:D19,E2:E19)

Regression analysis involving one independent variable and one dependent variable is referred to as

simple linear regression.

In the simple linear regression equation, the parameter is the _______ of the true regression line.

slope

Natalie needs to compare values across different categories. Which of the following charts should Natalie use?

column (bar) chart

When we use the estimated regression equation to develop an interval that can be used to predict the mean for ALL units that meet a particular set of given criteria, that interval is called a

confidence interval.

A tabular summary of data for two variables is referred to as a

crosstabulation.

A variable used to model the effect of categorical independent variables is called a(n)

dummy variable.

In the simple linear regression model, the ________ accounts for the variability in the dependent variable that cannot be explained by the linear relationship between x and y.

error term

The term in the multiple regression model that accounts for the variability in y that cannot be explained by the linear effect of the q independent variables is the

error term, ε.

The _____ is the range of values of the independent variables in the data used to estimate the regression model.

experimental region

Prediction of the value of the dependent variable outside the experimental region is called

extrapolation.

You would ________ a table if you wanted to display only data that matches specific criteria.

filter

A method to find a specific value in a cell by adjusting the value of another cell is

goal seek.

A bubble chart is a graphical presentation that

has two axes that represent two variables, and the magnitude of the third variable is given by the size of the bubble.

The graph of the simple linear regression equation is a(n) _____.

line

DJ needs to display data over time. Which of the following charts should DJ use?

line chart

The Golf Course manager must report the number of visits to the course over the last 12 months. These data are shown in the table. How could these data be best displayed?

line chart (time series)

Which of the following is NOT possible with VLOOKUP?

look up values located in a column to the left of the column that contains the lookup value

A Geographic Information System (GIS)

merges maps and statistics to present data collected over different geographies.

The study of how a dependent variable y is related to two or more independent variables is called

multiple linear regression.

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the error term ε follows a ________ distribution for all values of x.

normal

VLOOKUP assumes that the first column of the table is

sorted in ascending order.

Which of the following can be used to show overall trend?

sparklines

The process of making estimates and drawing conclusions about one or more characteristics of a population through analysis of sample data drawn from the population is known as

statistical inference.

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the variance of ε, denoted by σ 2 , is

the same for all values of x.

Which of the following types of graphs is useful for visualizing hierarchical data along multiple dimensions?

treemap

An approximation of the linear relationship between variables in a chart can be represented with a

trendline.

The defined range for a VLOOKUP formula must have at least ________ rows or columns.

two

A data dashboard is a visualization tool that

updates in real time and gives multiple outputs.

A bar chart is a graphical presentation that

uses horizontal bars to display the magnitude of quantitative data.

Goal seek is part of a suite of data tools used for

what-if analysis.

The process of changing the values in cells to see how those changes affect the outcome of formulas on a worksheet is called

what-if analysis.

Using various interest rates to determine the amount of loan payments is an example of

what-if analysis.

Of the options below, which graphical display can be used to compare categorical data?

bar chart

The KPIs displayed in a data dashboard should do all of the following except

be displayed across multiple screens.

Which of the following options guarantees that the best model for a given number of variables will be found?

best subsets regression.

Which of the following options is NOT an iterative variable selection procedure?

best subsets regression.

In a regression analysis, if SSE = 200 and SSR = 300, then the coefficient of determination is

0.6.

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the error term ε is a random variable with a mean or expected value of

0.

Which of the following methods would NOT improve the readability of a table?

Put the numerical values in italic font.

A logical function that counts the cells that meet specific criteria in a specified range is ________.

COUNTIF

If a column of data holds the marital status for members of a gym, the ________ function could be used to determine how many members are "single".

COUNTIF

Found below is Excel output from a quadratic regression analysis based upon the number of cars each employee sold during the most recent sales period and the number of months each salesperson has been employed by the company... Which of the following gives the correct quadratic regression model?

E(B0) + E(B1)*(months employed) + E(B2)*(months employed)^2

The following table is used to look up information on a specific product. The Product ID is entered into cell B2 and the information is returned in the shaded box. To return the information for the Product named in B2, the formula in cell C5 would be =VLOOKUP($B$2,$A$14:$F$44,2,FALSE).

False. The formula is =VLOOKUP(value, table, index, range).

The following table is used to look up information on a specific product. The Product ID is entered into cell B2 and the information is returned in the shaded box. To return the information for the Product named in B2, the formula in cell C5 would be =LOOKUP($B$2,$A$14:$F$44,3,FALSE).

False. The formula is =VLOOKUP(value, table, index, range).

_________ can be used to determine the maximum loan amount with payments that stay within your budget.

Goal seek

_____ refers to the use of sample data to calculate a range of values that is believed to include the unknown value of a population parameter.

Interval estimation

________ functions search for a corresponding value in a defined range of cells located in another part of the workbook.

Lookup

___________ functions look up a value in a defined range of cells located in another part of the corresponding workbook to find a value.

Lookup

The following table is used to look up information on a specific product. The Product ID is entered into cell B2 and the information is returned in the shaded box. If "Trace Dependents" were used in the Formula Auditing Group on the Formulas tab and clear cell C9, would that generate an error in Cell C6?

No error would be generated in Cell C6.

Which of the following is a logical function that will add values in multiple ranges that meet multiple criteria?

SUMIF

The following table is used to look up information on a specific product. The Product ID is entered into cell B2 and the information is returned in the shaded box. There are several formulas in the table. How could you look at the cells that are linked to cell B2?

Select "Trace Dependents" in the Formula Auditing Group on the Formulas tab.

Suppose an estimated regression equation has a coefficient of determination (r2) of 0.866. Interpret this value.

The estimated regression equation explains approximately 86.6% of the variation in the dependent variable.

Which statement is NOT true of influence diagrams?

The influence diagram is a means of gaining influence over competitors.

The formula = SUMIF(B2:B10, ">0", C2:C10) for a particular cell will find:

The sum of all of the values in the range C2:C10 if the corresponding value in the range B2:B10 is greater than zero.

Found below is Excel output from a quadratic regression analysis based upon the number of cars each employee sold during the most recent sales period and the number of months each salesperson has been employed by the company... Interpret the coefficient of determination for this regression model.

This regression model explains approximately R^2 of the variation in cars sold for our sample data.

Using the influence diagram given below, which of the following would be a likely mathematical expression for Total Cost?

Total Cost = Fixed Cost + Total Variable Cost

In a make-versus-buy decision, companies have to decide whether they should manufacture a product or outsource its production to another firm.

True

Using only "VLOOKUP" to transfer the Rank1 name from VData tab into the "Individual Sales" table in the Table1 tab, the formula would be =VLOOKUP(A4,VData!$B$2:$C$11,2,FALSE).

True. The formula is =VLOOKUP(value, table, index, range, T/F).

An Excel function that searches values in a table array arranged in columns is the ______________ function.

VLOOKUP

If you have defined a two-column range of cells containing names and phone numbers, you would use the ________ function to match a name with a number.

VLOOKUP

The following table is used to look up information on a specific product. The Product ID is entered into cell B2 and the information is returned in the shaded box. If "Trace Precedents" were used in the Formula Auditing Group on the Formulas tab and clear cell A14, would that generate an error in Cell C6?

Yes, it would generate an error of #N/A in in Cell C6. It would generate an error of #N/A in in Cell C6 because it has dependencies on Cell A14.

The mathematical equation relating the expected value of the dependent variable to the value of the independent variables, which has the form of is called

a multiple regression model.

A one-way data table summarizes

a single input's impact on the output of interest.

Edward Tufte introduced the idea of the data-ink ratio, as a way of quantifying the proportion of "data-ink" to the total amount of ink used in a table or chart. Which of the following options would increase the data-ink ratio of a table?

adding a title to the table

Suppose a residual plot of x versus the residuals, y - ŷ, shows a nonconstant variance. In particular, as the values of x increase, suppose that the value of the residuals also increases. This means that

as the values of x get larger, the ability to predict y becomes less accurate.

The process of making conjecture about the value of a population parameter, collecting sample data that can be used to assess this conjecture, measuring the strength of the evidence against the conjecture that is provided by the sample, and using these results to draw a conclusion about the conjecture is known as

hypothesis testing.

Two variables have a positive linear correlation. As the dependent variable increases, the independent variable will

increase.

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the values of ε are

independent.

A(n) _______ is a visual representation that shows which entities affect others in a model.

influence diagram

A PivotTable

is a crosstabulation created in Excel that is interactive.

A PivotChart

is a graphical presentation created in Excel that functions similar to a PivotTable.

A key performance indicator (KPI)

is a metric that is crucial for understanding the current performance of an organization.

A heat map

is a two-dimensional graphical presentation of data in which color shadings indicate magnitudes.

The coefficient of determination

is defined as SSR/SST.

The phenomenon by which the value of an estimate generally gets closer to the value of the parameter being estimated as the sample size grows is called the _____.

law of large numbers

When there are many independent variables to consider, special procedures are sometimes employed to select the independent variables to include in the regression model. All of the following are examples of variable selection procedures except for

overfitting.

A graphical presentation used to examine more than two variables in which each variable is represented by a different vertical axis is called a

parallel coordinates plot.

A(n) ________ refers to a measurable factor that defines a characteristic of a population, process, or system.

parameter

With reference to a spreadsheet model, an uncontrollable model input is known as a __________.

parameter

A crosstabulation in Excel is called a

pivot table.

When we use the estimated regression equation to develop an interval that can be used to predict the mean for a specific unit that meets a particular set of given criteria, that interval is called a

prediction interval.

If developing a regression model to make future predictions, the selection of the independent variables to include in the regression model should be based on the _____ on observations that have not been used to train the model.

predictive accuracy

What type of regression model should be used when there is a nonlinear relationship between the independent and dependent variables that is fit by including the independent variable and the square of the independent variable?

quadratic regression model

The COUNTIF function takes two arguments. What are they (in proper order)?

range of cells to check and logical condition

The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is known as the

residual.

A _____ is used to visualize sample data graphically and to draw preliminary conclusions about the possible relationship between the variables.

scatter chart

To better understand the relationship between advertising dollars spent and the subsequent sales, you could create a _____ chart.

scatter chart

When determining the best estimated regression equation to model a set of data, the procedure that uses an iterative variable selection procedure that considers adding an independent variable and removing an independent variable at each step is called

stepwise selection.

Simple linear regression refers to the type of regression analysis for which the relationship between the independent variable and dependent variable are approximated by a(n)

straight line.

The _____ is a measure of the error that results from using the estimated regression equation to predict the values of the dependent variable in a sample.

sum of squares due to error (SSE)

Increasing the "white space" in a table by removing unnecessary lines increases all of the following except the

table's size.

The procedure of using sample data to find the estimated regression equation is better known as

the least squares method.

If a residual plot of x verses the residuals, , shows a non-linear pattern, then we should conclude that

the regression model is not an adequate representation of the relationship between the variables.

In the simple linear regression equation, the parameter represents the _____ of the true regression line.

y-intercept

When the mean value of the response variable is independent of variation in the predictor variable, the slope of the regression line is

zero.

Consider the Excel output for a simple linear regression model. What is the regression model?

ŷ = B0 + B1x

The following data show the results of an aptitude test and the grade point average of 10 students. If GPA and Aptitude Test Scores are linearly related, which of the following must be true?

β ≠ 0


Related study sets

D265 - WGU - Critical Thinking - Reason and Evidence

View Set

KAAP 221 Exam #1 Chapter 17: The Endocrine System

View Set

Psychology Chapter 11: Motivation and Emotion

View Set

Garde Manger Centerpieces and Buffet Production

View Set

ACC201 Managerial Accounting Study guide

View Set

Intro Pharmacology, Pharmacokinetics/ADME I, II, and III

View Set