BA test 2

Ace your homework & exams now with Quizwiz!

With reference to exponential forecasting models, a parameter that provides the weight given to the most recent time series value in the calculation of the forecast value is known as the

smoothing constant

The process of making estimates and drawing conclusions about one or more characteristics of a population through analysis of sample data drawn from the population is known as

statistical inference

The graph of the simple linear regression equation is a(n)

straight line

Data mining methods for classifying or estimating an outcome based on a set of input variables is referred to as

supervised learning

The basis for using a normal probability distribution to approximate the sampling distribution of the same means and population mean is

the central limit theorem

A procedure for using sample data to find the estimated regression equation is

the least squares method

A set of observations on a variable measured at successive points in time or over successive periods of time constitute a

time series

A parameter is a numerical measure from a population, such ass

u.

When the expected value of the point estimator equals the population parameter, we say the point estimator is

unbiased

A positive forecast error indicates that the forecasting method _________ the dependent variable.

underestimated

The moving averages method refers to a forecasting method that

uses the average of the most recent data values in the time series as the forecast for the next period

A characteristic or quantity of interest that can take on different values is a

variable

A pizza shop advertises that they deliver in 30 minutes or less or it is free. People who live in homes that are located on the opposite side of town believe it will take the pizza shop longer than 30 minutes to make and deliver the pizza. A random sample of 50 deliveries to homes across town was taken and the mean time was computed to be 32 minutes. What is the appropriate symbol to represent the value, 32?

x=32

In the graph of the simple linear regression equation, the parameter Bo represents the ______ of the true regression line.

y-intercept

When the mean value of the dependent variable is independent of variation in the independent variable, the slope of the regression line is

zero

In interval estimation, as the sample size becomes larger, the interval estimate

becomes narrower

As the number of degrees of freedom for a t distribution increases, the difference between the t distribution and the standard normal distribution.

becomes smaller

Using an a= 0.04, a confidence interval for a population proportion is determined to be 0.65 to 0.75. If the level of significance is decreased, the interval for the population proportion

becomes wider

Which is not true regarding trend patterns?

can result when business conditions shift to a new level at some point in time

The ___________ is a measure of the goodness of fit of the estimated regression equation. It can be interpreted as the proportion of the variability in the dependent variable y that is explained by the estimated regression equation.

coefficient of determination

a ___________ matrix displays a model's correct and incorrect classification.

confusion

Assessing the regression model on data other than the sample data that was used to generate the model is known as

cross-validation

A ________ refers to a model input that can be controlled in a spreadsheet model.

decision variable

The mean absolute error, mean squared error, and mean absolute percentage error are all methods to measure the accuracy of a forecast. These methods measure forecast accuracy by

determining how well a particular forecasting method is able to reproduce the time series data that are already available.

A time series with a seasonal pattern can be modeled by treating the season as a

dummy variable

A variable used to model the effect of categorical independent variables in a regression model is known as a

dummy variable

Classifying a record as belonging to one class when it belongs to another class is referred to as a

error

In the simple linear regression model, the ______ accounts for the variability in the dependent variable that cannot be explained by the linear relationship between the variables

error term

A test set is the data set used to

estimate performance of the final model on unseen data

Determine a freshman's likely first-year grade point average from the student's Scholastic Aptitude Test (SAT) score, high school grade point average, and number of extra-curricular activities. This is an example of

estimation of a continuous outcome

The ______ is the range of values of the independent variables in the data used to estimate the regression model

experimental region

Prediction of the mean value of the dependent variable y for values of the independent variables x1, x2,,,,, xq that are outside the experimental range is caled

extrapolation

Prediction of the value of the dependent variable outside the experimental region is called

extrapolation

Regression analysis involving one dependent variable and more than one independent variable is known as

multiple regression

A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size

n has the same probability of being selected

The set of recorded values of variables associated with a single entity is a

observation

The percent of misclassified records out of the total records in the validation data is known as the

overall error rate

Fitting a model too closely to sample data, resulting in a model that does not accurately reflect the population is termed as

overfitting

Two approaches to drawing a conclusion in a hypothesis test are

p-value and critical value

With reference to a spreadsheet model, an uncontrollable model input is known as a

parameter

A simple random sample of 31 observations was taken from a large population. The sample mean equals 5 Five is a

point estimate

The population parameter value and the point estimate differ because a sample is not a census of the entire population, but it is being used to develop the

point estimate

A forecast is defined as a

prediction of future values of a time series.

Which statement is not true

rejecting the null hypothesis when it is true is a type II error.

Casual models

relate a time series to other variables that are believed to explain or cause its behavior

The difference between the observed value of the dependent variable and the value predicted using the estimated regression equation is known as the

residual

The value of the ______ is used to estimate the value of population parameter.

sample statistic

A ______ is used to visualize sample data graphically and to draw preliminary conclusions about the possible relationship between the variables.

scatter chart

A time series that shows a recurring pattern over one year or less is said to follow a

seasonal pattern

A regression analysis involving one independent variable and one dependent variable is referred to as a

simple linear regression

In the graph of the simple linear regression equation, the parameter B1 is the ________ of the true regression line.

slope

In a simple linear regression analysis the quantity that gives the amount by which the dependent variable changes for a unit change in the independent variable is called the

slope of the regression line

In a simple linear regression model, y B0+Bx +E the parameter B1 represents the

slope of the true regression line

Which of the following regression models is used to model a nonlinear relationship between the independent and dependent variables by including the independent variable and the square of the independent variable in the model?

Quadratic regression model

Spreadsheet models are referred to as what-if models because they

allow easy instantaneous recalculation for a change in model inputs

A normally distributed error term with a mean of zero would

allow more accurate modeling

The ________ button provides an automatic means of checking for mathematical errors within formulas of a worksheet.

Error checking

___________ uses a weighted average of past time series values as the forecast.

Exponential smoothing

_____________ is the amount by which the predicted value differs from the observed value of the time series variable.

Forecast error

The purpose of statistical inference is to make estimates or draw conclusions about a

Population based upon information obtained from the sample

A random sample selected from an infinite population is a sample selected such that each element selected comes from the same ___ and each element is selected_____

Population; independently

For a population with an unknown distribution, the form of the sampling distribution of the sample mean is

approximately normal for large sample sizes

A one-way data table summarizes.

A single input's impact on the output of interest

The CEO of a company wants to estimate the percent of employees that use company computers to go on Facebook during work hours with 95% confidence. He selects a random sample of 150 of the employees and finds that 53 of them logged onto Facebook that day. What is the point estimate of the proportion of the population that logged onto Facebook that day?

.35

The CEO of a company wants to estimate the percent of employees that use company computers to go on Facebook during work hours with a 95% confidence. He selects a random sample of 150 of the employees and finds that 53 of them logged onto Facebook that day. What is the estimate of the standard error of the proportion.

0.039

What would be the value of the sum of squares due to regression SSR if the total sum of squares SST is 25.32 and the sum of squares due to error SSE is 6.89?

18.43

Demand for a product and the forecasting department's forecast for a product are shown below compute the mean absolute error.

2

If the forecasted value of the time series variable for period 2 is 22.5 and the actual value observed for period 2 is 25 what is the forecast error in period 2?

2.5

In order to determine an interval for the mean of a population with unknown standard deviation, a sample of 24 items is selected. The mean of the sample is determined to be 23. The number of degrees of freedom for reading the t value is.

23

If the expected value of a sample statistic is equal to the population parameter being estimated, the sample statistic is said to

be an unbiased estimator of the population parameter

_______ is used to test the hypothesis that the values of the regression parameters B1, B2,,,,, Bq are all zero

An F test

A ________ classifies a categorical outcome variable by splitting observations into groups via a sequence of hierarchical rules.

Classification tree

The modeling process begins with the framing of a _______ model that shows the relationships between the various parts of the problem being modeled.

Conceptual

___________ involves descriptive statistics, data visualization, and clustering.

Data exploration

___________ is dividing the sample data into three sets for training, validation, and testing of the data mining algorithm performance.

Data preparation

___________ is the step in data mining that includes addressing missing and erroneous data, reducing the number of variables, defining new variables, and data exploration.

Data preparation

____________ is the manipulation of the data with the goal of putting it in a form suitable for formal modeling.

Data preparation

_________ is a method of extracting data relevant to the business problem under consideration. It is the first step in the data mining process.

Data sampling

In a linear regression model, the variable that is being predicted or explained is known as __________. It is denoted by y and is often referred to as the response variable.

Dependent variable

A student wants to determine if pennies are really fair when flipped, meaning equally likely to land heads up or tails up. He flips a random sample of 50 pennies and finds that 28 of them land heads up. If p denotes the true probability of a penny landing heads up when flipped, what are the appropriate null and alternative hypotheses?

H0: P=0.5, Hz:p doesn't not equal 0.5

A pizza shop advertises that they deliver in 30 minutes or less or it is free. People who live in homes that are located on the opposite side of town believe it will take the pizza shop longer than 30 minutes to make and deliver the pizza. Write the null and alternative hypotheses that can be used to conduct a significance test.

H0: u<or equal to 30, Ha: u>30

The owners of a fast food restaurant have automatic drink dispensers to help fill orders more quickly. When the 12 ounce button is pressed, they would like for exactly 12 ounces of beverage to be dispensed. There is however, some variation in this amount. The company does not want the machine to systematically over fill or under fill the cups. Which of the following gives the correct set of hypotheses?

H0: u>or equal to 12, Hz: u<12

The average number of hours for a random sample of mail order pharmacists from Company A was 50. 1 hour last year. It is believed that changes to medical insurance have led to a reduction in the average work week. To test the validity of this belief, the hypotheses are

H0: u>or equal to 50.1, u<50.1

The __________ function is used for the conditional computation of expressions in Excel.

IF

In a linear regression model, the variable used for predicting or explaining values of the response variable are known as the ________. It is denoted by x.

Independent variable

_________ is a generalization of linear regression for predicting a categorical outcome variable.

Logistic regression

__________ attempts to classify a categorical outcome as a linear function of explanatory variables.

Logistic regression

Which of the following measures of forecast accuracy is susceptible to the problem of positive and negative forecast errors offsetting one another?

Mean forecast error

The degree of correlation among independent variables in a regression model is called.

Multicollinearity

__________ refers to the degree of correlation among independent variables in a regression model.

Multicollinearity

________ refers to the scenario in which the the analyst builds a model that does a great job of explaining the sample of data on which it is based but fails to accurately predict outside the sample data.

Overfitting.

___________ is a statistical procedure used to develop an equation showing how two variables are related.

Regression analysis

What are the two decisions that you can make from performing a hypothesis test?

Reject the null hypothesis; Fail to reject the null hypothesis.

Determine whether the alternative hypothesis is left-tailed, right-tailed, or two-tailed: H0: u=11, Ha: u>11.

Right tailed

The __________ function pairs each element of the first array with its counterpart in the second array, multiplies the elements of the pairs together, and adds the results.

SUMPRODUCT

The _______ button in the Formula Auditing group allows the user to inspect each formula in detail in its cell location.

Show Formulas

The ______ is a measure of the error that results from using the estimated regression equation to predict the values of the dependent variable in the sample.

Sum of squares due to error SSE

_______is a category of data mining techniques in which an algorithm learns how to classify or estimate an outcome variable of interest.

Supervised learning

With reference to the SUMPRODUCT function, which of the following statements is true?

The arrays that appear as arguments must be of the same dimension.

Which of the following approaches is a good way to proceed with the influence diagram building for a problem?

The influence diagram for a portion of the problem is built first and then expanded until the total problem is conceptually modeled.

__________ is used to test the hypothesis that the values of the regression parameter B1, B2,,,,, Bq are all zero

The least squares method

Trend refers to

The long-run shift or movement in the time series observable over several periods of time.

Which of the following statements is the objective of the moving averages and exponential smoothing methods.

To smooth out random fluctuations in the time series.

Which of the following states the objective of time series analysis?

To uncover a pattern in a time series and then extrapolate the pattern into the future.

Larger values of a have the disadvantage of increasing the probability of making a

Type I error

The ___________ function allows the user to pull a subset of data from a larger table of data based on some criterion.

VLOOKUP

A sample of 37 AA batteries had a mean lifetime of 584 hours. A 95% confidence interval for the population mean was 579.2<u < 588.8. Which statement is the correct interpretation of the results?

We are 95% confident that the mean lifetime of all the bulbs in the population is between 579.2 hours and 588.8 hours.

The proportion of dental procedures that are extractions is 0.16. Which of the following exemplifies a Type I error in this situation?

We reject the claim that the proportion of dental procedures that are extractions is 0.16 when the proportion is actually 0.16

A Type I error is committed when

a true null hypothesis is rejected

The SUM function in Excel

adds up all the numbers in a range of cells

The conceptual model

helps in organizing the data requirements

A one-tailed test is a hypothesis test in which the rejection region is

in one tail of the sampling distribution

A _________ is a visual representation that shows which entities affect others in a model.

influence diagram

An estimate of a population parameter that provides an interval of values believed to contain the value of the parameter is known as the

interval estimate

The coefficient of determination

is used to evaluate the goodness of fit

A null and alternative hypothesis for a one proportion z test are given as H0: p=0.8, Ha: p<0.8 This hypothesis test is

lower-tailed.

A ________ decision is one in which companies have to decide whether they should manufacture a product or outsource production to another firm.

make-versus buy

Statistical significance at the 0.01 level is _______ than significance at the 0.05 level.

more difficult to achieve

You are _____ to commit a a Type I error using the 0.05 level of significance than using the 0.01 level of significance.

more likely


Related study sets

Condensed BUS Law Exam Review #1

View Set

Bias in the Media; Modes of Persuasion

View Set

Marketing Management M4 (Ch. 14) Quiz Review

View Set

Legal Environment Business Exam 5 (chapter 24)

View Set

BUS/475: Integrated Business Topics Wk 4 - Practice: Ch . 8, Corporate Strategy: Vertical [due Day 5]

View Set