Exam #2 Review BSIS

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

_____ is a statistical procedure used to develop an equation showing how two variables are related.

Regression analysis

Danah is responsible for reporting the status of sales for his company. The following pie chart shows the percentages of closed sales in each of the top 7 cities. Which of the following is NOT a problem with using a pie chart to display these data?

"it provides general information at a glance"

In the simple linear regression equation, the parameter B1 is the _______ of the true regression line. slope x-intercept y-intercept end-point

slope

What would be the coefficient of determination if the total sum of squares (SST) is 30 and the sum of squares due to regression (SSR) is 27?

coefficient of determination = SSR/SST

Natalie needs to compare values across different categories. Which of the following charts should Natalie use? pie chart column (bar) chart bubble chart heat map

column (bar) chart key word category

When we use the estimated regression equation to develop an interval that can be used to predict the mean for ALL units that meet a particular set of given criteria, that interval is called a

confidence interval

The process of making conjecture about the value of a population parameter, collecting sample data that can be used to assess this conjecture, measuring the strength of the evidence against the conjecture that is provided by the sample, and using these results to draw a conclusion about the conjecture is known as

hypothesis testing

A variable used to model the effect of categorical independent variables is called a(n)

dummy variable

In the simple linear regression model, the ________ accounts for the variability in the dependent variable that cannot be explained by the linear relationship between x and y.

error term

The term in the multiple regression model that accounts for the variability in y that cannot be explained by the linear effect of the q independent variables is the

error term, E

The _____ is the range of values of the independent variables in the data used to estimate the regression model. confidence interval codomain experimental region validation set

experimental region

Prediction of the value of the dependent variable outside the experimental region is called

extrapolation

_____ refers to the use of sample data to calculate a range of values that is believed to include the unknown value of a population parameter.

hypothesis testing

The phenomenon by which the value of an estimate generally gets closer to the value of the parameter being estimated as the sample size grows is called the _____.

law of large numbers

The procedure of using sample data to find the estimated regression equation is better known as point estimation. interval estimation. the least squares method. extrapolation.

least squares method

DJ needs to display data over time. Which of the following charts should DJ use?

line chart

The Golf Course manager must report the number of visits to the course over the last 12 months. These data are shown in the table. How could these data be best displayed?

line chart

The study of how a dependent variable y is related to two or more independent variables is called

multiple linear regression

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the error term ε follows a ________ distribution for all values of x.

normal

When there are many independent variables to consider, special procedures are sometimes employed to select the independent variables to include in the regression model. All of the following are examples of variable selection procedures except for backward elimination. forward selection. best subsets. overfitting.

overfitting

When determining the best estimated regression equation to model a set of data, the procedure that uses an iterative variable selection procedure that considers adding an independent variable and removing an independent variable at each step is called

stepwise selection

An approximation of the linear relationship between variables in a chart can be represented with a

trendline

A bar chart is a graphical presentation that uses horizontal bars to display the magnitude of quantitative data. has two axes that represent two variables, and the magnitude of the third variable is given by the size of the bubble. updates in real time and gives multiple outputs. indicates the trend of data but not magnitude.

uses horizontal bars to display the magnitude of quantitative data

Which of the following options is NOT an iterative variable selection procedure? backward elimination. best subsets regression. forward selection. stepwise regression.

best subsets regression at each step of an iterative variable selection procedure, a single independent variable is added or deleted and the new model is evaluated

Which of the following options guarantees that the best model for a given number of variables will be found? backward elimination. best subsets regression. forward selection. stepwise regression.

best subsets regression guarantees that the best model for a given number of variables will be found because it compares all possible models with a specified number of independent variables

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the error term ε is a random variable with a mean or expected value of

0

Which of the following charts is appropriate for displaying the data given below? Level Frequency

Bar graph Heat maps are not good for categorical data

Danah is responsible for reporting the status of sales for his company. The following pie chart shows the percentages of closed sales in each of the top 7 cities, but a scatter chart would be preferable to show data of this type.

False. Pie charts are useful for categorical data

What would be the value of the sum of squares due to regression (SSR) if the total sum of squares (SST) is 22.21 and the sum of squares due to error (SSE) is 6.89?

SST = SSR + SSE 22.21 = SSR + 6.89

Suppose an estimated regression equation has a coefficient of determination (r^2) of 0.866. Interpret this value.

The estimated regression equation explains approximately 86.6% of the variation in the dependent variable. The coefficient of determination is the proportion of the variability in the dependent variable y that is explained by the estimated regression equation

Where to find coefficient of determination in regression model and interpret:

Under regression statistics, "R square"

When the mean value of the response variable is independent of variation in the predictor variable, the slope of the regression line is

Zero.

Edward Tufte introduced the idea of the data-ink ratio, as a way of quantifying the proportion of "data-ink" to the total amount of ink used in a table or chart. Which of the following options would increase the data-ink ratio of a table? adding table borders increasing the row heights and column widths for all rows and columns in the table rounding off the data to reduce excessive decimal precision adding a title to the table

adding a title to the table (provides additional information)

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the values of ε are

independent

A graphical presentation used to examine more than two variables in which each variable is represented by a different vertical axis is called a

parallel coordinates plot

A _____ interval is an interval estimate of an individual y value given values of the independent variables.

prediction

When we use the estimated regression equation to develop an interval that can be used to predict the mean for a specific unit that meets a particular set of given criteria, that interval is called a

prediction interval

If developing a regression model to make future predictions, the selection of the independent variables to include in the regression model should be based on the _____ on observations that have not been used to train the model.

predictive accuracy

To better understand the relationship between advertising dollars spent and the subsequent sales, you could create a _____ chart.

scatter chart

What type of regression model should be used when there is a nonlinear relationship between the independent and dependent variables that is fit by including the independent variable and the square of the independent variable?

quadratic regression model

In a regression analysis, if SSE = 200 and SSR = 300, then the coefficient of determination is

r^2 = SSR / (SSE + SSR) 300/(200+300) = .6

A _____ is used to visualize sample data graphically and to draw preliminary conclusions about the possible relationship between the variables. contingency table scatter chart bar chart pie chart

scatter chart

The _____ is a measure of the error that results from using the estimated regression equation to predict the values of the dependent variable in a sample.

sum of squares due to error (SSE)

The tests of significance in regression analysis are based on assumptions about the error term ε. One such assumption is that the variance of ε, denoted by σ^2 , is

the same for all values of x

In the simple linear regression equation, the parameter B0 represents the _____ of the true regression line.

y-intercept


Ensembles d'études connexes

NCLEX Pediatric Drugs, Integumentary, Sensory perception, NMNC 1110 EAQ 6: Pain, Clinical judgement-Adaptive Assessments (Nursing Concepts Online for RN 2.0, 2nd Edition), Chapter 13 fundies, Nursing Process PrepU (with explanations), Module 1 Exam,...

View Set

Chapter 42: Drug Therapy for Hyperthyroidism and Hypothyroidism

View Set

ATR 4132 Quiz 1-4, Exam 2-3 Material Human Injuries

View Set