BI310 Final

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

___________ refers to describing the important aspects of a set of measurements.

Descriptive statistics

In a simple linear regression model, the coefficient of determination not only indicates the strength of the relationship between the independent and dependent variables, but also shows whether the relationship is positive or negative.

False

In simple linear regression analysis, if the error terms exhibit a positive or negative autocorrelation over time, then the assumption of constant variance is violated.

False

Selecting many different samples and running many different tests can eventually produce a result that makes a desired conclusion be true.

False

The error term is the difference between an individual value of the dependent variable and the corresponding mean value of the dependent variable.

False

The experimental region is the range of the previously observed values of the dependent variable.

False

The least squares simple linear regression line minimizes the sum of the vertical deviations between the line and the data points.

False

The science of describing the important aspects of a set of measures is called statistical inference.

False

When there is positive autocorrelation, over time, negative error terms are followed by positive error terms and positive error terms are followed by negative error terms.

False

When using simple regression analysis, if there is a strong correlation between the independent and dependent variables, then we can conclude that an increase in the value of the independent variable causes an increase in the value of the dependent variable.

False

The point estimate of the variance in a regression model is

MSE

Sampling error occurs because a mean of a random sample can not exactly equal the population mean that we are attempting to estimate.

True

The estimated simple linear regression equation minimizes the sum of the squared deviations between each value of Y and the line

True

The number of sick days per month taken by employees for the last 10 years at Apex Co. is an example of time series data.

True

The residual is the difference between the observed value of the dependent variable and the predicted value of the dependent variable.

True

An example of manipulating a graphical display to distort reality is ___________.

stretching the axis

If the Durbin-Watson statistic is less than dL, then we conclude that

there is significant positive autocorrelation

Which of the following is the best analytic dashboard graphical method for visualizing hierarchical information?

treemap

If r = −1, then we can conclude that there is a perfect relationship between X and Y.

true

Stem-and-leaf display is best used to ___________.

display the shape of the distribution

an ______________ is one unit of a population.

element

A data set provides information about some group of individual _____________.

elements

All of the following are assumptions of the error terms in the simple linear regression model except

error terms are dependent on each other.

The _____________ is the range of the previously observed values of x.

experimental region

In simple regression analysis, the quantity E(Y-Y)2 is called the __________ sum of squares.

explained

A population that consists of all the customers who will use the drive-thru of the local fast food restaurant is called a(n) _____________.

finite population

When the assumption of __________ residuals (error terms) is violated, the Durbin-Watson statistic is used to test to determine if there is significant _____________ among the residuals.

independent, autocorrelation

__________________ assigns a value to a variable

measurement

A person's telephone area code is an example of a(n) _____________ variable.

nominative

a _____________________ is bell-shaped with even distribution on both sides of the high point of the curve

normal curve

If successive values of the residuals are close together, then there is a ___________ autocorrelation and the value of the Durbin-Watson statistic is _________.

positive, small

As a measure of variation, the sample ___________ is easy to understand and compute. It is based on the two extreme values and is therefore a highly unstable measure.

range

A statistical model is a set of assumptions based solely on the sample data that have been selected.

False

Business analytics uses methods that are not part of traditional statistics to look at big data

False

The Durbin-Watson test statistic ranges from

0 to 4

What value of the Durbin-Watson statistic indicates that there is no autocorrelation present in time-ordered data?

2

Bullet graphs are a method of ____________

Descriptive analytics

are graphical summaries of data intended to aid the understanding of up-to-the-minute information about the operational status of a business

Descriptive analytics

is a set of tools for finding unusual observations in a data set. These observations may merit investigation.

Anomaly (outlier) detection

A ____________ variable can have values that are numbers on the real number line

quantitative

The dollar amount on an accounts receivable invoice.

quantitative

The national debt of the United States in 2015.

quantitative

The net profit for a company in 2015.

quantitative

examine all of the population units

Census

is a set of techniques for assigning observations to the most appropriate of several pre-specified categories.

Classification

is a set of techniques for finding inherent groupings or clusters within a data set without having to pre-specify a set of categories.

Cluster Detection

The ______________ is a quantity that measures the variation of a population or sample relative to its mean.

Coefficient of variation

should not be used to make valid statistical inferences about a population.

Convenience, voluntary, and judgment sampling

is the use of predictive analytics, algorithms, and information system techniques to extract useful knowledge from huge amounts of data

Data Mining

is a set of techniques for reducing a large number of correlated variables to a smaller group of underlying factors describing the essential aspects of a situation.

Factor Detection

A graphical portrayal of a quantitative data set that divides the data into classes and gives the frequency of each class is a(n) ___________.

Histogram

The least squares regression line minimizes the sum of the

squared differences between actual and predicted Y values.

Statistical ____________ refers to using a sample of measurements and making generalizations about the important aspects of a population

Inference

Door choice on Let's Make A Deal Door #1 Door #2

Nominative

Personal computer ownership Yes No

Nominative

ncome tax filing status Married filing jointly Married filing separately

Nominative

Restaurant rating ***** **** *** ** *

Ordinal

Statistics course letter grade A B C D F

Ordinal

Television show classifications TV-G TV-PG TV-14 TV-MA

Ordinal

is the use of anomalies, patterns, and associations to predict future outcomes or their probabilities.

Prediction

are methods for finding anomalies, patterns, and associations in data which can be used to redict future outcomes.

Predictive Analytics

factor detection outlier detection association learning are all methods of ___________________.

Predictive analytics

is the generation of courses of action based upon results from predictive analytics, supplemented by values of relevant variables

Prescriptive analytics

___________ sampling is where we know the chance that each element will be included in the sample, which allows us to make statistical inferences about the sample population.

Probability

The simple linear regression (least squares method) minimizes

SSE

When error terms exhibit a positive or negative autocorrelation over time, the assumption of independence is violated

STATEMENT

A relative frequency curve having a long tail to the right is said to be ___________.

Skewed to the right

If we sample without replacement, we do not place the unit chosen on a particular selection back into the population.

True

A simple linear regression model is an equation that describes the straight-line relationship between a dependent variable and an independent variable

True

By taking a systematic sample in which we select every 100th shopper arriving at a specific store, we are approximating a random sample of shoppers.

True

In a simple linear regression model, the slope term is the change in the mean value of y associated with _____________ in x.

a one-unit increase

Any characteristic of a population unit is a(n)

Variable

A flaw possessed by a population or sample unit is ___________.

a defect

Which of the following is a violation of the independence assumption?

a pattern of cyclical error terms over time a pattern of alternating error terms over time negative autocorrelation positive autocorrelation

is a method of finding characteristics that tend to occur together and finding descriptions of how these characteristics are associated.

association learning

The general term for a graphical display of categorical data made up of vertical or horizontal bars is called a(n) ___________.

bar chart

______________ and _____________ are used to describe qualitative (categorical) data.

bar charts, pie charts

As a general rule, when creating a stem-and-leaf display, there should be ______ stem values.

between 5 and 20

Which of the following is not a method of predictive analytics?

bullet graphs

Pie charts, Pareto charts, and bar charts are used with ___________________/___________________ data

categorical/ qualitative

The ____________ assumption requires that all variation around the regression line should be equal at all possible values (levels) of the ___________variable.

constant variance, independent

The _____________ measures the strength of the linear relationship between the dependent variable and the independent variable.

correlation coefficient

Which of the following is a measure of the strength of the linear relationship between x and y that is dependent on the units in which x and y are measured.

covariance

__________________________ looks at data collected at the same point in time.

cross-sectional analysis

________________ is the science of describing the important aspects of a set of measures is called statistical inference.

descriptive statistics

Which of the following is not a supervised learning technique in predictive analytics?

factor analysis

If we examine some of the population measurements, we are conducting a census of the population.

false

When the constant variance assumption holds, a plot of the residual versus x

forms a horizontal band pattern

The number of measurements falling within a class interval is called the ___________.

frequency

The ___________ the r2 and the __________ the s (standard error), the stronger the relationship between the dependent variable and the independent variable.

higher, lower

Which one of the following graphical tools is used with quantitative data?

histogram

If there is significant autocorrelation present in a data set, the ________________ assumption is violated.

independence of error terms

Temperature (in degrees Fahrenheit) is an example of a(n) __________ variable

interval

Any value of the error term in a regression model _____________ any other value of the error term

is independent of

The __________ the r2 , the better the prediction model

larger

the __________________ direction defines the skewness of the graph, in this case skewed to the right.

long tail

When using simple linear regression, we would like to use confidence intervals for the ___________ and prediction intervals for the ___________ at a given value of x.

mean y-value, individual y-value

Another name for 50th percentile

median

In simple regression analysis, the standard error is ___________ greater than the standard deviation of y values.

never

__________ is a necessary component of a runs plot.

observation over time

Measurements from a population are called

observations

An identification of police officers by rank would represent a(n) ____________ level of measurement.

ordinal

______ & ___________ are used for a single qualitative variable

pie charts and bar charts

A set of all elements we wish to study is called a ____________.

population

One method of determining whether a sample being studied can be used to make statistical inferences about the population is to

produce a runs plot

The change in the daily price of a stock is what type of variable?

quantitative

A(n) ____________ variable can have values that indicate into which of several categories of a population it belongs.

qualitative

The advertising medium (radio, television, or print) used to promote a product.

qualitative

The stock exchange on which a company's stock is traded.

qualitative

If one of the assumptions of the regression model is violated, performing data transformations on the ____________ can remedy the situation.

response variable

A ________ and ___________ both look at data over time

runs plot, time series analysis

subset of the units in a population

sample

The point estimate of the _______________ is the positive square root of the sample variance.

sample standard deviation

When we are choosing a random sample and we do not place chosen units back into the population, we are:

sampling without replacement

A _________________ is a graphical display of the relationship between two variables

scatter plot

______________ shows the relationship between two variables.

scatter plot

Data collected for a particular study are referred to as a data ____________.

set

After plotting the data points on a scatter diagram, we have observed an inverse relationship between the independent variable (X) and the dependent variable (Y). Therefore, we can expect both the sample ___________ and the sample _____________ to be negative values.

slope, correlation coefficient

_______________ is the science of using a sample of measurements to make generalizations about the population of measurements.

statistical inference

_____________ is a set of assumptions about how the sample data are selected and about the population from which the sample data are selected.

statistical model

____________ is the science of describing aspects of a set of measurements

statistics

________________ & _________________ are used for displaying a single quantitative variable.

steam-and-leaf / histograms

The _____ distribution is used for testing the significance of the slope term

t

If we collect data on the number of wins the Dallas Cowboys earned each of the past 10 years, we have _____________ data.

time series

Which of the following is not an example of unethical statistical practices?

using graphs to make statistical inferences

____________ is a characteristic of an element

variable

_____________ are characteristics of elements in a population.

variables

Which of the following is a categorical variable?

whether a person has a traffic violation

The ___________ of the simple linear regression model is the value of y when the mean value of x is zero.

y-intercept


Ensembles d'études connexes

Chapter 14 Cloud Computing and Internet of Things

View Set

Chapter 5: Computer Science (Java)

View Set

Experimental Probability study island

View Set

Section G: Material Requirements Planning

View Set

Complete FL Real Estate Exam 88% Study Guide

View Set

Chapter 15: Brain and Cranial Nerves

View Set