BUS 245 exam 3
which of the following sample correlation coefficients shows the strongest linear association between x and y?
-0.95
the sample coefficient value is between
-1 and 1
a test in which the null hypothesis is rejected only on one side of the hypothesized value of the population parameter
1 tailed test
a test in which the null hypothesis can be rejected on either side of the hypothesized value of the population parameter
2 tailed test
we do not reject the null hypothesis when the p-value is
>= a
y=Bo+B1x+E which symbol represents the intercept?
B0
y=Bo+B1x+E which symbol represents slope?
B1
a type II error occurs when we
Do not reject the null hypothesis when it is actually false
which symbol represents random error in the linear regression model? y=Bo+B1x+E
E
the alternative hypothesis for a one sided test look like:
Ha:u<u0
in a simple linear regression, a downward sloping trend line suggests which of the following?
a negative linear relationship between x and y
in a simple linear regression, an upward sloping trend line suggests which of the following?
a positive linear relationship between x and y
the significance level is the probability of making
a type 1 error
which of the following statements best defines a test statistic in a hypothesis test?
a variable upon which the decision in hypothesis testing is based
contradicts the default state or status quo specified in the null hypothesis
alternative hypothesis
the _________ _________ typically contests the status quo and may suggest a corrective action if true
alternative hypothesis
which of the following identifies the range for a correlation coefficient?
any value between -1 and 1, inclusive
which of the following is a possible advantage of using multiple tools to judge the validity of a regression model?
avoid the risk of using the wrong model
which of the following are the estimated model coefficients of the simple linear regression equation?
b0 and b1
In a simple linear regression model, which of the coefficients in the stilted sample regression equation indicates the cane in the predicted value of y when x increases by one unit?
b1
the coefficient of determination can assume which of the following values?
between zero and one
in practice, we use a stochastic model over a deterministic model because
certain variables that impact the response variable are not included in the model
an important final conclusion to a statistical test is to
clearly interpret the results in terms of the initial claim
the ____ Se is to 0, the better the model fits the data
closer
The goodness-of-fit measure that quantifies the proportion of the variation in the response variable that is explained by the sample regression equation is the
coefficient of determination
the proportion of the sample variation in the response variable that is explained by the sample regression equation
coefficient of determination
what does the goodness of fit measure?
coefficient of determination, adjusted coefficient or determination, standard error of the estimate
an alternative hypothesis
contradicts the status quo
what is the difference between correlation and causation?
correlation means that two variables are related, but causation means that one variable causes another to happen
this approach is used when a computer is not available and all calculations must be done by hand
critical value approach
matched pairs sampling is an example of
dependent sampling
in regression analysis, the response variable is also called the
dependent variable
when the response variable is uniquely determined by the explanatory variable, the relationship is
deterministic
unlike R^2, adjusted R^2 can be used to compare regression models with
different numbers os explanatory variables
one limitation of correlation analysis is that it
does not imply causation
a measure of degree of variability that exists even if all population means are the same; measures the unexplained variation in the response variable
error sum of squares (SSE)
we can generally reduce both type I and type II errors simultaneously by
increasing the sample size
two or more samples are _______ if the process that generates one sample is completely separate from the process that generates the other sample
independent
two or more random samples are considered independent if the process that generates one sample is completely separate from the other sample
independent random samples
in hypothesis tests about the population correlation coefficient, the alternative hypothesis of not equal to zero is used when testing whether two variables are
linearly related
a confidence interval for the mean difference ud follows the general format of a point estimate +-
margin of error
if the correlation between the response variable and the explanatory variables is sufficiently low, then adjusted R^2
may be negative
one limitation of correlation analysis is that it
may not be a reliable measure when outliers are present in one or both of the variables
For matched-pairs sampling, the parameter of interest is referred to as the
mean difference
allows us to examine how the response variability is influenced by two or more explanatory variables
multiple linear regression model
if the value of the sample covariance between the two random variables X and Y equals -150 then we can conclude that x and Y have a
negative linear relationship
a binomial distribution can be approximated by a _______ distribution for large sample sizes
normal
statistical inference concerning the difference in population means is based on one condition that the sampling distribution of x1-x2 follows a
normal distribution
when testing u, the p-value is the probability of obtaining a sample mean at least as large or at least as small as the one derived from a given sample, assuming the ___ hypothesis is true
null
corresponds to the presumed default state of nature or status quo
null hypothesis
the p-value is calculated assuming the
null hypothesis is true
what values can the standard of error of the estimate x assume?
o<= x< infinity
one limitation of correlation analysis is that it
only captures a linear relationship between two variables
a regression technique for fitting a straight line whereby the error sum of squares is minimized
ordinary least squares
name the mathematical method that produces the "best fitting trend line"
ordinary least squares
a few extreme high or low values in the data set are called
outliers
we can reject the null hypothesis when the
p-value < a
two approaches for a hypothesis test:
p-value and critical value approach
this approach is used by most researchers and practitioners; requires statistical software package
p-value approach
the two equivalent methods to solve a hypothesis test are the:
p-value approach and critical value approach
when comparing two population proportions, the parameter of interest is
p1-p2
hypothesis testing is used to resolve conflicts between 2 competing hypotheses on what?
particular parameter of interest
a pooled sample proportion can be computed when testing to see if two population proportions are equal. the pooled value represents an estimate of the unknown
population proportion
what type of relationship exists between two variables if as one increases, the other increases?
positive
if the value of the sample covariance between the two random variables x and y equals 14.67, we can conclude that x and y have a
positive linear relationship
unlike the mean and standard deviation, the population proportion p is a descriptive summary measure that can be used for data that are _______
qualitative
the difference between the observed and the predicted values of y
residual e
the standard error of the estimate is the standard deviation of the
residuals
the competing hypothesis Ho:p1-p2<=Do versus Ha:p1-p2>Do is a
right tailed test
when testing u and o is known, H0 can never be rejected if z<= 0 for a
right tailed test
in inferential statistics, we use ____________ information to make inferences about and unknown population parameter
sample
gauges the direction and strength of the linear relationship between two variables
sample correlation coefficient
the point estimate for the difference between two population means is represented by the difference between two:
sample means
measures the direction of the linear relationship
sample variance
numerical measure that gauges dispersion from the sample regression equation
sample variance of the residual
shows the relationship between two variables
scatterplot
the allowed probability of making a type I error (100a%)
significance level
in regression analysis one explanatory variable is used to explain the variability in the response variable
simple linear regression model
steps of p-value approach to hypothesis testing in order:
specify the null and alternative hypotheses, calculate the value of the test statistic and its p-value, state the conclusion and interpret the results
a numerical measure that gauges the dispersion of data points from the sample regression equation is referred to as the
standard error of the estimate
standard deviation of the residual; used as a goodness of fit measure for regressional analysis
standard error of the estimate
margin of error=
standard of error
when the value of the response variable is not uniquely determined by the explanatory variable, the relationship is said to be
stochastic
if r=0.83, we can conclude that x and y have a relatively
strong, positive linear relationship
sample-based measure used in hypothesis testing
test-statistic
the sample variance of the residual is defined as
the average of the squared differences between y1 and y^1
the residual e represents
the difference between an observed and predicted value of the response variable at a given value of the explanatory variable
the multiple regression model is used when?
the researcher believes that two or more explanatory variables influence the response variable
for which of the following situations is a simple linear regression model appropriate?
the response variable y is influenced by one explanatory variable
unlike R^2, adjusted R^2 explicitly accounts for
the sample size and the number of explanatory variables
statistical inference concerning the mean difference based on matched-pairs sampling requires one of two conditions. what are the two conditions?
the sample size n >=5 and both x1 and x2 are normally distributed
in evaluating a regression model, why is a scatterplot a useful tool?
the scatterplot can be used to asses the linearity of the relationship
SST represents what?
total variation in y
true or false: for a given sample size n, a type I error can only be reduced at the expense of a higher type II error
true
true or false: in a two-tailed test, we can reject the null hypothesis on either side of the hypothesized value of the population parameter
true
true or false: the optimal values of type I and type II errors require a compromise in balancing the costs of each type of error
true
true or false: we choose a value for a before conducting a hypothesis test
true
the hypothesis Ho:u1-u2=Do versos Ha:u1-u2 not=Do indicates
two tailed test
we specify the alternative hypothesis as Ha:p<0 when we want to test if
two variables are negatively linearly related
committed when we reject the null hypothesis and it is actually true
type I error
made when we do not reject the null hypothesis when the null hypothesis is actually false
type II error
we calculated a pooled estimate of the common variance by
using the weighted averages of the sample variances
if sample evidence is inconsistent with the null hypothesis=
we reject the null hypothesis
when conducting a hypothesis test, we determine
whether the sample data support the alternative hypothesis
y=Bo+B1x+E which symbol represents the explanatory variable?
x
what does the model y=Bo+B1x+E tell us about the relationship between the variables x and y?
x and y are linearly related, but the relationship is inexact or stochastic
which symbol represents the response variable in the linear regression model? y=Bo+B1x+E
y
in a simple linear regression model , if all of the data points fall on the sample regression line, then the standard error of the estimate is
zero
in most applications, the hypothesized difference between two population means is
zero
the hypothesized difference between 2 population means U1 and U2 is
zero