Test 2
The linear correlation coefficient is ALWAYS between _______ and ______.
-1, 1
If disjoint the probability is _____
0
The coefficient of determination is a number between ______ and _____
0, 1
The probability of any event E, P(E), must be greater than or equal to ____ and less than or equal to ____
0, 1
The probability of any event must be between _____ and ____
0, 1
The sum of probabilities of all outcome must equal ____
1
The sum of probabilities of all outcomes in the sample space must equal ___
1
If something is less than ___% it is unusual.
5
x and y plays __________ ___________ in the interpretation of r.
NO ROLE
Pattern =
NOT appropriate
The probability that two events E and F both occur is P(E and F)=
P(E)*P(E|F)
If E and F are disjoint events then P(E or F) =
P(E)+P(F), add probabilities
If E and F are not disjoint then
P(E)+P(F)-P(E and F)
If an experiment has n equally likely outcomes and if the number of ways that an event E can occur is m then the probability of E is
P(E)=m/n
__________ is a measure of the likelihood of a random phenomenon or chance behavior
Probability
_____% change in y can be explained by the change in x
R
________ ________ represents a situation in which an association between two variables inverts or goes away when a third variable is introduced to the analysis
Simpson's paradox
If E and F are disjoint then ______ probabilities
add
No pattern =
appropriate
A high correlation DOES NOT imply ____________
causation
If an event is a ______, the probability of the event is 1
certainty
The _________ __ __________ (R²) measures the proportion of total variation in the response variable that is explained by the least squares regression line
coefficient of determination
A ________ ________ lists the relative frequency of each category of the response variable, given a specific value of the explanatory variable in the contingency table
conditional distribution
The strict requirement of the linear model being violated is called ________ __________ ________
constant error variance
A table that related to two categories of data is called a ______ _______ or ______ ____ _______
contingency table, two way table
Tow events are __________ if the occurrence of E in a probability experiment affects the probability of event F.
dependent
Two event are ______ if they have no outcomes in common
disjoint
So r close to 0.....
does not imply no relation just no linear relation
An _______ is any collection of outcomes from a probability experiment.
event
________ are what we are trying to find the probability of
events
In probability an __________ is any process that can be repeated in which the results are uncertain
experiment
The difference between the predicted value of the response variable and the mean value of the response variable is called the ________ _________
explained deviation
If R²=1 means that the line _______ _______ __ ___ _______ _______ _______ _______ ______.
explains 100 percent of the variation in the response variable
P(E)=relative frequency of E=
frequency of E/number of trials of experiment
The statistical term for constant error variance is ________________
homoscedasticity
The explanatory variable is plotted on the _________ _____________.
horizontal axis
If an event is ______, the probability of the event is 0
impossible
As the number of repetitions of a probability experiment ________, the proportion with which a certain outcome is observed gets _______ to the probability of the outcome
increases, closer
Two events E and F are _________ if the occurrence of event E in a probability experiment DOES NOT affect the probability of event F.
independent
An _________ _________ is an observation that significantly affects the least squares regression lines slope and/or y intercept or the value of the correlation coefficient
influential observation
The experiment is a binomial experiment if ...
it is performed a fixed number of times, trials are independent, there are two mutually exclusive (disjoint) outcomes (success or failure), probability is fixed for each trial
The ___________ __________ ________ _________ is a line that minimizes the sum of squared errors (or residuals)
least squares regression line
The ___________ __________ __________ ______ minimizes the sum of the squared vertical distance between the observed values of y and those predicted by the line (y-hat)
least squares regression line
To determine R² for the linear regression model simply square the value of the _________ _________ _________.
linear correlation coefficient
The _____________ _______________ ______________ or ______________ _________ ___________ __________ _________ a measure of the strength and direction of the linear relation between two quantitative variable.
linear correlation coefficient, Pearson product moment correlation coefficient
If r is close to 0 then _________ to _________ evidence exists of a linear relation.
little, no
The ______ _______ _______ in which a certain outcome is observed is the probability of that outcome
long term proportion
A ____________ _________ is related to both the explanatory and response variable.
lurking variable
Another way that two variables can be related even though there is not a causal relation is through a _________ _________.
lurking variable
A ___________ __________ of the variable is a frequency (tally marks) or relative frequency (percentages) distribution of either the row or column variable in the contingency table
marginal distribution
If a plot of the residuals against the predictor variable shows a discernible pattern such as a curve, then the response and predictor variable _______ _______ ______ _______ ________
may not be linearly related
IF independent event _______ probabilities
multiply
Disjoint events are also referred to as _______ ________ events
mutually exclusive
The closer r is to -1 the stronger the evidence of ____________ association between two variables.
negative
If R²=0 means the line has _____ _____ _______.
no explanatory value
The correlation coefficient is ___________ _________.
not resistant
If S is the sample space of this experiment, P(E)=N(E)/N(S) where N(E) is the ____________ and N(S) is the
number of outcomes in E, number of outcome in sample space
An event may consist of _____ outcome or ____ _____ _____ outcome.
one, more than one
Probability describes the long term proportion with which a certain ________ will occur in situations with short term uncertainty
outcome
Residuals help to check for ___________
outliers
If r= -1 then a ____________ ____________ __________ __________ exists between two variables.
perfect negative linear relation
If r= +1 then a ________ ____________ _______ _______ exists between two variables.
perfect positive linear relation
the closer r is to +1 the stronger the evidence of ___________ association between the two variables.
positive
Residuals help to determine whether a linear model is appropriate to describe the relation between the __________ and __________ ___________
predictor, response variables
A __________ ________ provides the possible values of the random variable X and their corresponding probabilities
probability distribution
A ______ ______ lists the possible outcomes of a probability experiment and each outcomes probability.
probability model
A _______ ________ is a numerical measure of the outcome from a probability experiment so its value is determined by chance.
random variable
_____________ play an important role in determining the adequacy of the linear model
residuals
The ___________ variable is the variable whose value can be explained by the value of the ______________ variable.
response, explanatory
A probability model must satisfy all of the ______ ___ _______.
rules of probabilities
r represents the _________ ____________ _________.
sample correlation coefficient
The ____ ________ (S) of a probability experiment is the collection of all possible outcomes.
sample space
A ___________ __________ diagram shows the relationship between two quantitive variable measured on the same individual.
scatter diagram
Probability deals with experiments that yield random __________ ________ ________ or results yet reveal _______ ________ __________
short term outcomes, long term predictability
Events with one outcome are sometimes called ______ events.
simple
__________ _________ is based on your opinion
subjective probability
The notation P(F|E) is read "______________"
the probability of even F given event E
The difference between the the observed value of the response variable and the mean value of the response variable is called ___________ ________
total deviation
The difference between the observed value of the response variable and the predicted value of the response variable is called the _________ ________
unexplained deviation
The linear correlation coefficient is __________.
unitless
An _________ event is an event that has a low probability of occurring
unusual
Residuals help to determine whether the __________ of the residuals are constant
variance
The response variable is plotted on the ____________ __________.
vertical axis
If a plot of the residuals against the explanatory variable shows the spread of the residuals increasing or decreasing as the explanatory variable increases then a strict requirement of the linear model is _________
violated
change in y can be explained by ___
x
The equation for the least squares regression line is given by
y=b₁x+b₀