STP 231 Exam 3
The y(hat) symbol represents _______ value of ________.
a predicted, body temperature
Paired smale data may include one or more _______, which are points that strongly affect the graph of the regression line.
influential points
A straight line satisfies the_________ if the sum of the squares of the residuals is the smallest sum possible.
least-square property
Because the P-value is ______ the significance level 0.05, _____ the null hypothesis. There _____ sufficient evidence to support the claim that there is a linear correlation between the number of chirps in 1 minute and the temperature °F for a significance level of 0.05.
less than or equal to, reject, is
Because the P-value is ___ than the significance level of 0.05, there ____ sufficient evidence to suppoirt the claim that there is a linear correlation between lemon imports and crash fatality rates for a significance level of a=0.05
less, is
The _______ measures the strength of the linear orrelation between the paired quantitative x- and y- values in a sample
linear correlation coefficient
If the condfidence interval contains _____, _____ the null hypothesis
zero, fail to reject
If the confidence interval does not contain ______, _____ the null hypothesis
zero, reject
If it is found that r=0, does that indicate that thre is no association between these two variables?
No, because while there is no linear correlation, there may be a relationship that is not linear.
If we find that ther is a linear correlation between the concentration of carbon dioxide in our atmosphere and the global temperature, does that indicate the changes in the concentration of carbon dioxide cause changes in the global temperature?
No. The presence of a linear correlation between two variables does not imply that one of the variable is the cause of the other variable.
Given an assumption, what two criteria must be met to satisfy the assumption?
The assumptions are satisfied when the expected frequencies are all 1 or more, and none are less than 5
Find the Linear Correlation Coefficient
Stat -> Edit -> Input Data -> Stat-> LinRegTTest
The value of r is estimated to be ___, because it is likely that thre is no correlation between body temperatutre and head circumference
0
Which of the following is NOT true of the goodness-of-fit test? A: Goodness of fit hypothesis tests may be left tailed, right tailed, or two tailed B: If expected frequencies are equal, then we can determine them by E=n/k, where n is the total number of observations and k is the number of categories C: Expected frequencies need not be whole numbers D: If expected frewuencies are not all equal, then we can determine them by E=np for each individual category, where n is the total number of observations and p is the probability for the category
A: Goodness of fit hypothesis tests may be left tailed, right tailed, or two tailed They are right tailed
Are the expected frequencies variables? A: The expected frequencies are not variables, as they are determined by the sample size and the distribution in the alternative hypothesis B: The expected frequencies are variables, as they are determined by the sample size and the distribution in the alternative hypothesis C: The expected frequencies are not variables, as they are determined by the sample size and the distribution in the alternative hypothesis D: The expected frequencies are variables, as they are determined by the sample size and the distribution in the null hypothesis
A: The expected frequencies are not variables, as they are determined by the sample size and the distribution in the alternative hypothesis
What is NOT a property of the chi-square distribution? A: The mean of the chi-square distribution is 0 B: The chi-square distribution is different for each number of degrees of freedom, df = n-1 C: The chi-square distribution is not symmetric D: The values of chi-square can be zero or positive, but they cannot be negative.
A: The mean of the chi-square distribution is 0
In general what does µd represent? A: The mean of differences from the population of matched data B: The difference of the population means of the two populations C: The mean of the means of each matched pair from the population of matched data D: The mean value of the difference for the paired sample data
A: The mean of the differences from the populaton of matched data
Choose the correct answer below: A: The value of r does not change, because r is not affected by converting all values of a variable to a different scale. B: The value of r does not change, because r is not affected by the choice of x or y. C: The value of r does not change, because r is not affected by relationships that are not linear. D: The value of r changes, because r is affected bt converting all values of a variable to a different scale.
A: The value of r does not change because r is not affected by converting all values of a variable to a different scale
Which of the following is NOT one of the tree common errors involving correlation? A: The conclusion that correlation implies causality B: Correlation does not imply causality C: Mistaking no linear correlation with no correlation D: The use of data based on averages
B: Correlation does not imply causality
Which of the following is NOT a requirement of testing a claim about the mean of the differences from dependent samples? A: The samples are simple random samples B: The degrees of freedom are n-2 C: The sample data are dependent D: Either the number of pairs of sample data is larger than 30 or the pairs have differences that are from a population having a distrubition that is approximately normal, or both
B: The degrees of freedom are n-2
What is meant by saying that a variable has a chi-square distribution? A: The distribution of the variable has the shape of a special type of symmetric cueve B: The distribution of the variable has the shape of a special type of right-skewed curve C: The distribution of the variable has the shape of a special type of bimodal curve D: The distribution of the variable has the shape of a special type of left-skewed curve
B: The distribution of the variable has the shape of a special type of right-skewed curve
Which of the following is NOT a property of correlation coefficient r? A: the value of r is not affected by the choice of x or y B: the linear correlation coefficient r is robust. That is, a single outliuer will not affect the value of r C: the value of r is always between -1 and 1 inclusive D: the value of r measures the strength of a linear relationship
B: the linear correlation coefficient r is robust. That is, a single outliuer will not affect the value of r
Which of the following is NOT a requirement of conducting a hypothesis test for independence between the row variabe and column variable in contingency table? A: For every cell in the contingency table, the expected frequency, E, is at least 5 B: The sample data are represented as frequency counts in a two-way table C: For every cell in the contingency table , the observed frequency, O, is at least 5. D: The sample data re randomly selected
C: For every cell in the contingency table, the observed frequency, O, is at least. 5 The expected MUST be at least 5
Based on thses results, does it appear that police can use a shoe print length to estimate the height of a male? A: No, because shoe print length and height appear to be correlated B: Yes, because shoe print length and height do not appear to be correlated C: No, because shoe print length and height do not appear to be correlated D: Yes, because shoe print length and height appear to be correlated
C: No, because shoe print length and height do not appear to be correlated
Which of the followong is NOT true for conducting a hypothesis test for independence bwtween the roa variable and column variable in a contingency table? A: The null hypothesis is that the row and column variables are independent of each other B: The number of degrees of freedom is (r-1)(c-1), where r is the number of rows and c is the number of columns C: Small values of the X^2 test statistic reflect significant differences between observed and expected frequencies D: Tests of independence with a contingency table are always right-tailed
C: Small values of the X^2 test statistic reflect significant differences between observed and expected frequencies
Are the observed frequencies variables? A: The observed frequencies are variables, as they do not vary from sample to sample B: The observed frequencies are not variables, as they vary from sample to sample C: The observed frequencies are variables, as they vary from sample to sample D: The observed frequencies are not variables, as they do not vary from sample to sample
C: The observed frequencies are variables, as they vary from sample to sample
Which of the following is NOT a requirement of testing a claim about two population means when sigma 1 and sigma 2 re unknown and not assumed to be equal? A: Both samples are simple random samples B: The two samples are independent C: The two samples are dependent D: Either the two sample sizes are large (n1 > 30 and n2 >30) or both samples come from populations having normal distributions, or both of these conditions are satisfied
C: The two samples are dependent
When making predictions based on regression lines, which of the following is NOT listed as a consideration? A: Use the regression equation for precictions only if the graph of the regression line on the satterplot confirms that the regression line fits the points reasonably well B: Use the regression equation for predictions only if the linear correlation coefficient r indicates that there is a linear correlation between the two variables C: Use the refression line for predictions only if the data go far beyond the scope of the available sample data D: If the regression equation does not appear to be useful for making predictions, the best predicted value of a variable is its poiont estimate
C: Use the regression line for predictions only if the data go far beyond the scope of the available sample data
A residual is: A: a value that is determined exactly, without any error B; a point that has a stron effect on the regression equation C: a value of y-y(hat) which is the difference between an observed value of y and predicted value of y D: the amount that one variable changes when the other variable changes by exactly one unit
C: a value of y-y(hat) which is the difference between an observed value of y and predicted value of y
Choose the correct answer below: A: r is a statistic that represents the value of the linear correlation coefficient computed from the paried sample data, and p is a parameter that represents the proportion of the variation in head circumference that can be explained by variation in body temperature B: r is a parameter that represents the value of the linear correlation coefficient that would be computed by using all of the paired data in the popoulation of all statistics students, and p is a statistic that represents the value of the linear orrelation coefficient computed from the paired sample data C: r is a statistic that represents the value of the linear correlation coefficient computed from the paired sample data, and p is a parameter that represnts the value of the linear correlation coefficient that would be computed by using all of the paired data in the population of all statistics students D: r is a statistic that represents the proprtion of the variation in head circumference that can be explained by variation in body temperature, and p is a parameter that represents the value of the linear correlation coefficient the would be computer by using all of the paired data in the population of all statistics students.
C: r is a statistic that represents the value of the linear correlation coefficient computed from the paired sample data, and p is a parameter that represnts the value of the linear correlation coefficient that would be computed by using all of the paired data in the population of all statistics students
Which of the following statements about correlation is true? A: we say there is a negative correlation between x and y if the x-values increase as the corresponding y-values increase B: we say that there is a positive correlation bwtween x and y if there is no distinct pattern in the scatterplot C: we say that there is a positive correlation between x and y if the x-values increase as the corresponding y-values decrease D: We say there is a positive correlation between x and y if the x-values increas as the corresponding y-values decrease
C: we say that there is a positive correlation between x and y if the x-values increase as the corresponding y-values increase
Which of the following is NOT a requirement to conduct a goodness-of-fit test? A: The data have been randomly selected B: The sample data consist of frequency counts for each of the different categories C: For each category, the expected frequency is at least 5 D: For each category, the observed frequency is at least 5
D: For each category, the observed frequency is at least 5
Do the results suggest that imported lemons cause car fatalities> A: The results suggest that imported lemons cause car fatalities B: The results suggest that an increase in imported lemons causes car fatality rates to remain the same C: The results suggest that an increase in imported lemons cause an increase in car fatality rates D: The results do not suggest any cause- effect relationship between the two variables
D: The results do not suggest any cause- effect relationship between the two variables
What is the relationship between the linear correlation coefficient r and the slope b1 of a regression line? A: the value of r will always have the opposite sign of the value of b1 B: the value of r will always be smaller than the value of b1 C: the value of r will always be larger than the valie of b1 D: The value of r will always have the same sign as the value of b1
D: The value of r will always have the same sign as the value of b1
Which of the following is not equivalent to the other three? A: dependent variable B: Independent Variable C: Predictor variable D: Explanatory variable
Dependent variable
Find the Linear Correlation Coefficient
The linear correlation coefficient measures the relationship between the paired values in a sample. Sum up the values of the first column of data x . ∑x = (The sum the values of the first column of data x) Simplify the expression. Sum up the values of the second column of data y ∑y = (sum of the values of the second column of data y) Simlify the expression. Sum upthe values of x ⋅y ∑xy Simplify the expression. Sum up the values of x2 . ∑x2 = (Sum up the values of x2) Simplify the expression. Sum up the values of y2 . ∑y2 = ( Sum up the values of y2) Simplify the expression. Fill in the computed values. Simplify the expression.
Natural pairing No natural pairing
The samples are independent The samples are dependent
When finding the best predicted value and the P-value is greater than the significance level, use
given y(bar)
A ____________ is used to test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution
goodness-of-fit
Because the p-value if the linear correlation coefficient is ____ the significance level, there _____sufficient evidence to support the claim that there is a linear correlation between shoe print lengths and heights of males
greater than, is not
Two samples are ______ if the sample values from one population are not related to or somehow naturally paired or matched with the sample values from the other population
independent
A __________ is a table in which frequencies correspond to two variables
contingency table
A ________ exists between two variables when the values of one variable are somehow associated with the values of the other variable
correlation
Two samples are _______ if the sample values are paired
dependent
In a scatterplot, a(n) ______ is a point lying far away from the other data points.
outlier
Given a collenction of paired sample data, the ________ y(hat)=b0 +b1x algebraically desctibes the relationship between the two variables, x and y
regression equation
For a pair of sample x- and y-values, the __________ is the difference between the observed sample of y and the y-value that is predicted by using the regression equation.
residual
When determining whether there is a correlation between teo variables, one should use a _________ to explore the data visually
scatterplot
The regression line has the property that the ______ of the residuals is the _____ possible sum
sum of squares, lowest
The value of p(hat) is________ The value of q(hat) is ________ The value of n is _________ The value of E is ________ The value of p is ________
the sample proportion found from evaluating 1-p(hat) the sample size data the margin of error the population proportion