Stats 1430.01 Quizzes
Suppose you have 4 data sets whose scatterplots all show possible linear relationships. The four data sets have correlations of -0.10, +0.25, -0.90, and +0.80, respectively. Which of the correlations shows the strongest linear relationship?
- 0.9
What is the standard deviation of the data set 1, 1, 1, 1?
0
If you could choose four numbers from 1, 2, 3, 4 and repeated numbers were allowed (such as 1, 1, 3, 2), which set of four numbers would give you the largest standard deviation? (No calculations needed.) A. 1, 1, 4, 4 B. 1, 2, 3, 4 C. No answer text provided. D. 1, 1, 1, 4
1, 1, 4 , 4
A listing of all possible values in a data set and how often they occurred is called a data _____________________. A. percentage B. frequency C. distribution D. variable
C. distribution
Which of the following summary measures cannot be directly calculated from a boxplot? A. Sample size B. Mean C. Standard deviation D. none of these choices can be calculated from a box plot
D. none of these choices can be calculated from a box plot
If you add the same positive number to every value of a data set what happens to the standard deviation?
Does not change
A flat histogram (with a line straight across) contains no variability whatsoever, according to our definition. (T.F)
False
If the mean of a data set is large, the standard deviation has to be large also. (T/F)
False
If there are a few very small values in a data set compared to the rest of the data, the mean will be larger than the median. (T/F)
False
In a boxplot you can tell the exact pattern of the data set (beyond just whether the data is skewed or symmetric.) (T/F)
False
Suppose the correlation between X =price of a gallon of gasoline and Y = price of a gallon of milk is r = .30 Should we go on and try to make predictions for milk prices using gasoline prices using a straight line? (TRUE / FALSE)
False
Suppose the correlation between yards rushing and yards passing is .6. That means the correlation between feet rushing and feet passing is .6 x 12 (since you multiply yards by 12 to convert to feet). (TRUE/FALSE)
False
The median must be one of the numbers in the data set. (T/F)
False
Your boss gives you the following regression equation. X = square feet and Y = selling price Selling price = $5,240 + $33.80 (Number of Square Feet). Does it make sense to interpret the Y-intercept for this equation? (TRUE/FALSE)
False
Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8. (TRUE/FALSE)
False (x and y are interchangable)
Mike marks down the gas mileage of his two cars every time he fills them up with gas for 6 months straight. At the end he notes that his Mustang gets better mileage than his Corvette. Is this an experiment or an observational study? A. Observational study B. Experiment
Observational study
What kind of sample occurs when you put an ad in the newspaper and ask readers to take your survey?
Self selected sample
When a difference in treatment is decided to be due to more than random chance, what do you call the results? A. Avoid or minimize bias B. Statistically different C. Statistically extrapolated D. Statistically significant
Statistically Significant
What is the most common observational study?
Survey
If a data set is skewed to the left, how will the mean and median compare?
The mean will be less than the median
A researcher is trying to use January temperatures to predict latitude. This means January temperature is the X (independent) variable and latitude is the Y (dependent) variable. (TRUE/FALSE)
True
Suppose your data represent revenues from a group of 20 stores in a retail chain across the country, and revenue is measured in millions of dollars. The standard deviation of this data set would also be measured in millions of dollars. (T/F)
True
The slices on a pie chart represent relative frequencies. (T/F)
True
You can have two data sets with the same mean but different standard deviations. (T/F)
True
If a residual is negative, then that data point lies _________________ the regression line. A. Above B. Below C. Exactly on
b. below
Which type of graph is best for COMPARING two or more quantitative data sets, a boxplot or a histogram?
boxplot
Which type of graph is made from the 5-number summary?
boxplot
Standard Deviation has no units (T/F)
false
If you add 10 to every value of a data set, what happens to the SD
it stays the same
What does SSE stand for?
sum of squares for error
An outlier in a data set can significantly affect the value of the mean but not the median. (T/F)
true
Correlation is affected by outliers (T/F)
true
As we heard in lecture, the "average distance from the mean" is measured by the __________________________. A. standard deviation B. standard average C. standard distance D. average deviation
A. standard deviation
A five number summary contains the min, max, Q1, Q3, and what other value? A. Q2 B. All of these answers are correct. C. the 50th percentile D. the Median
B. All of these answers are correct.
Which is more affected by skewness, the IQR or standard deviation? A. Neither one is affected. B. Standard Deviation C. IQR D. Both are affected the same amount.
B. Standard Deviation
A researcher is trying to predict the linear relationship between January revenue and yearly revenue for her company. The correlation turns out to be .60. How does she interpret this correlation? A. There is a very strong positive linear relationship between January revenue and yearly revenue. B. There is a moderate positive linear relationship between January revenue and yearly revenue. C. There is a weak positive linear relationship between January revenue and yearly revenue.
B. There is a moderate positive linear relationship between January revenue and yearly revenue.
The personnel department keeps records on all employees in a company. Here is the information they keep in one of their data files: - Employee identification number - Last name - First name - Middle initial - Department - Number of years with the company - Salary ($) - Education Level (high school, some college, or college degree) - Age (years) Which of the following combinations of variables would be appropriate to examine with a scatterplot? A. Education Level and Age. B. Salary and Education Level. C. Age and Salary. D. All of these choices are correct.
C. Age and Salary.
What should the residual plot look like if the regression line fits the data well? A. random patterns B. points fall around the horizontal line Y = 0 C. all of these choices are correct D. no fan shapes
C. all of these choices are correct
Suppose the equation y = 3.45 - 2.58x represents a valid regression equation and X can be used to predict Y. From this information, we know that X and Y have _____________ correlation.
negative
The starting point can affect the way a graph looks. (T/F)
true
Suppose: 1. all the points on a scatterplot lie perfectly on a straight line going uphill 2. the mean of X and the mean of Y are both 2 3 the standard deviations of X and Y are exactly the same. Can you find the equation of the best fitting line with this information? (Hint: Think of the '5 number' way of finding the best-fitting line.) (yes / no)
yes
Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). How do you interpret the slope for this equation? A. As square feet increase by 1, selling price increases by $33.80 B. As selling price increases by $1, square feet increases by 5,240. C. As square feet increases by 1, selling price increases by $5,240. D. As selling price increases by $1, square feet increases by $33.80
A. As square feet increase by 1, selling price increases by $33.80
Bob wants to estimate the percentage of people who own a dog in his town, and he goes to all the apartment buildings to carry out his survey. He leaves out all the houses in the town. What kind of bias is this? A. Bias due to undercoverage B. Nonresponse bias C. Response bias
A. Bias due to undercoverage
Which of the following is the X variable in an experiment? A. Independent variable, or factor B. Dependent variable, or response C. Confounding variable
A. independent variable or factor
Bob is interested in examining the relationship between the number of bedrooms in a home and its selling price. After downloading a valid data set from the internet, he calculates the correlation. The correlation value he calculates is only 0.05. What does Bob conclude? A. Bob continues his research because even though there is no linear relationship here, there could be a different relationship. B. Bob gives up on his research because r = .05 means there is no relationship of any kind between bedrooms and selling price.
A. Bob continues his research because even though there is no linear relationship here, there could be a different relationship.
Which of the following can never be negative? A. mean B. standard deviation C. all of these choices can be negative D. median
B. standard deviation
Boxplot A and Boxplot B are drawn on the same axes. The box part of Boxplot A is shorter in length than the box part of Boxplot B. What can you tell about the two data sets? A. Boxplot B has to contain more data than Boxplot A. B. They have to contain the same amount of data. C. You cannot tell anything from the information provided. D. Boxplot A has to contain more data than Boxplot B.
C. You cannot tell anything from the information provided.