STATS MIDTERM
If the mean of a data set is large, the standard deviation has to be large also.
false
Whats the most common observational study
survey
A researcher is trying to use January temperatures to predict latitude. This means January temperature is the X (independent) variable and latitude is the Y (dependent) variable.
true
Correlation is affected by outliers.
true
The slices on a pie chart represent relative frequencies.
true
Suppose you have 4 data sets whose scatterplots all show possible linear relationships. The four data sets have correlations of -0.10, +0.25, -0.90, and +0.80, respectively. Which of the correlations shows the strongest linear relationship?
-0.90
What is the standard deviation of the data set 1, 1, 1, 1?
0
If you could choose four numbers from 1, 2, 3, 4 and repeated numbers were allowed (such as 1, 1, 3, 2), which set of four numbers would give you the largest standard deviation? (No calculations needed.)
1,1,4,4
Correlation of Rainfall and Corn = 0.60 what is the % of the variability in Bushels per acre is due to Inches of Rainfall.
36% square the correlation value
Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). How do you interpret the slope for this equation?
As square feet increase by 1, selling price increases by $33.80
If a residual is negative, then that data point lies _________________ the regression line.
Below the regression line
A listing of all possible values in a data set and how often they occurred is called a data _____________________.
Distribution
If you add the same positive number to every value of a data set what happens to the standard deviation?
Does not change
A flat histogram (with a line straight across) contains no variability whatsoever, according to our definition.
False
If there are a few very small values in a data set compared to the rest of the data, the mean will be larger than the median.
False
In a boxplot you can tell the exact pattern of the data set (beyond just whether the data is skewed or symmetric.)
False
The SD has no units
False
Your boss gives you the following regression equation. X = square feet and Y = selling price Selling price = $5,240 + $33.80 (Number of Square Feet). Does it make sense to interpret the Y-intercept for this equation?
False
Which of the following is the X variable in an experiment?
Independent variable, or factor
If you add 10 to every value of a data set, what happens to the standard deviation?
It stays the same
Which of the following summary measures cannot be directly calculated from a boxplot?
Mean, SD, Sample size
Mike marks down the gas mileage of his two cars every time he fills them up with gas for 6 months straight. At the end he notes that his Mustang gets better mileage than his Corvette. Is this an experiment or an observational study?
Observational study
Which of the following can never be negative?
SD
As we heard in lecture, the "average distance from the mean" is measured by the __________________________.
Standard Deviation
When a difference in treatment is decided to be due to more than random chance, what do you call the results?
Statistically significant
What does SSE stand for?
Sum of Squares for Error
If a data set is skewed to the left, how will the mean and median compare?
The mean will be less than the median.
A five number summary contains the min, max, Q1, Q3, and what other value?
The median, 50th quartile, Q2
Correlation of Rainfall and Corn = 0.608
There is a moderate linear relationship between rainfall and bushels
Boxplot A and Boxplot B are drawn on the same axes. The box part of Boxplot A is shorter in length than the box part of Boxplot B. What can you tell about the two data sets?
You cannot tell anything from the information provided.
Data was collected on amount of rainfall (inches) and amount of corn produced (bushels per acre) for a number of years in Kansas. The output is shown below. Assume the scatter plot looks good. What are the units of slope in this situation? Predictor Coef SE Coef T P Constant 89.543 6.703 13.36 0.000 Rainfall 0.12800 0.01375 9.31 0.000 Correlation of Rainfall and Corn = 0.608
bushels per inch
1. all the points on a scatterplot lie perfectly on a straight line going uphill 2. the mean of X and the mean of Y are both 2 3 the standard deviations of X and Y are exactly the same. Can you find the equation of the best fitting line with this information? (Hint: Think of the '5 number' way of finding the best-fitting line.)
no
What should the residual plot look like if the regression line fits the data well?
no fan shapes random patterns points fall around the horizontal line Y = 0
Which is more affected by skewness, the IQR or standard deviation?
sd
What kind of sample occurs when you put an ad in the newspaper and ask readers to take your survey?
self-reflected sample
Suppose your data represent revenues from a group of 20 stores in a retail chain across the country, and revenue is measured in millions of dollars. The standard deviation of this data set would also be measured in millions of dollars.
True
The starting point can affect the way a graph looks.
True
Bob wants to estimate the percentage of people who own a dog in his town, and he goes to all the apartment buildings to carry out his survey. He leaves out all the houses in the town. What kind of bias is this?
Bias due to undercoverage
Bob is interested in examining the relationship between the number of bedrooms in a home and its selling price. After downloading a valid data set from the internet, he calculates the correlation. The correlation value he calculates is only 0.05. What does Bob conclude?
Bob continues his research because even though there is no linear relationship here, there could be a different relationship.
Which type of graph is made from the 5-number summary?
Boxplot
Suppose the correlation between X =price of a gallon of gasoline and Y = price of a gallon of milk is r = .30 Should we go on and try to make predictions for milk prices using gasoline prices using a straight line?
false
Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8.
false. It would be the same (.8)
The median must be one of the numbers in the data set.
False
A researcher is trying to predict the linear relationship between January revenue and yearly revenue for her company. The correlation turns out to be .60. How does she interpret this correlation?
There is a moderate positive linear relationship between January revenue and yearly revenue.
Suppose the correlation between yards rushing and yards passing is .6. That means the correlation between feet rushing and feet passing is .6 x 12 (since you multiply yards by 12 to convert to feet).
False
An outlier in a data set can significantly affect the value of the mean but not the median.
True
You can have two data sets with the same mean but different standard deviations.
True
Suppose the equation y = 3.45 - 2.58x represents a valid regression equation and X can be used to predict Y. From this information, we know that X and Y have _____________ correlation.
a negative
The personnel department keeps records on all employees in a company. Here is the information they keep in one of their data files: Employee identification number Last name First name Middle initial Department Number of years with the company Salary ($) Education Level (high school, some college, or college degree) Age (years) Which of the following combinations of variables would be appropriate to examine with a scatterplot?
Age and Salary.