MIDTERM
Suppose you have 4 data sets whose scatterplots all show possible linear relationships. The four data sets have correlations of -0.10, +0.25, -0.90, and +0.80, respectively. Which of the correlations shows the strongest linear relationship?
-0.90
Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). How do you interpret the slope for this equation?
As square feet increase by 1, selling price increases by $33.80
Suppose the probability of someone being female and voting for Candidate A is 30%. What is the notation for this probability?
P(Female and Voting for A) = 0.30
Suppose 90% of the patients who test positive for a disease actually have the disease. Write this as a probability.
P(have disease | test positive)=.90
If a residual is negative, then that data point lies _________________ the regression line.
below
Bob wants to estimate the percentage of people who own a dog in his town, and he goes to all the apartment buildings to carry out his survey. He leaves out all the houses in the town. What kind of bias is this?
bias due to undercoverage
Which type of graph is best for COMPARING two or more quantitative data sets, a boxplot or a histogram?
boxplot
Which type of graph is made from the 5-number summary?
boxplot
If there are a few very small values in a data set compared to the rest of the data, the mean will be larger than the median. T or F?
false
The standard deviation has no units. T or F?
false
Which of the following is the X variable in an experiment?
independent variable
A __________ distribution summarizes the information from one variable ONLY, without considering ANY information from another variable.
marginal
If a data set is skewed to the left, how will the mean and median compare?
mean will be less than the median
Suppose the equation y = 3.45 - 2.58x represents a valid regression equation and X can be used to predict Y. From this information, we know that X and Y have _____________ correlation.
negative
What kind of sample occurs when you put an ad in the newspaper and ask readers to take your survey?
self-selected sample
If you add 10 to every value of a data set, what happens to the standard deviation?
stays the same
What is the most common observational study?
survey
If 2 corresponding conditional distributions are different from each other in a two-way table, we know that the variables are related
true
If A and B are independent then P(A or B) becomes P(A)+P(B)-P(A)P(B).
true
You can have two data sets with the same mean but different standard deviations.
true
As we heard in lecture, the "average distance from the mean" is measured by the __________________________.
standard deviation
Which is more affected by skewness, the IQR or standard deviation?
standard deviation
Which of the following can never be negative?
standard deviation
When a difference in treatment is decided to be due to more than random chance, what do you call the results?
statistically significant
Correlation is affected by outliers. T or F?
true
A two-way table allows us to examine the relationship between _____________________ variables.
two categorical
When is P(A or B) = P(A) + P(B)? For ____________________________________.
disjoint events
A listing of all possible values in a data set and how often they occurred is called a data _____________________.
distribution
A flat histogram (with a line straight across) contains no variability whatsoever, according to our definition. T or F?
false
Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8.
false
The median must be one of the numbers in the data set.
false
Mike marks down the gas mileage of his two cars every time he fills them up with gas for 6 months straight. At the end he notes that his Mustang gets better mileage than his Corvette. Is this an experiment or an observational study?
observational study