Stats Midterm 1

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Suppose the equation y = 3.45 - 2.58x represents a valid regression equation and X can be used to predict Y. From this information, we know that X and Y have _____________ correlation.

A negative

A flat histogram (with a line straight across) contains no variability whatsoever, according to our definition.

False

If variables A and B are related in a certain way in a two way table (with 2 variables), no matter how many other variables you look at in addition to these two, the relationship will always stay the same.

False

In a boxplot you can tell the exact pattern of the data set (beyond just whether the data is skewed or symmetric.)

False

Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8.

False

Suppose the correlation between yards rushing and yards passing is .6. That means the correlation between feet rushing and feet passing is .6 x 12 (since you multiply yards by 12 to convert to feet).

False

The median must be one of the numbers in the data set.

False

The standard deviation has no units.

False

You can compare two marginal distributions to see if the corresponding two variables are related.

False

What is the X variable in an experiment

Independent variable

If you add 10 to every value of a data set, what happens to the standard deviation?

It stays the same

Suppose your data represent revenues from a group of 20 stores in a retail chain across the country, and revenue is measured in millions of dollars. The standard deviation of this data set would also be measured in millions of dollars.

True

The slices on a pie chart represent relative frequencies.

True

The starting point can affect the way a graph looks.

True

Bob wants to estimate the percentage of people who own a dog in his town, and he goes to all the apartment buildings to carry out his survey. He leaves out all the houses in the town. What kind of bias is this?

Bias due to undercoverage

Which type of graph is best for COMPARING two or more quantitative data sets, a boxplot or a histogram?

Boxplot

Boxplot A and Boxplot B are drawn on the same axes. The box part of Boxplot A is shorter in length than the box part of Boxplot B. What can you tell about the two data sets?

Can't tell anything

In the hospital example in your lecture notes, you found hospital A didn't do as well as hospital B when all the data was in one two-way table, but when an additional variable was examined, hospital A was better in all cases. What was that additional (confounding) variable that ended up reversing the results?

Condition of Patient

What should the residual plot look like if the regression line fits the data well?

Random patterns, no fan shapes, points fall around y=0

Suppose: 1. all the points on a scatterplot lie perfectly on a straight line going uphill 2. the mean of X and the mean of Y are both 2 3 the standard deviations of X and Y are exactly the same. Can you find the equation of the best fitting line with this information? (Hint: Think of the '5 number' way of finding the best-fitting line.)

Yes

What kind of sample occurs when you put an ad in the newspaper and ask readers to take your survey?

A self-selected sample

What is the most common observational study?

A survey

Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). How do you interpret the slope for this equation?

As square feet increase by 1, selling price increases by $33.80

If a residual is negative, then that data point lies _________________ the regression line.

Below

A bar graph using StatCrunch is a good way to visualize a marginal or a conditional distribution.

True

What is the standard deviation of the data set 1, 1, 1, 1?

0

If 2 corresponding conditional distributions are different from each other in a two-way table, we know that the variables are related.

True

When a difference in treatment is decided to be due to more than random chance, what do you call the results?

Statistically significant

If a data set is skewed to the left, how will the mean and median compare?

The mean will be less then the median

What does SSE stand for?

sum of squares for error

If you could choose four numbers from 1, 2, 3, 4 and repeated numbers were allowed (such as 1, 1, 3, 2), which set of four numbers would give you the largest standard deviation? (No calculations needed.

1, 1, 4, 4

A five number summary contains the min, max, Q1, Q3, and what other value?

Median

Which of the following summary measures cannot be directly calculated from a boxplot? Mean, sample size, SD

None

Which of the following can never be negative? Mean, SD, median

SD

An outlier in a data set can significantly affect the value of the mean but not the median.

True

A __________ distribution summarizes the information from one variable ONLY, without considering ANY information from another variable.

Marginal

A researcher is trying to predict the linear relationship between January revenue and yearly revenue for her company. The correlation turns out to be .60. How does she interpret this correlation?

Moderate positive linear relationship

Suppose 35% of the women in a poll of Americans support candidate A for president (and 65% do not.) These results make up what kind of distribution?

Conditional distribution of support (yes/no) given women

Mike marks down the gas mileage of his two cars every time he fills them up with gas for 6 months straight. At the end he notes that his Mustang gets better mileage than his Corvette. Is this an experiment or an observational study?

Observational Study

As we heard in lecture, the "average distance from the mean" is measured by the __________________________.

Standard Deviation

Smoker Nonsmoker Total Male 125_____ 376 Female 104 233 337 Total 229 484 713 Above is a two-way table examining the relationship between gender and whether or not a person smokes. What number belongs in the missing cell?

251

A listing of all possible values in a data set and how often they occurred is called a data _____________________.

Distribution

If you add the same positive number to every value of a data set what happens to the standard deviation?

Does not change

If the mean of a data set is large, the standard deviation has to be large also.

False

If there are a few very small values in a data set compared to the rest of the data, the mean will be larger than the median.

False

Suppose the correlation between X =price of a gallon of gasoline and Y = price of a gallon of milk is r = .30 Should we go on and try to make predictions for milk prices using gasoline prices using a straight line?

False

Your boss gives you the following regression equation. X = square feet and Y = selling price Selling price = $5,240 + $33.80 (Number of Square Feet). Does it make sense to interpret the Y-intercept for this equation?

False

Which is more affected by skewness, the IQR or standard deviation?

SD

Correlation is affected by outliers.

True

Suppose 35% of the women in a poll support candidate A for president (and 65% do not.) In the same poll 45% of the men support candidate A for president (and 55% do not). Are gender and support (yes/no) related?

True

You can have two data sets with the same mean but different standard deviations.

True

A two-way table allows us to examine the relationship between _____________________ variables.

two categorical

The personnel department keeps records on all employees in a company. Here is the information they keep in one of their data files: Employee identification number Last name First name Middle initial Department Number of years with the company Salary ($) Education Level (high school, some college, or college degree) Age (years) Which of the following combinations of variables would be appropriate to examine with a scatterplot?

Age and Salary

Suppose you have 4 data sets whose scatterplots all show possible linear relationships. The four data sets have correlations of -0.10, +0.25, -0.90, and +0.80, respectively. Which of the correlations shows the strongest linear relationship?

-0.90

A researcher is trying to use January temperatures to predict latitude. This means January temperature is the X (independent) variable and latitude is the Y (dependent) variable.

True

A two-way table has this name because it contains rows going one way, and columns going the other way.

True

Which type of graph is made from the 5-number summary?

boxplot


Ensembles d'études connexes

Chapter 1: Check Your Understanding (Laws and Business)

View Set

4 - Life Insurance Premiums, Proceeds and Beneficiaries

View Set

Marketing Ch. 16 PRACTICE QUESTIONS

View Set

Parallel Lines Cut by a Transversal Quiz

View Set