Stats midterm
What type of graph is best for comparing two or more quantitative data sets, a boxplot or a histogram
Boxplot
Conditional Probabilities appear in the 4 cells of a two way table
False
If variables A and B are related in a certain way in a two way table (with 2 variables), no matter how many other variables you look at in addition to these two, the relationship will always stay the same
False
If you have all the information filled in a two way table you can fill in all the information on a tree. But not the other way around.
False
Suppose the correlation between X = price of a gallon of gasoline and Y = price of a gallon of milk is r = 0.30. Should we go on and try to make predictions for milk using gasoline prices using a straight line?
False
Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8
False
The median must be one the numbers in the data set
False
You can compare two marginal distributions to see if the corresponding two variables are related
False
In a boxplot you can tell the exact patten of the data set
Fasle
Which of the following is the x variable in an experiment
Independent variable or factor
If you add 10 to every value of a data set what happens to the standard deviation
It stays the same
Can standard deviation be negative
No
Supposed P(A and B) = .2 and P(A) = .3 and P(B) =.4 Then P(A|B) = .5
True
The slices on a pie chart represent relative frequencies
True
The starting point can affect the way a graph looks
True
There is 70% chance that bob will watch the Packer game. 50% of the time Bob watches the game, the Packers win. 60% of the time Bob doesn't wat the game, the Packers win. Suppose the packers won the game. Was bob more likely watching or not?
Watching
If 60% of male-owned businesses are successful in their first year, and 60% of female-owned businesses are successful in their first year, are gender and having a successful business in their first year independent?
Yes
Suppose 35% of women in a poll of Americans support candidate A for president. In the same poll 45% of men support candidate A Are gender and support related
Yes
Mike marks down the gas mileage of his tow cars every time he fills them up with gas for 6 months straight. At the end he notes that his mustang gets better mileage than his Corvette. Is this an experiment or an observational study?
observational study
what is the most common observational study
survey
Which technique should you use when you have P(B|A) and some other probabilities but you are looking for P(A|B)
Bayes Rule
Which type of graph is made from the 5-number summary
Boxplot
When is P(A or B) = P(A) + P(B)
Disjoint Events
Correlation of rainfall and corn = 0.60 What is the variability
R^2 = 0.36
What summary measures can you not directly calculate from a boxplot
Standard Deviation, Mean, Sample Size
Suppose: 1) All the points on a scatterplot lie perfectly on a straight line going uphill 2)The mean of x and the mean of y are both 2 3) The standard deviations of X and Y are exactly the same Can you find the equation of the best fitting line with this information
yes
If you could choose four numbers from 1,2,3,4 and repeated numbers were allowed (such as 1,1,3,2), which set of four numbers would give you the largest standard deviation?
1,1,4,4
Suppose 20% of OSU students are business majors, and of the business majors, 60% have internships over the summer. Of those who are not business majors, 30% have internships over the summer. What percentage of ALL students have internships over the summer?
12%
If a residual is negative then the data point lies ______ the regression line
Below
A researcher is trying to predict the linear relationships between January revenue and yearly revenue for her company. The correlation turns out to be 0.60. How does she interpret this correlation
There is a moderate positive linear relationship between January revenue and yearly revenue
The pew research center reports the following: In the 2016 POTUS election 63% of females voted and 59% of males voted. If research shows that 45% of registered voters are female and 55% of registered voters are male, What % of all registered voters actually voted in 2016
.61
Two randomly chosen customers are buying laundry soap at the store. 40% of the laundry soap on the shelves is Tide brand. What is the chance that at least one of them purchase Tide laundry soap
.64
70% of male students at OSU wear a backpack and 60% of female students at OSU wear a backpack. Assuming 50% of OSU students are male and 50% female, what total percentage of OSU students wear a backpack?
.65
What is the standard deviation of the data set 1,1,1,1
0
Conditional Probabilities are present on a tree
True
When a difference in treatment is decided to be due to more than random chance, what do you call the results?
Statistically significant
A five number summary contains the min, max, Q1, Q3, and what other value
The median, the 50th percentile, Q2
Y = corn (bushels) X = inches of rain What is the unit of the slope
bushels per inch
A listing of all possible values in a data set and how often they occurred is called a data ___
distribution
As we heard in lecture, the "average distance from the mean is measured by the ___________
standard deviation
What does SSE stand for?
sum of squares for error
A hotel has two restaurants, A and B. Restaurant A gets 60% of the business and restaurant B gets 40% of the business. 80% of Restaurant A customers are satisfied and 90% of Restaurant B customers are satisfied. What are the chances that someone ate at Restaurant B and was satisfied?
.36
Two randomly chosen customers are buying laundry soap at the store. 40% of the laundry soap on the shelves is Tide brand. What is the chance that only one of them purchase Tide laundry soap
.48
A bank has 3 branches; A,B,C. Branch A does 50% of the business, branch b does 30% of the business, and Branch C does 20% of the business. If someone goes to Branch A the chance they are depositing money is 20%. If someone goes to branch B, the chance they are depositing money is 40%. If they go to branch C the chance is 60% Suppose someone deposited money at a branch. Which branch were they most likely at.
B or C
Suppose 35% of women in a poll of Americans support candidate A for president. These results make up what kind of distribution?
Conditional distribution of support given women
If you add the same positive number to every value of a data set what happens to the standard deviation
Does not change
Your boss gives you the following regression line: y = 5240 + 33.80x. How do you interpret the slop
Every increase by 1x increases y by 33.80
A flat histogram contains no variability whatsoever, according to our definition.
False
If the mean of a data set is large, the standard deviation has to be large also
False
Suppose you make 10 telemarketer calls. Which of the following is the complement of "at most" 9 sales?
More than 9 or all 10
Suppose the equation Y = 3.45 - 2.58 (X) represents a valid regression equation: From this information, we know that X and Y have a ________correlation.
Negative
The pew research center reports the following: In the 2016 POTUS election 63% of females voted and 59% of males voted. Does this mean gender and voting are independent or not
Not independent
What kind of sample occurs when you put an ad in the newspaper and ask readers to take your survey
Self-Selected sample
if the data set is skewed to the let, how will the mean and median compare
The mean will be less than the median
If 2 corresponding conditional distributions are different from each other in a two-way table, we know that the variables are related
True
If A and B are independent then P(A or B) becomes P(A) +P(B) - P(A)*P(B)
True
If there are a few very small values in a data set compared to the rest of the data, the mean will be larger than the median
True
Starting with the multiplication rule, you can show that P(B|A) = P(A and B)/P(A)
True
A two-way table allows us to examine the relationship between ______ variables
Two categorical
Bob wants to estimate the percentage of people who own a dog in his town, and he goes to all the apartment buildings to carry out his survey. He leaves out all the houses in the town. What kind of bias is this?
bias due to undercoverage
The standard deviation has no units
false
A _____ distribution summarizes the info from only one variable, without considering any information from another vairable
marginal
A hotel has two restaurants, A and B. Restaurant A gets 60% of the business and restaurant B gets 40% of the business. 80% of Restaurant A customers are satisfied and 90% of Restaurant B customers are satisfied. What percentage of Restaurant B customers are not satisfied:?
.10
P(A) =.5, P(B)=.4, and P(A and B) = .1 what is P(B|A)
.2
Suppose you have 4 data sets whose scatterplots all show possible linear relationship. The four data sets have correlations of -0.10,+0.25,-0.9, and +0.80, respectively
-0.90
Suppose a school figures that 70% of adults will purchase a candy bar from a 6th grader during a fund-raiser. A sixth grader randomly selects 10 adults. What's the chance that at least one of them will buy a candy bar?
1-0.3^10
A hotel has two restaurants, A and B. Restaurant A gets 70% of the total business and Restaurant B gets 30% of the total business, We know 30% of A"s customers are women and 60% of customers are women. What % of all customers are women?
40%
Bob is interested in examining the relationship between the number of bedrooms in a home and its selling price. After downloading a valid data set from the internet, he calculates the correlation. The correlation value he calculates is only 0.05. What does Bob conclude
Bob continues his research because even though there is no linear relationship here, there could be a different relationship
In the hospital example in your lecture notes, you found hospital A didn't do as well as hospital B when all the data was in one two-way table, but when an additional variable was examined, hospital A was better in all cases. What was the additional confounding variable that ended up reversing the results?
Condition of patient(good/poor)
The pew research center reports the following: In the 2016 POTUS election 63% of females voted and 59% of males voted. If research shows that 45% of registered voters are female and 55% of registered voters are male. If someone selected a t random actually voted in the election , were they more likely to be male or female?
Male
Bob and Bill live in an apartment together. Bob is in the apartment 30% of the time overall. But when Bill is in the apartment, Bob is only there 10% of the time. Let Event A = "Bill is in the apartment", and let Event B = "Bob is in the apartment." Are Events A and B independent?
No
Boxplot A and Boxplot B are drawn on the same axes. If Boxplot A is shorter in length than boxplot B, what can you tell about the size of the data sets
Nothing
Which is more affected by skewness, the IQR or standard deviation
Standard Deviation
Speeds were recorded by a police officer in a speed trap and summarized in the boxplot shown. Which of the following statements is true about the data? The top line of the boxplot is longer and the median is closer to Q1
The data is skewed to the right
Two randomly chosen customers are buying laundry soap at the store. 40% of the laundry soap on the shelves is Tide brand. What is the chance that both of them purchase Tide laundry soap
16%
Your boss gives you the following regression line: y = 5240 + 33.80. Does it make sense to interpret the Y intercept for this equation (Y=selling price per square feet)
False
A student has applied to two graduate schools but she will only enroll in one of them. She has a 60% chance of getting into grad school A, and a 40% of getting into grad school B. If grad school A lets her in first, there is an 80% chance she will enroll. If grad school B lets her in first, there is a 90% chance she will enroll. How do you write .80 in probability notation?
P(Enroll | A)
A hotel has two restaurants, A and B. Restaurant A gets 70% of the total business and Restaurant B gets 30% of the total business, We know 30% of A"s customers are women and 60% of customers are women. Suppose a randomly chosen is woman. Is she more likely to have eaten at Restaurant A or Restaurant B?
Restaurant A
AN outlier in a data set can significantly affect the value of the mean but not the median
True
Suppose your data represent revenues from a group of 20 stores in a retail chain across the country, and revenue is measured in millions of dollars. The stand deviation of this data set would also be measured in millions of dollars
true
A bank has 3 branches; A,B,C. Branch A does 50% of the business, branch b does 30% of the business, and Branch C does 20% of the business. If someone goes to Branch A the chance they are depositing money is 20%. If someone goes to branch B, the chance they are depositing money is 40%. If they go to branch C the chance is 60% What is the chance that a randomly selected person didn't deposit money if you they went to branch A?
.8
What should the residual plot look like if the regression line fits the data well
No fan shapes, Points fall around the horizontal line y=0, random patterns
Suppose the probability of someone being female and voting for candidate A is 30%. What is the notation for this probability
P(Female and Voting for A) = 0.30
Suppose 90% of patients who test positive for a disease actually have the disease. Write this as a probability.
P(Have disease | test postive) = 0.90
P(A) = P(A and B) + P(A and Bc) where Bc means...
b. complement
A researcher is tryin to use January temperatures to predict latitude. This mean January temperature is the X variable and latitude is the Y variable
True
Correlation is affected by outliers
True
A bank has 3 branches; A,B,C. Branch A does 50% of the business, branch b does 30% of the business, and Branch C does 20% of the business. If someone goes to Branch A the chance they are depositing money is 20%. If someone goes to branch B, the chance they are depositing money is 40%. If they go to branch C the chance is 60% What are they chances that somebody deposited money at any of these branches?
30%
You can have two data sets with the same mean but different standard deviations
True
Suppose the correlation between yards rushing and yards passing is .6. That means the correlation between feet rushing and feet passing .6x12
False
There is 70% chance that bob will watch the Packer game. 50% of the time Bob watches the game, the Packers win. 60% of the time Bob doesn't wat the game, the Packers win. What is the chance that the packers will win the game?
.55