Stat Quizzes Midterm 1
When given two variables, and it says "at least one"
"or" probability
A researcher is trying to determine the January temperature in regions of the United States using the degrees of latitude. After collecting data, she creates the scatterplot above. Which of the following is the most likely value of r, the correlation coefficient, for the data?
-0.87
If events are disjoint, the and probability is equal to ____
0
Which of the following is the complement of "at most 1"?
"More than 1"
Suppose you have 4 data sets whose scatterplots all show possible linear relationships. The four data sets have correlations of -0.10, +0.25, -0.90, and +0.80, respectively. Which of the correlations shows the strongest linear relationship?
-0.90
A hotel has two restaurants, A and B. Restaurant A gets 60% of the business and Restaurant B gets 40% of the business. 80% of Restaurant A customers are satisfied and 90% of Restaurant B customers are satisfied. What percentage of Restaurant B customers are not satisfied?
.10
SSE is for what line going through the data
ANY line
The personnel department keeps records on all employees in a company. Here is the information they keep in one of their data files: Employee identification number Last name First name Middle initial Department Number of years with the company Salary Education (coded as high school, some college, or college degree) Age Which of the following combinations of variables would be appropriate to examine with a scatterplot?
Age and Salary
P(one exactly) when given two options =
P(A and Ac) + P(Ac and A)
What are the 3 rules of independence
P(A given B)=P(*A) P(A given B)=P(A given Bc) P(A and B)=P(A)*P(B)
When measuring 10 variables, and it says "none"
P(Ac)^10
Under coverage happens during what stage of the sampling process?
When the sample is being selected
Coefficient of determination is always _______(neg or pos)
positive
Scatterplot is made of 2 _______ variables
quantitative
How to solve for SSE
square up residuals and add them together
Synonym of response in an experiment
the dependent variable
The variable whose effects you want to study in an experiment is called ________
the factor
If the correlation coefficient, r, between two variables is 0, we can conclude that there is no relationship between the two variables.
False
Event in probability
Subset of Sample space (written as A=1 for example)
If a data set is skewed to the left, how will the mean and median compare?
The mean will be less than the median.
A researcher is trying to determine the January temperature in regions of the United States using the degrees of latitude. After collecting data, she creates the scatterplot above. Which of the following is the correct interpretation of the correlation depicted in the scatterplot?
There is a moderately strong negative linear relationship between temperature and latitude.
If A and B are independent, all you need is P(A) and P(B) to calculate P(A or B).
True
True or False: The multiplication rule can test independence
True
True or False? Correlation is affected by outliers and skewness.
True
Residual tells you
Yi-Y hat
A hotel has two restaurants, A and B. Restaurant A gets 60% of the business and Restaurant B gets 40% of the business. 80% of Restaurant A customers are satisfied and 90% of Restaurant B customers are satisfied. What's the chance that someone ate at Restaurant B and was satisfied?
0.36
When given two variables, and it says "at most one"
1 - "and" probability
When measuring 10 variables, and it says "at least one person"
1 - "none" (1 - P(Ac)^10)
When given two variables, and it says "not either"
1 - "or" probability
At least one when given 10 variables
1 - P(none)^10 because at least one is everything besides the probability of getting none
Suppose 40% of OSU students have internships over the summer, and of those who have internships, 60% of them are business majors. What percentage of all OSU students are business students and have internships over the summer?
24%
Above is a two-way table examining the relationship between gender and whether or not a person smokes. What number belongs in the missing cell?
251
Data are collected on the fuel economy of 2016 model midsize cars. The information measured in average miles per gallon (mpg) is summarized below: The government is interested in cars using less fuel, so they decide to provide a 'gas saver tax credit' for those who own vehicles that get a good gas mileage. What is smallest value for MPG if you want only 25% of the cars to receive this tax incentive?
33.750 mpg
Suppose 20% of OSU students are business majors, and of the business majors, 60% have internships over the summer. Of those who are not business majors, 30% have internships over the summer. What percentage of ALL students have internships over the summer?
36%
A list of the data that occurred in a data set and how often it occurred is called ___________
A data distribution
Which of the following best describes a confounding variable?
A variable you did not include in the study that may have had an effect on the results.
Complement Rule
Explains why P(Ac)=1-P(A)
Suppose 70% of Facebook users have Twitter accounts. Write this as a probability.
P(Twitter | Facebook) = 0.70
Bob wants to do a telephone survey based on 100 people. Knowing that some people won't answer the phone, he selects a random sample of 200 names to be safe, so if someone isn't home, he can just call the next person on the list. He continue this way until he gets 100 responses. Will this sampling method create bias in Bob's data?
Yes
70% of male students at OSU wear a backpack and 60% of female students at OSU wear a backpack. Assuming 50% of OSU students are male and 50% are female, what total percentage of OSU students wear a backpack?
.65
When measuring 10 variables, and it says at most 9
1 - P(S)^10 because it is 1 minus the probability of everybody getting it, so that way u have 9 or less
Thomas wants to know what percentage of females buy his product. Thomas knows his customers are 50% male and 50% female, and that 30% of all his customers buy his product. He collects data on people who have already bought his product, and asks their gender. He finds that 70% of the people who bought his product were female. What method would you use to answer Thomas' original question?
Bayes' Rule
What does it mean for a sample to be truly random?
Every sample of the same size has the same chance of being selected
A conditional distribution summaries the information from one variable ONLY, without considering ANY information from another variable.
False
A flat histogram contains no variability whatsoever, according to our definition.
False
A researcher is trying to determine the January temperature in regions of the United States using the degrees of latitude. After collecting data, she creates a scatterplot. Given the relationship the researcher is trying to predict, the latitude is the dependent variable and the temperature is the independent variable.
False
Changing the number of bins will never change the shape of a histogram.
False
Suppose a school figures that 70% of adults will purchase a candy bar from a 6th grader during a fund-raiser. A sixth grader randomly selects 10 adults. What's the chance that at least one of them will buy a candy bar?
None of the answers provided.
Use Bayes' Rule when you have _____ probability and ______ probability
conditional and marginal
Regression analysis gives you _____ column that corresponds to a ____ (y int) and a _______ (slope)
constant and a variable
Simpson's paradox definition
looking at individual numbers, you see one thing, but when grouping the number's meaning in regard to the percentage, your original conclusions may be reversed
If you add 10 to every value of a data set, which of the following will also increase by 10?
Both the median and mean will increase by 10.
Match the boxplots below to their Five Number Summaries: Min Q1 Median Q3 Max #1 10 12 20 22 30 #2 10 18 20 28 30 #3 10 15 20 25 30
Boxplot C1 goes with Five Number Summary #2, C2 with #1, and C3 with #3.
Outliers significantly affect the value of the median.
False
Suppose the correlation between X =price of a gallon of gaspline and Y = price of a gallon of milk is r = .40. Then the correlation between the price of a HALF gallon of milk and the price of a HALF gallon of gas must be r = .4/2 = .20.
False
The units of r, the correlation coefficient, are the same as the X variable.
False
True or False? You can ALWAYS interpret both the slope and the y - intercept of a regression line.
False
True or false: An influential point is defined to be a point with a large residual.
False
True or false: SSE is only for the best line going through the data
False
Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). It makes sense to interpret the Y-intercept for this equation.
False
Above is a two-way table examining the relationship between gender and whether or not a person smokes. What is the marginal distribution of gender?
Females: 337 / 713 = 0.473 Males: 376 / 713 = 0.527
A veterinarian collects data on 100 of his patients who come in every year for their annual check-ups. After 5 years, he compares the health status of the dogs to the cats. What type of study is this?
Observational Study
formula for coefficient of determination
R^2
Which of the following is not one of the criteria for a good experiment?
Select a random sample of individuals to participate.
An experimenter compares a single brand of popcorn to see how much popcorn is popped using different time settings on the same microwave. The time settings are 1.5 minutes, 2 minutes, 2.5 minutes, and 3 minutes. In this situation, what is the factor?
Time Setting
A STAT 1430 student is interested in examining the relationship between the number of bedrooms in a home and it's selling price. After downloading a valid data set from the internet, the student creates a scatterplot and calculates the correlation. The correlation value they calculate is 0.67. This implies that the selling price of a house tends to increase as the number of bedrooms increases.
True
A taxi cab company takes a random sample of 50 of its taxis and notes their miles per gallon (mpg) on a test run. The computer output is shown below. From this output we can find the value of the interquartile range.
True
If there are a few very large values in a data set compared to the rest of the data, the mean will be larger than the median.
True
Suppose your data represent revenues from a group of 20 stores in a retail chain across the country, and revenue is measured in millions of dollars. The first quartile of this data set would also be measured in millions of dollars.
True
The mean is influenced by outliers (values that are much larger or much smaller than the rest of the data.)
True
True or false: A confounding variable can cause the results of a two - way table to reverse when added to the data set
True
True or false: Suppose A and B are disjoint. Then P(A|B) = 0
True
True or false: Suppose P(A and B) = .2 and P(A)=.3 and P(B) = .4. Then P(A|B) = .5
True
True or false: the difference between an experiment and an observational study is an experiment randomly assigns subjects to treatments
True
Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). The residuals have units of dollars.
True
If 60% of male-owned businesses are successful in their first year, and 60% of female-owned businesses are successful in their first year, are gender and having a successful business in their first year independent?
Yes
A manager of a retail store is interested in the relationship between a person's annual income and their total purchase amount. Could he measure this relationship by finding the correlation?
Yes, because income and total purchase amount are quantitative variables.
P(A) = .3, P(B) = .2, what is P(A and B)?
cannot tell
Statistic
characteristic of a sample
Parameter
characteristic of population
R squared measures __________ relationships and correlation measures _______ relationships
coefficient of determination measures any kind of relationship and correlation measures linear relationships
a good residual (does/doesn't) have a pattern
doesn't
You can tell two things are independent when given two conditional probabilities that are
equal, for example P(A given female) = P(A given male)
Making comparisons, avoiding bias, and having enough data are three criteria for a good ____________
experiment
true or false: if the coefficient of determination is .81, the value of correlation must be .9 - explain why
false bc .9 may have been negative
coefficient of determination definition
how much variability of y is explained by relationship with x
If there is no relationship between the two variables, the variables are
independent
Synonym of the factor in an experiment
independent variable
Correlation is is a descriptive statistic for _______ relationships
linear
If you are comparing two conditional distributions, (for example P(A given B)=P(A given Bc), what can you conclude
there is no relationship between the two variables
True or false: If P(A|B) = P(A) then A and B are independent.
true
A two-way table allows us to examine the relationship between categorical variables.
True
P(A) measures
long term chance
The Multiplication Rule
Explains the formula for joint probabilities
Addition Rule
Gives you "or" probability and for disjoint events you just add P(A) to P(B), because the joint probability is 0 so you don't need to include that
Speeds were recorded by a police officer in a speed trap and summarized in the boxplot shown. Which of the following statements is true about the data?
Half the drivers were driving between about 70 and about 80 miles per hour.
Which of the following summary measures can be directly calculated from a boxplot?
IQR
How do you tell if variables are independent in a survey
If you are given a random sample for the survey
Sample Space in probability
List of possible outcomes
Descriptive statistics give you ______ and _______ for 2 variables
Mean and StDev
Bob and Bill live in an apartment together. Bob is in the apartment 30% of the time overall. But when Bill is in the apartment, Bob is only there 10% of the time. Let Event A = "Bill is in the apartment", and let Event B = "Bob is in the apartment." Are Events A and B independent?
No
P(A|B) = .6, P(A) = .5, and P(B) = .4. Are A and B independent?
No
Suppose 40% of all OSU students own a Tablet PC and an iPhone. Write this as a probability.
None of the answers provided.
Disjoint events are ___________ (independent or dependent)
dependent
If you switch X and Y the sign of the correlation changes.
False
Bob and Fred share an apartment. Bob is there 40% of the time, and Fred is there 50% of the time. When Bob is in the apartment, Fred is there 70% of the time. Are Bob's and Fred's time in the apartment independent? True or false?
False
Boxplot A and Boxplot B are drawn on the same axes. If Boxplot A is shorter in length than boxplot B, it also has to contain less data than Boxplot B.
False
If the conditional distribution is different from the corresponding marginal distribution in a two-way table, we know that the variables are NOT related.
False
In thinking about the 5-number summary, the percentage of data below Q1 and above Q3 combined is the same as the percentage of data in the IQR.
True
Suppose the equation Y = 3.45 - 2.58 (X) represents a valid regression equation: From this information, we know that X and Y have a negative correlation.
True
You can put ____ probabilities in a pie chart
And
Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). What is the correct interpretation of the slope of this equation?
For every additional square foot, we expect a home's selling price to increase by $33.80.
A hotel has two restaurants, A and B. Restaurant A gets 60% of the business and Restaurant B gets 40% of the business. 80% of Restaurant A customers are satisfied and 90% of Restaurant B customers are satisfied. How do you write .80 in probability notation?
P(S|A)