Stats 1430.02 Midterm

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

True or false: You cannot see the mean on a boxplot.

True

If the correlation is 0 you know there is no relationship between X and Y. a. True b. False

b. False

If you add the same value to every single number in a data set, the standard deviation also changes by that same value. a. True b. False

b. False

Which is best if you want to compare several data sets regarding shape, center, and variability? histogram boxplot

boxplot

All good samples are _____________ a. Simple b. Stratified c. Random d. Biased

random

Suppose you are using cereal price to predict milk price; they have a correlation of -.70. The average cereal price is $3.00 with standard deviation $0.50 and the average milk price (gallon) is $2.50 with standard deviation $0.25. What is the slope of the regression line? Choose the closest answer. -1.40 .245 1.40 -.35 .35

-.35

A researcher collected data on 210 people who fly, noting the type of flight (connecting or direct) and whether they brought carry-on luggage (yes or no). The data is summarized in the table below. Show the conditional distribution of type of flight given the person brought carry-on luggage using the appropriate graph. Use proper labels on the graph.

Conditional Distribution of Type of Flight Given Carry-On (n = 120)P(Connecting) = 80/120 = 2/3P(Direct) = 40/120 = 1/3

A correlation of -.6 is considered to be what? Weak Moderately strong No linear relationship Strong Moderate

Moderately strong

Extrapolation is what? None of the other choices is correct Another word for Simpson's paradox Guessing an answer by plugging X into the equation of a line Plugging in X values outside the range of the data

Plugging in X values outside the range of the data

When finding the correlation if you are given R-squared, you take the square root first. Then what do you look at to determine the sign for the correlation? Neither choice is correct The sign of the y-intercept The sign on the slope Both choices are correct

The sign on the slope

Which of the following is NOT a property of standard deviation? a. Standard deviation is never negative. b. Standard deviation has no units. c. Standard deviation is affected by outliers and skewness. d. All of the above are properties of standard deviation.

b. Standard deviation has no units.

If you want to ask the question: "How is the view from your seat?" where your population is the OSU's football stadium, what kind of sample should you use? a. Volunteer Sample b. Stratified Random Sample c. Simple Random Sample d. Choose one selection and select everyone from that section

b. Stratified Random Sample

If you are predicting gas price using temperature, which is the X variable? a. Gas price b. Temperature c. Cannot tell without more information

b. Temperature

Which measure of variability measures the concentration of the data around the mean? a. The IQR b. The standard deviation c. The correlation d. All of the above

b. The standard deviation

How to the residuals relate to the SSE? a. The sum of the residuals equals SSE. b. The sum of the squared residuals equals SSE. c. The square root of the sum of the squared residuals equals SSE. d. The residuals do not relate to SSE.

b. The sum of the squared residuals equals SSE.

Which of the following is NOT a property of correlation? a. It has no units. b. Switching X and Y does not change its value c. It is not affected by outliers and skewness. d. All of the above are properties of correlation.

c. It is not affected by outliers and skewness.

Which five descriptive statistics do you need to find the equation of the best fitting line? a. Median of X, Median of Y, Mean of X, Mean of Y, and r. b. Min, Q1, Median, Q3, Max c. Mean of X, Mean of Y, SD of X, SD of Y, and r. d. None of the above

c. Mean of X, Mean of Y, SD of X, SD of Y, and r.

You send out an email to all the students in Stat 1430 and you tell them to go to your website and do a survey. 100 students come forward. What kind of sample is this? a. Undercoverage b. Involuntary sample c. Self-Selected sample d. Simple Random sample

c. Self-Selected sample

If P(A) = .2, P(B) = .3, and P(A|B) = .1, what is P(A and B)? a. .06 b. .1 c. .02 d. .03 e. None of the above

d. .03

What is statistical significance? a. A statistical anomaly b. A result with significant digits c. An important result d. A result due to more than chance

d. A result due to more than chance

Which can affect the way a histogram looks? a. The number of bars used b. The scale on the Y axis c. The starting point on the Y axis d. All of the above

d. All of the above

Which of the following is NOT in the same units as the original data? a. standard deviation b. Q1 c. y-intercept of the regression line d. All of the above are in the same units as the original data.

d. All of the above are in the same units as the original data.

Which of the following statistics can NEVER be negative? a. Correlation b. Slope of the regression line c. Y-intercept of the regression line d. All of the above can be negative, if the data permits.

d. All of the above can be negative, if the data permits.

If you switch X and Y, which of the following will change? a. The correlation b. The slope of the regression line c. The Y-intercept of the regression line d. Both b and c will change e. All of a, b, and c will change

d. Both b and c will change

Suppose the best fitting line is Y = 3+ 20X, where X is hours studied and Y is exam score. How do you interpret the slope of the line? a. As hours studied increases by 3, exam score increases by 20. b. As exam score increases by 1, hours studied increases by 20. c. As hours studied increases by 1, exam score increases by 3. d. None of the above.

d. None of the above.

The correlation between study time for an exam (in minutes) and exam score is 0.79. If we convert study time to hours, the correlation will a. Increase by a factor of 60 b. Decrease by a factor of 60 c. Switch signs to become -0.79 d. Stay the same

d. Stay the same

Suppose X and Y have a correlation of .9 and the regression line is Y = 2X + 3. If you increase X by 5 what happens to Y? None of the other choices is correct increase by 13 None of these answers are correct. increase by 10

increase by 10

Suppose the correlation between speed (mph) and gas mileage (mpg) is -.65 and the slope of the regression line is -.123. That means if you increase your speed by 10 mph, what will happen to your gas mileage? it will decrease by 1.23 mpg it will stay the same; these variables do not have a strong correlation. it will decrease by 12.3 mpg it will be 12.3 mpg

it will decrease by 1.23 mpg

If the correlation is .2 what does that tell you about using a regression line to fit your data? it's a linear relationship; go ahead and do a regression line, it will fit well. it's a weak positive linear relationship, do not proceed with a regression line it is a weak positive linear relationship; proceed with caution with your regression line

it's a weak positive linear relationship, do not proceed with a regression line

A confidential survey is one in which they cannot link you to your data. a. True b. False

b. False

IQR is affected by outliers. a. True b. False

b. False

In our lecture notes is an example involving two hospitals, A and B. If you compare patient outcomes for the hospitals, B is safer (has a lower death rate). But if you look only at the patients in poor condition, A is safer, and if you only look at the patients in good condition, A is safer. What is going on with this example? a. Bayes Rule b. The Law of Reverse Probability c. Simpson's Paradox d. There must have been some mistake in the data.

c. Simpson's Paradox

An experiment gives 3 different dosage levels of a drug to 3 groups of people. The first dosage level is a fake pill (or placebo) for comparison. We measure the blood pressure of the participants before and after the study and write down the amount by which blood pressure changed. What is the response variable? a. drug b. dosage level c. blood pressure change d. blood pressure

c. blood pressure change

You randomly choose 100 students from Stat 1350 to take a survey. 60 of them take the survey. What can occur with the other 40 people? a. response bias b. undercoverage c. nonresponse bias

c. nonresponse bias

Your company operates in 4 regions and your boss numbers them 1, 2, 3 4. Is this variable quantitative or categorical? categorical quantitative

categorical

The third quartile is the same thing as the _____________ percentile 3rd 75th .75 30th

75th

What is the statistical definition of a random sample? Choose the best answer. a. Each individual has the same chance of being selected. b. Every sample of that same size has an equal chance of being selected.

b. Every sample of that same size has an equal chance of being selected.

A flat histogram indicates no variability in the data. True False

False

When an individual in the sample responds but does not give the correct data, this is called: a. Response bias b. Response rate c. Nonresponse d. Undercoverage

Response Bias

P(A) = .20, P(B) = .30, P(A and B) = .06. Are A and B independent? No Not enough information to tell Yes

Yes

If A and B are independent events with P(A) = 0.20 and P(B) = 0.60, then P(A|B) is: a. 0.20 b. 0.60 c. 0.12 d. None of the above / Can't tell without more information.

a. 0.20

Suppose 35% of OSU students own an iPad, 25% own a laptop, and 10% own both. What percentage of OSU students own at least one of those items? a. 50% b. 60% c. 70% d. None of the above

a. 50%

Suppose the probability of having a female in your stat class is .6, and of the females in your stat class, 30% are accounting majors. Of the males in your class, 40% are accounting majors. What percentage of all students in your stat class are accounting majors? a. 70% b. 34% c. 12% d. Can't tell without more information.

b. 34%

If there is no relationship between two variables in a two-way table, then the two variables are said to be: a. Independent b. Dependent (also known as Not Independent) c. Not enough information to tell.

a. Independent

Two buses (Bus A and Bus B) take all the children home from their school. Bus A takes 40% of the children; Bus B takes 60% of the children. Of those who ride Bus A, 20% are in kindergarten. Of those who ride Bus B, 10% are in kindergarten. Suppose a bus rider is in kindergarten. Are they more likely to ride Bus A or Bus B? a. More likely to ride Bus A. b. More likely to ride Bus B. c. Equally likely to ride either Bus. d. Can't tell with the information given.

a. More likely to ride Bus A.

SSE is equal to what? a. The Sum of Squares for Error for any line going through the data. b. The Sum of Squares for Error for only the best line going through the data.

a. The Sum of Squares for Error for any line going through the data.

A and Ac are disjoint events. (Hint: what does it mean for events to be disjoint?) a. True b. False

a. True

A confounding variable can cause the results of a two-way table to reverse when it is added to the data set. a. True b. False

a. True

If you multiply every single number in a data set by the same value, the standard deviation is also multiplied by that same value. a. True b. False

a. True

Which type of probabilities are in each of the 4 cells of a two-way table of probabilities? a. Conditional probabilities b. "And" probabilities c. Marginal probabilities d. None of the above

b. "And" probabilities

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .10. What is the chance that you go to Branch A and they will have run out of single dollars? Choose the closest answer .15 .03

.03

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .10. What is the chance that you go to any branch of this business and they will have run out of single dollars? Choose the closest answer .15 .05 .10 .20

.10

P(A) = .3, P(B|A) = .4, P(B) = .5 What is P(A and B)? .20 None of these choices is correct. .15 .12

.12

P(A) = .2 and P(B) = .3. Suppose A and B are independent. What is P(A or B)?Choose the closest answer. Not enough information to tell .50 .40 .30

.40

P(A) = .2 and P(B) = .3. Suppose A and B are disjoint. What is P(A or B)?Choose the closest answer. .50 Not enough information to tell .10 .30

.50

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .10 Which Branch is most likely to run out of single dollars in a day? A C B

A

A business has 3 branches, A, B, and C. Branch A gets 20% of the business, Branch B gets 50%, and Branch C gets 30%. We know the following information: Branch A: chance of running out of single dollars in a day is .15 Branch B: chance of running out of single dollars in a day is .05 Branch C: chance of running out of single dollars in a day is .1 Suppose a Branch has run out of single dollars and you get the phone call. Is it most likely to be Branch A, Branch B, or Branch C? A or C A C B or C A or B B All have the same chance

A or C

To find the best fitting line, you find the line with the ________ SSE. a. Largest b. Smallest c. It depends on the data set

Smallest

Which type of graph of quantitative data fits the following description: It shows skewed vs. symmetric shapes; it's easy to determine center and variability; it's good for skewed data sets; and it's easy to compare data sets: a. Histogram b. Boxplot c. Pie chart d. None of the above

Boxplot

Bob runs an experiment to see which brand of paper towel is more absorbent: Brand A or Brand B. He takes a random sample of 10 sheets from each brand of paper towel and puts each sheet in a cup of water and measures how much water was absorbed by the sheet by squeezing it tightly for 10 seconds and weighing the water that comes out. What is the independent variable? a. Weight of the water squeezed out b. Which brand is more absorbent in the end c. Brand of paper towel (A or B) d. None of the above

Brand of paper towel (A or B)

The median is not affected by outliers True False

True

The wording of a survey question can affect the results: True or False?

True

Correlation is in the same units as X and Y. True False

False

Correlation measures the strength and direction of any relationship between X and Y. True False

False

Standard deviation has no units. True False

False

Suppose the correlation between X and Y is .3. If you double all the X values and double all the Y values, the correlation between 2X and 2Y is .6. True False

False

The 4 cells of a two way table contain conditional probabilities. True False

False

The second level branches on a tree are marginal probabilities. True False

False

Which is better to use to see the most clear pattern in the data? histogram boxplot

Histogram

A researcher collected data on 210 people who fly, noting the type of flight (connecting or direct) and whether they brought carry-on luggage (yes or no). The data is summarized in the table below. Show the marginal distribution of type of flight using the appropriate graph. Use proper labels on the graph.

Marginal Distribution of Type of Flight (n = 210) P(Connecting) = 100/210 = 47.62P(Direct) = 110/210 = 52.38

Suppose 40% of the new employees at your company are males and 60% of the "old employees" are males. Are gender and type of employee (new/old) independent? Yes No Not enough information to tell

No

Suppose 40% of the new employees at your company are males and 30% of the "old employees" are males. What percentage of ALL the employees are male? .35 Not enough information to tell .70

Not enough information to tell

If you are predicting U.S. movie box office revenue by using Opening Weekend Revenue, which variable is X and which is Y? a. U.S. box office revenue is X and Opening Weekend Revenue is Y. b. Opening Weekend Revenue is X and U.S. Box office revenue is Y. c. Not enough information to tell.

Opening Weekend Revenue is X and U.S. Box office revenue is Y.

Bob is a telemarketer and he makes a sale 20% of the time. Suppose he makes ten calls. What is the chance he makes AT LEAST ONE sale? Show your work.

P(At least 1) = 1 - P(None)= 1 - P(No No No .... No )= 1 - P(No)^10= 1 - .8^10=.8926

A recent Maryland highway safety study looked at two variables involving highway accidents: 1) whether the driver was wearing a seat belt; and 2) whether the driver avoided serious injury. They found that: 1) 77% of the time the driver was wearing a seat belt; 2) 92% of the drivers who were wearing a seat belt avoided serious injury, and 3) 63% of the drivers who didn't wear a seat belt avoided serious injury. Find the total percentage of drivers who avoided serious injury. Show work and notation for full credit.

P(S) = .77 P(A|S) = .92 P(A|NS) =.63 P(A) = P(S)P(A|S) + P(NS)P(A|NS) = .77x.92 + (1-.77)x.63 = .77x.92 + .23x.63 =.8533

A recent Maryland highway safety study looked at two variables involving highway accidents: 1) whether the driver was wearing a seat belt; and 2) whether the driver avoided serious injury. They found that: 1) 77% of the time the driver was wearing a seat belt; 2) 92% of the drivers who were wearing a seat belt avoided serious injury, and 3) 63% of the drivers who didn't wear a seat belt avoided serious injury. Find the total percentage of drivers who avoided serious injury. Show work and notation for full credit. Previous problem, continued. Suppose a driver escaped serious injury. What is the chance they were wearing a seat belt? Show all work and notation for full credit.

P(S|A) = P(S and A)/P(A) = P(S)P(A|S)/.8533 (from #26) = (.77x.92)/.8533= .8302

Suppose the regression line for X and Y is y = 2x+1 and the data points are (1, 3); (1, 4); (2, 3); and (3, 6). What is the SSE for this line? (Hint: Another name for Error is Residual.) You should get one number in the end. Show all work.

Point (1, 3): observed = 3, predicted = 2(1)+1 = 3; Residual = obs - pred = 3-3=0 Point (1, 4): observed = 4, predicted = 2(1)+1=3; Residual = obs-pred = 4-3 = 1 Point (2, 3): observed = 3, predicted = 2(2) + 1=5; Residual = 3 - 5 = -2 Point (3, 6): observed = 6, predicted = 2(3) + 1 = 7; Residual = 6 - 7 = -1 SSE = Sum of Squared Residuals = 0^2 + 1^2 + (-2)^2 + (-1)^2 = 0+1+4+1 = 6

A boxplot is a one-dimensional graph True False

True

A listing of all the possible values of a data set and how often they occur is called a distribution. True False

True

A longer box in the boxplot means more variability in the data. True False

True

If the median is closer to Q1 than it is to Q3 then the data is skewed right. True False

True

25% of people read the paper every day. 30% of women read it, and 20% of men read it. Are gender and reading the paper independent?a. Yes b. No c. Can't tell without more information.

b. No

The equation of a regression line is Y = 20 + 5X where X = hours studied and Y = exam score. Study time data ranged from 8 to 15 hours. Should we interpret the Y-intercept here? a. Yes. If someone studies 0 hours, they are expected to get 20 points. b. No. You should not interpret the Y-intercept in this situation. c. Not enough information to tell.

b. No. You should not interpret the Y-intercept in this situation.

The probability that a person will buy something when a telemarketer calls is .10. A telemarketer calls two people at random. What is the probability of getting at least one person to buy something? a. .11 b. .19 c. .01 d. None of the above

b. .19

If r = -.7, what is the value of the coefficient of determination? a. -.49 b. .49 c. .70 d. -.70 e. None of these / not enough information to tell.

b. .49

What does it mean for a sample to be truly random, according to our notes? a. Every individual in the sample has the same chance of being selected. b. Every sample of the same size has the same chance of being selected. c. Every individual in the population has the same chance of being selected. d. None of the above.

b. Every sample of the same size has the same chance of being selected.

Bob picks a name from the phone book using a random number generator, and then takes the first 100 names that come after that to make a sample. Is Bob's sample random? a. True b. False

b. False

There can be different amounts of data in each section of a boxplot. a. True b. False

b. False

Undercoverage means you had a lot of nonresponse in your sample. a. True b. False

b. False

The five-number summary of a single data set of 100 numbers would be which of the following? the 5 numbers that are marked off on a boxplot the two means, two standard deviations, and the correlation the mean, median, standard deviation, Q1, and Q3

the 5 numbers that are marked off on a boxplot


Ensembles d'études connexes

Lewis ch 14: infection and human immunodeficiency virus infection QUESTIONS

View Set

North Dakota "Closed book" Code practice

View Set

A&P2 Exam 5 (Reproductive System, Pregnancy & Development, and Heredity)

View Set

Chapter 10: Aviation weather reports and forecast

View Set

INTERNATIONAL BUSSINESS - CHPATER 3 REVIEWER

View Set

Quiz 1: Growth and Development Review

View Set

General Chemistry Chapter 9 Section 3: Stoichiometry of Gaseous Substances, Mixtures, and Reactions

View Set

Pharmacology Chapter 58: Drugs for thyroid disorders

View Set

Unit 8: Period 8, 1945-1980, Part 1 (AP EXAM PREP)

View Set