stat 200 PSU

Ace your homework & exams now with Quizwiz!

[Fill in the blank] In an experiment, a person guesses which one of three different cards a researcher has randomly picked (and hidden from the person who guesses). This is repeated four times, replacing the cards each time. Let X = number of correct guesses in the four tries. The probability distribution for X, assuming the person is just guessing, is partially provided above. What is the value of the missing probability P(X = 4)?

.01

Which test would be most appropriate for the following: Is average semester book cost for PSU students more than $400?

Test of One Mean. The data would be gathered by getting the SAT Math score for the sampled students and comparing the mean of this sample to 610. This makes this a test of one mean

Which test would be most appropriate for the following: Is there a difference in two repair shops in their estimates to repair damaged cars?

Test of Paired Means. The data would be gathered by taking a sample of damaged cars to the two repair shops and comparing the estimates these shops give on each car. This would result in comparing the two estimates per car, this makes this a test of paired means.

Which test would be most appropriate for the following: Is there a gender difference in the amount of money one is willing to spend on a date?

Test of Two Independent Means. The data would be gathered by asking study participants their Gender, Male or Female, and how much money they are willing to spend on a date; a quantitative response. The mean dollar amount from the Female responses would be compared to the mean dollar amount from the Male responses making this a test of two independent means.

[Fill in the blank] Ellen is taking 4 courses for the semester. She believes that the probability distribution function for X = the number of courses for which she will get an A grade is given above. What is the expected number of A's she will get? E(X) = ________ A's

1.75

Which test would be most appropriate for the following: Is there a gender difference in attitude toward dating someone with a great personality even if you did not find that person attractive?

Test of Two Proportions. The data would be gathered by asking study participants their Gender, Male or Female, and if they would date someone with agreat personality even if you did not find them attractive, Yes or No. The proportion of Female Yes responses would be compared to the proportion of Male Yes responses making this a test of two proportions.

Correctly identify if the following random variables as either discrete or continuous. The number of new accounts opened at a bank during a certain month A) Discrete B) Continuous

A

Correctly identify whether the following situations satisfy the conditions required to conduct a Binomial experiment. Five percent of all VCRs manufactured by a large electronics company are defective. Three VCRs are randomly selected from the production line of this company. The selected VCRs are inspected to determine whether each of them is defective or good. A) Binomial B) NOT Binomial

A

In a past General Social Survey, 87% of a random sample of n = 990 respondents answered yes to the question "Would you approve of an adult male punching a stranger if the stranger had broken into the man's house?" A 90% confidence interval for the proportion of all Americans who approve of punching an intruder is A) 0.852 to 0.888 B) 0.849 to 0.891 C) 0.845 to 0.895 D) 0.842 to 0.898

A A confidence interval is found by sample statistic ± Zmultiplier*StandardError. With p-hat of 0.87, Zmultiplier of 1.65 and n = 990, the 90% confidence interval is 0.849 to 0.891

In a survey of n = 950 randomly selected individuals, 17% answered yes to the question "Do you think the use of marijuana should be made legal or not?" A 90% confidence interval for the proportion of all Americans in favor of legalizing marijuana is A) 0.150 to 0.190 B) 0.146 to 0.194 C) 0.142 to 0.198 D) 0.139 to 0.201

A A confidence interval is found by sample statistic ± Z multiplier*StandardError. With phat of 0.17, Z multiplier of 1.65 and n = 950, the 90% confidence interval is 0.150 to 0.190

In a nationwide survey of n = 1,030 adults, 6% answered yes to the question "During the last year did anyone break into or somehow illegally get into your home or apartment?" A 90% confidence interval for the proportion of all Americans who had their homes broken into is A) 0.048 to 0.072 B) 0.045 to 0.075 C) 0.043 to 0.077 D) 0.041 to 0.079

A A confidence interval is found by sample statistic ± Zmultiplier*StandardError. With phat of 0.6, Zmultiplier of 1.65 and n = 1030, the 90% confidence interval is 0.048 to 0.072

Which statement is correct about a pvalue? A) The smaller the pvalue the stronger the evidence in favor of the alternative hypothesis. B) The smaller the pvalue the stronger the evidence in favor the null hypothesis C) Whether a small pvalue provides evidence in favor of the null hypothesis depends on whether the test is onesided or twosided. D) Whether a small pvalue provides evidence in favor of the alternative hypothesis depends on whether the test is onesided or twosided.

A A small pvalue provides stronger evidence that the results produced by the sample did not occur by random chance but instead were the result of the null hypothesis being incorrect. Therefore, the smaller the pvalue the stronger the evidence in support of the alternative hypothesis, Ha

Null and alternative hypotheses are statements about A) population parameters. B) sample parameters. C) sample statistics. D) it depends sometimes population parameters and sometimes sample statistics

A As with confidence intervals, hypothesis statements are statements made toward the population, thus use population parameters

According to a 2001 study of college students by Harvard University's School of Public Health, 19.3% of those included in the study abstained from drinking (USA TODAY, April 3, 2002). Suppose that of all current college students in the United States, 20% abstain from drinking. A random sample of four college students is selected with the following binomial results: Binomial with n = 4 and p = 0.2 x P( X = x ) 0 0.4096 1 0.4096 2 0.1536 3 0.0256 4 0.0016 From the table above, what is the probability that at least four college students in this sample abstain from drinking? A) 0.0016 B) 0.9984 C) 1.00

A At least 4 students means 4 or more. In this scenario this would mean only 4 since there were only 4 students in the sample. The P(X=4) is 0.0016

Which one of the following statements is true? A) Increasing the sample size of a survey decreases the margin of error. B) Increasing the sample size of a survey increases the margin of error. C) Increasing the sample size of a survey decreases the impact of response bias. D) Increasing the sample size of a survey increases the impact of response bias.

A Consider the equation for margin of error; it involves dividing by the sample size n. By theory, then, as n would increase the margin of error would decrease. Also, the idea of confidence intervals is to estimate the true parameter value. So the larger the sample size the close your size of the sample approaches the population size thus the error should decrease

A survey asked people how often they exceed speed limits. The data are then categorized into the above contingency table of counts showing the relationship between age group and response. Among people with age over 30, what's the "risk" of always exceeding the speed limit? A) 0.20 B) 0.40 C) 0.33 D) 0.50

A Feedback: Risk is found by the number in the group of interest divided by the total in that group: 40/200 = 0.20

percent of the sample were males? A) 43.6% B) 48.5% C) 56.4% D) 77.2%

A Feedback: Take 217 divided by 498 times 100% for 43.6%

The time taken for a computer to boot up, X, follows a normal distribution with mean 30 seconds and standard deviation 5 seconds. What is the standardized score (zscore) for a bootup time of x =20 seconds? A) 2.0 B) 0.0 C) 1.0 D) 2.0

A Feedback: The zscore is found by (observed mean)/ SD = (20 30)/ 5 = 2.0

A statistics class has 4 teaching assistants (TAs): three female assistants (Lauren, Rona, and Leila) and one male assistant (Josh). Each TA teaches one discussion section. Two students, Bill and Tom, who don't know each other, each pick a discussion section. The two events B = {Bill's TA is Lauren} and T = {Tom's TA is a woman} are A) independent events. B) disjoint (mutually exclusive) events. C) each simple events. D) None of the above.

A Feedback: These events are independent since one event does not influence the other event

Regression Analysis: Weight versus Height The regression equation is: Weight = 21.2 + 1.94 Height Predictor Coef SE Coef T P Constant 21.19 22.05 0.96 0.338 Height 1.9419 0.3247 5.98 0.000 S = 28.6963 RSq = 13.8% Based on the pvalue for HEIGHT, we can reject the null hypothesis and conclude that there exists a linear relationship between Height and Weight. A) True B) False

A Feedback: True since the reported pvalue is ≈ 0, it is less than any common value. Thus we would reject Ho that the slope is 0 and conclude that a statistically significant linear relationship exists between Height and Weight.

Suppose that the money spent by students at PSU on a Friday night is a normal random variable with a mean of 25 dollars and a From Table A1 we need to apply P(Z < 1) P(Z < 1).standard deviation of 2 dollars. Using Standard Normal Table, find the probability that a student will spend more than 28 dollars on a Friday night. A) 0.0668 B) 0.9332 C) 0.1469 D) 0.8531

A First you need to find the z score by taking the observed value of 28 and subtract from that the mean of 25, giving you a difference of 3. Then divide this difference by the standard deviation of 2 resulting in a z score of 1.5. Find the probability associated with this score in Table A1 and subtract this value from 1 since you want more than. The answer is 1 0.9332 = 0.0668

Your roommate is a statistical dork. They took a physics exam and scored a 50% and is upset with this score. You, however, know that these exam scores can be considered a normal random variable. You find out that the mean of this exam is 46% with a variance of 16%. Using Standard Normal Table, what is the approximate percentile of your roommate's score? A) 54th percentile B) 60th percentile C) 16th percentile D) 40th percentile

A First you need to recognize that the variance was given and not the standard deviation. So you have to take the square root of the variance to get the SD, producing SD = 0.4 Now you can find the percentile by calculating the z score = (.50 . 46)/0.4 = 0.1 From Table A1 you find the probability for z = 0.10 is 0.5398 meaning that your roommate score roughly in the 54th percentile on this exam.

For which of the following situations would the Central Limit Theorem not apply A) A random sample of size 20 is drawn from a skewed population. B) A random sample of size 50 is drawn from a skewed population. C) A random sample of size 20 is drawn from a bellshaped population. D) A random sample of size 50 is drawn from a bellshaped population.

A For sample sizes of 30 or more the Central Limit Theorem states that even for a skewed population the sample mean distribution will approximated a normal distribution, and if the population is bellshaped then so to will the distribution of the sample means regardless of sample size. Thus the response that does not follow the Rules is the one of sample size of 20 from a skewed distribution

For which of the following situations would the Central Limit Theorem not apply? A) A random sample of size 20 is drawn from a skewed population. B) A random sample of size 50 is drawn from a skewed population. C) A random sample of size 20 is drawn from a bellshaped population. D) A random sample of size 50 is drawn from a bellshaped population.

A For sample sizes of 30 or more the Central Limit Theorem states that even for a skewed population the sample mean distribution will approximated a normal distribution, and if the population is bellshaped then so to will the distribution of the sample means regardless of sample size. Thus the response that does not follow the Rules is the one of sample size of 20 from a skewed distribution

You are applying to graduate schools and need to take the Graduate Record Examination (GRE). The quantitative (math) portion of this exam is a normal random variable with a mean of about 600 and standard deviation of 150. What score on this quantitative section of the GRE do you need in order to fall in the 67th percentile? A) 666 B) 712 C) 544 D) 650

A Here you need to solve for the observed value for a given z score. The z score associated with the 67th percentile is 0.44. To find the observed value: z*SD + Mean >>>> (0.44)*(150) + 600 = 666.

Determine if the statement is a typical null hypothesis (Ho) or alternative hypothesis (Ha). There is no difference between the proportion of overweight men and overweight women in America. A) Null hypothesis B) Alternative hypothesis

A Ho refers to no difference or change or equal to. Ha will be the research hypothesis that involves either a difference, greater than, or less than.

A pop quiz in a class resulted in the following eight quiz scores: 0, 60, 66, 78, 82, 96, 98, 100. A five number summary for these test scores is A) 0, 63, 80, 97, 100. B) 66, 78, 82, 96, 98. C) 0, 66, 82, 98, 100. D) 0, 25, 50, 75, 100.

A Min = 0; Q1 = 63; Med = 80; Q3 = 97; Max = 100

For the given situation, decide if the random variable described is a discrete random variable or a continuous random variable. Random variable X = the number of letters in a word picked at random out of the dictionary. A) Discrete random variable B) Continuous random variable

A Number of letters is discrete: you cannot, say, have 1.5 letters

The table above shows the responses from a sample of 680 people in the General Social Survey to the question, "Do you sometimes drink more than you think you should?" What is the odds ratio for women thinking they drank more than they should compared to men? A) 0.41 B) 0.57 C) 1.76 D) 2.41

A Odds ratio compares the odds (e.g. the number of women saying yes to number of women saying no) of one group to the odds of another. Here we are comparing women's odds to men's odds ( 92/260)/(151/177) = 0.41

Statistic is to sample as parameter is to A) population. B) sample size. C) mean. D) estimate.

A Remember: Statistic is to Sample as Population is to Parameter

Statistic is to sample as parameter is to A) population. B) sample size. C) mean. D) estimate

A Remember: Statistic is to Sample as Population is to Parameter.

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.40 Ha: ρ < 0.40 Use Standard Normal Table to calculate the pvalue for this hypothesis test for zstatistic = 1.50 the pvalue is: A) 0.0668 B) 0.1469 C) 0.8531 D) 0.9332

A Since Ha is "<" we find the pvalue by P(Z < z) where z is the zstatistic

In the survey of a random sample of students at a university, two questions were "How many hours per week do you usually study?" and "Have you smoked marijuana in the past six months?" An analysis of the results produced the above output. A research question of interest is whether students who have smoked marijuana (group 1) in the past 6 months study fewer hours on average per week than those who have not (group 2). Based on the information given in the output, what conclusion can be made about the difference in time spent studying for the two groups? A) There is a statistically significant difference. B) There is not a statistically significant difference. C) It is impossible to know if there is a statistically significant difference because no pvalue is provided. D) It is impossible to know if there is a statistically significant difference because the test is onesided and the information provided is twosided.

A Since the 95% confidence interval does not contain 0 we would reject the Ho and conclude that there is a statistically significant difference.

A researcher examined the folklore that women can predict the sex of their unborn child better than chance would suggest. She asked 104 pregnant women to predict the sex of their unborn child, and 57 guessed correctly. Using these data, the researcher created the above output. Which choice describes how the pvalue was computed in this situation? A) The probability that a zscore would be greater than or equal to 0.98. B) The probability that a zscore would be less than or equal to 0.98. C) The total of the probabilities that a zscore is greater than or equal to 0.98 and less than or equal to 0.98. D) The probability that a zscore would be between 0.98 and 0.98.

A Since the alternative hypothesis is Ha: ρ > 0.5 the pvalue is calculated by P(Z > 0.98)

Is the given percent a statistic or a parameter? A customs inspector sampled 5 boxes among 20 boxes being shipped from out of the country. He found that one of the five boxes (20%) contained an illegal food item. A) Statistic B) Parameter

A Since the percent is from a sample it is a statistic

Is the given percent a statistic or a parameter? A customs inspector sampled 5 boxes among 20 boxes being shipped from out of the country. He found that one of the five boxes (20%) contained an illegal food item. A) Statistic B) Parameter

A Since the percent is from a sample it is a statistic.

A researcher wants to assess if there is a difference in the average age of onset of a certain disease for men and women who get the disease. Let μ1 = average age of onset for women and μ2 = average age of onset for men. A random sample of 30 women with the disease showed an average age of onset of 83 years, with a sample standard deviation of 11.5 years. A random sample of 20 men with the disease showed an average age of onset of 77 years, with a sample Standard deviation of 4.5 years. Assume that ages at onset of this disease are normally distributed for each gender, do not assume the population variances are equal. What are theappropriate null and alternative hypotheses? A) μ1 = μ2 and Ha: μ1 ≠ μ2 B) μ1 ≠ μ2 and Ha: μ1 = μ2 C) μ1 = μ2 and Ha: μ1 < μ2 D) μ1 = μ2 and Ha: μ1 > μ2

A Since the researcher is interested in detecting only a difference this would imply that any difference will do, thus the Ha is ≠

Three playing cards, without replacement, are chosen at random from a standard 52card deck. What is the probability of getting an Ace, a King and a Queen in that order? A) 8/16575 B) 1/2197 C) 6/35152

A Since these are done without replacement, the probability of getting Ace, a King and a Queen in order is equal to (4/52)* (4/51)*(4/50) = (1/13)*(4/51)*(2/25) = 8/16575

Statistical inference takes information from the ___________ and makes inferences to the ____________. A) sample; population B) population; sample C) neither

A Statistical inference uses samples to infer about a population

The table above shows the opinions of 953 respondents in the General Social Survey to the question "If your party nominated a woman for President, would you vote for her if she were qualified for the job?" What percent of females would vote for a woman president? A) 88.1% B) 51.2% C) 59.3% D) 11.9%

A Take 488 divided by 554 times 100% for 88.1%

The table above shows the counts by gender and highest degree attained for 498 respondents in the General Social Survey. What percent of the sample were males with no high school degree? A) 9.8% B) 20.3% C) 22.6% D) 48.5%

A Take 49 divided by 498 times 100% for 9.8%

The correlation between two variables is given by r = 0.0. What does this mean? A) The best straight line through the data is horizontal. B) There is a perfect positive relationship between the two variables C) There is a perfect negative relationship between the two variables. D) All of the points must fall exactly on a horizontal straight line.

A The best straight line would go through the horizontal since the slope would be zero.

Regression Analysis: Weight versus Height The regression equation is: Weight = 21.2 + 1.94 Height Predictor Coef SE Coef T P Constant 21.19 22.05 0.96 0.338 Height 1.9419 0.3247 5.98 0.000 S = 28.6963 RSq= 13.8% From the regression output, what is the value of the correlation between Height and Weight? A) 0.3715 B) 0.612 C) 0.612 D) 0.612 or 0.612

A The correct answer is 0.3715. We find this from taking the square root of Rsq= 0.138 and the correlation is positive since the slope (1.94) in our regression equation is positive.

Which of the following correlations indicates a stronger linear relationship between two variables? A) 0.90 B) 0.75 C) 0.50 D) 1.25

A The correct answer is 0.90. Recall that the value of the correlation indicates the strength and this value cannot be less than 1 nor greater than + 1 (thus ruling out 1.25 as an answer). The negative sign just indicates the direction of the relationship (positive or negative) and has no bearing on the strength of the relationship.

If the target population is all U.S. adults, then a telephone directory would not make a good sampling frame because A) people with unlisted numbers or no phones are not included. B) not everyone can understand English. C) some people have more than one phone number. D) some people will not answer their phone.

A The directory would not include those people whose phone numbers were not listed.

A 95% confidence interval for the proportion of women that has ever dozed off while driving is 0.07 to 0.14. For men, a 95% confidence interval for the proportion that has ever dozed off while driving is 0.19 to 0.25. Assume both intervals were computed using large random samples. What conclusion can be made about the population proportions that have dozed off while driving? A) It is reasonable to conclude that there is a difference between men and women. B) It is not reasonable to conclude that there is a difference between men and women. C) It is reasonable to conclude that there is a difference of .05 between men and women. D) No conclusion is possible because we don't know the margin of error.

A The intervals do not overlap indicating that a statistically significant difference exists

Which of the following is not a correct way to state a null hypothesis? A) Ho: B) Ho: μd = 10 C) Ho: μ = 0 D) Ho: μ = 0.5

A The null (and alternative) hypotheses need to have a reference to a population, thus simply writing Ho would be incorrect.

The table above shows the opinions of 953 respondents in the General Social Survey to the question "Everything considered, would you say that in general, you approve or disapprove of wiretapping?" The purpose of Examining the data is to see if there is a gender difference in how people would respond to this question. State the null hypotheses for this study. A) There is no relationship in the population between gender and approval of wiretapping B) There is no relationship in the sample between gender and approval of wiretapping C) There is a relationship in the population between gender and approval of wiretapping D) There is a relationship in the sample between gender and approval of wiretapping

A The null hypothesis speaks of no relationship between the variables in the population

If the correlation between a response variable Y and explanatory variable X is 0.5, what is the value that defines how much variation in Y is explained by X? A) 25% B) 5% C) 25% D) 5%

A The phrase in the question "what is the value that explains how much variation in Y is explained by X" refers to the Coefficient of Determination, or r2. Since r = 0.5 the r2 would be equal to (0.5)*( 0.5) = 0.25 or 25%

If the correlation between a response variable Y and explanatory variable X is -0.7, what is the value that defines how much variation in Y is explained by X? A) 49% B) 7% C) 49% D) 7%

A The phrase in the question "what is the value that explains how much variation in Y is explained by X" refers to the Coefficient of Determination, or r2. Since r = 0.7 the r2 would be equal to (0.7)*(0.7) = 0.49 or 49%

A thumbtack is tossed repeatedly and observed to see if the point lands resting on the floor or sticking up in the air. The goal is to estimate the probability that a thumbtack would land with the point up. That Probability is an example of A) a relative frequency probability based on longrun observation. B) a relative frequency probability based on physical assumptions. C) a personal probability. D) a probability based on measuring a representative sample and observing relative frequencies that fall into various categories.

A The probability will be based on the number of outcomes after observing numerous trials; i.e. relative probability based on long run observation

A thumbtack is tossed repeatedly and observed to see if the point lands resting on the floor or sticking up in the air. The goal is to estimate the probability that a thumbtack would land with the point up. That probability is an example of A) a relative frequency probability based on long run observation. B) a relative frequency probability based on physical assumptions. C) a personal probability. D) a probability based on measuring a representative sample and observing relative frequencies that fall into various categories.

A The probability will be based on the number of outcomes after observing numerous trials; i.e. relative probability based on long run observation

According to a 2001 study of college students by Harvard University's School of Public Health, 19.3% of those included in the study abstained from drinking (USA TODAY, April 3, 2002). Suppose that of all current college students in the United States, 20% abstain from drinking. A random sample of four college students is selected with the following binomial results: Binomial with n = 4 and p = 0.2 x P( X = x ) 0 0.4096 1 0.4096 2 0.1536 3 0.0256 4 0.0016 From the table above, what is the probability that at least one student BUT no more than three students in this sample abstain from drinking? A) 0.5888 B) 0.1792 C) 0.5632

A The question asks for at least 1 but no more than 3, meaning either 1, 2 or 3 students abstain. The answer is found by adding P(X=1) + P(X=2) + P(X=3) = 0.5888

Suppose on a highway with a speed limit of 65 mph, the speed of cars are independent and normally distributed with an average speed = 65 mph and standard deviation = 5 mph. What is the standard deviation for the sample mean speed in a random sample of n = 10 cars? A) 1.58 B) 13 C) 32.5

A The standard deviation of the sample mean would equal S/sqrt.N = 5/sqrt of 10 = 1.58

Suppose on a highway with a speed limit of 65 mph, the speed of cars are independent and normally distributed with an average speed = 65 mph and standard deviation = 5 mph. What is the standard deviation for the sample mean speed in a random sample of n = 100 cars? A) 0.5 B) 13 C) 3.25

A The standard deviation of the sample mean would equal S/sqrt.N = 5/sqrt of 100 = 0.50

Based on the 2000 Census, the proportion of the California population aged 15 years old or older who are married is p = 0.524. Suppose n = 1000 persons are to be sampled from this population and the sample proportion of married persons (ρhat) is to be calculated. What is the standard deviation of the sampling distribution of ρhat? A) 0.0158 B) 0.0166 C) 0.2494 D) 0.5240

A The standard error (or standard deviation of the sampling distribution) is found by taking the square root of [(0.524) (0.476)/1000] = 0.0158

In a random sample of 1000 students, 80% were in favor of longer hours at the school library. The standard error of ρhat is approximately: A) 0.013 B) 0.160 C) 0.640. D) 0.800

A The standard error is found by taking the square root of [(0.80)(0.20)/1000] = 0.013

Which one of the following statements is false? A) The standard error measures the variability of a population parameter. B) The standard error of a sample statistic measures, roughly, the average difference between the values of the statistic and the population parameter. C) Assuming a fixed value of s = sample standard deviation, the standard error of the mean decreases as the sample size increases. D) The standard error of a sample proportion decreases as the sample size increases

A The standard error measures the variability of a sample statistic not a population parameter

A safety officer wants to prove that μ = the average speed of cars driven by a school is less than 25 mph. Suppose that a random sample of 14 cars shows an average speed of 24.0 mph, with a sample standard deviation of 2.2 mph. Assume that the speeds of cars are normally distributed. What are the appropriate null and alternative hypotheses? A) Ho: μ = 25 and Ha: μ < 25 B) Ho: μ = 25 and Ha: μ > 25 C) Ho: μ = 25 and Ha: μ ≠ 25 D) Ho: μ ≠ 25 and Ha: μ = 25 E) Ho: xbar = 24 and Ha: xbar < 24 F) Ho: xbar= 24 and Ha: xbar > 24 G) Ho: xbar = 24 and Ha: xbar≠ 24 H) Ho: xbar ≠ 24 and Ha: xbar = 24

A The word less is the key term in determining the correct Ha expression. Exceeds implies that the investigator is only interested in whether the true population mean is less than 25. The value of 24 is the sample mean

A safety officer wants to prove that μ = the average speed of cars driven by a school is less than 25 mph. Suppose that a random sample of 14 cars shows an average speed of 24.0 mph, with a sample standard deviation of 2.2 mph. Assume that the speeds of cars are normally distributed. What are the appropriate null and alternative hypotheses? A) Ho: μ = 25 and Ha: μ < 25 B) Ho: μ = 25 and Ha: μ > 25 C) Ho: μ = 25 and Ha: μ ≠ 25 D) Ho: μ ≠ 25 and Ha: μ = 25 E) Ho: xbar = 24 and Ha: xbar < 24 F) Ho: xbar = 24 and Ha: xbar > 24 G) Ho: xbar = 24 and Ha: xbar ≠ 24 H) Ho: xbar ≠ 24 and Ha: xbar= 24

A The word less is the key term in determining the correct Ha expression. Exceeds implies that the investigator is only interested in whether the true population mean is less than 25. The value of 24 is the sample mean.

Let us say, for example, that I decide to make the final grading for this class random. I put into a bowl the following: 5 pieces of paper with "A", 2 pieces of paper with "B" and 3 pieces of paper with "C". If I randomly draw once piece of paper from the bowl to determine your final grade, what is the probability that your grade is either an A or a C? A) 4/5 or 0.80 B) 1/2 or 0.50 C) 7/10 or 0.70

A There are a total of 10 pieces of paper in the bowl. From that you have a 5/10 chance of getting an A and a 3/10 chance of getting a C. To find the P(Getting an A or Getting a C) we add P(A) + P(C) P( A and C). Since you cannot get both an A and a C, the two are independent meaning P(A and C) is zero. Therefore, P(A or C) is simply P(A) + P(C) = 5/10 + 3/10 = 8/10 or 4/5 [i.e. 0.80]

A scatterplot of y = left forearm length (cm) and x = height for 55 college students is given above. Is there a positive association or a negative association? Explain what this association means in the context of this situation. A) There is a positive association, which means that as height increases, left forearm length also tends to increase. B) There is a negative association, which means that as height increases, left forearm length also tends to increase. C) There is a positive association, which means that as left forearm length increases, height tends to increase. D) There is a negative association, which means that as left forearm length increases, height tends to increase.

A There is a positive association, which means that as height increases, left forearm length also tends to increase. We always speak of Y in relationship to X.

The table above shows the opinions of 321 respondents from the General Social Survey by whether they owned a gun (or not) and whether they favored (or opposed) a law requiring a permit to own a gun. What is the correct null hypothesis for this survey? A) There is no relationship in the population between gun ownership and opinion regarding gun law permit B) There is no relationship in the sample between gun ownership and opinion regarding gun law permit C) There is a relationship in the population between gun ownership and opinion regarding gun law permit D) There is a relationship in the sample between gun ownership and opinion regarding gun law permit

A There is no relationship in the population between gun ownership and opinion regarding gun law permit

From Standard Normal Table, the probability of an observation being greater than 0.5 is A) 0.3085 B) 0.6915

A This is looking for greater than meaning you have to subtract from 1 the probability found in the table for 0.5. This comes to 1 0.6915 = 0.3085

The level of significance associated with a significance test is the probability A) of rejecting a true null hypothesis. B) of not rejecting a true null hypothesis. C) that the null hypothesis is true. D) that the alternative hypothesis is true.

A This level of significance, commonly set to α equal to 0.05, is used to set the cutoff as the maximum probability a researcher would use in order to reject a true null hypothesis.

A random sample of 30 airline flights during a storm had an average delay of 40 minutes. The standard deviation was 5 minutes, and the standard error of the mean is 0.9129. Calculate a 98% confidence interval for the average delay for all flights during a storm. A) (37.8, 42.2) B) (38.2, 41.8) C) (27.7, 52.3)

A Using degrees of freedom equal to 30 1 gives a t* of 2.46 for confidence level of 98%. The standard error is equal to s/√n = 5/√30 = 0.9129 [given!]. The interval then is 40 ± 2.46*0.9129 = (37.8, 42.2)

When the pvalue is less than or equal to the designated level of 0.05, the result is called a A) statistically significant result. B) test statistic. C) significance level

A When pvalue is <e; α we reject Ho which means that the test is statistically significant

Suppose the significance level for a hypothesis test is α = 0.05. If the pvalue is 0.049, the decision is to A) reject the null hypothesis. B) accept the null hypothesis. C) not reject the null hypothesis.

A With pvalue <e; α our decision is to reject the null hypothesis Ho

Suppose the significance level for a hypothesis test is α = 0.05. If the pvalue is 0.001, the decision is to A) reject the null hypothesis. B) accept the null hypothesis. C) not reject the null hypothesis.

A With pvalue <e; α our decision is to reject the null hypothesis Ho

study wore a special pair of shoes with the sole of one shoe made from Material A and the sole on the other shoe made from Material B. The sole types were randomly assigned to account for systematic differences in wear between the left and right foot. After three months, the shoes are measured for wear. Let Ho: μd = 0 versus Ha: μd ≠ 0. From this random sample of 10 boys, the sample mean difference was 0.41 and Sd was 0.387.If the pvalue for this test is 0.009, then for a significance level of alpha = 0.05 which of the following is an appropriate conclusion? A) There is a statistically significant difference in population mean wear of the two shoe materials B) The population difference is not statistically significant: there is not enough evidence to conclude that the population mean wear of the two shoe materials is different. C) The population mean wear difference is .41 between the two shoe materials.

A With the pvalue < 0.05 there is a statistically significant difference in population mean wear of the two shoe materials

According to the Penn State Fact Book for Enrollment by Ethnicity Fall 2005 [University Park], in a class of 30 students, there would be 17 males and 13 females. Let us say that out of these 30 students five are A students, three of which are males. If a student is chosen at random, what is the probability of choosing a male or an A student? A) 19/30 B) 11/15 C) 17/180

A You are asked to find P(M or A) = P(M) + P(A) P( A and M). Drawing a Venn Diagram such as the one found at See Venn helps tremendously in seeing how to solve this problem. P(M) = 17/30, P(A) = 5/30 and (M and A) = 3/30. Substituting these into the equation for P(M and A) results in 17/30 + 5/30 3/ 30 = 19/30.

Scores on an achievement test had an average of 70 and a standard deviation of 10. Serena's score was 85. Using Standard Normal Table and assuming the scores have approximately a normal distribution, about what proportion of students scored lower than Serena? A) 0.93 B) 0.07 C) 0.84 D) 0.68

A You are asked to find P(X < 85) which you have to convert to P(Z < 1.5) by using zscore= (observed mean)/ SD. From Table A1 look up 1.5 in the left column combine with 0.00 across the top row, we get 0.9332

A study is conducted comparing a student's height versus the height of their father. The correlation between father's heights and student's heights for 79 male students was r = 0.669. What is the proportion of variation in son's heights explained by the linear relationship with father's heights? A) 44.8% B) 82.0% C) ± 44.8% D) ± 82.0%

A You are asked to find R2 and this is always positive and is found by squaring r. Answer is 44.8%

Which of the following regression equations best depicts the relationship in the graph above? A) y = bo + b1x B) y = bo b1x C) x = bo + b1y D) x = bo b1y

A y = bo + b1x is correct. The relationship is positive and we refer to Y in terms of X.

According to a recent Gallup poll, about 60% of all American adults owned a cell phone at the time of the poll. The results are based on telephone interviews with a randomly selected national sample of 998 adults, 18 years and older, conducted March 30April 2, 2001. The 95% margin of error was reported to be 3.5%. Which of the following statements correctly interprets the reported margin of error of 3.5%? A) In about 95% of all random samples of this size from the same population, the difference between the sample percent and the population percent will be less than 3.5%. B) In about 3.5% of all random samples of this size from the same population, the sample percent will equal the population percent. C) The probability that a 95% confidence interval based on this poll does not cover the population proportion is 3.5%. D) In about 95% of all random samples of this size from the same population, the difference between the sample percent and the population percent will be more than 3.5%.

A) Keep in mind that confidence intervals are used to estimate the true parameter value, in this case the true population proportion. The margin of error then would mean that the true proportion would be within the margin of error of the sample proportion

Correctly identify if the following random variables as either discrete or continuous. A gallon of orange juice. A) Discrete B) Continuous

B

Correctly identify if the following random variables as either discrete or continuous. The time taken to run a marathon A) Discrete B) Continuous

B

Correctly identify whether the following situations satisfy the conditions required to conduct a Binomial experiment. Selecting a few voters from a very large population of voters and observing whether or not each of them favors a certain proposition in an election when 54% of all voters are known to be in favor of this proposition. A) Binomial B) NOT Binomia

B

For the given situation, decide if the random variable described is a discrete random variable or a continuous random variable Random variable X = the weight (in pounds) a dieter will lose after following a two week weight loss program. A) Discrete random variable B) Continuous random variable

B

A standard 52card deck is shuffled and 5 cards are picked, without replacement, from the top of the deck. The probability that the first four cards are a red suit (Heart or Diamond) and the last card is a black suit (Spade or Club) is A) 2.53% B) 2.99% C) 3.13% D) 50.0%

B (26/52)*(25/51)*(24/50)*(23/49)*(26/48) = 2.99%

What is the primary purpose of doing a chisquare test? To determine if there is a significant relationship between A) two quantitative variables B) two categorical variables C) two continuous variables D) a qualitative variable and quantitative variable

B A chisquare test is used to analyze categorical (or qualitative) variables with the primary purpose to discover if a relationship exists between the variables.

In a past General Social Survey, 87% of a random sample of n = 990 respondents answered yes to the question "Would you approve of an adult male punching a stranger if the stranger had broken into the man's house?" A 95% confidence interval for the proportion of all Americans who approve of punching an intruder is A) 0.852 to 0.888 B) 0.849 to 0.891 C) 0.845 to 0.895 D) 0.842 to 0.898

B A confidence interval is found by sample statistic ± Zmultiplier*StandardError. With phat of 0.87, Zmultiplier of 1.96 and n = 990, the 95% confidence interval is 0.849 to 0.891

student wanted to test whether there was a difference in the mean daily hours of study for students living in four different dormitories. She selected a random sample of 50 students from each of the four dormitories. What is the null hypothesis for this situation? A) The mean daily hours of study is 3 hours for each dormitory. B) The mean daily hours of study is the same for each dormitory. C) The mean daily hours of study is different for each of the 200 students in the sample. D) The mean daily hours of study is not the same for all four dormitories

B ANOVA tests a null hypothesis that the means (average) are equal

supermarkets. She selected a random sample of 20 shoppers from each of the five supermarkets. What is the null hypothesis for this situation? A) The average waiting time to check out is 25 minutes for all five supermarkets. B) The average waiting time to check out is the same for all five supermarkets. C) The average waiting time for each of the 100 shoppers is different. D) The average waiting time to check out is not the same for all five supermarkets

B ANOVA tests a null hypothesis that the means (average) are equal

Which statement is true about xbar and ρhat? A) They are both parameters. B) They are both statistics. C) xbar is a parameter and ρhat is a statistic. D) ρhat is a parameter and xbar is a statistic

B Both xbar and ρhat represent statistics.

Which one of the following statements involving correlation as we have discussed is possible and reasonable? A) The correlation between hair color and eye color is 0.80. B) The correlation between the height of a father and the height of his first son is 0.6 C) The correlation between left foot length and right foot length is 2.35. D) The correlation between hair color and age is positive.

B Correlation per our discussions is used to report the linear association between two quantitative variables and ranges from 1 ≤ r ≤ 1: so height of father and first son of 0.6 is correct

For the following statement, determine if it is true or false. If events A and B are known to be independent and P(A) = 0.2 and P(B) =0.3, then P(A and B) = 0.5. A) True B) False

B False; if independent then P(A) times P(B) equals P(A and B) and here 0.2 times 0.3 does not equal 0.5

A student doing an internship at a large research firm collected the following data, representing all of the studies the firm had conducted over the past 3 years. Define the events E = {the study was an experiment}, U = {the study used randomization} and R = {the study was a retrospective observational study}. Which of the following sets of events are disjoint? A) E and U B) E and R C) U and R D) All three E) None of the above

B Feedback: Being and experiment or Retro are disjoint

Based on 1988 census data for the 50 States in the United Stares, the correlation between the number of churches per State and the number of violent crimes per State was 0.85. What can we conclude? A) There is a causal relationship between the number of churches and the number of violent crimes committed in a city. B) The correlation is spurious because of the confounding variable of population size: both number of churches and number of violent crimes are related to the population size. C) Since the data comes from a census, or nearly complete enumeration of the United States, there must be a causal relationship between the number of churches and the number of violent crimes. D) The relationship is not causal because only correlations of +1 or 1 show causal relationships

B Feedback: Keep in mind the effect of confounding variables!

Correctly identify whether the following situations satisfy the conditions required to conduct a Binomial experiment. Rolling a die many times and observing whether the number obtained is even or odd A) Binomial B) NOT Binomial

B Feedback: Not binomial - n is not fixed

Is the given percent a statistic or a parameter? 75% of all students at a school are in favor of more bicycle parking spaces on campus. A) Statistic B) Parameter

B Feedback: Since the percent refers to the population (i.e. "all students") it is a parameter.

According to a recent Gallup poll, about 60% of all American adults owned a cell phone at the time of the poll. The results are based on telephone interviews with a randomly selected national sample of 998 adults, 18 years and older, conducted March 30April 2, 2001. The margin of error was reported to be 3.5%. What was the population of interest in this Gallup Poll? A) All American adults who own cell phones. B) All American adults. C) The 998 adults who participated in the survey. D) The participants in the survey who owned cell phones.

B Feedback: The population of interest is all American Adults

Michael wants to take French or Spanish, or both. But classes are closed, and he must apply and get accepted to be allowed to enroll in a language class. He has a 50% chance of being admitted to French, a 50% chance of being admitted to Spanish, and a 20% chance of being admitted to both French and Spanish. If he applies to both French and Spanish, the probability that he will be enrolled in either French or Spanish (or possibly both) is A) 70% B) 80% C) 90% D) 100%

B Feedback: The question asks you to find P(A union B) = P(A) + P(B) P( A and B) = 0.5 + 0.5 0.2 = 0.8 or 80%

The relative risk of ever smoking for a 17 year old versus a 12year old is 3.6. What is the risk of smoking for a 12 year old (i.e. what was the percentage of 12year olds who ever tried smoking)? A) 14.1% B) 15.6% C) 50.0% D) 56.2%

B Feedback: You need to use algebra to solve: Relative risk (3.6) = one group of interest (17 year olds who tried smoking = 56.2%) divided by another group of interest (12 year olds who tried smoking and is unknown). Solving for the unknown comes to (3.6)* (0.562) = 0.156 or 15.6%

Determine if the statement is a typical null hypothesis (Ho) or alternative hypothesis (Ha). The proportion of overweight men is greater than the proportion of overweight women in America. A) Null hypothesis B) Alternative hypothesis

B Ho refers to no difference or change or equal to. Ha will be the research hypothesis that involves either a difference, greater than, or less than.

According to a 2001 study of college students by Harvard University's School of Public Health, 19.3% of those included in the study abstained from drinking (USA TODAY, April 3, 2002). Suppose that of all current college students in the United States, 20% abstain from drinking. A random sample of four college students is selected with the following binomial results: Binomial with n = 4 and p = 0.2 x P( X = x ) 0 0.4096 1 0.4096 2 0.1536 3 0.0256 4 0.0016 From the table above, What is the probability that only two college students in this sample abstain from drinking? A) 0.4096 B) 0.1536 C) 0.8192 D) 0.9728

B Just need to locate the probability for P(x=2) which is 0.1536

According to a 2001 study of college students by Harvard University's School of Public Health, 19.3% of those included in the study abstained from drinking (USA TODAY, April 3, 2002). Suppose that of all current college students in the United States, 20% abstain from drinking. A random sample of four college students is selected with the following binomial results: Binomial with n = 4 and p = 0.2 x P( X = x ) 0 0.4096 1 0.4096 2 0.1536 3 0.0256 4 0.0016 From the table above, What is the probability that only two college students in this sample abstain from drinking? A) 0.4096 B) 0.1536 C) 0.8192 D) 0.9728

B Just need to locate the probability for P(x=2) which is 0.1536

Which of the following is an example of a binomial random variable? A) The number of games your favorite baseball team will win this coming season. B) The number of questions you would get correct on a 15 question truefalse test if you randomly guessed on all questions. C) The number of siblings a randomly selected student has. D) The number of coins a randomly selected student is carrying.

B Needing a fixed number of trials with a "success" or "failure" on any one trial, and a fixed, common probability for each success leads to the correct answer of randomly guessing on a 15 question true/false test

Joan has noticed that the probability distribution for X = number of students in line to use the campus ATM machine when she shows up to use it is shown above. What is the probability that there will be no more than 1 student in line when Joan shows up? A) 0.10 B) 0.20 C) 0.70 D) 0.90

B No more than implies P(X ≤ 1) = P(X =0) plus P(X = 1) = 0.10 plus 0.10 which is 0.20

Which of the following statements is true about a parameter and a statistic for samples taken from the same population? A) The value of the parameter varies from sample to sample. B) The value of the statistic varies from sample to sample. C) Both A and B are true. D) Neither A nor B are true

B Only the value of the statistics will vary; the parameter is fixed

Which of the following statements is true about a parameter and a statistic for samples taken from the same population? A) The value of the parameter varies from sample to sample. B) The value of the statistic varies from sample to sample. C) Both A and B are true. D) Neither A nor B are true.

B Only the value of the statistics will vary; the parameter is fixed

A regression between foot length (response variable in cm) and height (explanatory variable in inches) for 33 students resulted in the following regression equation: y = 10.9 + 0.23x One student in the sample was 73 inches tall with a foot length of 29 cm. What is the predicted foot length for this student? A) 17.57 cm B) 27.69 cm C) 29 cm D) 33 cm

B Plugging 73 into the equation results in a predicted foot length of 27.69cm

Suppose a 95% confidence interval for the proportion of Americans who exercise regularly is 0.29 to 0.37. Which one of the following statements is false? A) It is reasonable to say that more than 25% of Americans exercise regularly. B) It is reasonable to say that more than 40% of Americans exercise regularly. C) An acceptable hypothesis is that about 33% of Americans exercise regularly. D) It is reasonable to say that fewer than 40% of Americans exercise regularly.

B Since 40% is in the interval we could not say that it is reasonable to say that more than 40% of Americans exercise regularly

Suppose a 95% confidence interval for the proportion of Americans who exercise regularly is 0.29 to 0.37. Which one of the following statements is false? A) It is reasonable to say that more than 25% of Americans exercise regularly. B) It is reasonable to say that more than 40% of Americans exercise regularly. C) An acceptable hypothesis is that about 33% of Americans exercise regularly. D) It is reasonable to say that fewer than 40% of Americans exercise regularly

B Since 40% is in the interval we could not say that it is reasonable to say that more than 40% of Americans exercise regularly

A null hypothesis is that the probability is 0.7 that a new drug will provide relief in a randomly selected patient. The alternative is that the probability of relief is greater than 0.7. Suppose the treatment is used on 500 patients and there are 380 successes. How would a pvalue be calculated in this situation? A) Find the chance of 380 or more successes, calculated assuming the true success rate is greater than 0.7. B) Find the chance of 380 or more successes, calculated assuming the true success rate is 0.7. C) Find the chance of fewer than 380 successes, calculated assuming the true success rate is greater than 0.7. D) Find the chance of fewer than 380 successes, calculated assuming true success rate is 0.7.

B Since Ha would be p > 0.7 the pvalue would be found by finding the probability of 380 or more successes based on believing the true rate is 0.7

A sample of n = 200 college students is asked if they believe in extraterrestrial life and 120 of these students say that they do. The data are used to test Ho: ρ = 0.5 versus Ha: ρ > 0.5, where p is the population proportion of college students who say they believe in extraterrestrial life. From this sample, the above output was obtained. What is the correct description of the area that equals the pvalue for this problem? A) The area to the right of 0.60 under a standard normal curve. B) The area to the right of 2.83 under a standard normal curve. C) The area to the right of 2.83 under the standard normal curve. D) The area between 0.532105 and 0.667895 under a standard normal curve

B Since the alternative hypothesis, Ha, is ">" we define the pvalue as the area to the right of the zvalue under the standard normal curve (i.e. Table A1). That is we find the pvalue as P(Z > z)

Is the given percent a statistic or a parameter? Based on the 2000 Census, 39.5% of the California population of residents who are over 5 years old speak languages other than English at home. A) Statistic B) Parameter

B Since the percent is referring to the California population it is a parameter

A researcher examined the folklore that women can predict the sex of their unborn child better than chance would suggest. She asked 104 pregnant women to predict the sex of their unborn child, and 57 guessed correctly. Using these data, the researcher created the above output. Based on the information in the output and using α = 0.05, what is the appropriate conclusion the researcher can make about ρ = proportion of pregnant women who can correctly predict the sex of their unborn child? A) There is statistically significantly evidence against the null hypothesis that p = 0.5. B) There is not statistically significant evidence against the null hypothesis that p = 0.5. C) There is statistically significant evidence against the null hypothesis that p = 0.548. D) There is not statistically significant evidence against the null hypothesis that p = 0.548

B Since the pvalue is > 0.05 we would not reject Ho. Therefore, the result does not provided statistically significant evidence against the null hypothesis that ρ = 0.5

A study compared grade point averages (GPA) among students in 4 different majors (English, History, Statistics, and Art) using analysis of variance. A total sample size of 20 students (5 in each major) was Studied. The Error Sum of Squares is SS Error = 64. What is the Mean Square Error (MS Error)? A) 3.2 B) 4 C) 16

B The MS Error is equal to the SS Error divided by the Error degrees of freedom (which are equal to the total sample size minus the number of group levels: 20 − 4 = 16). Therefore, the answer is 64/16 = 4

There are five cities in a politician's district but redistricting has been proposed for the state. The politician would like to know which city he should try to remove from his district. He plans to conduct a survey to find out his approval rating. Which one of the Following sampling plans would be most useful for his purposes? A) Take a stratified sample with political parties as the strata. B) Take a stratified sample with the five cities as the strata. C) Take a simple random sample across his district. D) Take a cluster sample with the five cities as clusters

B The best sampling method would be one that provides a sample that represents all five cities. The best technique would be to use a stratified sample based on strata of the five cities

Joan has noticed that the probability distribution for X = number of students in line to use the campus ATM machine when she shows up to use it is shown above. What is the expected value of X, E(X)? A) 2.0 B) 2.2 C) 2.5 D) 3.0

B The expected value of X, i.e. E(X) is found by taking each outcome times its respective probability and then summing these products. Thus, (0)(0.10) + (1)(0.10) + (2)(0.40) + (3)(0.30) + (4)(0.10) = 2.2

sided die 100 times. Each individual student determined the proportion of his or her 100 rolls for which the result was a "1". The instructor plans to draw a histogram of the 1,100 sample proportions. What will be the approximate mean for the 1,100 sample proportions? A) 1/100 B) 1/6 C) 6/100 D) 6

B The mean of the sample proportions should equal the expected proportion of 1/6

According to a recent Gallup poll, about 60% of all American adults owned a cell phone at the time of the poll. The results are based on telephone interviews with a randomly selected national sample of 998 adults, 18 years and older, conducted March 30April2, 2001. The margin of error was reported to be 3.5%. What was the population of interest in this Gallup Poll? A) All American adults who own cell phones. B) All American adults. C) The 998 adults who participated in the survey. D) The participants in the survey who owned cell phones.

B The population of interest is all American Adults

The regression line for a set of points is given by y = 12 6x. What is the slope of the line? A) 12 B) 6 C) 6 D) 12

B The slope is the coefficient of X which is 6.

Based on the 2000 Census, 31.8% of grandparents in California are the primary caregivers for their grandchildren. Suppose n = 1000 persons are to be sampled from this population and the sample proportion of grandparents as primary caregivers (ρhat) is to be calculated. What is the standard deviation of the sampling distribution of ρhat? A) 0.0002 B) 0.0147 C) 0.2169 D) 0.3180

B The standard error (or standard deviation of the sampling distribution) is found by taking the square root of [(0.318) (0.682)/1000] = 0.0147

Sleep apnea is a condition involving irregular breathing during sleep. Suppose that about 20% of a random sample of n = 64 men experience sleep apnea. What is the standard error of the sample proportion? A) 0.125 B) 0.05 C) 0.10 D) 0.20

B The standard error is found by taking the square root of [(0.20)(0.80)/64] = 0.05

A statistics class has 4 teaching assistants (TAs): three female assistants (Lauren, Rona, and Leila) and one male assistant (Josh). Each TA teaches one discussion section. A student picks a discussion section. The two events W = {the TA is a woman} and M = {the TA is a man} are A) independent events. B) disjoint (mutually exclusive) events. C) each simple events. D) none of the above

B These events are mutually exclusive since both events cannot occur at the same time.

If events A and B are mutually exclusive (disjoint) then A) they must also be independent. B) they cannot also be independent. C) they must also be complements. D) they cannot also be complements

B They cannot be independent.

An investigator wants to assess whether the mean μ = the average weight of passengers flying on small planes exceeds the FAA guideline of average total weight of 185 pounds (passenger weight including shoes, clothes, and carryon). Suppose that a random sample of 51 passengers showed an average total weight of 200 pounds with a sample standard deviation of 59.5 pounds. Assume that passenger total weights are normally distributed. Using the Ttable, for a significance level of a = 0.05, are the results statistically significant? A) No, results are not statistically significant because the pvalue < 0.05. B) Yes, results are statistically significant because the pvalue < 0.05. C) No, results are not statistically significant because the pvalue > 0.05 D) Yes, results are statistically significant because the pvalue> 0.05.

B This is a one sample ttest, so the test statistic, t is found by taking the difference between the sample mean (200) minus the hypothesized mean (185) and dividing by the standard error of the mean (S/√n = 59.5/√51 = 8.332). The tvalue is then t = 15/8.332 = 1.80. Then with degrees of freedom of 51 − 1 = 50 and from TTable we get a pvalue of 0.025 to 0.050 which is less than 0.05. Thus we would reject Ho and find the results to be statistically significant.

A counselor wants to show that for men who are married by the time they are 30, μ = average age when the men are married is not 21 years old. A random sample of 10 men who were married by age 30 showed an average age at marriage of 22.2, with a sample standard deviation of 1.9 years. Assume that the age at which this population of men gets married for the first time is normally distributed. What is the value of the test statistic? A) t =1.80 B) t =2.00 C) t =2.33

B This is a one sample ttest, so the test statistic, t is found by taking the difference between the sample mean (22.2) minus the hypothesized mean (21) and dividing by the standard error of the mean (S/√n = 1.9/√10 = 0.601). The tvalue is then t = 1.2/0.601 = 2.00

For the given situation, decide if the random variable described is a discrete random variable or a continuous random variable. Random variable X = the time (in seconds) it takes one email to travel between a sender and receiver. A) Discrete random variable B) Continuous random variable

B Time is continuous. Imagine if an email is said to have take 2 seconds. This 2 seconds is actually the time from 1.5 seconds to less than 2.5

A random sample of 30 airline flights during a storm had a mean delay of 40 minutes. The standard deviation was 5 minutes, and the standard error of the mean is 0.9129. Calculate a 90% confidence interval for the average delay for all flights during a storm. A) (38.2, 41.8) B) (38.4, 41.6) C) (31.5, 48.5)

B Using degrees of freedom equal to 30 1 gives a t* of 1.70 for confidence level of 90%. The standard error is equal to s/ √n = 5/√30 = 0.9129 [given!]. The interval then is 40 ± 1.70*0.9129 = (38.4, 41.6)

Past data has shown that the regression line relating the final exam score and the midterm exam score for students who take statistics from a certain professor is final exam = 50 + 0.5 × midterm Midterm exam scores could range from 0 to 100. Based on the equation, final exam scores are predicted to range from: A) 0 to 100. B) 50 to 100. C) 50 to 75. D) 0 to 75.

B Using these range of sores the minimum and maximum predicted scores are 50 to 100.

A null hypothesis is that the mean nose lengths of men and women are the same. The alternative hypothesis is that men have a longer mean nose length than women. A statistical test is performed for assessing if men have a longer mean nose length than women. The pvalue is 0.225. Which of the following is the most appropriate way to state the conclusion? A) The mean nose lengths of the populations of men and women are identical. B) There is not enough statistical evidence to say that that the populations of men and women have different mean nose lengths. C) Men have a greater mean nose length. D) The probability is 0.225 that men and women have the same mean nose length.

B With a pvalue of 0.225 which is greater than 0.05 we would not reject Ho. This would mean that there is not enough statistical evidence to say that that the populations of men and women have different mean nose lengths.

The average time taken to complete an exam, X, follows a normal probability distribution with mean = 60 minutes and standard deviation 30 minutes. Using Standard Normal Table, what is the probability that a randomly chosen student will take at least an hour to complete the exam? A) 0.1587 B) 0.5000 C) 0.8413 D) 0.9772

B You are asked to find P(X > 60) which you have to convert to P(Z > 0.0) by using zscore = (observed mean)/ SD. From Table A1 and using the complement rule, look up 0.0 in the left column combine with 0.00 across the top row, we get 0.5000. Subtracting this value from one produces P(Z > 0.0) = 0.5000

Proportion of people who intend to vote in the next presidential election. Sample: 100 baseball fans at a baseball game. Population: All voters in the next presidential election. A) Representative B) Not representative

B) Not representative: fans at a baseball game may not be old enough to vote or enough voter eligible.

A survey taker randomly selected 1000 students who were studying in the library and found that 90% of these students were in favor of longer library hours. The results of this study, if applied to all students in the university, are questionable because of A) lack of accuracy. B) selection bias. C) nonresponse bias. D) response bias.

B) The survey sample only looked at those students who were studying in the library which would not necessarily represent ALL students, so the study would be questionable due to selection bias

A student is randomly selected from a large college campus. Define the events H = {the student has blond hair} and E = {the student has blue eyes}. The chance that a blond haired student has blue eyes equals 75%. How do we write this probability? A) P(H) = 0.75 B) P(E) = 0.75 C) P(E|H) = 0.75 D) P(H|E) = 0.75

C

Past data has shown that the regression line relating the final exam score and the midterm exam score for students who take statistics from a certain professor is final exam = 50 + 0.5 × midterm For a student with a midterm score of 50, the predicted final exam score is: A) 50. B) 50.5. C) 75. D) 100.

C

The z* multiplier for a 98% confidence interval is A) 1.65 B) 1.96 C) 2.33 D) 2.58

C

The percent of data which lie between the lower and upper quartiles is A) 10%. B) 25%. C) 50%. D) 75%.

C 50% of observations lie between Q1 (lower quartile) and Q3 (upper quartile).

In a survey of n = 950 randomly selected individuals, 17% answered yes to the question "Do you think the use of marijuana should be made legal or not?" A 98% confidence interval for the proportion of all Americans in favor of legalizing marijuana is A) 0.150 to 0.190 B) 0.146 to 0.194 C) 0.142 to 0.198 D) 0.139 to 0.201

C A confidence interval is found by sample statistic ± Z multiplier * Standard Error. With phat of 0.17, Zmultiplier of 2.33 and n = 950, the 98% confidence interval is 0.142 to 0.198

A null hypothesis is that the mean cholesterol level is 200 in a certain age group. The alternative is that the mean is not 200. Assuming that sample standard deviation is the same for each sample, which of the following is the most significant evidence against the null and in favor of the alternative? A) For a sample of n = 25, the sample mean is 220. B) For a sample of n = 10, the sample mean is 220. C) For a sample of n = 50, the sample mean is 180. D) For a sample of n = 20, the sample mean is 180.

C All of the given sample means would produce the same difference from the hypothesized population mean of 200 (i.e. the difference is either ± 20). The smaller pvalue would result from the test statistic that is largest in absolute value. This would occur from the sample that would produce the smallest standard error of the mean (for this example that would be from S/√n) which is when n = 50.

A sample of n = 200 college students is asked if they believe in extraterrestrial life and 120 of these students say that they do. The data are used to test Ho: ρ = 0.5 versus Ha: ρ > 0.5, where p is the population proportion of college students who say they believe in extraterrestrial life. From this sample, the above output was obtained. Suppose that the alternative hypothesis had been Ha: ρ ≠ 0.5. What would have been the pvalue of the test? A) 0.002 B) 0.001 C) 0.004 D) 0.5

C Changing Ha from a onesided test (i.e. > or <) to a twosided test (i.e. ≠) the pvalue would double. Conversely, if we altered the test from a 2sided to a 1sided test the pvalue would be cut by half.

A recent poll reported that 62% of all college students believe there is extraterrestrial life. The 95% margin of error for the poll was 4%. Which of the following statements is correct? A) We can be certain that the percentage of all college students who believe there is extraterrestrial life is in the interval 58% to 66%. B) The chance is 95% that at least 62% of all college students believe there is extraterrestrial life. C) With 95% confidence we can say that the percentage of all college students who believe there is extraterrestrial life is between 58% and 66%. D) The chance is 5% that at least 62% of all college students believe there is extraterrestrial life.

C Confidence intervals are exactly that: a measure of how confident you are in your estimate of the true parameter value. The interval itself is either correct in its estimate [i.e. the true parameter value is in the interval] or is incorrect [the interval does not capture the true parameter value]. Therefore the probability that the interval is right is either 1 or 0

The average time taken to complete an exam, X, follows a normal probability distribution with mean = 60 minutes and standard deviation 30 minutes. Using Standard Normal Table, by what time should 99% of the students have finished the exam? (i.e. What is the 99th percentile for X?) A) 85.2 minutes B) 89.4 minutes C) 129.9 minutes D) 150.0 minutes

C Converting 99% to 0.9900 and referencing Table A1, we look inside the table for the the zscore that has cumulative probability closest to 0.9900. This zscore is 2.33, and we use this to solve for "observed" in the equation zscore = (observed mean)/ SD. Generally, this works out to observed = SD*zscore + mean. Substituting, observed = (30)*(2.33) + 60 which equals 129.9

Suppose that for X = net amount won or lost in a lottery game, the expected value is E(X) = $ 0.50. What is the correct interpretation of this value? A) The most likely outcome of a single play is a net loss of 50 cents. B) A player will have a net loss of 50 cents every single time he or she plays this lottery game. C) Over a large number of plays the average outcome for plays is a net loss of 50 cents. D) A mistake must have been made because it's impossible for an expected value to be negative

C Expected values are the mean results expected over a long run (i.e. large number) of trials.

If the sample size (n) is large, and the sample is a random sample, then the distribution of the sample mean xbar is approximately a A) binomial distribution. B) uniform distribution. C) normal distribution. D) none of the above.

C Feedback: Based on theory, the sampling distribution of xbar when sample size is large will follow a normal distribution. We typically refer to large as being of sample size 30 or more. This is also known as the central limit theorem.

The above histogram shows the distribution of the difference between the actual and 'ideal' weights for 119 female college students. Notice that percent is given on the vertical axis. Ideal weights are responses to the question "What is your ideal weight"? The difference = actual ideal. (Source: idealwtwomen dataset on the CD.) What is the approximate shape of the distribution? A) Nearly symmetric. B) Skewed to the left. C) Skewed to the right. D) Bimodal (has more than one peak

C Feedback: Note how the data "stacks" to the left and then "tails" to the right. This implies skewed to the right, or right skewed

The table above shows the responses from a sample of 680 people in the General Social Survey to the question, "Do you sometimes drink more than you think you should?" What is the risk (or percentage) of men Thinking they drank more than they should? A) 22.2% B) 35.7% C) 46.0% D) 62.1%

C Feedback: Remember that risk is the number in the group of interest divided by the group total: 151/328

Heights of college women have a distribution that can be approximated by a normal curve with a mean of 65 inches and a standard deviation equal to 3 inches. Using Standard Normal Table, about what proportion of college women are between 65 and 68 inches tall? A) 0.5000 B) 0.8413 C) 0.3413 D) 1.3413

C First, recognize that the question asks you to find P(65 < X < 68) and that you need to convert 65 and 68 to z scores by the equation (observed mean)/ SD. Doing so results in a zscore of zero for 65 and one for 68. From Table A1, P(0 < X < 1) equals P(Z < 1) P(Z < 0) = 0.8413 0.5000= 0.3413

A shoe company wants to compare two materials, A and B, for use on the soles of boys' shoes. In this example, each of ten boys in a study wore a special pair of shoes with the sole of one shoe made from Material A and the sole on the other shoe made from Material B. The sole types were randomly assigned to account for systematic differences in wear between the left and right foot. After three months, the shoes are measured for wear. Let Ho: μd = 0 versus Ha: μd ≠ 0. From this random sample of 10 boys, the sample mean difference was 0.41 and Sd was 0.387. What is the value of the test statistic? A) t = 0.00 B) t = 3.18 C) t = 3.35 D) t = 4.74

C First, recognize that this is a paired ttest. Then the t test statistic is 3.35 and is found by taking the sample mean difference minus the hypothesized value then dividing by the standard error (not standard deviation!). This, then, is (0.41 0)/( 0.387/sq.rt. 10) or 0.41/0.122 = 3.35

Using Standard Normal Table, what is the probability that Z is between 1 and 1, P(1 < Z < 1)? A) 0.1587 B) 0.3174 C) 0.6826 D) 0.8413

C From Table A1 we need to apply P(Z < 1) P(Z < 1). From the table, look up 1.0 in the left column combine with 0.00 across the top row, we get 0.8413, and look up 1.0 in the left column combine with 0.00 across the top row, we get 0.1587. Subtracting 0.8413 0.1587 equals 0.6826

Which statement is not true about hypothesis tests? A) Hypothesis tests are only valid when the sample is representative of the population for the question of interest. B) Hypotheses are statements about the population represented by the samples. C) Hypotheses are statements about the sample (or samples) from the population. D) Conclusions are statements about the population represented by the samples.

C Hypothesis tests are NOT about the sample, but instead we use the sample to draw conclusions about the population

In the past five years, only 5% of preschool children did not improve their swimming skills after taking a Beginner Swimmer Class at a certain Recreation Center. What is the probability that a preschool child who is taking this swim class will improve his/her swimming skills? A) 5% B) 10% C) 95%

C If 5% did not improve then 95% did improve since you can either improve or not improve.

Suppose that a student needs to buy 10 books for her history course. The number of books that she will be able to find used is a binomial random variable X with n = 10 and p = 0.30. In other words, the probability that she will find any given book used is 0.30, and is independent from one book to the next. What is the probability that she will find more than 2 used books? A) 0.382 B) 0.256 C) 0.618 D) 0.744

C Open Minitab and go to Calc > Probability Distributions > Binomial. Since we want exactly 2 choose Cumulative Probability and enter 10 for number of trials, 0.3 for Probability of Success, click Input Constant and enter 2. This results in probability of P(X > 2) = 0.382. But since we want more than 2 we need to take this probability and subtract it from 1 resulting in P(X > 2) = 0.618

A standard 52card deck is shuffled and 2 cards are picked, without replacement, from the top of the deck. The probability that the first card is a Heart and the second card is a Spade is A) 5.9% B) 6.25% C) 6.37% D) 25.0%

C P(Heart) times P(Spade) = 13/52 times 13/51 = 6.37%

A regression equation for left palm length (y variable) and right palm length (x variable) for 55 college students gave an error sum of squares (SSE) of 10.7 and a total sum of squares (SSTO) of 85.2. The proportion of variation explained by x, R2, is: A) 11.2%. B) 12.6%. C) 87.4%. D) 88.8%.

C R2 is found by (SSE SSTO)/ SSTO. Then multiply by 100%.

For a normal random variable (Using Standard Normal Table), the probability of an observation being less than the median is: A) 0.16 B) 0.34 C) 0.50 D) 0.68

C Recall that for a normal random variable, the mean is equal to the mean. Thus, for a normal random variable, the total probability above the mean (or median) is 0.5 as well as below the mean (or median).

A student is randomly selected from a large college. Define the events C = {the student owns a cell phone} and I = {the student owns an iPod}. Suppose you know that your friend's sister, who attends this college, owns a cell phone and you are wondering what would be the chance that she owns an iPod. What type of probability would this be? A) A probability of independent events. B) A probability of dependent events. C) A conditional probability. D) A probability of disjoint events.

C Since some event is given (owns an ipod) you are interpreting a conditional probability

8 each and a different training program is assigned to each group. After two months, the improvement in endurance is recorded for each participant. A oneway analysis of variance is used to compare the five training programs, and the resulting pvalue is 0.023. At a significance level of 0.05, the appropriate conclusion about mean improvement in endurance is that it A) is the same for the five training programs. B) is different for each of the five training programs C) differs for at least two of the five training programs. D) is significantly better for one of the training programs than for the other four.

C Since the pvalue is less than 0.05 we would reject the null hypothesis that all five population means are equal and conclude that there are not all equal (i.e. at least two of the five means differ).

A null hypothesis is that the average pulse rate of adults is 70. For a sample of 64 adults, the average pulse rate is 71.8. A significance test is done and the pvalue is 0.02. What is the most appropriate conclusion based on α of 0.05? A) Conclude that the population average pulse rate is 70. B) Conclude that the population average pulse rate is 71.8. C) Reject the hypothesis that the population average pulse rate is 70. D) Reject the hypothesis that the sample average pulse rate is 70.

C Since the pvalue is less than α we would reject the Ho the null hypothesis that the population average pulse rate is 70.

The table above shows the number of Olympic medals won by the three countries with the most medals during the 2000 Olympics in Sydney, Australia. There were a total of 244 medals won by the three countries. What percent of the medals won by the USA were gold? A) 39.4% B) 39.8% C) 40.2% D) 40.6%

C Take 39 divided by 97 times 100% for 40.2%

was n =30 (10 soccer, 10 track, 5 Lacrosse, and 5 water polo). A oneway analysis of variance was used to compare the population mean levels for the four sports. The sum of squared errors is SS Error = 100. What is the value of the Mean Square Error (MS Error)? A) 10 B) 3.45 C) 3.85

C The MS Error is equal to the SS Error divided by the Error degrees of freedom (which are equal to the total sample size minus the number of group levels: 30 − 4 = 26). Therefore, the answer is 100/26 = 3.85

A study compared testosterone levels among athletes in four sports: soccer, track, Lacrosse, and water polo. The total sample sizewas n =30 (10 soccer, 10 track, 5 Lacrosse, and 5 water polo). A oneway analysis of variance was used to compare the population mean levels for the four sports. The sum of squared errors is SS Error = 100. What is the value of the Mean Square Error (MS Error)? A) 10 B) 3.45 C) 3.85

C The MS Error is equal to the SS Error divided by the Error degrees of freedom (which are equal to the total sample size minus the number of group levels: 30 − 4 = 26). Therefore, the answer is 100/26 = 3.85

Based on her past experience, a professor knows that the probability distribution for X = number of students who come to her office hours on Wednesday is given above. What is the value of the cumulative probability distribution at 2, i.e. P(X ≤ 2)? A) 0.50 B) 0.70 C) 0.80 D) 0.90

C The cumulative distribution means to find the probability for each outcome plus previous outcomes. To see the cumulative probability table click Cumulative From this table you can see that the cumulative probability for P(X = 2) is 0.80

Which one of the following choices describes a problem for which an analysis of variance would be appropriate? A) Comparing the proportion of successes for three different treatments of anxiety. Each treatment is tried on 100 patients. B) Analyzing the relationship between high school GPA and college GPA. C) Comparing the mean birth weights of newborn babies for three different racial groups. D) Analyzing the relationship between gender and opinion about capital punishment (favor or oppose).

C The dependent (response) variable needs to be continuous and the different levels of the independent variable need to be mutually exclusive and categorical. This leads to the correct answer of mean birth weights (continuous response) across three racial groups (mutually exclusive, categorical).

The probability distribution shown above is for the random variable X = number of classes for which full time students at a university are enrolled in a semester. What is the expected value of courses taken per student? A) 4 B) 5 C) 5.2 D) 5.5

C The expected value is found by taking each outcome times its respective probability and then summing these products. The expected payoff, then, is (4)(0.2) + (5)(0.5) + (6)(0.2) + (7)(0.1) = 5.2 classes

Suppose that a 95% confidence interval for the proportion of firstyear students at a school who played in intramural sports is 35% plus or minus 5%. The confidence level for the confidence interval is A) 5% B) 35% C) 95%

C The level of confidence is 95%

equally likely. This is repeated for 200 trials. The null hypothesis is that the subject is guessing, while the alternative is that the subject has ESP and can guess at higher than the chance rate. What is the correct statement of the null hypothesis that the person does not have ESP? A) p = 0.5 B) p = 4/200 C) p = 1/4 D) p > 1/4

C The null hypothesis will refer to no difference or equal to, so Ho is ρ = 1/4 since out of four cards, each likely to be drawn.

A study compared grade point averages (GPA) among students in 4 different majors (English, History, Statistics, and Art) using analysis of variance. A total sample size of 20 students (5 in each major) was studied. What are the numerator and denominator degrees of freedom for the ANOVA Ftest? A) 5 for numerator and 20 for denominator. B) 4 for numerator and 79 for denominator. C) 3 for numerator and 16 for denominator.

C The numerator degrees of freedom are found by taking the number of group levels minus 1 (this case 4 − 1 = 3) and the denominator degrees of freedom are found by taking the total sample size minus the number of group levels (20 − 4 = 16)

A study compared testosterone levels among athletes in four sports: soccer, track, Lacrosse, and water polo. The total sample size was n =30 (10 soccer, 10 track, 5 Lacrosse, and 5 water polo). A oneway analysis of variance was used to compare the population mean levels for the four sports. What are the numerator and denominator degrees of freedom for the ANOVA Ftest? A) 10 for numerator and 30 for denominator. B) 3 for numerator and 29 for denominator. C) 3 for numerator and 26 for denominator.

C The numerator degrees of freedom are found by taking the number of group levels minus 1 (this case 4 − 1 = 3) and the denominator degrees of freedom are found by taking the total sample size minus the number of group levels (30 − 4 = 26)

A study compared grade point averages (GPA) for students in a class: students were divided by 6 locations where they usually sat during lecture (i.e. left or right front, left or right center, left or right rear). A total sample size of 12 students was studied (2 students from each section) using oneway analysis of variance. What are the numerator and denominator degrees of freedom for the ANOVA Ftest? A) 6 for numerator and 12 for denominator. B) 5 for numerator and 11 for denominator. C) 5 for numerator and 6 for denominator.

C The numerator degrees of freedom are found by taking the number of group levels minus 1 (this case 6 − 1 = 5) and the denominator degrees of freedom are found by taking the total sample size minus the number of group levels (12 − 6 = 6)

A store manager is trying to decide whether to price oranges by weight, with a fixed cost per pound, or by the piece, with a fixed cost per orange. He is concerned that customers will choose the largest ones if there is a fixed price per orange. For one week the oranges are priced by the piece rather than by weight, and during this time the mean weight of the oranges purchased is recorded for all customers who buy 4 of them. The manager knows the population of weights of individual oranges is bell shaped with mean of 8 ounces and a standard deviation of 1.6 ounces. If the 4 oranges each customer chooses are equivalent to a random sample, what should be the approximate mean and standard deviation of the distribution of the mean weight of 4 oranges? A) mean = 32 ounces, standard deviation = 6.2 ounces B) mean = 8 ounces, standard deviation = 1.6 ounces C) mean = 8 ounces, standard deviation = 0.8 ounces D) mean = 2 ounces, standard deviation = 0.4 ounces

C The sample mean distribution would have a mean equal to the population mean (8 ounces) and a standard deviation of S/SqRt of n, or 1.6/2 = 0.8

Suppose that a polling organization surveys n = 400 people about whether they think the federal government should give financial aid to the airlines to help them avoid bankruptcy. In the poll, 300 people say that the government should provide aid to the airlines. Which choice gives the correct notation and value for the sample proportion, ρhat, in this survey? A) ρhat = 0.30 B) ρ = 0.30 C) ρhat = 0.75 D) ρ =0.75

C The sample proportion, ρhat, would equal 300/400 = 0.75

Suppose that a polling organization surveys n = 400 people about whether they think the federal government should give financial aid to the airlines to help them avoid bankruptcy. In the poll, 300 people say that the government should provide aid to the airlines. Which choice gives the correct notation and value for the sample proportion, ρhat,in this survey? A) ρhat = 0.30 B) ρ = 0.30 C) ρhat= 0.75 D) ρ =0.75

C The sample proportion, ρhat, would equal 300/400 = 0.75

The designated level (typically set at 0.05) to which the pvalue is compared to, in order to decide whether the alternative hypothesis is accepted or not is called a A) statistically significant result. B) test statistic. C) significance level.

C The significance level is symbolized by &alpha which is what the pvalue is compared

Heights for a sample of n = 4 women are measured. For the sample, the mean is 64 inches and the standard deviation is 3 inches. What is the standard error of the mean? A) 3/8 B) 0.75 C) 1.5 D) 3

C The standard error equals S/Sq.Rt. of N = 3/2 = 1.5

For a random sample of 9 women, the average resting pulse rate is x = 76 beats per minute, and the sample standard deviation is s = 5. The standard error of the sample mean is

C The standard error equals S/Sq.Rt. of N = 5/3 = 1.667

For a random sample of 9 women, the average resting pulse rate is x = 76 beats per minute, and the sample standard deviation is s = 5. The standard error of the sample mean is A) 0.557 B) 0.745 C) 1.667 D) 2.778

C The standard error equals S/Sq.Rt. of N = 5/3 = 1.667

The table above shows the opinions of 321 respondents from the General Social Survey by whether they owned a gun (or not) and whether they favored (or opposed) a law requiring a permit to own a gun. What is the correct alternative hypothesis for this survey? A) There is no relationship in the population between gun ownership and opinion regarding gun law permit B) There is no relationship in the sample between gun ownership and opinion regarding gun law permit C) There is a relationship in the population between gun ownership and opinion regarding gun law permit D) There is a relationship in the sample between gun ownership and opinion regarding gun law permit

C There is a relationship in the population between gun ownership and opinion regarding gun law permit

From our Class Survey, 52% of the students reported having tried marijuana, and 24% of students reported that they had tried marijuana and still smoke marijuana. What is the probability that a student still smokes marijuana given that the student has tried marijuana? A) 0.125 B) 0.76 C) 0.46

C This is a conditional probability question, meaning you need to find P(Still gets high | Tried Marijuana) = P(Still and Tried)/P(Tried) = 0.24/0.52 = 0.46 or 46%.

A counselor wants to show that for men who are married by the time they are 30, μ = average age when the men are married is not 21 years old. A random sample of 10 men who were married by age 30 showed an average age at marriage of 22.2, with a sample standard deviation of 1.9 years. Assume that the age at which this population of men gets married for the first time is normallydistributed. Using the Ttable, what is the approximate pvalue? A) pvalue≈ 0.022 B) pvalue≈ 0.043 C) pvalue≈ 0.076

C This is a one sample ttest, so the test statistic, t is found by taking the difference between the sample mean (22.2) Minus the hypothesized mean (21) and dividing by the standard error of the mean (S/√n = 1.9/√10 = 0.601). The tvalue is then t = 1.2/0.601 = 2.00. Then with degrees of freedom of 10 − 1 = 9 and from Ttable we get a pvalue between 0.025 and 0.050, but since Ha is twosided (i.e. ≠) we need to double this pvalue to get a final range between 0.050 and 0.100. The pvalue of 0.076 fallsin this range.

A result is called statistically significant whenever A) the null hypothesis is true. B) the alternative hypothesis is true. C) the pvalue is less or equal to the significance level. D) the pvalue is larger than the significance level.

C When we reject Ho we have a result that is statistically significant, and we reject Ho when pvalue is less than or equal to the level of significance, α (which is usually 0.05).

A result is called statistically significant whenever A) the null hypothesis is true. B) the alternative hypothesis is true. C) the pvalue is less or equal to the significance level. D) the pvalue is larger than the significance level.

C When we reject Ho we have a result that is statistically significant, and we reject Ho when pvalue is less than or equal to the level of significance, α (which is usually 0.05).

For the variable "Time spent watching TV in Typical Day," the results of a twosample tprocedure that compares a random sample of men and women at a college are shown above. Which of the following is the correct conclusion about these results using a 5% significance level? A) The mean TV watching times of men and women at the college are equal. B) There is a statistically significant difference between the mean TV watching times of men and women at the college. C) There is not a statistically significant difference between the mean TV watching times of men and women at the college. D) There is not enough information to judge statistical significance here.

C With a pvalue of 0.14 which is greater than 0.05 we would not reject Ho and conclude that there is not a statistically significant difference between the mean TV watching times of men and women at the college.

Which statement is true about ρ and ρhat? A) They are both parameters. B) They are both statistics. C) ρ is a parameter and ρhat is a statistic. D) ρhat is a parameter and ρ is a statistic.

C ρhat is a statistic and ρ is a parameter

A national polling organization wants to estimate the percentage of all teenagers who believe social security will 'be there' for them. The organization surveys a random sample of 1500 teenagers, and 37% of this sample says that they believe social security will 'be there' for them. In this survey, what is the population of interest? A) The 1500 teenagers who were surveyed. B) Teenagers who believe social security will 'be there' for them. C) All teenagers. D) The people in the sample who believe social security will 'be there' for them.

C) All teenagers would be the population of interest

For a survey of American diets a random sample of 1000 people were contacted. Of the 1000 people, 340 people completed the questionnaire. The results of this study, if applied to all Americans, are questionable because of A) a large margin of error. B) selection bias. C) nonresponse bias. D) response bias

C) With 660 people not completing the questionnaire the results would be questionable due to nonresponse bias

We want to test for a relationship between race and employment status (employed or unemployed)

Chi-square test. Race and employment status are both categorical

The z* multiplier for a 99% confidence interval is A) 1.65 B) 1.96 C) 2.33 D) 2.58

D

What procedure is used to test whether or not three or more population means are equal? A) Analysis of correlation B) 3sample ttest C) Chisquare test D) Analysis of variance

D ANOVA is used to test the equality of 3 or more population means

A fivenumber summary for a data set is 35, 50, 60, 70, 90. About what percent of the observations are between 35 and 90? A) 25% B) 50% C) 95% D) 100%

D All, or 100%, would lie between the minimum and maximum

Which of the following statements best describes the relationship between a parameter and a statistic? A) A parameter has a sampling distribution with the statistic as its mean. B) A parameter has a sampling distribution that can be used to determine what values the statistic is likely to have in repeated samples. C) A parameter is used to estimate a statistic. D) A statistic is used to estimate a parameter.

D An underlying theme of statistics is to use statistics to estimate a parameter

An outlier is a data value that A) is larger than 1 million. B) equals the minimum value in a set of data. C) equals the maximum value in a set of data. D) is not consistent with the bulk of the data.

D Feedback: An outlier is inconsistent with the bulk of the data

Which of the following statements best describes the relationship between a parameter and a statistic? A) A parameter has a sampling distribution with the statistic as its mean. B) A parameter has a sampling distribution that can be used to determine what values the statistic is likely to have in repeated samples. C) A parameter is used to estimate a statistic. D) A statistic is used to estimate a parameter

D Feedback: An underlying theme of statistics is to use statistics to estimate a parameter

Which one of the following probability statements would represent a cumulative probability? A) The probability that there are exactly 4 people with Type O+ blood in a sample of 10 people. B) The probability of exactly 3 heads in 6 flips of a coin. C) The probability that the accumulated annual rainfall in a certain city next year, rounded to the nearest inch, will be 18 inches. D) The probability that a randomly selected woman's height is 67 inches or less

D Feedback: Cumulative probability implies "from the beginning to some point", or in other words the probability from some value or less.

Students in a statistics class were asked, "With whom do you find it easier to make friends:person of the same sex, person of opposite sex, or no preference?" A table summarizing the responses by gender is given below. Results for a chi-square test for these data were: Chi-Sq = 7.15 DF=2 P-value = 0.028. Assume these students represent a random sample of all students. Based on the chi-square test, what conclusion can be made about the relationship between gender and response to the question about friends, using a = 0.05? In the population: A) The relationship is not statistically significant so there is no relationship. B) The relationship is not statistically significant so there is a relationship. C) The relationship is statistically significant so there is no relationship. D) The relationship is statistically significant so there is a relationship

D Feedback: FEEDBACK:Since the p-value is less than 0.05 we conclude that there is a statistically significant relationship

Students in a statistics class were asked, "With whom do you find it easier to make friends:person of the same sex, person of opposite sex, or no preference?" A table summarizing the responses by gender is given below. Results for a chisquare test for these data were: ChiSq = 7.15 DF=2 Pvalue = 0.028. Assume these students represent a random sample of all students. Based on the chisquare test, what conclusion can be made about the relationship between gender and response to the question about friends, using a = 0.05? In the population: A) The relationship is not statistically significant so there is no relationship. B) The relationship is not statistically significant so there is a relationship. C) The relationship is statistically significant so there is no relationship. D) The relationship is statistically significant so there is a relationship

D Feedback: FEEDBACK:Since the pvalue is less than 0.05 we conclude that there is a statistically significant relationship

A reviewer rated a sample of fifteen wines on a score from 1 (very poor) to 7 (excellent). A correlation of 0.92 was obtained between these ratings and the cost of the wines at a local store. In plain English, this means that A) in general, the reviewer liked the cheaper wines better. B) having to pay more caused the reviewer to give a higher rating. C) wines with low ratings are likely to be more expensive (probably because fewer will be sold). D) in general, as the cost went up so did the rating

D Feedback: Since the correlation is positive this means that as the cost went up so did the rating

Which of the following statements is correct about a parameter and a statistic associated with repeated random samples of the same size from the same population? A) Values of a parameter will vary from sample to sample but values of a statistic will not. B) Values of both a parameter and a statistic may vary from sample to sample. C) Values of a parameter will vary according to the sampling distribution for that parameter. D) Values of a statistic will vary according to the sampling distribution for that statistic.

D Feedback: The population parameter does not vary, however, the values of the statistic will vary based on the sampling distribution of that statistic

Students who live in the dorms at a college get free T.V. service in their rooms, but only receive 6 stations. On a certain evening, a student wants to watch T.V. and the six stations are broadcasting separate shows on baseball, football, basketball, local news, national news, and international news. The student is too tired to check which channels the shows are playing on, so the student picks a channel at random. The two events F = {the student watches football} and A = {the student watches an athletic event} are A) independent events. B) disjoint (mutually exclusive) events. C) each simple events. D) None of the above.

D None of the descriptions are accurate. The two events are not mutually exclusive since they share common outcomes: football for the one event and for the other event, football is one of the possible athletic outcomes (along with baseball and basketball). With an outcome in common they cannot be mutually exclusive. As for the independence, knowing that the student is watching an athletic event increases the probability that the student is watching football: the probability of watching football is 1/6, but given that the student is watching an athletic event increases the chance that it is football to 1/3. Finally, the event "watching an athletic event" has more than 1 outcome and therefore is not a simple event.

Which one of these variables is a categorical variable? A) Number of ear pierces a person has B) Height of a person C) Weight of a person D) Opinion about legalization of marijuana

D Number of ear pierces is categorical. Opinions, by themselves, are just opinions. They become qualitative variables once they are classified (e.g. negative, positive, neutral).

A medical treatment has a success rate of 0.8. Two patients will be treated with this treatment. Assuming the results are independent for the two patients, what is the probability that neither one of them will be successfully cured? A) 0.5 B) 0.36 C) 0.2 D) 0.04

D Probability of cured is 0.8 so probability of not cured is 0.2. Independence means multiply so probability that both are not cured is 0.2*0.2 = 0.04

The table above summarizes, by gender of respondent, the responses from 1,033 people to the question, "Do you smoke?" What are the odds of smoking (to not smoking) for a man? A) 0.14 B) 0.32 C) 0.45 D) 0.47

D Recall that odds compare the number in the group of interest compared to the number not in the group of interest: 142/302

When a representative sample is selected but respondents give answers that are different from their true opinions, the problem is called A) lack of accuracy. B) selection bias. C) nonresponse bias. D) response bias.

D Response bias is the result of respondents providing answers that are not necessarily their true opinions but instead are opinions they believe the researcher wants to hear or when respondents are afraid to say their true thoughts

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.40 Ha: ρ > 0.40 Use Standard Normal Table to calculate the pvalue for this hypothesis test for zstatistic = 1.50 the pvalue is: A) 0.0668 B) 0.1469 C) 0.8531 D) 0.9332

D Since Ha is ">" we find the pvalue by P(Z > z) where z is the zstatistic.

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.70 Ha: ρ ≠ 0.70 Use Standard Normal Table to calculate the pvalue for this hypothesis test for zstatistic = 0.40 the pvalue is: A) 0.3446 B) 0.4000 C) 0.6554 D) 0.6892

D Since Ha is "≠" we find the pvalue by twice the P(Z > |z|) where z is the zstatistic

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.70 Ha: ρ ≠ 0.70Use Standard Normal Table to calculate the pvalue for this hypothesis test for zstatistic = 0.40 the pvalue is: A) 0.3446 B) 0.4000 C) 0.6554 D) 0.6892

D Since Ha is "≠" we find the pvalue by twice the P(Z > |z|) where z is the zstatistic

analysis of variance. A total sample size of 20 students (5 in each major) was studied. The pvalue for the Ftest is 0.013. If thesignificance level, α, is 0.05, what is the conclusion from the analysis of variance? A) The null hypothesis is not rejected; the population means are not significantly different. B) The null hypothesis is not rejected; the population means are significantly different. C) The null hypothesis is rejected; the population means are not significantly different. D) The null hypothesis is rejected; the population means are significantly different. E) The null hypothesis is not rejected; the sample means are not significantly different. F) The null hypothesis is not rejected; the sample means are significantly different. G) The null hypothesis is rejected; the sample means are not significantly different. H) The null hypothesis is rejected; the sample means are significantly different.

D Since pvalue is less than α = 0.05, we would reject the null hypothesis that all 4 group population means are equal conclude that these population means are significantly different, i.e. they are not all equal.

In a newspaper article about whether the regular use of Vitamin C reduces the risk of getting a cold, a researcher is quoted as saying that Vitamin C performed better than placebo in an experiment, but the difference was not larger than what could be explained by chance. In statistical terms, the researcher is saying the results are _______ A) due to nonsampling errors. B) definitely due to chance. C) statistically significant. D) not statistically significant.

D Since the difference could be due to chance, the outcome is not statistically significant.

Suppose a 95% confidence interval for p, the proportion of drivers who admit that they sometimes run red lights when no one is around, is 0.29 to 0.38. Which of the following statements is false? A) A test of Ho: ρ = 0.3 versus Ha: ρ ≠ 0.3 would not be rejected using a = 0.05. B) A test of Ho: ρ = 0.5 versus Ha: ρ ≠ 0.5 would be rejected using a = 0.05. C) It is plausible that about 37% of all drivers would admit that they sometimes run red lights when no one is around. D) It is plausible that a majority of all drivers would admit that they sometimes run red lights when no one is around.

D Since the interval covers 0.3 we would not reject Ho: ρ = 0.3. Since the interval does not cover 05 we would reject Ho: ρ = 0.3. Since confidence intervals are primarily used to estimate true parameter values then we could say about 37% of all drivers would admit that they sometimes run red lights when no one is around is plausible. However, we could not say that a majority would.

If the size of a sample randomly selected sample from a population is increased from n = 100 to n = 400, then the standard deviation of ρhat will A) remain the same. B) increase by a factor of 4. C) decrease by a factor of 4. D) decrease by a factor of 2.

D Since the standard error involves taking the square root of n, and here n was increased fourfold, then the resulting standard error would decrease by a factor of 2

A magazine printed a survey in its monthly issue and asked readers to fill it out and send it in. Over 1000 readers did so. This type of sample is called a A) cluster sample. B) stratified sample. C) simple random sample. D) selfselected sample

D Since the survey only went to subscribers of the magazine the sampling method is a selfselected sample

A shopper wanted to test whether there was a difference in the average waiting times at the checkout counter among 5 different supermarkets. She selected a random sample of 20 shoppers from each of the five supermarkets. What is the alternative hypothesis for this situation? A) The average waiting time to check out is 25 minutes for all five supermarkets. B) The average waiting time to check out is the same for all five supermarkets. C) The average waiting time for each of the 100 shoppers is different. D) The average waiting time to check out is not the same for all five supermarkets

D The alternative hypothesis for an ANOVA test is that all the means are not the same, i.e. the means are not all equal.

A student wanted to test whether there was a difference in the mean daily hours of study for students living in four different dormitories. She selected a random sample of 50 students from each of the four dormitories. What is the alternative hypothesis for this situation? A) The mean daily hours of study is 3 hours for each dormitory. B) The mean daily hours of study is the same for each dormitory. C) The mean daily hours of study is different for each of the 200 students in the sample. D) The mean daily hours of study is not the same for all four dormitories.

D The alternative hypothesis for an ANOVA test is that all the means are not the same, i.e. the means are not all equal.

Researchers want to see if men have a higher blood pressure than women do. A study is planned in which the blood pressures of 50 men and 50 women will be measured. What is the most appropriate alternative hypothesis about the means of the men and women? A) The sample means are the same. B) The sample mean will be higher for men. C) The population means are the same. D) The population mean is higher for men than for women

D The alternative hypothesis, Ha, would indicate that there is a difference and that this difference would take place in the population. From the responses provided, the only alternative that indicates a difference in the population is the response that the population mean is higher for men than for women. The sample is used to test for this population difference

Researchers want to see if men have a higher blood pressure than women do. A study is planned in which the blood pressures of 50 men and 50 women will be measured. What is the most appropriate alternative hypothesis about the means of the men and women? A) The sample means are the same. B) The sample mean will be higher for men. C) The population means are the same. D) The population mean is higher for men than for women.

D The alternative hypothesis, Ha, would indicate that there is a difference and that this difference would take place in the population. From the responses provided, the only alternative that indicates a difference in the population is the response that the population mean is higher for men than for women. The sample is used to test for this population difference.

Suppose that a difference between two groups is examined. In the language of statistics, the alternative hypothesis is a statement that there is __________ A) no difference between the groups for the samples. B) a difference between the groups for the samples. C) no difference between the groups for the populations. D) a difference between the groups for the populations.

D The alternative hypothesis, Ha, would indicate that there is a difference and that this difference would take place in the population. The sample is used to test for this population difference.

The probability distribution for X = number of heads in 4 tosses of a fair coin is given in the table above. What is the value of the cumulative distribution function at 3, i.e. P(X ≤ 3)? A) 6/16 B) 10/16 C) 11/16 D) 15/16

D The cumulative distribution means to find the probability for each outcome plus previous outcomes. To see the cumulative probability table click Cumulative From this table you can see that the cumulative probability for P(X = 3) is 15/16

A regression was done for 20 cities with latitude as the explanatory variable (x) and average January temperature as the response variable (y). The latitude is measured in degrees and average January temperature in degrees Fahrenheit. The latitudes ranged from 26 (Miami) to 47 (Duluth) The regression equation is y = 49.4 0.313x The city of Miami, Florida has latitude 26 degrees with average January temperature of 67 degrees Fahrenheit. 1. What is the estimated average January temperature for Miami, and 2. based on the regression equation, what is the residual? A) Estimated January temperature is 36.88 and the residual is 11.88 B) Estimated January temperature is 36.88 and the residual is 11.88 C) Estimated January temperature is 41.3 and the residual is 25.7 D) Estimated January temperature is 41.3 and the residual is 25.7

D The estimated January temperature is 41.3 degrees Fahrenheit. The residual is 25.7 degrees (residual = observed minus predicted)

Based on the 2000 Census, the proportion of the California population aged 15 years old or older who are married is p = 0.524. Suppose n = 1000 persons are to be sampled from this population and the sample proportion of married persons (ρhat) is to be calculated. What is the mean of the sampling distribution of ρhat? A) 0.0158 B) 0.0166 C) 0.2494 D) 0.5240

D The mean of the sampling distribution of ρhat is 0.524

Based on the 2000 Census, 31.8% of grandparents in California are the primary caregivers for their grandchildren. Suppose n = 1000 persons are to be sampled from this population and the sample proportion of grandparents as primary caregivers (ρhat) is to be calculated. What is the mean of the sampling distribution of ρhat? A) 0.0002 B) 0.0147 C) 0.2169 D) 0.3180 D) 0.800

D The mean of the sampling distribution of ρhat is 31.8% or 0.3180

A company has 500 employees and would like to select a simple random sample of 25 of them for a study. Of the following, only one fits the definition of a simple random sample. Which one is it? A) Randomly choose one person whose last name begins with A, one with B, and so on, omitting X because it's least common. B) Randomly choose 25 pages from the employee directory, then choose the first person listed on each of those pages. C) Number the employees from 1 to 500 based on seniority and randomly choose one person from the first 20 names on the list, one from the next 20, and so on. D) Number the employees from 1 to 500 in random order and choose the first 25 names on the list.

D The only random sampling defined by the given responses is where all the employees are randomly ordered from 1 to 500 and then the first 25 are picked. The random ordering gives everyone an equal chance of being in the top 25

A pair of dice is rolled. What is the probability of that the sum of the two dice from this roll is two? A) 1/6 B) 1/3 C) 1/4 D) 1/36

D The only way to get a sum of 2 is to roll two ones. The probability of rolling a one is 1/6. The probability of rolling a 1 two times is (1/6)*(1/6) = 1/36

The primary purpose of a significance test is to A) estimate the pvalue of a sample. B) estimate the pvalue of a population. C) decide whether there is enough evidence to support a research hypothesis, Ha, about a sample D) decide whether there is enough evidence to support a research hypothesis, Ha, about a population.

D The primary purpose of confidence intervals is to estimate a population parameter. Hypothesis tests are primarily used to determine if a enough evidence has been presented in order to reject a null hypothesis and thus support Ha the research hypothesis. Since Ho and Ha always refer to the population (i.e. use parameter notation) we are interested in supporting Ha about a population.

The probability distribution for X = number of heads in 4 tosses of a fair coin is given in the table above. What is the probability of getting at least one head? A) 1/16 B) 4/16 C) 5/16 D) 15/16

D The probability of getting at least one head would be to say find P(X ≥ 1) which is the sum of probabilities for getting 1, 2, 3 or 4 heads, or 15/16 [note that these events are mutually exclusive, i.e. you cannot at the same time get 2 heads and 3 heads]. Conversely, you could apply the complement rule where the complement would be getting fewer than 1 head, or no heads (i.e. all tails). This probability is 1/16, so by the complement rule P(X ≥ 1) = 1 P(X < 1) = 1 1/16 = 15/16.

From a Class Survey, approximately 32% of the students responding said that they have driven a vehicle while under the influence of drugs or alcohol. If you got into a car with two students from this class, what is the probability that both of them would have previously driven while under the influence? A) 64% B) 32% C) 22% D) 10%

D The probability that both have driven while under the influence is equal to 0.32 times 0.32 which equals 10%

Sara is a frequent business traveler. For security purposes, 10% of all people boarding airplanes are randomly selected for additional screening just prior to boarding. What is the probability that the first time Sara is selected for screening is on her third flight? A) (0.1)(0.1) B) (0.9)(0.9) C) (0.1)(0.1)(0.9) D) (0.9)(0.9)(0.1)

D The probability would be equal to P(not selected on first flight) times P(not selected on second flight) times P(selected on third flight) or (0.9)(0.9)(0.1)

For a random sample of 10 men, the mean head circumference is x = 57.3 cm and the sample standard deviation is s = 2 cm. The standard error of the sample mean is A) 0.200 B) 0.447 C) 0.500 D) 0.632

D The standard error equals S/Sq.Rt. of N = 2/3.16 = 0.632

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.10 Ha: ρ ≠ 0.10 If the sample size n = 500 and sample proportion ρhat = 0.20, then the zstatistic is: A) 7.45 B) 5.59 C) 5.59 D) 7.45

D The zstatistic is found by (ρhatρ)/√[(ρ*(1ρ))/ n]

Suppose that a 95% confidence interval for the proportion of firstyear students at a school who played in intramural sports is 35% plus or minus 5%. The 95% confidence interval for the proportion of students playing intramural sports is A) 25% to 45% B) 30% to 35% C) 35% to 40% D) 30% to 40%

D With a margin of error equal to 5% and phat of 35%, the 95% confidence interval is 30% to 40%

A sample of n = 200 college students is asked if they believe in extraterrestrial life and 120 of these students say that they do. The data are used to test Ho: ρ = 0.5 versus Ha: ρ > 0.5, where p is the population proportion of college students who say they believe in extraterrestrial life. From this sample, the above output was obtained. Using a 5% significance level, what is the correct decision for this significance test? A) Fail to reject the null hypothesis because the pvalue is greater than 0.05. B) Fail to reject the null hypothesis because the pvalue is less than 0.05. C) Reject the null hypothesis because the pvalue is greater than 0.05. D) Reject the null hypothesis because the pvalue is less than 0.05.

D With a pvalue of 0.002 which is less than 0.05 we would reject the null hypothesis

The weights of a sample of n = 8 college men will be used to create a 95% confidence interval for the mean weight of all college men. Using the Ttable, what is the correct t* multiplier involved in calculating the interval? A) 1.89 B) 2.00 C) 2.31 D) 2.36

D With degrees of freedom equal to n 1 = 8 1 = 7 and looking under confidence level 0.95, t* is 2.36

A hypothesis test is done in which the alternative hypothesis states that more than 10% of a population is lefthanded. The pvalue for the test is calculated to be 0.25. Which statement is correct? A) We can conclude that more than 10% of the population is lefthanded. B) We can conclude that more than 25% of the population is lefthanded. C) We can conclude that exactly 25% of the population is lefthanded. D) We cannot conclude that more than 10% of the population is lefthanded.

D With such a high pvalue (greater than 0.05), we would not reject Ho and therefore could not conclude that more than 10% of the population is lefthanded

The relative risk of allergies for children of parents who smoke compared to children of parents who don't smoke is 3.0. Suppose that the risk of allergies for the children of nonsmokers is 0.15 (15%). What is the risk of allergies for the children of smokers? A) 3% B) 5% C) 30% D) 45%

D You are given the relative risk (3.0), the baseline group (children of parents who do not smoke the compared to group!), and the risk for this baseline group (15%). Using this information and algebra we need to solve for the alternate group risk in the equation: Rel.Risk = (Alternate group risk) /(baseline risk). Doing so produces 0.45 or 45%

Which of the following is not one of the steps for hypothesis testing? A) Determine the null and alternative hypotheses. B) Verify data conditions and calculate a test statistic. C) Assuming the null hypothesis is true, find the pvalue. D) Assuming the alternative hypothesis is true, find the pvalue.

D pvalue is found on the presumption that the null hypothesis is true, not the alternative.

Suppose we select a random sample of n = 100 students and find that the proportion of students who said they believe in love at first sight is 0.43. Which statement is not necessarily true? A) There were 43 students in the sample who said they believe in love at first sight. B) Based on the information provided by the sample, we cannot determine exactly what proportion of the population would say they believe in love at first sight. C) ρhat= 0.43 D) ρ = 0.43

D ρ = 0.43 is a parameter representing the population proportion which would not necessarily be known just from the data given.

The probability is p = 0.80 that a patient with a certain disease will be successfully treated with a new medical treatment. Suppose that the treatment is used on 40 patients. What is the expected value of the number of patients who are successfully treated? A) 40 B) 20 C) 8 D) 32

D Recognize this as a binomial random variable with n = 40 and p = 0.80, then the expected value is equal to np or 32

The primary purpose of a confidence interval is to:

Estimate the population parameter

A research group wants to show that their new drug performs better in treating a certain medical condition than the current drug. The current drug success rate is 65%. The research group conducted a hypothesis test which produced a p-value of 0.019 from which they could conclude that their new drug has a success rate that is:

Greater than 65%. Since the p-value of 0.019 is less than 0.05 would reject the null hypothesis and conclude that the new drug has a success rate that is greater than 65%.

A research group wants to show that their new drug performs better in treating a certain medical condition than the current drug. The current drug success rate is 65%. From a random sample of 200 patients, 144 or 72%, had a successful recovery with the new drug. Which of the following would be the correct null and alternative hypotheses?

Ho: p = 0.65 Ha: p > 0.65. From the statement the 65% is the claimed null value and 72% would be the sample proportion value. Since hypothesis statements always include population notation and this is a problem regarding proportions, the correct answer Ho is p = 0.65 and Ha is p > 0.65

We take random samples of professional athletes in baseball, basketball, and football to analyze if mean salaries differ among these groups:

One-way Analysis of Variance (ANOVA)The question wants to compare a continuous variable (earnings) across more than 2 levels (the 4 races) of one variable (race)

Is the following percent a statistic or a parameter? 47% of all students at University Park are female.

Parameter

We own a business and want to see if our money spent on advising each month is helping to increase monthly sales:

Regression. Regression is used to find a relationship between a quantitative dependent variable (sales dollars) and another quantitative variable (adv dollars).

Is the following percent a statistic or a parameter? Of 200 students sampled from the student database at Penn State, 200 or 100% said they would like the university to offer free legal services to students.

Statistic

If the p-value is 0.003 for the Chi-Square Analysis of this data, which of the following is the BEST conclusion?

Statistically significant in the population

A five number summary for hours studied in a week were 5, 12, 14, 18, and 20. What was the shortest number of hours studied by anyone? A) 5 hours B) 12 hours C) 14 hours D) 18 hours E) 20 hours

a

he z* multiplier for a 90% confidence interval is A) 1.65 B) 1.96 C) 2.33 D) 2.58

a

A survey asked people how often they exceed speed limits. The data are then categorized into the above contingency table of counts showing the relationship between age group and response. What is the relative risk of always exceeding the speed limit for people under 30 compared to people over 30? A) 2.5 B) 0.4 C) 0.5 D) 30%

a Remember that relative risk compares the risk of one group to the risk of another group. Since the wording refers to exceeding the speed limit of "Under 30 to over 30" we compare the under 30 risk (100/200) to the over 30 risk (40/200) = 2.5

In the survey of a random sample of students at a university, two questions were "How many hours per week do you usually study?" and "Have you smoked marijuana in the past six months?" An analysis of the results produced the above output. A research question of interest is whether students who have smoked marijuana (group 1) in the past 6 months study fewer hours on average per week than those who have not (group 2). Based on the information given in the output, what conclusion can be made about the difference in time spent studying for the two groups? A) There is a statistically significant difference. B) There is not a statistically significant difference. C) It is impossible to know if there is a statistically significant difference because no p-value is provided. D) It is impossible to know if there is a statistically significant difference because the test is one-sided and the information provided is two-sided.

a Since the 95% confidence interval does not contain 0 we would reject the Ho and conclude that there is a statistically significant difference.

The table above shows the counts by gender and highest degree attained for 498 respondents in the General Social Survey. What percent of the sample were males with no high school degree? A) 9.8% B) 20.3% C) 22.6% D) 48.5%

a Take 49 divided by 498 times 100% for 9.8%

Which of the following best describes the standardized (z) score for an observation? A) It is the number of standard deviations the observation falls from the mean. B) It is the most common score for that type of observation. C) It is one standard deviation more than the observation. D) It is the center of the list of scores from which the observation was taken.

a A score is standardized to represent how many standard deviations the observation falls from the mean

In the survey of a random sample of students at a university, two questions were "How many hours per week do you usually study?" and "Have you smoked marijuana in the past six months?" An analysis of the results produced the above output. A research question of interest is whether students who have smoked marijuana (group 1) in the past 6 months study fewer hours on average per week than those who have not (group 2). Based on the information given in the output, what conclusion can be made about the difference in time spent studying for the two groups? A) There is a statistically significant difference. B) There is not a statistically significant difference. C) It is impossible to know if there is a statistically significant difference because no p-value is provided. D) It is impossible to know if there is a statistically significant difference because the test is one-sided and the information provided is two-sided.

a Feedback: Since the 95% confidence interval does not contain 0 we would reject the Ho and conclude that there is a statistically significant difference.

A researcher examined the folklore that women can predict the sex of their unborn child better than chance would suggest. She asked 104 pregnant women to predict the sex of their unborn child, and 57 guessed correctly. Using these data, the researcher created the above output. Which choice describes how the p-value was computed in this situation? A) The probability that a z-score would be greater than or equal to 0.98. B) The probability that a z-score would be less than or equal to 0.98. C) The total of the probabilities that a z-score is greater than or equal to 0.98 and less than or equal to -0.98. D) The probability that a z-score would be between -0.98 and 0.98.

a Feedback: Since the alternative hypothesis is Ha: ρ > 0.5 the p-value is calculated by P(Z > 0.98)

Which one of these statistics is unaffected by outliers? A) Interquartile range B) Mean C) Standard deviation D) Range

a Feedback: The Interquartile Range (IQR) is unaffected by outliers. IQR is based on Q3 and Q1 both of which are dependent on the number of observations more so than their numeric value.

Which of the following is not a correct way to state a null hypothesis? A) Ho: B) Ho: μd = 10 C) Ho: μ = 0 D) Ho: μ = 0.5

a Feedback: The null (and alternative) hypotheses need to have a reference to a population, thus simply writing Ho would be incorrect.

A newspaper article reported that "Children who routinely compete in vigorous after-school sports on smoggy days are three times more likely to get asthma than their non-athletic peers." (Sacramento Bee, Feb 1, 2002, p. A1) The newspaper also reported that "The number of children in the study who contracted asthma was relatively small - 265 of 3,535." From this information and the information given in the original quote, which of the following could not be computed? A) The baseline risk of getting asthma without participating in after-school sports. B) The overall risk of getting asthma for the children in this study. C) The relative risk of getting asthma for children who routinely participate in vigorous after-school sports on smoggy days and their non-athletic peers. D) All of the above could be computed.

a Feedback: We couldn't get the baseline risk of getting asthma without participating in after school sports.

When a one-way analysis of variance test is done, what probability distribution is used to find the p-value? A) F-distribution B) normal distribution C) Chi-square distribution D) t-distribution

a Feedback: We use an F-test

A pop quiz in a class resulted in the following eight quiz scores: 0, 60, 66, 78, 82, 96, 98, 100. A five-number summary for these test scores is A) 0, 63, 80, 97, 100. B) 66, 78, 82, 96, 98. C) 0, 66, 82, 98, 100. D) 0, 25, 50, 75, 100.

a Min = 0; Q1 = 63; Med = 80; Q3 = 97; Max = 100

A pop quiz in a class resulted in the following eight quiz scores: 0, 60, 66, 78, 82, 96, 98, 100. A five number summary for these testscores is scores is A) 0, 63, 80, 97, 100. B) 66, 78, 82, 96, 98. C) 0, 66, 82, 98, 100. D) 0, 25, 50, 75, 100.

a Min = 0; Q1 = 63; Med = 80; Q3 = 97; Max = 100

Consider the above scatterplots for two quantitative variables y and x. One point has been labeled in each plot: Point A and Point B. Which point would produce the larger residual? A) Point A B) Point B

a Point A would have the larger residual as this point would be further from the regression line than would point B. Point B is an influential outlier meaning that the regression line would be "drawn" to it. In that regard point B would be closer to the best fit regression line than point A.

A six sided die is made that has four Green sides and two Red sides, all equally likely to land face up when the die is tossed. The dieis tossed three times. Which of these sequences (in the order shown) has the highest probability? A) Green, Green, Green B) Green, Green, Red C) Green, Red, Red D) They are all equally likely

a Since Green is more likely then getting all Green is most likely outcome

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.40 Ha: ρ > 0.40 Use Standard Normal Table to calculate the p-value for this hypothesis test for z-statistic = 2.00 the p-value is: A) 0.0228 B) 0.9332 C) 0.9545 D) 0.9772

a Since Ha is ">" we find the p-value by P(Z > z) where z is the z-statistic.

In a past General Social Survey, 22% of n = 1006 respondents answered yes to the question "Are you a member of any sports groups?" A 95% confidence interval for the population proportion of Americans who belonged to a sports group at that time is 19.4% to 24.6%. Based on these results, you can reasonably conclude that A) less than 50% of all Americans belong to sports clubs. B) more than 50% of all Americans belong to sports clubs. C) 22% of all Americans belong to sports clubs.

a Since confidence intervals are used to estimate the true parameter value, in this problem since the interval does not contain 50% and the limits (or bounds) of the interval are less than 50% we can conclude that the true proportion is less than 50%

The level of significance associated with a significance test is the probability A) of rejecting a true null hypothesis. B) of not rejecting a true null hypothesis. C) that the null hypothesis is true. D) that the alternative hypothesis is true.

a This level of significance, commonly set to α equal to 0.05, is used to set the cut-off as the maximum probability a researcher would use in order to reject a true null hypothesis.

Which one of the following statements is most correct about a skewed dataset? A) The mean and median will usually be different. B) The mean and median will usually be the same. C) The mean will always be higher than the median. D) Whether the mean and median are the same depends on whether the data set is skewed to the right or to the left.

a When data sets are symmetric the mean and median are about equal. Otherwise these two statistics differ.

Decide if the probability described is a subjective (personal) probability or a relative frequency probability: Among 5000 new tires sold by a tire company, 20% (1000/5000) lasted more than 100,000 miles. The 20% chance that a new tire will last more than 100,000 miles is a A) subjective probability. B) relative frequency probability.

b

Decide if the probability described is a subjective (personal) probability or a relative frequency probability: In a sample of 1000 students majoring in the humanities, 660 were female. The 66% (660/1000) chance of a humanities major being female is a A) subjective probability. B) relative frequency probability

b

In the past five years, only 5% of preschool children did not improve their swimming skills after taking a Beginner Swimmer Class ata certain Recreation Center. Decide if the probability described is a subjective (personal) probability or a relative frequency probability: The probability of 5% that a preschool child who is taking a Beginner Swimmer Class swim class will not improve his/her swimming skills is a A) subjective probability. B) relative frequency probability.

b

The z* multiplier for a 95% confidence interval is A) 1.65 B) 1.96 C) 2.33 D) 2.58

b

Which one of the following is not appropriate for studying the relationship between two quantitative variables? A) Scatterplot B) Bar chart C) Correlation D) Regression

b

A study on the use of seat belts versus belted booster seats for children ages 4 and 5 reported that "Using seat belts instead of booster seats was associated with increased risk for serious injury in an accident; the relative risk was 2.4." Based on this, it can be concluded that for this study: A) Children ages 4 and 5 in a booster seat were 2.4 times more likely to have serious injuries in an accident than were children wearing seatbelts B) Children ages 4 and 5 wearing seatbelts were 2.4 times more likely to have serious injuries in an accident than were children in a booster seat. C) The percent of children ages 4 and 5 in a booster seat was 2.4 times higher than the percent of children wearing seatbelts. D) The percent of children ages 4 and 5 wearing seatbelts was 2.4 times higher than the percent of children in a booster seat.

b From the study the baseline group is the children wearing the belted booster seats. So the 2.4 would indicate that children ages 4 and 5 wearing seat belts are 2.4 times more likely to be injured in a car accident than children ages 4 and 5 who wear belted booster seats.

A five number summary for hours studied in a week were 5, 12, 14, 18, and 20. What is the value such that 75% of the students studied longer than that value? A) 5 hours B) 12 hours C) 14 hours D) 18 hours E) 20 hours

b The first quartile, Q1, splits the distribution from 25% at/below and 75% at/above. So 12 would be the answer.

Based on the National Household Survey on Drug Abuse, the percentage of 17-year olds who ever tried cigarette smoking is 56.2%. The relative risk of ever smoking for a 17-year old versus a 12-year old is 3.6. What is the risk of smoking for a 12-year-old (i.e. what was the percentage of 12-year olds who ever tried smoking)? A) 14.1% B) 15.6% C) 50.0% D) 56.2%

b You need to use algebra to solve: Relative risk (3.6) = one group of interest (17 year olds who tried smoking = 56.2%) divided by another group of interest (12 year olds who tried smoking and is unknown). Solving for the unknown comes to (3.6)*(0.562) = 0.156 or 15.6%

For the following statement, determine if it is true or false. The probability of the intersection of two events A and B, and the probability of the union of A and B can never be equal. A) True B) False

b False; consider the possibility that the two events completely overlap. For example, if B = getting an A, and B = female the union and intersection would be equal if all the students were female and they all got an A. However, the probability of the intersection would never be greater than the probability of the union

For the following statement, determine if it is true or false. If two events A and B are independent, they must also be mutually exclusive. A) True B) False

b False; independent does not imply mutually exclusive, only that the outcome of the one event does not affect the probability of the outcome for another event.

student wanted to test whether there was a difference in the mean daily hours of study for students living in four different dormitories. She selected a random sample of 50 students from each of the four dormitories. What is the null hypothesis for this situation? A) The mean daily hours of study is 3 hours for each dormitory. B) The mean daily hours of study is the same for each dormitory. C) The mean daily hours of study is different for each of the 200 students in the sample. D) The mean daily hours of study is not the same for all four dormitories.

b Feedback: ANOVA tests a null hypothesis that the means (average) are equal.

The table above shows the opinions of 321 respondents from the General Social Survey by whether they owned a gun (or not) and whether they favored (or opposed) a law requiring a permit to own a gun. Based on the chi-square statistic and p-value, one can conclude that A) the relationship between the support for the gun law between gun owners and non-gun owners is not statistically significant in the sample. B) the relationship between the support for the gun law between gun owners and non-gun owners is statistically significant in the population. C) the relationship between the support for the gun law between gun owners and non-gun owners is not practically significant. D) the relationship between the support for the gun law between gun owners and non-gun owners is practically significant

b Feedback: Since the p-value is less than 0.05 we conclude that there is a statistically significant relationship in the population between gun ownership and view toward gun law permit.

A group of adults aged 20 to 80 were tested to see how far away they could first hear an ambulance coming towards them. An equation describing the relationship between distance (in feet) and age was found to be: Distance = 600 - 3 × Age How much does the estimated distance change when age is increased by 1? A) It goes down by 1 foot. B) It goes down by 3 feet. C) It goes up by 1 foot. D) It goes up by 3 feet.

b Feedback: The slope gives the increase (or decrease) in the response as X increases by 1 unit. So if Age (x) increases by 1 year, then distance (y) decreases, since slope is negative by 3 feet.

hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.10 Ha: ρ ≠ 0.10 If the sample size n = 100 and sample proportion ρ-hat = 0.10, then the z-statistic is: A) -1.00 B) 0.00 C) 0.10 D) 1.00

b Feedback: The z-statistic is found by (ρ-hat - ρ)/√[(ρ*(1-ρ))/n]

In the simple linear regression equation y = b0 + b1x, the term b0 represents the A) estimated or predicted response. B) estimated intercept. C) estimated slope. D) explanatory variable

b Feedback: This is analogous to the algebra equivalent y = mx + b where b is the y-intercept of the line.

A safety officer wants to prove that μ = the average speed of cars driven by a school is less than 25 mph. Suppose that a random sample of 14 cars shows an average speed of 24.0 mph, with a sample standard deviation of 2.2 mph. Assume that the speeds of cars are normally distributed. Using the T-table, what is the p-value? A) 0.001 < p-value < 0.005 B) 0.050 < p-value < 0.100 C) 0.025 < p-value < 0.050

b Feedback: Yes, results are statistically significant because the p-value < 0.05.

A five number summary given for the fastest ever driving speeds reported by 102 women was: 30, 80, 89, 95, 130. Approximately 25% of the women reported a fastest ever driving speed of at most _____ mph. A) 30 B) 80 C) 89 D) 95

b Q1 divides the distribution into the lower 25% and upper 75%. So the lower 25% of mph would be that greater than Q1 which is 80mph

Regression Analysis: Weight versus Height The regression equation is: Weight = 21.2 + 1.94 Height Predictor Coef SE Coef T P Constant 21.19 22.05 0.96 0.338 Height 1.9419 0.3247 5.98 0.000 S = 28.6963 R-Sq = 13.8% What statistic from the output provides us with an estimate of the amount of variability in Weight that is being explained by Height? A) S B) R-Sq C) The Coefficient for Height

b R-squared (R-Sq) is called the "coefficient of determination" whose definition is "the percent of variation in the dependent variable (Weight) that is explained by the independent variable (Height)."

The above output is for a simple regression in which y = grade point average (GPA) and x = number of classes missed in a typical week. The results were determined using self-reported data for a sample of n = 1,673 at a large northeastern university. What value is given in the output for R2 and based on this output is the relationship between GPA and classes missed per week positive or negative? A) R2 is 3.1% and the relationship is positive B) R2 is 3.1% and the relationship is negative C) R2 is 2.7% and the relationship is positive D) R2 is 2.7% and the relationship is negative

b R2 = 3.1% and the negative slope indicates a negative relationship.

Which one of the following statistics would be affected by an outlier? A) Median B) Standard deviation C) Lower quartile D) Upper quartile

b Standard deviation. This statistic depends on the value of all the observations while the other responses are more dependent on the number of observations in the data set

Suppose a 95% confidence interval for the proportion of Americans who exercise regularly is 0.29 to 0.37. Which one of the following statements is NOT true? A) It is reasonable to say that more than 25% of Americans exercise regularly. B) It is reasonable to say that more than 40% of Americans exercise regularly. C) An "acceptable" hypothesis is that about 33% of Americans exercise regularly. D) It is reasonable to say that fewer than 40% of Americans exercise regularly.

b The interval in the question has: an upper bound less than 40% making "fewer than 40%" true and thus "more than 40% false. The lower bound being greater than 25% makes "more than 25%" acceptable and the interval range covers 33% making "about 33%" possible.

Which of the following statements is most correct about a confidence interval for a mean? A) It provides a range of values, any of which is a good guess at the possible value of the sample mean. B) It provides a range of values, any of which is a good guess at the possible value of the population mean. C) It provides a good guess for the range of values the sample mean is likely to have in repeated samples. D) It provides a good guess for the range of values the population mean is likely to have in repeated samples.

b The population parameter is fixed, thus there is no range of values for it. Also, the confidence interval is used to estimate a population parameter.

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.10 Ha: ρ ≠ 0.10 If the sample size n = 500 and sample proportion ρ-hat = 0.04, then the z-statistic is: A) -6.84 B) -4.47 C) 4.47 D) 6.84

b The z-statistic is found by (ρ-hat - ρ)/√[(ρ*(1-ρ))/n]

Students who live in the dorms at a college get free T.V. service in their rooms, but only receive 6 stations. On a certain evening, a student wants to watch T.V. and the six stations are broadcasting separate shows on baseball, football, basketball, local news, national news, and international news. The student is too tired to check which channels the shows are playing on, so the student picks a channel at random. The two events A = {the student watches an athletic event} and N = {the student watches a news broadcast} are A) independent events. B) disjoint (mutually exclusive) events. C) each simple events. D) None of the above.

b These events are mutually exclusive since both events cannot occur at the same time

A statistics class has 4 teaching assistants (TAs): three female assistants (Lauren, Rona, and Leila) and one male assistant (Josh). Each TA teaches one discussion section. A student picks a discussion section. The two events W = {the TA is a woman} and J = {the TA is Josh} are A) independent events. B) disjoint (mutually exclusive) events. C) each simple events. D) None of the above.

b These events are mutually exclusive since both events cannot occur at the same time.

The heights of a random sample of 100 women are recorded. The sample mean is 65.3 inches and the sample standard deviation is 3 inches. Which of the following provides a 90% confidence interval for the population mean? A) 65.3 ±(1.66)(0.03) B) 65.3 ± (1.66)(0.3) C) 65.3 ± (1.66)(3) D) 65.3 ± (1.66)(30)

b Using degrees of freedom equal to 100 - 1 gives a t* of 1.66 for confidence level of 90% (recall to use the closest without exceeding). The standard error is equal to s/√n = 3/√100 = 0.3

The cholesterol levels of a random sample of 100 men are measured. The sample mean is 188 and the sample standard deviation is 40. Which of the following provides a 95% confidence interval for the population mean? A) 188 ± (1.99)(0.4) B) 188 ± (1.99)(4) C) 188 ± (1.99)(40) D) 188 ± (1.99)(4000)

b Using degrees of freedom equal to 100 - 1 gives a t* of 1.99 for confidence level of 95% (recall to use the closest without exceeding). The standard error is equal to s/√n = 40/√100 = 4

The heights of a random sample of 100 women are recorded. The sample mean is 65.3 inches and the sample standard deviation is 3 inches. Which of the following provides a 90% confidence interval for the population mean? A) 65.3 ±(1.66)(0.03) B) 65.3 ± (1.66)(0.3) C) 65.3 ± (1.66)(3) D) 65.3 ± (1.66)(30)

b Using degrees of freedom equal to 100 1 gives a t* of 1.66 for confidence level of 90% (recall to use the closest without exceeding). The standard error is equal to s/√n = 3/√100 = 0.3

A null hypothesis is that the mean nose lengths of men and women are the same. The alternative hypothesis is that men have a longer mean nose length than women. A statistical test is performed for assessing if men have a longer mean nose length than women. The p-value is 0.225. Which of the following is the most appropriate way to state the conclusion? A) The mean nose lengths of the populations of men and women are identical. B) There is not enough statistical evidence to say that that the populations of men and women have different mean nose lengths. C) Men have a greater mean nose length. D) The probability is 0.225 that men and women have the same mean nose length.

b With a p-value of 0.225 which is greater than 0.05 we would not reject Ho. This would mean that there is not enough statistical evidence to say that that the populations of men and women have different mean nose lengths.

What is the effect of an outlier on the value of a correlation coefficient? A) An outlier will always decrease a correlation coefficient. B) An outlier will always increase a correlation coefficient. C) An outlier might either decrease or increase a correlation coefficient, depending on where it is in relation to the other points. D) An outlier will have no effect on a correlation coefficient.

c

counselor wants to show that for men who are married by the time they are 30, μ = average age when the men are married is not 21 years old. A random sample of 10 men who were married by age 30 showed an average age at marriage of 22.2, with a sample standard deviation of 1.9 years. Assume that the age at which this population of men gets married for the first time is normally distributed. What are the appropriate null and alternative hypotheses? A) Ho: μ = 21 and Ha: μ < 21 B) Ho: μ = 21 and Ha: μ >21 C) Ho: μ = 21 and Ha: μ ≠ 21 D) Ho: μ ≠ 21 and Ha: μ = 21 E) Ho: x-bar = 21 and Ha: x-bar < 21 F) Ho: x-bar = 21 and Ha: x-bar > 21 G) Ho: x-bar = 21 and Ha: x-bar ≠ 21 H) Ho: x-bar ≠ 21 and Ha: x-bar = 21

c

Which one of the following choices describes a problem for which an analysis of variance would be appropriate? A) Comparing the proportion of successes for three different treatments of anxiety. Each treatment is tried on 100 patients. B) Analyzing the relationship between high school GPA and college GPA. C) Comparing the mean birth weights of newborn babies for three different racial groups. D) Analyzing the relationship between gender and opinion about capital punishment (favor or oppose).

c The dependent (response) variable needs to be continuous and the different levels of the independent variable need to be mutually exclusive and categorical. This leads to the correct answer of mean birth weights (continuous response) across three racial groups (mutually exclusive, categorical).

In a survey of n = 950 randomly selected individuals, 17% answered yes to the question "Do you think the use of marijuana should be made legal or not?" A 98% confidence interval for the proportion of all Americans in favor of legalizing marijuana is A) 0.150 to 0.190 B) 0.146 to 0.194 C) 0.142 to 0.198 D) 0.139 to 0.201

c Feedback: A confidence interval is found by sample statistic ± Zmultiplier*StandardError. With p-hat of 0.17, Zmultiplier of 2.33 and n = 950, the 98% confidence interval is 0.142 to 0.198

A hypothesis test for a population proportion ρ is given below:Ho: ρ = 0.40 Ha: ρ > 0.40 Use Standard Normal Table to calculate the p-value for this hypothesis test for z-statistic = 0.00 the p-value is: A) 0.0000 B) 0.3085 C) 0.5000 D) 0.6915

c Feedback: Since Ha is ">" we find the p-value by P(Z > z) where z is the z-statistic.

The table above shows the opinions of 953 respondents in the General Social Survey to the question "Everything considered, would you say that in general, you approve or disapprove of wiretapping?" The purpose of examining the data is to see if there is a gender difference in how people would respond to this question. State the alternative hypotheses for this study. A) There is no relationship in the population between gender and approval of wiretapping B) There is no relationship in the sample between gender and approval of wiretapping C) There is a relationship in the population between gender and approval of wiretapping D) There is a relationship in the sample between gender and approval of wiretapping

c Feedback: The alternative hypothesis speaks of a relationship between the variables in the population.

The smaller the p-value, the A) stronger the evidence against the alternative hypothesis. B) stronger the evidence for the null hypothesis. C) stronger the evidence against the null hypothesis.

c Feedback: The smaller the p-value then the stronger the evidence against the null hypothesis Ho

A university administrator writes a report in which he states that at least 75% of all students have driven while under the influence of drugs or alcohol. Many others think the correct percent is more than 75%. What are appropriate null and alternative hypothesis in this situation? A) ρ-hat = 0.75 vs. Ha: ρ-hat > 0.75 B) ρ-hat = 0.75 vs. Ha: ρ-hat ≠ 0.75 C) ρ = 0.75 vs. Ha: ρ > 0.75 D) ρ = 0.75 vs. Ha: ρ ≠ 0.75

c Hypotheses statements always use parameter notation and the null hypothesis will include the symbol "=" Then the alternative, Ha, is based on what is being researched in this case greater than 75% or 0.75

Which of the following would indicate that a dataset is skewed to the right? A) The interquartile range is larger than the range. B) The range is larger than the interquartile range. C) The mean is much larger than the median. D) The mean is much smaller than the median.

c In a skewed data set the mean will "chase the tail", meaning that in a skewed right data set the mean will be to the right (i.e. larger) than the median.

If one card is randomly picked from a standard deck of 52 cards, the probability that the card will be a number from 2 through 10, or a Heart, or both, is A) 51.9% (27/52) B) 69.2% (36/52) C) 76.9% (40/52) D) 94.2% (49.52)

c P(2 through 10) = 36/52. The P(heart) = 13/52 and P(2-10 and heart) = 9/52. To find P(2-10 or heart) = P(2 thru 10) +P(heart)-P(both) = 36/52 + 13/52- 9/52 = 40/52

If an exam was worth 100 points, and your score was at the 80th percentile, then A) your score was 80 out of 100. B) 80% of the class had scores at or above your score. C) 20% of the class had scores at or above your score. D) 20% of the class had scores at or below your score.

c The 80th percentile would mean 80% of the class scored at or below your score, or that 20% scored at or above your score.

A five number summary for hours studied in a week were 5, 12, 14, 18, and 20. What is the value such that 50% of the students studied longer than that value? A) 5 hours B) 12 hours C) 14 hours D) 18 hours E) 20 hours

c The median splits the data into 50% halves: so 50% at/above 14 and 50% at/below 14.

A study compared grade point averages (GPA) for students in a class: students were divided by 6 locations where they usually sat during lecture (i.e. left or right front, left or right center, left or right rear). A total sample size of 12 students was studied (2 students from each section) using one-way analysis of variance. What are the numerator and denominator degrees of freedom for the ANOVA F-test? A) 6 for numerator and 12 for denominator. B) 5 for numerator and 11 for denominator. C) 5 for numerator and 6 for denominator.

c The numerator degrees of freedom are found by taking the number of group levels minus 1 (this case 6 − 1 = 5) and the denominator degrees of freedom are found by taking the total sample size minus the number of group levels (12 − 6 = 6)

Calculate the standard error of the sample statistic: A randomly selected sample of 30 students spent an average amount of $40.00 on a date, with a standard deviation of $5.00. The standard error of the sample mean is A) 0.063 B) 0.167 C) 0.913 D) 5.000

c The standard error is found by taking the standard deviation and dividing by the square root of the sample size

A safety officer wants to prove that μ = the average speed of cars driven by a school is less than 25 mph. Suppose that a random sample of 14 cars shows an average speed of 24.0 mph, with a sample standard deviation of 2.2 mph. Assume that the speeds of cars are normally distributed. Using the T-table, which of the following is an appropriate conclusion? A) The results are statistically significant so the average speed appears to be greater than 25 mph. B) The results are statistically significant so the average speed appears to be less than 25 mph. C) The results are not statistically significant: there is not enough evidence to conclude the average speed is less than 25 mph.

c This is a one sample t-test, so the test statistic, t is found by taking the difference between the sample mean (24) minus the hypothesized mean (25) and dividing by the standard error of the mean (S/√n = 2.2/√14 = 0.588). The t-value is then t = −1/0.588 = −1.70. Then with degrees of freedom of 14 − 1 = 13 and from T-table we get 0.050 < p-value < 0.100 and since this range includes 0.05 we say that the results are not statistically significant: there is not enough evidence to conclude the average speed is less than 25 mph.

Olivia wants to learn a foreign language. To get an idea of how satisfied other students were after taking a foreign language course, she decides to take a random sample of 20 students. If Olivia randomly selects one class among all the foreign language classes taught that year, and then interviews all students in that class, the sampling method is a A) simple random sample. B) stratified random sample. C) cluster sample. D) systematic sample.

c This sampling method is a cluster technique as she samples a cluster; the one class from among all foreign language classes

The mean hours of sleep that students get per night is 7 hours, the standard deviation of hours of sleep is 1.7 hours, and the distribution is approximately normal. Complete the following sentence. For about 95% of students, nightly amount of sleep is between ______. A) 5.3 and 8.7 hrs B) 5 and 9 hrs C) 3.6 and 10.4 hrs D) 1.9 and 12.1 hrs

c Using the Empirical rule, 95% would lie between 5.7 ± 2*standard deviation or 3.6 to 10.4

Which statement is not true about the 95% confidence level? A) Confidence intervals computed by using the same procedure will include the true population value for 95% of all possible random samples taken from the population. B) The procedure that is used to determine the confidence interval will provide an interval that includes the population parameter with probability of 0.95. C) The probability that the true value of the population parameter falls between the bounds of an already computed confidence interval is roughly 95%. D) If we consider all possible randomly selected samples of the same size from a population, the 95% is the percentage of those samples for which the confidence interval includes the population parameter.

c ll confidence intervals either correctly contain the true parameter or they do not. Therefore, the only probability that can be attached to an interval would be either 1, because the interval is correct, or 0 because it is not.

The five numbers in a fivenumber summary are the A) lowest value, mean, median, mode, and the highest value. B) lowest value, lower margin of error, median, upper margin of error, and the highest value. C) lowest value, lower quartile, median, upper quartile, and the highest value. D) lowest value, 2nd lowest value, middle value, 2nd highest value, and the highest value.

c lowest value (Minimum), lower quartile (Q1), median, upper quartile (Q3), and the highest value (Maximum).

The table above shows the number of Olympic medals won by the three countries with the most medals during the 2000 Olympics in Sydney, Australia. There were a total of 244 medals won by the three countries. What percent of the medals won by China were silver? A) 6.6% B) 24.2% C) 27.1% D) 28.3%

c take 16 divided by 59 times 100% for 27.1%

A shoe company wants to compare two materials, A and B, for use on the soles of boys' shoes. In this example, each of ten boys in a study wore a special pair of shoes with the sole of one shoe made from Material A and the sole on the other shoe made from Material B. The sole types were randomly assigned to account for systematic differences in wear between the left and right foot. After three months, the shoes are measured for wear. Let Ho: μd = 0 versus Ha: μd ≠ 0. From this random sample of 10 boys, the sample mean difference was 0.41 and Sd was 0.387. What is the value of the test statistic? A) t = 0.00 B) t = 3.18 C) t = 3.35 D) t = 4.74

cFirst, recognize that this is a paired t-test. Then the t test statistic is 3.35 and is found by taking the sample mean difference minus the hypothesized value then dividing by the standard error (not standard deviation!). This, then, is (0.41 - 0)/(0.387/sq.rt. 10) or 0.41/0.122 = 3.35

A reviewer rated a sample of fifteen wines on a score from 1 (very poor) to 7 (excellent). A correlation of 0.92 was obtained between these ratings and the cost of the wines at a local store. In plain English, this means that A) in general, the reviewer liked the cheaper wines better. B) having to pay more caused the reviewer to give a higher rating. C) wines with low ratings are likely to be more expensive (probably because fewer will be sold). D) in general, as the cost went up so did the rating.

d Since the correlation is positive this means that as the cost went up so did the rating.

a short quiz has two truefalse questions and one multiplechoice question with four choices. A student guesses at each question. Assuming the choices are all equally likely, what is the probability that the student gets all three correct? A) 1/32 B) 1/3 C) 1/8 D) 1/16

d (1/2)*(1/2)*(1/4) = 1/16

Which of the following is not a term used for a quantitative variable? A) Measurement variable B) Numerical variable C) Continuous variable D) Categorical variable

d Categorical variables are used to describe qualitative variables

Suppose that amount spent by students on textbooks this semester has approximately a bell-shaped distribution. The mean amount spent was $300 and the standard deviation is $100. What amount spent on textbooks has a standardized score equal to 0.5? A) $150 B) $250 C) $300.50 D) $350

d Feedback: Need to use algebra to solve for Observed in equation Z = (observed - mean)/SD. This comes to Observed = Z*SD + mean = (0.5)*(100) + 300 = 350 dollars

A researcher reports that the correlation between two quantitative variables is r = 0.8. Which of the following statements is correct? A) The average value of y changes by 0.8 when x is increased by 1. B) The average value of x changes by 0.8 when y is increased by 1. C) The explanatory variable (x) explains 80% of the variation in the response variable (y). D) The explanatory variable (x) explains 64% of the variation in the response variable (y).

d Feedback: The "explanation of..." refers to the Coefficient of Determination, or R2. So -0.8 squared is 64%

Which of the following measures is not a measure of spread? A) Variance B) Standard deviation C) Interquartile range D) Median

d Median is a location statistic.

A student wanted to test whether there was a difference in the mean daily hours of study for students living in four different dormitories. She selected a random sample of 50 students from each of the four dormitories. What is the alternative hypothesis for this situation? A) The mean daily hours of study is 3 hours for each dormitory. B) The mean daily hours of study is the same for each dormitory. C) The mean daily hours of study is different for each of the 200 students in the sample. D) The mean daily hours of study is not the same for all four dormitories.

d The alternative hypothesis for an ANOVA test is that all the means are not the same, i.e. the means are not all equal.

Researchers want to see if men have a higher blood pressure than women do. A study is planned in which the blood pressures of 50 men and 50 women will be measured. What is the most appropriate alternative hypothesis about the means of the men and women? A) The sample means are the same. B) The sample mean will be higher for men. C) The population means are the same. D) The population mean is higher for men than for women.

d The alternative hypothesis, Ha, would indicate that there is a difference and that this difference would take place in the population. From the responses provided, the only alternative that indicates a difference in the population is the response that the population mean is higher for men than for women. The sample is used to test for this population difference.

Which statistic is not resistant to an outlier in the data? A) Lower quartile B) Upper quartile C) Median D) Mean

d The mean is not resistant. This is due to the fact that all values are used when calculating a mean, but for the other responses the values are based on the number of observations

On a survey conducted at a university, students were asked how they felt about their weight (about right, overweight, or underweight), and also were asked to record their grade point average (GPA). There were 235 responses, with 160 saying their weight was about right, 50 said they were overweight, and 17 underweight. The question of interest is whether mean GPA is the same or differs for different weight attitude populations. Output for the study is given above. What is the appropriate conclusion to draw from this analysis? A) If a student is underweight he (she) should gain weight in order to raise his (her) GPA. B) Overweight students should go on a diet to lose weight because this will result in a higher GPA. C) There is no significant difference among the mean GPAs of students in the three weight attitude groups. D) There is a significant difference among the mean GPAs of students in the three weight attitude groups.

d With a p-value less than 0.05 we would reject the null hypothesis that the three population means are equal and conclude that there is a significant difference among the mean GPAs of students in the three weight attitude groups.

Which of the following variables COULD be used in a Chi-Square analysis? A) Gender B) Political Party Affiliation C) Race D) Age E) Course Section Number F) All of the above

f

Suppose we select a random sample of n = 100 students and find that the proportion of students who said they believe in love at first sight is 0.43. Which statement is not necessarily true? A) There were 43 students in the sample who said they believe in love at first sight. B) Based on the information provided by the sample, we cannot determine exactly what proportion of the population would say they believe in love at first sight. C) ρhat = 0.43 D) ρ = 0.43

ρ = 0.43 is a parameter representing the population proportion which would not necessarily be known just from the data given.


Related study sets

A&P II Exam 4: Ch. 22 The Respiratory System

View Set

Chapter 24: Management of Patients With Chronic Pulmonary Disease

View Set

Mother Baby Final chapter 5-6 questions

View Set