Mini Quiz Test
What is the intuitive addition rule? a. "or" means add, avoid double counting b. "and" means add, avoid double counting c. "and" means add, remember to double count d. "or" means add, remember to double count
a. "or" means add, avoid double counting
The probability of of an event must be a number between: a. 0 and 1 b. 0 and 10 c. 0 and 100 d. 1 and 100 e. 1 and 10
a. 0 and 1
For any event A, P(Ac) = ____ a. 1 - P(A) b. P(A) c. 0 d. 1 e. P(A) - 1
a. 1 - P(A)
Which of the following are properties of a probability distribution? Select ALL that apply. a. All probabilities add up to 1 b. All probabilities between 0 and 1, inclusive c. No overlap of events d. No probabilities can be zero e. All probabilities must be expressed as fractions
a. All probabilities add up to 1 b. All probabilities between 0 and 1, inclusive c. No overlap of events
What is the name for sorting the experimental units into a few groups with similar characteristics before randomly assigning the treatments? a. Blocking b. Controlling c. Combining d. Isolating e. Fracking
a. Blocking
What are the two names for a variable that may create a false sense of association between the explanatory and response variables? Select TWO answers. a. Confounding variable b. Lurking variable c. Controlling variable d. Hidden variable
a. Confounding variable b. Lurking variable
Which of the following are elements of a well-designed experiment? Select ALL that apply. a. Control of potential confounding variables b. Blocking of variables into homogeneous groups c. Replication of treatments to multiple units in each group d. Comparison of at least two treatment groups e. Random assignment of treatments to experimental units
a. Control of potential confounding variables c. Replication of treatments to multiple units in each group d. Comparison of at least two treatment groups e. Random assignment of treatments to experimental units
What is the general name for all things that are assigned treatments in an experimental study? a. Experimental units b. Participants c. Individuals d. Subjects e. Members
a. Experimental units
What is another name for the explanatory variable? a. Factor b. Level c. Control d. Response e. Treatment
a. Factor
According to the law of large numbers, how can the true probability of something be estimated? a. Find the relative frequency of many trials b. Find the sample size of a few trials c. Find the relative frequency of a few trials d. Find the sample size of many trials
a. Find the relative frequency of many trials
The median of the first half of an ordered data set is called what? a. First quartile b. Second quartile c. Fourth quartile d. Interquartile range e. Third quartile
a. First quartile
What happens to the width of a confidence interval if you increase the sample size? a. It gets narrower. b. It gets wider. c. It cannot be determined. d. It stays the same.
a. It gets narrower.
The mean of a data set is 10 and the standard deviation is 4. Every point in the data set then has 2 added to it. What is the mean and standard deviation of this transformed data set? a. Mean = 12; Standard deviation = 4 b. Mean = 10; Standard deviation = 4 c. Mean = 10; Standard deviation = 6 d. Mean = 12; Standard deviation = 6
a. Mean = 12; Standard deviation = 4
The mean of a data set is 10 and the standard deviation is 4. Every point in the data set is then multiplied by 2. What is the mean and standard deviation of this transformed data set? a. Mean = 20; Standard deviation = 8 b. Mean = 10; Standard deviation = 4 c. Mean = 20; Standard deviation = 4 d. Mean = 10; Standard deviation = 8
a. Mean = 20; Standard deviation = 8
The mean of a data set is 10 and the standard deviation is 4. Every point in the data set then has 2 subtracted from it. What is the mean and standard deviation of this transformed data set? a. Mean = 8; Standard deviation = 4 b. Mean = 10; Standard deviation = 2 c. Mean = 8; Standard deviation = 2 d. Mean = 10; Standard deviation = 4
a. Mean = 8; Standard deviation = 4
A(n) __________ is when we observe individuals and measure variables of interest without imposing any treatment. a. Observational study b. Survey c. Experiment d. Sample
a. Observational study
For which of the following confidence intervals would you use t* for the critical value? Select ALL that apply. a. One mean (in most cases) b. Difference of two proportions c. One proportion d. Slope of a regression line e. Difference of two means (in most cases) f. Mean difference (in most cases)
a. One mean (in most cases) d. Slope of a regression line e. Difference of two means (in most cases) f. Mean difference (in most cases)
For which of the following confidence intervals would you use z* for the critical value? Select ALL that apply. a. One proportion b. Difference of two means (in most cases) c. Mean difference (in most cases) d. Difference of two proportions e. One mean (in most cases) f. Slope of a regression line
a. One proportion d. Difference of two proportions
For which of the following do you compare a statistic in a population against a claim about the population? a. One sample situation b. Two sample situation
a. One sample situation
If events A and B are independent, then P(A ∩ B) = a. P(A) * P(B) b. P(A) + P(B) c. 0 d. 1 e. P(A) * P(A|B)
a. P(A) * P(B)
What acronym should be used to answer a FRQ on confidence intervals? a. PANIC b. PICNIC c. PHANTA d. PHANTOMS e. POTATO
a. PANIC
A(n) __________ observational study follows a set of individuals into the future collecting data. a. Prospective b. Historical c. Retrospective d. Introspective
a. Prospective
Which of the following are elements of a well-designed experiment? Select ALL that apply. a. Random assignment of treatments to experimental units b. Control of potential confounding variables c. Use of a placebo to keep things consistent between groups d. Comparisa. Random assignment of treatments to experimental units e. Replication of treatments to multiple units in each groupon of at least two treatment groups
a. Random assignment of treatments to experimental units b. Control of potential confounding variables d. Comparison of at least two treatment groups e. Replication of treatments to multiple units in each group
Which variable is the outcome variable, on which comparisons are made? a. Response b. Explanatory c. Categorical d. Discrete e. Quantitative
a. Response
A(n) __________ observational study examines current or past data for a set of individuals. a. Retrospective b. Historical c. Prospective d. Introspective
a. Retrospective
Which type of bias results when individuals inaccurately report things about themselves? a. Self-reported response bias b. Undercoverage bias c. Question wording bias d. Nonresponse bias e. Voluntary response bias
a. Self-reported response bias
The completely randomized design of experiments is analogous to which of the following sampling methods? a. Simple random sample b. Systematic random sample c. Stratified random sample d. Cluster random sample
a. Simple random sample
Which of these is a number that describes a sample? a. Statistic b. Variable c. Population d. Distribution e. Parameter
a. Statistic
The randomized block design of experiments is analogous to which of the following sampling methods? a. Stratified random sample b. Cluster random sample c. Systematic random sample d. Simple random sample
a. Stratified random sample
Which random sampling technique has the lowest variability? a. Stratified random sample b. Cluster random sample c. Systematic random sample d. Simple random sample
a. Stratified random sample
When subtracting one random variable from another one, you should do which of the following? a. Subtract their means and add their variances. b. Subtract their means and add their standard deviations. c. Subtract their means and subtract their variances. d. Subtract their means and subtract their standard deviations.
a. Subtract their means and add their variances.
Question 158 What is the main difference between the binomial distribution and the geometric distribution? a. The binomial setting involves counting the number of successes in a fixed number of trials, and the geometric setting involves counting the number of trials until the first success. b. The binomial setting involves counting the number of trials until the first success, and the geometric setting involves counting the number of successes in a fixed number of trials. c. The binomial setting uses a different probability for each trial, and the geometric setting uses the same probability for each trial. d. The binomial setting uses the same probability for each trial, and the geometric setting uses a different probability for each trial.
a. The binomial setting involves counting the number of successes in a fixed number of trials, and the geometric setting involves counting the number of trials until the first success.
Which of the following conditions must be true to make inferences about a difference in population proportions? Select ALL four answers. a. The samples must be random. b. The samples must NOT be random. c. Each sample size must be less than 10% of the population size. d. Each sample size must be greater than 10% of the population size. e. Both np and n(1-p) must be greater than or equal to 5 for both samples. f. Both np and n(1-p) must be greater than or equal to 10 for both samples. g. The two populations must be dependent on each other. h. The two populations must be independent of each other.
a. The samples must be random. c. Each sample size must be less than 10% of the population size. f. Both np and n(1-p) must be greater than or equal to 10 for both samples. h. The two populations must be independent of each other.
Which of the following conditions must be true to make inferences about a population proportion? Select ALL three answers. a. The samples must be random. b. The samples must NOT be random. c. The sample size must be less than 10% of the population size. d. The sample size must be greater than 10% of the population size. e. Both np and n(1-p) must be greater than or equal to 5. f. Both np and n(1-p) must be greater than or equal to 10.
a. The samples must be random. c. The sample size must be less than 10% of the population size. f. Both np and n(1-p) must be greater than or equal to 10.
Which of the following conditions must be true to make inferences about a population mean? Select ALL three answers. a. The samples must be random. b. The samples must NOT be random. c. The sample size must be less than 10% of the population size. d. The sample size must be greater than 10% of the population size. e. The population must be normally distributed AND the sample size must be at least 30. f. The population must be normally distributed OR the sample size must be at least 30.
a. The samples must be random. c. The sample size must be less than 10% of the population size. f. The population must be normally distributed OR the sample size must be at least 30.
Which of the following conditions must be true to make inferences about a difference in population means? Select ALL four answers. a. The samples must be random. b. The samples must NOT be random. c. The sample size must be less than 10% of the population size. d. The sample size must be greater than 10% of the population size. e. The population must be normally distributed AND the sample size must be at least 30. f. The population must be normally distributed OR the sample size must be at least 30. g. The two populations must be dependent on each other. h. The two populations must be independent of each other.
a. The samples must be random. c. The sample size must be less than 10% of the population size. f. The population must be normally distributed OR the sample size must be at least 30. h. The two populations must be independent of each other.
What is a sample space? a. The set of all possible outcomes b. An event that has a probability of 0 c. The complement of the outcome of interest d. A particular outcome of interest e. An event that has a probability of 1
a. The set of all possible outcomes
What is an event space? a. The set of outcomes for a given event b. The set of all possible outcomes c. An event that has a probability of 1 d. An event that has a probability of 0 e. The complement of the outcome of interest
a. The set of outcomes for a given event
Which of the following is used to model the conditional probabilities for successive events? a. Tree diagram b. Experimental design diagram c. Two-way table d. Conditional table e. Venn diagram
a. Tree diagram
Which of the following is the most appropriate way to interpret a 95 percent confidence interval? a. We are 95 percent confident that the interval captures the true population parameter. b. We have a 95 percent guarantee of the population parameter being in the interval. c. There is a 95 percent chance that the population parameter lies within the interval. d. There is a probability of 0.95 that the population parameter is in the interval.
a. We are 95 percent confident that the interval captures the true population parameter.
Which of these should be used as the degrees of freedom when conducting a linear regression t interval for slopes? a. n - 2 b. n + 1 c. n + 2 d. n - 1 e. n
a. n - 2
Which of the following statements is true regarding P and p̂? a. p̂ is the sample statistic, which estimates the population parameter P b. p̂ is the population parameter, which estimates the sample statistic P c. P is the population parameter, which estimates the sample statistic p̂ d. P is the sample statistic, which estimates the population parameter p̂
a. p̂ is the sample statistic, which estimates the population parameter P
If events A and B are mutually exclusive, then P(A ∩ B) = a. P(A) + P(B) b. 0 c. P(A) * P(B) d. 1 e. P(A) * P(A|B)
b. 0
What critical value of z should be used to get a confidence interval of exactly 95 percent? a. 2.00 b. 1.96 c. 1.88 d. 1.69
b. 1.96
Which of the following criteria must be true in order to use the normal distribution to approximate the binomial distribution? a. Both np and np(1-p) must be greater than 10. b. Both np and n(1-p) must be greater than 5. c. Both np and n(1-p) must be greater than 10. d. Both np and np(1-p) must be greater than 5.
b. Both np and n(1-p) must be greater than 5.
What is another name for mutually exclusive events? a. Separate events b. Disjoint events c. Independent events d. Dependent events e. Conditional events
b. Disjoint events
What happens to the width of a confidence interval if you raise the confidence level, such as from 95% to 99%? a. It stays the same. b. It gets wider. c. It gets narrower. d. It cannot be determined.
b. It gets wider.
When constructing a confidence interval for the slope of a regression line, what is the significance of the interval containing 0? a. The confidence interval is inconclusive, and another one must be created. b. It is NOT plausible to assume that a relationship exists between the two variables. c. It is plausible to assume that a relationship exists between the two variables.
b. It is NOT plausible to assume that a relationship exists between the two variables.
When constructing a confidence interval for a difference in proportions or means, what is the significance of 0 being outside the interval? a. It is plausible to assume the two proportions or the two means are the same. b. It is NOT plausible to assume the two proportions or the two means are the same. c. The confidence interval is inconclusive, and another one must be created.
b. It is NOT plausible to assume the two proportions or the two means are the same.
When constructing a confidence interval for a difference in proportions or means, what is the significance of the interval containing 0? a. The confidence interval is inconclusive, and another one must be created. b. It is plausible to assume the two proportions or the two means are the same. c. It is NOT plausible to assume the two proportions or the two means are the same.
b. It is plausible to assume the two proportions or the two means are the same.
What happens to the t distribution curve as the sample size is decreased? a. It looks more and more like the z (normal) distribution. b. It looks less and less like the z (normal) distribution. c. Nothing. The t distribution does not depend on sample size. d. It changes from unimodal to bimodal. e. It changes from bimodal to unimodal.
b. It looks less and less like the z (normal) distribution.
Which one of the following statements is true if two events A and B are independent? a. P(A ∩ B) = 0 b. P(A) = P(A|B) c. P(A) = P(B) d. P(A) = P(B|A) e. P(A) = 0 or P(B) = 0
b. P(A) = P(A|B)
In an AP exam question, if you see the phrase "convincing statistical evidence," what should you do? a. Add a best fit line to a scatterplot b. Perform a hypothesis test c. Construct a confidence interval d. Describe a data distribution e. Interpret a regression output
b. Perform a hypothesis test
What is it called when a subject responds to a fake treatment with an actual measured response? a. Double-blinding b. Placebo effect c. Confounding d. Control e. Response variable
b. Placebo effect
Which type of bias results when survey questions are confusing or leading? a. Voluntary response bias b. Question wording bias c. Nonresponse bias d. Self-reported response bias e. Undercoverage bias
b. Question wording bias
Which of the following are elements of a well-designed experiment? Select ALL that apply. a. Ability to conduct in a double-blind manner b. Replication of treatments to multiple units in each group c. Control of potential confounding variables d. Comparison of at least two treatment groups e. Random assignment of treatments to experimental units
b. Replication of treatments to multiple units in each group c. Control of potential confounding variables d. Comparison of at least two treatment groups e. Random assignment of treatments to experimental units
Which of the following is true about the null and alternative hypotheses? a. The null hypothesis is assumed to be false from the start, but we must collect evidence to support the alternative hypothesis. b. The null hypothesis is assumed to be true from the start, but we must collect evidence to support the alternative hypothesis. c. The null hypothesis is assumed to be true from the start, but we must collect evidence to reject the alternative hypothesis. d. The null hypothesis is assumed to be false from the start, but we must collect evidence to reject the alternative hypothesis.
b. The null hypothesis is assumed to be true from the start, but we must collect evidence to support the alternative hypothesis
How does the z-distribution (normal) curve compare to the t-distribution curve? a. The z curve is shorter and narrower than the t curve. b. The z curve is taller and narrower than the t curve. c. The z curve is shorter and wider than the t curve. d. The z curve is taller and wider than the t curve.
b. The z curve is taller and narrower than the t curve.
Which of these gives the degrees of freedom for a t-distribution used to make inferences about a mean or a difference of two means? a. n+1 b. n-1 c. n d. n-2
b. n-1
We must take a representative ______________ in order to draw conclusions about a _______________. a. population; sample b. sample; population c. sample; data value d. data value; sample
b. sample; population
Which one of the following statements is true about two mutually exclusive events A and B? a. A and B don't affect each other b. P(A) or P(B) must equal 0 c. A and B can't happen at the same time d. P(A U B) = P(A) * P(B) e. P(A ∩ B) = P(A) * P(B)
c. A and B can't happen at the same time
According to the Central Limit Theorem, if a sample size is large enough, what will be the shape of the sampling distribution of the mean of a random variable? a. A uniform distribution b. A skewed left distribution c. A normal distribution d. A skewed right distribution e. The same shape as the sample distribution
c. A normal distribution
Which random sampling technique has the highest variability? a. Stratified random sample b. Simple random sample c. Cluster random sample d. Systematic random sample
c. Cluster random sample
How do you calculate the probability of an event? a. Subtract event size from sample size b. Multiply event size by sample size c. Divide event size by sample size d. Divide sample size by event size e. Subtract sample size from event size
c. Divide event size by sample size
In P(A|B), the | means _______. a. Complement b. Independent c. Given d. Or e. And
c. Given
What happens to the width of a confidence interval if you lower the confidence level, such as from 95% to 90%? a. It stays the same. b. It gets wider. c. It gets narrower. d. It cannot be determined.
c. It gets narrower.
What happens to the width of a confidence interval if you decrease the sample size? a. It cannot be determined. b. It gets narrower. c. It gets wider. d. It stays the same.
c. It gets wider.
The mean of a data set is 10 and the standard deviation is 4. Every point in the data set is then divided by 2. What is the mean and standard deviation of this transformed data set? a. Mean = 10; Standard deviation = 2 b. Mean = 5; Standard deviation = 4 c. Mean = 5; Standard deviation = 2 d. Mean = 10; Standard deviation = 4
c. Mean = 5; Standard deviation = 2
Which type of bias results when individuals chosen for a sample cannot give data or refuse to respond? a. Undercoverage bias b. Self-reported response bias c. Nonresponse bias d. Question wording bias e. Voluntary response bias
c. Nonresponse bias
Setting up a confidence interval for a mean difference of matched pairs data is done the same way as which of the following confidence intervals? a. Difference of two proportions b. Difference of two means c. One mean d. One proportion e. Slope of a regression line
c. One mean
Which of these defines the sampling distribution of a statistic? a. The values of the variable for all individuals in all possible samples b. The values of the variable for all individuals in the population c. The distribution of values taken by the statistic in all samples of the same size from the population d. The values of the variable for all individuals in the sample e. The mean of the values of the variable for all individuals in the population
c. The distribution of values taken by the statistic in all samples of the same size from the population
What is A n B? a. The intersection of A and B, which refers to anything that is in A or in B or in both A and B b. The union of A and B, which refers to anything that is in A or in B or in both A and B c. The intersection of A and B, which refers to anything that is only in both A and B d. The union of A and B, which refers to anything that is only in both A and B
c. The intersection of A and B, which refers to anything that is only in both A and B
How does the t-distribution curve compare to the z-distribution (normal) curve? a. The t curve is taller and narrower than the z curve. b. The t curve is shorter and narrower than the z curve. c. The t curve is shorter and wider than the z curve. d. The t curve is taller and wider than the z curve.
c. The t curve is shorter and wider than the z curve.
What is A U B? a. The intersection of A and B, which refers to anything that is in A or in B or in both A and B b. The intersection of A and B, which refers to anything that is only in both A and B c. The union of A and B, which refers to anything that is in A or in B or in both A and B d. The union of A and B, which refers to anything that is only in both A and B
c. The union of A and B, which refers to anything that is in A or in B or in both A and B
Which of the following describes a limitation of observational studies as compared to experiments, and gives a valid reason why? a. The results cannot be analyzed in a double-blinded manner since an observational study involves observation. b. They cannot have things randomized because there is no manipulation of variables. c. They cannot show cause and effect because they don't control for confounding variables. d. They cannot compare multiple groups at once because they don't control for confounding variables.
c. They cannot show cause and effect because they don't control for confounding variables.
What are each of the different possible options for the explanatory variable called? a. Levels b. Controls c. Treatments d. Responses e. Constants
c. Treatments
If the p-value is __________ than the significance level, we __________ the null hypothesis. Select TWO answers. a. lower; accept b. lower; fail to reject c. greater; fail to reject d. greater; reject e. lower; reject f. greater; accept
c. greater; fail to reject e. lower; reject
To draw conclusions about a _____________, we must take a representative ______________. a. data value; sample b. sample; population c. population; sample d. sample; data value
c. population; sample
What is the intuitive multiplication rule? a. "or" means multiply b. "given" means multiply c. "not" means multiply d. "and" means multiply
d. "and" means multiply
When adding two random variables together, you should do which of the following? a. Add their means and subtract their standard deviations. b. Add their means and add their standard deviations. c. Add their means and subtract their variances. d. Add their means and add their variances.
d. Add their means and add their variances.
What is the intuitive conditional rule? a. Always multiply by the given. b. Divide by the given only for mutually exclusive events. c. Divide by the given only for independent events. d. Always divide by the given.
d. Always divide by the given.
A(n) __________ is where a treatment is imposed on individuals in order to observe their responses. a. Sample b. Survey c. Observational study d. Experiment
d. Experiment
Which variable is manipulated by the experimenter, to form groups to be compared with respect to the treatment? a. Discrete b. Response c. Categorical d. Explanatory e. Quantitative
d. Explanatory
When asked to find the probability of at least one occurrence of successive independent events, what is the easiest way to do it? a. Find the probability of exactly half of the events occurring and double it. b. Find the probability of every possible number of events occurring and then add them all together. c. Find the probability of all of the events occurring and subtract it from 1. d. Find the probability of none of the events occurring and subtract it from 1. e. Find the probability of exactly one of the events occurring and subtract it from 1.
d. Find the probability of none of the events occurring and subtract it from 1.
Which of the following is the most appropriate way to interpret a standard deviation? a. It is the typical amount of the true population parameter based on different sample sizes. b. It is the typical value of the true population parameter. c. It is the typical margin of error from the true population parameter for different confidence levels. d. It is the typical distance of all estimates from the true population parameter.
d. It is the typical distance of all estimates from the true population parameter.
What is the name for something that looks like the treatment but has no effect? a. Level b. Factor c. Response variable d. Placebo e. Control
d. Placebo
In an ideal situation, what should random sampling do to bias and variability? a. Increase bias and increase variability b. Reduce bias and increase variability c. Increase bias and reduce variability d. Reduce bias and reduce variability
d. Reduce bias and reduce variability
What is the main difference between the geometric distribution and the binomial distribution? a. The geometric setting uses the same probability for each trial, and the binomial setting uses a different probability for each trial. b. The geometric setting involves counting the number of successes in a fixed number of trials, and the binomial setting involves counting the number of trials until the first success. c. The geometric setting uses a different probability for each trial, and the binomial setting uses the same probability for each trial. d. The geometric setting involves counting the number of trials until the first success, and the binomial setting involves counting the number of successes in a fixed number of trials.
d. The geometric setting involves counting the number of trials until the first success, and the binomial setting involves counting the number of successes in a fixed number of trials.
A statistic used to estimate a parameter is an unbiased estimator if... a. The sampling distribution is identical to the population distribution b. The mean of a single sample is equal to the true value of the parameter being estimated c. The sampling distribution and population distributions have equal standard deviations d. The mean of its sampling distribution is equal to the true value of the parameter being estimated e. The mean of the population distribution is equal to the mean of a single sample
d. The mean of its sampling distribution is equal to the true value of the parameter being estimated
Which of these correctly defines the p-value? a. The proportion of successes in a trial b. The probability of observing a measured statistic value, assuming that HA is true c. The proportion of successes in a population d. The probability of observing a measured statistic value, assuming that H0 is true
d. The probability of observing a measured statistic value, assuming that H0 is true
What is the benefit of blinding the subjects in an experiment? a. To increase bias b. To provide statistical significance c. To prevent confounding d. To prevent the placebo effect
d. To prevent the placebo effect
What is the benefit of blinding the researchers in an experiment? a. To provide statistical significance b. To prevent the placebo effect c. To increase bias d. To reduce bias
d. To reduce bias
Which type of bias results when part of the population has a reduced chance of being in the sample? a. Question wording bias b. Nonresponse bias c. Self-reported response bias d. Undercoverage bias e. Voluntary response bias
d. Undercoverage bias
What is the relationship between statistics and parameters? a. Parameters and statistics are both examples of geometric random variables b. There is no relationship c. We use parameters to predict or estimate statistics d. We use statistics to predict or estimate parameters e. We can make sampling distributions from both parameters and statistics
d. We use statistics to predict or estimate parameters
Which of these symbols represents the significance level? a. z b. p̂ c. μ d. α e. σ
d. α
What happens to the t distribution curve as the sample size is increased? a. Nothing. The t distribution does not depend on sample size. b. It looks less and less like the z (normal) distribution. c. It changes from bimodal to unimodal. d. It changes from unimodal to bimodal. e. It looks more and more like the z (normal) distribution.
e. It looks more and more like the z (normal) distribution.
What acronym should be used to answer a FRQ on hypothesis testing? a. PANIC b. PHANTA c. POTATO d. PICNIC e. PHANTOMS
e. PHANTOMS
Which of these is a number that describes a population? a. Sample b. Statistic c. Distribution d. Variable e. Parameter
e. Parameter
Suppose you want to determine the mean parent/guardian income for all students at Deep Run. What is the only way to get the true value of this mean? a. Taking a simple random sample b. Taking a cluster random sample c. Taking a stratified random sample d. Posting an online poll e. Using a census
e. Using a census
Which type of bias results when the sample is composed entirely of people who choose to participate? a. Self-reported response bias b. Nonresponse bias c. Undercoverage bias d. Question wording bias e. Voluntary response bias
e. Voluntary response bias
What is extrapolation, and why is it unreliable? a. Predicting values outside the range of given data, which is unreliable due to uncertainty if the trends will continue. b. Predicting values within the range of given data, which is unreliable due to uncertainty in the measurements. c. Predicting values within the range of given data, which is unreliable due to fluctuations in the data. d. Predicting values outside the range of given data, which is unreliable due to fluctuations in the data.
a. Predicting values outside the range of given data, which is unreliable due to uncertainty if the trends will continue.
Which of the following can be used to calculate the interquartile range of a data set? a. Q3 minus Q1 b. Q4 minus Q1 c. Q3 minus Q2 d. Q2 minus Q1 e. Q4 minus Q2
a. Q3 minus Q1
Which sampling method involves taking every nth element off a numbered list? a. Convenience Sample b. Cluster Random Sample c. Simple Random Sample d. Systematic Random Sample e. Voluntary Response Sample f. Stratified Random Sample
d. Systematic Random Sample
If you know the coefficient of determination, how can you find the correlation coefficient? a. Double the coefficient of determination. b. Halve the coefficient of determination. c. You can't. d. Take the square root of the coefficient of determination. e. Square the coefficient of determination.
d. Take the square root of the coefficient of determination.
What percent of the data falls below Quartile 3? a. 75% b. 50% c. 25% d. 100%
a. 75%
A linear model for a scatterplot is considered a good fit if the residual plot has which of the following? Select TWO answers. a. Apparent randomness b. No clear pattern c. A distinct pattern such as a curve d. Data points trending in a certain direction
a. Apparent randomness b. No clear pattern
Which type of quantitative variable can take on infinitely many values with no gaps between the values? a. Continuous b. Discrete c. Constant d. Distinct e. Discontinuous
a. Continuous
The following quantitative variables can be used to describe a person. Which of the these variables are discrete? Select ALL that apply. a. Digital scale reading b. Number of tattoos c. Body weight d. Foot length e. Shoe size
a. Digital scale reading b. Number of tattoos e. Shoe size
Which type of quantitative variable can take on a countable number values, with gaps between the values? a. Discrete b. Constant c. Distinct d. Continuous e. Discontinuous
a. Discrete
Which of the following types of graphs should be used to plot a discrete variable but NOT a continuous variable? Select TWO answers. a. Dotplot b. Stem and leaf plot c. Histogram d. Bar chart e. Pie chart
a. Dotplot b. Stem and leaf plot
Data is collected on every vehicle in a used car lot. Which of the following is a categorical variable? Select ALL that apply. a. Exterior color b. Number of seats c. Odometer reading (miles driven) d. Vehicle identification number (VIN) e. Type of engine
a. Exterior color d. Vehicle identification number (VIN) e. Type of engine
The following quantitative variables can be used to describe a person. Which of the these variables are continuous? Select ALL that apply. a. Foot length b. Shoe size c. Number of tattoos d. Digital scale reading e. Body weight
a. Foot length e. Body weight
Data is collected on every student in a math class. Which of the following is a categorical variable? Select ALL that apply. a. Home zip code b. Whether or not they like math c. Brand of cell phone d. Height e. Number of siblings
a. Home zip code b. Whether or not they like math c. Brand of cell phone
Which is most likely true for a skew right distribution? a. Mean > Median b. Not enough information to determine c. Mean = Median d. Mean < Median
a. Mean > Median
An explanatory and a response variable are found to have very high correlation. Which of the following must be true? a. Neither of these variables must have caused a change in the other variable. b. The explanatory variable must have caused a change in the response variable. c. One of the variables must have caused a change in the other variable, but it cannot be known which variable caused the change without doing further analysis. d. The response variable must have caused a change in the explanatory variable.
a. Neither of these variables must have caused a change in the other variable.
The median of an ordered data set is also known as what? a. Second quartile b. Mean c. Third quartile d. Average e. First quartile
a. Second quartile
Which of the following is a way of displaying two categorical variables in either table or graphical form? Select ALL that apply. a. Side-by-side bar graph b. One-way table c. Double histogram d. Dual pie chart e. Two-way table
a. Side-by-side bar graph e. Two-way table
Which of the following terms is used to describe the data distribution shown in the above dotplot? a. Skewed left b. Skewed right c. Bimodal d. Symmetric e. Uniform
a. Skewed left
n the statistical linear regression equation, which letters represent the slope and the y-intercept? a. Slope = b; y-intercept = a b. Slope = m; y-intercept = b c. Slope = b; y-intercept = m d. Slope = a; y-intercept = b
a. Slope = b; y-intercept = a
Which of the following is an important characteristic that must be included in the description of a distribution? Select ALL that apply. a. Spread or variability b. Center c. Unusual features d. Shape e. Number of data points
a. Spread or variability b. Center c. Unusual features d. Shape
Which of the following is an important characteristic that must be included in the description of a distribution? Select ALL that apply. a. Spread or variability b. Shape c. Unusual features d. Center e. Direction
a. Spread or variability b. Shape c. Unusual features d. Center
Which of the following measures are considered nonresistant to outliers? Select ALL that apply. a. Standard deviation b. Median c. Mean d. IQR e. Range
a. Standard deviation c. Mean e. Range
Which of the following is an important characteristic that must be included in the description of a scatterplot? Select ALL that apply. a. Strength b. Unusual features c. Variability d. Direction e. Form
a. Strength b. Unusual features d. Direction e. Form
What things can be determined from the correlation coefficient WITHOUT seeing the scatterplot? Select ALL that apply. a. Strength b. Unusual features c. Direction d. Form
a. Strength c. Direction
When given a two-way table, if asked to calculate a joint relative frequency, you should divide: a. The number in a single cell by the total for the entire table b. A row or column total by the total for the entire table c. The number in a single cell by a row or column total
a. The number in a single cell by the total for the entire table
What is the physical meaning of the coefficient of determination? a. The percent of variation in the response variable that can be explained by the explanatory variable b. The percent of variation in the correlation coefficient that can be explained by the slope of a line c. The percent of variation in the explanatory variable that can be explained by the response variable d. The percent of variation in the slope of a line that can be explained by the correlation coefficient
a. The percent of variation in the response variable that can be explained by the explanatory variable
In statistics, what is the physical meaning of the y-intercept of a line in terms of the explanatory and response variables? a. The value of of the response variable when the explanatory variable is zero b. The amount that the response variable increases or decreases for every increase of 1 in the explanatory variable c. The amount that the explanatory variable increases or decreases for every increase of 1 in the response variable d. The value of of the explanatory variable when the response variable is zero
a. The value of of the response variable when the explanatory variable is zero
When determining if a point is an outlier, select the correct formulas for calculating the upper and lower fences (you must choose BOTH correct formulas): a. Upper: Q3 + 1.5*IQR b. Lower: Min - 1.5*IQR c. Lower: Q1 - 1.5*IQR d. Upper: Max + 1.5*IQR
a. Upper: Q3 + 1.5*IQR c. Lower: Q1 - 1.5*IQR
When using the z table, to find the proportion to the left of a z-score, you should do which of the following? a. Use the proportion directly from the z table b. Use 1 minus the proportion from the z table c. Use 1 plus the proportion from the z table
a. Use the proportion directly from the z table
For a scatterplot that shows positive direction, as x values increase, y values tend to: a. increase b. be random c. stay constant d. decrease
a. increase
For a symmetric distribution, the best measure of center is the __________ and the best measure of spread is the ______________________. a. mean; standard deviation b. median; interquartile range c. median; standard deviation d. mean; interquartile range
a. mean; standard deviation
If a residual is negative, the predicted response value from the linear model ___________ the actual response value. a. overestimates b. underestimates c. perfectly estimates
a. overestimates
In a normal distribution, approximately what percent of the data lies within 1 standard deviation of the mean? a. 50% b. 68% c. 95% d. 98% e. 99.7%
b. 68%
A linear model for a scatterplot is considered a bad fit if the residual plot has which of the following? Select TWO answers. a. No clear pattern b. A distinct pattern such as a curve c. Apparent randomness d. Data points trending in a certain direction
b. A distinct pattern such as a curve d. Data points trending in a certain direction
How do you calculate a residual for a given data point? a. Predicted y-value minus actual y-value b. Actual y-value minus predicted y-value c. Predicted x-value minus actual x-value d. Actual x-value minus predicted x-value
b. Actual y-value minus predicted y-value
Data is collected on every vehicle in a used car lot. Which of the following is a quantitative variable? Select ALL that apply. a. Number of seats b. Exterior color c. Type of engine d. Odometer reading (miles driven) e. Vehicle identification number (VIN)
b. Exterior color d. Odometer reading (miles driven)
The interquartile range of a data set is calculated by taking the difference between which two quartiles? a. Second and fourth b. First and third c. First and second d. Second and third e. First and fourth
b. First and third
Data is collected on every student in a math class. Which of the following is a quantitative variable? Select ALL that apply. a. Home zip code b. Height c. Number of siblings d. Brand of cell phone e. Whether or not they like math
b. Height c. Number of siblings
What is the second quartile of an ordered data set more commonly known as? a. Mean b. Median c. Interquartile range d. Mode e. Average
b. Median
Which of the following gives the five number summary in the correct order? a. Maximum, median, minimum, Q3, Q1 b. Minimum, Q1, median, Q3, maximum c. Minimum, maximum, Q1, median, Q3 d. Maximum, Q3, median, Q1, minimum e. Minimum, median, maximum, Q1, Q3
b. Minimum, Q1, median, Q3, maximum
Which of the following is a circular representation of the percentage of each category for a categorical variable? a. Stem and leaf plot b. Pie chart c. Dotplot d. Bar chart e. Histogram
b. Pie chart
Which sampling method has been used here? Agent Mulder wants to know other FBI agents' opinions on the existence of extraterrestrials. He thinks opinion may differ based on age group, so he first sorts the agents into different age brackets: 25-40, 41-55, and 56 and up. He then takes a random sample of 25 agents from each age group, and records their opinions. a. Simple Random Sample b. Stratified Random Sample c. Voluntary Response Sample d. Cluster Random Sample e. Systematic Random Sample f. Convenience Sample
b. Stratified Random Sample
Which of the following is an important characteristic that must be included in the description of a scatterplot? Select ALL that apply. a. Spread b. Strength c. Form d. Direction e. Unusual features
b. Strength c. Form d. Direction e. Unusual features
When given a two-way table, if asked to calculate a conditional relative frequency, you should divide: a. The number in a single cell by the total for the entire table b. The number in a single cell by a row or column total c. A row or column total by the total for the entire table
b. The number in a single cell by a row or column total
Which sampling method has been used here? Agent Mulder wants to know other FBI agents' opinions on the existence of extraterrestrials. He posts a flyer in the FBI cafetaria, asking for people to come stop by his office and share their opinions. a. Simple Random Sample b. Voluntary Response Sample c. Systematic Random Sample d. Cluster Random Sample e. Convenience Sample f. Stratified Random Sample
b. Voluntary Response Sample
If a high-leverage point is removed from a scatterplot, there is usually a large effect on which of the following? Select ALL that apply. a. correlation (strength) b. slope c. y-intercept
b. slope c. y-intercept
The z-score tells you how many __________ above or below the __________ a certain value is. a. means; standard deviation b. standard deviations; mean c. medians; standard deviation d. standard deviations; median
b. standard deviations; mean
If a residual is positive, the predicted response value from the linear model ___________ the actual response value. a. perfectly estimates b. underestimates c. overestimates
b. underestimates
In the statistical linear regression equation, which letters represent the y-intercept and the slope? a. y-intercept = b; slope = m b. y-intercept = a; slope = b c. y-intercept = m; slope = b d. y-intercept = b; slope = a
b. y-intercept = a; slope = b
What percent of the data falls below Quartile 1? a. 100% b. 75% c. 25% d. 50%
c. 25%
In a normal distribution, approximately what percent of the data lies within 2 standard deviations of the mean? a. 50% b. 68% c. 95% d. 98% e. 99.7%
c. 95%
When given a two-way table, if asked to calculate a marginal relative frequency, you should divide: a. The number in a single cell by the total for the entire table b. The number in a single cell by a row or column total c. A row or column total by the total for the entire table
c. A row or column total by the total for the entire table
Which of the following is a graphical representation of the frequency or relative frequency of a categorical variable? a. Histogram b. Pie chart c. Bar chart d. Dotplot e. Stem and leaf plot
c. Bar chart
When an explanatory variable and a response variable are found to be highly correlated, which variable must have caused the change in the other variable? a. The response variable caused the change. b. The explanatory variable caused the change. c. It is possible that neither of these variables caused a change in the other variable. d. Either variable could have caused the change, but further analysis would need to be done to show which one caused it.
c. It is possible that neither of these variables caused a change in the other variable.
Which type of experimental design often involves grouping two very similar subjects, and randomly assigning one to the treatment group and one to the control group? a. Systematic b. Cluster c. Matched pairs d. Stratified random e. Completely randomized
c. Matched pairs
Which is most likely true for a skew left distribution? a. Mean = Median b. Not enough information to determine c. Mean < Median d. Mean > Median
c. Mean < Median
When determining if a point is an outlier, what is the formula to calculate the lower fence? a. Q1 + 1.5*IQR b. 1.5*Q1 - IQR c. Q1 - 1.5*IQR d. 1.5*Q1 + IQR
c. Q1 - 1.5*IQR
When determining if a point is an outlier, what is the formula to calculate the upper fence? a. 1.5*Q3 - IQR b. Q3 - 1.5*IQR c. Q3 + 1.5*IQR d. 1.5*Q3 + IQR
c. Q3 + 1.5*IQR
Which sampling method involves randomly sampling individuals in such a way that every individual has the same chance of being selected? a. Stratified Random Sample b. Cluster Random Sample c. Simple Random Sample d. Convenience Sample e. Voluntary Response Sample f. Systematic Random Sample
c. Simple Random Sample
Which sampling method involves grouping similar individuals, then taking a random sample from each group? a. Simple Random Sample b. Systematic Random Sample c. Stratified Random Sample d. Cluster Random Sample e. Convenience Sample f. Voluntary Response Sample
c. Stratified Random Sample
Two variables are found to have a correlation coefficient of r = 0.9. How can you describe the relationship between the two variables? a. Weak and negative b. Weak and positive c. Strong and positive d. Strong and negative
c. Strong and positive
Which sampling method has been used here? Agent Mulder wants to know other FBI agents' opinions on the existence of extraterrestrials. He creates a numbered list of agents, and then chooses every 50th person off that list. He asks those agents if they believe in extraterrestrials. a. Simple Random Sample b. Convenience Sample c. Systematic Random Sample d. Stratified Random Sample e. Voluntary Response Sample f. Cluster Random Sample
c. Systematic Random Sample
Which of the following terms is used to describe a data distribution that has a single peak value? a. Symmetric b. Skewed c. Unimodal d. Uniform e. Bimodal
c. Unimodal
When using the z table, to find the proportion to the right of a z-score, you should do which of the following? a. Use the proportion directly from the z table b. Use 1 plus the proportion from the z table c. Use 1 minus the proportion from the z table
c. Use 1 minus the proportion from the z table
Two variables are found to have a correlation coefficient of r = -0.2. How can you describe the relationship between the two variables? a. Weak and positive b. Strong and negative c. Weak and negative d. Strong and positive
c. Weak and negative
Two variables are found to have a correlation coefficient of r = 0.2. How can you describe the relationship between the two variables? a. Weak and negative b. Strong and negative c. Weak and positive d. Strong and positive
c. Weak and positive
If an outlier is removed from a scatterplot, there is usually a large effect on which of the following? a. slope b. y-intercept c. correlation (strength)
c. correlation (strength)
For a scatterplot that shows negative direction, as x values increase, y values tend to: a. stay constant b. be random c. decrease d. increase
c. decrease
Which type of variable is used to predict or explain changes in another variable, and on which axis should it be graphed? a. explanatory; y-axis b. response; y-axis c. explanatory; x-axis d. response; x-axis
c. explanatory; x-axis
For a skewed distribution, the best measure of center is the __________ and the best measure of spread is the ______________________. a. mean; interquartile range b. mean; standard deviation c. median; interquartile range d. median; standard deviation
c. median; interquartile range
Which of the following is used to calculate a residual for a data point? a. X̂ - X b. ŷ - y c. y - ŷ d. X - . X̂
c. y - ŷ
The line of best fit will always pass through which point? a. (x̄, 0) b. (0, 0) c. (0, ȳ) d. (x̄, ȳ)
d. (x̄, ȳ)
What percent of the data falls below the median? a. 100% b. 75% c. 25% d. 50%
d. 50%
Which of the following terms is used to describe a data distribution that has two peak values? a. Uniform b. Symmetric c. Unimodal d. Bimodal e. Skewed
d. Bimodal
Which type of variable takes on values that are category names or labels? a. Qualitative b. Quantitative c. Explicit d. Categorical e. Numerical
d. Categorical
Which sampling method has been used here? Agent Mulder wants to know other FBI agents' opinions on the existence of extraterrestrials. There are 11 floors in the building where he works - he uses a random number generator to generate 2 numbers between 1 and 11. He gets 3 and 7. So he surveys all the agents on the 3rd and 7th floors. a. Stratified Random Sample b. Voluntary Response Sample c. Systematic Random Sample d. Cluster Random Sample e. Simple Random Sample f. Convenience Sample
d. Cluster Random Sample
Which sampling method has been used here? Agent Mulder wants to know other FBI agents' opinions on the existence of extraterrestrials. He asks the first 50 agents he sees coming into work if they believe in extraterrestrials, and records their responses. a. Stratified Random Sample b. Systematic Random Sample c. Voluntary Response Sample d. Convenience Sample e. Cluster Random Sample f. Simple Random Sample
d. Convenience Sample
Which sampling method involves selecting individuals who are easiest to reach? a. Voluntary Response Sample b. Stratified Random Sample c. Cluster Random Sample d. Convenience Sample e. Simple Random Sample f. Systematic Random Sample
d. Convenience Sample
Which of the following types of graphs should be used to plot a discrete or a continuous variable? a. Stem and leaf plot b. Bar chart c. Pie chart d. Histogram e. Dotplot
d. Histogram
Which of the following measures are considered resistant to outliers? Select ALL that apply. a. Standard deviation b. Range c. Mean d. IQR e. Median
d. IQR e. Median
Which type of variable takes on numerical values for a measured or counted quantity? a. Qualitative b. Explicit c. Numerical d. Quantitative e. Categorical
d. Quantitative
Which type of experimental design involves first dividing subjects into homogeneous groups such as gender and then randomly assigning them into treatment/control groups? a. Matched pairs b. Stratified random c. Completely randomized d. Randomized block e. Cluster
d. Randomized block
Which of the following terms is used to describe the data distribution shown in the above dotplot? a. Uniform b. Skewed left c. Symmetric d. Skewed right e. Bimodal
d. Skewed right
If you know the correlation coefficient, how can you find the coefficient of determination? a. Halve the correlation coefficient. b. You can't. c. Take the square root of the correlation coefficient. d. Square the correlation coefficient. e. Double the correlation coefficient.
d. Square the correlation coefficient.
Two variables are found to have a correlation coefficient of r = -0.9. How can you describe the relationship between the two variables? a. Strong and positive b. Weak and positive c. Weak and negative d. Strong and negative
d. Strong and negative
In statistics, what is the physical meaning of the slope of a line in terms of the explanatory and response variables? a. The amount that the explanatory variable increases or decreases for every increase of 1 in the response variable b. The value of of the response variable when the explanatory variable is zero c. The value of of the explanatory variable when the response variable is zero d. The amount that the response variable increases or decreases for every increase of 1 in the explanatory variable
d. The amount that the response variable increases or decreases for every increase of 1 in the explanatory variable
Which of the following is the line of best fit? a. The line that maximizes the sum of the residuals b. The line that minimizes the sum of the residuals c. The line that maximizes the sum of the squared residuals d. The line that minimizes the sum of the squared residuals
d. The line that minimizes the sum of the squared residuals
What additional information is conveyed in a mosaic plot that cannot be conveyed in a side-by-side or segmented bar graph? a. The mean of each group b. The mean of each category within a group c. The frequency of each category within a group d. The size of each group
d. The size of each group
For a scatterplot that shows no direction, as x values increase, y values tend to: a. stay constant b. decrease c. increase d. be random
d. be random
Percentile refers to the percent of data values ____________ a given value. a. greater than or equal to b. equal to c. less than d. less than or equal to e. greater than
d. less than or equal to
Which type of variable varies due to changes in another variable, and on which axis should it be graphed? a. response; x-axis b. explanatory; x-axis c. explanatory; y-axis d. response; y-axis
d. response; y-axis
In a normal distribution, approximately what percent of the data lies within 3 standard deviations of the mean? a. 50% b. 68% c. 95% d. 98% e. 99.7%
e. 99.7%
What graphical representation is used to display the five number summary? a. Dotplot b. Stem and leaf plot c. Histogram d. Bar chart e. Boxplot
e. Boxplot
Which sampling method involves grouping individuals, randomly selecting one or more groups, then surveying everyone in the selected groups? a. Voluntary Response Sample b. Stratified Random Sample c. Convenience Sampled. Simple Random Sample e. Cluster Random Sample f. Systematic Random Sample
e. Cluster Random Sample
If nonlinear data is successfully transformed to achieve linearity, what will be true of the residual plot? a. The points will be roughly horizontal b. The points will be in a straight line c. The points will trend in a certain direction d. There will be a clear pattern in the data e. The points will show apparent randomness
e. The points will show apparent randomness
The median of the second half of an ordered data set is called what? a. Second quartile b. First quartile c. Interquartile range d. Fourth quartile e. Third quartile
e. Third quartile
Which of the following terms is used to describe a data distribution where every data point occurs with approximately the same frequency? a. Skewed right b. Bimodal c. Skewed left d. Unimodal e. Uniform
e. Uniform
Which sampling method involves individuals choosing to participate in a survey? a. Systematic Random Sample b. Convenience Sample c. Cluster Random Sample d. Simple Random Sample e. Voluntary Response Sample f. Stratified Random Sample
e. Voluntary Response Sample
Which sampling method has been used here? Agent Mulder wants to know other FBI agents' opinions on the existence of extraterrestrials. He assigns each agent a number between 1 and 1000, and uses a random number generator to choose 50 numbers. He asks those 50 agents if they believe in extraterrestrials. a. Systematic Random Sample b. Cluster Random Sample c. Voluntary Response Sample d. Stratified Random Sample e. Convenience Sample f. Simple Random Sample
f. Simple Random Sample