STA 301 Final Exam

Ace your homework & exams now with Quizwiz!

A survey found that the sample average hotel room rate in New Orleans is $88.42 and the average room rate in Phoenix is $80.61. Assume that the data were obtained from two independent random samples of 25 hotels each and that the sample standard deviations are $5.62 and $4.83, respectively. An investigator is interested in testing if there is a significant difference in the true mean hotel room rates. Choose an appropriate target population parameter from the following list? A. (μ1 - μ2) B. (p1-p2) C. (sample mean 1-sample mean2) D. (X1-X2)

A. (μ1 - μ2)

Which statement best describes a random phenomenon? A. A phenomenon is random if we know what outcomes could happen, but not which particular values will happen in any given trial. B. A random phenomenon is exactly the same as sample mean for a categorical variable. C. A random phenomenon is exactly the same as conditional probability or marginal probability. D. A phenomenon is random if we exactly know which particular value(s) will happen, but not what outcomes could happen in any given trial.

A. A phenomenon is random if we know what outcomes could happen, but not which particular values will happen in any given trial.

An insurance company sets up a statistical test with a null hypothesis that the true average time for processing a claim is 7 days, and an alternative hypothesis that the true average time for processing a claim is greater than 7 days. The hypothesis test rejected null hypothesis and concluded that the average time exceeds 7 days at LaTeX: \alpha α (level of significance) = 0.05. However, at the annual audit it was learned that the true mean processing time is really 7 days. What type of error has been committed in the statistical test? A. A type I error has been committed. B. An error committed cannot be determined without knowing the first quartile of the data. C. Both type I and type II errors have been committed at the same time. D. A type II error has been committed.

A. A type I error has been committed.

A company is interested in studying the several behaviors of all employees in the Midwest region of USA. The company surveyed a random sample of 9,500 employees in the Midwest region of USA. One question they asked was, "If your employer provides you with mentoring opportunities are you likely to remain in your job for the next five years?" They found that 630 members of the sample said yes. Identify the population of interest to the company. A. All employees in the Midwest region of USA of a company. B. The 630 employees who answered yes. C. The 9,500 employees surveyed. D. All people in the world.

A. All employees in the Midwest region of USA of a company.

Which of the following is NOT the required assumption for a valid Chi-Square goodness-of-fit test? A. All observed counts must be at least 5. B. All expected counts must be at least 5. C. Data are obtained by randomization.

A. All observed counts must be at least 5.

Suppose a recent study of 1,000 teenagers in the U.S. found that 33% of them do babysitting to earn extra money. Which of the following describes the population for this example? A. All teenagers in the U.S. B. The 67% of teenagers who do NOT do babysitting to earn extra money. C. The 1,000 teenagers who participated in the study. D. The 33% of teenagers who do babysitting to earn extra money

A. All teenagers in the U.S.

The variable eye color of human is A. Categorical (qualitative). B. Quantitative (discrete). C. Quantitative (continuous). D. Can be both quantitative and categorical depends on day of the wee

A. Categorical (qualitative).

Which of the following Chi-Square tests is a generalization of the two independent proportions z-test? A. Chi-Square Test of Homogeneity B. All of the Chi-Square Tests mentioned here C. Chi-Square Goodness of Fit Test D. Chi-Square Test of Independence

A. Chi-Square Test of Homogeneity

The theoretical long-run average value of a random variable which measures the center of the probability distribution is known as A. Expected value. B. Probability density function (PDF). C. Probability mass function (PMF). D. Cumulative distribution function (CDF).

A. Expected value.

In One-way Analysis of Variance (One-way ANOVA), the probability of making at least one type I error among a family of comparisons is known as: A. Experiment-wise error rate (αe). B. Comparison-wise error rate (αc). C. Type II error rate (β). D. Mean square error (MSE).

A. Experiment-wise error rate (αe).

Which distribution should be used to model the waiting time until the first event occurs? A. Exponential distribution B. Normal distribution C. Standard normal distribution D. Gamma distribution

A. Exponential distribution

"Do you travel over 10 miles to go to work?" This survey question was posed to a random sample (n = 400) of Indiana government employees. The answers were yes for 284 and no for 116. Is there convincing evidence that the true proportion of Indiana government employees who travel over 10 miles to go to work is > 2/3? State null and alternative hypotheses for an appropriate hypothesis test at LaTeX: \alpha α (level of significance) = 0.05 (You do NOT need to perform the hypothesis test). A. Ho: P=2/3 , Ha: P>2/3 B. Ho: µ=10 , Ha:µ≠10 C. Ho: µ=2/3 , Ha: µ>2/3 D. Ho: µ=10 , Ha:µ>10

A. Ho: P=2/3 , Ha: P>2/3

Choose an incorrect relationship among mean, median, and mode for a normal distribution. A. Mean < Median B. Mean = Median C. Mode = Median D. Mean = Mode

A. Mean < Median

"Do you travel over 10 miles to go to work?" This survey question was posed to a random sample (n = 400) of Indiana government employees. The answers were yes for 284 and no for 116. Is there convincing evidence that more than two-thirds of Indiana government employees travel over 10 miles to go to work? Given that test statistic z = 1.838 and p-value = 0.0329, interpret the p-value on the context of the problem (You do NOT need to perform the hypothesis test). A. Over repeated random sampling of 400 Indiana government employees, we expect the z-statistic to exceed 1.838 about 3.29% of time, if indeed exactly 2/3 (or less) of Indiana government employees travel over 10 miles to go to work. B. The probability of rejecting null hypothesis is 0.0329. C. All of the provided answers are the correct interpretations of p-value on the context. D. The probability of fail to reject null hypothesis is 0.0329.

A. Over repeated random sampling of 400 Indiana government employees, we expect the z-statistic to exceed 1.838 about 3.29% of time, if indeed exactly 2/3 (or less) of Indiana government employees travel over 10 miles to go to work.

Which distribution should be used to model for predicting the number of events that occur over a given interval of time? A. Poisson distribution B. Gamma distribution C. Normal distribution D. Z score

A. Poisson distribution

Representatives of the insurance industry wished to investigate the monetary loss resulting from earthquake damage to single-family dwellings in Northridge, California, in January 1994. From the set of all single-family homes in Northridge, California 100 homes were selected for inspection. In the problem described above, the sample is: A. The 100 single-family homes in Northridge, California that were selected for inspection in January 1994. B. All single-family homes that existed in Northridge, California in January 1994. C. All single-family homes that were damaged by the earthquake. D. The single-family homes that were inspected that showed damage in Ohio.

A. The 100 single-family homes in Northridge, California that were selected for inspection in January 1994.

A study of elementary school children, ages 6 to 11, finds a high positive correlation between shoe size x and score y on a test of reading comprehension. The observed correlation is most likely due to: A. The effect of a lurking variable, such as an age. B. Reverse cause and effect relationship (i.e. higher reading comprehension causes larger shoe size). C. A mistake, since the correlation must be negative. D. Cause and effect relationship (i.e. larger shoe size causes higher reading score).

A. The effect of a lurking variable, such as an age.

An insurance company studies amount of claims for property damage due to fires. The company finds a high positive correlation between number of firemen who responded to the fire x and amount of damage y to the property. The observed correlation is most likely due to: A. The effect of a lurking variable, such as the size of the fire. B. A mistake, since the insurance company can never know the meaning of correlation. C. A mistake, since the correlation can never be positive for any data. D. Cause and effect (more firemen causes more damage).

A. The effect of a lurking variable, such as the size of the fire.

The correlation coefficient measures: A. The strength and direction of a linear relation between two quantitative variables. B. Whether a cause and effect relation exists between two quantitative variables. C. Whether there is quadratic relation exists between two quantitative variables. D. Whether there is non-linear relation exists between two quantitative variables.

A. The strength and direction of a linear relation between two quantitative variables.

What is the purpose of Chi-Square Goodness of Fit Test? A. To test the distribution of a qualitative (categorical) variable. B. To determine if the true mean is equal to the true proportion. C. To compare more than two population means. D. To determine whether an association exists between two categorical variables. E. To compare the distributions of a variable of two or more populations.

A. To test the distribution of a qualitative (categorical) variable.

In One-way Analysis of Variance (One-way ANOVA), for pairwise comparison, why does Tukey's HSD (Honesty Significant Difference) method is more popular? A. Tukey's HSD Method is more popular because it not only controls the experiment wise type I error rates but also maintains the high power of the test. B. Tukey's HSD Method is more popular because it is completely error free method. C. Tukey's HSD Method is more popular because it never controls the experiment wise type I error rates and it never maintains the high power of the test. D. Tukey's HSD Method is more popular because it is the only one method available in the world for pairwise comparison.

A. Tukey's HSD Method is more popular because it not only controls the experiment wise type I error rates but also maintains the high power of the test.

Label the random variable G as either discrete or continuous based on its description. Let G = The amount of milk produces yearly by a particular cow (gallons). A. Variable G is a continuous random variable. B. Variable G is a discrete random variable. C. If the cow is brown in color the variable G is a discrete random variable. D. All of the possible answers presented in this question are correct answers on the context.

A. Variable G is a continuous random variable.

In One-way Analysis of Variance (One-way ANOVA), the residual versus fitted plot can be used to assess: A. both the constance variance and linearity assumptions. B. normality assumption. C. independence assumption. D. normality and independence assumptions.

A. both the constance variance and linearity assumptions.

When the outcome of the first event affects the outcome of the second event, the two events are said to be ...... A. dependent events. B. marginal probability. C. conditional probability. D. independent events.

A. dependent events.

In One-way Analysis of Variance (One-way ANOVA), a categorical explanatory variable is known as A. factor. B. normality assumption. C. sample mean. D. population mean.

A. factor.

A type of probability that gives the probability of two events occurring together is known as....... A. joint probability. B. theorem of total probability or rule of elimination. C. conditional probability. D. marginal probability.

A. joint probability.

An ordered arrangements of a set of objects is known as....... A. permutation. B. sample mean or sample standard deviation. C. combination. D. disjoint events with probability = 1.9999

A. permutation.

A Chi-Square probability density function curve is _______. A. right-skewed B. always normal C. left-skewed D. symmetric

A. right-skewed

A ....... is the most basic outcome of an experiment. A. sample space B. sample point C. union and intersection D. compound event

A. sample space

The natural tendency of randomly drawn samples to differ, from another is known as A. sampling variability or sampling error. B. biased sample. C. standard error. D. census.

A. sampling variability or sampling error.

Using data on the appraised value of diamonds, a jewelry store owner computes the least-squares regression line for predicting a diamond's price in dollars from its size in carats. The equation of the least-squares regression line is: LaTeX: yhat\:=\$3453.36+725.22\left(x\right) y h a t = $ 3453.36 + 725.22 ( x ) , where y represents a diamond's value in dollars and x is the size of the diamond in carats. Suppose the jeweler is looking to purchase a 2 carat diamond for his store. How much would he expect to pay for the diamond? Note: LaTeX: yhat\:=\:Predicted\:y\:value y h a t = P r e d i c t e d y v a l u e A. $3,453.36 B. $4,903.80 C. $725.22 D. $4,178.58

B. $4,903.80

What is the implication of small p-value on the context of hypothesis testing? A. A small P-value says that the data we have observed are consistent with the model from the null hypothesis, and we have no reason to reject the null hypothesis. Formally, we say that the test "fails to reject" the null hypothesis. B. A small P-value says that the data we have observed would be very unlikely if our null hypothesis were true, and we have a reasonable reason to reject the null hypothesis.

B. A small P-value says that the data we have observed would be very unlikely if our null hypothesis were true, and we have a reasonable reason to reject the null hypothesis.

Which of the following is true about an outlier? A. Any observation less than Q1 - (IQR) is considered an outlier. B. Any observation whose z-score is >|3| is considered an outlier. C. Any observation greater than Q3 + (IQR) is considered an outlier. D. Any observation whose Q1 = 35 is considered an outlier for any data.

B. Any observation whose z-score is >|3| is considered an outlier.

What does it mean for an estimator to be consistent? A. As the sample size gets smaller and smaller the estimates ultimately get closer and closer (i.e. converge) to the true value of the parameter. B. As the sample size gets larger and larger the estimates ultimately get closer and closer (i.e. converge) to the true value of the parameter. C. The expected value (theoretical mean) of the estimator equals the parameter we are estimating. D. Of all possible unbiased estimators for a population parameter, the estimator that is the minimum variance unbiased estimator (MVUE) has the largest variance.

B. As the sample size gets larger and larger the estimates ultimately get closer and closer (i.e. converge) to the true value of the parameter.

A special type of discrete random variable that results in a dichotomy (only two possible outcomes) is known as A. Probability density function (PDF). B. Bernoulli Random Variable. C. Cumulative distribution function (CDF). D. Continuous Random Variable.

B. Bernoulli Random Variable.

Select the correct null and alternative hypotheses (in general) for Chi-Square Test of Independence. A. H0: The variable has the specified distribution, HA: The variable does not have the specified distribution. B. H0: The two variables are independent, HA: The two variables are dependent (=associated). C. H0: The means are all equal, HA: The means are not all equal (at least two of the means differ). D. H0: The populations are homogeneous with respect to the variable, HA: The populations are not homogeneous with respect to the variable. E. H0: The true mean is equal to the true proportion, HA: The true mean is not equal to the true proportion.

B. H0: The two variables are independent, HA: The two variables are dependent (=associated).

Explain what the phrase 95% confident means when we interpret a 95% confidence interval for population mean. A. 95% of the observations in the population fall within the bounds of the calculated interval. B. In repeated sampling, 95% of similarly constructed confidence intervals contain the value of the population mean. C. The probability that the sample mean falls in the calculated interval is 0.95. D. 95% of similarly constructed confidence intervals would contain the value of the sampled mean.

B. In repeated sampling, 95% of similarly constructed confidence intervals contain the value of the population mean.

A sample of 36 commuters in Chicago showed the sample average of the commuting times was 33.2 minutes and the sample standard deviation was 8.3 minutes. A researcher is interested in finding a 99% confidence interval of true average commuting times in Chicago. What is an appropriate population parameter in this setting? A. Sample mean B. Population mean C. Sample proportion D. Population proportion

B. Population mean

A university was interested in student reaction to a proposal to spend more on athletic scholarships and less on academic scholarships. 35 student athletes were intentionally surveyed ignoring other students such as those who are NOT student athletes. What type of sampling bias may have been occurred? A. Non-response bias B. Selection bias C. None D. Measurement error

B. Selection bias

Molly's Reach, a regional restaurant and gift shop, has recently launched an increased sales campaign, particularly focused on their gift shop. To determine if their proposal is of interest, they plan to survey a random sample of their regular customers. Customers are grouped into four age categories (under 21, 21 to 35, 36 to 50, and over 50). Randomly select 10 regular customers in each age category. Name the sampling strategy (design). A. Cluster sampling B. Stratified sampling C. Simple random sampling You Answered D. Systematic sampling

B. Stratified sampling

The Central Limit Theorem states that the sampling distribution of the sample mean is approximately normal under certain conditions. Which of the following condition is the necessary for the Central Limit Theorem to be used? A. The population size must be large such as at least 30. B. The sampling must be done randomly and the sample size must be large (e.g., at least 30). C. The sample size must be very small such as less than 10. D. The population from which we are sampling must have a known variance.

B. The sampling must be done randomly and the sample size must be large (e.g., at least 30).

A sociologist is concerned about the effectiveness of a training course designed to get more drivers to use seat belts in automobiles. She is interested in testing the following hypotheses: H0: The training course is NOT effective. HA: The training course is effective. Explain what it would mean to commit a Type I Error on the context. A. The test concludes that the training course is NOT effective, but, in reality, it is effective. B. The test concludes that the training course is effective, but, in reality, it is NOT effective.

B. The test concludes that the training course is effective, but, in reality, it is NOT effective.

Choose the best statement that is true about the standardized z-score of a value of a normal random variable X, which has mean µ and standard deviation σ. A. The z-score has a mean = 1 and variance = 0. B. The z-score has a mean equal to 0, the z-score has a standard deviation equal to 1, the distribution of z-scores is a normal distribution, and the z-score tells us by how many multiples of σ the original X observation fall away from the mean. C. The z-score has a mean = 1 and variance = 1 D. The z-score has a mean equal to 0, the z-score has a standard deviation equal to 1, the distribution of z-scores is a normal distribution, and the z-score tells us by how many multiples of σ the original X observation fall away from the zero.

B. The z-score has a mean equal to 0, the z-score has a standard deviation equal to 1, the distribution of z-scores is a normal distribution, and the z-score tells us by how many multiples of σ the original X observation fall away from the mean.

What is the total area or probability under any probability density function curve for a continuous random variable such as normal distribution curve? A. Total area or probability = 0.25 B. Total area or probability = 1 C. Total area or probability = 0 D. Total area or probability = 0.5

B. Total area or probability = 1

When we study every unit of a population of interest, it is called A. a statistic. B. a census. C. a biased sample. D. a random sample.

B. a census.

Given H0: P = 0.25, Ha: P ≠ 0.25, and P-value = 0.094. The test rejected the null hypothesis. Which LaTeX: \alpha α (level of significance) was used for making a decision in this hypothesis testing? A. alpha = 1% B. alpha = 10% C. alpha = 5% D. alpha = 4%

B. alpha = 10%

The sampling distribution of a statistic is a probability distribution A. of all values in the population. B. calculated from all possible random samples of a specific size (n) taken from a population. all values the statistic can take in all possible samples of size n. C. the maximum value the statistic can take in all possible samples. D. of all values in the sample.

B. calculated from all possible random samples of a specific size (n) taken from a population. all values the statistic can take in all possible samples of size n.

When comparing two independent population means, if n1 < 30 and n2 < 30, the samples are known as A. large samples. B. small samples. C. power of the hypothesis test. D. population parameters.

B. small samples.

The central limit theorem says that when a random sample of size n is drawn from any population with mean LaTeX: \mu μ and standard deviation LaTeX: \sigma σ (read as sigma), then when n is sufficiently large (n ≥ 30) A. the distribution of the sample mean is exactly normal. B. the distribution of the sample mean is approximately normal. C. the standard error of the sample mean is: (sigma square)/n D. the distribution of the population is exactly normal.

B. the distribution of the sample mean is approximately normal.

Which of the following target population parameters represents the difference in two population proportions for testing two independent proportions? A. (p1+p2) B. (μ1+μ2) C. (p1-p2) D. (μ1-μ2)

C. (p1-p2)

A water hydrant dispenses water at a rate described by a uniform continuous distribution over the interval 50 to 70 gallons per minute. Find the probability that between 58 gallons and 62 gallons are dispensed during a randomly selected minute. Formula: P(c< x < d) = (d-c) / (b - a) A. 0.50 B. 0.02 C. 0.20 D. 0.10

C. 0.20

A commuter must pass through five traffic lights on her way to work and will have to stop at each one that is red. After keeping a record for several months, she developed following probability mass function (PMF) for the number of red lights she hits. X = # of red light 0 1 2 3 4 5 P(X = x) 0.05 0.25 0.35 0.15 0.15 0.05 Find P(1 < X < 4). A. 0.9 B. None C. 0.5 D. 1.0

C. 0.5

Indiana University administration reported that 56% of all faculty and staff members donated to the United Way campaign. A survey of a random sample of 100 faculty and staff members found that 60% have donated to United Way campaign. In this setting, A. 60% and 56% are both parameter values. B. 60% is a parameter value and 56% is a statistic value. C. 56% is a parameter value and 60% is a statistic value. D. 56% and 60% are both statistic values.

C. 56% is a parameter value and 60% is a statistic value.

Which of the following statement is false on the context of simple linear regression and correlation? A. Before fitting a linear regression model on our data, we should always first examine a scatterplot of independent variable versus dependent variable. B. A regression line is used to estimate the expected value of the dependent variable when the value of the independent variable is given in the range of x-values. C. A negative correlation between two variables means that the values of the two variables decrease at the same time. D. A positive relationship exists when both independent and dependent variables increase or decrease at the same time. In this pattern, the plot runs from the lower left to the upper right.

C. A negative correlation between two variables means that the values of the two variables decrease at the same time.

An insurance company sets up a statistical test with a null hypothesis that the true average payment per claim is $175, and an alternative hypothesis that the true average payment per claim is greater than $175. After completing the statistical test using a sample of 250 claims, it is concluded that the true average payment per claim is $175 (failed to reject null hypothesis) at LaTeX: \alpha α (level of significance) = 0.05. However, at the annual audit it was learned that the true average payment per claim is really greater than $175. What type of error occurred in the statistical test? A. No error has been committed. B. Error committed cannot be determined without knowing the third quartile of the data. C. A type II error has been committed. D. A type I error has been committed.

C. A type II error has been committed.

The bar graph is appropriate to display A. Quantitative variable. B. Quantitative continuous variable. C. Categorical variable. D. Numerical variable. E. Quantitative discrete variable.

C. Categorical variable.

When conducting One-way Analysis of Variance (One-way ANOVA), we use A. Geometric distribution. B. Binomial distribution. C. F distribution. D. Poisson distribution.

C. F distribution.

Which of the following is a form possible for a null hypothesis?\ A. H0: population characteristic < hypothesized value B. H0: population characteristic > hypothesized value C. H0: population characteristic = hypothesized value D. H0: population characteristic ≠ hypothesized value

C. H0: population characteristic = hypothesized value

Which statement best describes inferential statistics? A. A boxplot (a graphical method to describe quantitative data) is an inferential statistic. B. A frequency table (a numerical method to describe a categorical data) is an inferential statistic. C. Inferential statistics utilize sample data to make estimates, decision, predictions, or other generalizations about a population. D. Inferential statistics utilize numerical and graphical methods to look for patterns in a data set, to summarize the information revealed in a data set, and to present the information in a convenient form.

C. Inferential statistics utilize sample data to make estimates, decision, predictions, or other generalizations about a population.

Which of the following is NOT a measure of the variability of a distribution of quantitative variable? A. Standard deviation B. Range C. Mean D. Interquartile range (IQR)

C. Mean

Which of the following statistic (numerical summary) can be used to describe the center of the distribution for both quantitative and categorical variables? A. Mean B. Median C. Mode D. Standard deviation

C. Mode

Explain whether or not the following number could be an example of a probability. P(A) = 8 You Answered A. Yes, because probability can be > 1 for very large number of trials. B. No, because probability must be negative. C. No, because probability cannot be > 1. D. Yes, because probability must be > 1 for any experiment.

C. No, because probability cannot be > 1.

Explain whether or not the following number could be an example of a probability. P(B) = - 0.4567 A. No, because probability must be > 1 for any experiment. B. Yes, because probability can be negative. C. No, because probability cannot be negative. D. Yes, because probability must be negative.

C. No, because probability cannot be negative.

Which of the following graphical methods cannot be used to describe categorical variables? A. Bar chart B. Histogram C. Pareto diagram D. Pie chart

C. Pareto diagram

To qualify for a police academy, candidates must score in the top 10% on a general abilities test. The test has a mean of 200 and a standard deviation of 20. Find and interpret the lowest possible score to qualify for a police academy. Assume the test scores are normally distributed. The relevant R code and output is given below: R codes and output > qnorm(0.90,mean=200,sd=20) [1] 225.63 A. The standard deviation possible score to qualify for a police academy for candidates is 225.63 on a general abilities test. B. The average possible score to qualify for a police academy for candidates is 225.63 on a general abilities test. C. The lowest possible score to qualify for a police academy for candidates is 225.63 on a general abilities test. D. The highest possible score to qualify for a police academy for candidates is 225.63 on a general abilities test.

C. The lowest possible score to qualify for a police academy for candidates is 225.63 on a general abilities test.

Continuous random variables that appear to have equally likely outcomes (evenly distributed) over their range of possible values should be modeled using which distribution? A. Gamma distribution B. Binomial distribution C. Uniform continuous distribution D. Bernoulli distribution

C. Uniform continuous distribution

Label the random variable F as either discrete or continuous based on its description. Let F = Number of defective parts of a machine. A. If the part is made in USA it is continuous random variable. B. Variable F is a continuous random variable. C. Variable F is a discrete random variable. D. All of the possible answers presented in this question are correct answers on the context.

C. Variable F is a discrete random variable.

A random sample of 250 students at Indiana University finds that these students take an average of 15.6 credit hours per semester with a standard deviation of 2.1 credit hours. The 98% confidence interval for the true mean is 15.6 ± 0.309 (i.e. sample mean ± margin of error). Interpret the confidence interval. A. We are 98% confident that the average number of credit hours per semester of the sampled students falls in the interval 15.291 to 15.909 hours. B. The probability that a student takes 15.291 to 15.909 credit hours in a semester is 0.98. C. We are 98% confident that the true average number of credit hours per semester taken by Indiana University students falls in the interval 15.291 to 15.909 hours. D. 98% of the students take between 15.291 to 15.909 credit hours per semester.

C. We are 98% confident that the true average number of credit hours per semester taken by Indiana University students falls in the interval 15.291 to 15.909 hours.

The Central Limit Theorem is considered powerful in statistics because A. it works for any population distribution provided the population mean is known. B. it works for any sample size provided the population is normal. C. it works for any population distribution provided the sample size from a random sample is sufficiently large. D. it works for any sample provided the population distribution is known.

C. it works for any population distribution provided the sample size from a random sample is sufficiently large.

A sample space of an experiment is.......... A. the collection of sample points that negative probability. B. the collection of all sample points that have a probability of at least 50%. C. the collection of all its sample points. D. the collection of all sample points that have a probability of at most 50%.

C. the collection of all its sample points.

A random sample of 16 measurements was selected from a population that is approximately normally distributed produced sample mean = 97.94 and sample standard deviation = 12.64. If we construct 80% and 95% confidence intervals (CI) for population mean form the data, which statement is true? A. The width of both of these CIs for population mean will be exactly the same. B. 80% CI will be wider than the 95% CI for population mean. C. 80% CI will be narrower than the 95% CI for population mean.

CD. 80% CI will be narrower than the 95% CI for population mean.

In One-way Analysis of Variance (One-way ANOVA), for comparing three population means, if sample mean of first population = sample mean of second population = sample mean of third population = overall sample mean of three populations, what will be the value of test statistic (Fvalue)? A. 1.99 B. 0.5 C. 1 D. 0

D. 0

Probability mass function (PMF) is used to describe the behavior of which type of a random variable? A. Unknown random variable if probability = 1.99. B. Continuous random variable if probability = - 1.99. C. Continuous random variable. D. Discrete random variable.

D. Discrete random variable.

A weapons manufacturer uses a liquid propellant to produce gun cartridges. During the manufacturing process, the propellant can get mixed with another liquid to produce a contaminated cartridge. A statistician found that 23% of the cartridges in a particular lot were contaminated. Suppose you randomly sample (without replacement) gun cartridges from this lot until you find the first contaminated one. Let X be the number of cartridges sampled until the first contaminated one is found. Which distribution best describes the context? A. Binomial distribution B. Normal distribution C. Poisson distribution D. Geometric distribution

D. Geometric distribution

If all else remain the same, which of the following will make a confidence interval for the population mean narrower? I. Decrease the confidence level II. Decrease the sample size III. Decrease the margin of error A. I only B. II and III C. II only D. I and III

D. I and III

If all else remains the same, which of the following will make a confidence interval for the population mean wider? I. Increase the confidence level II. Increase the sample size III. Decrease the margin of error A. I and II B. I and III C. II only D. I only

D. I only

You wish to survey Miami students about their viewpoints on whether or not the first class of the day should start at 10:00 AYou wish to survey Miami students about their viewpoints on whether or not the first class of the day should start at 10:00 A.M. instead of 8:30 A.M. Which of the following sampling methods will lead to a representative sample? I. You stand outside of the Rec Center at 7:30 A.M., and ask every tenth person who walks by to participate in a survey. II. You stratify the student body by major and give the survey to students studying political science. III. You assign a number to each student at Miami University and use a random number generator such as random.org to select 100 numbers. You give the survey to the students whose numbers were chosen. IV. You assign a number to each student at Miami University and you and your friends pick your favorite numbers. You sample the students whose numbers were chosen. V. You stratify the student body by college (for example, Arts and Science; Education, Health & Society; Engineering, etc.) and randomly select 50 students from each college to participate in a survey. A. III, V B. I, II, V C. II, IV D. III, IV, V E. I, II, IV

D. III, IV, V

In general, which of the following statement is true about the sampling distribution of the sample mean? Formula: Standard error of sample mean =LaTeX: \frac{\sigma}{\sqrt{n}} σ n A. Increasing the sample size does not change the standard error. B. Increasing the sample size increases the standard error. C. Standard error will be negative by increasing the sample size. D. Increasing the sample size decreases the standard error.

D. Increasing the sample size decreases the standard error.

As the number of observations of a random variable increases, the average of the observations converges to the expected value. In other words, if we repeat an experiment a large number of times our observed average should be fairly close to the expected value. This gives the definition of A. Law of very small numbers. B. Law of permutation. C. Chebyshev's law. D. Law of large numbers.

D. Law of large numbers.

Which of the following is not affected by outliers? A. Variance B. Range C. Mean D. Median

D. Median

The least squares criterion in the simple linear regression model: A. Maximizes the sum of the residuals. B. Maximizes the sum of the squared residuals. C. Minimizes the sum of the residuals. D. Minimizes the sum of the squared residuals.

D. Minimizes the sum of the squared residuals.

Two events are said to be ....... events if they have no outcomes in common. A. Independent B. Dependent C. Non-mutually exclusive D. Mutually exclusive or disjoint

D. Mutually exclusive or disjoint

The normal probability plot (QQ plot) of residual can be used to assess which of the assumptions (LINE) necessary for a simple linear regression model? A. Linearity (L) B. Equal variance (E) C. Independence (I) D. Normality (N)

D. Normality (N)

Statistical inference is concerned with making decisions or predictions about A. Only the non-response bias. V. Sample characteristics. C. Datum. D. Population characteristics.

D. Population characteristics.

Suppose you want to survey the opinions of the residents of an apartment complex. The complex contains a total of 800 units. You decide to sample 160 of the units. Suppose you start at a random starting point, and select every 5th unit to be in your sample, until you sample 160 units. What sampling method did you use? A. Stratified sampling B. Cluster sampling C. Simple random sampling D. Systemic sampling

D. Systemic sampling

Assume each newborn baby had a probability of approximately 0.54 of being female and 0.46 of being male. For a family of four children, let X = number of children who are female. Which of the following statement correctly describes the required conditions of a binomial distribution? A. The n trials are independent, each trial has at least two possible outcomes, and each trial has the same probability of a success. B. The n trials are dependent, there are two trials, and each trial has two possible outcomes. C. The n trials are dependent, each trial has the same probability of a success, and each trial has two possible outcomes. D. The n trials are independent, each trial has the same probability of a success, and each trial has two possible outcomes.

D. The n trials are independent, each trial has the same probability of a success, and each trial has two possible outcomes.

Suppose studentsʹ ages follow a right skewed distribution with a mean of 25 years old and a standard deviation of 5 years. If we randomly sample 100 students, which of the following statements about the sampling distribution of the sample mean age is incorrect? Formula: Standard error of sample mean =LaTeX: \frac{\sigma}{\sqrt{n}} σ n A. The mean of the sampling distribution of sample mean is 25 years. B. The shape of the sampling distribution of sample mean is approximately normal. C. The sample size is 100. D. The standard error of the sampling distribution is equal to 5 years.

D. The standard error of the sampling distribution is equal to 5 years.

An urn contains 3 red balls, 2 blue balls, and 5 white balls. A ball is selected and its color noted, then it is replaced. A second ball is selected and its color noted. Selecting 1 blue ball and then 1 white ball in succession with replacing the first consist of A. events with conditional probability = 2.00 B. dependent events. C. events with negative probability. D. independent events.

D. independent events.

In a large population of high school students who participated in the Indiana University High School Math Contest, the mean IQ is 120 with a standard deviation of 20. The distribution of IQ scores is normal. Suppose that 25 participants are chosen at random to be invited to a reception. The distribution of the sample mean IQ of the invitees is Formula: Standard error of sample mean =LaTeX: \frac{\sigma}{\sqrt{n}} σ n A. normal with mean 120 , standard error 5.0. B. normal with mean 120, standard error 20. C. normal with mean 120, standard error 0.8. D. normal with mean 120, standard error 4.0. E. approximately normal with mean 120, standard error 20.

D. normal with mean 120, standard error 4.0.

Select the correct null and alternative hypotheses (in general) for Chi-Square Test of Homogeneity. A. H0: The variable has the specified distribution, HA: The variable does not have the specified distribution. B. H0: The true mean is equal to the true proportion, HA: The true mean is not equal to the true proportion. C. H0: The means are all equal, HA: The means are not all equal (at least two of the means differ). D. H0: The two variables are independent, HA: The two variables are dependent (=associated). E. H0: The populations are homogeneous with respect to the variable, HA: The populations are not homogeneous with respect to the variable.

E. H0: The populations are homogeneous with respect to the variable, HA: The populations are not homogeneous with respect to the variable.

Select the correct null and alternative hypotheses (in general) for Chi-Square Goodness of Fit Test. A. H0: The two variables are independent, HA: The two variables are dependent (=associated). B. H0: The true mean is equal to the true proportion, HA: The true mean is not equal to the true proportion. C. H0: The populations are homogeneous with respect to the variable, HA: The populations are not homogeneous with respect to the variable. D. H0: The means are all equal, HA: The means are not all equal (at least two of the means differ). E. H0: The variable has the specified distribution, HA: The variable does not have the specified distribution.

E. H0: The variable has the specified distribution, HA: The variable does not have the specified distribution.

What is the purpose of Chi-Square Test of Homogeneity? A. To determine whether an association exists between two categorical variables. B. To test the distribution of a qualitative (categorical) variable. C. To compare more than two population means. D. To determine if the true mean is equal to the true proportion. E. To compare the distributions of a variable of two or more populations.

E. To compare the distributions of a variable of two or more populations.

What is the purpose of Chi-Square Test of Independence? A. To compare more than two population means. B. To compare the distributions of a variable of two or more populations. C. To determine if the true mean is equal to the true proportion. D. To test the distribution of a qualitative (categorical) variable. E. To determine whether an association exists between two categorical variables.

E. To determine whether an association exists between two categorical variables.

A statistic is biased if the mean of the sampling distribution is equal to the parameter it is intended to estimate. True False

False

Correlation and simple linear regression both treat X and Y symmetrically i.e. the value of correlation coefficient and regression parameters will NOT be affected by treating X as Y and vice versa. True False

False

If 0 (i.e. null hypothesized population value) falls within a Tukey's HSD (Honesty Significant Difference) confidence interval, the test concludes that the two population means significantly differ. Context: pairwise comparisons in one-way ANOVA. True False

False

If a confidence interval for the difference between two independent population proportions contains 0 (H0: p1=p2 , Ha: p1LaTeX: \ne ≠ p2 ), we have enough evidence to believe that the two population proportions differ. True False

False

Of all possible unbiased estimators for a population parameter, the estimator that is the minimum variance unbiased estimator (MVUE) has the largest variance. True False

False

Scatterplots are useful to describe both qualitative and quantitative variables graphically. True False

False

We should use Chebyshev's rule to compute probabilities of a random variable even if we know the exact distribution of the random variable. True False

False

We use the acronym SOCS (Shape, Outlier, Center, and Spread) when describing the distribution for categorical data. True False

False

A Chi-Square test works by comparing the observed counts to the expected counts. In other words, Chi-square test is a measure of discrepancy between data and model. True False

True

A One-way Analysis of Variance (One-way ANOVA) involves comparing the variation between the population means to the variation within the populations. True False

True

A point estimator of a population parameter is a rule or formula which tells us how to use sample data to calculate a single number that can be used as an estimate of the population parameter. True False

True

Adding or subtracting a constant from the random variable shifts the mean but doesn't change the variance or standard deviation. True False

True

Continuous random variable uses integration, discrete random variable uses summation for finding cumulative distribution function (CDF). True False

True

Correlation coefficient has no units of measurement. True False

True

If LaTeX: \alpha α (alpha) increases, LaTeX: \beta β (beta) decreases in a test of hypothesis. True False

True

In most situations, the true mean and true standard deviation of a population are unknown (unobserved) quantities that have to be estimated from sample data. True False

True

In one-way ANOVA, if variation between the population means is significantly greater than variation within the populations, then we conclude that not all of the means of the populations are equal. True False

True

On the context of the paired t-test (two dependent small samples t-test for means), observations between two subjects are independent. True False

True

One-Way analysis of variance (One-Way ANOVA) is a generalization of the two small independent random samples exact t-test. True False

True

The mean (expected value) of a Chi-Square distribution is equal to its number of degrees of freedom (df). True False

True

Using Central Limit Theorem, when the sample sizes are large, the sampling distribution of LaTeX: \left(p1hat-p2hat\right) ( p 1 h a t − p 2 h a t ) i.e. (sample proportion 1 - sample proportion 2) will be approximately normal. True False

True

When the population variances are NOT equal, we should use an approximate t-test for testing two independent population means as opposed to an exact t-test for testing two independent population means. True False

True

When the population variances are equal, we should use an exact t-test for testing two independent population means as opposed to an approximate t-test for testing two independent population means. True False

True

When using a (1-LaTeX: \alpha α )100% confidence interval to perform a hypothesis test, we reject the null hypothesis if the H0 value (null hypothesized value) is outside the confidence interval. True False

True

When a distribution is left-skewed, the mean is typically _______ the median. equal to greater than less than ten times more than

greater than


Related study sets

Disorders of the Adrenal Medulla

View Set

Lesson 9: Network Security Design & Implementation

View Set

International Business II Tests 1, 2, 3

View Set

2 - Life Insurance Policies - Provisions, Options and Riders Chapter Exam 2

View Set

Various Orion Quiz Questions ACC 305 Exam 2

View Set

MGMT 309 - chapter 13 human resources (part of exam 3)

View Set

MCRO 251 - Ch 13.1-13.2, 13.5-13.7

View Set