STAT 215 Questions
Which of the following is false related to ANOVA: -To apply the F‐Test in ANOVA the sample size must be the same for all groups -A statistically significant F value indicates that there are differences among the group means but does not tell you specifically which means are different from each other. -To apply F‐Test in ANOVA the variance or standard deviations must be homogeneous among the groups. -The underlying distribution of individual observations within a group does not need be normally distributed due to the Central Limit Theorem.
-The underlying distribution of individual observations within a group does not need be normally distributed due to the Central Limit Theorem.
Rejecting the null hypothesis when the null hypothesis is true is ... a) The correct decision b) A Type I error (alpha) c) A Type II error (Beta) d) Power (1‐Beta)
A Type I error (alpha)
A statistical interval within which, with some confidence level, a specified proportion of a sampled population falls.
Tolerance Interval
Which of the following would introduce Lack‐of‐Fit in a linear regression model? Populations with large variance Extremely precise measurements True relationship between 𝑥 and 𝑦 is non‐linear Measurements with a constant bias
True relationship between 𝑥 and 𝑦 is non‐linear
Which of the following is a consequence of accepting the alternative hypothesis (H1)? a) No action will be taken (Status Quo) b) The conjecture has been supported by the observed statistic c) More data are needed to support the conclusion.
b The conjecture has been supported by the observed statistic
Which of the following does not impact the width of a confidence interval? a) The sample size b) The point estimate c) The standard deviation d) The choice of confidence coefficient
b point estimate
This is what it means to be "95% confident" when we create a 95% confidence interval for the population mean: If we take repeated random samples of the same size 𝑛 from our population and calculate all of the sample means and all of the corresponding 95% confidence intervals, we expect __________ of those intervals will contain the true population mean (u). a) 5% b) 95% c) Cannot know with any confidence
b) 95%
Statistical significance necessarily implies: a) An insufficient sample size was used in the study b) A suitable sample size was used to detect a difference large enough to be considered unlikely under the null hypothesis c) The statistical difference is meaningful in practice
b) A suitable sample size was used to detect a difference large enough to be considered unlikely under the null hypothesis
One‐tailed vs. two‐tailed test refers to the... a) Null hypothesis b) Alternative hypothesis
b) Alternative hypothesis
If we do not reject the null hypothesis, we... a) Do have evidence to say the alternative hypothesis is true b) Do not have evidence to say the alternative hypothesis is true c) Have proven the null hypothesis is true
b) Do not have evidence to say the alternative hypothesis is true
A report states "there is no significant evidence that the median income has increased over the past year." The implied alternative hypothesis is: a) Not applicable since it involves the median b) Median income has increased c) Median income has not changed d) Insufficient information to decide
b) Median income has increased
When sample size 𝑛 is large, a confidence interval for 𝑝 can be constructed using the __________ distribution. a) Uniform distribution b) Normal distribution c) t‐distribution d) Chi‐Square distribution
b) Normal distribution
You are interested in comparing sleep habits of elementary school children. Suppose you have a group of eighth graders. You asked them in July to estimate about how many hours of sleep they get per night, on average. You asked them the same question again in November, during the school year. The most suitable inference test is: a) Independent two‐samples t‐test b) Paired t‐test c) Z‐score d) Analysis of Variance e) Chi‐Square test for Independence
b) Paired t‐test
Which test is used to compare the observed for data one categorical variable with two or more groups against a proposed probability distribution? a) Analysis of Variance b) Chi‐squared goodness‐of‐fit test c) Chi‐squared test of independence d) Chi‐squared test of homogeneity
d) Chi‐squared test of homogeneity
Which of the following is NOT a possible outcome of a hypothesis test? a) Rejecting the null hypothesis b) Not rejecting the null hypothesis c) Accepting the alternate hypothesis d) Proving the null hypothesis is true
d) Proving the null hypothesis is true
When performing an Analysis of Variance (ANOVA) how many treatment means are being evaluated? One Two Three Only More than Two
more than two
When finding the p‐value using a chi‐squared test statistic, the p‐value is always the area to the __________ of the test statistic under the chi‐squared curve. a) Right b) Left
right
Which of the following is an unbiased statistic for the dispersion or spread of a normal distribution? a) Variance b) Standard Deviation c) Mean d) Median
variance
When comparing the ratios of two sample variances
F distribution
What would be the actual significance level (a) if 4 (four) group means were tested using independent pairwise t‐tests, each with a 95% confidence level. = 0.200 = 0.226 = 0.265 =0.185 = 0.050
(4 choose 2)=6 1-(0.96)^6= 0.185
True or False: Population proportion 𝑝 is a parameter. a) True b) False
true
7. LetZ be a standard normal random variable. Use the z‐table to find P(Z > ‐1.30). 0.0968 0.0668 0.9332 0.9032
0.9032
Using the z‐table find P(Z < 1.41). a) 0.9192 b) 0.9207 c) 0.0808 d) 0.0793
0.9207
What is the standard deviation of the standard normal distribution? a) 0 b) 1 c) Any positive number
1
Suppose X is normally distributed with a mean of 10 and a standard deviation of 2. What is the value of the z‐score when X <13? a) 13 b) 1.5 c) 2 d) 3
1.5 - x-u/s
Suppose the population has a normal distribution with a true mean of 100 and a standard deviation of 50. You are taking repeated samples of size 25. What is the expected mean of the sampling distribution?
100- doesn't matter standard dev or sample size
When you are doing an independent samples t‐test, how many populations are under consideration? a) 0 b) 1 c) 2 d) More than 2
2
Suppose the observed value of y is 20 and the predicted value of y is 18. What is the value of the residual? a) b) c) d) ‐2 ,2, 18, 20
20-18=2
Suppose you have a normal distribution with a mean of 10 and a standard deviation of 2. Approximately what percent of the data is between 8 and 12? a) 50% b) 95% c) 68% d) 99%
68
Which of the statements regarding Analysis of Variance is false: ANOVA is an omnibus test which simultaneously compares 3 or more group means. ANOVA is used in place of multiple t‐tests to avoid falsely accepting the null hypothesis Groups are treated as a categorical variable. The null hypothesis is that all group means are equal. The alternate hypothesis is that one or more group means are different.
ANOVA is used in place of multiple t‐tests to avoid falsely accepting the null hypothesis
Method to compare 3 or more group means (normally distributed) for a categorical variable.
Analysis of Variance
True or False: The area under any normal curve equals 1.
true
Distribution of a sample variable approximates a normal distribution (i.e., a "bell curve") as the sample size becomes larger, assuming that all samples are identical in size, and regardless of the population's actual distribution shape.
Central Limit Theorem
Which distribution do you use to find critical values for confidence intervals (n = 20) for a variance when the population standard deviation sigma is unknown? The normal distribution The t‐distribution Chi‐Square distribution F‐distribution
Chi‐Square distribution
Which statistic is used to quantify the strength of a linear relationship between two quantitative variables? Slope Intercept Mean Square Error Coefficient of Determination (R2)
Coefficient of Determination (R2)
Using linear regression or ANOVA models on a large database, like the US Historical Census data, would not be expected to be useful for ? Identifying changes in income levels over time Determining the cause of population demographic shifts Identifying linear patterns among continuous variables Identifying changes in mean values for categorical variables
Determining the cause of population demographic shifts
Used to test for equality of variances from two normal populations.
F‐Distribution
Mechanism in statistics used to determine if a particular claim is statistically significant.
Inference Test
Approach for modelling the relationship between a response variable and one or more explanatory variables.
Linear Regression
Critical area of test statistic is only located in the left tail of the probability distribution.
One‐sided confidence Interval
Interested in the difference between two variables for the same subject.
Paired t‐test
An estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed.
Prediction Interval
True or False: If your p‐value is less than your level of significance alpha, you reject the null hypothesis. a) True b) False
true
Using ANOVA, a set of sample means is more likely to result in rejection of the null hypothesis when... The number of groups means is larger The number of group means is smaller The variability within the groups is smaller The variability within the groups is larger
The variability within the groups is smaller
Critical area of test statistic is shared in both tails of the probability distribution.
Two‐Sided Confidence Interval
Which of the following Confidence Intervals are incorrect? Circle Letter.
a
We can convert a probability about any normal random variable X to a probability about a standard normal random variable Z using ... a) b) c) d) The 68‐95‐99.7 Rule, A z‐score, A t‐value, The binomial distribution
a z score
Which test is used to determine if a population has a specified theoretical distribution? a) Chi‐squared goodness‐of‐fit test b) Chi‐squared test of independence c) Analysis of Varianced) Linear Regression
a) Chi‐squared goodness‐of‐fit test
When making probability statements related to a population from a SINGLE SAMPLE, which of the following are potentially relevant? a) F‐Distribution b) Central Limit Theorem c) T‐Distribution d) Standard Normal Distribution (z) e) Chi‐Square (χ 2) Distribution
a) F‐Distribution no b) Central Limit Theorem no c) T‐Distribution yes d) Standard Normal Distribution (z) yes e) Chi‐Square (χ 2) Distribution yes
Suppose you have a group of fourth graders and a group of eighth graders, and you want to compare the average amount of sleep per night, in hours, for fourth graders vs. eighth graders. Which is the most appropriate test to use? a) Independent two‐samples t‐test b) Paired t‐test for differences c) Chi‐Square test for independence d) Analysis of Variance
a) Independent two‐samples t‐test
Indicate whether each statistic below is a measure of Centrality for a probability distribution? a) Mode b) Mean c) Standard Deviation d) Median e) Variance
a) Mode yes b) Mean yes c) Standard Deviation no d) Median yes e) Variance no
For all of the examples in this course, the "equal to" part is always in the ____________ hypothesis. a) Null (H0) b) Alternative (H1)
a) Null (H0)
When we do a hypothesis test, we are testing a claim about a ... a) Population parameter b) Sample statistic c) Point estimate from a sample
a) Population parameter
If the p‐value is less than the level of significance (), we... a) Reject the null hypothesis b) Do not reject the null hypothesis c) Accept the null hypothesis
a) Reject the null hypothesis
Do the Critical Values for the following Probability Distributions depend on Sample Size?_ a) Standard Normal (Z) Distribution: b) t‐Distribution: c) Chi‐Square (χ2) Distribution: d) F‐Distribution:
a) Standard Normal (Z) Distribution no : b) t‐Distribution yes : c) Chi‐Square (χ2) Distribution: yes d) F‐Distribution: yes
When population variances are known, which distribution do we use for finding the critical value when constructing a confidence interval for the population mean (u)? a) The normal distribution b) The t‐distribution c) The Chi‐Square distribution
a) The normal distribution
A given confidence interval for the mean is centered at ... a) The population mean (u) b) The sample mean (𝑥̅)
a) The population mean (u)
Suppose we have a population and we take 1,000 random samples from that population each of size 75. For each sample, we calculate the sample mean. This distribution is called ... a) The sampling distribution of sample mean, 𝑥̅ b) The sampling distribution of X c) The sampling distribution of population mean,
a) The sampling distribution of sample mean, 𝑥̅
For the chi‐squared test of independence, what is null hypothesis? a) Two categorical variables are independent in the population b) Two categorical variables are not independent in the population c) Two categorical variables are equal
a) Two categorical variables are independent in the population
Indicate the statistical errors represented by shaded areas A and B below. 89.
a. type 2 b. B c. type 1 d. a
Which of the statements, follow the Central Limit Theorem. a) Sampling Distribution for Variance of Normal RV when n > 100 b) Sampling Distribution for the Mean for ALL Distributions when n is Large c) Sampling Distribution for Mean of a Uniform Distribution for n > 25 d) Probability Distribution for Individual Observations f(x) for n = 1 e) Sampling Distribution for Variance from a Highly Skewed RV
a. yes b. yes c. yes d.no e.no
Match the following numbers on the figure with the correct statement.
a4 b5 c1 d3 e2
Match the numbered sums of squares to the correct regression statistic.
a5 b2 c3 d4 e1
Which sampling distribution does not approximate a Normal Distribution for sample sizes exceeding 1000? a) Sampling distribution for the mean of a Binomial random variable with p = 0.25 b) Sampling distribution for the variance of a Uniform random variable c) Sampling distribution for the mean of a t‐distributed random variable d) Sampling distribution for the variance of a Normal random variable
b) Sampling distribution for the variance of a Uniform random variable
The sample proportion 𝑝̂ (p‐hat) is a ... a) Parameter b) Statistic c) Constant
b) Statistic
In significance at: a hypothesis test the resulting p‐value is 0.043. This means that we can find statistical a) Both the 0.05 and 0.01 levels b) The 0.05 but not at the 0.01 level c) The 0.01 but not at the 0.05 level d) Neither the 0.05 or 0.01 levels
b) The 0.05 but not at the 0.01 level
Suppose the plot below is a Q‐Q plot of our sample data. What can we conclude? The population has a normal distribution b) The population does not have a normal distribution
b) The population does not have a normal distribution
Which distribution do you use to find critical values for confidence intervals (n = 20) of a mean when the population standard deviation sigma is unknown? a) The normal distribution b) The t‐distribution c) Chi‐Square distribution d) F‐distribution
b) The t‐distribution
Suppose you are doing an independent samples t‐test, you do not know the population standard deviations, and you cannot assume the population standard deviations are equal. What do we call this situation? a) The pooled standard deviation case b) The unpooled standard deviation case c) Analysis of Variance (ANOVA) d) The z‐score test
b) The unpooled standard deviation case
A researcher tests the null hypothesis that the percentage of uninsured children has not changed, using = 0.01. H1: percentage has changed. The distribution is normally approximated.She finds Z = ‐3.44. She should conclude that: a) There is sufficient evidence that the percentage has increased b) There is sufficient evidence that the percentage has changed c) There is sufficient evidence that the percentage has decreased
b) There is sufficient evidence that the percentage has changed
When do we use the t‐distribution? a) When the population variance or standard deviation is known b) When the population variance or standard deviation is unknown c) When the sample size is greater than 120
b) When the population variance or standard deviation is unknown
If two random variables, 𝑋 and 𝑌 are not independent and negatively correlated, the variance of their sum... a) Would be greater than Var(𝑋) + Var(𝑌) b) Would be less than Var(𝑋) + Var(𝑌) c) Would equal Var(𝑋) + Var(𝑌)
b) Would be less than Var(𝑋) + Var(𝑌)
What is the null hypothesis for Analysis of Variance when comparing categorical group means? a. All the group means are equal to zero b. All the group means are the same c. All the group means are different from each other d. The slope equals zero
b. All the group means are the same
73. In the context of linear regression, extrapolation is ... a. Safely used to predict values of 𝑌 for values of 𝑋 that were part of the regression model b. Not recommended to predict values of 𝑌 for values of 𝑋 that were not part of the regression model
b. Not recommended to predict values of 𝑌 for values of 𝑋 that were not part of the regression model
When we want to do a hypothesis about a proportion 𝑝 and have a small sample, which distribution do we need to use? a) Binomial b) Normal c) t distribution d) Chi‐Square test for proportion
binomial
Which is the least precise (larger width of Confidence Interval) for a data set with n = 10, Mean = 25 and standard deviation = 5. Assume a normal distribution. a) A 99% CI for the sample mean. b) A 90% CI for the sample mean. c) A 99% CI for the sample variance. d) A 90% CI for the sample variance.
c) A 99% CI for the sample variance.
Which of the following probability distributions describes a sum of squares of a random normal variable? a) Normal Distribution b) t‐Distribution c) Chi‐Square Distribution d) F‐Distribution
c) Chi‐Square Distribution
The level of significance (alpha) or Type I Error is selected by the investigator on the basis of: a) Power of the test b) Sample size c) Expected losses from Type I error d) Expected losses from Type II error
c) Expected losses from Type I error
A confidence interval is a range of reasonable values for a ... a) Random variable b) Constant c) Population parameter d) Sample statistic
c) Population parameter
If the null hypothesis is actually false, which of these statements characterizes a situation where the value of the test statistic falls in the rejection region? a) A Type I Error has occurred b) A Type II Error has occurred c) The correct decision of rejecting the null hypothesis will be made d) Insufficient information has been given to make a decision
c) The correct decision of rejecting the null hypothesis will be made
Based on the discussion in Class, what is the alternative hypothesis (H1) in the legal court system? a) The defendant is found innocent b) The defendant is found not guilty c) The defendant is found guilty
c) The defendant is found guilty
When we don't know if the population has a normal distribution, the Central Limit Theorem says that the sampling distribution is approximately normally distributed when ... a) The standard deviation is small b) The population is sufficiently large c) The sample size is sufficiently large
c) The sample size is sufficiently large
Which of the following distributions has a mean of zero and its shape depends on degrees of freedom. a) The F‐distribution b) The Chi‐Square distribution c) The t‐distribution d) The normal distribution
c) The t‐distribution
When constructing a confidence interval for population proportion 𝑝, what does "large 𝑛" indicate? a) Sample size 𝑛 is greater than or equal to 30 b) The sampling distribution for 𝑝 follows a t‐distribution c) 𝑛 times 𝑝̂ (p‐hat) is <5 and 𝑛 times 𝑞 (q‐hat)is >= to5
c) 𝑛 times 𝑝̂ (p‐hat) is <5 and 𝑛 times 𝑞 (q‐hat)is >= to5
For sample variance or standard deviation
chi square distribution
Which of the following statements is not true because of the Central Limit Theorem: a) The standard deviation of the sampling distribution for X‐Bar is reduced by 1/√𝑛 b) The sampling distribution for X‐Bar becomes normally distributed as the sample size is increased c) The sampling distribution of X‐Bar is normally distributed regardless of sample size when the parent population is normally distributed. d) When the population of interest is not normally distributed taking a large sample from it will not change the underlying distribution of the individual observations.
d) When the population of interest is not normally distributed taking a large sample from it will not change the underlying distribution of the individual observations.
When we want to do a hypothesis test about a proportion 𝑝 and have a large sample, which distribution do we use? a) Uniform distribution b) Chi‐Square test for proportion c) t‐distribution d) Normal distribution only e) Either the normal or binomial distributions
e) Either the normal or binomial distributions
For a single sample t‐test against a specified reference mean, the degrees of freedom are used to compute the sample mean. a) True b) False
false
If every observation is multiplied by 2, then the t‐statistic is multiplied by 2. a) True b) False
false
True or False: Outliers do not bias the least‐squares regression line. a) True b) False
false
True or False: The Chi‐Square distribution is symmetric. a) True b) False
false
True or False: The t‐distribution is symmetric with less area in the tail regions compared to the Standard Normal distribution when 𝑛 < 120. a) True b) False
false
True or False: We can use Analysis of Variance to compare 3 or more group proportions where each group has 𝑛 = 5. a) True b) False
false
Identify each of the following components of a Confidence Interval.
first - point estimate (x1-x2) second- =/- za/2= confidence coefficient last- standard. error- in square root
If we increase the confidence level, say from 90% to 99% confidence, the width of the confidence interval will: a) Remain unchanged b) Increase c) Decrease
increase
In testing the null hypothesis that p = 0.3 against the alternative that p !=0.03, the probability of a Type II Error is ______ when the true 𝑝 = 0.4 than when 𝑝 = 0.6.
larger
Population proportion 𝑝 is a .... a) Random Variable b) Parameter c) Statistic d) Constant
parameter
Which best describes the shape of the uniform distribution? a) Bell‐shaped b) Rectangular c) Bimodal
rectangular
For sample average when variance or standard deviation are known
standard normal distribution (Z)
The Central Limit Theorem uses this distribution for the average when the standard deviation is known
standard normal distribution (Z)
For sample average when variance or standard deviation are unknown _
t distribution
True or False: Just like the sample mean, the sample variance takes on different values depending on the particular sample you take, and, therefore, the sample variance has a sampling distribution. a) True b) False
true
True or False: P(X > x) is the same as P(Z > z) when z is the corresponding z‐score.
true
TRUE or FALSE. (Circle One) Both Regression and ANOVA are the statistical models which are used to predict the continuous outcome but in case of the regression, continuous outcome is predicted on basis of the one or more than one continuous predictor variables whereas in case of ANOVA continuous outcome is predicted on basis of the one or more than one categorical predictor variables.
true
The variance (2) of a binomial proportion is 𝑛𝑝𝑞 or 𝑛𝑝(1‐𝑝) a) True b) False
true
True or False. A least squares regression line always passes through the means of the Y and X values. a) True b) False
true
True or False. Analysis of Variance is a method that is used to identify differences among more than 2 group means. a) True b) False
true
True or False. Monte Carlo simulation is a computer intensive technique that can be used to study the behavior of any sampling distribution for selected sample sizes and is a useful tool to demonstrate the Central Limit Theorem. a) True b) False
true
True or False. The t‐distribution essentially converges to the Normal distribution for sample sizes of 120. The tail regions of the t‐distribution converge more slowly with increasing 𝑛 relative to center of the distribution. a) True b) False
true
True or False. The variance of a constant is zero. a) True b) False
true
True or False: A 99% confidence interval has a larger margin of error than the corresponding 95% confidence interval. a) True b) False
true
True or False: A normal distribution is symmetric about a vertical axis through its mean. a) True b) False
true
True or False: If the population is known to have a normal distribution, then the sampling distribution is must also be normal. a) True b) False
true
True or False: If your alternative hypothesis has "greater than" in it, then the area under the curve greater (to the right) than your test statistic is equal to the p‐value. a) True b) False
true
True or False: If your data are matched pairs, an independent samples t‐test would be invalid. a) True b) False
true
Which of the following equations best describes linear regression?
y= b0+b1x+e
Which of the following equations best describes ANOVA?
yij=u + ti+ eij