Statistics Review
In a table that summarizes the results of an analysis of variance, the word "Between" might be replaced by
"Treatment."
When the rejection of a true null hypothesis has horrendous consequences, use a level of significance equal to
.001
A researcher would prefer to report an exact p-value of
.005
In real-life applications, unless there are obvious reasons for selecting either a larger or smaller level of significance, select a level of significance equal to
.05
When consulting power curves, the default value for the level of significance is .05, while the default value for power is
.80
Which one of the following numbers couldn't possibly be a value of r?
0.00
Given the following incomplete ANOVA summary table, SOURCE SS df MS F _________________________________________________________ Column 100 2 50 0.50 Row 60 1 60 0.60 Interaction 80 Within 2400 24 100 Total 2640 the value of F for the above interaction equals
0.40
If a least squares equation predicts annual income from years of education, as a college student you would prefer that the value of b, the slope of the equation, equal
1.6
In an experiment involving four different groups, each consisting of 5 subjects, the degrees of freedom for between groups equals
3
Given Y' = .005(X) + .40 for predicting college GPA from SAT critical reading scores, the predicted GPA for a student with an SAT score of 600 would be
3.40
Indicate which one of the following sets of observations --each having a mean of 50--has the larger standard deviation. (Calculations aren't necessary to answer this question.)
40, 40, 50, 60, 60
Which one of the following does not illustrate two related samples?
College freshmen are split into two groups.
When using a scatterplot, the customary direction of prediction is from
X to Y.
To describe the average daily load of clients for psychotherapists working at a mental heath clinic, a spokesperson reports a mean load of 5.2 clients and a median load of 7.5 clients. This implies that, among the staff of psychotherapists,
a few carry relatively light loads.
If a survey of grade-school children reveals that the distribution of daily TV-viewing times has a mean of 2.3 hours and a standard deviation of 1.0 hours, this implies that
a majority of children watch TV between 1.3 and 3.3 hours.
A dot cluster that tilts from upper left to lower right reflects
a negative relationship.
Power refers to the probability of detecting
a particular effect.
The importance of a statistically significant result might be clarified with
a point estimate based on the observed difference between means, a confidence interval for the difference between population means, and Cohen's standardized effect estimate, d.
A decision rule specifies precisely when the null hypothesis should be rejected because the observed z qualifies as
a rare outcome.
Two samples are related if each observation in one sample is paired with
a single observation in the other sample.
You're told that, for a large group of students, an r of .23 describes the relationship between anxiety score and time to complete a problem‑solving task. This implies that there probably is
a slight tendency for more anxious students to be slower problem solvers.
Metal detectors at airports are used to determine whether passengers are carrying weapons. If the null hypothesis states that a passenger isn't carrying a weapon, a type II error would occur whenever
a weapon-carrying passenger passes the detector without activating the alarm.
Metal detectors at airports are used to determine whether passengers are carrying weapons. If the null hypothesis states that a passenger isn't carrying a weapon, a type I error would occur whenever
a weapon-free passenger passes the detector and activates the alarm.
An F test of the null hypothesis is based on the notion that if the null hypothesis is true, the numerator of the F ratio tends to be
about the same as its denominator.
Unlike the mode or median, the mean is affected by the arithmetic value of
all observations.
No useful measure of variability is produced by taking the mean of all deviations (both positive and negative) about the mean for the original observations because the resulting measure
always equals zero.
Use Cohen's d to estimate the effect size of
any significant difference between pairs of means.
As variability within groups increases, the one observed mean difference is less
apparent, stable, and likely to be viewed as real.
A treatment effect exists if differences exist between
at least one pair of population means.
Rejection of the overall null hypothesis indicates that
at least one population mean differs from all others.
In analysis of variance, observed mean differences appear, somewhat disguised, as variability
between groups.
A rare outcome
can be readily attributed to variability and leads to the retention of the null hypothesis.
A student concentrates on raising his SAT score in order to raise his college GPA. This strategy is questionable because it assumes that the relationship between SAT scores and college GPAs is
cause‑effect.
Critical z scores separate
common and rare outcomes.
If the null hypothesis is really false and we reject the hypothesis, we have made a
correct decision.
Having made a decision to either retain or reject the null hypothesis, we don't know definitely whether this decision is
correct or incorrect.
Retention of the null hypothesis implies that the null hypothesis
could be true.
The dream times for each of twenty volunteers are measured during two sleep periods: once after an evening when no alcohol was consumed and once after an evening when alcohol was consumed. It would be most important that subjects experience the two conditions in
counterbalanced order and on nonconsecutive nights (after any alcohol effect disappears).
When subjects perform double duty in both conditions of an experiment, the order in which conditions are experienced should be
counterbalanced.
Tukey's HSD controls the
cumulative probability of a type I error.
Given Y' = .005(X) + .40 for predicting college GPA from SAT critical reading scores, the coefficient of X (.005) indicates that the correlation between college GPAs and SAT I scores is
d) positive.
When the range of possible X and Y scores is restricted, the value of the correlation coefficient usually
decreases.
The level of significance indicates the
degree of rarity required to reject the null hypothesis.
The squared curvilinear correlation indicates the proportion of variance in
dependent variable attributable to the independent variable
Scatterplots
depict relationships as dot clusters, provide a preview of the fully measured relationship, and show any pronounced curvilinearity.
If sample size is excessively large, an effect usually will be
detected even though it lacks importance.
Reject the null hypothesis if the observed z
deviates too far into the tails of the sampling distribution.
When constructing a graph for a possible interaction, each dot in the graph should be based on the mean (or total) for
each group of subjects that receives the same combination of treatments.
A partial squared curvilinear correlation coefficient can be used to estimate the size of the effect associated with
each significant F.
When calculating the sum of squares with the definition formula, each deviation is squared in order to
eliminate negative signs from deviation scores.
If the sum of squares for between groups equals 50 and that for within groups equals 70, the sum of squares for total variability must
equal 120.
The t test for two independent samples assumes that both underlying populations are normally distributed with equal variances. You needn't be too concerned about violations of the assumptions of t for two independent samples (normally distributed populations with equal variances) as long as both samples sizes are
equal and fairly large.
In two-factor ANOVA, it's important that the size of all groups be
equal.
If an investigator reports that factor A has an inconsistent effect for the various values of factor B, this implies that
factor B also has an inconsistent effect for the various values of factor A.
You needn't be too concerned about violations of assumptions for F tests in a two-factor ANOVA, particularly if all group sizes are equal and each is
fairly large.
It would be unwise to completely eliminate the rejection regions because then a
false hypothesis never would be rejected.
Assume that r equals -.43 for the relationship between years of heavy smoking and life expectancy. This signifies that
heavy smokers tend to have shorter life expectancies.
The occurrence of an interaction often
highlights pertinent issues for future research
A sample mean qualifies as a common outcome if the difference between its value and that of the
hypothesized population mean is small enough to be viewed as merely another random outcome.
A predicted college GPA is unlikely to coincide with a student's true GPA because the relationship between SAT score and college GPAs is
imperfect.
The null hypothesis presumes that any observed difference between sample means
is due to variability.
The probability of a type I error equals the level of significance, given that the null hypothesis
is true.
Insofar as regression toward the mean occurs, a student who made the lowest score on the first statistics exam would be expected to score
less poorly on the second exam.
Given the following data for the various mean rehabilitation scores" earned by prison inmates, NO. OF CELLMATES 0 1 2 Row Mean RECREATION 0 0 60 60 40 PRIVILEGES 2 60 60 60 60 4 120 60 60 80 ______________________________ Column Mean 60 60 60 Grand Mean = 60 the three row means (40, 60, and 80) reflect the
main effect of privileges.
Which average -- the mode, median, or mean -- probably serves as the basis for the following statement? As of March 2009, the average American's share of the national debt equals approximately $35,000.
mean
Type I errors are sometimes called
misses, wild goose chases, and false alarms.
Type II errors are sometimes referred to as
misses.
Given that the relationship between GPA and IQ is stronger in high school than in college, the dot cluster for high school (compared to that for college) should
more closely approximate a straight line.
As variability within groups decreases, differences between group means become
more detectable.
Use analysis of variance rather than a t test whenever the null hypothesis makes a claim about
more than two population means.
Given an observed difference between a sample mean of 42 and a hypothesized population mean of 50, you
must determine whether this observed difference can reasonably be attributed to chance.
When estimating the population standard deviation, always use the version of the sample standard deviation where
n - 1 appears in the denominator.
Given the following incomplete ANOVA summary table, SOURCE SS df MS F _________________________________________________________ Column 100 2 50 0.50 Row 60 1 60 0.60 Interaction 80 Within 2400 24 100 Total 2640 it would be reasonable to conclude that for the null hypotheses corresponding to the two listed F values,
neither should be rejected.
Prior to taking a written test of self-esteem (scored from a low of 0 to a high of 100), shy volunteers are randomly assigned to participate in weekend workshops dealing with either assertive behavior or group recreation. After analyzing their subsequent performance on the test of self-esteem, the investigator reports a 95 percent confidence interval of -5 to 14 points, tilted in favor of the group that had the workshop on assertive behavior. The boundaries for the 95 percent confidence interval (-5 to 14) require the conclusion that there is
no consistent effect.
In an experiment to determine whether TV cartoons produce more aggressive behavior in grade school children, the null hypothesis states that TV cartoons have
no effect on aggressive behavior.
If there is a correlation between the age of schoolchildren and their reading comprehension, you could speculate about the basis for this relationship. Probably the most reasonable speculation is that
none of the above
It would be appropriate to report the median for a distribution of
none of the above (no middle if no sequence)
Given that the null hypothesis for interaction has been rejected, the corresponding graph of this interaction should contain two or more _______________________ lines.
nonparallel.
Interaction occurs whenever the effects of one factor are
not consistent for all values of the second factor.
In two-factor ANOVA, an F ratio is calculated for each different
null hypothesis.
Smaller p-values tend to discredit the
null hypothesis.
The mode reflects the value
of the most frequently occurring observation.
Given a concern only that meditation improves college grade point averages, the alternative hypothesis should be a
one-tailed test with the upper tail critical.
Regardless of whether the null hypothesis is true or false, variability within groups reflects
only random error.
If a sample correlation coefficient qualifies as a rare outcome under the null hypothesis, then you can conclude that the
population correlation coefficient differs from zero.
Dots in scatterplots that deviate conspicuously from the main dot cluster are viewed as
potential outliers.
Statistical significance indicates that the null hypothesis is
probably false.
Rejection of the null hypothesis implies that the null hypothesis
probably is false.
In both one- and two-factor ANOVA, F ratios always consist of a numerator that reflects
random error and, if present, a treatment effect.
In both one- and two-factor ANOVA, F ratios always consist of a denominator that reflects
random error.
If the observed F equals or exceeds the critical F, the experimental outcome is
rare, and the null hypothesis is rejected.
If a published report of an F test specified that p <.01, you could conclude that the test result is
rare, supporting the research hypothesis.
The regression fallacy is committed whenever regression toward the mean is interpreted as an effect that is
real.
Given critical z values of ±1.96 and an observed z value of -2.40, the appropriate decision is to
reject the null hypothesis.
There is a negative correlation between years of heavy smoking and life expectancy. Therefore, for someone who has smoked heavily for many years, you would predict a life expectancy that is
relatively short.
Which one of the following does not describe the median?
represents the value midway between the largest and smallest observations
If the null hypothesis is seriously false, there is a high probability that this hypothesis will be
revised.
If it's important to detect even a relatively small effect, increase the
sample size.
You needn't be too concerned about violating the assumptions of the F test so long as
sample sizes are equal and larger than 10.
To determine whether it qualifies as a common or rare outcome under the null hypothesis, the one observed difference between sample means is viewed as originating from the
sampling distribution for the differences between sample means.
Compared to the standard deviation for the distribution of salaries for all new college graduates, the standard deviation for the distribution of salaries for all new electrical engineers should be
smaller.
Compared to the standard error for two independent samples, that for two related samples will tend to be
smaller.
A sample mean qualifies as a rare outcome if it appears to emerge from the
sparse concentration of possible sample means in either tail of the sampling distribution
Unlike a p-value, a level of significance is a degree of rarity
specified before the test result has been observed.
Which one of the following pairs fails to reflect the relationship between a sample and its population, respectively.
students in college; students in class
The t test for two related samples should be used if
subjects are matched. or subjects are measured twice.
Variability between groups is based on the variation among the scores of
subjects treated differently.
In analysis of variance, variance estimates consist of
sum of squares divided by degrees of freedom.
Once pairs of observations have been converted to difference scores, the t test for two related samples can be viewed as a
t test for a single sample.
Grade-school children are randomly assigned to reading groups of either four or eight children throughout the school year. At the end of the school term, scores are obtained for each child on a standardized reading achievement test. The appropriate test (to evaluate any difference between reading group size) would be a
t test for two independent samples.
A medical researcher compares the blood pressure readings for a group of high-risk patients both before and after the administration of a new tension-reducing drug. The appropriate test would be a
t test for two related samples.
If an investigator reports that main effects exist for both factors, this implies
that an interaction could not possibly be present.
The value of the mode identifies
that which is "fashionable."
Rejection of the overall null hypothesis usually raises some additional questions regarding
the estimated size of the overall effect and which differences between population means cause rejection of the overall null hypothesis
Among frequency distributions for physical stamina scores, the greatest variability probably would occur in the distribution for
the general population.
A modification of Tukey's test can be used to pinpoint important differences between pairs of column or row means, given that (1) the corresponding null hypotheses have been rejected and (2) interpretations aren't compromised by inconsistencies associated with
the interaction.
The dream times were measured for each of twenty volunteers during two sleep periods: once after an evening when no alcohol was consumed and once after an evening when alcohol was consumed. To analyze the test results, use a t test for two related samples because
the same subject is measured twice.
Published reports often include parenthetical statements that summarize
the statistical analysis, including a p-value and an estimate of effect size.
Prior to taking a written test of self-esteem (scored from a low of 0 to a high of 100), shy volunteers are randomly assigned to participate in weekend workshops dealing with either assertive behavior or group recreation. After analyzing their subsequent performance on the test of self-esteem, the investigator reports a 95 percent confidence interval of -5 to 14 points, tilted in favor of the group that had the workshop on assertive behavior. The boundaries for the 95 percent confidence interval (-5 to 14) indicate that
the true population mean difference is probably between -5 to 14 points.
The dream times for each of twenty volunteers are measured during two sleep periods: once after an evening when no alcohol was consumed and once after an evening when alcohol was consumed. It would be appropriate to test the null hypothesis with a
two-tailed test at the .05 level of significance.
When all possible differences between pairs of population means are evaluated not with an F test, but with a series of regular t tests, the probability of at least one
type I error is larger than the specified level of significance.
In a study with 30 grade-school children per group, students in the experimental reading program do 10 points better, on the average, than do students in the regular program. Therefore you can conclude that this difference is
undecipherable without additional information.
The null hypothesis
usually asserts that nothing special is happening with respect to some population characteristic.
If a treatment effect exists,
variability between groups will tend to exceed variability within groups.
In a scatterplot, predictive errors are associated with
vertical discrepancies between dots and the regression line.
Measures of variability for qualitative data are
virtually nonexistent.
If the level of significance equals .05 and the null hypothesis is true, a correct decision will occur
with probability .95.
Which one of the following most likely describes a positive relationship?
years of education and lifetime earnings
If the distribution of ages for college students has a mean of 23.74 and a standard deviation of 3.19, the latter number (3.19) is expressed in units of
years.
The appropriate decision rule for a one-tailed test, lower tail critical, at the .05 level of significance, is to reject if
z<= -1.65
The symbol for the population mean is
μ
