stat online (all those practice quizzes)
The best estimate of the standard deviation of the men's weights displayed in this dotplot is
25 pounds
In the dataset represented by the following stemplot, how many times does the number "28" occur? Leaf unit = 1.0.
3
The first step in analyzing numeric bivariate data is to
Create a scatterplot
Which of the following are measures of the center of a distribution (circle all that apply): I. Mean II. Variance III. Standard deviation IV. Median
I and IV
Consider the following data:43 54 55 63 67 68 69 77 85 Suppose that the last data point is actually 115 instead of 85. What effect would this new maximum have on our value for the mean of the dataset?
Increase the value of the mean.
Which measure of spread is resistant to extreme values?
Interquartile range
To model a relationship with a regression line, a number of conditions need to be checked and met. Which of the following need not be checked?
Nearly Normal condition
Look at the scatterplot below. Choose which description BEST fits the plot.
No relationship
Which one of the following is a FALSE statement about density curves?
Total area under the curve depends on the shape of the curve.
Variables that are numbers are always quantitative.
false
When comparing two groups, use different scales, if necessary, for clarity and sizing.
false
Which variable is categorical?
gender
A hidden variable that stands behind a relationship and determines it by simultaneously affecting the other two variables is called a ______ variable.
lurking
A regression analysis of students' college grade point averages (GPA) and their high school GPAs found r 2 = 0.311. Which of the following is true? High school GPA accounts for 31.1% of college GPA. 31.1 % of college GPAs can be correctly predicted with this model. 31.1% of the variance in high school GPA can be accounted for by the model.
none
What is the approximate interquartile range of the Male Wrist Girth dataset shown below?
none of these
Which of the following best gives a quick impression of how a whole group is partitioned into smaller groups?
pie chart
All the cases we wish we knew about is the:
population
The side-by-side boxplots below show cumulative GPAs for sophomores, juniors and seniors taking intro stats course in Autumn 2003. Which class had the highest median GPA?
sophomore
Do the following show Simpson's paradox?
yes
The correlation between X and Y is r = 0.35. If we double each X value, decrease each Y by 0.20, and interchange the variables (put X on the Y-axis and vice versa), the new correlation is
0.35
Look at the following least-squares regression line.The Y-intercept tells us the predicted waist girth for someone weighing how many pounds?
0`
Look at the following least-squares regression line. If a person increased his/her weight by 10 pounds, by how much (in inches) would one expect to see their waist girth increase?
1.332
An outlier is a point more than _____ IQR from either end of the box in a boxplot.
1.5
Rachael got a 670 on the analytical portion of the Graduate Record Exam (GRE). If GRE scores are normally distributed and have mean μ = 600 and standard deviation σ = 30, what is her standardized (Z) score?
2.33
How many baseball players report a salary of less than $1,441,000?
220
What percentage watched for the commercials only?
23.5%
What percent of the variation in the sisters' heights can be explained by the heights of the brothers?
31.14%
What percentage of the cancer patients who survived were treated at Clinic B?
320 / 710 = 45%
What percentage of the cancer patients at Clinic A survived?
390 / 600 = 65%
Given that a viewer did not watch the Super Bowl, what percentage were male?
45.2%
What percentage of the Super Bowl viewers (Game) were male?
58.2%
Find the median of the following 9 numbers:43 54 55 63 67 68 69 77 85
67.5
Attendance at a university's basketball games follows a normal distribution with mean μ = 8,000 and standard deviation σ = 1,000. Estimate the percentage of games that have between 7,000 to 9,000 people in attendance.
68%
A researcher wants to know if taking increasing amounts of ginkgo biloba will result in increased capacities of memory ability for different students. He administers it to the students in doses of 250 milligrams, 500 milligrams, and 1000 milligrams. What is the explanatory variable in this study?
Amount of ginkgo biloba given to each student.
Look at the following scatterplot. What would be a correct interpretation of the slope?
As we increase our CO content by 1 mg, we increase the tar content by 1.01 mg.
If we want to discuss any gaps and clusters in the data set, which of the following should not be chosen to display the data set?
Boxplot
If you knew that the μ = 0 and σ = 3, which density curve would match the data?
Dataset 2
Choose which description BEST fits the plot. (shark)
Direction: positive, form: linear, strength: strong
Choose which description BEST fits the plot. (alligator)
Direction: positive, form: non-linear, strength: strong
If you analyze data with a severe outlier you should:
Exclude the outlier but discuss why.
Look at the side-by-side boxplots and compare the female and male thigh girth.
Females and males have about the same thigh girths.
Look at the side-by-side histograms and compare the female and male shoulder girth. (female)
Females have a typically smaller shoulder girth than males
Look at the side-by-side boxplots and compare the female and male shoulder girth.
Females have a typically smaller shoulder girth than males.
which of the following is not an objective of this class?
Give formulas to memorize
A residuals plot is useful because It will help us to see whether a linear model makes sense. It might show a pattern in the data that was hard to see in the original scatterplot.
I and II
Which of the following data summaries are changed by adding a constant to each data value? the mean the median the standard deviation
I and II
We might choose to display data with a stem-and-leaf plot rather than a boxplot because a stem-and- leaf plot ... I. reveals the shape of a distribution. II. is better for large data sets. III. displays the actual data.
I and III
For families who live in apartments the correlation between the family's income and the amount of rent they pay is r = 0.60. Which is true? In general, families with higher incomes pay more rent. On average, families spend 60% of their income on rent. The regression line passes through 60% of the (income$, rent$) data points.
I only
Which of the following are measures of the spread of a distribution (circle all that apply): I. Mean II. Variance III. Standard deviation IV. Median
II and III
Which is true of the data shown in the histogram? The distribution is skewed to the right. The mean is smaller than the median. We should use the median and IQR to summarize these data.
II and III only
Which one of the following is a FALSE statement about a standardized value (z-score)?
It is measured in the same units as the variable.
What is a FALSE statement about r, the correlation coefficient?
It is measured in units of the X variable.
Which one of the following is a FALSE statement about the standard normal curve?
Its standard deviation σ can vary with different datasets.
The shape of the boxplot below can be described as:
Left-skewed
Look at the density curve below.Which would be larger, the median (M) or the mean (μ)?
M would be smaller than μ
What is a plausible set of values for the five-number summary?
Min = 1, Q1 = 8.5, Median = 12.6, Q3 = 15, Max = 17
What are statistics?
Particular calculations made from data
A study was conducted to determine whether the amount of time students spend practicing concepts on the computer is associated with their math score on the Iowa Test of Basic Skills (ITBS). Computer time and ITBS math score was recorded on students at three schools.• Overall, the association between computer time and math score was negative.• Within each school, the association between computer time and math score was positive.Given the results of this study, what can you conclude?
Simpson's paradox is present.
Look at the following scatterplot. What could we say about the relationship between r and the slope of the regression line?
Since b is positive, r must also be positive.
Which of the following statements is TRUE?
Standard deviation is inflated by outliers.
All but one of the statements below contain a mistake. Which one could be true?
The correlation between blood alcohol level and reaction time is r = 0.73.
All but one of the following statements contain a mistake. Which one could be true?
The correlation between height and weight is 0.568.
The residuals plot for a linear model is shown. Which is true?
The linear model is no good because of the curve in the residuals.
Which one of the following is a FALSE statement about the standard normal distribution?
The mean is greater than the median.
For which of the following situations would it be appropriate to calculate r, the correlation coefficient?
Time spent studying for statistics exam and score on the exam.
what are data?
Values along with context
Suppose that a Normal model describes fuel economy (miles per gallon) for automobiles and that a Saturn has a standardized (z-score) of +2.2. This means that Saturns . . .
achieve fuel economy that is 2.2 standard deviations better than the average car
statistics
all of the above
We collect these data from the Tour de France. Which variable is quantitative?
average speed
The SPCA collects data about the dogs they house. Which is categorical?
breed
Which scatterplot shows a strong association between two variables even though the correlation is probably near zero?
c
A(n) ______ is an individual about whom or which we have data.
case
We collect these data from 50 students. Which variable is categorical?
eye color
An outlier is defined to be a point more than 1.0 IQR from either end of the box in a boxplot.
false
In a Normal model, about 68% of the data fall within 2 standard deviations of the mean.
false
True or False? Computing r as a measure of the strength of the relationship between X and Y is appropriate for the data in the following scatterplot:
false
You should use a histogram to display categorical data:
false
School administrators collect data on the students attending the school. Which variable is quantitative?
grade point average
Environmental researchers have collected data on rain acidity for years. Suppose that a Normal model describes the acidity (pH) of rainwater, and that water tested after last week's storm had a z-score of 1.8. This means that the acidity of that rain . . .
had a pH 1.8 standard deviations higher than that of average rainwater.
In a contingency table, when the distribution of one variable is the same for all categories of another, we say the variables are:
independent
The side-by-side boxplots below show cumulative GPAs for sophomores, juniors and seniors taking intro stats course in Autumn 2003. Which class had the lowest cumulative GPA?
junior
It takes a while for new employees to master a complex process. During the first month new employees work, a company tracks the number of days they have been on the job and the length of time it takes them to complete an assembly. The correlation is most likely to be
near -0.6
A regression analysis of a company profits and the amount of money the company spent on advertising found R 2 = 0.72. Which of these is true? This model can correctly predict the profit for 72% of the companies. On average, about 72% of a company's profit results from advertising. On average, companies spend about 72% of their profits on advertising.
none
A regression analysis of students' AP Statistics test scores and the number of hours they spent doing homework found r 2 = 0.32. Which of the following is true? 32% of student test scores can be correctly predicted with this model. Homework accounts for 32% of your grade in AP stats. There is a 32% chance that you will get the score this model predicts for you.
none
What is the approximate range of the Male Wrist Girth dataset shown below?
none of these
Here is the data from the previous question.43 54 55 63 67 68 69 77 85 Suppose that the last data point is actually 115 instead of 85. What effect would this new maximum have on our value for the median of the dataset?
not change the value of the median.
Suppose the lengths of sport-utility vehicles (SUV) are normally distributed with mean μ = 190 inches and standard deviation σ = 5 inches. Marshall just bought a brand-new SUV that is 194.5 inches long and he is interested in knowing what percentage of SUVs is longer than his. Using his statistical knowledge, he drew a normal curve and labeled the appropriate area of interest.
plot B
Which of the following scatterplots displays the stronger linear relationship? (They are the same except for one point.)
plot B
Which scatterplot would give a larger value for r?
plot B
Which of the following displays percentages rather than counts?
relative frequency table
Which of the following is not a step in doing statistics right?
repeat
What shape would you say the data take?
right-skewed
The cases we actually examine to understand a larger group is the:
sample
Which of the following is not part of the 5-number summary:
the mean
Look at the side-by-side histograms and compare the female and male shoulder girth. (Spread)
the spread of Distribution A is greater than the spread of Distribution B.
The five-number summary of credit hours for 24 students in an introductory statistics class is: From this we know that
there are no outliers in the data.
A correlation of zero between two quantitative variables means that
there is no linear association between the two variables.
statistics is a way of reasoning
true
A quantity or amount adopted as a standard of measurement is a(n):
unit
A(n) ______ holds information about the same characteristic for many cases.
variable
