Stats Unit 2

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

A taxi company monitoring the safety of its cabs kept track of the number of miles tires have been driven (in thousands) and the depth of the tread remaining (in mm). They found the equation of the least squares regression line to be: tread = 36-0.6miles, with R2= 0.74. There appears to be a moderate/strong, negative, linear associations between miles driven and tread depth. r=?

-0.86

Variables X and Y have r = 0.40. If we decrease each X value by 0., double each Y value, and then interchange then the new correclatio will be

0.40 b/c r is not affected by shifts, change of scale, or switching variables

A consumer group wants to see if new education program will improve the spending habits of college students. Student in an economics class are randomly assigned to 3 different coursed on spending habits. How many factors are there?

1 factor: courses

Common bias errors include

1. relying on voluntary response 2. undercoverage of the population 3. non-response bias 4. response bias

Ceramics engineers are testing a new formulation for the material used to make insulators for powerlines. They will try baking the insulators at 4 different temperatures, followed by either slow or rapid cooling. They want to try every combination of the baking and cooling options to see which produced insulators least likely to break during adverse weather conditions. How many factors are there?

2 factors: temperature & cooling

A consumer group wants to see if new education program will improve the spending habits of college students. Student in an economics class are randomly assigned to 3 different coursed on spending habits. How many treatments are there?

3 treatments: 3 different courses

A taxi company monitoring the safety of its cabs kept track of the number of miles tires have been driven (in thousands) and the depth of the tread remaining (in mm). They found the equation of the least squares regression line to be: tread = 36-0.6miles, with R2= 0.74. Explain (in context) what R2 means.

74% of the variation in tread depth is explained by this model.

Ceramics engineers are testing a new formulation for the material used to make insulators for powerlines. They will try baking the insulators at 4 different temperatures, followed by either slow or rapid cooling. They want to try every combination of the baking and cooling options to see which produced insulators least likely to break during adverse weather conditions. How many treatments are there?

8 treatments: 4 temperatures * 2 cooling rates

A taxi company monitoring the safety of its cabs kept track of the number of miles tires have been driven (in thousands) and the depth of the tread remaining (in mm). They found the equation of the least squares regression line to be: tread = 36-0.6miles, with R2= 0.74. In this context, what does a negative residual mean?

A car has tires with tread depth that is less than predicted.

A taxi company monitoring the safety of its cabs kept track of the number of miles tires have been driven (in thousands) and the depth of the tread remaining (in mm). They found the equation of the least squares regression line to be: tread = 36-0.6miles, with R2= 0.74. Explain (in context) what the y-intercept of the line means.

A car that has driven 0 miles will have a tread depth of 36mm.

For families who live in apartments the correlation between the family's income and the amount of rent they pay is r-0.60. Which is true? A. In general, families with higher incomes pay more in rent B. On average, families spend 60% of their income on rent C. The regression line passes through 60% of the data points

A. In general, families with higher incomes pay more in rent. b/c r is + so slope is +

Which is not a critical part of designing a good experiment? A. Random selection of subjects B. Random assignment of subjects to treatments C. Control of known sources of variability D. Replication of the on a sufficient number of subjects E. All of these are important.

A. Random selection of subjects

Which is important in designing a good experiment? A. Randomization in assigning subjects to treatments. B. Control of potentially confounding variables. C. Replication of the experiment on a sufficient number of subjects.

A. Randomization in assigning subjects to treatments. B. Control of potentially confounding variables. C. Replication of the experiment on a sufficient number of subjects.

Which statement about influential points is true? A. Removal of an influential point changes the regression line. B. Data points that are outliners in the horizontal direction are more likely to be influential than points that are outliners in the vertical direction. C. Influential points have large residuals.

A. Removal of an influential point changes the regression line. B. Data points that are outliners in the horizontal direction are more likely to be influential than points that are outliners in the vertical direction.

The owner of a car dealership planned to develop strategies to increase sales. He hoped to learn the reasons why many people who visits his car lot do not eventually buy a car from him. For 1 month he asked his sales staff to keep a list of the names and addresses of everyone who came in to test drive a car. At the end of the month he sent surveys to the people who did not buy the car, asking them why. About 1/3 of them returned the survey, with 44% of those indicating that they found a lower price elsewhere. Which is true? A. The population of interest is all potential car buyers. B. This survey design suffered from non-response bias. C. Because it comes from a sample 44% is a parameter, not a statistic.

A. The population of interest is all potential car buyers. B. This survey design suffered from non-response bias.

Hoping to get information that would allow them to negotiate new rates with their advertisers, Natural Health magazine phones a random sample of 600 subscribers. 64% of those polled said they use nutritional supplements. Which is true? A. The population of interest is the people who read this magazine. B. "64%" is not a statistic; it's the parameter of interest C. This sampling design should provide the company with a reasonably accurate estimate of the % of all subscribers who use supplements.

A. The population of interest is the people who read this magazine. C. This sampling design should provide the company with a reasonably accurate estimate of the % of all subscribers who use supplements.

Which is true about sampling? A. An attempt to take a census will always result in less bias than sampling. B. Sampling error is usually reduced when the sample size is larger. C. Sampling error is the result of random variations and is always present.

B. Sampling error is usually reduced when the sample size is larger. C. Sampling error is the result of random variations and is always present.

Researchers who wanted to see if drinking grape juice could help people lower their blood pressure got 120 non-smokers to volunteer for a study. They measured each person's blood pressure and then randomly divided the subjects into two groups. One group drank a glass of grape juice every day while the other did not. After sixty days, the researchers measured everyone's blood pressure again. They reported the differences in change in blood pressure between the groups were not statistically significant. Other researchers now plan to replicate this study using both smokers and non-smokers. Briefly describe the design strategy you think they should use.

Block by smoking status, randomly assigning people in each block to the treatments. Compare blood pressure after 60 days within each block.

Researchers who wanted to see if drinking grape juice could help people lower their blood pressure got 120 non-smokers to volunteer for a study. They measured each person's blood pressure and then randomly divided the subjects into two groups. One group drank a glass of grape juice every day while the other did not. After sixty days, the researchers measured everyone's blood pressure again. They reported the differences in change in blood pressure between the groups were not statistically significant. Briefly explain why the researcher studied only non-smokers.

By studying only non-smokers, the researchers were trying to reduce the impact that smoking might have had on blood pressure. The researchers were trying to remove any confounding that smoking might have made on blood pressure. The new study must be blocked by whether a person smokes or not. Half of each group would then be randomly assigned to grape juice or no grape juice groups.

A regression model examining the amount of weight a football player can bench press found that 10 cm difference in chest size are associated with 8 kg differences in weight pressed. Which is true? A. The correlation between chest size and weight pressed is r = 0.80. B. As a player gets stronger and presses more weight his chest will get bigger. C. A positive residual means that the player pressed more than predicted.

C. A positive residual means that the player pressed more than predicted.

If we wish to compare the average PSAT scores of boys and girls taking stats at a high school, which would be the best way to gather this data? A. observational study B. stratified sample C. census D. experiment E. SRS

C. census

Does donating blood lower cholesterol levels? 50 volunteers have a cholesterol test, then donate blood, and then have another cholesterol test. Which aspect of experimental design is present? A. a placebo B. randomization C. none of these D. control group E. blinding

C. none of these

All but one of these statements contain a mistake. Which could be true? A. The correlation between a football player's weight and the position he plays is 0.54. B. There is a high correlation (1.09) between height of a corn stalk and its age in weeks. C. The correlation between a car's length and its fuel efficiency is 0.71 miles per gallon. D. The correlation between the amount of fertilizer used and the yield of beans is 0.42. E. There is a correlation of 0.63 between gender and political party.

D. The correlation between the amount of fertilizer used and the yield of beans is 0.42. b/c r must be between -1.0 and 1.0, r does not have a unit, and cannot be used on categorical values

A couple of years ago, a local newspaper published research results claiming a positive association between the number of years high school students had taken instrumental music lessons and their performances in school (GPA). A group of parents then went to the School Board demanding more funding for music programs as a way to improve student chances for academic success in high school. As a statistician, do you agree or disagree with their reasoning? Explain.

Disagree, association does not mean cause and effect. Perhaps the greater parental commitment that supports music lessons also encourages higher grades. (or higher SES enhances both, or people who are better students anyways take music)

A consumer group collected information on HDTVs. They created a linear model to estimate the cost of an HDTV (in $) based on the screen size (in inches). Which is the most likely value of the slope of the line of best fit? A. 700 B. 7000 C. 0.70 D. 7 E. 70

E. 70

It takes a while for new factory workers to master a complex assembly process. During the first month new employees work, the company tracks the number of days they have been on the job and the length of time it takes them to complete an assembly. The correlation (r) is most likely to be.... A. exactly +1.0 B. near 0 C. near +0.6 D. exactly -1.0 E. near -0.6

E. near -0.6 b/c the longer the worker has worked = more experience = less assembly time

A consumer group wants to see if new education program will improve the spending habits of college students. Student in an economics class are randomly assigned to 3 different coursed on spending habits. What are the experimental units?

Economics students

Researchers who wanted to see if drinking grape juice could help people lower their blood pressure got 120 non-smokers to volunteer for a study. They measured each person's blood pressure and then randomly divided the subjects into two groups. One group drank a glass of grape juice every day while the other did not. After sixty days, the researchers measured everyone's blood pressure again. They reported the differences in change in blood pressure between the groups were not statistically significant. Was this an experiment or an observational study? Explain.

Experiment - there was an application of a treatment, grape juice, to groups containing randomly assigned subjects with a comparison of the blood pressure across the treatment groups

A couple of years ago, a local newspaper published research results claiming a positive association between the number of years high school students had taken instrumental music lessons and their performances in school (GPA). What does "positive association" mean in this context?

In general, students who studied music longer had higher GPAs.

Data points whose x-values are far from the mean of x are said to exert ______.

Leverage High-leverage points pull the line close to them, and so they can have a large effect on the line, sometimes completely determining the slope and intercept. With high enough leverage, their residuals can be deceptively small

A variable that is not explicitly part of a model but affects the way the variables in the model appear to be related.

Lurking Variable

Suppose the state decides to randomly test high school wrestlers for steroid use. There are 16 teams in the league, and each team has 20 wrestlers. State investigators plant to test 32 of these athletes by randomly choosing 2 wrestlers from each team. Is this a simple random sample?

No, b/c not all possible groups of 32 wrestlers could have been sample; This is stratified sample or multi-stage sample

To check the effect of cold temperature on the battery's ability to start a car, researchers purchased a battery from Sears and one from NAPA. They disable a car so it would not start, put the car in a warm garage, and installed the Sears battery. They tried to start the car repeatedly, keeping track of the total time that elapsed before the battery could no longer turn the engine over. Then they moved the car outdoors where the temperature was below 0. After the car had chilled there for several hours the researchers installed the NAPA battery and repeated the test. Is this a good experimental design?

No, b/c temperature is confounded by brand

Which is true? A. Random scatter in the residuals indicates a model with high predictive power. B. If two variables are very strongly associated, then the correlation between them will be near +1.0 or -1.0. C. The higher the correlation between two variables the more likely the association is based in cause and effect.

None

A regression analysis of students' AP statistics test scores and the number of hours they spent doing homework found R2 = 0.32. Which of these is true? A. 32% of students test scores can be correctly predicted with this model. B. Homework accounts for 32% of your grade in AP stats. C. There's a 32% chance that you'll get the scores this model predicts for you.

None b/c 32% of the variation in test scores can be accounted for by looking at homework hours.

A taxi company monitoring the safety of its cabs kept track of the number of miles tires have been driven (in thousands) and the depth of the tread remaining (in mm). They found the equation of the least squares regression line to be: tread = 36-0.6miles, with R2= 0.74. Explain (in context) what the slope of the line means.

On average, cars that travel farther have about 0.6mm less tread per each 1000 miles driven.

A polling organization is investigating public opinion about cloning. They phone a random sample of 1200 adults, asking each person one of these questions (randomly chosen): A. "Do you favor allowing doctors to use cloned cells in attempts to find cures for such terrible diseases as Alzheimer's, diabetes, and Parkinson's?" B. "Should research scientists be allowed to use cloned human embryos in their experiments?" Which question do you expect will elicit greater support for cloning. What kind of bias is this?

Question A will elicit greater support for cloning because the wording of the question appeals to the emotions of the respondent. Question B conjures up images of scientists experimenting on humans and cloned embryos, and will elicit less support. This is a form of response bias.

Researchers who wanted to see if drinking grape juice could help people lower their blood pressure got 120 non-smokers to volunteer for a study. They measured each person's blood pressure and then randomly divided the subjects into two groups. One group drank a glass of grape juice every day while the other did not. After sixty days, the researchers measured everyone's blood pressure again. They reported the differences in change in blood pressure between the groups were not statistically significant. Briefly explain why the researchers randomly assigned the subjects to the groups.

Randomly assigning subjects to groups allows us to equalize the effects of unknown or uncontrollable sources of variation.

Researchers who wanted to see if drinking grape juice could help people lower their blood pressure got 120 non-smokers to volunteer for a study. They measured each person's blood pressure and then randomly divided the subjects into two groups. One group drank a glass of grape juice every day while the other did not. After sixty days, the researchers measured everyone's blood pressure again. They reported the differences in change in blood pressure between the groups were not statistically significant. Since everyone's blood pressure was measured at the beginning and at the end of the study, the researchers could have simply looked at the grape juice drinkers to see if their blood pressure changed. Briefly explain why the researchers bothered to include the control group.

The control group provided a basis for comparison after the 60 days to determine if the group drinking grape juice had lower blood pressure because of the juice. Maybe everyone's blood pressure naturally drops at a certain time of the year.

Double-blinding in experiments is important so that

The evaluators do not know which treatment group the participants are in & the participants do not know which treatment group they are in

Researchers who wanted to see if drinking grape juice could help people lower their blood pressure got 120 non-smokers to volunteer for a study. They measured each person's blood pressure and then randomly divided the subjects into two groups. One group drank a glass of grape juice every day while the other did not. After sixty days, the researchers measured everyone's blood pressure again. They reported the differences in change in blood pressure between the groups were not statistically significant. Briefly explain what "not statistically significant" means in this context.

There was not a large reduction of blood pressure between the groups who drank grape juice and the group who did not drink grape juice; the reduction could be reasonably attributed to random variation (or sampling error).

A taxi company monitoring the safety of its cabs kept track of the number of miles tires have been driven (in thousands) and the depth of the tread remaining (in mm). They found the equation of the least squares regression line to be: tread = 36-0.6miles, with R2= 0.74. What is the explanatory variable?

X-axis miles driven (thousands)

Suppose a school district to randomly test high school students for attention deficit disorder (ADD). There are 3 high schools in the district, each with grades 9-12. The school board pools all of the students together & randomly samples 250 students. Is this a simple random sample?

Yes, b/c they could have chosen any 250 students from the district.

In an experiment the primary purpose of blinding is to reduce

bias

Any systematic failure of a sampling method to represent its population is ____

bias Biased sampling methods tend to over- or underestimate parameters. It is almost impossible to recover from bias.

Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups is said to be....

blinding

When groups of experimental units are similar in a way that is not a factor under study, it is often a good idea to gather them into _______ and then randomized the assignment of treatments within each _____. By doing this, we isolate the variability attributable to the differences between the ______ so that we can see the differences caused by the treatments more clearly.

block

Ceramics engineers are testing a new formulation for the material used to make insulators for powerlines. They will try baking the insulators at 4 different temperatures, followed by either slow or rapid cooling. They want to try every combination of the baking and cooling options to see which produced insulators least likely to break during adverse weather conditions. What is the response variable?

breakage of the insulators

A sample that consists of the entire population

census

We wish to compare the average ages of the math and science teachers at your high school. What is the best way to collect that data?

census

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Randomly select several city blocks; interview all the adults living on each block.

cluster

A sampling design in which entire groups are chosen at random. Is usually selected as a matter of convenience, practicality, or cost. Groups are heterogeneous and should be representative of the population

cluster sampling

When the levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated, we say that these two factors are...

confounding

The experimental units assigned to a baseline treatment level typically either the default treatment or a null, placebo treatment. Their responses provide a basis for comparison.

control group

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Have each council member survey 50 friends, neighbors, or co-workers.

convenience

Consists of individuals who are conveniently available. Often fail to be representative b/c every individual in the population is not equally convenient to sample

convenience sampling

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Go to a downtown street corner, a grocery store, and a shopping mall; interview 100 typical shoppers at each location.

convince sample

For each pair of variables, indicate what association you expect: positive, negative, curved, or none. The amount of rainfall during growing season; the crop yield (bushels per acre)

curved

An ___________ manipulates factor levels to create treatments, randomly assigns subjects to these treatment levels, and then compares the responses of the subject groups across treatment levels.

experiment

A research wants to compare the performance of 3 types of pain relievers in volunteers suffering from arthritis. Because people of different ages may suffer arthritis of varying degrees of severity, the subjects are split into 2 groups: younger than 60 and older than 60. Subjects in each group are randomly assignes to take one of the mesications. 20 minutes later they rate their levels of pain. This eperiment

has one factor (medication) blocked by age

A point that, if omitted from the data, results in very different regression model

influential point

Ceramics engineers are testing a new formulation for the material used to make insulators for powerlines. They will try baking the insulators at 4 different temperatures, followed by either slow or rapid cooling. They want to try every combination of the baking and cooling options to see which produced insulators least likely to break during adverse weather conditions. What are the experimental units?

insulators

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Randomly pick several city blocks, then randomly pick 10 residents from each block.

multistage

Sampling schemes that combine several sampling methods.

multistage sampling

For each pair of variables, indicate what association you expect: positive, negative, curved, or none. The price of charged for fund-raised candy bars; number of candy bars sold

negative

For each pair of variables, indicate what association you expect: positive, negative, curved, or none. The number of mils a student lived from school; the student's GPA.

none

Bias introduced when a large fraction of those sampled fail to respond. Those who do respond are likely to not represent the entire population.

nonreponse bias

A study based on data in which no manipulation of factors have been employed.

observational study

A treatment known to have no effect, administered so that all groups experience the same conditions.

placebo

A numerically valued attribute of a model for a population. We rarely expect to know the true value, but we do hope to estimate it from sampled data. For example, the mean income of all employed people in the county is a....

population parameter

For each pair of variables, indicate what association you expect: positive, negative, curved, or none. A person's blood alcohol level; time it takes the person to solve a maze.

positive

For each pair of variables, indicate what association you expect: positive, negative, curved, or none. Weekly sales of hot chocolate at a Montana diner; the number of auto accidents that week in that town.

positive

An observational study in which subjects are followed to observe future outcomes. Because no treatments are deliberately applied, this is not an experiment. These typically focus on estimating differences among groups that might appear as the groups are followed during the course of the study.

prospective study

As researcher identified 100 men over 40 who were not exercising and another 100 men over 40 with similar medical histories who were exercising regularly. She followed all the men for several years to see if there was any difference between between the 2 groups in the rate of heart attacks. This a...

prospective study

Does regular exercise decreases that risk of cancer? A researcher finds 200 women over 50 who exercise regularly, pairs each with a woman who has a similar medical history but does not exercise, then follows subjects for 10 years to see which group develops more cancer. This is a

prospective study

The best defense against bias is ________, in which each individual is given a fair, random chance of selection.

randomization

20 dogs and 20 cats were subjects in an experiment to test the effectiveness of a new flea control chemical. 10 of the dogs were randomly assigned to an experimental group that wore a collar containing the chemical, while the others wore a similar collar without chemical. The same was same with the cats. After 30 days vets were asked to inspect the animals for fleas & evidence of flea bites. This experiment is

randomized block, blocked by species

The difference between the observed values and its associated predicted value is called the

residual

________ tells us how far off the model's prediction is at that point.

residual value

Anything in a survey design that influences responses falls under the heading of ____________.

response bias

An observational study in which subjects are selected and then their previous conditions or behaviors are determined. These need not be based on random samples and they usually focus on estimating differences between groups or associations between variables.

retrospective study

A company sponsoring a new internet search engine wants to collect data on the ease of using it. What is the best way to collect the data?

sample survey

A study that asks questions of a sample drawn from some population in the hope of learning something about the entire population. polls taken to assess voter preferences are common

sample survey

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Have the Board of Elections assign each voter a number, then select 400 of them using a random number table.

simple random sample (SRS)

A sample size of n is a sample in which each set of n elements in the population has an equal chance of selcetion

simple random sampling

A consumer group wants to see if new education program will improve the spending habits of college students. Student in an economics class are randomly assigned to 3 different coursed on spending habits. What is the response variable?

spending habits

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Randomly pick 50 voters from each election district.

stratified

A sampling design in which the population is divided into several subpopulations, or strata, and random samples are then drawn from each stratum. If the strata are homogeneous, but are different from each other, a ________ may yield more consistent results than an SRS

stratified sampling

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Call every 500th person in the phone book.

systematic

A sample drawn by selecting individuals systematically from a sampling frame. When there is no relationship between the order of the sampling frame and the variables of interest, this can be representative

systematic sampling

A correlation of 0 between 2 quantitative variables means that

there is no linear association between the 2 variables

A sampling scheme that biases the sample in a way that gives a part of the population less representation in the sample than it has in the population suffers from ____________________.

undercoverage

In an experiment the primary purpose of blocking is to reduce

variation

A member of the City Council has proposed a resolution opposing construction of a new state prison there. The council members decide they want to assess public opinion before they vote on this resolution. Below is the method that is proposed to sample local residents to determine the level of public support for the resolution. What is this sampling technique? Place an announcement in the newspaper asking to call their council.

voluntary response

Bias introduced to a sample when individuals can choose on their own whether to participate in the sample

voluntary response bias

A sample in which individuals can choose on their own whether to participate. Samples based on this are always invalid and cannot be recovered, no matter how large the sample size.

voluntary response sampling


Kaugnay na mga set ng pag-aaral

CTS2303 - Knowledge Check 4A - Manage IP Addressing - TestOut

View Set

Uppers, Downers, All Arounders: Chapter 6

View Set