Statistics Exam 2
Will the following variables have positive correlation, negative correlation, or no correlation? IQ and annual salary
Positive
The _______, R2, measures the proportion of total variation in the response variable that is explained by the least squares regression line.
coefficient of determination
What is a closed question? What is an open question?
A closed question has fixed choices for answers, whereas an open question is a free-response question.
What is a confounding variable?
A confounding variable is an explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study.
What is a designed experiment?
A designed experiment is when a researcher assigns individuals to a certain group, intentionally changing the value of an explanatory variable, and then recording the value of the response variable for each group.
What is a frame?
A frame is a list of the individuals in the population being studied.
What is a lurking variable?
A lurking variable is an explanatory variable that was not considered in a study, but that affects the value of the response variable in the study. In addition, lurking variables are typically related to explanatory variables in the study.
What does it mean when a part of the population is under-represented?
A part of the population is under-represented when it is proportionally smaller in a sample than in its population.
What does it mean when an observational study is prospective?
A prospective study collects the data over time
What is a residual? What does it mean when a residual is positive?
A residual is the difference between an observed value of the response variable y and the predicted value of y. If it is positive, then the observed value is greater than the predicted value.
What does it mean when an observational study is retrospective?
A retrospective study requires that individuals look back in time or require the researcher to look at existing records.
What is an observational study?
An observational study measures the value of the response variable without attempting to influence the value of either the response or explanatory variables.
Discuss the advantages and disadvantages of each type of question.
Closed questions are easier to analyze, but limit the responses. Open questions allow respondents to state exactly how they feel, but are harder to analyze due to the variety of answers and possible misinterpretation of answers.
Researchers wanted to determine if there was an association between the level of satisfaction of an individual and their risk of breast cancet. The researchers studied 1718 people over the course of 12 years. During this 12-year period, they interviewed the individuals and asked questions about their daily lives and the hassles they face. In addition, hypothetical scenarios were presented to determine how each individual would handle the situation. These interviews were videotaped and studied to assess the emotions of the individuals. The researchers also determined which individuals in the study experienced any type of breast cancer over the 12-year period. After their analysis, the researchers concluded that the satisfied individuals were less likely to experience breast cancer. Complete parts (a) through (c). (a) What type of observational study was this? Explain.
Cohort study, because information was collected about a group of individuals by observing them over a long period of time.
What is meant by confounding?
Confounding in a study occurs when the effects of two or more explanatory variables are not separated. Therefore, any relation that may exist between an explanatory variable and the response variable may be due to some other variable or variables not accounted for in the study.
What is a cross-sectional study?
Cross-sectional studies are observational studies that collect information about individuals at a specific point in time or over a very short period of time.
Explain what each point on the least-squares regression line represents.
Each point on the least-squares regression line represents the predicted y-value at the corresponding value of x.
True or false: Correlation implies causation.
False
Explain the difference between a single-blind and a double-blind experiment.
In a single-blind experiment, the subject does not know which treatment is received. In a double-blind experiment, neither the subject nor the researcher in contact with the subject knows which treatment is received.
What is the difference between univariate data and bivariate data?
In univariate data, a single variable is measured on each individual. In bivariate data, two variables are measured on each individual.
Why is it rare for frames to be completely accurate?
It is rare for frames to be accurate because frames are obtained periodically, whereas populations are constantly changing
If the linear correlation between two variables is negative, what can be said about the slope of the regression line?
Neg
Will the following variables have positive correlation, negative correlation, or no correlation? outside temperature and the number of people wearing coats
Negative
Which is the superior observational study? Why?
Neither study is always the superior to the other. Both have advantages and disadvantages that depend on the situation.
What does it mean if r=0?
No linear relationship exists between the variables.
Distinguish between nonsampling error and sampling error.
Nonsampling error is the error that results from undercoverage, nonresponse bias, response bias, or data-entry errors. Sampling error is the error that results because a sample is being used to estimate information about a population.
What does it mean when sampling is done without replacement?
Once an individual is selected, the individual cannot be selected again.
What is replication in an experiment?
Replication is applying each treatment to more than one experimental unit.
Which sampling method does not require a frame?
Systematic
Do the two variables have a linear relationship?
The data points do not have a linear relationship because they do not lie mainly in a straight line.
Do the two variables have a linear relationship?
The data points have a linear relationship because they lie mainly in a straight line.
If the relationship is linear do the variables have a positive or negative association?
The relationship is not linear.
What are the advantages of having a presurvey with open questions to assist in constructing a questionnaire that has closed questions?
The researcher can learn common answers.
(c) In the report, the researchers stated that "the research team also hasn't ruled out that a common factor like genetics could be causing both the emotions and the breast cancer." Explain what this sentence means. Choose the correct answer below.
The researchers may be concerned with confounding that occurs when the effects of two or more explanatory variables are not separated or when there are some explanatory variables that were not considered in a study, but that affect the value of the response variable.
What is the response variable? What is the explanatory variable?
The response variable is whether or not breast cancer was contracted, because it is the variable of interest. The explanatory variable is level of satisfaction, because it affects the other variable.
What does it mean to say that two variables are negatively associated?
There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable decreases.
What does it mean to say that two variables are positively associated?
There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable increases
Determine whether the following statement is true or false. Generally, the goal of an experiment is to determine the effect that the treatment will have on the response variable.
True
Is the statement below true or false? The least-squares regression line always travels through the point (x,y).
True
Determine whether the following statement is true or false. Explain. Inferences based on voluntary response samples are generally not reliable.
True, because it is often the case that the individuals who volunteer do not accurately represent the population.
Which allows the researcher to claim causation between an explanatory variable and a response variable?
a designed experiment
Match the linear correlation coefficient to the scatter diagram. The scales on the x- and y-axis are the same for each scatter diagram. r=0.946
all points are closer together
Match the linear correlation coefficient to the scatter diagram. The scales on the x- and y-axis are the same for each scatter diagram. r=0.787
all points are more spread out
Match the linear correlation coefficient to the scatter diagram. The scales on the x- and y-axis are the same for each scatter diagram. r=1
all points make a line
Grouping together similar experimental units and then randomly assigning the experimental units within each group to a treatment is called
blocking
A ____________________ is obtained by dividing the population into groups and selecting all individuals from within a random sample of the groups.
cluster sample
Determine whether the study depicts an observational study or an experiment.
experiment The researchers control one variable to determine the effect on the response variable
Does the description correspond to an observational study or an experiment? 20 pts with parkinson's dz are divided into 2 groups. One group uses a drug and the other uses a placebo. After one year, memory recall is measured.
experiment The researchers control one variable to determine the effect on the response variable
Determine whether the study depicts an observational study or an experiment. A poll is conducted by a school's math department in which 3rd grade students are asked if they prefer to be in their math or science class
observational study the study examines individuals of a sample, but does not try to influence the response variable
What are some solutions to nonresponse?
offer rewards and incentives attempt callbacks
r=-0.025
points are widely spaced out
Will the following variables have positive correlation, negative correlation, or no correlation? Years of education and annual salary
positive
A _______ is a scatter diagram with the residuals on the vertical axis and the explanatory variable on the horizontal axis.
residual plot
A __________________________ is obtained by dividing the population into homogeneous groups and randomly selecting individuals from each group.
stratified sample
What is a case-control study?
Case-control studies are observational studies that are retrospective, meaning that they require individuals to look back in time or require the researcher to look at existing records.
Determine whether the following statement is true or false. Explain. A simple random sample is always preferred because it obtains the same information as other sampling plans but requires a smaller sample size.
False, because other sampling techniques may provide more information for less cost than a simple random sample.
Researchers wanted to know if there is a link between proximity to high-tension wires and the rate of leukemia in children. To conduct the study, researchers compared the rate of leukemia for children who lived within 1/2 mile of high-tension wires to the rate of leukemia for children who did not live within 1/2 mile of high-tension wires. The researchers found that the rate of leukemia for children near high-tension wires was higher than the rate for those not near high-tension wires. Can the researchers conclude that proximity with high-tension wires causes leukemia in children?
No, because this is an observational study.
Determine whether the following statement is true or false. Explain. When conducting a cluster sample, it is better to have fewer clusters with more individuals when the clusters are heterogeneous.
True, because when the clusters are heterogeneous, they are scaled down versions of the population.