Week 4 Homework
The linear correlation coefficient is always between _______ and _______, inclusive.
-1, 1
What is the probability of an event that is impossible?
0
Define the complement of an event E.
All of the outcomes in the sample space that are not outcomes in the event E.
Explain the Law of Large Numbers. How does this law apply to gambling casinos?
As the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the probability of the outcome. This applies to casinos because they are able to make a profit in the long run because they have a small statistical advantage in each game.
Brad and Allison have three girls. Brad tells Allison that he would like one more child because they are due to have a boy. What do you think of Brad's logic?
Brad is incorrect due to the nonexistent Law of Averages. The fact that Brad and Allison had three girls in a row does not matter. The likelihood the next child will be a boy is about 0.5.
T/F: When two events are disjoint, they are also independent.
False. Two events are disjoint if they have no outcomes in common. In other words, the events are disjoint if, knowing that one of the events occurs, we know the other event did not occur. Independence means that one event occurring does not affect the probability of the other event occurring. Therefore, knowing two events are disjoint means that the events are not independent.
According to a center for disease control, the probability that a randomly selected person has hearing problems is 0.153. The probability that a randomly selected person has vision problems is 0.088. Can we compute the probability of randomly selecting a person who has hearing problems or vision problems by adding these probabilities? Why or why not?
No, because hearing and vision problems are not mutually exclusive. So, some people have both hearing and vision problems. These people would be included twice in the probability.
Suppose that a probability is approximated to be zero based on empirical results. Does this mean that the event is impossible?
No. When a probability is based on an empirical experiment, a probability of zero does not mean that the event cannot occur. The probability of an event E is approximately the number of times event E is observed divided by the number of repetitions of the experiment, as shown below. Just because the event is not observed, does not mean that the event is impossible.
How do you calculate residual?
Observed y minus predicted y.
Suppose that events E and F are independent, P(E)=0.6, and P(F)=0.6. What is the P(E and F)?
P(E and F) = P(E) * P(F) = 0.6*0.6 = 0.36
If P(E)=0.50, P(E or F)=0.60, and P(E and F)=0.05,find P(F). If P(E)=0.60, P(E or F)=0.70 and P(E and F)=0.10 Find P(F)?
P(E or F) - P(E) + P(E and F) = P(F) 0.60 - 0.50 + 0.05 = P(F) = 0.15
Find the probability of the indicated event if P(E)=0.20 and P(F)=0.45: Find P(E or F) if P(E and F)=0.05.
P(E or F) = P(E) + P(F) - P(E and F) 0.20 + 0.45 - 0.05 = 0.60 P(E or F) = 0.60
If E and F are disjoint events, then P(E or F) = _______.
P(E) + P(F)
What is the Complement Rule?
P(E^c) = 1 - P(E)
The word _______ suggests an unpredictable result or outcome.
Random
A _______ represents scenarios where the outcome of any particular trial of an experiment is unknown, but the proportion (or relative frequency) a particular outcome is observed approaches a specific value.
Random process
Which of the following is true about the least squares regression line, y = b₁x + b₀?
The least-squares regression line always contains the point (⁻x,⁻y) The predicted value of y, ^y, is an estimate of the mean value of the response variable for that particular value of the explanatory variable. The least-squares regression line minimizes the sum of squared residuals. The sign of the linear correlation coefficient, r, and the sign of the slope of the least-squares regression line, b₁, are the same.
What is the least-squares regression line?
The line of best fit; it minimizes the sum of the squared residuals.
Why is the following not a probability model?
This is not a probability model because at least one probability is less than 0.
What does it mean to say that we should not use the regression model to make predictions outside the scope of the model?
We should not use the regression model to make predictions for values of the explanatory variable (x) that are much larger or smaller than those observed.
With a probability of 0.026, would you consider it unusual to find a college student who never wears a seat belt when riding in a car driven by someone else?
Yes, because P(never)<0.05.
About 18% of the population of a large country is nervous around strangers. a) If two people are randomly selected, what is the probability both are nervous around strangers? b) What is the probability at least one is nervous around strangers?
a) 0.18*0.18 = 0.0324 b) 1-0.18 = 0.82 1-0.82*0.82 = 0.3276
Determine whether the events E and F are independent or dependent. Justify your answer. a) E: A person having an at-fault accident. F: The same person being prone to road rage. b) E: A randomly selected person accidentally killing a spider. F: A different randomly selected person accidentally swallowing a spider. c) E: The war in a major oil-exporting country. F: The price of gasoline.
a) E and F are dependent because being prone to road rage can affect the probability of a person having an at-fault accident. b) E cannot affect F and vice versa because the people were randomly selected, so the events are independent. c) The war in a major oil-exporting country could affect the price of gasoline, so E and F are dependent.
Is there a relation between the age difference between husband/wives and the percent of a country that is literate? Researchers found the least-squares regression between age difference (husband age minus wife age), y, and literacy rate (percent of the population that is literate), x, is y = −0.0498x + 6.8. The model applied for 17≤x≤100. a) Interpret the slope. b) Does it make sense to interpret the y-intercept? Explain. c) Predict the age difference between husband/wife in a country where the literacy rate is 30 percent. d) Would it make sense to use this model to predict the age difference between husband/wife in a country where the literacy rate is 12%? e) The literacy rate in a country is 97% and the age difference between husbands and wives is 1.5 years. Is this age difference above or below the average age difference among all countries whose literacy rate is 97%
a) For every unit increase in literacy rate (x), the age difference (y) falls by 0.0498 units, on average. b) No—it does not make sense to interpret the y-intercept because an x-value of 0 is outside the scope of the model. (17≤x≤100) c) y = (-0.0498*30) + 6.8 = 5.3 years d) No—it does not make sense because an x-value of 12 is outside the scope of the model. e) Below--the average age difference among all countries whose literacy rate is 97% is 2.0 years.
The data below represent commute times (in minutes) and scores on a well-being survey. a) Find the least-squares regression line treating the commute time, x, as the explanatory variable and the index score, y, as the response variable. b) Interpret the slope. c) Interpret the y-intercept d) Predict the well-being index of a person whose commute time is 30 minutes. e) Suppose Barbara has a 15-minute commute and scores 66.7 on the survey. Is Barbara more "well-off" than the typical individual who has a 15-minute commute?
a) y = a+bx → y = -0.096x + 69.025 b) For every unit increase in commute time, the index score falls by 0.096, on average. c) For a commute time of zero minutes, the index score is predicted to be 69.025. d) y = (-0.096*30) + 69.025 = 66.1 e) No, Barbara is less well-off because the typical individual who has a 15-minute commute scores 67.6.
What is the slope-intercept form?
y = mx + b m is the slope and b is the y-intercept.
If events E and F are disjoint and the events F and G are disjoint, must the events E and G necessarily be disjoint? Give an example to illustrate your opinion.
No, events E and G are not necessarily disjoint. For example, E={0,1,2}, F={3,4,5}, and G={2,6,7} show that E and F are disjoint events, F and G are disjoint events, and E and G are events that are not disjoint.
A(n) _______ is any collection of outcomes from a probability experiment.
Event
In probability, a(n) ________ is any process that can be repeated in which the results are uncertain.
Experiment
In a scatter diagram, the _______ variable is plotted on the horizontal axis and the _______ variable is plotted on the vertical axis.
Explanatory, response
T/F: If r is close to 0, then little or no evidence exists of a relation between the two quantitative variables.
False. A value of r close to zero does not imply no relation, just no linear relation.
What does it mean when a residual is positive?
If it is positive, then the observed value is greater than the predicted value.
What is the difference between univariate data and bivariate data?
In univariate data, a single variable is measured on each individual. In bivariate data, two variables are measured on each individual.
Two events E and F are ________ if the occurrence of event E in a probability experiment does not affect the probability of event F.
Independent
What does it mean if r=0?
No linear relationship exists between the variables.
_______ is a technique used to recreate a random event.
Simulation
The closer r is to +1, the _______ the evidence is of _______ association between the two variables.
Stronger, positive
Why should the cutoff for identifying unusual events not always be 0.05?
The choice of a cutoff should consider the context of the problem.
What is a residual?
The difference between an observed value of the response variable y and the predicted value of y.
Describe the difference between classical and empirical probability.
The empirical method obtains an approximate empirical probability of an event by conducting a probability experiment. The classical method of computing probabilities does not require that a probability experiment actually be performed. Rather, it relies on counting techniques, and requires equally likely outcomes.
In a certain card game, the probability that a player is dealt a particular hand is 0.47. Explain what this probability means. If you play this card game 100 times, will you be dealt this hand exactly 47 times? Why or why not?
The probability 0.47 means that approximately 47 out of every 100 dealt hands will be that particular hand. No, you will not be dealt this hand exactly 47 times since the probability refers to what is expected in the long-term, not short-term.
Why should correlations should always be reported with scatter diagrams?
The scatter diagram is needed to see if the correlation coefficient is being affected by the presence of outliers.
What does it mean to say that two variables are negatively associated?
There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable decreases.
What does it mean to say that two variables are positively associated?
There is a linear relationship between the variables, and whenever the value of one variable increases, the value of the other variable increases.
Suppose Ari wins 44% of all bingo games. (a) What is the probability that Ari wins two bingo games in a row? (b) What is the probability that Ari wins six bingo games in a row? (c) When events are independent, their complements are independent as well. Use this result to determine the probability that Ari wins six bingo games in a row, but does not win seven in a row.
a) 0.44² = 0.1936 b) 0.44⁶ = 0.0073 c) 0.44⁶ * (1-0.44) = 0.0041
Suppose you toss a coin 100 times and get 83 heads and 17 tails. Based on these results, what is the probability that the next flip results in a head?
0.83
Before interpreting a y-intercept, what two questions must be asked?
1. Is 0 a reasonable value for the explanatory variable (x)? 2. Do any observations near x=0 exist in the data set?
What is the linear correlation coefficient?
A measure of the strength and direction of the linear relationship between two variables.
What does it mean for an event to be unusual?
An event is unusual if it has a low probability of occurring.
A student at a junior college conducted a survey of 20 randomly selected full-time students to determine the relation between the number of hours of video game playing each week, x, and grade-point average, y. She found that a linear relation exists between the two variables. The least-squares regression line that describes this relation is y = -0.0572x + 2.9205 a) Predict the grade-point average of a student who plays video games 8 hours per week. b) Interpret the slope. c) If appropriate, interpret the y-intercept.
a) y = (-0.0572*8) + 2.9205 = 2.46 b) For each additional hour that a student spends playing video games in a week, the grade-point average will decrease by 0.0572 points, on average. c) The grade-point average of a student who does not play video games is 2.9205. (x=0)
Explain what each point on the least-squares regression line represents.
Each point on the least-squares regression line represents the predicted y-value at the corresponding value of x.