STAT 201 Final

Ace your homework & exams now with Quizwiz!

Kelsey is playing a dice rolling game. If she rolls a 5 or 6, she wins $6. If she rolls less than a 5, she has to pay $3. The expected value of this game is __________. Use exact probabilities in your calculations.

$0 Expected value= $6*(1/3) -$3*(2/3)=-$0.

300 people were asked the following question: How many times have you accessed the internet this week? - none - once or twice - three or four times - more than four times Which is the BEST method of displaying the data produced from that question?

Bar graph While this variable initially looks like a quantitative variable, we have grouped the responses into 4 distinct groups, making a bar chart the best choice

Which of the following makes no distinction between explanatory and response variables?

Correlation makes no distinction between explanatory and response variables. The distinction between explanatory and response variables is very important in regression.

The three basic requirements of a(n) _____ experiment are: the presence of an institutional review board, subject confidentiality, and informed consent.

Ethical

he box in a boxplot spans between which two values?

First quartile and third quartile The third quartile is part of the five-number summary.

Which type of plot is best to use to display the distribution of a single quantitative variable?

Histogram

Which is used for prediction? a) correlation b) regression c) alignment d) slope

In regression we model the relationship between x and y and we can use the model to make predictions.

______ squares regression makes the sum of the squared prediction errors as small as possible.

Least The least squares line is line that has the smallest sum of squared prediction errors. A prediction error is the vertical distance from a data point to the line.

Institutional review boards (IRBs) are tasked with reviewing all studies involving human subjects to protect their rights and welfare. One criticism of the IRB process is that when they are overworked they may be tempted to categorize some studies as ______ risk when they shouldn't be.

Minimal Placing a study in the minimal risk category speeds up the process

The mistake of believing that, in 10 tosses of a fair coin, the sequence THTHHTTTHH is more likely than TTTTTHHHHH (where T is tail and H is head) is an example of the _____.

Myth of short run regularity Probability is random in the long run but not random in the short run. Thus, both sequences are equally likely in the short run

The mistake of believing that, in 6 spins of the roulette wheel, the sequence RBRBBR is more likely than RRRBBB (where R is red and B is black) is an example of the _____.

Myth of short run regularity Probability is random in the long run but not random in the short run. Thus, both sequences are equally likely in the short run.

Susan is trying to predict if her next baby will be a boy or a girl. She currently has 5 kids and all are girls. For this next pregnancy, she says that the baby HAS to be a boy because of all of the previous girls. What myth is Susan falling victim to?

Myth of the law of averages She is mistakenly assuming that if a girl is born several times in a row, then it becomes increasingly more likely that the next one should be a boy. This is the myth of the law of averages.

Lily claims that she can feel it in her bones that the chances of a particular candidate winning the next election is 95%. Lily's belief is an example of a _____.

Personal probabilities This number is based on judgement, not long-term models.

The odds of Stephanie winning the 100 meter dash are F to S. This corresponds to a probability of _____.

S/(F+S) There are S number of wins to every F number of losses. Therefore, the number of expected wins over the total possible outcomes is the probability.

In clinical trials it is _____ acceptable to allocate half the subjects to a placebo.

Sometimes

Standard Scores

Standard scores do not measure area under a Normal distribution; they measure value in a distribution. If the distribution is Normal, once you have the standard score, you can find the proportion less than the value of interest using the standard Normal table.

An investigation of the relationship between height of a child and age (in months) found r^2 = 0.64. Knowing this, we can say _____________.

The age of a child explains 64% of the variation in height. As children grow older, they grow in both height and weight, so we can say that age explains r2 = 64% of the variation in height.

What does r^2 tell us?

The percentage of variation in the y's that is explained by the least squares regression of y on x.

The law of large numbers states that, as the number of times you roll a fair die increases, the proportion of times that you roll a 6 approaches _________. Please give your answer in fraction form.

The proportion approaches the probability of rolling a 6. Thus, the proportion of 6's will be approximately 1/6

What does the area under a density curve above a range of values tells us?

The proportion of all possible observations in the range.

What is the variable of the sampling distribution for the mean comprehensive score from all possible samples of size 500 of students taking the ACT exam (a national college admissions exam) in 2010?

The statistic: mean comprehensive score in all possible samples of 500 students This is the variable of the sampling distribution.

What is the variable of the sampling distribution for the mean number of friends from all possible samples of size 1200 Facebook pages?

The statistic: mean number of friends of all possible samples of 1200 Facebook pages

The _____ of a sampling distribution is a statistic.

The variable of a sampling distribution is a statistic. For example, a sampling distribution could be the distribution of all the means from all possible samples from a population distribution.

Thirty percent of observations from a distribution are _____ the value of the 30th percentile of that distribution.

Thirty percent of observations are less than the value of the 30th percentile.

Which of the following correctly explains why this statement is true or false? "Too much precision or regularity can lead to a suspicion that the data is being manipulated".

True because this is a possible indication of fraud.

Is sometimes OK to report over a 100% increase?

Yes because it is possible to have more than a 100% increase

Is it ethical to give half of the subjects in a clinical trial a placebo when the other half of subjects get the medical treatment?

Yes, because the placebo does not have harmful side effects and it is necessary in order to compare the treatment to a baseline.

The distribution of a random variable is __________.

equal to the values it can take and how often it takes them

Density Curve

smoothed histogram "idealized"

Jimmy is proofreading a colleague's experimental report where he tested 100 different students on a standardized test. The students were split up into 4 groups, with 25 students in each group. Each of the four groups received a different treatment before the test. The percentage of students who passed in each group was 63%, 55%, 80%, and 79%. Jimmy said that this is an error because _______________.

the numbers are not consistent with each other if 25 students were surveyed in each group, the corresponding percentages would have to be multiples of 4. 63%, 55%, and 79% are all not multiples of four.

Chance behavior is _____ in the short run.

unpredictable But over the long run, chance behaviors have predictable patterns.

From a computer simulation of rolling a six-sided die ten times, the following data were collected based on the number of spots showing: 5 5 1 3 2 1 5 6 5 1 This means the probability of rolling 4 is 1/_____ .

1/6 Not having seen a four on these 10 rolls does not mean it won't happen. The die is balanced so the probability is 1/6 which is approximately 0.17.

Dan has been saving $8 each week in a box under his bed. The equation that predicts y (how much money he has at week x) is ŷ = 20 +8x. The value of r^2 for this relationship is ______.

100% There is no variability in the amount saved each week so the prediction is exact.

High blood cholesterol increases your risk of heart attack and stroke. Cholesterol levels in young women are approximately Normal with mean 189 mg/dl and standard deviation 40 mg/dl. About 34% of women will have levels between _____.

189 and 229 In a Normal distribution, about 68% of all observations are within one standard deviation of the mean. 229 is one standard deviation above the mean and 189 is the mean. Half of the 68% will be between 189 and 229.

The price of a certain product rose from $80 to $100. The percentage increase is ___________.

25.0% % change = 20/80 * 100 = 25.0%

The odds for Billy's Goats to win the local grass eating competition are 12 to 5. This corresponds to a probability of _____.

5/17 There are 5 wins to every 12 losses, out of every 17 outcomes.

If a coin is flipped four times and we are interested in the event "two heads and two tails," the event will have _____ outcomes in it

6 They are HHTT, HTHT, HTTH, TTHH, THHT, and THTH.

In April's class, the children averaged a 60 on the first test. After the second test, she told them that they raised the average by 20%. The average percentage on the second test was _________.

72% 20 = change in value/60*100. The change in value = 20*60/100 = 12+ 60 = 72

If a coin is flipped three times, the number of all possible outcomes is _______.

8 Each flip has two possible outcomes (heads or tails). The possible scenarios are: HHH, HHT, HTH, HTT, TTT, THT, THH, and TTH.

Suppose that the time to complete a test is Normally distributed with mean 45 minutes and standard deviation 15 minutes. If the test period is 75 minutes long, we'll expect about _____ % of the students to finish.

97.5 75 minutes is two standard deviations (2*15) above the mean of 45 minutes. Normal distributions have about 95% of the area within two standard deviations of the mean. 5% of the area is left for the tails, which is divided by 2 because of symmetry. About 97.5% of students would be able to finish.

Three Normal distributions all have mean 20. Distribution A has standard deviation 1, distribution B has standard deviation 5, and distribution C has standard deviation 10. The distribution with the sharpest peak is Distribution _____.

A Because this distribution has the smallest standard deviation, it means that more observations will be clustered near the mean.

The median of any distribution is the _____ th percentile.

Fifty percent observations are less than the median, so the median is the 50th percentile

What can we say about the bars of a histogram?

For visual accuracy, the bars must be of equal width.

One measure of the physical abilities of football players is the time in which they run a 40 yard dash. In one league, the mean time is 4.4 seconds with standard deviation 0.15 seconds. The fastest anyone has ever run the 40 yard dash in this league is 4.2 seconds. Which of the following statements is true of the variable "Time to run the 40 yard dash"?

It is not normally distributed. If it were Normally distributed, the variable would have to obey the 68-95-99.7 Rule. The fastest time ever is only 1.33 standard deviations below the mean time of 4.4 seconds.

Data was collected every ten years over a hundred and fifty year period on x = number of Methodist ministers for the year and y = number of barrels of rum imported for the same year. The correlation between x and y is r = +1.0. How should we classify this correlation?

Nonsense because there is no direct connection between number of ministers and number of barrels of rum imported.

Which is not true of a histogram?

a number can be included in only two classes A number from the data set belongs in just one class

When examining the change in price over time, it can be misleading to plot the _______ increase from time period to time period.

actual It can be misleading to plot the actual increase, so plot the percent change from the previous time instead.

In a clinical trial, it is agreed that the interests of the subjects must _____ prevail over society and science.

always

The general form for a linear equation is given as: y = a + bx. In the equation, _______ tells us about how y changes when x changes.

b The quantity that tells us the change in y as x increases by one-unit is the slope, b. regression line is only valid when the relationship between x and y is linear

Pictogram

bar chart where pictures replace the bars

The measured value is equal to the true value plus _____________. Please choose the correct answer from the following choices, and then select the submit answer button.

both bias and random error

Robert wants to remodel his kitchen. He measures the width of the kitchen 3 times with a tape measure and gets a measurement of 10.2 ft, 10.4 ft, and 10.3 ft. The true width of his kitchen according to a professional is 11.1 ft. The difference between the true kitchen width and his measurements can be attributed to ___________.

both bias and random error Because his measurements systematically understate the true width, the measurements are bias. However, there is also random error in Robert's measurements because repeated measurements on the same individual give different results.

Ryan weighs himself on the scale five times and the scale says that he weighs 180, 180, 180, 180, and 180. He just went to the doctor and found his exact weight to be 176 lbs. Which of the following terms can be used to describe the measurements given by this scale?

both bias and reliable measurements The results systematically overstate Ryan's weight so the results are biased. The measurements are also reliable because they give the same weight every time (even though it is an incorrect weight).

It is unethical to have patients participate in _____ trials even when the active drug (not the placebo) is not believed to help the patients taking the drug.

clinical Medical treatments can be tested in clinical trials only when there is reason to hope that they will help the patients who are subjects in the trials.

Points scored by a professional basketball player in his first 35 games of the season are given in the following stemplot (n = 35). The shape of the distribution of these scores is approximately symmetric. For these data, the mean is _____ the median. 0|5 1|01233 1|6678 2|00111234 2|5555667889 3|0014 3|56 4|2 Please choose the correct answer from the following choices, and then select the submit answer button.

close to The mean is close to the median when the shape is approximately symmetric.

The correlation _____ r has no units.

coefficient Because r is based on the product of standard scores, it has no units.

The mistake of believing that, in 4 tosses of a fair die, getting a 1123 on November 23rd is more than a random occurrence is an example of the myth of surprising _____.

coincidence Probability is random in the long run but not random in the short run. Therefore, while being unusual, it is not as unlikely as one may suspect. The myth being used here is surprising coincidence.

Rates are more valid measurements than ________.

counts Rates are more valid than counts because they take into account the number of people.

The first step in analyzing numeric data involving two variables is to ____________.

create a scatterplot. The first step in any data analysis should be to plot the data. Because we have numeric data involving two variables, a scatterplot is appropriate.

Fred is playing blackjack. The probability of getting an Ace on the first card dealt to him is _____ of the probability of getting a King on the second card.

dependent These are dependent because the card Fred gets on the first draw directly affects the probability of drawing a king on the second card

When examining a single variable, we look for shape, center, spread, and outliers. When examining a relationship between two numeric variables, we look at form, ________ , strength, and outliers.

direction We want to know if the relationship is increasing, decreasing, or neither.

The correlation coefficient, r, measures the ______ and strength of a linear relationship.

direction The correlation coefficient measures the direction and strength of a linear relationship.

A(n) _____________ is a set of outcomes of interest in a random process.

event

To establish clearly that an explanatory variable causes changes in a response variable, we need to perform a(n) ___________.

experiment Lurking variables are always potential problems in observational studies. A valid experiment is necessary to draw conclusions about the explanatory variable causing changes in the response variable.

A researcher wants to know if taking increasing amounts of ginkgo biloba will result in increased capacities of memory ability for different students. They administer it to the students in doses of 250 milligrams, 500 milligrams, and 1000 milligrams. The amount of ginkgo will be plotted on the x axis because _____ variables are plotted on the x axis

explanatory Here, the amount of ginkgo is the explanatory variable that the researchers hope will affect memory ability. Explanatory variables are plotted on the x axis.

Weather forecasters tell people where a hurricane might strike so that they can evacuate and save their lives. But sometimes people evacuate and the hurricane misses the area, making them angry at the forecasters and civil defense authorities who ordered the evacuation. Although the forecasters' models are becoming better and better, the orders to evacuate an area are still based on _____________.

extrapolation Forecasters and civil defense authorities still have to order evacuations well in advance of the storm's arrival. This is extrapolation, but here it is used in the interests of people's safety.

One of the reasons to use a line graph is to:

help us visualize any cycles or trends in the data Line graphs tell us how a quantitative variable changes with time. quantitative on x and y

A simulation model assumes that each outcome is _____ of each other.

independent Outcomes must be independent of one another or simulations will not work.

One principle of ethical studies that is not always necessary in behavioral experiments is _____________.

informed consent

Taking a measurement involves using a unit of measurement, using a variable, and using a(n) _____________ of measurement.

instrument

A researcher decides to measure athletic ability based on an IQ test. This measurement is a(n) ____________ measurement.

invaid

A new synthetic oil company claims a 110% reduction in engine gunk. Which of the following is True?

it doesn't make sense to talk about a 110% reduction you cannot take awake more than 100% of anything

If deception occurs in a behavioral or social science experiment, _____. Please choose the correct answer from the following choices, and then select the submit answer button.

it must be explained as soon as possible

The linear correlation coefficient, r, measures the direction and strength of a ________ relationship.

linear

Before computing the correlation coefficient, r, we need to know the _______ and standard deviation of both variables.

mean Because r is calculated based on z-scores, we need the mean and standard deviation to be able to calculate these.

A statistics class has 245 students. To find the median score on the first midterm, you should first order the exam scores and then find the score in the _____ position.

middle 123 (n + 1)/2 = (245 + 1)/2 = 123. So, the median is the midterm score in the 123rd position.

Subjects in clinical trials _____.

must be a priority over the interests of science and society

Jim is interested in testing the efficacy of a new drug on breast cancer. He follows 10 women who have received the drug and records their reactions to the drug. This is an example of a(n) _____________. Please choose the correct answer from the following choices, and then select the submit answer button.

observational study This is an observational study because Jim is only observing patients who are receiving the drug but he was not the one who imposed those treatments. This is not a carefully designed experiment which contains a control group.

The APA requires consent unless a study merely _____ behavior in a public place.

observes

Every time a hurricane approaches, some people choose to "ride it out" despite warnings from government officials and weather forecasters. These people are basing their decisions in part on guesswork, such as believing the storm will not be as bad as forecasted. The basis for this type of personal decision can be called _____ probability.

personal They are combining known facts with their instincts, beliefs about the strength of the storm, and other personal attitudes like not wanting to abandon their property. In many cases, their decisions are foolish.

If the area to the right of a standard score is less than 0.5, the standard score is _____.

positive. If the area to the right is less than 0.5, the value is above the mean.

A tree diagram shows the possible outcomes and their _____ at each stage of a simulation.

probability A tree diagram is used to give the probability model a graphical form, showing the probabilities of each stage.

Simulations of _____ use a probability model.

probability Simulations need a probability model in order to randomly generate outcomes.

An event is _____ if individual outcomes are uncertain but happen in a predictable manner through time

random Randomness means that we do not know what will happen on any one trial, but over the long run, a pattern of sorts emerges.

Ryan weighs himself on the scale five times and the scale says that he weighs 171, 181, 174, 176, and 177. His true weight is 176. This is an example of _______________________.

random error The results do not seem to systematically overstate or understate Ryan's weight. They give different results both above and below the true weight.

Simulation is the best way to study _____.

randomness Simulations, whether physical (such as tossing a die), or done on a computer, are a good way to study randomness.

When the explanatory and response variables switch axes in a scatterplot, the correlation _____

remains the same If we reverse our choice of which variables to call x and which to call y, the correlation does not change.

The standard deviation determines the _____ of the normal curve.

shape The standard deviation controls the spread and shape of the distribution.

The key to successful _____ is thinking carefully about the probability model.

simulation Getting the probability model correct in the first place is essential to getting the simulation correct.

Using software or other random procedures to imitate chance behavior is called _____

simulation This is what we use to imitate taking many, many random samples to investigate the behavior of the sample mean

A quantity that measures the amount of variation in y explained by a regression model is the ____________ of the correlation coefficient.

square r^2 measures the amount of variation in y that is explained by the regression model.

It would be against basic data ethics principles to publish individual's results, but _____ results are publishable.

summary

why is a pictogram misleading?

the image is magnified to show a difference between groups increases the height and the width to avoid distortion of the picture.

When using a tree diagram, one will have to do many different stages of simulations because _____.

the model is not independent. Because the model is not independent, we have to conduct many different stages of simulations in order to product proper simulations.

Institutional review boards (IRBs) are tasked with reviewing all studies involving human subjects to protect their rights and welfare. One criticism of the IRB process is that ___________.

they can be overworked Overworking an IRB calls into question to the ability to fully protect the rights and welfare of the human subjects.

When examining a single variable, we look for shape, center, spread, and outliers. When examining a scatterplot, we look to _____.

understand the form, direction, strength, and outliers We must have two numeric (quantitative) variables to see these qualities.

A variable is a(n) _________ measure of a property if it is relevant as a representation of that property.

valid Def: valid measurement

Institutional review boards (IRBs) are tasked with reviewing all studies involving human subjects to protect their rights and _____.

welfare


Related study sets

Exam 3 Chapter 12 ( Limited Partnerships & Limited Liability Partnerships )

View Set

The public speaking project chapters 1-16

View Set

Chapter 3a - Developmental Psychology's Major Issues

View Set

GI Clinical Practice Questions (Med Surg)

View Set

An Introduction to Geology An Introduction to Geology Dynamic Study Module

View Set

Multiple choice test 3 accounting

View Set

Tears of a Tiger pg 54 - 78; 79 - 102

View Set

Chapter 14 Food Safety Regulation and Standards

View Set

Unit 24 - Solid and Hazardous Waste

View Set