STAT FINAL

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Z = (X - μ) / σ

(sample proportion-population proportion)/SD of sampling distribution

sample proportion (p-hat)

- Proportion (or percentage) of successes in a sample or - the number of individuals in a sample with a certain characteristic, divided by the sample size.

If the sample size is large enough in a sampling distribution, we know that: The sampling distribution is approximately ________________. The center of the sampling distribution is equal to the population parameter (____________________________________). The standard deviation of the sampling distribution can be determined by solving this equation:_____________________

-Normal in shape -in this case, the claimed population proportion or p SD= √(p(1-p)/n)

What is the probability your sample proportion of orange candies will be 0.63 or larger? p=0.5 n=100

0.47%

The probability that an event does not occur is

1 minus the probability that the event does occur (complement rule)

Confidence Interval Interpretations should contain

1. CI 2. Population 3. proportion/mean

In what percentage of samples would we expect to see 61% or fewer Americans handing out candy on Halloween? mean = 64%

6.68%

Can the amount of protein in a fast food sandwich help predict the amount of calories in that sandwich? Data was gathered from a random sample of 126 fast food sandwiches. From each sandwich, researchers obtained the amount of protein(in grams, or g) and the number of calories. Predicted Calories = 207.59 + 12.96 (Protein) r= 0.79 _____________________of the variability in calories can be explained by the regression equation, while___________________of the variability in calories cannot be explained by the regression equation.

62.31% and 37.59%

In what percentage of samples would we expect to see 63% or more Americans handing out candy on Halloween? mean = 64%

69.15%

probability

= # of occurences/ # of trials

sample SD formula

= p-hat +/- z*√(p-hat(1-p-hat)/n)

One- vs. Two-sided Hypotheses

A one-sided (or one-tailed) hypothesis test uses a > or a < sign. A two-sided (or two-tailed) hypothesis test uses a ≠ sign. The null hypothesis (H0) always gets the = sign. The alternative hypothesis (Ha) can have a >, <, or ≠ sign.

7. Which of the following values can sometimes be negative? A.A population proportion B.A test statistic C.A p-value D.None of the above can ever be negative. E.All of the above can sometimes be negative.

A proportion can never be negative. The smallest it can be is 0 and the largest it can be is 1. The sample goes for a p-value, which is a probability. A test statistic is another name for a standard score or a z-score. Z-scores can be negative or positive.

Confidence Interval (CI)

A range of values, calculated from the sample observations, that is believed, with a particular probability, to contain the true value of a population parameter. A 95% confidence interval, for example, implies that were the estimation process repeated again and again, then 95% of the calculated intervals would be expected to contain the true parameter value

What happens to our sample proportions as we increase sample size? The graph to the right shows what happened when we drew 10,000 samples, each of size n = 45, and then displayed the proportion of nouns from each sample. Again, in the upper left-hand corner of the graph, you can see what the mean (or center) of the distribution of sample proportions is, and you can see what the standard deviation (or SD) of that distribution is. Along the y-axis, we see our frequencies, and it's far less likely, when we increase the sample size, to see sample proportions as low as 0. A. How would you describe the shape of the distribution of sample proportions above, from samples of size n = 45? B. The mean of the distribution of sample proportions is equal to what value? C. The sample proportions appear to range from _____________, with a standard deviation (or SD) equal to _______________

A. This distribution appears symmetric, and close to Normal B. The mean is equal to 0.157 C. 0 to 0.40 0.050.

The graph to the right shows what happened when we drew 10,000 samples, each of size n = 10, and then displayed the proportion of nouns from each sample. Notice in the upper left-hand corner of the graph, you can see what the mean (or center) of the distribution of sample proportions is, and you can see what the standard deviation (or SD) of that distribution is. If you look along the y-axis in the graph, you'll see that a little less than 2000 of the sample proportions were 0, meaning a little less than 2000 of the random samples ended up with no nouns at all. A. How would you describe the shape of the distribution of sample proportions above, from samples of size n = 10? B. The mean of the distribution of sample proportions is equal to what value? C. The sample proportions appear to range from ___________ with a standard deviation ____________________

A. the distribution has a shape that is skewed to the right B. he mean is equal to 0.157. C. 0 to 0.8 (or SD) equal to 0.112

A 95% confidence interval is 34%±6%. A 90% confidence interval based on this same sample would have: A. The same center and a larger margin of error. B. The same center and a smaller margin of error. C. A larger margin of error and probably a different center. D. A smaller margin of error and probably a different center. E. The same center, but the margin of error changes randomly.

B. The same center and a smaller margin of error.

Many states sell instant lottery tickets in which you scratch off certain areas to see if you have won a prize. You have obtained a random sample of 1000 people who have purchased such tickets, and you are examining the proportion of these people who have won prizes. You know the sampling distribution (for samples of size n = 1000) of the proportion of winners of prizes from these tickets is approximately Normal, with a mean of 0.20 and a standard deviation of 0.01. What percentage of samples chosen at random will have a proportion of winners greater than .22? What percentage of samples will have a proportion of winners less than .19? What percentage of samples would you expect to have a proportion of winners less than 0.183? Greater than 0.226?

Begin by sketching the sampling distribution. Again, we know it will be centered at 0.20, with a standard deviation of 0.01. We also know it will be NORMAL.

A 90% confidence interval for a population proportion calculated using data from a random sample of size n = 200 is (0.36, 0.50). Which of the following is the 99% confidence interval calculated from the same data? A. (0.43, 0.53) B. (0.40, 0.46) C. (0.33, 0.53) D. (0.42, 0.44) E. (0.25, 0.61)

C. (0.33, 0.53)

According to a recent news report, 78.8% of dog owners bring their dogs to the veterinarian at least once a year for routine or preventative care. Imagine that we choose a simple random sample of 350 dog owners and we ask these dog owners if they bring their dogs to the veterinarian at least once a year. We know the percentage of dog owners who answer "yes" to this question will vary, or be different, from sample to sample, if the sampling method is repeated. In fact, if we look at the resulting sampling distribution (based on samples of size n = 350), we will see a distribution that is Normal in shape, with a mean (or center) of 78.8% and a standard deviation of 2.2%. Because the distribution has a Normal shape, we know that approximately 68% of the values in this distribution will be between A. 78.8% and 81%. B. 76.6% and 78.8%. C. 76.6% and 81%. D. 74.4% and 83.2%. E. 65.8% and 70.2%.

C. 76.6% and 81%.

You conduct a hypothesis testwith a sample of size n= 40and you observe values for the sample mean and sample standard deviation that do notlead to the rejection ofthe null hypothesis. In fact, you determine thep-value is0.0667. What would you expect to happen to the p-value if the sample size was larger? Assume here the sample mean and sample standard deviation would not change with the larger sample size. A.The p-value would stay the same B.The p-value would increase(or get bigger) C.The p-value would decrease (or get smaller) D.This cannot be answered sincep-values are not affected at all by sample size

C. As the sample size gets larger, the denominator for the test statistic gets smaller, and this leads the test statistic to get larger. A larger test statistic is further out in the tail of the sampling distribution, and values further out in the tail have a smaller probability of occurring. This is why the p-value gets smaller as the sample size increases.

You ask a simple random sample of 200 college students whether they have ever changed their major. Suppose that in the population, 54% of all college students have changed their majors at least one time. The sampling distribution (based on samples of size n = 200) of the sample proportion who say they have changed their major is Normal, with a mean (or center) of 0.54 and a standard deviation of 0.04. Approximately what percentage of samples would have a sample proportion between 0.46 and 0.58? Choose the answer below that is closest to what you calculate. A. 27% B. 47.5% C. 68% D. 81.5% E. 95%

D. 81.5% Applying the Empirical Rule can help us answer this question. We know that 0.46 is two standard deviations below the mean. If we go out two standard deviations in either direction of the mean, we have 95% of our data. The percentage of values that fall between 0.46 and 0.54 would be found by taking 95% and dividing it in half to give us 47.5%. We know that 0.58 is one standard deviation above the mean. If we go out one standard deviation in either direction of the mean, we have 68% of our data. The percentage of values between 0.54 and 0.58 would be found by dividing 68% by two to get 34%. If we now add 47.5% and 34%, we get our answer: 81.5%. Another way to approach this is to use Table B. We know that as a z-score, 0.46 is -2, and 0.54 is +1. The percentiles that correspond to z-scores of -2 and +1 are, respectively, 2.27 and 84.13. If we take the larger percentage and subtract the smaller one, we get 84.13 - 2.27 = 81.86.

Masha plans to toss a fair coin 1000 times in the hope that it will lead her to a deeper understanding of the laws of probability. Which of the following statements is true? A. It would be impossible for all 1000 tosses to land on the tails side. B. Whether the 500th coin toss results in a head will depend on whether the 499th toss was a head. C. The proportion of tails in these tosses would be considered a parameter. D. The number of tails Masha observes should be exactly equal to 500. E. The proportion of tails should be close to 0.5.

E. The proportion of tails should be close to 0.5

T/F The population parameter is always found in the confidence interval

F

T/F The population parameter will always be between the lower and upper bounds of the confidence interval.

F

True or False? When constructing a confidence interval to estimate a population mean, the size of the sample standard deviation has no impact on the margin of error.

FALSE. To determine the margin of error, we need to know the sample standard deviation, or s. As s gets larger, the margin of error gets larger.

A phenomenon is called ____________if individual outcomes are uncertain but there is a regular distribution of outcomes in a large number of repetitions.

Here, you are essentially given the definition of what random means in statistics; individual outcomes are uncertain, but, in a large number of repetitions, a regular pattern begins to emerge.

if the data is has a tail on the right

Median < Mean

if the data has a tail on the left

Median > Mean

Suppose you were to draw five more random samples. Each sample would contain 10 words from the Gettysburg Address. Since your first sample had a proportion of 0.10 nouns, would you expect the next five random samples to also have sample proportions of 0.10? Please explain why or why not

No, I would not expect all of the sample proportions to be 0.10

Is a correlation ofr= 0.9 stronger than acorrelation of r= -0.9? Please explain why or why not.

No. A correlation of 0.9 is just as strong as a correlation of -0.9. The difference between the two correlations has to do with direction, not strength.

how you'd expect the 95% confidence interval to change if you had been able to take a larger sample

Our margin of error will get smaller as the sample size increases. This will then lead to a more narrow confidence interval.

What proportion of college students prefer to use digital textbooks? Allanreports that, with 95% confidence, the proportion of all college students who prefer to use digital textbooks is between 0.37and 0.45. Allan constructed this interval after gathering data from a random sample of college students. From this information, we can conclude the margin of error is A.0.08. B.1.96. C.0.04. D.0.05. E.It's impossible to answer this question without more information.

Recall that the sample statistic is always right in the center of the interval, and the width of the interval is determined by the margin of error. If we are given an interval, we can easily work backward to find the sample statistic and the margin of error. The interval in this case is 0.37 to 0.45. The center, or the sample statistic, is in the middle of the interval. We can find it by averaging 0.37 and 0.45 to get 0.41. Now, if we figure out what we'd need to add and subtract to 0.41 to get 0.37 and 0.45, we'll have our margin of error. Finding that 0.45 - 0.41 = 0.04 and 0.41 - 0.37 = 0.04 tells us that 0.41 is 0.04 units away from 0.37 and 0.45. Our margin of error is therefore 0.04. Putting the pieces together, we get the interval as follows: 0.41 LaTeX: \pm± 0.04. Remember that another way to figure out the margin of error is to find the distance between the lower and upper bounds of the interval and to divide that distance by 2. This would be 0.45 - 0.37 = 0.08/2 = 0.04.

T/F The sample statistic will always be between the lower and upper bounds of the confidence interval.

T

T/F If you construct a confidence interval for a population mean, the size of the sample mean will have no effect on the size of the margin of error.

T /The sample mean determines the center of the interval, not the size of the margin of error. You might be tempted the think the size of the sample mean would make a difference since we need to use the sample mean to find the sample standard deviation, but the SIZE of the sample m ean has no bearing on what the sample standard deviation ends up being; it's how far data values fall from the sample mean that affects the standard deviation.

T/F The sample statistic is always found within the confidence interval

T because it is derived from the sample statistic

what happens to a confidence interval as you increase your level of confidence?

The interval gets wider when the confidence level is increased. This is because the z* value—which is based on the level of confidence—gets bigger as you increase the level of confidence. This, in turn, leads to a larger margin of error.

Imagine you have data in the form of means, and you have conducted a hypothesis test. You obtain a p-value of 0.0001. Which of the following statements must betrue regarding this p-value?A.Since the p-value is so small, the results will have practical importance as well as statistical significance. B.The difference between μ and 푥̅must be huge. C.The sample standard deviation, or s, must be very small. D.All of the above. E.None of the above.

The key word in the question is "must." It's certainly possible that B and/or C could be leading to a small p-value, but it's not necessary for either B or C to be happening in order for the p-value to be small. None of the options say anything about sample size, and we know this too can lead to a small p-value, even when thereis not that big of a difference between the sample mean and the claimed population mean. Further, option A is one that can be easily ruled out since results can be statistically significant (i.e., have a small p-value that has led to rejection of the null hypothesis) without being practically important(or vice versa). We need to look beyond the p-value when it comes to judging practical importance!

p-value

The probability of observing a test statistic as extreme as, or more extreme than, the statistic obtained from a sample, under the assumption that the null hypothesis is true.

Is the true population proportion a parameter or a statistic?

The true population proportion would be a parameter since it is a characteristic of the population.

If you conduct a two-sided test, you get your p- value

by doubling the one-sided p-value.

sample statistic determines the

center of the confidence interval

As we take a bigger random sample, we expect to get a sample proportion that is _____________________________________________

close to the true population proportion

It has been reported that 45% of all college students use Twitter If we survey a simple random sample of n = 225 college students and ask if they use Twitter, the percentage who say "yes" will vary if the sampling method is repeated. In fact, the sampling distribution of the percentage who say they use Twitter, in many samples of size n = 225, will be Normal in shape, with mean (or center) of 45% and standard deviation of 3.3%. Based on this information, we know that the probability of obtaining a sample (of size n = 225) where 39% or fewer students say they use Twitter is A. 0.0227. B. 0.0359. C. 0.0446. D. 0.0500. E. 0.1357.

To answer this question, we must convert 39% to a z-score and then use Table B. This gives us: z = 39 % − 45 % 3.3 % ≈ − 1.8 From Table B, we see -1.8 corresponds to a percentile of 3.59%. Since the question is asking for a probability (which is a number between 0 and 1), we need to divide 3.59 by 100 to get the final answer of 0.0359.

The average number of minutes spent per day using social media by a population of college seniors is 42.7 minutes. If we take a random sample of size n = 65 from this population and find that the sample standard deviation is 9.6 minutes, we know the sampling distribution of the sample mean in this case would have a standard deviation equal to A.0.15 minutes. B.9.60minutes. C.5.30minutes. D.1.19minutes. E.2.87 minutes.

To find the standard deviation of the sampling distribution when the focus is on means, we simply need to know the sample standard deviation (s) and the sample size (n). 1.19

You want to estimate the proportion of students at OSU who have ever used an online dating app. You select a random sample of 140OSU students and find that 43% have used an online dating app. If you want to construct a 99% confidence interval, what will the margin of error be? Try not to do a lot of rounding along the way until you get to the end of your calculations. Choose the answer below that is closest to your final answer. A. 7.1%B. 4.2%C. 5.0%D. 8.5%E. 10.8%

To obtain the margin of error, we use the following formula: If we now multiply 0.108 by 100% to convert it to a percentage, we get 10.8%.

True or False? Switching "x" and "y" will lead the regression equation to change.

True. What you make "x" and "y" has a direct impact on the slope and intercept since these values are computed based on the means and standard deviations of the "x" and "y" variables.

proper form of CI

We are 95% confident that the true proportion of the population is between the lower bound and the upper bound

Which one of the following statements about p-values is correct? A.If the p-value for a hypothesis test is 0.15, this means the probability that the null hypothesis is true is 0.15. B.The smaller the p-value, the more evidence there is against the null hypothesis. C.If a p-value is 0.04 and the alpha-level is 0.01, our decision should be to reject null hypothesis. D.A p-value that is less than 0 means that we have an outcome that is extremely rare. E.All of the above statements are correct.

We can rule out option A since a p-value is not telling us anything about the probability of the null being true. Option C is not correct because we'd fail to reject the null hypothesis in that case since 0.04 is larger than 0.01. Option D is incorrect because a p-value can never be less than 0. Since A, C, and D are not correct, E is also notcorrect. Option B is correct because one interpretation of a smaller p-value is that it's providing us with more evidence against the null hypothesis. A smaller p-value makes it more likely that we will be able to reject the null hypothesis.

A recent article stated that college students spend $20per week on coffee.Suppose you think that value is too low for OSU students.Youcollect some data in order to conduct a hypothesis test.You obtain a random sample of 130OSU students and ask each student how much is spent on coffee each week.You find that your sample mean is $21.45and the sample standard deviation is $14.80.What will the p-value be?Choose the answer below that is closest to what you calculate. A.0.8643 B.0.0808 C.0.5398 D.0.1357 E.Less than .001.

We first need to compute the test statistic: z = 21.45 − 20 14.80 130 ≈ 1.1 From Table B, we see that a z of 1.1 corresponds with a percentile of 86.43. We must subtract this from 100 to get the upper tail percentage since we believe college students spend MORE than $20 per week on coffee. This gives us 100 - 86.43 = 13.57. Converting this to a probability gives us 13.57/100 = 0.1357.

Match.com is an online dating service. Those who manage the Match.com site claim that 25% of the people who use their services are under the age of 30.You do not believe this claim. In fact, you think that the true percentage of individuals in this agegroup who use the Match.com online dating services is different from25%. You are able to survey a random sample of 94individuals under the age of 30.In your sample, 21individuals indicate they have used Match.com online dating services. You now want to test the following hypotheses: Ho: p = 0.25Ha: p ≠ 0.25. What will your test statistic be? A.-2.6 B.-1.2 C.-0.6 D.-0.3 E.None of the above

We first need to figure out the sample proportion. This will be 21/94 or approximately 0.223. Assuming the null hypothesis is true, our test statistic can be calculated as -0.6

z*

determined by the confidence level

a _________________ shows us all values a variable can take on and how often it can take on different values.

distribution

if p value is greater than alpha

fail to reject the null

in a sampling distribution, as the sample size ______________ the distribution will look more normal even if the population distribution is not normal

gets larger

If an outcome has a probability of 1

it always happens

If an outcome has a probability of zero

it never happens

p-value keeps getting smaller as z

keeps getting bigger

random variability is taken into account by the

margin of error

Confidence statement has two parts

margin of error and level of confidence

center of sampling distribution

mean

The bigger the sample size, the ________ the confidence in terval(assuming the same confidence level).

narrower

the bigger the sample size the ___________________the confidence interval assuming the same confidence level

narrower

the mean of the sampling distribution does ________________as the sample size increases a. increases b. stays the same c. decreases

not change

Suppose we are given a bag (or a sample) of 100 candies, and we want to determine the proportion of candies that are orange. Out of 100 candies, suppose that exactly 48 are orange. Our sample proportion is then:

p(hat)=48/100 = 0.48

confidence statements are always written about the

population parameter

σ

population standard deviation

A _______________________________ shows all possible outcomes of a random phenomenon and the probabilities of each outcome.

probability model

Can the average number of hours of sleep a mammal gets per day help to predict that mammal's lifespan (in years)? Researchers studied the sleep patterns and lifespans of a large sample of mammals. All of the mammals in the sample slept, on average, between 2.9 and 19.7 hours per day. The form of the relationship is linear, and the following information was computed based on this data set. Predicted lifespan = 36.93 -1.64 (average hours of sleep) r-squared (or R-squared or r2) = 42.21% Based on the above information, what is the value of the correlation coefficient, or r?

r= √0.4221≈−0.65

if p value is less than alpha

reject the null hypothesis

When attempting to estimate a population mean, which of the following values will always be within the 95% confidence interval limits? A. population mean B. sample size C. sample mean D. sample standard deviation

sample mean

S

sample standard deviation

If a bag of 100 candies is a sample, the proportion of orange candies in that bag is a

sample statistic, It is a sample proportion.

A __________________ is a collection of the statistics of all possible samples of a particular size taken from a particular population. If we want to use one sample to make an inference about an unknown parameter, we need to understand how our one sample would compare to other samples we could have drawn randomly from the population

sampling distribution

A __________________ shows us how sample statistics can vary. It assigns probabilities to the values the sample statistic can take.

sampling distribution

measurement of variability in sampling distribution

standard deviation

outliers will have an impact on the

standard deviation and the mean

Imagine that you take a random sample of n = 10 words from the population, and you find that 10% of the sample is composed of nouns. Converting this number to a sample proportion gives us 0.10. We would call this value a _____________________ since it describes a sample

statistic

How does the size of the standard deviation affect the MOE?

the larger the SD the larger the MOE

if the data has a normalized curve

the median = mean

The name for the pattern of values that a statistic takes when we sample repeatedly from the same population is known as

the sampling distribution of the statistic

The hypothesis test is designed to assess

the strength of the evidence against the null hypothesis.

If two events have no outcomes in common, the probability that one or the other occurs is

the sum of their individual probabilities (addition rule)

A claim is made that 33% of adults have ended a relationship because of arguments over household chores.You believe this claim is incorrect, and you decide to conduct a hypothesis test with a two-tailedor two-sidedalternative hypothesis. You are able to survey a random sample of 226adults, and you obtain a test statistic of 2.4. If we assume the significance (or alpha) level is α = 0.01, what shouldyour decision be? A.Reject the alternative hypothesis. B.Fail to reject the alternative hypothesis. C.Reject the null hypothesis D.Fail to reject the null hypothesis. E.More information is needed in order to determine what decision is most appropriate.

We have a positive test statistic of 2.4. This means we first need to look up 2.4 in Table B, find the percentile associated with that test statistic, and subtract the percentile from 100. This gives us 100 - 99.18 = 0.82. If we then divide 0.82 by 100, we get a probability of 0.0082. If we had a one-sided or one-tailed alternative hypothesis, this would be our final p-value, and we'd reject the null hypothesis since that p-value is clearly smaller than 0.01. Because we have a two-sided or two-tailed alternative hypothesis, we need to multiply the probability of 0.0082 by 2. This gives us 0.0082 x 2 = 0.0164. Since 0.0164 is larger than the alpha level of 0.01, we fail to reject the null hypothesis.

If the level of confidence was changed from 90% to 99%, how would you expect the width of the confidence interval to change?

We would expect the width to increase or get larger because a larger level of confidence leads to a larger value of z*, and this will make the margin of error bigger. The margin of error is what determines the width of the interval.

When asked if she thinks she got a passing grade on the exam in her business finance course, Daisy says "I think I have a 75% chance of getting a passing grade on this exam." In other words, Daisy is assigning a probability of 0.75 to getting a passing grade. We would call Daisy's judgment of the likelihood of getting a passing grade a A. probability model. B. conditional probability. C. margin of error. D. personal probability. E. sampling distribution.

What Daisy is doing relates to the idea of personal probability. A personal probability is a number between 0 and 1 that expresses an individual's judgment of how likely the outcome is.

as we increase the sample size, how does the width of the confidence interval change?

When the sample size gets larger, the margin of error gets smaller, and this will lead to a more narrow confidence interval.

Suppose you were to draw five more random samples. Each sample would contain 10 words from the Gettysburg Address. Is it possible for all the sample proportions to be the same?

Yes, it is, but given sampling variability, it's likely we'll see some differences from sample to sample.

a proportion of nouns in an essay is considered to be a population, this proportion is

a parameter

sample distribution of sample mean

a probability distribution of all possible sample means of a given sample size

Since we get closer and closer to the true parameter with a larger sample size, we see less variability, as evidenced by ___________________________, among the sample statistics for the larger samples, when compared to the variability seen among the statistics for the smaller samples.

a smaller standard deviation and a smaller overall range

confidence interval allow us

to use sample data to estimate an unknown population parameter

hypothesis tests allow us

to use sample data to test a claim about a population parameter

The bigger the confidence level (90%, 95% ,99%), the _____ __ the confidence interval.

wider

the bigger the confidence level (90, 95, 99) the ___________________ the confidence interval

wider

margin of error determines the ________________of the CI

width

in any given sample distribution the MOE of a CI

would stay the same for ea sample taken

equation for sampling distribution mean

σ/√n


Ensembles d'études connexes

AP Gov Court Cases - McCulloch v. Maryland (1819)

View Set

Psychology AP Units 1-8 Outlines

View Set