Statistics 1430 Exam 2

¡Supera tus tareas y exámenes ahora con Quizwiz!

True

The 80th percentile of the standard normal (Z) distribution is .84.

True

True or False?

True

True or False? Suppose X and Y are independent. Then is the following TRUE or FALSE?

False

True or false? Let X be a continuous random variable. Then f(x) must always lie between 0 and 1.

Time spent waiting at a bank

Uncountably Infinite

Percentile

Value of x where a certain percentage lies below it

When finding ____________ for a continuous random variable, you only have to set up the integrals, not calculate them out

Variance

2.6

What is the mean of X+Y?

Number of accidents in one year at an intersection

Which of the following is NOT a continuous random variable? - Number of accidents in one year at an intersection - Length of time between accidents at an intersection - Time waiting on the phone for tech support (in seconds) - All of the choices are continuous random variables

True

With a continuous random variable, P(a < X < b) is the area under the curve f(x) between a and b.

.841

X is a normally distributed random variable with mean 50 and standard deviation 5. What is the probability that X will be less than 55?

If x and y are independent, is variance of x - y = variance of y - x?

Yes

1/2

f(x) = 1/2 between 0 and 2. What is the probability that X is between 1 and 3?

Which of the following terms represents "all possible values of a random variable and how often you expect them to occur?"

probability distribution

(n x)

(n!)/x(n-x)! =

Z

(x - μx)/ σx =

What are the types of random variables?

- Discrete Random Variable - Continuous Random Variable

Binomial probability table

- For many different values of n and p - Shows probability X is less than or equal to x

The integral of e^-x =

-e^-x

P(X = x) =

0

Rules for variance:

1. Linear Transformations: σ^2ax + b = a^2σx^2 + 0 2. Sums and differences: σ^2x+y = σ^2x + σ^2y and σ^2x-y = σ^2x + σ^2y (only true when x and y are independent) 3. Sums and differences: σ^2x+y = σ^2x + σ^2y + 2ρxyσxσy (if x and y are independent ρ=0) and σ^2x-y = σ^2x + σ^2y - 2ρxyσxσy (if x and y are independent ρ=0)

Rules of means:

1. Linear transformation: μy = aμx + b 2. Sums and differences: μx+y = μx + μy and μx+y = μx - μy

A _________ is uncountably infinite Uncountably Infinite: on an interval

Continuous Random Variable

r =

Correlation in sample between x + y

$12 per hour

Everyone at Bob's diner makes $12.00 per hour. Let X = salary of an employee at Bob's diner. What is the mean of X?

True

Everyone at Bob's diner makes $12.00 per hour. Let X = salary of an employee at Bob's diner. With this information, can you find the variance of X?

Suppose X is a discrete random variable with mean 10 and variance 4, and Y is a discrete random variable with mean 20 and variance 9. We do not know whether X and Y are independent. True/False: We can find the variance of X+Y with only the information we are given.

False

σ x+y = σx + σy

False

If x and y are independent, is variance of x - y = variance of x - variance of y?

No

f(x) is _____________ the same as p(x)

Not

- Always integrate over the entire interval for X - Don't forget to put the "x" in your integral. If you forgot to put the "x" in the integral, the result would be the integral of f(x) dx =1 - Your answer is not a probability. It's the expected average value of X - Your answer should be somewhere between the minimum and maximum possible values of X - The mean of X is in units of X

Notes on Means

A list of all possible values of x and how often you expect them to occur (looking into the future)

Probability distribution of a discrete random variable

4/9

Suppose X is a continuous random variable with the pdf defined as below: What is f(2)?

2.25

Suppose X is a continuous random variable with the pdf defined as below: What is the mean of X?

.7

Suppose X is a continuous random variable with the pdf defined as below: What is the probability that x is greater than 2?

False

Suppose f(x) = 1/2 where 0 < X < 2. True or false? The median of X is 0.50.

Binomial Distribution

- A discrete variable (discrete distribution) - Used when counting number of times an event occurs out of n trials - E.X: X = number of yesses that occurred

The probability for X within a certain interval is the integral of f(x) over that ____________

Interval

Years of employment

2. If you were trying to predict someone's age using their years of employment, which variable would be the X variable?

1.64

90% confidence interval for z =

If you don't know σ, use t and use s for SD

Assuming you have a normal or at least a symmetric distribution.

The integral of f(x) from 2 to 2

Let f(x) =1/3 be a density function for a random variable X where X is between 0 and 3. Which of the following represents the probability that X is 2?

z * Square root (p-hat(1 - p-hat) / n)

Margin of error in a confidence interval for p:

Intuition behind the central limit theorem

No matter what the population values are shaped like, the values of x(bar) look more normal and they are more concentrated around μx

False

P(Z < -1) = 1- P(Z > 1) True or False?

p

Parameter for proportions:

True

The p-values can change from sample to sample.

µ

The parameter for means:

Coefficient of Determination

The percentage of variability in y that is due to x - Notation: R^2 - Calculate as r^2 in a linear regression line

Approximately normal if n is large enough

The shape of the sampling distribution of x(bar) is ______________________

False

The significance level is the same as a Type 2 error.

False

True or false? σx^2 - σy^2 = σy^2 - σx^2 when X and Y are independent

.01

Use the quick and dirty formula to find the approximate margin of error if you surveyed 10,000 randomly selected individuals and asked them a yes/no question. Choose the closest answer.

True

We are 100% confident that our SAMPLE mean is in every confidence interval we make.

Sum of Squares for Error

What does SSE stand for?

p-hat

Statistic for proportions:

Significance Level

What is the name for the cutoff value that lets you decide between rejecting Ho and failing to reject Ho with your p-value?

We use statistics to estimate or test parameters

What is the relationship between parameters and statistics?

p-hat (Proportion in the sample that have the characteristic)

What is the sample statistic when estimating p?

Square root (p-hat(1 - p-hat) / n)

What is the standard error when estimating p?

The standard error of X-bar decreases

What is true about the standard error of the random variable X-bar as n increases?

False

Suppose X = the number of rolls needed until you get a 1 on a single die. Then X is a binomial random variable.

False

Suppose X is a continuous random variable. Then f(x) must be between 0 and 1 for all values of X.

Integrate f(x) from 0 to 5.

Suppose a random variable X is continuous with a certain density function f(x) where x > 0. You want to find P(X ≤ 5). What do you do?

Hypothesis test for a proportion.

Suppose someone reports that 70% of registered voters voted in the presidential election, and you think it's more than that. What technique to you use?

Hypothesis Test

Suppose someone reports that the average age of all Pokemon players is 13. You believe it's higher than that because your statistics professor plays Pokemon. You decide to challenge the claim. Do you use a confidence interval or a hypothesis test?

No

Suppose we are doing a hypothesis test and α is set at 0.05. If our p-value turns out to be 0.07, we cannot reject Ho. However, would it be OK after the fact to change the significance level to something like 0.10, to make it so we could reject Ho?

Pounds

Suppose you are using height (inches) to predict weight (pounds) in a regression situation. What are the units of the Y-intercept?

Type 1 Error

Suppose you conducted a hypothesis test and you rejected Ho. Which type of error could you have made?

-.90

Suppose you have 4 data sets whose scatterplots all show possible linear relationships. The four data sets have correlations of -0.10, +0.25, -0.90, and +0.80, respectively. Which of the correlations shows the strongest linear relationship?

Smallest SSE

The criteria for the best fitting line

False

The density function for any continuous random variable X must be between 0 and 1.

degrees of freedom

tn-1 is a value on the t distribution with ______________________________ n-1.

Confidence Interval

- Formulas in this section need X to be normal or need a large n (n>30) - Bias is not included because it isn't measurable - Outliers affect the confidence interval (because they affect x bar)

When you know σ

- If you have a normal distribution or n > 30 use Z - If n< 30 and not normal, can use t as long as your distribution is relatively symmetric.

Confidence Intervals for p

- What proportion of all U.S. children play video games more than 3 hours per day? - What proportion of Americans distrust the media? - What % of the market does our product get? - What proportion of U.S. workers dread going back to work on Mondays? - What % of businesses use social media to promote their products?

Sometimes true

1. A confidence interval for the mean gives you a range of values that contains the exact population mean.

Two main types of inferences in statistics

1. Estimate a population parameter that's unknown using your data (confidence interval) 2. Test a claimed or reported value for a population parameter using your data (hypothesis/significance test)

Steps in a statistical inference

1. Take a sample 2. Get a sample statistic 3. Don't stop there. Sample results may vary. Measure and take into account the variability from sample to sample. Use this info to draw conclusions about the population based on your sample.

Hypothesis Test

1. What parameter is the statement about? 2. What is their claim about the value of that parameter? 3. You are challenging the claim. Do you think the value of the parameter is more than the claim, less than the claim, or simply not equal to the claim?

1.28

80% confidence interval for z =

1.96

95% confidence interval for z =

2.576

99% confidence interval for z =

Increases

A 95% confidence interval for a population proportion is determined to be .75 to .86. As p-hat approaches ½ from either direction (above or below), what happens to the margin of error?

False

A Type 2 error is described as a false alarm.

t-distribution

A _______________ is flatter and fatter than a z-distribution

No linear relationship

+- .0=

Weak linear relationship

+- .3 =

Moderate linear relationship

+- .5 =

Strong linear relationship

+- .7 =

He disagrees with the claim

Suppose a pizza place claims it's average pizza delivery time is 10 minutes. Bob makes a 95% confidence interval for the average pizza delivery time for this pizza place and gets (15, 20). What does Bob think about the pizza place's claim?

Ho: u = 30 and Ha: u > 30

Suppose a pizza place claims their pizzas are delivered in 30 minutes and Bob believes it's more than that. Bob samples 49 pizza delivery times at random and gets an average of 35 and sample standard deviation 10 min. What are the hypotheses in this problem?

30

Suppose a pizza place claims their pizzas are delivered in 30 minutes and Bob believes it's more than that. Bob samples 49 pizza delivery times at random and gets an average of 35 and sample standard deviation 10 min. What is the value of u0 in this problem? (That's mu sub 0).

Hypothesis test for mean using Z

Suppose it is assumed that the weight of yogurt containers have a normal distribution with mean 8 ounces and standard deviation 1 ounce. You believe the average weight is not 8 ounces. What technique do you use?

.40

Suppose it is reported that 30% of cars on the road are speeding. You take a random sample of 100 cars and find 40 are speeding. What is the value of p^? (Note p^ stands for p-hat.)

True

Suppose n = 10 and you want the value of t to put in your 95% confidence interval for the mean, where you don't know the population standard deviation. The value of t that you find on the t-table in this case is 2.2622.

H0: µ = 2 vs. Ha: µ >2

Suppose someone claims the average delivery time for U.S. packages to be delivered during December is 2 days. You believe it takes longer than that. You conduct a hypothesis test; what are your hypotheses?

False

Suppose the correlation between X =price of a gallon of gasoline and Y = price of a gallon of milk is r = .30 Should we go on and try to make predictions for milk prices using gasoline prices using a straight line?

False

Suppose the correlation between two variables X and Y is .8. That means the correlation between Y and X is -.8.

False

Suppose the correlation between yards rushing and yards passing is .6. That means the correlation between feet rushing and feet passing is .6 x 12 (since you multiply yards by 12 to convert to feet).

a negative

Suppose the equation y = 3.45 - 2.58x represents a valid regression equation and X can be used to predict Y. From this information, we know that X and Y have _____________ correlation.

0

Suppose the probability density function f(x) for a uniform random variable X on the interval [0, 10]. What is the probability that X equals 5?

It stays the same

Suppose you take a random sample of size n and look at the average. As n increases, which of the following statements is true about the mean of?

Confidence Interval

Suppose you want to estimate the average age of all Pokemon players. Would you use a confidence interval or a hypothesis test?

95%

Suppose you want to estimate the average price of gas in Ohio and your 95% confidence interval was (1.20, 1.50). How confident are you that the POPULATION MEAN is in this interval?

100%

Suppose you want to estimate the average price of gas in Ohio and your 95% confidence interval was (1.20, 1.50). How confident are you that the SAMPLE MEAN is in this interval?

All OSU students

Suppose you want to estimate the percentage of all OSU students who will take classes this summer. You take a random sample of 200 OSU students and find that 50 of them will take classes this summer. What is the population in this problem?

25%

Suppose you want to estimate the percentage of all OSU students who will take classes this summer. You take a random sample of 200 OSU students and find that 50 of them will take classes this summer. What is the statistic in this problem?

n > 30

Suppose you want to make a confidence interval for the average gas price in Ohio. What condition(s) do you need to check, if any, to do this problem?

hypothesis test for u (where u represents mu)

Suppose you want to test a claim about the average life span of a Subaru in average mpg (miles per gallon). Which of the following would you use?

Yes

Suppose: 1. all the points on a scatterplot lie perfectly on a straight line going uphill 2. the mean of X and the mean of Y are both 2 3 the standard deviations of X and Y are exactly the same. Can you find the equation of the best fitting line with this information? (Hint: Think of the '5 number' way of finding the best-fitting line.)

(p-hat - p0) / square root (p0 * (1-p0) / n)

Test statistic for proportions:

Shape

The Central Limit Theorem is an important result that pertains only to the _______ of the distribution of the random variable.

The shape (type) of the distribution of X-bar

The Central Limit theorem tells us important results that pertain to:

False

The NORMAL PROBABILITY PLOT should look like a bell-shaped curve if the data have a normal distribution.

less than

The best fitting regression line has an SSE that is _____________ any other possible line that goes through the data points.

Lowest

The best fitting regression line is the one that has the _______________ sum of squares for error.

Shape

The central limit theorem only applies to __________, not mean or standard error

True

The coefficient of determination measures the % of variability in Y that is explained by X.

n > 30 or normal distribution with any n

The conditions for means:

Extrapolation

What is the term for making predictions outside the range of the data values for X?

The mean of X-bar stays the same

What is true about the mean of the random variable X-bar as n increases?

Confidence Interval

What percentage of children under 18 have gone to an R rated movie?

Straight line going uphill

What shape does the normal probability plot have if the data have a normal distribution?

Straight line

What shape should a normal probability plot look like if you have a normal distribution?

All of these choices are correct

What should the residual plot look like if the regression line fits the data well? - all of these choices are correct - no fan shapes - points fall around the horizontal line Y = 0 - random patterns

x-bar

What symbol do we use for the average of all the values in a data set?

Alternative Hypothesis

What you believe is incorrect Ha: μx > μ0, μx < μ0, μx not equal to μ0

- No pattern (should have random scatter about the regression line) - No systematic changes as x increases (E.X: y- values fan out as x increases) - No/Few unusually large values of a residual (outlier in the y direction) - No influential points (outlier in the x direction)

When examining the residuals, if the line fits well the residuals should have:

p-hat +- z * Square root (p-hat(1 - p-hat) / n)

When the two conditions are met, then the 95% confidence interval for p =

Reject H0

When there is enough evidence based on my data to say the claim is false (p-value is small)

Fail to reject H0

When there is not enough evidence based on my data to say the claim is false (p-value is large)

t

When you have a distribution that isn't normal and n<30, use a _________ distribution

Margin of Error

When you have a higher confidence level for a confidence interval, your z increases, and your __________________ increases

Sample Size (n)

When you have a larger _____________ for your confidence interval, and your _____________ decreases (more data so more precise on a smaller interval)

Margin of Error

When you solve for n, use the ________ formula (and always round up!)

Sampling distribution

When you take all possible sample means from all possible samples of size n from a population and put them all together, what do you have?

Test Statistic

When you take the given information and form a Z-score in a hypothesis test for the mean, what is the special name of that result?

You needed np and n(1-p) to be at least 10 and you get an approximate answer.

When you use the normal distribution to find a probability for a binomial random variable, which of the following is true? - None of the other choices is correct - You needed np and np(-p) to be less than 10. - You needed np and n(1-p) to be at least 10 and you get an approximate answer. - You are making a mistake.

Increases

When your confidence level increases in a confidence interval for p, your z increases, and your margin of error ______________

Decreases

When your sample proportion p-hat is close to 0 or 1, your margin of error ____________

Increases

When your sample proportion p-hat is close to 1/2, your margin of error __________

Decreases

When your sample size increases in a confidence interval for p, your n increases, and your margin of error _____________

t

Which distribution has a larger standard deviation, the Z or the t?

All of these choices are conditions of the binomial

Which of the following is NOT a condition of the binomial distribution? - independent trials - all of these choices are conditions of the binomial - fixed number of trials (n) - same probability of yes on each trial (p) - two possible outcomes on each trial (yes/no)

p-hat

Which of the following is a sample statistic? - p - p-hat - p0

Standard error

Which of the following is part of the margin of error formula for a confidence interval for the population mean? a. The sample mean b. Standard error c. Both a and b are part of the margin of error d. None of these answers is correct.

Var (X-Y) = Var(X) + Var(Y)

Which of the following requires independence? - Mean (X-Y)= Mean X - Mean Y - Var (X-Y) = Var(X) + Var(Y) - SD(X+Y) = Sqrt(Variance of (X+Y)) - All of these choices require independence

X does not have a normal distribution (or is unknown) and n > 30

Which of the following situations involves the use of the Central Limit Theorem?

Flipping a coin 10 times and counting the number of heads.

Which of the following variables has a binomial distribution? - Neither of these choices has a binomial distribution. - Flipping a coin 10 times and counting the number of heads. - Flipping a coin until you get 10 heads and counting the number of flips needed. - Both of these choices have binomial distributions.

x-bar +- z(σx/square root n)

With a confidence interval, you want to estimate the mean of the population with a confidence interval for µ. Doing so, the formula used is ________________, and you needed x to be normal or CLT n > 30

Binomial

X is _______________ if you are counting the number of YESes in a set of YES/NO trials.

Use a t distribution with n-1 degrees of freedom.

You are estimating the population mean and the standard deviation of the population is NOT known. Which distribution do you use in the confidence interval formula?

True

You can have a line that does not go through any of the data points but has the best SSE.

True

You must use a t instead of a Z in your confidence interval formula for a mean when you don't know the value of sigma. (Assume you have a normal distribution.)

Confidence Interval

You want to estimate the average time for packages to be delivered by your company

n * p-hat >=10 and n*p-hat (1 - p-hat) >=10

You want to estimate the proportion of all students who will go home for winter break using a confidence interval. Which condition(s) do you need to check in order to do this problem?

As square feet increase by 1, selling price increases by $33.80

Your boss gives you the following regression equation. Selling price = $5,240 + $33.80 (Number of Square Feet). How do you interpret the slope for this equation?

False

Your boss gives you the following regression equation. X = square feet and Y = selling price Selling price = $5,240 + $33.80 (Number of Square Feet). Does it make sense to interpret the Y-intercept for this equation?

µ0

Your friend claims the average price of a burger in the Columbus Gateway Area is $10.00. You believe it's lower than that. What symbol represents the $10 in this scenario?

Population

____________ must have a normal distribution (or sample size is large enough that the t value is the same as the Z value.)

tn-1

_________________ is a value on the t-distribution with n-1 degrees of freedom.

Confidence interval for a population mean

x bar +- z(σx/square root n) =

False

x-bar = μx

Margin of error for a population mean

z(σx/square root n) =

Claimed value for population mean

μ0

Population mean (real value)

μx

μx

μx(bar) =

σx / square root n

σx(bar) =

Test Statistic

σx/square root n =

Statistic

A company has developed a new battery, but the average lifetime of all of the batteries it makes is unknown. In order to estimate this average, a sample of 500 batteries is tested and the average lifetime of this sample is found to be 225 hours. The 225 hours is the value of a:

True

A confidence interval for the mean is known as a range of likely values for the population mean.

True

A confidence interval gets wider when you don't know the population standard deviation because you have to use t instead of Z.

False

A confidence interval is narrower if you use t than if you use Z. (Assume all else is the same.)

Percent

A confidence interval means that you are a certain _________ confident that the population parameter is in there. You are not 100% sure

Hypothesis Test

A newspaper reports that 10% of children under 18 have gone to an R rated movie. You believe it's more than that. What does your data show?

False

A p-value in hypothesis testing means the same thing as the sample proportion.

There is a moderate positive linear relationship between January revenue and yearly revenue.

A researcher is trying to predict the linear relationship between January revenue and yearly revenue for her company. The correlation turns out to be .60. How does she interpret this correlation?

True

A researcher is trying to use January temperatures to predict latitude. This means January temperature is the X (independent) variable and latitude is the Y (dependent) variable.

False

A residual can be calculated anywhere along the regression line, not just where the data points are.

Bob continues his research because even though there is no linear relationship here, there could be a different relationship.

Bob is interested in examining the relationship between the number of bedrooms in a home and its selling price. After downloading a valid data set from the internet, he calculates the correlation. The correlation value he calculates is only 0.05. What does Bob conclude?

400

Bob makes a 95% confidence interval with a random sample of 100 and his margin of error is plus or minus 18. In order to cut the margin of error in half next time, what should Bob's sample size be? (Hint: You CAN determine the answer without knowing the standard deviation.)

16

Bob wants a 95% confidence interval for the population mean. Previous research shows the population standard deviation is 8. He wants his margin of error to be no more than 4. What should his sample size be?

P(X ≥ 3)

Bob wants to find the probability that X is at least 3. What probability does Bob need to find?

Bob's score is 0.25 standard deviations above the mean

Bob's Z-score on an exam is 0.25. What is the correct interpretation?

.0401

Bob's alternative hypothesis is Ha: u > 8 and his test statistic is 1.75. What is Bob's p-value?

Ha: µ not equal to 8

Bottles of cream have a normal distribution with mean 8 ounces. You would stop the process if the containers were being filled incorrectly. What is your alternative hypothesis?

-2.05

Bottles of cream have a normal distribution with mean 8 ounces. You would stop the process if the containers were being filled incorrectly. Your sample of 30 containers has an average weight of 7.7 ounces. Assume the standard deviation is .8. What is your test statistic?

As z increases, the confidence level is higher, and an increase in n brings the margin of error down

Can you have a high level of confidence, yet have a small margin of error?

True

Coefficient of determination works for any type of relationship.

n * p0 >= 10 n * (1-p0) >= 10

Conditions for proportions:

True

Correlation is affected by outliers.

True

Correlation measures the strength and direction of linear relationships only.

Linear

Correlation only works for ______________ relationships, but the coefficient of determination (R-squared) works for any relationship.

Population

Entire group of interest (E.X: All U.S. voters)

Statistical Inference

Estimate/guess or test/challenge a population parameter using info from a sample

False

Failing to reject Ho and Accepting Ho are both appropriate conclusions that mean the same thing.

Sampling Distribution

Find the distribution of all possible values of the sample statistic (from all possible samples of size n) - Find its center (mean), spread (standard error), and shape - Use this information to make inferences about the population parameter

When p^ is close to 0 or 1

For what value(s) of p^ is the margin of error the smallest in a confidence interval for p? Choose the best answer.

Confidence Interval

Give a range of likely values for the parameter (a number that summarizes the population) General Formula: Sample Statistic +- Margin of Error

Sum of squares for residuals

How do the residuals relate to the SSE

It has a larger standard deviation and hence thicker tails than the Z distribution.

How does the t distribution differ from the Z distribution?

n > 30

How large does n generally have to be in order for the Central Limit Theorem to take effect? (Assume X does not have a normal distribution.)

True

If Ha is not equals, you have to double the probability of being beyond your test statistic to get your p-value.

Dependent

If X and Y are ______________ then Var(X-Y) is a longer formula involving rho (Greek small r), whose last term doesn't drop out (see formula sheet.)

X-bar is approximately normal if n > 30

If X does not have a normal distribution, what is true about the shape (distribution) of the random variable X-bar?

X-bar is exactly normal for any n

If X has a normal distribution, what is true about the shape (distribution) of the random variable X-bar?

n > 30

If X has any distribution (not normal) then the shape of the sampling distribution of x(bar) is approximately normal as long as _____________ (called the central limit theorem)

Discrete Finite

If X is a binomial random variable then X is also a __________________ random variable.

No

If X is a binomial random variable with n = 15 and p = 0.3, we can use the normal approximation?

.019

If X is a binomial random variable with n = 4 and p = 0.18. What is P(X = 3)?

False (can be greater than 1)

If X is a continuous random variable and its probability density function is f(x), then we know that the values of f(x) must always lie between 0 and 1.

False

If X is binomial with n = 5 and p = 0.05, the chance that X is at least 1 is 0.7738

10

If X is random variable with a standard deviation of 10, then 3X has a standard deviation equal to

Point is below the line

If a data point has a negative residual, where is the point compared to the line?

Below

If a residual is negative, then that data point lies _________________ the regression line.

False

If everyone at your workplace got a 10% raise, the standard deviation of salaries would decrease.

True

If you are counting the number of RED M&Ms in a bag you have a binomial distribution.

Approximation

If you are using the central limit theorem, your answer is an ____________

False

If you count the number of M&Ms of each color you have a binomial distribution.

CI for u, involving t

If you do not know the population standard deviation but you want to estimate the average number of hours you study per week, what technique do you use?

s

If you don't know the population standard deviation, what do you use as a substitute in your test statistic?

t

If you don't know what σ is, use __________ with n-1 degrees of freedom and s (sample standard deviation)

Type 2 Error

If you failed to reject H0, you could be wrong. H0 could have been false. This mistake is called a _______________. Also known as a missed detection.

Find the 70th percentile

If you have a normal distribution and you want to find the cutoff for the top 30 percent how do you attack the problem?

It decreases

If you increase n, what happens to the margin of error of your confidence interval for p?

It decreases

If you increase the sample size, what happens to the margin of error in a confidence interval?

True

If you reject H0 in a hypothesis test for the mean, the value of µ0 is guaranteed not to be in the corresponding confidence interval. (Assume Ha is , alpha = .05, confidence level = 95%)

True

If you reject Ho in a hypothesis test for the mean, the value of is guaranteed not to be in the corresponding confidence interval. (Assume Ha is , alpha = .05, confidence level = 95%).

Type 1 Error

If you reject Ho, which type of error could you be making?

Type 1 Error

If you rejected H0, you could be wrong. Ho could have been true. This mistake is called a _______________. Also known as a false alarm.

No

If you switch X and Y in a regression problem, does the correlation change?

Yes

If you switch X and Y in a regression situation, does the slope of the regression line change?

Z

If you want to estimate the mean, and you KNOW what σ is, use ____________

x-bar +- tn-1 (s / square root n)

If you want to estimate the population mean and you don't know σ, use the following formula:

Find a confidence interval for µ

If you want to estimate the population mean, which technique do you use?

Confidence interval for p

If you want to estimate the proportion of teenagers who drive a Subaru, which method do you use?

Normal

If your answer is exact, it is a _____________distribution

1.96

If your confidence interval is 95%, what is the value of Z that goes into the confidence interval formula?

Marginal result (let reader decide)

If your p-value = alpha, then ______________

Fail to reject H0

If your p-value is .10, and the significance level is .05, what do you conclude?

Fail to reject H0

If your p-value is .11, and the significance level is .05, what do you conclude?

False

If your test statistic is negative, so is your p-value.

The value of α

In a hypothesis test, which of the following represents the significance level of the test?

t

In order to use ____________, the population must have a normal distribution or at least a symmetric distribution (or sample size is large enough that the t value is the same as the Z value.)

.0802

In testing the hypotheses 800 vs Ho: µ≠800, if the value of the test statistic equals 1.75, then the p-value is:

If the p-value is less than α

In what situation do you reject Ho in a hypothesis test?

Sample Proportion

Is p-hat the sample proportion or the population proportion?

Hypothesis Test

Last year the average delivery time for our packages was 2 days. This year we tried to decrease this time. Are we meeting this goal?

Statistic

Number that describes the sample (E.X: sample mean = x bar)

Parameter

Number that summarizes the population (E.X: population mean = μ)

False You can only find a residual where there is a data point. You need observed - predicted for a residual. Predicted is the value on the line, and observed is the value of the data point.

On a regression line, you can find a residual for any value anywhere on the entire line. True or false?

False

R-squared measures the % of points that lie exactly on the regression line.

H0: μ = 60 and Ha: μ < 60

Researchers determined that 60 Kleenex tissues is the average number of tissues used during a cold. You think it's less than that. Suppose a random sample of 100 Kleenex users used an average number of 54 tissues used during their colds. What are the null and alternative hypotheses in this situation?

np0 and n(1-p0) at least 10

Someone reports 90% of students own an iPad. You believe it's more than that. You take a random sample of 200 students and find 185 own an iPad. What condition(s) do you check to use Z in this hypothesis test to find the test statistic?

Sample

Subset of the population that you select

50

Suppose 20% of Ohio residents support the legalization of marijuana. If you randomly select n people and would like to use the normal approximation to answer questions, what does your sample size have to be, at minimum?

76.4%

Suppose 40% of college students plan to vote in the election. Now suppose we randomly select 200 college students and ask them if they plan to vote. What is the probability that more than 75 of the 200 students sampled plan to vote?

True

Suppose Bob makes a 95% confidence interval and his result is (70, 90). True or false: We can tell what Bob's sample mean must have been.

Type 2

Suppose Bob questions whether cereal boxes are being filled correctly, and he believes they are being underfilled. He collects his data, finds his test statistic and p-value, and ends up failing to reject Ho. He didn't have enough evidence to say the cereal boxes were being underfilled. What Type of error could Bob have made?

µ0

Suppose Forbes reports the average starting salary for a business graduate is $70,000 and your sample of 100 randomly selected business graduates has an average of $73,000. What label do you use to describe $70,000 in this problem?

80, 4

Suppose X = number of women in a sample of size 100 from a population that has 80% women and 20% men. X has an approximate normal distribution with mean _____ and standard deviation ______.

No

Suppose X and Y are independent. Is it true that the standard deviation of X+Y equals the standard deviation of X plus the standard deviation of Y?

Reject H0

Suppose X has a normal distribution and you run a hypothesis test with Ho: u = 8 and Ha: u > 8. You collect a random sample of n = 11 values, and you find your test statistic is t = 2.50. What is your decision?

.01 < p value < .025

Suppose X has a normal distribution and you run a hypothesis test with Ho: u = 8 and Ha: u > 8. You collect a random sample of n = 11 values, and you find your test statistic is t = 2.50. What two values does the p-value lie between on the t-table?

Wider

Suppose a 95% confidence interval for the mean is (10, 12) when you know the value of the population standard deviation (sigma). If you HAD NOT known the value of sigma, would your confidence interval have been wider, narrower, or the same?

Considerations for a binomial

1. You have a series of n observations 2. Each observation is independent (not related) 3. Each observations has only two possible categories 4. The probability for success equals p, and is the same for each observation

The two requirements of a continuous random variable probability density function:

1. f(x) >= 0 (Never negative, but can be >1) 2. Total area under the curve = 1 (using integral notation)

P(X > x) _______ P(X >= x) - Not true if x is discrete

=

Uncountably Infinite

A continuous random variable has a(n) __________________________ number of possible values.

The number of heads for n coin flips

Finite

np (1-p)

If X has a binomial distribution with large enough n, X can be approximately modeled by a normal distribution. 1. np >= 10 2. n(1-p)>= 10 σ^2x = __________

Square Root np(1-p)

If X has a binomial distribution with large enough n, X can be approximately modeled by a normal distribution. 1. np >= 10 2. n(1-p)>= 10 σx = __________

np

If X has a binomial distribution with large enough n, X can be approximately modeled by a normal distribution. 1. np >= 10 2. n(1-p)>= 10 μx = __________

1.5 standard deviations below the mean

If an observation has a z-score of -1.5, how many standard deviations is it above or below the mean?

160

If the variance of X is 10, what is the variance of 4X + 20?

Discrete random variables use Σ, while continuous random variables use ____________

Integrals from a to b

True

Let f(x) = 3x^2 for 0 < x < 1. True or false? P(X < 1/2) = P(X <=1/2)

.35

The mean cost of a box of Cheerios Oat Crunch is $4.00 and the variance is .35. If the price increases by 25 cents a box, what is the new variance?

$4.40

The mean cost of a box of Cheerios Oat Crunch is $4.00. If the price increases by 10% what is the new mean?

Notation: μx Formula: μx = Integral from - infinity to positive infinity x*f(x) dx

The mean of a continuous random variable

Which of the statements below is correct? - The standard deviation of X is the square root of the variance of X. - The standard deviation of X is the same as the variance of X. - None of the other choices is correct - The standard deviation of X is the square of the variance of X.

The standard deviation of X is the square root of the variance of X.

A weighted average of the deviation from the mean (same units as x)

The standard deviation of a discrete random variable

78.4 minutes

The time (X) to complete a standardized exam is approximately normal with a mean of 70 minutes and standard deviation of 10 minutes. How much time should be given to complete the exam so that 80% of the students will complete the exam in the time given?

False

The time to complete a standardized exam is approximately normal with a mean of 70 minutes and a standard deviation of 10 minutes. The percentage of the students which take longer than 80 minutes to complete the exam is 84.13%.

A weighted average of the squared deviations from the mean (doesn't use units)

The variance of a discrete random variable

If X and Y are independent, then True/False: Variance of (X-Y) = Variance of (Y-X)

True

Mean and variance of the binomial distribution

- Mean of X: μx = Σ x*p(x) = np - Variance of X: σ^2x = Σ (x - μx)^2 * p(x) = np * (1-p) - Standard deviation of X: Square Root of np * (1-p)

Rules of means and variances for continuous random variables:

- Same rules for means, variances, and standard deviation apply, whether you have discrete or continuous random variables (E.X: μx+y = μx + μy)

Normal Distribution

- Type of variable: Continuous - Shape: Bell-shaped curve - Center: μx (any number) - Spread/Variance: σx

Characteristics of a continuous random variable:

- X is a continuous random variable if it takes on values that are in an interval on the real number line - Probability is area under a curve - The curve is called a probability density function (pdf) - Noted by f(x) - P(a < x < b)

Binomial probability formula:

- n independent trials - p = probability of success on each trial - X = number of success - P(x) = probability of x yeses out of n trials - P(x) = ( n p) * p^x * (1-p)^n-x

True

A probability density function f(x) tells you how much probability is in the area near x.

False

Suppose X is a continuous random variable with the pdf defined as below: True or false? The median of X is 1.5

A function that tells you how much probability is in the area near x - Notation: f(x) - Height tells you how much probability there is around or near x (not probability at x)

Continuous Random Variable

________________ requires integration of functions such as polynomials, a constant function, and the exponential function e-x

Continuous Random Variables

ρxy =

Correlation in population between x + y

The number of flips until 100 heads from coin flips

Countably Infinite

0

Suppose X is a continuous random variable with the pdf defined as below: What is P(X = 2)?

A _________________ is either finite or countably infinite Finite: X = 1,2,3,......n Countably Infinite: X = 1,2,3,.........

Discrete Random Variable

A list of all possible values of X and how often you expect them to occur (looking into the future) Notation: P(X) Two Requirements: 1. 0 ≤ P(X) ≤ 1 2. Σ P(X) = 1

Discrete Random Variable

109

Let X = time waiting in line at a restaurant to get a seat. Mean of X = 20 min, Variance of X = 25 Let Y = time waiting for your food to arrive. Mean of X = 30 min, Variance of Y = 36 X and Y are correlated and the correlation is 0.80. What is the variance of the TOTAL TIME waiting?

1/8

Let f(x) = 3x^2 for 0 < x < 1. What is P(X < 1/2)? Choose the closest answer!

3/4

Let f(x) = 3x^2 for 0 < x < 1. What is the mean of X?

.8

Let f(x) = 3x^2 for 0 < x < 1. What is the median of X?

1/2

Let f(x) = kx where x is between 0 and 2. What is the value of k that makes this a legitimate density function?

Y = ax + b, which means μy = aμx + b

Linear Transformation

A weighted average (expected average) of the possible outcomes; weights are the probabilities. It is the expected average that only includes the population mean -Possible outcomes: x1, x2, ... xk -Probabilities: p1, p2, ...pk Notation: 𝜇𝑥 Formula: Σxp(X)

Mean of a Discrete Random Variable

Which of the following is NOT a discrete random variable? - All of these answers are discrete random variables. - One that is countably infinite. - One that is finite. - One that is uncountably infinite.

One that is uncountably infinite.

A characteristic that you can measure, count, or categorize Notation: X, Y, Z Examples: - X = Number of heads on 2 coin flips - X = Number of customers in a queue at the bank - X = Time it takes to serve a customer at a help desk

Random Variable

- The standard deviation is the square root of the variance - Its units are the same as for X - Not expected to calculate, just set up - Notation: σx

Standard Deviation of a Continuous Random Variable

- A weighted average of the deviation from the mean - Notation: σx - Formula: σx = (Square Root (σx^2)) = (Square Root (Σ (x - μx)^2 * p(x)) - Same units as x

Standard deviation of a discrete random variable

1.28

Suppose X = exam score and X has a normal distribution with mean 80 and standard deviation 5. Bob scored at the 90th percentile. What is Bob's Z-value (aka Z-score)?

Square root of (12^2 x 64)

Suppose X = number of 1 foot long steps taken in one hour by a person working in a bank. Suppose the mean of X is 300 and the variance is 64. If you change the units from feet to inches, you multiply by 12. What is the new standard deviation?

False

Suppose X and Y are independent. Then is the following true or false?

2 standard deviations below the mean

Suppose X has a normal distribution with mean 80 and standard deviation 5. How many standard deviations above or below the mean is 70?

86.4

Suppose X has a normal distribution with mean 80 and standard deviation 5. What is the 90th percentile of X?

True

Suppose X has a uniform distribution on the interval [0, 10]. True or false, the mean of X is 5.

1/10

Suppose X has a uniform distribution on the interval [0, 10]. What is f(x)?

False

True or false? When you integrate e^-x, you get e^-2

False

True/False: The mean of x = 2

Notation: σ^2x Formula: σ^2x = the integral from - infinity to positive infinity (x- μx)^2 *f(x) dx - If it is discrete, σ^2x = Σ (x - μx)^2

Variance of a continuous random variable

A weighted average of the squared deviations from the mean - Notation: σx^2 - Formula: σx^2 = Σ (x - μx)^2 * p(x) - No units

Variance of a discrete random variable

.90

What is the probability that X is at least 2?

.006

What is the probability that a normally distributed random variable X is greater than 85 if μx=75 and σx=4?

.25

What is the variance of X?

-1.28

Where is the 10th percentile of the Z distribution?

Fail to reject H0

If your p-value > alpha, then ______________

Quantitative

The type of data for a means:

yes/no

The type of data for proportions:

About population mean

The type of question for means:

Proportion of yeses

The type of question for proportions:

False

The value of α (alpha) will change if you get a new data set.

True

The variance of X+Y equals the variance of X plus the variance of Y if X and Y are independent.

Mu with subscript X-bar

What is the symbol we use to represent the mean of the random variable X-bar?

Normal approximation to binomial

If n is large enough we can use a Z value. Where did this idea come from?

Reject H0

If your p-value < alpha, then ______________

It stays the same

As n increases, what happens to μx-bar

Increases

As the population standard deviation increases, the margin of error:

Population Standard Deviation (σx)

As your _______________ increases for a confidence interval, your margin of error increases (if the population is diverse, sample results will be diverse)

σx(bar)

As your sample size increases, your ____________ decreases

σx(bar)

As your σx increases, your _____________ increases

True

Before working with percentages in confidence intervals and hypothesis tests for p, change them to proportions by dividing by 100, then put the proportions in the formulas.

b1 = r * (Sy / Sx) - r: correlation - Sy: standard deviation of y values - Sx: Standard deviation of x values

Best slope:

b0 = y-bar - b1*x-bar - y-bar: mean of y values - x-bar: mean of x values - b1: slope

Best y-intercept:

.04

What is the margin of error if we find a 95% confidence interval for the percentage of Corvettes on the road, if your random sample of 100 cars contains 5 Corvettes? Choose the closest answer.

Because of the Central Limit Theorem

The formula for a confidence interval for p involves a Z-value. Why is this? (Assume n is large.)

False

The formula for the margin of error for a confidence interval for a proportion is Z*sigma/(sqrt n).

MOE = 1 / Square root of n

The formula for the quick and dirty method to estimate the margin of error:

t-distribution

The larger n is, the closer the __________________ looks to the one and only z-distribution

True

The larger n is, the closer the t distribution looks to the Z distribution.

False

The margin of error increases if the sample size increases.

True

The margin of error is larger if the standard deviation of the population increases (assume all else stays the same.)

False

The margin of error of a confidence interval gets larger if the confidence level increases (assume all else stays the same.)

False, always true for means

The mean of X+Y = mean of X + mean of Y ONLY when X and Y are independent.

y = b0 + b1x

The model/general equation to fin d the best fitting line:

.0463

The national proportion of adults who are concerned about nutrition is 0.4. You take a random sample of 10 adults and count the number who are concerned about nutrition (call it X). What is the probability that X is less than 2?

np and n(1 - p) are both greater than or equal to 10

The normal distribution is used to approximate a binomial distribution only if:

Null Hypothesis

The original claim about the parameter (the one you believe is incorrect) H0: μx = μ0

True

The p-value will change if you get a new data set.

He would be wrongly accusing the pizza place of taking too long to deliver pizzas.

The pizza place down Bob's street says they deliver their pizzas in 30 minutes or less. Bob thinks it takes longer. Bob conducts a hypothesis test. Suppose he makes a Type I error. What is the impact of this error?

Significance Level

The pre-set cutoff value you determine before you collect any data

Estimating p

The proportion of yesses in the population = _____________ - Use a range of likely values: Confidence Interval - Sample statistic +/- Margin of Error - Sample statistic +/- Certain Number of SEs - SE Stands for: Standard Error

Sampling distribution of X-bar

The set of all possible sample means from all possible samples of size n from the population is known as the:

x-bar

The statistic for means:

False

The t distribution is taller and thinner (more concentrated around the mean) than the Z distribution.

0

The t-distribution has a mean of ____________

True

The t-distribution is symmetric

Normal

The t-distribution looks more and more like a _____________ distribution as the degrees of freedom increase.

True

The t-distribution looks more and more like the Z distribution as n gets larger and larger.

z = (x-bar - µ0) / (σ / square root n)

The test statistic for means:

1. n*p-hat >= 10 2. n(1 - p-hat) >= 10

The two conditions that must be met in order to estimate p:

1.645

To be 90% confident add/subtract ____________ standard errors

1.96

To be 95% confident add/subtract ____________ standard errors

2.58

To be 99% confident add/subtract ____________ standard errors

tn-1 = (x-bar - μ0) / (s / square root n)

To find the test statistic when you don't know σ, use this formula for a t-test:

z = (x-bar - μx) / (σx/square root n)

To test H0, use this formula :

Z and t can both be negative

True or False? Z and t can both take on negative numbers.

False

True or False? x-bar = µx

False

True or false? µx - µy = µy - µx when X and Y are independent

1. As your sample size increases, your σx(bar) decreases 2. As your σx increases, your σx(bar) increases

What affects the standard error and how in a sampling distribution of x bar

- Two quantitative variables only - Linear relationships only - Has no units (unit-free) - Switching x and y doesn't matter, you get the same correlation - Affected by outliers and skewness because it's based on the mean and standard deviation

What are the properties of correlation?

np and n(1-p) at least 10

What conditions need to be checked before you use Z to solve a binomial problem with a large n?

np^ and n(1-p^) must both be at least 10

What conditions need to be met in order to use the Z value in the formula for the confidence interval for a proportion? (Remember the data is binomial, why can we use Z?)

npo and n(1-po) must both be at least 10

What conditions need to be met in order to use the Z value in the formula for the test statistic in a hypothesis test for a proportion? (Remember the data is binomial, why can we use Z?)

Shape only

What element of X-bar is addressed by the Central limit theorem, the shape, mean, or standard deviation?

It goes to 0

What happens to the standard error of X-bar as n increases to infinity?

No meaningful units

Let X = time waiting in line at a restaurant to get a seat in minutes. What are the units of the variance of X? Choose the best answer.


Conjuntos de estudio relacionados

Microbiology Chapter 8: Bacterial Genetics

View Set

GEO 2050 Exam #1 SG - Trepanier Spring 19

View Set

Ch. 6 Continuous Probability Distributions

View Set