Stats Final
Median of a continuous random variable:
the equal-areas point, the point that divides the area under the probability distribution in half. -The 50th percentile of its probability distribution
Mean of a continuous random variable:
the point at which its probability distribution would balance if made of solid material.
Likely sources of error
they're LYING, older/newer technology only accounts for certain ages (LAN lines), people with strong opinions - if survey, questions wording can sway results MOE does not account for bias, only natural sampling variability
Skewed left density curve
Long left tail pulls the mean to the left of the median B(median) is the equal-areas point of the distribution. The mean will be less than the median due to the left-skewed shape.
Skewed right density curve
Long right tail pulls the mean to the right of the median B(median) is the equal-areas point of the distribution. The mean will be more than the median due to the right-skewed shape.
This capybara is so cute im crying
Look
Symmetric density curve
Mean and median are equal
Calculating prob with p̂
z= (p̂-p) / st. dv sketch model P(z>/< z-score)
Calculating prob with x̅
z= (x̅-μ) / st. dv sketch model P(z>/< z-score)
How to calculate μx̅
μ
x̄ (the sample mean) estimates...
μ (the population mean)
Population Parameter
μ (the population mean) p (the population proportion) σ (the population SD)
Mean of p̂
μp̂ = p
Mean of the sampling distribution of X
μx = np.
s (the sample SD) estimates...
σ (the population SD)
The standard deviation of the sampling distribution of X
σ = √np(1-p) or √npq
How to Interpret a Confidence Interval
"We are C% confident the true proportion of [parameter in context] lies between ____to____ ."
Based on your answer to probability question, what should __ conclude if they select an SRS of size n= and finds p̂= ?
(answer question in context) Based on the sample, if the true proportion of __ was p, the likelihood of getting a sample proportion of __ or more/less is only (probability you found) by chance alone.
Density curve validity:
- Entirely above horizontal axis - has an area of exactly 1 underneath it
How to Reduce MOE and drawbacks:
1. Decreasing level of confidence - reduces MOE and the likelihood you capture the population parameter 2. Increasing the Sample Size - A larger sample takes longer to collect and more money to obtain
How to Check the Normal/Large Sample Condition
1. The data come from a normally distributed population. 2. The sample size is large (n ≥ 30). 3. When the sample size is small and the shape of the population distribution is unknown, a graph of the sample data shows no strong skewness or outliers. Make sure to sketch in the PLAN step.
How to Find t*
1. Using Table B, find the correct confidence level at the bottom of the table. 2. On the left side of the table, find the correct number of degrees of freedom (df). For this type of confidence interval, df = n − 1. 3. In the body of the table, find the value of t* that corresponds to the confidence level and df. 4. If the correct df isn't listed, use the greatest df available that is less than the correct df. On Calc: menu 6-5-6 area: % in tails degrees of freedom: n-1 (ignore negative sign)
Finding probabilities involving x̄.
1. justify that it is a normal distribution and CLT. 2. find standard deviation and mean 3. find z score 4. model and shade 5. write notation
90% Confidence interval
1.645
95% Confidence interval
1.96
Typica values
2 standard deviations away from the mean
98% Confidence interval
2.33
99% Confidence interval
2.576
Confidence interval
A confidence interval gives an interval of plausible values for a parameter.
Point estimate
A point estimate is a single-value estimate of a population parameter.
Unbiased estimator
A statistic used to estimate a parameter if the mean of its sampling distribution is equal to the value of the parameter being estimated.
Calculating point estimate & MOE (given interval):
Add up lower and upper bound, divide by 2 Point estimate - lower bound
Describe the shape of the sampling distribution of for SRSs of size n = 2 from the population of pennies. Justify your answer.
Because n = 2 < 30, the sampling distribution of will be skewed to the left, but not quite as strongly as the population.
Describe the shape of the sampling distribution of for SRSs of size n = 50 from the population of pennies. Justify your answer.
Because n = 50 ≥ 30, the sampling distribution of will be approximately normal by the central limit theorem.
Justify that the sampling dist. of p̂ is approximately normal.
Because np≥10 and nq≥10, the sampling distribution of p̂ is approximately normal. Therefore, it is appropriate to use a normal model.
Based on prob that you found for x̅, what would you conclude about the mean ___ for (units)?
Because this result is likely/unlikely (<5%), we have convincing evidence that the mean ___ of all (units) is > or < value.
DO
Caclulations MOE - do on calc to be safe, but write out your work / decimals you use
Increasing sample size:
Confidence interval becomes narrower because increasing the sample size decreases MOE
Increasing confidence level:
Confidence interval becomes wider because increasing the confidence level increases MOE
Central Limit Theorem
Draw an SRS of size n from any population with mean μ and finite standard deviation σ. The central limit theorem (CLT) says that when n is large, the sampling distribution of the sample mean is approximately normal.
Describe the shape of the sampling distribution of x̄ for SRSs of size n=60 from this population of movies. Justify your answer.
Even though the distribution is strongly skewed right, the sampling distribution of x̄ is approximately normal based on the CLT since n=60≥30.
Density Curve
Find and interpret P(x<1): A=1/2(1)(.5)= 1/4 There is a 25% chance that an enemy appears in less than 1 minute.
Interpret the standard deviation
If ___ took many samples of size n, the number of ___ would typically vary by about ___ from the mean of __.
How to Interpret a Confidence Level
If we were to select many random samples from a population and construct a C% confidence interval using each sample, about C% of the intervals would capture the [parameter in context]. Example: If the Pew Project took many random samples of U.S. adults and constructed a 95% confidence interval using each sample, about 95% of these intervals would capture the true proportion of all U.S. adults who use Twitter or another service to share updates about themselves or see updates about others.
Interpret Standard Deviation of σx̅
In SRSs of size n, the sample mean number of ___ will typically vary by about ___ from the population mean of ___.
Interpret Standard deviation of p̂
In SRSs of size n, the sample proportion of ___ will typically vary by about ___ from the true proportion of p.
There is one dot on the graph at p̂ = .38 . Explain what this dot represents.
In one SRS of size n, 38% of __ were ___.
Practical consequence of increasing sample size:
More reliable and precise - closer to the true mean
Conditions met/not met example:
No. The sample size is small and the (graph) shows strong left skewness and possible outliers. Yes. Although the sample size is small, the (graph) shows no strong skewness or outliers.
A Pew Research Center poll asked 1102 12- to 17-year-olds in the United States if they have a cell phone. Of the respondents, 71% said "Yes."
Population: all 12- to 17-year-olds in the United States. Parameter: p = the proportion of all 12- to 17-year-olds with cell phones. Sample: the 1,102 12- to 17-year-olds contacted. Statistic: the sample proportion with a cell phone, p̂ = 0.71.
Tom is roasting a large turkey breast for a holiday meal. He wants to be sure that the turkey is safe to eat, which requires a minimum internal temperature of 165°F. Tom uses a thermometer to measure the temperature of the turkey breast at four randomly chosen points. The minimum reading he gets is 170°F.
Population: all possible locations in the turkey breast. Parameter: the true minimum temperature in all possible locations. Sample: the four randomly chosen locations. Statistic: the sample minimum, 170°F.
PLAN
Random and Large counts Conditions met; proceed with a 1 sample z-interval for p.
How to Check the Conditions for Constructing a Confidence Interval for p
Random: The data come from a random sample from the population of interest. Large Counts: both np̂ and nq̂ are at least 10. Conditions met; proceed with a one proportion z-interval.
Four - Step Process: Confidence Intervals
STATE: State the parameter you want to estimate and the confidence level. PLAN: Identify the appropriate inference method and check conditions. DO: If the conditions are met, perform calculations. CONCLUDE: Interpret your interval in the context of the problem.
How to Calculate Sample Size for a Desired Margin of Error
Set MOE equation less than or equal to desired ME, solve inequality for n where p hat is a guessed value for the sample proportion. The margin of error will always be less than or equal to ME if you use = 0.5. In general, we round to the next highest integer when solving for sample size to make sure the margin of error is less than or equal to the desired value.
Justification for normal models with sample means:
Since the population follows a normal distribution, a sample size of 15 is considered large enough. (n≥30)
The Large Counts condition
Suppose X is the number of successes in a random sample of size n from a population with proportion of successes p. The Large Counts condition says that the distribution of X will be approximately normal when np ≥ 10 and n(1 - p) // nq ≥ 10
The Large Counts condition
Suppose p̂ is the proportion of successes in a random sample of size n from a population with proportion of successes p. The Large Counts condition says that the distribution of p̂ will be approximately normal when np≥ 10 and nq≥10
Normal/Large Sample condition
The Normal/Large Sample condition says that the distribution of will be approximately normal when either of the following is true: • The population distribution is approximately normal. This is true no matter what the sample size n is. • The sample size is large. If the population distribution is not normal, the sampling distribution of will be approximately normal in most cases if n ≥ 30.
Confidence level
The confidence level C gives the long-run success rate of confidence intervals calculated with C % confidence. That is, in C % of all possible samples, the interval computed from the sample data will capture the true parameter value.
Describe the shape of the sampling distribution of x̄ for SRSs of size n=10 from this population of movies. Justify your answer.
The distribution is strongly skewed right and based on the CLT since n=10≥30.
Margin of error
The margin of error of an estimate describes how far, at most, we expect the estimate to vary from the true population value. That is, in a C% confidence interval, the distance between the point estimate and the true parameter value will be less than the margin of error in C% of all samples.
Decreasing sampling variability
The sampling distribution of any statistic will have less variability when the sample size is larger.
What would happen to the sampling distribution of the sample mean if the sample size were n = 50 instead? Justify. What is the practical consequence of this change in sample size?
The sampling distribution of the sample mean will be more variable because the sample size is smaller. The estimated mean ___ will typically be farther away from the true mean ___. In other words, the estimate will be less precise.
Sampling distribution of the sample proportion p̂
The sampling distribution of the sample proportion p̂ describes the distribution of values taken by the sample proportion p̂ in all possible samples of the same size from the same population.
Sniffie
Theres too much info on this quiz
CONCLUDE
We are C% confident that the interval from _ to _ captures the true proportion of ___.
Interpreting confidence intervals
We are __% confident that the true population parameter of (context) lies between (lower bound) and (upper bound).
STATE
We want to estimate p= the proportion of _______ at a C% confidence level.
Convincing evidence?????
Yes/No - see if the number they give lies below, within, or above the interval you calculated -- answer question too.
How to Calculate a Confidence Interval for μ
When the Random and Normal/Large Sample conditions are met, a C% confidence interval for the unknown mean μ is (picture) where t* is the critical value for a t distribution with df = n − 1 and C% of its area between −t* and t*.
Standard error of x̄
an estimate of the standard deviation of the sampling distribution of x̄ estimates how much x̄ typically varies from μ
Unusual values
anything more than 2 standard deviations away from the mean
MOE
critical value * standard deviation
sampling distribution of the sample count X
describes the distribution of values taken by the sample count X in all possible samples of the same size from the same population.
Sampling distribution of the sample mean x̅
describes the distribution of values taken by the sample mean x in all possible samples of the same size from the same population.
Omg this is so funny
i wonder what quizlet this was used on.... im scared now
sampling distribution of a statistic
is the distribution of values taken by the statistic in all possible samples of the same size from the same population.
Unbiased estimator:
mean equals the value of the population mean
Bias
means that our aim is off and we consistently miss the bullseye in the same direction. That is, our sample values do not center on the population value.
High Variability
means that repeated shots are widely scattered on the target. In other words, repeated samples do not give very similar results.
When not app to use normal model:
not np ≥ 10 and n(1 - p) // nq ≥ 10 It is not appropriate to use a normal distribution model because np is not greater than 10.
p̂ (the sample proportion) estimates...
p (the population proportion)
Normal distribution
symmetric, single-peaked, bell-shaped density curve. Any normal distribution is completely specified by two numbers: its mean μ and standard deviation σ.
Calculating confidence interval for p
where z* is the critical value for the standard normal curve with C% of its area between -z* and z*. Or use calc: menu - 6 - 6 - 5
Sample Statistics
x̄ (the sample mean) p̂ (the sample proportion) s (the sample SD)
How to Check the Conditions for Constructing a Confidence Interval for μ
•Random: The data come from a random sample from the population of interest. •Normal/Large Sample: The data come from a normally distributed population or the sample size is large (n ≥ 30).