STAT 104 - Chapter 3

Ace your homework & exams now with Quizwiz!

How Bootstrapping works:

- A random sample I representative of the population - Use the sample as a proxy for the population - Draw re-samples (bootstrap) from the sample!

Importance of Random Sampling

- Gives us representative samples - If we take random samples, out sampling distribution will center around the population parameter - If we do NOT take random samples, our samples are likely biased and may not be centered around the population parameter!

How is the confidence LEVEL interpreted?

95% of all samples yield intervals that contain the true parameter.

Sample Variability

CASES within a sample vary. Can be summarized using the standard deviation

Sample Distribution

Describes ONE sample

Sampling Distribution

Describes statistics for MANY samples

Margin of Error - Why 2?

In a sampling distribution, 95% of statistics will be within 2*SE of the true parameter value

Statistics vs. Parameters

Use Statistics (known) to estimate Parameters (usually unknown)

Population Proportion

p

Confidence Interval Equation

statistic +/- 2*(Standard Error)

Common MISINTERPRETATIONS of confidence intervals include:

"A 95% confidence interval contains 95% of the data in the population." "I am 95% sure that the mean of a sample will fall within a 95% confidence interval for the mean." "95% of all sample means will fall within this 95% confidence interval." "The probability that the population parameter is in this particular 95% confidence interval is .95%"

Sampling with Replacement

- Many times it is not possible to sample repeatedly from the actual population... - But we can sample repeatedly from the sample! - To get statistic that vary, sample with replacement (each case can be selected more than once)

Standard Error

- Measures how much the statistic varies from sample to sample - It is the average distance from the statistic to the parameter - It is calculated in the same way as the standard deviation which was the average distance from the observation to the mean

Margin of Error

- Reflects the precision of the sample statistic as a point estimate for the parameter - One form of margin of error for a 95% confidence interval is: Margin of error = 2*(standard error)

Sampling Distribution - Summary

- Sampling Distribution is a collection of many statistics from a population with the same sample size, n - Width is measured by standard error - Larger sampling size = smaller standard size

Steps to create a Sampling Distribution

- Suppose we have a random sample - Take a random sample and compare a statistic - Take a different random sample and compute a statistic - Take another random sample and compute a statistic (and so on)... - Graph the statistics of many samples, and we. get our sampling distribution

What is Bootstrapping?

- Suppose you have a population and take one random sample, Sample A. - If a sample is randomly selected, it should be representative of your population. - Now, imagine we take Sample A and make many copies of it. We can consider this a guess of what our population looks like. - Then , we can take many re-samples from our "guess of what our population looks like."

Point Estimate

- The sample statistic of interested is a point estimate for a population parameter - It will not necessarily equal the parameter - refereed as the "best estimate" from the sample

Center of a bootstrap distribution

- The sampling distribution is centered around the population parameter - The bootstrap distribution is centered around the sample statistic

Standard Error from a bootstrap distribution

- The variability of the bootstrap statistic is similar to the variability of the sample statistic in a sampling distribution - The standard error of a statistic can be estimated using the standard deviation of the bootstrap distribution! - Note that the standard deviation of a bootstrap distribution is called a standard error

Where does the standard error come from?

- There are different methods of estimating standard errors, such as BOOTSTRAPPING

Interval Estimate

- This estimate show far off the parameter is from the point estimate - It gives plausible values for the parameter (Plausible doesn't always mean possible) - e.g. confidence interval

How is the confidence INTERVAL interpreted?

- We are "95% confident" that an interval contains the truth/parameter - Always include the context of the problem! That includes the variable and measurement units when appropriate.

95%

Approx. 95% of 95% confidence intervals will contain the true parameter value - The 95% is NOT a probability - The parameter is fixed but the statistic and interval are random (depends on the statistic)

Population Distribution

Describes the population

Percentile Method

If the bootstrap distribution is approximately symmetric, we can construct a confidence interval by finding the percentiles in the bootstrap distribution so that the proportion of bootstrap statistics between the percentiles matches the desired confidence level.

Parameter

Numerical summary of the POPULATION

Statistic

Numerical summary of the SAMPLE DATA

Three Types of Distributions

Population Distribution Sample Distribution Sampling Distribution

Sampling Variability

STATISTICS = from many samples vary. Can be summarized using the standard error which is the standard deviation of a sampling distribution.

Sample Size vs Bootstrap samples

Sample Size: - How many cases you have in one sample - Sample size WILL affect the standard error Bootstrap Samples: - The number of bootstrap samples is how many times you run the bootstrap simulation. - Each bootstrap sample should have the same sample size as the original/real sample - Number of bootstrap samples will NOT affect the standard Error

Bootstrap Distribution

The distribution of many bootstrap statistics - Can be used to estimate the sampling distribution

95% Confidence Interval

The range within which the true population mean lies, with 95% certainty - Can be calculated using the following formula if the sampling distribution is approx. bell shaped: statistic +/- margin of error

Bootstrap Statistic

The statistic computed on a bootstrap sample - Use the re-sampled cases in the bootstrap sample to compute the bootstrap statistic of interest

What is the purpose of a Sampling Distribution?

To show us how the sample statistic varies from sample to sample

Difference in population proportion

p1 - p2

Population Mean

μ

Difference in population means

μ1 - μ2


Related study sets

🍡💜🍡Endocrine System🍡💜🍡

View Set

pharmacology online practice 2017 B

View Set

AIT 524 - Week 8 (Table Creation and Management)

View Set

Art History: Romanesque Europe - Chapter 12

View Set