Stat 212 - Ch. 13, Sampling Distributions

Ace your homework & exams now with Quizwiz!

Mean and Standard Deviation of a Sample Mean

Suppose that x̄ is the mean of an SRS of size n drawn from a large population with mean µ and standard deviation σ. Then the sampling distribution of x̄ has mean µ and standard deviation σ/√n.

Simulation

Using software to imitate chance behavior.

We can describe the behavior of a sample statistic by a probability model that answers the question

What would happen if we did this many times?

Central Limit Theory

When randomly sampling from any population with mean µ and standard deviation σ, when n is large enough, the sampling distribution of x̄ is approximately normal: N(µ, σ/√n). The larger the sample size n, the better the approximation. This is useful in inference, as large samples can be assumed to be normal.

The Law of Large Numbers

As the number of randomly-drawn observations n in a sample increases, the sample mean x̄ gets closer to the population mean µ (quantitative) and the sample proportion p̂ gets closer to the population proportion p (categorical).

The results of large samples are less variable than the results of small samples.

Averages are less variable than individual observations. More data makes it more average.

If the population is much larger than the sample, X ("successes") in an SRS of size n is

approximately the binomial distribution B(n,p) with mean µ and standard deviation µ = np σ = √npq = √np(1-p)

The standard deviation of the sampling distribution gets smaller only at the rate √n.

B/c the sample mean distribution is N(µ, σ/√n). To cut the standard deviation of x̄ by 10, we need to take 100 times as many observations, not just 10 times as many, but large sample sizes are not always an option.

Don't forget the three main ways of describing a distribution

Shape, Center, and Spread.

X is a count of

the occurrences of "successes" in a fixed number of observations.

The sampling distribution of a statistic is

the probability distribution of that statistic for samples of a given size n taken from a given population.

How do we know if a population is Normal or not?

Sometimes we are told that a variable has an approx. Normal distribution, ex. large studies. Most of the time, we don't know, and we must summarize the data with a histogram and describe its shape. If the sample is random, the shape of the histogram should be similar to the shape of the population distribution.

You should not use the Normal approximation to get the distribution of p̂ when the sample size n is small. Additionally, the formula given for the standard deviation of p̂ is not accurate unless the population is much larger than the sample.

...

Population Distribution

A distribution that shows how many measurements vary within the population.

Statistic

A number describing a characteristic of a sample. Statistics are often used to estimate unknown population parameters.

Parameter

A number that describes a character of the population. The value of a parameter is usually not known because we cannot examine the entire population.

Sampling Distribution of x̄

Describes how the statistic x̄ varies in all possible SRSs of the same size from the same population.

Sampling distribution of a sample proportion

For an SRS of size n from a large population that contains population proportion p of successes. Let p̂ be the sample proportion of successes. Then: - the mean of the sampling distribution is p. - the standard deviation of the sampling distribution is √p(1-p)/n. - as the sample size increases, the sampling distribution of p̂ becomes approximately Normal.

Sample Proportion p̂

If the number of observations is n, the sample proportion is p̂ = count of successes in sample / size of sample = X/n

Sampling distribution of a sample mean for a Normal distribution

If the population has a N(µ, σ) distribution, then the sample mean distribution is N(µ, σ/√n).

The central limit theorem applies to sampling distributions, not to the distribution of a single sample.

Many students mistakenly believe that larger sample sizes yield more Normal sample histograms. It's not the size of the sample though, it's the number of samples.

How large a sample size n is needed for x̄ to be close to Normal depends on the population distribution. More observations are required if the shape of the population distribution is far from Normal.

Means of random samples are less variable than individual observations. Means of random samples are more Normal than individual observations.

How large a sample size?

More observations are required if the population distribution is far from normal. A sample size of 25+ usually suffices for strongly skewed, few/mild-outlier sample. A sample size of 40+ usually suffices for an extremely skewed, stronger/more-outlier sample.

If n is large and p is not too close to 0 or 1, the sampling distribution of p̂ is approximately

N(µ=p, σ=[√p(1-p)]/n)

Law of Large Numbers vs. Sampling Distribution

The law of large numbers describes what would happen if we took samples of increasing size n. A sampling distribution describes what would happen if we took all possible random samples of a fixed size n.

The standard deviation of the sampling distribution of x̄ is σ/√n

The standard deviation of the sampling distribution measures how much the sample statistic x̄ varies from sample to sample. Averages are less variable than individual observations.

The mean of the sampling distribution of x̄ is µ

There is no tendency for a sample average to fall systematically above or below µ, even if the population distribution is skewed. Thus, x̄ is an unbiased estimate of the population mean µ.

Sampling distributions are theoretical concepts

They're not actually built

The Central Limit Theory is valid as long as we are sampling many small random events, even if they have different distributions.

This explains why so many variables are Normally distributed.

The mean of the sampling distribution of p̂ is p, and therefore p̂ is an unbiased estimator of p.

p̂ has no systematic bias to go above or below p.

Because good samples are chosen randomly, statistics such as x̄ are

random variables.


Related study sets

Chapter 45: Digestive and Gastrointestinal Treatment Modalities

View Set

EAQ: Chronic Neurological Conditions

View Set

Pharmacology Exam 3 NCLEX Questions (Summer)

View Set

human resource str and planning chapter 4

View Set