Confidence Intervals
Confidence Intervals and Randomness
- we can use a meta study to construct 95% confidence intervals, then 95% of the intervals will contain mu, the population mean
Confidence Coefficients: t(0.025)
- this is a quantity called the "two-tailed 5% critical value" -the interval between -0.025 and +0.025 have 95% of the area underneath the curve, and the area around the tails is 5% -two "pieces" or tails of 0.025 make up 0.05 (5%)
Standard Error of Two Means
-(y bar one- y bar two): the difference between two sample means (estimated population means) Standard Error of two means= the square root of the standard deviation one squared divided by the sample size one, plus the standard deviation two squared divided by sample size two -we add them because of the variability of each random sample
Student's t distribution
-If data came from a normal population and we want to replace sigma with s, we must also replace the 1.96 in front of SE to a different number. -If it is a 95% interval this number is t(0.025) -these t distributions are theoretical and continuous to construct confidence intervals
Degrees of Freedom if n=1
-a sample size n provides only df pieces of information (n-1) about variability, about sigma (SD) -we can not use student's t method for n=1 because df=0 and so does SD
Comparing Two Means
-comparing two populations: means, sd, shapes
Confidence Interval for (mu1-mu2)
-constructing a confidence interval for the difference in population means compares the two sample means -remember y bar+-tSE so... (y bar1- y bar2)+-tSE (t can be found with the number and df)
One Sided Confidence Intervals
-if we are looking at only one boundary then only one side of an interval is needed -sometimes we look at a small interval when the large interval is infinity
Constructing a Confidence Interval of mu
-students method for constructing a confidence interval for mu based on a random sample in a normal population 1) choose a confidence interval like 95% 2) find t number area in chart 4 3) find degrees of freedom: df=n-1 4) compute the upper and lower limits of interval Limits: y bar-(t)SE and y bar+(t)SE or y bar+-(t)(s/sq. root n)
Intervals
-the higher the confidence level, the wider the confidence interval for a fixed sample size -as n increases, intervals become smaller
"degrees of freedom"
-the shape of a t-distribution depends on the degrees of freedom (df) -a t-curve is symmetric and bell-shaped like a normal curve but has a larger standard deviation -as df increases, t curves approach the normal curve normal curve: df is infinite -comes from the (population mean-sample mean)=0, so only (n-1) can vary
Critical Values of Student's t distribution
-these are the values of a certain area underneath a t distribution curve -the values of t decrease as df increases because df is infinite for a normal distribution -can confirm this is a normal distribution because t(0.025)= 1.960, and on a z scale z=1.96 is a 95% confidence interval so this would be correct
Validating the SE Formula
1) Population size must be large compared to sample size 2) Observations must be independent of one another (not hierarchical which is in many life science experiments)
Three Ways to Find a Confidence Interval
1) calculate degrees of freedom from formula--best one 2) use the smaller (n1-1) or (n2-1)---conservative 3) approximate so that df= n1+n2-2---liberal
Validity of Confidence Interval
1) data from two independent, random samples 2) normal population if n is small
Types of Interval Form
1) just as __+-__ 2) as endpoints 3) compactly (__,__) 4) a statement ___<mu<___
Conditions for Student t Distribution
1) must be a random sample (most important) that is independent of one another 2) if n is small, the population must be normal
SE vs. SD in Sample Size
As the sample size increases: SD becomes closer to the population mean because the sample mean is becoming closer. SE decreases because error is a more precise estimate of the population mean.
Standard Error of the Mean
Defined as: SE(y bar)= s/square root n -an estimate of the sampling distribution of y bar where SD of sampling distribution y bar is sigma=sigma/square root n -a measure of reliability or precision of the sampling mean as an estimate of population mean (small SE- precise estimate) -variability of observations and sample size are the only factors that affect reliability
Which interval to pick SD or SE
Do you want to emphasize a comparison of the means? Choose SE Do you want to show a summary of the variability in your observed data? Choose SD
SD vs. SE Definition
SD: describes dispersion of the data in a sample or population (deviation from the mean) SE: unreliability of sampling error in the mean of a sample as an estimate of the mean population (variability with the mean itself)
Confidence Interval for Population Mean
Standard Error shows how far the sample mean is from the population mean mu. Confidence Interval: If we can only see the sample mean and not the population mean, we can find an interval of standard error where the population mean can be found. We must have a correct sample (a biased sample will not work)
Statistical Estimation
We use data to determine an estimate of some feature of the population ad assess how well that estimate is to the population.
95% Confidence Interval
sample mean+-1.96SE SE= sigma/square root n -we are 95% confident that the population mean will fall into this interval -if we can not find the population SD, or sigma, we can use s and calculate the interval but the 95% changes
Intervals of SD and SE
sampling mean+-SD: shows the mean value within each group and the variability of the value within each group, a smaller SD is a bigger group (sample size) sampling mean+-SE: shows the mean value in each group, and the reliability of each group mean as an estimate of the population mean