BUAD 310 Chapters 8,9
One-sided Alternative Hypothesis
A one-sided alternative hypothesis either claims that the true parameter value is greater than the value claimed in the null hypothesis or it claims that the true value is less than that in the null hypothesis: Equal sign always goes with null The direction of the test is indicated by which way the inequality symbol points in Ha Ha < ⇒ left-tailed test Ha > ⇒ right tailed test
Standard Error Sample Proportion
As sample size increases, the distribution of the sample proportion p=x/n approaches a normal distribution with mean (π) and standard error:
T-stat for one-sided/two-sided
p-values shown in shaded area These p-values are exact if the population distribution is normal and are approximately correct for large n otherwise A test statistic that falls in any of these tails (depending on left-tailed, right-tailed or two-tailed see above), will cause a rejection of the null hypothesis!!
Fact about CI and two sided tests
reject the null hypothesis ONLY if it falls in the two tails (1-alpha), the times 100% is for CI
𝑋 ̅ bias
Consider using the sample mean, 𝑋 ̅, to estimate the true (unknown) population mean, 𝜇. We want our estimate to neither underestimate nor overestimate the parameter on a consistent basis. That is, we want our estimate to be unbiased. Is 𝑋 ̅ an unbiased estimate of 𝜇? Is 𝑋 ̅ (sample mean) an unbiased estimate of 𝜇 (mean)? E(𝑋 ̅) = 𝜇 → an unbiased estimate ***By using 𝑋 ̅ as the sample mean, it is systematically engineered to be smaller than if it were the true population mean (this is why you divide by n-1) i. For Xbar to be normally distributed 1. Either sample size > 30 2. Or underlying population has to be normal
Type 1/Type 2 error
Type 1 error: rejecting the null when it's true (someone is not innocent when they are) Type 2 error: failure to reject the null when it's false → The probability of a Type I error (rejecting a true null hypothesis) is denoted α (the lowercase Greek letter "alpha"). Statisticians refer to α as the level of significance → The probability of a Type II error (not rejecting a false hypothesis) is denoted β (the lowercase Greek letter "beta") Trade-off: if you convict someone who's innocent (type 1 error), you let a guilty person go (type 2 error)
Two-sided Alternative Hypothesis
claims that the true parameter value is not equal to the value claimed in the null hypothesis
Z-stat for one-sided/two-sided
Sample size guideline: and nπ and n(1-π) are both larger than 10
Ch. 9 Hypothesis Test
(test of significance) Common type of statistical inference: assess the evidence provided by the data relative to some claim about the population (affirm or reject a hypothesis through evidence/data) A formal procedure for comparing observed data with a statement about the parameters in a population or model
Summary: steps in significance testing
1. State the null hypothesis H0 and the alternative hypothesis Ha. The test is designed to assess the strength of the evidence against H0 in favor of Ha. 2. Calculate the value of the test statistic (e.g. t-stat), a measurement of how close the data are to H0. → measures the difference between a given sample mean and a benchmark (u) in terms of the standard error of the mean 3. Assuming that the null hypothesis is true, find the probability that the test statistic is at least as extreme as the value you just calculated, e.g. P(T≥t). This probability is called the p-value. 4. Compare the p-value to a pre-specified significance level α (if less than alpha of 5% you reject the null hypothesis at level alpha = 5%) If the p-value is less than α, then you may reject H0 at significance level α. Otherwise, you do not have sufficient evidence to reject H0 at level α. step 4* is optional; reported results may simply state the actual p-value or just whether or not the result is significant at a specified level Warning: do not decide on a significance level after you have seen the results
Common confusions: wrong interpertations
5% of all customers will keep a balance of $1,516 to $2,465. CI is for the average (the mean), not the whole distribution The mean balance of 95% of samples of 140 accounts will fall between $1,516 and $2,465. CI does not describe other samples, it describes the population mean µ The population mean balance is between $1,516 and $2,465. This is not a 100% Confidence Interval, so no guarantees CORRECT statement: We are 95% confident that the mean balance will be between $1,516 and $2,465.
Interpretation
A confidence interval either does or does not contain population mean → because 𝑋 ̅ is an average of the population mean (might miss or include it) The confidence level quantifies the uncertainty. Assuming a 95% confidence level, out of 100 confidence intervals, approximately 95% will contain population mean, while approximately 5% will not contain population Want to cover the mean 95% of the time with this confidence level! Once calculating the corresponding confidence interval for 𝑋 ̅, 1-alpha is no longer a probability (1-5% = 95%) but is now your level of confidence that the interval contains the population mean (u)
What is a Confidence Interval?
A confidence interval is used to estimate the population parameter, µ. The interval is calculated using the data from a sample: Point estimate (𝑋 ̅) ± Margin of Error We can construct a confidence interval for the unknown mean (u) by adding a subtracting a margin of error from 𝑋 ̅ (mean of our random sample) 1- alpha → 1- total probability of two tails you are trying to find alpha/2 → area of each tail
Bias/Consistency
E(X ̅ ) - u Can use =STDEV.P(data) to get a bias estimate of standard deviation Sampling error is random whereas bias is systematic Sampling errors are inevitable, you can't know whether you have a sampling error without knowing the population parameter (and if we knew the pop. parameter we wouldn't be taking a sample) Unbiased estimators avoids systematic errors Unbiased estimator: neither overstates or understates the true parameter on average A consistent estimator converges toward the parameter being estimated as the sample size increases A larger sample size = closer estimate of population mean/standard deviation
Choosing a Confidence Interval
Factors that affect CI A LARGER s = wider CI (larger standard deviation leads to more spread = wider CI) A LARGER n = smaller CI (standard error is smaller with larger n (𝜎/√𝑛)) 99% CI → wider than 95% or 90% The confidence interval can be made narrower by: increasing the sample size or decreasing the confidence level (smaller CI, lower value of t/z) Width of confidence interval (z𝜎/√𝑛) isn't affected by the sample mean (only affected by CI, standard deviation, and sample size) A higher confidence level leads to a wider confidence interval → not necessarily a "better" estimate with a higher confidence level In order to gain confidence with a higher confidence level, you must accept a wider range of possible values for the mean (u) Greater confidence implies loss of precision (greater margin of error) 95% confidence is most often used b/c it's a reasonable compromise between confidence and precision
Statement of Hypothesis
Hypothesis testing is designed to assess the strength of evidence against what we call the null hypothesis (notation: H0) Typically, H0 asserts that there is nothing unusual, nothing unexpected, no effect, no difference, etc. The statement we suspect is true instead of H0 is called the alternative hypothesis or Ha Thus the "null hypothesis" is: Current thinking Status quo Accepted theory What we are trying to disprove (innocent until proven guilty)
Credit Card Example (NOT proportion but population)
IF they give you.... s = 2883.33 (t score not z score) The point estimate for 𝜇 is 𝑋 ̅= $1,990.5 or "mean of a sample" --> sample mean = 𝑋 ̅ se(𝑋 ̅) = $2833.33/√140 = $239.46 margin of error = t(s/√𝑛) The 95% confidence interval is $1,990.50 ± 1.98($239.46) [$1,516.37 , $2,464.63] 1.98 --> t-score
Statistical Significance
If the p-value is smaller than a pre-specified level α, we say that the evidence (against H0) provided by the data is statistically significant at level α (1% level) If p-value is larger than α we "fail to reject the null" With a smaller alpha, the decision maker can make it harder to reject the null hypothesis Level α may be specified beforehand, so that a result for the test statistic at least as strong as the predetermined significance level will be accepted as significant evidence against the null hypothesis
Shape?
If the population is exactly normal, then the sample mean follows a normal distribution As sample size, n, increases, the distribution of sample means narrows in the population mean, u (becomes more vertical)
Random Variable
If we randomly sample from the population, the sample statistic calculated from the sample is a random variable All random variables have probability distributions defined by a mean, standard deviation, and shape We make a distinction between how the individual values in the population are distributed and how the many different values of X ̅ are distributed. We call the distribution of X ̅ a sampling distribution --> describes the distribution of a statistic Repeatedly take samples of X ̅ → sample distribution
When Can We Assume Normality of X?
If σ is KNOWN and the population is normal, then we can safely use the Z-SCORE to compute the confidence interval. If σ is known and we do not know if the population is normal, a common rule of thumb is that n > 30 is sufficient to use the Z-SCORE as long as the distribution is approximately symmetric with no outliers. Larger n may be needed to assume normality if you are sampling from a strongly skewed population or one with outliers!
Confidence Interval for Mean (u) with unknown σ → What if we don't know σ, the population standard deviation?
Instead of using a z-score of (𝑋 ̅−𝜇)/(𝜎/√𝑛).... We substitute the sample standard deviation, s, for 𝜎 and calculate a t-score of (𝑋 ̅−𝜇)/(𝑠/√𝑛). The distribution of t is known as: Z score has more variability (assumed to be known) Sample standard deviation will change from sample to sample T score → ratio of two random variables whereas Z score is only 1 random variable (X bar) → T score is always greater than Z score (confidence intervals will be wider because of this) - you're using another sample statistic leads to more variability with T than Z NOW you have to use t.dist rather than s.dist s/√𝑛 is the estimated standard error of the mean T distribution assumes a normal population → reliable as long as the population is not badly skewed and if sample size isn't too small With infinite degrees of freedom, the t‐distribution is the same as the standard normal distribution
Student's t Distribution
Observe that as the sample size increases the t distribution converges into the z distribution ==> standard deviation is greater than 1 because more variability than z score (more variability because taking another sample of standard deviation s) t distributions are SYMMETRIC and shaped like the standard normal distribution (mean = 0) ==> CENTERED AROUND 0, SPECIFIED BY NUMBER OF DEGREES OF FREEDOM, APPROACH NORMAL DISTRIBUTION AS N INCREASES, HAVE MORE PROBABILITY IN TAILS THAN Z The t distribution is dependent on the size of the sample, n, and degrees of freedom, n-1. Degrees of freedom: value of the t statistic used in the confidence interval formula --> T-distributions are specified by the number of degrees of freedom (they are NOT specified by one parameter of the mean) Tells us how many observations we used to calculate s (sample standard deviation) T is always larger than z (esp. for very small sample sizes), so the confidence interval is always wider than if z were used --> "A (1-α) confidence interval using the t distribution is as wide as or wider than one constructed using the z distribution." (this statement is always true) For very small samples, t-values differ substantially from the z scores. As sample sizes increase, the t-values approach the normal z-values. For a 90 percent confidence interval with df = 30, we would use t = 1.697, which is only slightly larger than z = 1.645.
Applying the Central Limit theorem
Rule of thumb → n>=30 to ensure a normal distribution for sample mean (𝑋 ̅), BUT a smaller n will actually suffice if the population is symmetric As long as the sample size n is large enough, the Central Limit Theorem lets us know that 𝑋 ̅ is normally distributed, which in turn permits us to define an interval within which the sample means are expected to fall. This, in turn, enables us to create an interval that we can be "confident" covers the true population mean, 𝜇, as shown in the document Confidence Intervals in Blackboard. The standard error decreases as n increases Standard error of the mean = σ/√n The mean of sample means converges to the true population mean (u) as n increases Typical confidence levels are 90% (1.645), 95% (1.96), and 99% (2.576)
Standard Error
Since 𝑋 ̅is a random variable whose value will change whenever we take a different sample, as long as our samples are RANDOM samples, the only type of error we have in our estimating process is the sampling error The sampling error of the sample means is described by its standard deviation ==> standard error of the mean We can show that the standard deviation/standard error for 𝑋 ̅ is 𝜎/√𝑛. The proof can be found in the file Mean and standard deviation of the sample mean on Blackboard This is called the STANDARD ERROR of the mean.
Statistical Hypothesis/Hypothesis Test
Statistical hypothesis: statement about the value of a population parameter that we are interested in (mean, proportion, variance) Hypothesis test: decision between two competing, mutually exclusive, and collectively exhaustive hypotheses about the value of the parameter
z-test for a proportion
Suppose we have a simple random sample of size n from a population with an unknown population proportion π. To test the hypothesis p=sample proportion
Test Statistic
Test statistic measures compatibility between the null hypothesis and the data (results can happen by chance) In general it is a random variable with a known distribution. We compute the specific value of the test statistic using the sample. Based on the variation in tests the difference in resulting values can have different impacts (plastic bag: 55 compared to 50 can be large based on how many standard errors from the mean it is, if data ranges from 35-65, 55 may not be very drastic) How many standard errors away from the hypothesized value were you? T-distributions is symmetric around 0 T-distributions are specified by the number of degrees of freedom (they are NOT specified by one parameter of the mean) T-distributions approach a normal distribution T-distributions have more probability in the tails than the standard normal distribution
P-value
The evidence against H0 is quantified using a p-value, which is the probability of observing the specific value of the test statistic if the null hypothesis is true. The p-value is the probability of this particular sample result (or one more extreme) assuming that H0 is true Smaller p-values indicate stronger evidence against the null hypothesis Because the p-value is a probability it falls in the range [0, 1]. A small p-value is evidence against the null hypothesis. Using the p-value, we reject H0 if p-value < alpha
Population Mean
The mean of X ̅ is µ, the population mean The average or expected value of X ̅ → µ The standard deviation of 𝑋 ̅ is also determined from the population. The population standard deviation is σ. As the sample size n increases, the standard deviation of 𝑋 ̅ decreases --> Smaller than the average deviation, Bigs and smalls averaging each other out → chances of deviation around the mean is smaller
When to use one or the other?
Use a one-sided hypothesis when you have some preconceived notion about the direction of the difference. Otherwise, use a two-sided hypothesis. Payment Time: a consulting firm uses hypothesis testing to provide strong evidence that the new electronic billing system has reduced the mean payment time to below 19.5 days Null hypothesis: no difference in mean payment time (equal or greater than 19.5 days) Alternative hypothesis: below 19.5 days Camshaft: an automobile manufacturer uses hypothesis testing to study an important quality characteristic affecting V6 engine camshafts. The goal is to check whether the mean "hardness depth" significantly differs from its desired target value of 4.5mm. Null hypothesis: equals 4.5mm ⇒ null is equal to u (4.5) Alternative hypothesis: above or below 4.5 mm ⇒ two sided alternative, alternative is not equal to u (not equal to 4.5, greater or less)
Ch. 8 Estimating Parameters
We use sample statistic to estimate a population parameter (Recall that a statistic is a numerical measure from a sample. The larger the size of the sample used to compute the statistic, the better the statistic will estimate the corresponding population parameter) Care about the mean of a population, the proportion (assembly line, prop. of defected elevator rails, prop. of population that votes republican), and standard deviation
Sampling Error
X ̅ - u Sampling errors exist because different samples yield different values for X ̅
Credit Card Example (population proportion)
standard error for population proportion: n = 100 x = 140 x/n = 0.14 (p) se(p) =
Confidence Interval for a Proportion (x)
→ ALWAYS A Z-SCORE WITH A PROPORTION "What proportion of the population prefers coke to Pepsi (always between 0 and 1)" "What proportion of the population votes democrat versus republican?" Rule of Thumb: The sample proportion p = x/n can be assumed to be normal if both nπ >= 10 and n(1-π) >= 10 --> To assume normality, the sample should have at least 10 "successes" and at least 10 "failures" (we want x>=10 and n-x>=10) ==> can check by multiplying n*probability and see if it's > 10 HW 7 problem 7 confidence interval for a proportion (pic):
Central Limit Theorem
→ If a random sample of size n is drawn from a population with mean µ and standard deviation σ, the distribution of the sample mean 𝑋 ̅ approaches a normal distribution with mean µ and standard deviation σx=σ/√n as the sample size increases. ⇒ all lead to a bell shaped curve ⇒ central limit theorem If you take a big enough the shape of the distribution (as long as n is big enough) of X ̅ turns out to be normally distributed A sample size of 1 would lead to uniform distribution (flat line - 1 value, mean = that 1 value)
How to calculate t-score on excel!
𝛼 = .05 For a 95% CI =T.INV.2T(.05,139) = 1.977 =T.INV(.975,139) = 1.977