Quant 1 - Section 2
Define alternative hypothesis:
A alternative hypothesis is a hypothesis to be considered as an alternative to the null hypothesis. We use the symbol Ha(subscript) to represent the alternative hypothesis. The choice of the alliterative hypothesis depends on and should reelect the purpose of the hypothesis test. Three choices are possible for the alternative hypothesis.
Define null hypothesis:
A null hypothesis is a hypothesis to be tested. We use the symbol H0 to represent the null hypothesis. Traditionally, the null in null hypothesis stood for "no difference" or the "difference is null".
Define unbiased estimator
A statistic is called an unbiased estimator of a parameter if the mean of all its possible values equals the parameter; otherwise, the statistic is called a biased estimator. We want our statistic to be unbiased and have a small standard error, for then, the chances are relatively good that our point estimate (i.e. the value of the statistic in question) will be close to the parameter (of interest)
Any decision we make based on a hypothesis test may be ________.
Any decision we make based on a hypothesis test may be incorrect because we have used partial information obtained from a sample to draw conclusions about the entire population. There are two types in incorrect decisions - Type I and Type II Errors.
List the four basic properties of the t-curve:
Basic Properties of t-Curves: 1. The total area under a t-curve = 1. 2. A t-curve extends indefinitely in both directions, approaching but never touching, the horizontal axis as it does so. 3. A t-curve is symmetric about 0. 4. As the number of degrees of freedom becomes larger, t-curves look increasingly like the standard normal curve. Percentages (and probabilities) for a variable having a t-distribution equal areas under the variable's associated t-curve. tα denotes the t-value having area α to its right under a t-curve. The way to the t-curve table is set up, we do not need to minus by 1 to get the area to the right of tα as we do with zα.
If the primary concern of a hypothesis test is to decide whether a population mean, μ, is less than a specified value μ0, we express the alternative hypothesis as
If the primary concern of a hypothesis test is to decide whether a population mean, μ is less than a specified value μ0, we express the alternative hypothesis as Ha: μ < μ0. A hypothesis test whose alternative hypothesis has this form is called a one-tailed test, specifically, a left-tailed test.
Another way we can calculate a particular confidence interval (CI) if we are given E (Margin of Error) is ___________.
If we are given Margin of Error, we can determine the confidence interval quickly (i.e. without finding the Zα/2) by simply taking the sample mean plus or minus the margin of error. Note this similar method might result in a more conservative (i.e. wider) confidence interval because the sample size is rounded up if you "worked backwards" to find the needed sample size.
In this book, the null hypothesis for a hypothesis test concerning a population mean, μ, always ______________. We express the null hypothesis as:
In this book, the null hypothesis for a hypothesis test concerning a population mean, μ, always specifics a single value for that parameter. Hence we express the null hypothesis as: H0: μ = μ0, where μ0 is some number that represents our belief concerning the world.
The critical value approach to hypothesis testing: Define the rejection region, non rejection region and critical value:
Rejection region: The set of values for the test statistic that lead us to rejection of the null hypothesis. Non-rejection region: The set of values for the test statistic that leads to non-rejection of the null hypothesis. Critical Value: The value or values of the test statistic that separate the rejection and non-rejeciton regions. A critical value is considered part of the rejection region.
Obtaining critical values for a one-mean z-test:
Remember that α is equal to our significance level.
Suppose a hypothesis test is conducted with a small significance level, α. What conclusions can we draw when we reject the H0 and when we fail to reject the H0?
Suppose that a hypothesis test is conducted with s small significance level, α: 1. If the null hypothesis is rejected, we conclude that the data provide sufficient evidence to support the alternative hypothesis. 2. If the null hypothesis is not rejected, we conclude that the data do not provide sufficient evidence to support the alternative hypothesis. We DO NOT conclude that the data provide sufficient evidence to support the null, because we typically do not know the probability, ß, of making a Type II Error of not rejecting a false hull hypothesis.
What are the conditions for obtaining a critical value?
Suppose that a hypothesis test is to be performed at a significance level of α. Then the critical value(s) must be chosen so that, if the null hypothesis is true, the probability is α that the test statistic will fall in the rejection region.
Studentized Version of the Sample Mean: Define.
Suppose that a variable x of a population is normally distorted with mean μ. Then, for samples of size n, the variable t (picture) has the t-distribution with n-1 degrees of freedom. A variable with a t-distribution has an associated curb, called a t-curve.
Confidence Interval Estimate
The Confidence Interval Estimate contains the confidence interval (CI) and the confidence level. A confidence interval estimate for a parameter provides a rand of numbers along with a percentage confidence that the parameter lies in that range.
What is the Margin of Error (labeled E)?
The Margin of Error - E - is equal to half the length of the confidence interval. The Margin of Error for the estimate of μ is E = Zα/2 * σ/√n, which we recall is half of our CI. The Margin of Error is also known as the maximum error of the estimate, because we are 100(1-α)% sure that that our error in estimating μ is AT MOST Zα/2 * σ/√n.
Describe the basic logic of Hypothesis testing:
The basic logical of hypothesis testing is as follows: Take a random sample from the population. If the sample data are consistent with the null hypothesis, do not reject the null hypothesis; if the sample data are inconsistent with the null hypothesis and supportive of the alternative hypothesis, reject the hypothesis in favor or the alternative hypothesis. In practice, we must develop precise criterion for deciding whether to reject the null hypothesis. This precise criterion involves a test statistic, a statistic calculated from the data that is used as a basis for deciding whether the H0 should be rejected.
The formula for determining sample size (n) involves the population standard deviation, σ, which is usually unknown. In such cases we can ______________ to get an answer if σ is unknown.
The formula for determining sample size (n) involves the population standard deviation, σ, which is usually unknown. In such cases we can take a preliminary large sample, say, of size 30 or more, and use the sample standard deviation, s, in place of σ. Remember the formula for s is
What does the length of a confidence interval tell us?
The length (i.e. the size of the range of values) of the confidence interval (CI) indicates the precision of the estimate, or how well we have "pinned down" M(ean). Long confidence intervals indicate poor precision and short confidence intervals indicate good precision.
Explain the relationship between Margin of Error (E), Precision, and Sample Size:
The length of the confidence interval for a population mean, μ, and therefore the precision with which x̄ estimates μ, is determined by the margin of error, E. For a fixed confidence level, increasing the sample size improves the precision, and vice versa.
Part of evaluating the effectiveness of a hypothesis test involves analyzing the chances of making an incorrect decision. How do we denote the probability of making Type I and Type II errors?
The probability of a making Type I Error, that is rejecting a true null hypothesis, is called the significance level, α, of a hypothesis test. The probability of a Type I Error is denoted α. A Type II error occurs if a false null hypothesis in not rejected; the probability of a Type II Error occuring is denoted ß (beta).
How does the confidence level affect the length of the confidence interval?
The relationship between confidence and precision: For a fixed sample, decreasing the confidence level improves the precision, and vice versa. Or in other words, decreasing the confidence level decreases the length of the confidence interval, and vice versa. (Essentially, a trade-off between precision and confidence)
The standardized (z) version of x̄ has the _______________ distribution. The studentized (t) version of x̄ has the ___________ distribution. What is the main defining characteristic of this new distribution?
The standardized version of x̄ has the standard normal distribution; the studentized version of x̄ has a distributed own as the t-distribution. (Note that the variable itself is normally distributed, but the studentized version of x̄ does not have a normal distribution) (?) There is a different t-distribution for each sample size. We identify a particular t-distrubtion by its number of degrees of freed (df). For the studentized version of x̄, the number of degrees of freedom is 1 less than the sample size, which we indicate symbolically by df = n-1.
The symbol zα (alpha) is used to denote ___________.
The symbol zα (alpha) is used to denote the z-score that has an area of α to its right under the standard normal curve. For example, z(.05) represents the z-score that has an area of .05 to its right under the standard normal curve. Because the area to its right is .05, the area to its left is 1 - .05, or .95. To find the z-score that corresponds to z(.05), use the z-table to find the z-score that corresponds to 1 - .05, since z-tables show the area to the left.
What is the test statistic for a one-mean z-test?
The test statistic for a one-mean z-test is: This is the procedure we use to perform a hypothesis test for one population mean when the population standard deviation is known and the variable under consideration is normally distributed (or, given the CLT, the sample size is sufficiently large).
If we are given the margin of error and the confidence level, we must determine the sample size necessary to meet those specifications. To find the formula for the required sample size we must ________.
To find the formula for the required sample size we must, we solve the margin of error formula, E = Zα/2 ⋅ σ/√n, in terms of n (rounded to the nearest whole number):
If we want to improve precision without decreasing our confidence level, we can ________.
To improve precision without decreasing our confidence level, we must decrease the margin of error, E. Because the sample size, n, occurs in the denominator of the formula for E (MoE), we can decrease E by increasing the sample size. This should make intuitive sense since we expect more precise information from larger samples.
In what noticeable way will the t variable differ from the standard z variable, and why does this difference typically occur?
Typically, the studentized (t) version has more spread (ie. variance) than the standardized (z) version. This different is not surprising because the variation in the possible values of the standard version is due solely to the variation of sample means, whereas that of the studentized diversion is due to the variation of both sample means AND sample stander deviations.
We can say that 100(1-α)% of all samples size n have means within _________.
We can say that 100(1-α)% of all samples size n have means within Zα/2 ⋅ σ/√n of μ. Equivalently we can say that 100(1-α)% of all samples of size n have the property that the interval from x̄ - Zα/2 ⋅ σ/√n to x̄ + Zα/2 ⋅ σ/√n contains μ.
Previously we have determined a confidence interval when σ (population standard deviation) is known, which is based off the standardized version of x̄. However it is unlikely that we would know this. If we do not know σ, then we cannot base of confidence interval (CI) procedure on the standardized version of x̄. What can we do then to find the confidence interval to estimate the population mean, μ?
When we do not know σ, the best we can do is estimate the population standard deviation, σ, by the sample standard deviation, s. In other words, we replace σ by s and base of confidence interval procedure off of the resulting variable called the studentized version of x̄:
Give the procedure to obtain a confidence interval for a population mean when σ is unknown:
When σ is unknown, we invoke a t-distribution instead of the standard normal distribution to obtain a confidence interval for a population mean. This procedure is called the One-Mean t-Interval Procedure. The confidence interval for μ is from: x̄ - tα/2 ⋅ s/√n to x̄ + t/2 ⋅ s/√n The assumptions that must be made to invoke this procedure are: 1. Simple random sample taken 2. Normal population or large sample taken 3. σ is unknown.
With the critical-value approach to hypothesis testing, we choose a "cut-off point" (or cutoff points) based on __________.
With the critical-value approach to hypothesis testing, we choose a "cut-off point" (or cutoff points) based on the significance level of the hypothesis test. The criterion for deciding whether to reject the null hypothesis involves a comparison of the value of the test statistic to the cut-off point(s).
Define Confidence Interval (CI)
A Confidence Interval (CI) is an interval of numbers obtained from a pint estimate of a parameter.
Define Hypothesis: Define Hypothesis Test:
A Hypothesis is a statement that something is true: Por ejemplo: "the mean weight of all bags of pretzels packages differs from the advertised weight of 454g" is a hypothesis. We often use inferential statistics to make decision of hutments about the value of a parameter, such as a population mean. One of the most commonly used methods for making such decisions or judgements is to perform a hypothesis test. The problem or goal of a hypothesis test is to decide whether the null hypothesis should be rejected in favor of the alternative hypothesis.
Define Point Estimate
A Point Estimate of a parameter is the value of a statistic use to estimate the parameter. A point estimate consists of a single number, or point. Roughly speaking, a point estimate of a parameter is our best guess for the value of the parameter base on available sample data. The term point estimate applies to the use of a statistic used to determine the parameter. (Parameter example: Mean is a parameter)
Define Type I Error: Define Type II Error:
A Type I Error occurs occurs when we reject the null hypothesis when it is in fact true. A Type II Error occurs when we do not reject the nil hypothesis when it is in fact false.
Define Confidence Level
Confidence Level is the confidence we have that that the parameter lies in the confidence interval (i.e. that the confidence interval contains the parameter). Usually expressed in a percent. The confidence level of a confidence interval for a population mean, M, signifies the confidence we have the M actually lies in that interval
Define and describe t-curves
Every variable with a t-distribution has an associated curve, called a t-curve. Although there is a different t-curve for each number of degrees of freedom, all t-curves are similar and resemble the standard normal curve. T-curves are more spread out than the standard normal curve. This property follows from the fact that, for a t-curve with v ("new") degrees of freedom where v>2, the standard deviation is √v/(v-2). The quantity always exceed 1, which is the standard deviation of the normal curve.
Describe the relationship between Type I and Type II Error Probabilities
For a fixed sample size, the smaller we specify the significance level, α, the larger will be the probability, ß, of not rejecting a false null hypothesis.
Explain the difference in rejecting regions for two-tailed, right tail and left tail tests:
For a two-tailed test, the H0 is rejected when the test statistic is either too small or too large. For a left-tailed test, the H0 is rejected only when the test statistic is too small. For a right tailed test, the H0 is only rejected when the test statistic is too large.
If the primary concern of a hypothesis test is to decide whether a population mean, μ is different from a specific value, we express the alternative hypothesis as
If the primary concern of a hypothesis test is to decide whether a population mean, μ is different from a specific value, we express the alternative hypothesis as Ha ≠ M0. A hypothesis test whose alternative hypothesis has this form is called a two-tailed test.
If the primary concern of a hypothesis test is to decide whether a population mean, μ, is greater than a specified value μ0, we express the alternative hypothesis as
If the primary concern of a hypothesis test is to decide whether a population mean, μ is greater than a specified value μ0, we express the alternative hypothesis as Ha: μ > μ0. A hypothesis test whose alternative hypothesis has this form is called a one-tailed test, specifically, a right-tailed test.
The larger the sample size, the ________ the sampling error. And with knowledge of confidence intervals, we can determine _____________.
The larger the sampling size, the smaller the sampling error tends to be in estimating a population mean, μ, by a sample mean, x̄. With knowledge of confidence intervals, we can determine exactly have sample size affects the accuracy of an estimate.