Chapter 23: Foundations of Statistical Inference Part 1
What are the assumptions in inferential statistics based on?
Two important concepts of statistical reasoning: probability and sampling error
What does probability imply?
Uncertainty
What is a point estimate?
a single value obtained by direct calculation from sample data
What is the aim of superiority trials?
to demonstrate that one treatment is "superior" to another
Which confidence intervals are used most often?
90%, 95%, and 99%
What is a confidence interval (CI)?
A confidence interval (CI) is a range of scores with specific boundaries, or confidence limits, that should contain the population mean.
What does a non-inferiority trial focus on?
A non-inferiority trial focuses on demonstrating that the effect of a new intervention is the same as standard care, in an effort to show that it is an acceptable substitute. This approach is based on the intent to show no difference, or more precisely the new treatment is "no worse" than standard care
What is the difference between a nondirectional alternative hypothesis and a directional alternative hypothesis?
A nondirectional alternative hypothesis does not specify which group is expected to be larger, and a directional alternative hypothesis indicates the expected direction of difference between sample means
What do we expect to see according to the concept of sampling error?
According to the concept of sampling error, we would expect to see some differences between groups even when a treatment is not at all effective.
What happens to the samples as sample sizes increase?
As sample size increases, samples become more representative of the population, and their means are likely to be closer to the population mean, which means that their sampling error will be smaller.
What is an interval estimate, and how do we use it?
Because a single sample value will most likely contain some degree of error, it is more meaningful to use an interval estimate, by which we specify a range within which we believe the population parameter will lie.
Why are sampling distributions only theoretical?
Because no one goes through a process in practice
What can you use sampling distribution for?
Because of the predictable properties of the normal curve, we can use the concept of the sampling distribution to formulate a basis for drawing inferences from sample data.
What can we never be sure of even when truly random samples are used?
Even when truly random samples are used, we cannot be sure that one's sample characteristics will be identical to those of the population.
How is the alternative hypothesis stated in equation form? What does it predict?
H1: uA does not equal uB (H1: uA-uB does not equal 0); it predicts that the observed difference between the two population means is not due to chance, or that the likelihood that the difference is due to chance is small
How can the null hypothesis be stated formally as an equation? What does this predict?
Ho: uA= uB (Ho: uA- uB= 0); it predicts that the mean of Population A is not different from the mean of Population B, or that there will be no treatment effect
What does it mean if the difference is in the non-inferiority margin?
If the difference between treatments is within that margin, then non-inferiority is established. The new treatment would be considered no worse than standard care, and therefore its preference for use would be based on safety, convenience, cost, or comfort
For sampling distribution of means, what would happen if we plotted all the sample means?
If we plotted all the sample means, we would find that the distribution would take the shape of a normal curve, and that the mean of the sample means would be equal to the population mean.
In practice, why is sampling error unpredictable?
In practice, sampling error is unpredictable because it occurs strictly by chance, by virtue of who happens to get picked for any one sample.
What are the roles of the null and alternative hypothesis in a non-inferiority trial?
In the scheme of a non-inferiority trial, the roles of the alternative and null hypotheses are actually reversed. The null says the treatment is better than the new approach, while the alternative states that the new treatment is not inferior to the standard treatment. In this case, it is the non-inferiority margin that must be considered and the confidence interval around that margin
What is inferential statistics?
Inferential statistics involve a decision-making process that allows us to estimate unknown population characteristics from sample data.
What does probability predict?
It is predictive in that it reflects what should happen over the long run, not necessarily what will happen for any given trial or event
What is probability?
Likelihood that any one event will occur, given all the possible outcomes
Is an event probable once it has occurred?
No. It either happened as predicted or not.
What proportion does probability apply to?
Probability applies to the proportion of time we can expect a given outcome to occur in the idealized "long run"
In the real world, what numbers does probability usually fall between?
Probability for most events falls somewhere between 0 and 1.
What is sampling error of the mean equal to?
Sampling error of the mean for any single sample is equal to the difference between the sample mena and the population mean.
How are statistical hypotheses formally stated?
Statistical hypotheses are formally stated in terms of the population parameter, u, even though the actual statistical tests will be based on sample data.
How can we view probability statistically?
Statistically, we can view probability as a system of rules for analyzing a complete set of possible outcomes.
What does the concept of 95% confidence limit indicate?
The concept of 95% confidence limits indicates that if we were to draw 100 random samples, each with n=50, we would expect 95% of them to contain the true population mean and 5% would not.
How do we correctly interpret the confidence interval?
The correct interpretation of a confidence interval is that if we were to repeat sampling many times, 95 of the time our confidence interval would contain the true population mean.
How is the degree of confidence usually expressed?
The degree of confidence is expressed as a probability percentage, typically 95% or 99% confidence.
What is the estimation of population characteristics based on?
The estimation of population characteristics from sample data is based on the assumption that samples are random and valid representatives of the population.
What does an interval estimate take into consideration?
The interval estimate takes into consideration not only the value of a single sample statistic but the relative accuracy of that statistic as well.
Since the sampling error groups will probably not have exactly equal means even when there is no true treatment effect, what is the null hypothesis really indicating?
The null hypothesis really indicates that observed differences are sufficiently small to be considered pragmatically equivalent to zero.
What does the alternative hypothesis state?
The null hypothesis states the treatment is effective and that the effect is too large to be considered a result of chance alone
What is the null hypothesis?
The observed difference between the groups occurred by chance; states that the group means are not different, which means that the groups come from the same population
What is the estimate standard error of the mean from the sample data based on?
The standard deviation and the sample size
What is the standard deviation called and what is it when it comes to establishing variance properties of a sampling distribution of means?
The standard deviation is called the standard error of the means, and this value is considered an estimate of the population standard deviation.
What does the standard deviation indicate?
The standard deviation of the sampling distribution is an indicator of the degree of sampling error, reflecting how accurately the various sample means estimate the population mean.
What happens to the standard error as n increases?
The standard error of the mean decreases. With larger samples, the sampling distribution is expected to be less variable, and therefore, a statistic based on a larger sample is considered a better estimate of a population parameter than one based on a smaller sample.
What causes inferential statistics to be successful?
The success of this process requires that we make certain assumptions about how well a sample represents a population.
What does it mean when we propose a wider confidence interval?
The wider the interval we propose, the more confident we will be that the true population mean will fall within it.
How can we be more confident in the accuracy of a confidence interval?
To be more confident in the accuracy of a confidence interval, we could use 99% as our reference, allowing only a 1% risk that the interval we propose will not contain the true population mean.
Can we prove the null hypothesis based on the sample data?
We can never actually prove the null hypothesis based on sample data, so the purpose of a study is to give the data a chance of disproving it
How should we appropriately express the outcome of the statistical decision?
We can reject or do not reject the null hypothesis. If we reject the null hypothesis, we can accept the alternative hypothesis.
Why is there always a 5% chance that the population mean is not included in the obtained interval (5% chance the interval is one of the incorrect ones)?
We construct only one confidence interval based on the data from on sample. Therefore, we cannot know if our one sample would produce one of the correct intervals or one of the incorrect ones.
What symbol is used to signify probability, and how is it expressed?
We use a lowercase p to signify probability, expressed as a ratio or decimal.
Why do we use probability in research?
We use probability in research as a guideline for making decisions about how well sample data estimate the characteristics of a population. We also use probabilities to determine if observed effects are likely to have occurred by chance. We try to estimate what would happen to others in the population on the basis of our limited sample.
Does a sample mean and its standard error help us imagine what the sample distribution curve would look like?
Yes
In probability, are the outcomes a matter of chance, which would be representing random events?
Yes
Is probability complex but essential to understanding inferential statistics?
Yes
Is the population mean always constant and only the confidence interval varies around it?
Yes
Is the researcher's goal always to be statistically test the null hypothesis no matter how the research hypothesis is stated?
Yes
Do the means of small samples tend to vary, and if yes, what size of the curve should we expect?
Yes and wide curve because of great variability
What is the process called for deciding if an observed effect is or is not likely to be chance variation?
hypothesis testing
When it comes to the null hypothesis, what should the means be called instead of equal?
not statistically different
What is being sacrificed when the confidence interval is 99% instead of 95%?
precision
What is sampling error?
tendency for sample values to differ from population values
What is a non-inferiority margin?
the largest difference between the two treatments that would still be considered functionally equivalent
What does the choice of the confidence interval depend on?
the nature of the variables being studied and the researcher's desired level of accuracy
What does the alternative hypothesis usually represent?
the research hypothesis
What are the boundaries of a confidence interval based on?
the sample mean and its standard error
What are the corresponding z values for 90%, 95%, and 99%?
z= 1.645 for 90%, z= 1.96 for 95%, and z= 2.576 for 99%