Statistics Ch. 7 Clarifying the Concepts
When we look up a z score on the z table, what information can we report?
We can report the area between the mean and the z score, or the area from the z score to the tail.
What are three ways to deal with missing data
. 1) Replace the missing data point with the mode or mean for that variable (based on other participants' responses, 2) replace the missing data point with the mode or mean based on that participant's responses with other similar questions, or 3) replace the missing data point with a random number within the possible range of numbers
What are the six steps of hypothesis testing?
1) Identify the populations, comparison distribution, and assumptions, which helps you decide which test to use, 2) state the null and research hypotheses, 3) determine the characteristics of the comparison distribution (aka, find the numbers you need for equations), 4) determine your critical values, 5) calculate the test statistic, and 6) make a decision about the null hypothesis
What is a percentile?
A percentile is the percentage of scores that fall below a certain point on a distribution.
How can misleading data result in missing data
A researcher may decide to exclude misleading data points and thus be left with missing data.
What does statistically significant mean to statisticians?
A statistically significant findings is one in which we have rejected the null hypothesis because the pattern in the data differed from what we would have expected by chance. The odds of our finding an outcome like ours (or more extreme) in our population is very low, so we can be more sure that what we found was somehow different than our population.
Using everyday language, rather than statistical language, explain why the words critical region may have been chosen
Critical region may have been chosen because values of a test statistic describe the area beneath the normal curve that represent a statistically significant result. It's the region that's critical to our finding that significance.
What are the critical values and the critical region
Critical values are the test statistic values beyond which we reject the null hypothesis. The critical region refers to the area in the tails of the distribution in which the null hypothesis will be rejected if the test statistics falls there.
What is the standard size of the critical region used by statisticians?
Five percent, or .05.
Why was the word cutoff chosen?
For cutoff, we can think about which test statistics make the cutoff and which ones don't. In other words, which ones at least cross over a mark that we see as relevant to our decision-making.
Define assumptions
In statistics, assumptions are the characteristics we ideally require the population from which we are sampling to have so that we can make accurate inferences.
How is calculating a percentile for a mean from a distribution of means different from doing so for a score from a distribution of scores?
In terms of calculating the actual percentile and using the z table, there are no differences between working with a distribution of scores or a distribution of means. However, to get the initial z score for a distribution of means, we need to make sure that we use the standard error (instead of standard deviation) for our calculation.
What is the difference between parametric tests and non-parametric tests?
Parametric tests are statistical analyses based on a set of assumptions about the population. By contrast, nonparametric tests are statistical analyses that are not based on assumptions about the population. For example, one assumption you often see for parametric tests is that the population should be normally distributed. A related nonparametric test wouldn't have this same assumption, so you could use it even with a population that had a skewed distribution.
What is the difference between a one and two tailed hypothesis in terms of critical regions
Researchers typically use a two-tailed test rather than a one-tailed test because they either don't know the direction of the expected difference or do not want to make assumptions about the difference. For a one-tailed test, the critical region (usually 5% or a p level of .05) is placed in only one tail of the distribution; for a two-tailed test, the critical region must be split in half and shared between both tails (usually 2.5% or .025 in each tail).
What sample size is recommended in order to meet the assumption of a normal distribution, even when the population of interest is not normal
Samples of at least 30 observations are recommended. Basically statisticians have determined that if we repeatedly sample with 30 observations, we can approximate a normal curve with our distribution o the means of those samples.
What do these symbolic expressions mean?
These are just the symbolic ways of presenting our null and research hypotheses. The first (null) state that we expect the mean of population 1 to be the same as the mean of population 2. In other words, there are no differences between the two groups. The second (research) hypothesis says that the two populations aren't equivalent - that there is a difference between the two groups.
What re the three kinds of dirty and what are their possible sources
Three kinds of dirty data are missing data, misleading data, and outliers. Missing data may result from participants either intentionally or accidentally skipping some questions or from participants answering only initial parts of a survey and then skipping the rest. Misleading data may come either from participants purposefully providing answers that are not accurate or from poor test design, such that participants do not fully understand how to answer. Outliers refer to extreme data points. These may reflect participants who actually are extreme in the distribution, but they also may be signs of misleading data (e.g., participants not performing the experimental task as directed).
How do we calculatethe percentage of scores below a particular positive z score
We add the percentage between the mean and the z score, or the area from the z score to the tail.