Statistics
When is a difference reliable
1. It is reliable when the people who are tested are representative of the population. It is important to note who has been selected for the research. 2. The scores do not vary much. If there are outliers, the average is less predictable, and the data is less consistent. 3. The number of people tested should be high. If tests have only a few participants, the data will be less reliable because it won't represent many of the people in a certain group.
Describing Data
A meaningful description of data is important in research. Misrepresentation may lead to incorrect conclusions.
What makes averages more reliable? Why?
Averages are more reliable if the scores vary less. This is because one outlier can throw off the mean. When averages are more predictable, they are more reliable. They are not affected by outliers.
How is calling data statistical significance like calling a correlation strong?
Both show that the variables tested for may be in some way related because they seem to share a relationship. This does not mean that their relationship is relatively important, but it does mean that a relationship is possible.
Which one is most influenced by extreme scores?
Mean is most influenced by an extreme score. This is because if there is one outlier, the median and modes will stay relatively unaffected. Since the extreme score is an outlier, it will not be the most frequent score (mode) nor will it throw of the midpoint very far (median). However, when added with the other scores, it may greatly change that number and affect the mean.
p-value
The probability of results of the experiment being attributed to chance.
mean
This is also known as the average, and it is the sum of the data divided by the number of pieces of data.
Range
This is the highest score minus the lowest score. This can also get thrown off by outliers, however, because it takes the highest and lowest scores (which is where outliers would be.)
median
This is the midpoint of the data when the data is put in order of greatest to least.
mode
This is the most common number in the data.
normal distribution
a distribution of scores that produces a bell-shaped curve (symmetrical)
skewed distribution
a representation of scores that lack symmetry around their average value
statistical significance
a statistical statement of how likely it is that an obtained result occurred by chance
standard deviation
avg difference between each score and the mean
type 2 error
false negative
Type 1 error
false positive
inferential statistics
involves estimating what is happening in a sample population for the purpose of making choice a/b that population's characteristics - hyp and results must be tested for significance
When is data significant
liklihood of a difference being due to chance is less than 5%.
measures of central tendency
mean, median, mode
descriptive statistics
numerical data used to measure and describe characteristics of groups. Includes measures of central tendency and measures of variation.
measures of variability
range and standard deviation
Statistical Reasoning
statistical procedures analyze and interpret data allowing us to see what the unaided eye misses
null hypothesis
the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.