Ch 6A, B, C & D
Lower and Upper quartile
Remember that they are values; not based on the other values though. The boundaries of where the lowest or highest quartile end or begin. To find them, Line up all your scores from lowest to highest and divide them in two; like finding a median. Do not take the average of the scores and try to find the quartile.
Measures of central tendency
is a measurement of how much the data values are spread around the mean. Describes this as the 'shape'.
A higher deviation
is data is more spread out and you can't rally make inferences.
Distribution
of a variable or data set describes the values taken on by the variable and the frequency (how many times you see that show up) of these values. In any distribution, it's always true that the range is at least as large as the standard deviation.
Mean
sum of all scores or values and divide that total by the number of values or scores. (the mean is very sensitive to outliers).
the Two types of statistics
1. Descriptive, which is based on sample 2. Inferential, which are inferences about population based on what you saw from the sample.
the 5 number summary of your Range
1. Lowest, 2. lower Quartile, 3. middle, 4. upper Quartile & 5. Highest. Taken with the range and put together gives you a good overview.
How can you characterize the distribution of data?
1. Measurements of central Tendency; is talking about how you can talk about a set of values in one sentence- it's the average. The mean, the mode or a median. 2. Measurements of Variation
Descriptive
1. Measures of Central Tendency which are the mean, median & mode 2. Measures of Dispersion or Variation which are: Range, Quartiles & standard deviation
Inferential
1. Normal Distribution 2. Statistical Significance
Standard Deviation
Best described as the spread of the data values around the mean. An average of how far away every single data value is (deviate) from the mean; not the median or mode! It's a data point. Positive or negative doesn't matter, it's like an absolute value. Take the mean & subtract the data point, find them all then average.
Low, moderate & high variation
How widely dispersed the data is. The more spread out the data is the higher the variation.
REVIEW QUESTIONS FROM EACH SECTION
GREAT PREPARATION FOR TEST!
Shape of the Data distribution
Graphs of the same mode of data can have very different shapes based on the frequency, mode, of the data. refers to the shape of a probability distribution and it most often arises in questions of finding an appropriate distribution to use to model the statistical properties of a population, given a sample from that population.
Statistical Significance
Inference is trying to go from the sample and with confidence say it means. fill in the blank...about my population
Measurements of Dispersion- or variation
Range, Variance and Standard Deviation. Dispersion, or variation, is variability or spread in a variable or a probability distribution. Common examples of measures of statistical dispersion are the variance, standard deviation and interquartile range.
Explain what it means to be statistically significant
Something happened that was in a higher amount than if it happened by chance.
Symmetrical and Skewed graphs
Symmetrical = mode, median & mean are all in the middle. Skewed = outliers are either high or low; values are more spread out on the left or right side of the mode. Left skewed means the mean an the median are to the left of the mode and to the right if it trails off to the right.
Range
The highest value minus the lowest value equals the absolute difference. Gives you an idea how spread out the data is; or, the limits of the values a function can take
Mode
The most common score or the most frequent value of a collection of variables. Bi-modal means there's two groups of the same value the same amount of times. Not sensitive to outliers at all. If there's 0 or more than 3 there's no mode. The Mode will be stable, or unaffected by outliers, hence it will not be dragged lower or higher by an outlier.
Symetrical
When a graph is symmetrical the mode, median and mean will all align in the middle. When it's right or left skewed the mean is most affected by outliers & will be pulled furthest, then the median. The mode isn't affected at all.
Variation
a difference between one experience and another.
Significance
above and beyond what would have happened 'by chance'.
Outlier
an extreme score in comparison to the rest of them. A data value is so extreme compared to the rest of the others. They can really screw up the mean.
Quartiles
are not based on values. It's based on data points. They are actually the boundaries, if it falls between two numbers add them together then divide by 2.
Descriptive & Inferential
are related to a frequency table
Normal
if you were to plot all data points it would be symmetrical. the Mean, median & mode would line up in the middle & 68% would fall between one + or - standard deviation, 98% would fall between + or - two standard deviations.
Median
the middle score; the value which 50% of the cases fall below and 50% will be above the median-it's ALWAYS the middle score. An outlier will only affect the median by moving it one position over.