Ch. 3 and 4: Measures of Central Tendency and Variability; Standard Scores and the Normal Distribution

¡Supera tus tareas y exámenes ahora con Quizwiz!

Calculate Sample Variance

Four Steps 1) Subtract the mean from each score in order to calculate deviation scores 2) Take each deviation score and square it 3) Add up all the squared deviation scores (Called sum of squares...Abbreviated SS) 4) Take the sum of the squared deviation scores and divide it by the number of cases minus 1 to find sample variance

Central Tendency

Single value used to represent the typical score in a set of scores - Mean, median, mode

Sample Standard Deviation (s)

take the square of sample variance

Standard Scores (z-scores)

- A raw score expressed in terms of how many standard deviations it falls away from the mean - AKA z-score Purpose - Transform scores into a common unit of measurement (standard deviation) - Allows comparison between different scales or variables - Allows comparison between two different scales. e.g. "Carlos is more above average in his level of extroversion than he is in his level of intelligence." - This comparison can be made because the scores have been standardized. - Standard scores allow different kinds of measurements to be compared on a common scale—such as the juiciness of an orange vs. the crispiness of an apple. - Positive value for a z score means that the score falls above the mean - Negative value means that the score falls below the mean Value of 0 means that the score falls right at the mean - Figure 4.1 -Standard deviation is essentially a reflection of the amount of variability within a given data set. -The Z-score, by contrast, is the number of standard deviations a given data point lies from the mean. -In investing, standard deviation and Z-score can be useful tools in determining market volatility. - As the standard deviation increases, it indicates price action varies widely within the established time frame. Given this information, the Z-score of a particular price indicates how typical or atypical this movement is based on previous performance.

For standard deviations that aren't whole numbers...

- Can't use 34%, 14%, or 2% properties - Use z Scores - Calculate z Score - Use z-table (page A-1 back of book) to get percentage under the curve

Percentile Ranks

- Percentage of cases with scores at or below a given level in a frequency distribution - Can be computed from z scores - Express the same information but in a different format e.g. Joshua scored in the 75th percentile

Middle and Extreme Sections of the Normal Distribution

- Percentage of the normal distribution found around the midpoint, evenly divided into two chunks, one just above the mean and one just below it - z scores ± 0.84 mark off the middle 60% - Extreme 40% is two-tailed, split 20% in each extremity - ± 1.96 mark off the middle 95% of scores and also marks off the extreme 5% of scores 1.96 appears in many chapters -remember what it represents

Mode

- Score that occurs with the greatest frequency - Can be used for nominal, ordinal, interval or ratio measurements Example -Sex of students in Psychology Class -Most common value (female), it is the mode

Normal Distribution

- Specific bell-shaped curve that is defined by the percentage of cases that fall in specific areas under the curve. - A normal distribution, is a symmetrical distribution - Midpoint is the mean, median, and mode - As scores move away from the midpoint, the frequency of their occurrence decreases symmetrically - Symmetrical, so midpoint is the mean, median, and mode -AKA bell curve - Normal distribution is also defined by the percentage of cases that fall within specified regions - 68% of the cases fall within one standard deviation from the mean - 96% within 2 standard deviations - 100% within 3 standard deviations

Population Variance

- Variance - Great variability tool; Represents all cases in the data set - Mean of the squared deviation scores - Abbreviated as σ^2 - Pronounced "sigma squared" -Problem with samples is that they don't represent all variability that a population would. -IQR and range represent only limited info; variance represents all cases in the data set. -To calculate the variance for a population, deviation scores are used. -*Deviation scores serve as a measure of variability* (bigger deviation scores=more variability) -How to summarize deviation scores? -Can't add them and divide by number (average) because the sum of a set of deviation scores is always zero. -To get around this problem, square the deviation scores—making them all positive—and then find the average of the squared deviation scores. The result is the variance - -σ is the lowercase version of the Greek letter sigma ∑ -Standard Deviation is the square root of the variance and tells the average distance by which raw scores deviate from the mean. -Variance and SD use info from all the cases in the data set; big advantage over the range and IQR (just use limited info.).

Area A under a curve

-A positive z score falls above the midpoint, so the area below a positive z score also includes the 50% of cases that fall below the midpoint. -The area below a positive z score will always be greater than 50% as it includes both the 50% of cases that fall below the mean, and an additional percentage that falls above the mean.

Area B under a curve

-Can never be greater than 50% because it can't include more than half of the normal distribution. -Shows the symmetry between positive and negative z scores

Formula for Sample Variance (s^2)

-Population variance and SD are almost never known because it is rare that a researcher has access to all cases. Researchers study samples, but they want to draw conclusions about populations. To do so, they need to use population values in their equations. -To make this possible, statisticians have developed a correction so that the sample values are better approximations of the population values. -The *correction makes the sample variance and standard deviation a little bit larger*. -This is because there is more variability in a population than in a sample. -This controls bias (degrees of freedom)

Sum of deviation scores is equal to

0

4 Measures of Variability

1. Range 2. Interquartile range 3. Variance 4. Standard deviation

Mean

Abbreviated M - Because the mean takes distance between cases into account, it can only be used with interval- or ratio-level numbers. There is no mathematical significance in using the mean for nominal or ordinal (Male vs. Female)

Choosing a Measure of Central Tendency

Always choose the measure of central tendency that conveys the most info.

Formula for Population Variance

Four steps 1. Create deviation scores for each case in the population by subtracting the population mean from each raw score 2. Square each of these deviation scores 3. Add up all the squared deviation scores 4. Divide this sum by the number of cases in the population - To calculate the variance for a population, deviation scores are used. Deviation Score = (X-M) -Deviation scores = amount of distance a score falls from the mean. With data from a whole population, deviation scores would be calculated by subtracting the population mean, mu, from the raw scores.

Variability

How much variety (spread or dispersion) exists in a set of scores

How to report IQR

How to Report - "The interquartile range for IQ in the sample is 90 to 120." or - "The interquartile range for IQ in the sample is 30" Provides information about variability and central tendency - Knowing the interval, 90 to 120, tells where the average scores fall - Knowing how wide the interval is tells how much variability is present - Wider intervals = larger variability - Example SAT scores - Upper-ranked schools have IQR of 2,100 to 2,380 or 280 - Lower-ranked schools have IQR of 1,170 to 1,530 or 360 IQR is a single number, the distance between two points. However, like the range, the IQR is usually reported as two scores. GREAT descriptive statistic. When the IQR is reported as a single number, it gives useful comparative information about variability. -Upper-ranked schools IQR (280) is more narrow than lower-ranked schools IQR (360). *This indicates more variability in SAT performance among the students at the lower-ranked schools. Professors at the lower-ranked schools can expect a wider range of intellectual abilities in their classrooms than would be found at an upper-ranked school.

Range

Measure of variability for interval- or ratio level data - Distance from the lowest score to the highest score - Depends only on 2 scores in the data set - Influenced by outliers

Percentiles

Q1 - 25th percentile - Find the median of the first half of the list Q2 - 50th percentile Q3 - 75th percentile - Find the median of the second half of the list -most common solution is to trim the bottom 25% of scores and the top 25% of scores. Then, the range is calculated for the middle 50% of the scores. IQR - Measure of variability for interval- or ratio-level data - Distance covered by the middle 50% of scores IQR = Q3 - Q1 -Any point that falls outside the interval calculated by Q1-1.5(IQR) and Q3+1.5(IQR)=outlier

Interquartile Range Example

Twelve babies spoke for the first time at the following ages (in months): 8, 9, 10, 11, 12, 13, 15, 15, 18, 20, 20, 26. Find Q1, Q2, Q3, the range and the IQR. -IQR is used to determine data classified as outliers -Any point that falls outside the interval calculated by: -Q1-1.5(IQR) and Q3+1.5(IQR) The values of the min, Q1, Q2, Q3, and the max make up what is called our five number summary. 1. Range=The difference between the maximum and the minimum values in our list. =Hi-Lo =26-8 =18 2. Number each value. Q2 is the median (middle of the list). If 2 values in the middle, average those two. -Q2=13+15 = 14 2 3. Q1= Look at the first half of the list and find its middle. If 2 values, average them -Q1=10+11 = 10.5 2 4. Q3= Look at the second half of the list and find its middle. If 2 values, average -Q3=18+20 = 19 2 5. IQR =Q3-Q1 = 19-10.5 = 8.5 6. -IQR is used to determine data classified as outliers. -Outliers can occur by chance or be measurement errors so it is important to identify them. -Any point that falls outside the interval calculated by Q1-1.5(IQR) and Q3+1.5(IQR)=outlier -10.5-1.5(8.5)= to 19+1.5(8.5)= [—2.25, 31.75] No outliers, as no numbers fall outside the interval of -2.25 and 31.75.

Standard Scores Calculation

Two Steps 1) Subtract the mean from the raw score 2) Divide the deviation score by the standard deviation and round to 2 decimal places - Turning a raw score into a standard score so we can make it standardized and be able to compare to other units

Calculating Raw Score from z score

X = raw score = M + (z x s)

Deviation Score

X-M = deviation score -Measure of how far a score falls from the mean -Raw scores above the mean have positive deviation scores -Raw scores below the mean have negative deviation scores -Scores exactly at the mean have deviation scores of zero -Further the raw score is from the mean, bigger the deviation score

Median

abbreviated Mdn - Median is focused on direction (more/less) and ignores information about distance between scores. Because of this, the median can be used with ordinal data (unlike the mean). - Less influenced by outliers. - Find by using (N+1)/2 and looking at the value associated with that score number associated


Conjuntos de estudio relacionados

Chapter 12: Wrist and Hand Complex

View Set

IST 3343 Exam 1 Test Bank Chapters 1-4

View Set

Ch. 10 Commercial Auto Coverage - Random Questions 1 - MI P&C Licensing

View Set

Chapter 20: Fair Housing, ADA, Equal Credit, and Community Reinvestment - VERY IMPORTANT CHAPTER

View Set

Peds Final Exam Practice Questions

View Set