Descriptive Statistics (Continued)

Ace your homework & exams now with Quizwiz!

Considering the area under the normal curve in terms of probabilities is

very useful for researchers.

Many kinds of standard scores exist other than

z and T scores.

Box plots

(sometimes called box and whiskers diagrams) - can graphically display the five-number summary of the distribution.

outliner

scores or measurements that differ by such large amounts from those of other individuals in a group that they must be given carefully consideration as special cases.

The normal curve is

symmetrical and bell shaped .

The median is the most appropriate average to calculate when

the data result in a skewed distribution.

scatter plots illustrate

the degree of relationship between variables.

Consists of the lowest score, Q1, the median, Q3, and the highest score. The inter-quartile range (IQR) is

the difference between the third and first quartiles (Q3 - Q1 = IQR).

In the normal distribution, the mean, median, and mode are identical so the mean falls at

the exact center of the curve.

In all 99.7%of the observations fall within

three standard deviations of the mean.

Five number summary

useful way to describe a skewed distribution.

What would the z score be for a raw score of 52 in a distribution with a mean raw score f 50 and a standard deviation of 2?

z = +1

Modes do not and is not

- do not present very much useful information . - is not often used in educational research.

Median in short the

mid point

Negative

relationship indicated when high scores on one variable are accompanied by low scores on the other variable or when low scores on one are accompanied by high scores on the other variable.

When a distribution is skewed, the shape of the distribution and the variability can be described by

reporting several percentiles.

in Norma distribution

the large majority of the scores are concentrated in the middle and the scores decrease in frequency the further away they are.

Almost all the scores in a normal distribution lie between

the mean and plus or minus three standard deviations.

The standard deviation and its "brother",

the measure the spread of scores from the mean

Median

the point below and above which 50 percent of the scores in a distribution fall.

If a distribution is normal and we know the mean and the standard deviation of the distribution,

we can determine the percentage of scores that lie above or below any given score.

example of Percentile :

you receive a score of 630 on the GRE and receive a percentile of 84 meaning that 84% of those who took this test scored lower than you did.

e.g :We have previously shown that approximately 34 percent of the areas in a normal distribution lie between the mean and 1 SD. Because 50 percent of the scores fall above the mean, roughly 16 percent of the scores lie above 1 SD (50-34 = 16).

If we express 16 percent as a decimal and interpret it a s a probability, we can say that the probability of randomly selecting an individual from the population who has a score 1 SD or more above the mean is .16.

Here we do not just connect a series of dots that represent the actual frequencies of the observed distribution.

Instead, show a generalized distribution of scores not limited to one specific set of data

Look at the following scores: A: 19, 20, 25, 32, 39 B: 2, 3, 25, 30, 75

Mean in both is 27 . Median in both is 25.

The total area under the normal curve represents

all the scores in a normal distribution.

Probability

another important characteristic of the normal distribution is that the percentages associational with the areas under the curve can be thought of as probabilities.

The distribution of some human traits (e.g., height, weight)

approximates this normal curve.

Many other human traits, such as spatial ability, manual dexterity and creativity

are often assumed to do so.

Correlation coefficients

designated by the symbol r expresses the degree of relationship that exists between two sets of scores.

Researchers need show some way to

determine whether a relationship exists in data

Box plots Especially useful for

displaying two or more distributions.

Correlation coefficients range

from -1.00 to + 1.00. The closer we get to either of these extremes the stronger the relationship between the two variables. The closer we get to .00, the weaker the relationship.

If a set of scores is normally distributed, we can interpret any score if we know

how far, in standard deviation units, it is from the mean.

example of Probability :

if there is a probability that an event will occur 25 percent of the time, it is said to have a probability of .25.

An important point about the standard deviation is that

in a normal distribution of scores, the mean plus or minus three standard deviations will encompass about 99 percent of all the scores in the distribution.

pearson product-moment correlation

most commonly used correlation coefficient represented by lower case r.

Mode

most frequent score in a distribution. Score attained by more students than any other score.

Standard Deviation

most useful estimate of variability. It is a single number that represents the spread of a distribution.

To convert a z score to a T score:

multiply the z score by 10 and add 50.

Raw scores below the mean in a distribution convert to

negative z scores, which become awkward.

This is crucial to converting z scores to

percentages and probabilities.

scatter plot

pictorial representation of the relationship between two quantitative variables.

In a distribution that contains uneven number of scores, the median is

the middlemost score (provided the scores are listed in order).

Doing this is based on the assumption that

the trait being scored does distribute according to the normal curve.

Measures of central tendency are useful for summarizing scores in a distribution; but

they are not sufficient.

When actual data do not approximate the curve,

they can be changed to do so.

Because the curve is symmetrical,

50 percent of the scores fall either left or right of the mean.

The median is the

50th Percentile

In a distribution that contains even number of scores, the median is the point halfway between the two middlemost scores. For example in the distribution, 70, 74, 82, 86, 88, 90 the median is

84

The distribution , however, differ considerably in what statisticians call variability so

Averages Can Be Misleading

the Mode ,Median and Mean.

Each of these represents a type of average or typical score attained by a group of individuals on some measure.

Can also compare how an individual's score compares to all the other scores in a normal distribution.

For example: if a person's score lies exactly one standard deviation above the mean, we know that approximately 85 percent of all the other scores in the distribution fall below the individual's score.

Some ways to summarize categorical data:

Frequency Table

In any normal distribution, 68 percent of the scores fall within one standard deviation of the mean.

Half of these, 34 percent, will fall within one standard deviation above the mean and the other half (34 percent) will fall within one standard deviation below the mean.

There needs to be a way to measure the

Spread or variability that exists within a distribution.

Percentile in a set of numbers is a value such that

a certain percentage of the numbers fall below and the rest of the numbers fall above it.

A more common way to describe a numerical distribution is

a combination of the mean (a measure of center) and the standard deviation (a measure of spread).

T scores have

a mean of 50 and a standard deviation of 10.

measures of central tendency (averages) -

enable researchers to communicate scores in a frequency distribution with a single number.

Standard Deviation As with the mean

every score in the distribution is used in its calculation.

All of the percentages associated with areas under a normal curve can be

expressed in decimal form and viewed as probability statements.

Boxplots illustrate another way that

graphs can effectively convey information

Z scores Example: Not all Z scores fall exactly at one or two, etc. standard deviations from the mean. Can use the following formula to calculate these kinds of z scores:

z score = raw score - mean on standard deviation

Another 27 percent of the observed scores fall between one and two standard deviations from the mean.

Hence 95 percent (68 plus 27 percent) fall within two standard deviations of the mean.

Two very different distributions might have the same median: 98, 90, 84, 82, 76 90, 87, 84, 64, 41

Here the median is 84

example of Range

subtract 11 (lowest score) from 89 (highest score) to get a range of 78.

Percentages Under the Normal Curve

This is one of the most useful characteristics of the normal distribution.

A Probability is

a percent stated in decimal form and refers to the likelihood of an event occurring.

The normal curve is based on .

a precise mathematical equation

The Range gives

a quick but rough estimate of variability.

Mean

add up all the scores in a distribution and divide this sum by the total number of scores

One way to avoid negative z scores is to

convert them to T scores.

Researchers often draw a smooth curve instead of .

a series of straight lines in a frequency polygon

A raw score that is exactly one standard deviation above the mean represents

a z score of +1.

A raw score that is exactly at the mean represents

a z score of zero.

Standard Scores

derived score that uses a common scale to indicate how an individual compares to other individuals in a group.

T Scores For example:

z = -2. To convert to a T score multiply -2 by 10 = -20 and add 50 = 30. T score = 30.

the Mode ,Median and Mean.Which is best?

It depends. Mean - only one of the three that uses all the information in a distribution and is generally preferred over the other two. However, the mean is unduly influenced by extreme scores. In these cases the median gives a more accurate indication of the typical score in a distribution.

For example: If the mean of a normal distribution is 100 and the standard deviation is 15, what would be the scores that lay one standard deviation above the mean and one standard deviation below the mean.

One standard deviation above the mean - 115 . One standard deviation below the mean - 85 .

Scores in a distribution might have identical means and medians but

be quite different in other ways.

Range

distance between the highest and lowest scores in a distribution.

The smooth curve is known as

distribution curves.

bimodal distribution -

distribution with two modes

Norma distribution

distributions of data that tend to follow a certain specific shape of distribution.

Researchers use such probability statements to

precisely state the probability of an observed score relative to other scores in a normal distribution.

Z scores permit comparison of

raw scores on different tests.

Positive

relationship indicated when high scores on one variable are accompanied by high scores on the other variable or when low scores on one are accompanied by low scores on the other variable.

Most randomly selected samples will have scores that

resemble the normal distribution.

Frequency Table

shows the frequency with which each type of category is mentioned, for example, on a questionnaire. Frequency and Percentage of Responses to Questionnaire

Z Scores

simplest of the standard scores and represents how far the raw score is from the mean in standard deviation units.

T Scores

simply z scores expressed in a different form.s

Other important percentiles are

the 25th percentile, also known as the first quartile (Q1) and the 75th percentile, the third quartile (Q3).

Three most commonmeasures of central tendency (averages) are

the Mode ,Median and Mean.


Related study sets

Lección 6 Contextos 2 - Escoger

View Set

15 Common Mistakes Brazilians Make in English

View Set

Chapter 11 Differential Analysis: The Key to Decision Making

View Set

CIS140 Practice Questions 10, 11, 12

View Set

Chapter 49: Assessment and Management of Patients With Hepatic Disorders NCLEX

View Set

CFA Level 2 2016 - Quant: Multiple Regression & Analysis issues

View Set