Stats Test 2

Ace your homework & exams now with Quizwiz!

Percentile of data value =

# of value less than this data value/ total # of values in data set multiplied by 100%

s² = variance

(standard deviation)²

conditions for a normal distribution

- most data values are clustered near the mean, giving the distribution a well-defined single peak. -data values are spread evenly around the mean, making the distribution symmetric -larger deviations from the mean become increasingly rare, producing the tapering tails of the distribution -individual data values result from a combination of many different factors, such as genetic and environmental factors.

68-95-99.7 Rule

-About 68% (68.3 precisely), or just over 2/3, of the data points fall within 1 standard deviation of the mean -About 95% (95.4 precisely), of the data points fall w/n 2 standard deviations of the mean. - About 99.7% of the data points fall withing 3 standard deviations of the mean.

Normal Distribution

-distribution is single peaked -distribution symmetric around the single peak -distribution is spread out in a way that makes it resemble the shape of a bell -peak corresponds to the mean, median, and mode of the distribution -variation can be characterized by the standard deviation of the distribution

When does the Central Limit Theorem apply?

1. applies for suitably large sample sizes. A common threshold is n>30 2. applies to variables with any distribution (not necessarily a normal distribution)

Central Limit Theorem

1. the distribution of means will be approximately a normal distribution for large sample sizes 2. the mean of the distribution of means approaches the population mean, m, for large sample sizes. 3. The standard deviation of the distribution of means approaches σ/√n for large sample sizes, where s is the standard deviation of the population.

unimodal

A distribution with one mode

trimodal

A distribution with three modes

bimodal

A distribution with two modes

Normal distribution

A specific category of distributions that are symmetric and bell shaped with a single peak. The peak corresponds to the mean, median, and mode of such a distribution.

Boxplot

Draw a # line that spans all the values in the data set. Enclose the values from the lower to the upper quartile in a box. Draw a line through the box at the median Add "whiskers" extending to the low and high values.

In a recent​ year, the 952 players in a certain sports league had salaries with the characteristics below. The mean was 4,596,061. The median was ​$1,525,000. The salaries ranged from a low of ​$508,000 to a high of ​$30,000,000. a. Describe the shape of the distribution of salaries. Is the distribution​ symmetric? Is it​ left-skewed? Is it​ right-skewed?

It is right skewed

What does the area under the normal distribution curve represent. What is the total area under the normal distribution curve?

The area that lies under the normal distribution curve corresponding to a range of values on the horizontal axis is the total relative frequency of those values. Because the total relative frequency for all values must be 1 (100%), the total area under the normal distribution curve must equal 1 (100%)

Suppose that many random samples of size n for a variable are taken and the distribution of means of each sample is recorded. What is true regarding the Central Limit Theorem?

The standard deviation of the distribution of means approaches σ/√n, where σ is the standard deviation of the population. -The distribution of means will be approximately a normal distribution. -The mean of the distribution of means approaches the population mean, µ.

Does this make sense? The distribution of grades was left skewed, but the mean, median, and mode were all the same.

This does not make sense because the mean and median should lie somewhere to the left of the mode if the distribution is left skewed.

My professor graded the final score on a curve, and she gave a grade of A to anyone who had a standard score of 2 or more..... Does this make sense? Why?

This makes sense because a standard score of 2 or more corresponds to roughly the 97th percentile. This means the lowest test scores are getting curved up to be the highest test scores.

weighted mean

account for variations in the relative importance of data values. Each data value is assigned a weight and the weighted mean is weighted mean = sum of(each data values x its weight)/sum of all weights

upper quartile (third quartile) or Q₃

divides the lowest 3/4 of a data set from the upper 1/4. it is the median of the data values in the upper half of a data set. (exclude the middle value in the data set if the # of data points is odd.)

All data values in a uniform distribution have the same​ frequency, whereas a distribution with one or more modes

has one or more values that occur most frequently.

outlier

in a data set is a value that is much higher or lower than almost all others

mean

mean = sum of values/total # of values

range

of a set of data values is the difference between its highest and lowest data values range = highest value (max) - lowest value (min)

An ______ in a data set is a value that is much higher than almost all other values. An _______ can change the median of a data set but does not affect the mean or mode.

outlier, outlier

The standard deviation is approximately r/t to the range of a distribution by the

range rule of thumb

range rule of thumb

standard deviation(s) ≈ range / 4

(S) Standard deviation =

sum of (deviations from the mean)²/total # of data values-1 and then get the square root of final number

relative frequency

the area that lies under the normal distribution curve corresponding to a range of values on the horizontal axis - because the total relative frequency must be 1, the total area under the normal distribution curve must equal 1, or 100%

lower quartile (first quartile) or Q₁

the median of the data values in the lower half of a data set. (exclude the middle value in the data set if the # of data points is odd) divides the lowest 1/4 of a data set from the upper 3/4.

median

the middle value in the sorted data set (or halfway between the two middle values if the # of values is even)

mode

the most common value (or group of values) in a data set.

standard score

the number of standard deviations a data value lies above or below the mean

middle quartile (second quartile) or Q₂

the overall median

All data values in a ______ ______ have the same frequency, whereas a distribution with one or more modes has one or more values that occur most frequently.

uniform distribution

unusual values

values that are more than 2 standard deviations from the mean.

Simpson's paradox

when a set of data gives different results for each of several group comparisons than it does when the groups are taken together, this phenomenon is known as _____ _______.

z=standard score

z=data value - mean/standard deviation

Computing Standard Scores

z=standard score=data value - mean/standard deviation


Related study sets

13 Arterial Blood Collection/MEP

View Set

Multiple Choice Questions Chapter 10 + 12 + 6 + 7 + 9 Business 101

View Set