AP Stats: Chapter 2
34%, 13.5%, 2.5%, 2.5%, 0.15%
% of one side of Normal distribution curve
density curve
-An idealized description of the overall pattern of a distribution that smooths out the irregularities in the actual data -Always remains above the horizontal axis -Has total area 1 underneath it -An area under a density curve gives the proportion of observations that fall in a range of values -The curve is an approximation that is easy to use and accurate enough for practical use
example of complete answer for percentiles
-The Colorado Rockies, who won 92 games, are at the 83rd percentile since 25/30 teams had the same or fewer number of wins than they did.
Normal distribution
-described by a Normal density curve, which is symmetric, single-peaked, and bell-shaped -described by giving its mean & standard deviation -abbreviated as N (µ,σ)
right
Normal probability line: In a right-skewed distribution, the largest observations fall distinctly to the ___ of a line drawn through the main body of point
percentile
the pth percentile of a distribution is the value with p percent of the observations less than or equal to it -"percent correct" does not measure the same thing as "percentile" -should be whole numbers; if necessary, round to nearest integer
(smallest value of the next class, cumulative relative frequency) (ex. (45, 4.5) = age 45 is the 4.5th percentile of the inauguration age distribution)
values plotted on a cumulative relative frequency
left
Normal probability line: In a left-skewed distribution, the smallest observations fall to the ___ of the line
standardizing
Converting observations from original values to standard deviation units (z-scores)
add the counts in the frequency column for the current class and all classes with smaller values of the variable (2, 9, 22, 34, 41, 44)
Ex: Frequency table that summarizes the ages of the first 44 US presidents when they were inaugurated: -Age 40-44: 2 (frequency) -45-49: 7 -50-54: 13 -55-59: 12 -60-64: 7 -65-69: 3 How to find the cumulative frequency:
mean < median (mean is on the left of median)
For a left skewed density curve, what is the relationship between mean & median?
mean > median (mean is on the right of the median; the long right tail pulls the mean to the right)
For a right skewed density curve, what is the relationship between mean & median?
mean = median
For a symmetric density curve, what is the relationship between mean & median?
-Plot the data (dotplot, stemplot, or histogram); if the graph is approximately symmetric and bell-shaped -> most likely Normal; if data is clearly skewered, has multiple peaks, or isn't bell-shaped, there's evidence that the distribution is not Normal -Check whether the data follow the 68-95-99.7 rule (look at mean & standard deviation; mean +/- 1std. dev, 2nd, 3rd; count numbers that fit between/total number = %) -If percents are close to the 68%, 95%, and 99.7% -> Normal distribution -can use a Normal probability plot (linear shape = Normal); clear curvature in graph shows that data do not follow a Normal distribution
How to determine if the data is close to Normal
divide the entries in the cumulative frequency column by 44 (total # of presidents)
How to find the cumulative relative frequency:
State: Express the problem in terms of the observed variable x Plan: Draw a picture of the distribution and she the area of interest under the curve Do: Perform calculations: - Standardize x to restate the problem in terms of a standard Normal variable z -Use Table A and the fact that the total area under the curve is 1 to find the required area under the standard Normal curve Conclude: Write your conclusion in the context of the problem
How to solve problems involving Normal distributions:
variable
Multiplying/dividing by a ____ changes the shape of the distribution
68% (34% above, 34% below)
Normal distribution: 1 standard deviation above & below
95%
Normal distribution: 2 standard deviations above & below
99.7%
Normal distribution: 3 standard deviations above & below
25%, 50%, 75%
Percentages that Q1, Q2 (median), and Q3 represent
Normal probability plot
Provides a good assessment of whether a dat set follows a Normal distribution -look for shapes that show clear departures from Normality -if the points lie close to a straight line (linear nature), the plot indicates that the data are Normal; systematic deviations from a straight line indicate a non-Normal distribution -plots x-value against its score
Changes: mean, median, value of quartiles, value of percentiles Does not change: range, IQR, standard deviation; shape of distribution
What changes/does not change when adding/subtracting number to data?
Changes: mean, median, quartiles, percentiles; range, IQR, standard deviation (by absolute value of number) Does not change shape of distribution
What changes/does not change when multiplying/dividing number to data?
steepest part of graph = where the most amount of elements are in data flattest = where the least amount of elements are in data Ex. Graph/table shows the distribution of median household incomes for the 50 states and the District of Columbia -> at the steepest part of the graph, there are more households within that range of incomes. Therefore, the steepest parts of the graph tell you where the most common household incomes are
What does the steepness of the cumulative relative frequency graph tell you about the distribution?
-Adds a to measures of mean, median, value of quartiles, value of percentiles -Does not change range, IQR, or standard deviation -Does not change the shape of the distribution
What happens to the data when adding/subtracting the same number a?
-Multiples mean, median, quartiles, and percentiles, by b -Multiples range, IQR, and standard deviation by Ibl -Does not change the shape of the distribution
What happens to the data when multiplying/dividing the same number b?
cumulative relative frequency graph (ogives)
a way to display cumulative information graphically; shows the number percentage, or proportion of observations in a data set that are less than or equal to particular values -can be used to describe the position of an individual within a distribution or to located a specified percentile of the distribution
z-scores (standardized value)
indicates how many standard deviations the observation is above or below the mean -is directional; if positive, the observation is above average; if negative, the observation is below average -distribution does not have to be Normal - important that the distributions be roughly the same shape when comparing from different distributions