AP Stat Ch. 10

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Re-expressing Goal 2: Make the spread of several groups (as seen in side-by-side boxplots) more alike, even if their centers differ

Groups with common spread easier to compare; taking logs makes individual boxplots more symmetric and gives more nearly equal spread; can also reveal problems in data

When none of the data values is zero or negative...

LOGARITHMS. Try taking logs of both x- and y- variables. Then, re-express the data using the exponential, logarithmic, or power model.

Ways to model/summarize data

Requires that: -data have simple structure -Symmetry for summaries of center/spread and to use Normal model -equal variation across groups when we compare groups with boxplots or want to compare their centers -Linear shape in a scatterplot --> can use correlation to summarize the scatter and regression to fit a linear model

Reasons to consider re-expression:

-Make the distribution of a variable more symmetric -Make the spread across different groups more similar -Make the form of a scatterplot straighter -Make the scatter around the scatterplot more consistent

Ways to handle bent relationships

-Straighten the data, then fit a line -Use the calculator shortcut to create a curve

Don't choose a model based on R^2 alone.

A high R^2 does NOT mean the pattern is straight. MAKE A PICTURE. Before you fit a line, always look at the pattern in the scatterplot. After you fit the line, check for linearity again by plotting the residuals.

Watch out for data far from 1.

Data values that are very far from 1 probably not affected by re-expression unless range is very large -re-expressing numbers btwn 1-100 will have greater effect than re-expressing #'s 100,001-100,100 -Subtract a constant to bring them back near 1 >Consider "years since 1950" for re-expression >Unless your data starts @ 1950, avoid creating a zero by using "years since 1949"

Don't expect your model to be perfect.

We are not looking for the RIGHT MODEL...we are looking for a USEFUL model.

Watch out for negative data values

cannot re-express negative values or values that are zero for negative powers; add a constant (such as 1/2 or 1/6) to bring all the data values above zero

Ladder of Powers: When you take a negative power, the

direction of the relationship will change; you can always change the sign of the response variable if you want to keep the same direction

Re-expressing Goal 3: Make form of scatterplot more linear

easier to model; taking logs makes things more linear

Re-expressing Goal 1: Make distribution of variable more symmetric

easier to summarize center of symmetric distribution; for nearly symmetric distributions, use mean and standard deviations -distribution unimodal --> resulting distribution may be closer to Normal model --> can use 68-95-99.7 Rule

Ladder of Powers orders:

effects that the re-expressions have on data Ex: if you try taking the square roots of all the values in a variable and it helps but not quite, move farther down the ladder to the logarithm or reciprocal root; re-expressions will have similar, but even stronger effect on data. If you go too far, can go back up ladder.

Ladder of Powers

farther you move away from original data ("1" position), greater the effect of re-expression on data

Re-expressing Goal 4: Make the scatter in scatterplot spread out rather than thickening at one end

having even scatter is a condition of many methods of Statistics

We re-express data to:

improve symmetry, make scatter around a line more constant, or make a scatterplot more linear

The Ladder of Powers or the...

log-log approach can help us find a good re-expression

Ladder of Powers: "0"

logs -measurements that CANNOT be negative, values that grow by percentage increases (salaries, populations) -When in doubt, start here -If your data has zeros, try adding a small constant to all values before finding the logs.

We seek a useful...

model, not perfection (or even "the best")

Ladder of Powers

places in order the effects that many re-expressions have on the data

Ladder of Powers: Power "1"

raw data-no change at all; "home base" -farther you step from here up or down ladder, greater the effect -data that can take on both + and - values with no bounds less likely to benefit from re-expression.

Re-expression

re-express data by taking the logarithm, the square root, the reciprocal, or some other mathematical operation on all values of a variable

Models won't be perfect, but that...

re-expression can lead to a useful model

Beware of multiple modes.

re-expression can make skewed unimodal histogram more nearly symmetric, but cannot pull separate modes together; makes separation of modes clearer, making it easier to analyze individually

Watch out for scatterplots that turn around

re-expression cannot straighten oscillating graphs; should refuse to analyze such data

Ladder of Powers: Power "1/2"

square root of y -for counted data, start here

Don't stray too far from the ladder.

taking y-values to an extremely high power may artificially inflate R^2; will not be a useful/meaningful model -use powers between -2 and 2

Power

x-axis: log (x) y-axis: log (y) -Goldilock's model: When one of the ladder's powers is too big and the next is too small -Re-expression Equation: log ŷ = a + b logx -Calculator's Curve: >PwrReg >ŷ = ab^b

Logarithmic

x-axis: log(x) y-axis: y -a wide range of x-values, or a scatterplot descending rapidly at the left but leveling off toward the right -Re-expression equation: ŷ = a + b logx -Calculator's Curve: >LnReg >ŷ = a + b lnx

Exponential

x-axis: x y-axis: log (y) -is the "0" power in the ladder approach, useful for values that grow by percentage increases -Re-expression Equation: log ŷ = a + bx -Calculator's Curve: >Command: ExpReg >ŷ = ab^x

Ladder of Powers: Power "2"

y^2; try this for unimodal distributions that are skewed to the left


Ensembles d'études connexes

Module 3 Section D: Implementation of Demand Plans

View Set

Chapter 9 Test -- Fluid Dynamics

View Set

Encountering the Old Testament ch.18

View Set