The Practice of Statistics - Chapter 3

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Regression line

A line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x.

Residual plot

A scatterplot of the regression residuals against the explanatory variable (or equivalently, against the predicted y-values). Residual plots help us assess how well a regression line fits the data.

Explanatory variable

A variable that may help explain or influences changes in a response variable.

Response variable

A variable that measures an outcome of a study

Influential

An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation. Points that are outliers in the x direction of a scatterplot are often influential for the least-squares regression line.

Outlier

An observation that lies outside the overall pattern of the other observations. Points that are outliers in the y direction but not the x direction of a scatterplot have large residuals. Other outliers may not have large residuals.

Standard deviation of the residuals (s)

If we use a least-squares line to predict the values of a response variable y from an explanatory variable x, the standard deviation of the residuals (s) is given by ()22ˆresiduals22iyysnn−==−−ΣΣ This value gives the approximate size of a "typical" or "average" prediction error (residual). <<<Sorry - doesn't copy well>>>

Overall pattern

In any graph of data, look for the overall pattern and for striking departures from that pattern. You can describe the overall pattern of a scatterplot by the direction, form, and strength of the relationship.

Correlation

Measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r.

Scatterplot

Shows the relationship between two quantitative variables measured on the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears as a point in the graph.

Slope

Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form ˆy(hat) = a + bx. In this equation, b is the slope, the amount by which y is predicted to change when x increases by one unit.

y intercept

Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form ˆy(hat) = a + bx. In this equation, the number a is the y intercept, the predicted value of y when x = 0.

Residual

The difference between an observed value of the response variable and the value predicted by the regression line. That is, residual = observed y - predicted y = ˆy(hat)

Coefficient of determination r2

The fraction of the variation in the values of y that is accounted for by the least-squares regression line of y on x. We can calculate r2 using the following formula: r2(square) =1-SSE/SST where SSE = (Sigma) residual(square) = (Sigma)(yi-y(hat))(square) and SST = (Sigma) (yi - Y(hat) )(square).

Least-squares regression line

The least-squares regression line of y on x is the line that makes the sum of the squared vertical distances of the data points from the line as small as possible.

Extrapolation

The use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate.

Positive association

When above-average values of one variable tend to accompany above-average values of the other, and below-average values also tend to occur together.

Negative association

When above-average values of one variable tend to accompany below-average values of the other, and vice versa.

Predicted value

y (read "y hat") is the predicted value of the response variable y for a given value of the explanatory variable x.

Equation of the least-squares regression line

y(hat) = a+bx with slope b=r sy/sx and y intercept a=y(bar) - bx(bar)


Set pelajaran terkait

SYG2000 Chapter 6. Social Control and Deviance InQuizitive

View Set

Chapter 16 Real Estate Appraisal

View Set

CH. 5 BOOK INFO: METHODS FOR ASSESSING AND SELECTING EMPLOYEES

View Set

marketing research quiz questions

View Set

Chapter 2,4 Assignment Questions

View Set