BLESSING Chap 5 vocab
peroson's sample correlation coefficient
A measure of the extent to which sampke x and y values are linearly related; -1<equalto r < equal to 1, so values close to 1 or -1indicate a strong linear relationship.
transformation
A simple function of the x and/or y variable, which is then used in a regression.
power transformation
An exponent, or power, p, is first specified, and then new (transformed) data values are calculated as tranformed value = (original value)^p. A logritihmic transformation is identified with p=0. When the scatterplotof orginal data exhibits curvature, a power transformation of x and/or y will often result in a scatterplot that has a linear appearance.
Population correlation coefficient
Analogous measure of how strongly x and y are related in the entire population of pairs from which the sample was obtained
Scatterplot
Apicture of bivariate numerical data in which each observation (x, y) is represented as a point located with respect to a horizontal x axis and vertical y axis.
Explanatory variable*
Attempts to explain the observed outcomes, x value, independent
Danger of extrapolation
Calculation of the value of a function outside the range of known values
Explanitory vaiiable *
Helps explain the observed.outcomes, independent, x value
Response variable *
Measures an outcome of study, dependent, y value, depends on explanitory variable
Response variable*
Measures on the outcome of a study, y value, dependent
Sum of squared deviations***
Most widely used criterion for measuring the goodness of fit of a line y = a +bx to bivariate data (x1, y1),...., (xn, yn) about the line [sigma/sum of][y-(a+bx)]^2
Correlation coefficient
Numerical assessment of the strength of relationship between the x and y values in a set of (x, y) pairs
predicted (fitted) values, ....
Obtained by substituting the x value for each observation into the least-squares line; [...]
residuals
Obtained by subtracting each predicted value from the corresponding observed y value: [copy math fuction]. These are the vertical deviations from the least-squares line.
residual plot
Scatterplot of the (x, residual) pairs. Isolated points or a pattern of points in a residual plot are indicative of potential problems.
logistic equation
The graph of this equation is an S-shaped curve. The logistic regression equation is used to describe the relationship between the probability of success and a numerical predicator variable.
Least squares line
The least-squares regression line of y on x is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.
principle of least squares
The method used to select a line that summarizes an approximate linear relationship between x and y. The least-squares line is the line that minimizes the sum of the squared vertical deviations for the points in the scatterplot.
coefficient of determination
The proportion of variation in observed y's that can be attributed to an approximate linear relationship.
standard deviation about the least-squares line
The size of "typical"deviation from the least-squares line.
total sum of squares
The sum of squared deviations from the sample mean is a measure of total variation in the observed y values.
residual (error) sum of squares
The sum of the squared residuals is a measure of y variation that cannot be attributed to an approximate linear relationship (unexplained variation).
Regression analysis
a method of predicting sales based on finding a relationship between past sales and one or more variables
Intercept
a, the height of the line above the value x = 0 (Y-intercept / vertical intercept)
Slope
b, the amount by which y increases when x increases by 1 unit
the intercept of the least-squares line
fomula
the slope of the least-squares line
formula
Sample regression line
provides an estimate of the population regression line; used to predict the EXPECTED value of y for a given value of x; the line that minimizes this sum of squared deviations.