Dev Stats Ch 4 Vocab
Determine whether the given points are on the graph of the equation. 4x + 5y = 13
(a) (7, - 3) (b) (2,2) (c) (3,5) to get the answer, insert the x and y from a b and c and see if the result equals 13. the result will be either true or false
lurking variable
A variable that has an important effect on the relationship among the variables in a study but is not one of the explanatory variables studied
If r = + 1 , then a perfect positive linear relation exists between the two variables.
...
If r = − 1 , then a perfect negative linear relation exists between the two variables
...
If r is close to 0, then little or no evidence exists of a linear relation between the two variables. So r close to 0 does not imply no relation, just no linear relation.
...
Is there another way two variables can be correlated without there being a causal relation? Yes—through a lurking variable. A lurking variable is related to both the explanatory variable and response variable. For example, as air-conditioning bills increase, so does the crime rate. Does this mean that folks should turn off their air conditioners so that crime rates decrease? Certainly not! In this case, the lurking variable is air temperature. As air temperatures rise, both air-conditioning bills and crime rates rise.
...
Since s y and s x must both be positive, the sign of the linear correlation coefficient, r, and the sign of the slope of the least-squares regression line, b 1 , are the same. For example, if r is positive, then b 1 will also be positive.
...
The Least-Squares Regression Line The equation of the least-squares regression line is given by y ^ = b 1 x + b 0 where b 1 = r · s y s x is the slope of the least-squares regression line* ( 2 ) and b 0 = y ¯ − b 1 x ¯ is the y -intercept of the least-squares regression line ( 3 ) Note: x ¯ is the sample mean and s x is the sample standard deviation of the explanatory variable x; y ¯ is the sample mean and s y is the sample standard deviation of the response variable y.
...
The closer r is to + 1 , the stronger is the evidence of positive association between the two variables
...
The closer r is to − 1 , the stronger is the evidence of negative association between the two variables.
...
The correlation coefficient is not resistant. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient
...
The linear correlation coefficient is a unitless measure of association. So the unit of measure for x and y plays no role in the interpretation of r
...
The linear correlation coefficient is always between − 1 and 1, inclusive. That is, − 1 ≤ r ≤ 1 .
...
The notation y ^ is used in the least-squares regression line to remind us that it is a predicted value of y for a given value of x. The least-squares regression line, y ^ = b 1 x + b 0 , always contains the point ( x ¯ , y ¯ ) . This property can be useful when drawing the least-squares regression line by hand.
...
b1 = slope of the least squares regression line
...
lurking variables are related to both the explanatory and response variables in a study.
...
response variable
A variable that measures an outcome of a study
slope
A measure of the steepness of a line. Given two points with coordinates (X1,Y1) and (x2,y2) on a line the slope, m, of the line is given m = rise/run= y2-y1/x2-x1
contingency table
A two-variable table with cross-tabulated data.
linear relation
A type of relationship that exists between two variables whose graphed data points lie on a straight line
An equation in two variables can have more than one solution.
A. The statement is false. An equation in two variables has no solution. B. The statement is true. An equation in two variables is satisfied by all real numbers. Q C. The statement is false. An equation in two variables has only one solution. D. The statement is true. There can be infinitely many choices of the variables that satisfy the given equation. Correct answer is D
bi-variate data
Data collected on two variables for each individual in a study.
uni-variate data
Data coming from one variable. Data that you can find the mean, median, and standard deviation.
conditional distribution
Distribution of values of one variable among individuals who have a specific value of the other variable. (Totals in the cells)
Negatively associated
High values of one variable tend to occur with low values of the other variable
The graph of a linear equation is a .
Line
scatter diagram
Measure that indicates the relationship between data items using x and y axes.
If the linear correlation between two variables is negative, is the slope negative or positive
Negative
Testing for a Linear Relation
Step 1 Determine the absolute value of the correlation coefficient. Step 2 Find the critical value in Table II from Appendix A for the given sample size. Step 3 If the absolute value of the correlation coefficient is greater than the critical value, we say a linear relation exists between the two variables. Otherwise, no linear relation exists.
predictor variable
The dependent variable in a correlational study that is used to predict the score on another variable
least-squares regression
The line with the smallest sum of squared residuals. Squared Error = (observed y - predicted y)2.
marginal distribution
The marginal distribution of one of the categorical variables in a two-way table of counts is the distribution of values of that variable among all individuals described by the table. (Totals in the margins)
coefficient of determination
The statistic or number determined by squaring the correlation coefficient. Represents the amount of variance accounted for by that correlation.
y-intercept
The y-coordinate of a point where a graph crosses the y-axis.
The least-squares regression line is the line that minimizes the sum of the squared errors (or residuals)
This line minimizes the sum of the squared vertical distance between the observed values of y and those predicted by the line, y ^ (read "y-hat"). We represent this as " minimize ∑ residual s 2 ".
column variable
Variable that describes the columns of the table
simpson's paradox
When averages are taken across different groups, they can appear to contradict the overall averages
The slope of a horizontal line is_______ , while the slope of a vertical line is_________
Zero
explanatory variable
a variable that we think explains or causes changes in the response variable
if a pediatrician wants to use height to predict head circumference, which variable is explanatory and which is response
explanatory variable is Height (x) response variable is head circumfrence (y)
linear correlation coefficient
measures the strength of the linear relationship between 2 variables: x and y ( r=)
A linear correlation coefficient close to 0 does not imply that there is no relation, just no linear relation.
page 194
Perfect positive, linear relation r= 1
page 194
Scatter Diagram (2) from the book
scatter diagram is a graph that shows the relationship between two quantitative variables measured on the same individual. Each individual in the data set is represented by a point in the scatter diagram. The explanatory variable is plotted on the horizontal axis, and the response variable is plotted on the vertical axis.
explained deviation
the deviation between the predicted and mean value yhat-ybar
row variable
the variable that remains constant going horizontally
positively associated
when above average values of one tend to accompany above the average values of the other and below average values also tend to occur together