STT Exam 2

¡Supera tus tareas y exámenes ahora con Quizwiz!

leverage

data pts w/ x variables far from mean of x exert leverage on a linear model

What does a regression equation tell?

- predict y value at particular x -estimate slope between y and x -estimate if linear association is positive/negative

What are the 4 conditions of outliers?

-can distort correlation dramatically - can make weak correlation look big/hide strong correlation -can give a positive association a negative correlation -report correlation w/ & w/o outlier

What can a scatter plot & regression line be used for?

-determine any (x,y) pairs are outliers - predict y at specific x -estimate average y at specific x

a correlation of .9 means

.9^2=.81 so 81% of the variation of the y-values can be explained by the explanatory variable, x

what are the 4 properties of coefficient of determination?

0<r^2<1 =1 only if all points lie on a line doesn't change if units change measures strength between y & x

For which one of these relationships could we use a regression analysis? Only one choice is correct. a. Relationship between weight and height. b. Relationship between political party membership and opinion about abortion. c. Relationship between gender and whether person has a tattoo. d. Relationship between eye color (blue, brown, etc.) and hair color (blond, etc.).

A

Which of the following correlation values indicates the strongest linear relationship between two quantitative variables? a. r = −0.65 b. r = −0.30 c. r = 0.00 d. r = 0.50

A

A correlation of zero between two quantitative variables means that A. we have done something wrong in our calculation of r. B. there is no association between the two variables. C. there is no linear association between the two variables. D. re-expressing the data will guarantee a linear association between the two variables. E. None of the above.

C

The value of a correlation is reported by a researcher to be r = −0.5. Which of the following statements is correct? A. The x-variable explains 50% of the variability in the y-variable. B. The x-variable explains −50% of the variability in the y-variable. C. The x-variable explains 25% of the variability in the y-variable. D. The x-variable explains −25% of the variability in the y-variable.

C

Which of the following sets of variables is most likely to have a negative association? A. the number of bedrooms and the number of bathrooms in a house B. the number of rooms in a house and the time it takes to vacuum the house C. the age of a house and the cleanliness of the carpets inside D. the size of a house and its selling price

C

Which of the following sets of variables is most likely to have a negative association? A. the height of the son and the height of the father B. the age of the wife and the age of the husband C. the age of the mother and the number of children in the family D. the age of the mother and the ability to have children

D

WHAT ARE THE BEST WAY TO START OBSERVING THE RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?

Scatterplots

what is the correlation coefficient?

a numerical measure of the direction & strength of a linear association

what is a residual plot?

a scatterplot of the residuals against the explanatory variable -stretch horizontally -same amt of scatter -no bends -no outliers

Correlation

a statistic that measures the strength and direction of a linear relationship between two quantitative variables.

in ^y= a +bx what does a and b represent?

a- y-intercept b- slope

what does analyzing a residual do?

asses the adequacy of a model & identify outliers

Which of the following is a deterministic relationship? a. The relationship between hair color and eye color. b. The relationship between father's height and son's height. c. The relationship between height in inches and height in centimeters. d. The relationship between height as determined with a ruler and height as determined by a tape measure.

c

what is the straight enough condition?

correlation measures the strength only if the form is straight enough that its a linear relationship

Regression equation

describes the average relationship between a quantitative response and explanatory variable.

what are the 4 things to look for in a scatterplot?

direction, form, strength, unusual features

what is the symbol for residual?

e

what does b0 represent?

estimated intercept

what does b1 represent?

estimated slope

Which variable goes on the horizontal axis?

explanatory

what does x represent?

explanatory variable

What does it mean if the correlation coefficient between 2 quantitative variables is positive?

high values on one variable are associated w/ high values on the other

what to look for to determine strength of scatterplot?

how closely the points fit the trend & outliers

influential point

if omitted from the data, results in a very different regression model

what does slope tell?

indicates how much of a change there is for the predicted value of y when x increases 1 unit

4 types of scatterplot trends

linear, curved, clusters, no pattern

what does it mean if a scatterplot has an exotic form?

nonlinear with sharp points

What are 2 types of unusual features of a scatterplot?

outliers & subgroups

what would the graph of the equation: ^y=b0+b1x look like?

positive linear

What are the 3 types of relationships that can be determined from a scatterplot?

positive, negative, curvilinear

3 types of scatterplot directions

positive, negative, no direction

high leverage points

pull the line close to them -large effect -may determine slope & y intercept

what is the equation for correlation coefficient?

r= (xi-mean of x)(yi-mean of y) / (n-1)SxSy

what is the symbol for the coefficient of determination?

r^2

what are the 3 ways to study the relationship between 2 quantitative variables?

scatterplot, correlation, regression

what do Sx and Sy stand for?

standard deviation of X and Y

What does a correlation of r=0.0 between 2 variables mean?

the best straight line through the data is horizontal

residual

the difference between an OBSERVED value and the PREDICTED value

what does ^y represent?

the predicted response

negative residual

the predicted values overestimate the actual data

positive residual

the predicted values underestimate the actual data value

what does the coefficient of determination measure?

the proportion of variation that is explained by the independent variable

extrapolation

the use of a regression line for prediction outside the range of values- can't be trusted

interpolation

the use of a regression line for prediction within the range of values

what does the regression line predict?

the value for the response variable (y) as a straight-line function of the value x of the explanatory variable

Two variables have a positive relationship when

the values of one increase as the other increases

What is a scatterplot?

two-dimensional graph of data values

lurking variable

usually unobserved, influences the association between the variables of primary interest

confounding

when 2 explanatory variables are both associated w/ a response variable & eachother

what is the general equation of a regression line?

y= a + bx + error


Conjuntos de estudio relacionados

NURS 562 Family Nursing Prep U Chapter 2

View Set

Ch.2.1: Frequency Distributions & Graphs

View Set

Abeka 7th Grade History Quiz Sections 21.1-21.2

View Set

Foundations of Accounting - Exam One Study Guide

View Set

Creating a Company Culture for Security

View Set

Chapter 14: Physical Development in Adolescence

View Set

Major Histocompatibility Complex

View Set