chapter 14

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

correlation coefficient

The correlation coefficient computed from the sample data measures the *strength and direction* of a linear relationship between two variables.

If the assumption that the variance of ε is the same for all values of x is valid, and the assumed regression model is an adequate representation of the relationship between the variables, then

The residual plot should give an overall impression of a horizontal band of points

regression line/ line of best fit

We want to determine the equation of the regression linewhich is the line of best fit. Best fit means that the sum of the squares of the vertical distance from each point to the line is at a minimum.

confidence interval for B1 equation

b1 = point estimator t alpha/2 sb1 = margin of error

regression analysis is appropriate when the

dependent variable is continuous

coefficient of determination (r^2) measures

goodness of fit of the estimated regression equation.

good patterns of residual plots

horizontal band is desired

If the assumptions about the error term ε appear questionable,

hypothesis tests about the significance of the regression relationship and the interval estimation results may not be valid. 2. the residuals provide the best information about E/epsilon

B1 = 0

if B1 = 0, we can conclude that the mean value of y does not depend on the value of x--- * x and y are not linearly related*

test of significance holy grail *t test*

if Ho (null hypothesis is rejected) == a *statistically significant relationship exists between the 2 variables* if Ho is not rejected--- insufficient evidence to state that a stat sign relationship exists bet 2 variables

simple linear regression model

o Bo and B1 = parameters of the model o E = epsilon = random variable referred to as the error term

estimated simple linear regression equation

o Y hat = point estimator of E(y), the estimated value of y for a given x value *bo = y intercept of the line * b1 = slope of the line

coefficient of determination

provides a measure of the goodness of fit for the estimated regression equation

rejection rule for confidence interval for B1

reject Ho if 0 is not included in the confidence interval for B1

MSE (equation)

s^2 = MSE = (SSE)/n-2

regression analysis

statistical procedure used to develop an equation showing how the 2 variables are related

SSR

sum of squares due to regression = measures how much values from yhat onn the estimated regression line deviate from ybar

SST

tells you how much variation there is in the dependent variable

Least squares model (theory)

the best fitting line for the observed data ---- IS CALCULATED BY--- minimizing the sum of squares residuals (E/epsilon = yi - yhati) aka the diff bet observed and pred values

simple linear regression model is

the equation that describes how y is related to x and an error term

SST

total sum of squares

(least squares method (theory) )

uses sample data to provide the values of bo and b1 that minimize the "sum of squares of the deviations" between the observed values of dependent variable yi and the predicted values of dependent variable yhat i

independent variable

variable used to predict the variable of the dependent variable

sigma^2 =

variance of E, epsilon, in the regression model

To test for a significant regression relationship,

we must conduct a hypothesis test to *determine whether the value of β1 is zero.*

least squares method (equation) aka SSE

yhati = estmated value of the dependent variable yi = observed value of dep variable

epsilon/ error term

Error term accounts for the variability in y that cannot be explained by the relationship between x and y. (y intercept)

regression equation for simple linear regression

Graph of the simple linear regression is a straight line Bo = y intercept of the regression line B1 = slope E(y) = mean/expected value for y for a given value of x

Coefficient of determiniation

Used to evaluate goodness of fit for the estimated regression equation - takes values between 0 and 1 *expressed as a percent

SSE

((• value of SSE is a measure of the error in using the estimated regression equation to predict the values of the dependent variable in the sample.))

coefficient of determination explanation

(0,1) range = *percent of total sum of squares that can be explained* by the regression line (i..e. strong, weak relationship)

regression equation

. The equation that describes how the expected (or mean) value of y, denoted e(y), is related to x is called the_____

Confidence interval for B1

1. *if Ho is rejected, the hypothesized value of B1 is not included in the confidence interval for B1* 2. to reject Ho, 0 cant be included in conf int

correlation coefficient cont'd

1. Correlation coefficient is restricted to linear relationship between two variables. 2. Coefficient of determination can be used for nonlinear relationship and for relationships that have two or more independent variables.

assumptions about E(epsilon) in the regression model

1. The error epsilon, is a random variable with a mean of 0 2. The variance of epsilon, denoted by sigma^2 is the same for all values of the independent variable 3. The values of epsilon are independent 4. The error epsilon is a normally distributed random variable

simple linear regression

1. The goal of linear regression is to describe the relationship between two variables as a straight line. 2. To achieve this, linear regression attempts to model the relationship by fitting a linear equation (a straight line) to observed data.

intrapolation

1. linear regression model is built on sample data covering a spec range 2. intrapolate -- don't extrapolate

cautions of interpretation of significance tests

1. rejecting Ho : B1 = 0 and concluding that x and y has sign rel DOES NOT CONCLUDE a cause-and-effect relatioship is present bet x and y 2. just becuase you can reject Ho = B1 = 0 does not enable us to conclude there is a LINEAR RELATIONSHIP bet x and y

MSE

= mean square error == s^2 == provides estimate of sigma^2 (variance of E)

standard error of the estimate

= point estimate of sigma = s

residual observation of i

In other words, the ith residual is the error resulting from using the estimated regression equation to predict the value of the dependent variable

question answered by coefficient of determination

Question = How well does estimated regression equation fit the data ?

relationship between all sum of squares

SST = SSR + SSE


Set pelajaran terkait

Building Therapeutic Relationships

View Set

#2 Mechanisms of Genetics. Transcription/Translation

View Set

Chapter 18 - Environmental Science

View Set

ATI pharm book Ch 18-23 questions

View Set

Chapter 13 online quiz questions

View Set