Linear Regression

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Independent variables

Variables being used to predict the value of the dependent variable

Least squares method

- A procedure for using sample data to find the estimated regression equation. min sum ( yi-y^i)^2 = min sum (yi- bo - b1x1)^2 yi = observed value of dependent variable for ith observation y^i = predicted value of the dependent variable for the ith observation n = total number of observations (in sum)

Estimated regression equation

- The parameter values are usually not known and must be estimated using sample data. - Sample statistics denoted b0 and b1 are computed as estimates of the population parameters β0 and β1. - Substituting the values of the sample statistics b0 and b1 for β0 and β1 in the regression equation and dropping the error term, we obtain the estimated regression for simple linear regression. y^= bo + b1x ŷ = Estimate for the mean value of y corresponding to a given value of x. b0 = Estimated y-intercept. b1 = Estimated slope. - The graph of the estimated simple linear regression equation is called the estimated regression line. - In general ŷ is the point estimator of E(y|x), the mean value of y for a given value of x.

Simple linear regression

A regression analysis for which any one unit change in the independent variable, x, is assumed to result in the same change in the dependent variable, y

Interpreting least squares method

Estimated slope of b1 = 0.0678 y-intercept of b0 = 1.2739 The estimated simple linear regression model: ŷ1 = 1.2739 + 0.0678x1 - Interpretation of b1: If the length of a driving assignment were 1 unit (1 mile) longer, the mean travel time for that driving assignment would be 0.0678 units (0.0678 hours, or approximately 4 minutes) longer. - Interpretation of b0: If the driving distance for a driving assignment was 0 units (0 miles), the mean travel time would be 1.2739 units (1.2739 hours, or approximately 76 minutes).

Extrapolation

Prediction of the value of the dependent variable outside the experimental region. It is risky.

Assessing the Fit of the Simple Linear Regression Model

Sum of squared deviations obtained by using the sample mean ȳ = 6.7 to predict the value of travel time in hours for each driving assignment in the sample. Butler Trucking Example: For the ith driving assignment in the sample the difference yi - ȳ provides a measure of the error involved in using ȳ to predict travel time for the ith driving assignment.

The sums of squares

Sum of squares due to error: The value of SSE is a measure of the error in using the estimated regression equation to predict the values of the dependent variable in the sample. SSE = sum (yi - y^i)^2

Experimental region

The range of values of the independent variables in the data used to estimate the model. The regression model is valid only over this region.

dependent variable

Variable being predicted

Multiple Linear Regression

regression analysis involving two or more independent variables

Simple linear regression model

y = B0 + B1x + E B = parameters (population characteristics) E = Random variable/ error term (accounts for variability in y that cannot be explain by the linear relationship between x and y)


संबंधित स्टडी सेट्स

Health Assessment Chapter 23: Male Genitalia and Rectum

View Set

TB (3) 3 C le logement en France

View Set

Psychology Module 24- Forgetting, Memory Construction, and Improving Memory

View Set

Sociology- Marriage and the Family Exam

View Set