SCM 200
simple linear regression
analysis where you attempt to find the line that best estimates the relationship between two variables
r=0
no linear relationship in the sample
the sampling distribution of x1-x2 is normal for large samples
True
Se=0
all data points fall on the regression line
we use the slope of the regression equation to determine how strong the relationship is between y and x
false
seasonality
(a type of cycle) a special case of cycles which recur in regular fashion each year common to use dummy variable in the regression model to deal with seasonality
Expected Value of the error term in regression?
0
What is the value that is expected for the error term in regression
0
When dealing with multiple regression models, adjusted R square will usually be larger than the unadjusted r square
False. R2 adj always less than R2
in a linear regression, the units of Se are the units of the independent predictor variable
False. Se is an absolute measurement always in the units of what is being measured
The slope of a linear regression equation is an example of a correlation coefficient? T/f
False. Slope is different
the the R Square, and the the standard error of the estimate (Se), the stronger the relationship between the dependent variable and the independent variable.
Higher, lower
When you are dealing with a multiple regression model, an adjusted R square value can never be greater than the corresponding unadjusted R square value.
True
What component of time series analysis consists of erratic and unsystematic fluctuation in time series data?
Random variation
As the value of R 2 approaches 1, the relationship between x and y becomes stronger
True
The error term in a regression model describes the effects of all factors other than the independent variables on y.
True
The error term is defined as the difference between the original data value (y) and the estimated value of y (y hat)
True
When the slope of the linear regression equation is equal to zero, the correlation coefficient must be equal to zero.
True
correlation coefficient
a number between -1 and +1 calculated to represent the linear dependence of two variables
correlation coefficient
a number between -1 and +1 calculated to represent the linear dependence of two variables p= population correlation coefficient r=sample correlation coefficient
paired t-test
a one-sample t-test conducted on the n differences We want to know whether the mean difference in the population is different from zero
intercept coefficient sample
b₀ best estimate for β₀
slope coefficient (sample)
b₁ best for β₁
EXCEL: STDEV.S
calculates standard deviation based on sample data use this when you have sample data =STDEV.S(Number 1, Number 2)
when unadjusted R2 gets a new term added it
can increase or stay the same
when adjusted R2 get a new term added
can increase, decrease or stay the same.
cycles
changes due to economic conditions of prosperity and recession with a duration of more than one year
A student performed a simple linear regression. The p-value associated with a test of the slope coefficient was equal to 0.088. What is the correct conclusion at a 0.05 level of significance?
conclude that the true slope for the population is equal to zero
type 1 error
concluding that the alternative hypothesis is correct when the null is actually correct
type 2 error
concluding that the null is correct when the alternative is correct
Strength of the linear relationship between the dependent and independent variable in a regression?
correlation coefficient
What measures the strength of the linear relationship between the dependent and independent variable?
correlation coefficient
Paired data
data that have been observed in natural pairs same measurement is taken twice on each person, under different conditions or at different time of day researchers trying to understand differences and not the original observations
null Independent hypothesis t test for difference of population means
difference between two population means is 0, that means the two populations have the same means
In a regression equation, the slope b1 is always positive
false
if x and y are correlated in a regression, we can conclude that x causes y
false
in a regression equation, the slope b1 is always positive
false
the y-intercept b0 in a regression equation can never be less than 0.
false
when dealing with a simple linear regression model, the coefficient of determination indicates the strength of the relationship between the independent and the dependent variable, and it also shows whether the relationship is positive or negative.
false
When dealing with multiple regression models, adjusted R Square will usually be equal to the unadjusted R Square
false. R^2adj < R^2
a correlation of +0.75 is stronger -0.91
false. absolute values
it is never possible for regression coefficients to be less than -1
false. b0 and b1 can be less than
The coefficient of determination R2 can take any value between -1 and 1
false. between 0 and 1
even though it isn't common, it is possible to have a data at where the slope of the linear regression equation is positive and at the same time the correlation coefficient is negative.
false. sign of slope values must have the same sign as correlation coefficient
A multiple linear regression has multiple response variables
false. y= response variable.
Advantage of paired hypothesis for difference of two population means
gives opportunity to get rid of unwanted variation
trend
growth or decline in the mean of the forecasted variable y over time regression analysis is used time is always the independent variable x
What happens to the value of R2 when a new term is added to a regression model?
increases or remains the same
blocking
isolating the effect of background variables in order to reduce random variation
When using simple linear regression, we use confidence intervals for the and prediction intervals for the at a given level of s
mean y values, individual y-value
coefficient of determination
measures the percentage of variation in the values of the dependent variable y that can be explained by the change in the independent variable x p²= population coefficient of det R²= sample coefficient of det 0 to +1
standard error of the estimate
measures the residual standard deviation used to construct prediction intervals and test hypotheses concerning the usefulness of a regression line
population correlation coefficient
p
sample statistic
p
r=-1
perfect inverse relationship in the sample
differences in two population means
populations may be presented by two categories of a categorical variable, such as male and female, or they may be two hypothetical populations that could be represented by different treatment groups in an experiment. When it is not possible t block and used the paired t-test, run the t-test for independent samples
regression models
provide insight into relationships- used for testing and interpreting the slope estimation- estimate average value of y for a given level of x prediction- predict the next value of y for a given value of x
sample correlation coefficient
r
disadvantage of paired hypothesis test for the difference of two population means
reduction in degrees of freedom, compared to using a two sample t-test on independent samples
factors that affect probability of making a type 2 error
sample size: increasing sample size will reduce the probability of type 2 error level of significance: increasing alpha will reduce the prob of a type 2 error actual value of population parameter: as population parameter moves away from the null, the probability of a type 2 error decreases
What component of time series analysis refers to the fluctuations associated with climate, holidays, and related activities?
seasonality
When Standard Error of the estimate Se is equal to zero what is true?
the coefficient of determination R2 is equal to one.
What is true when Se=0
the correlation coefficient r is equal to -1 or +1, depending on the slope of the regression equation. the coefficient of determination (R^2) is equal to 1 the linear regression model explain all of the variation in the sample data
dependent variable (x) sample
the same value as the value used in the theoretical regression model because we know the value of the dependent variable
slope coefficient population
the true slope of regression equation β₁
independent variable (y^) sample
the value of the independent variable predicted by the sample regression equation
Se=Sy
there is too much error so the regression equation doesn't help
If R2=1, the linear regression model explains all of the variation in the sample data
true
If you were forecasting the trend in sales from time series data with a simple regression equation, the independent variable would be time.
true
It is possible to incorporate qualitative variables into a regression model T/f
true
R2 can also be referred to as the coefficient of determination
true
We use the sign of the slope in a regression equation to indicate if the relationship between x and y is positive or negative T/f
true
When the correlation coefficient r is equal to -1.0, the standard error of the estimate Se must equal zerro
true
When the value of the population correlation coefficient is -0.5, there is a strong inverse relationship in the population; when the value of the population correlation coefficient is 0, there is not a linear relationship in the population
true
a multiple linear regression has multiple predictor variables.
true
dummy variables are a way to use qualitative data into a linear regression model
true
if you accept the null hypothesis that the population slope is equal to zero, you can conclude that the X variable is not a statistically significant predictor.
true
the variable b1(slope) represents the estimated average change in y associated with a one-unit increase in xb when all other factors are the same
true
when dealing with time series analysis, observation must be taken at regular intervals over time
true
standard error
true value of π
independent variable (x) population
variable that determines the value of the independent variable. predictor variable
dependent variable (y) population
variable you are trying to predict response variable
scatterplot
vertical axis is the y axis dependent variable, horizontal is the x independent variable
dummy variable
when dealing with seasonality, the value of the dummy variable will be one if it is that season and the dummy var will be zero if it is a different season
When using a regression model for forecasting the next value of y, the prediction interval will be
wider than the confidence interval for the mean
Test statistic
z or t
intercept coefficient population
β₀ The true intercept of the regression equation
population paramenter
π
null value (population proportion)
π₀