Categorical and dummy variables
Example of dummy variables: have three levels of weight category - normal, overweight and obese. If normal was the base category, what would be the names of the two dummy variables and which variables would be labelled as 0 and 1 fr each of these?
- Dummy variable 1: normal_overweight, where normal = 0, overweight = 1 and obese = 0 - Dummy variable 2: normal_obese, where normal = 0, overweight = 0 and obese = 1
As a categorical variable, what should the control be labelled as?
0
Why can you not enter categories into a multiple linear regression as 0, 1, 2 and 3 (etc)>
Because that would be assuming a linear relationship between the differences of the variables
Entering categorical variable data into a regression is mathematically the same as doing what?
Conducting an unpaired T-test
If we have three levels of weight category - normal, overweight and obese, assessing their potential influence over SBP, with the normal variable the base category, what would be the equation for this relationship?
SBP = bo (constant) x (b1 x norm_overweight) x (b2 x norm_obese)
The coefficient slope outputted by a multiple linear regression between categorical variables is what?
The difference between the means of the categories
The coefficient slope that is outputted for a multiple regression for categorical variables is mathematically equal to what?
The mean difference given by an unpaired T-test
When are dummy variables used?
When entering categorical data in which there are more than 2 categories