Categorical Covariates and Interaction Terms (Optional Module)

If a covariate to be “corrected for” is categorical, dummy variables will be used, one for each category. Each dummy variable will take on the value “1” if the observation takes on its corresponding category, and “0” otherwise. When a regression is done, the last category’s variable is normally dropped to avoid a rank-deficient matrix in the regression.

It is sometimes desired to “correct for” first order interactions between non-genetic covariates. A first-order interaction term is a “new” covariate created from the product of two non-genetic covariates. In the case of one covariate being categorical, that covariate’s dummy variables for its categories may each be multiplied by the other covariate to create a first-order interaction term. In the case of both being categorical, their dummy variables may be multiplied by each other.