(taken from GSOC proposal)
...
contrast_interact: This is there to code interaction terms. In a dataframe with columns ‘a’ and ‘b’, ‘a:b’ is an interaction term. Again we need to code this term to produce some number of variables. But in this case the coding is somewhat different. I’ll explain with an example how to code ‘a:b’ and one can generalize the behavior. Let’s say column ‘a’ has m categories and ‘b’ has n categories. Now if ‘a’ has been mentioned in our regression expression, then we will code the column ‘b’ with n-1 variables and similarly if ‘b’ has been mentioned in the regression expression, then we will code column ‘a’ with m-1 variables. And if ‘a’ hasn’t been mentioned in our regression expression then ‘b’ will be coded with n variables and similarly if ‘a’ hasn’t been mentioned in our regression expression then ‘b’ will be coded with m variables.
Here’s a general rule to follow when we have more than two way interaction. Say we have ‘a: b:c’ and we need to decide whether to code ‘a’ w