Question to the community (not directly related to...
# ask-questions
s
Question to the community (not directly related to GrowthBook) Can I use multi-factorial ANOVA for testing for presence of interaction effect in my A/B test? Say I have two tests running on the same page with 2 variations each (Control and Treatment), this leaves us with with 4 possible combinations of variants if I’m not mistaken. Is it even reasonable to perform ANOVA for detecting interaction in this setting? Would love to hear an opinion on that.
r
Our official support hours are 6:30am - 5pm Pacific Time, Monday through Friday. You may occasionally hear from us outside of these hours. Your support request has been logged in our system. Our support team will get back to you very soon!
f
Hi Yevhen - this kind of interaction effect detection is on our roadmap. As for the specific technique, I’ll have to let our data science team weigh in.
@helpful-application-7107 ^
h
Yes, but it depends on what you look at in the ANOVA context. ANOVA and regression are basically identical in many ways, so I can easily point out how to do this with a regression. In a regression context you want to basically run the following regression:
y = b0 + b1 * t1 + b2 * t2 + b3 * t1 * t2
where
t1
is 0/1 for a/b in ab test 1 and
t2
is 0/1 for a/b in ab test 2. Then if
b3
is different from 0 there is evidence of an interaction effect. There's an analog in ANOVA but I find that people describe anova models in different ways that makes communicating with them harder than communicating with regression models.
s
Hi! I meant multi-factor ANOVA. that specifically tests for interaction, as described here: https://onlinestatbook.com/2/analysis_of_variance/multiway.html I heard that same results can be obtained through regression though. And I imagine the same approach can be used not only for testing for interaction effect in the context of different experiments on the same page (for instance), but also for testing the presence of novelty and primacy effects? In case I decide to segment my users by new vs. returning and see if the effect of my change is the same for both new and returning users (by testing these four combinations)
Copy code
old_control, new_control, old_treatment, new_treatment
Sorry if something sounds not exactly clear - I’m still relatively new to statistics and just trying to find my way around
h
I heard that same results can be obtained through regression though.
Yes, I'm just stating it in regression format because I find it easier to communicate about given my backgorund. Multi-factor anova will be identical to regression in this setting.
but also for testing the presence of novelty and primacy effects? In case I decide to segment my users by new vs. returning and see if the effect of my change is the same for both new and returning users (by testing these four combinations)
Generally yes, this is the rough approach to estimating "heterogeneous treatment effects" where you look at treatment effects within dimension slices, and then specifically run a test for whether the effect in group A is different from the effect in group B.
s
Hi @helpful-application-7107. Just need some clarification regarding using regression for estimating heterogenous treatment effect. I got familiar with estimating it using linear regression (just the way you showed earlier), but now I’m stuck with doing the same for testing proportions (the classic linear equation seems to be appropriate mostly for continuous outcome variables.) After a bit of research, I figured that what I need is binomial logistic regression with two categorical predictors and their interaction effect. So, basically, the equation for estimating the proportion would look like that on the image, interaction effect (specifically, novelty effect) would give us negative interaction coefficient and primacy effect would give us positive interaction coefficient. Am I on the right track here?
h
Yes. However, using a linear model is fine most of the time, is much easier to use, and for experiments will work well if you're only using it to look within the range of your covariates. Economists regularly prefer the "linear probability model" (e.g. standard OLS) for experiment effects and heterogeneous effects because it is simple and for any reasonable model (or if your covariates are all categorical) your predictions will all still fall within [0, 1].