Hi everyone. Firstly, Growthbook is an extremely impressive system. Previously worked on a high scale proprietary experiment platform and GB looks a fantastic, credible implementation. Question I have is to what extent is GB run-aware?
Let's say I start an experiment at 4/1/2023 000000 and keep it running for 3 days exactly. We find that the results we were getting were highly skewed towards the control and on analysis it is because there is a bug with the experiment. As such experiment was stopped, and bug fixed. Experiment restarted at 4/10/2023 000000. I have now two runs, 4/1/2023 -> 4/4/2023, and 4/10/2023 -> present.
What is the default behavior going to be when calculating results? I want to make sure that the only data that is analyzed is for the current run alone, because the results that were part of the original run were biased because of the poor implementation. Initial analysis seems to be that runs / phases are essentially synonymous, and that I would have to manually delete a phase if it were due to a poor implementation. Is this correct? And is there a way I can force the system to only work with data collected for the current phase, as typically the only reasons I have for stopping and restarting an experiment is due to bugs, bad data collection, bias on traffic allocation - essentially reasons that would mean data collected during earlier phases / runs would be invalidated.
04/11/2023, 7:25 AM
Hi Peter, Thanks!
We have two ways do deal with that - one is to set the ‘phase’ of the experiment, so that it will ignore events from the first trial run. The other way is to change the tracking key when you restart the experiment, and that will rebucket users with a new hash key.
We are working on tighter integration between the experiment phases and the feature flags, but at the moment this step is manual.