Hello 👋 Is there any plans on implementing ROPE on top of the "minimum threshold" in metrics to avoid early stopping of an experiment?
01/26/2022, 4:09 PM
How are you envisioning that would work? Would you want the "chance to beat baseline" to be configurable so you can define what "beating" means (e.g. at least 1% better)?
01/26/2022, 4:27 PM
I am seeing it as something we can look at on top of expected loss when it comes to stopping criteria. As in for some small sample size/early experiments, it could lead to low expected loss and high win rate early on, but if we look at ROPE decision rule it would say keep testing
01/26/2022, 4:39 PM
Currently the results already have too many numbers for non-data-scientist users to understand. We're exploring splitting results into different views, one with just the basic top-level data for non-technical stakeholders, and another with all the raw details for the data team. I think when we do that we can look at adding more things like this.
01/26/2022, 4:42 PM
The main concern I have at the moment (which happens with our PMs and non analyst users) is that they keep monitoring the test, like very frequently and calling it when win rate & expected loss get to the "threshold"
So I guess my question is how can we proof it even more against early stopping or peeking, and one way we learnt is using ROPE on top of it (which maybe can be implemented as a logic so it's not more numbers for the end user?)