rhythmic-napkin-6309909/13/2023, 9:21 AM
brief-honey-4561009/13/2023, 2:29 PM
rhythmic-napkin-6309909/13/2023, 4:37 PM
brief-honey-4561009/13/2023, 7:32 PM
The platform is very intuitive and easy to use. Thus, I would like to give the instrument to the business people to do self live experiment monitoring. What do you think about the peeking problem in this situation?
I read from your white paper that there is no Fixed Horizon using Bayesian approach, is it true also with uninformative priors? What about Robinson's post ?Peeking is a concern in both of the frequentist and bayesian engines. We are working on a blog article about peeking, our position on it, and solutions to it. Peeking is a problem for both engines. While peeking is normally talked about with respect to frequentist testing---because frequentist testing has devices like p-values that claim to control false positives---using a bayesian engine to make shipping decisions as soon as the results demonstrate some threshold is surpassed also can cause more bad decisions than you might otherwise realize. "More bad decisions than you might otherwise realize" is just a softer way of saying you will have a higher false positive rate and applies to the bayesian engine even though bayesian stats don't claim to give you control over the false positive rate. To this end, we strongly agree with the position taken in David Robinson's article linked above. So what are your solutions? • Implement a culture that respects the weight of evidence before making decisions (but this can be hard to do at scale, and at some point, someone needs to make a decision) • Use GrowthBook's "minimum sample size" setting to at least ensure X number of conversions are reached before we return statistics. • Use Sequential Testing in the Frequentist engine, which will hurt your power, but returns the guarantee that no matter how often you peek, you will only get a false positive on 5% of tests (if your p-value threshold is 0.05).
What are the differences between the Bayesian approach with uninformative priors and the Frequentist approach?There are lots of differences, so this can be hard to answer, but the asker may be (correctly) hinting at the fact that in practice the Bayesian engine with uninformative priors tend to produce similar results with the Frequentist engine. That said, we have some additional tools in the Frequentist engine (CUPED, sequential testing) that we don't have in the Bayesian engine, but the Bayesian engine returns more intuitive quantities like "Chance to win" rather than more opaque values like p-values.
Set a risk threshold is the only metric that you suggest for ending an experiment?Risk is only used in the bayesian engine. It isn't a metric we suggest you use, but it can be used in conjunction with your chance to win data. For example, if one metric has a 99% chance to win (and low risk) and another metric only has a 50% chance to win, so you aren't sure which is better, but that second metric has low risk, then maybe you don't need to wait for more data because even if that second metric is actually a little negative, the amount it's negative is low and not worth waiting extra weeks of experimentation time to figure out.
rhythmic-napkin-6309909/14/2023, 10:59 AM
brief-honey-4561009/14/2023, 2:40 PM
rhythmic-napkin-6309909/14/2023, 4:03 PM
brief-honey-4561009/14/2023, 4:04 PM