Something looks a bit odd with one of the metrics ...
# announcements
s
Something looks a bit odd with one of the metrics (the second one), the number do not align with Chance to Beat Control
w
Thanks for reporting this. This does indeed look off. We will look into it and get back to you.
h
It looks like in this case that we cannot compute the variance for some reason due to the log-approximation method that we utilize in our Bayesian engine. If you can share the statistics from your View Queries modal I can confirm this, but this is likely just a bad UX from not returning an error when we failed to properly compute the variance (and it isn't a temporary error, it's only resolved with new/slightly different data).
s
I assume you meant this stats.
main_sum_squares looks over the top
h
Yeah so this is a case where our log approximation is indeed breaking down since the variance is so large around the control mean that it could conceivably be negative. You may want to consider a capped version of this metric to help deal with some potentially very large outliers?
s
I understand there’s a chance it may still stabilize. By the way, I’m currently trying to wrap my head about this log approximation thing. Are my assumptions correct: • you do log approximation for continuous metrics in Bayesian engine • you do it for measuring relative uplift • you do it to “normalize” distributions which potentially can be non-normal (like is probably the case with page load times and revenue per user)
h
The first two are correct. It doesn't really apply to the third, that assumption is part of an earlier step of the process and has to do with the variation averages (before we even talk about the relative uplift). Sample means tend towards normal distributions (this is the common invocation of the central limit theorem) even if the underlying distribution is pretty skewed. The log approximation is then used to compute the relative uplift which is a ratio of these two underlying sample means.
s
Thanks, got it. Our revenue per user is extremely skewed, ranging from 0 to 50 000 USD per user and there are serious concerns that CLT won’t apply to our case (I’ve run some simulations), so we are considering doing some log transformation and such and I wondered if this was already the case in Growthbook
h
Yeah we aren't doing that because that would change the underlying effect and it wouldn't be an estimate of the uplift of revenue per user. You could consider a capped version of the metric as well to estimate alongside the uncapped version and that could be helpful for you!