Hi. My understanding is that the statistical significance for non-binomial metrics is calculated somewhat different and uses different statistical approaches, is that correct?
Both binomial and non-binomial use the same basic statistical approach, just with different Bayesian priors. Binomial uses beta-binomial priors and everything else uses gaussian priors.
b
breezy-crowd-53224
12/13/2021, 1:30 PM
sorry to ‘highjack’ this question , but should we assume any type of distribution of the data for continuous data? should not matter if I’m correct?
f
future-teacher-7046
12/13/2021, 2:10 PM
With enough samples, the central limit theorem should apply. If your data is extremely skewed, we recommend adding a cap to the value (e.g. normal orders are $10, but you get a $1000 bulk order occasionally)
b
breezy-crowd-53224
12/13/2021, 2:42 PM
k that’s helpful. Using frequentist approach I do notice quite a difference between using Mann Whitney U versus regular TTest. So central limit theorem apparently is not the complete story or doesn’t apply somehow 😉
Was wondering how this is handled in bayesian testing
I presume capping should be done in our data / SQL right?
f
future-teacher-7046
12/13/2021, 2:51 PM
You can set a capped value in the metric behavior settings. We're looking into adding more prior options for extremely skewed distributions