#announcements

Title

s

some-planet-44104

07/17/2023, 10:07 AMI’m also having this error when trying to run the notebook I downloaded:

Copy code

`cannot import name 'diff_for_daily_time_series' from 'gbstats.gbstats' (/Users/kralich/Library/jupyterlab-desktop/jlab_server/lib/python3.8/site-packages/gbstats/gbstats.py). I am already on the latest version of gbstats`

h

helpful-application-7107

07/17/2023, 3:36 PMOof, so sorry about this. This looks like a bug. One second.

Ok, a bugfix is on the way and new notebooks should not have this issue when it lands (I'll ping you here), but for now you should be able to safely delete

`diff_for_daily_time_series`

from the import statement in the top code cell and it should work as expected.s

some-planet-44104

07/18/2023, 9:24 AMThanks. It’s not an urgency so I’ll probably wait for the fix

h

helpful-application-7107

07/18/2023, 6:31 PMShould be landed now, if you download a new notebook. Please let me know if you get a chance to check it out!

s

some-planet-44104

07/19/2023, 12:45 PMChecked it. Seemingly works without hickups.
By the way, I’m now accessing the statistical model you use by using synthetic data and put hundreds of tests through the model in the jupyter notebook and what strikes me as odd is that even with small sample sizes of 500 or 1000 users the credible interval holds exceptionally well, even though the original population distributions are extremely skewed.
I’d expect credible intervals be less reliable with small samples taken from extremely skewed population distributions but somehow everything I’ve seen so far performs well and I can’t explain it.
I tried to sample the distribution and given the skew of the population distribution and the sample size, the distribution of the sample mean doesn’t converge to normal. Still, credible intervals look reliable.
I am kind of surprised why it’s the case.

h

helpful-application-7107

07/19/2023, 4:57 PMBeyond the fact that the CLT is pretty strong even at small sample sizes, the bayesian engine explicitly models the data as beta (for binomial) or normally (for count metrics) distributed. This is an assumption of our bayesian (and basically all A/B testing platforms that have a bayesian engine).