Hi < fresh football 47124> < future teacher 7046> The bug re GrowthBook Users #announcements

Hi <@U01T6HCHD0A> <@U01TCPDB58C> The bug related t...

handsome-library-89124

12/05/2022, 7:22 AM

Hi @fresh-football-47124 @future-teacher-7046 The bug related to ratio metrics fixed? https://growthbookusers.slack.com/archives/C01T6PKD9C3/p1667921148551759?thread_ts=1667916912.725549&cid=C01T6PKD9C3

helpful-application-7107

12/05/2022, 10:59 PM

Hi @handsome-library-89124, I was investigating this today. I'm wondering whether it's possible your numerator or your denominator metric has any NULL values in it? Could you check?

handsome-library-89124

12/06/2022, 6:37 AM

Hi @helpful-application-7107 It was an old experiment and I ran it without denominator. I am just asking if the bug mentioned by @future-teacher-7046 is fixed then I will use ratio metric for my future usage.

handsome-library-89124

12/08/2022, 6:30 AM

@helpful-application-7107 values can't be null because of the condition in query, it can't even a zero number.

helpful-application-7107

12/12/2022, 5:07 PM

One potential source for this bug is null values and we're discussing solutions to that. I'll try and follow-up when that is resolved.

handsome-library-89124

12/13/2022, 11:10 AM

Thanks I will wait to hear back from you!

green-jordan-83609

01/10/2023, 2:16 PM

Hi team! Is there any update here? We’re having a similar issue with some ratio metrics (however for others ratio metrics it seems to be working fine). Thanks! cc: @kind-airplane-15946

👍 1

helpful-application-7107

01/10/2023, 5:56 PM

@green-jordan-83609 You're still seeing negative variance right now?

helpful-application-7107

01/10/2023, 10:02 PM

I would love to see some screen shots of your results and the intermediate results that are proving problematic.

helpful-application-7107

01/10/2023, 10:04 PM

For example, the screen shots that the user provided here would be helpful for your case and could help us debug the issue: https://growthbookusers.slack.com/archives/C01T6Q0TEG3/p1666343634581349

green-jordan-83609

01/11/2023, 8:30 AM

Sure! The metric with the issue shows up like this in the experiment dashboard. It’s really similar to the case you attached: a static 50% chance to beat control and no violin plot at all. The metric is has the following characteristic: • defined as a count • user aggregation is SUM(value) • denominator is another count metric.

green-jordan-83609

01/11/2023, 8:37 AM

If I retrieve the query that the experiment is using (under view queries option), I can do the following:

Copy code

...all previous steps... ,
__stats as (
    -- One row per variation/dimension with aggregations
    SELECT
      d.variation,
      d.dimension,
      SUM(IFF(m.value is null, 1, 0)) as num_nulls_numerator,
      SUM(IFF(d.value is null, 1, 0)) as num_nulls_denominator
    FROM
      __userDenominator d
      LEFT JOIN __userMetric m ON (d.user_id = m.user_id)
    GROUP BY
      d.variation,
      d.dimension
  )
  
  select *
  FROM __stats

To check if this final join gets any null (i.e. some users might be in the denominator and not in the numerator, therefore retrieving a null in

m.value

). That’s not the case, as this query retrieves 0 in both cases.

green-jordan-83609

01/11/2023, 8:55 AM

The funny thing is that we have another ratio metric, which uses the same denominator (i.e. the same count metric as denominator), and for this one the violin plot and the stats works well. I’ve checked and the final metrics calculated by the query: • variation • dimension • users • count • mean • stddev Are more or less in the same proportion in both cases. stddev is in both cases roughly 3 times the mean. The only difference I’m able to see might be that the not working metric might has a mean closer to 0 (around 0.07), but I’m not sure that would be the root of the issue.

green-jordan-83609

01/11/2023, 8:57 AM

Apart from that, I just wonder how this new ratio metrics works with the bayesian statistical framework. Are we still defining a gaussian posterior with

mean=query_mean

and std =

query_stddev / sqrt(num_users)

? Thanks a lot and let us know if we can help with anything else!

helpful-application-7107

01/11/2023, 3:18 PM

Can you share the intermediate values from the metric queries? If you go to

view queries

, can you share the results from the ratio metric?

helpful-application-7107

01/11/2023, 3:18 PM

Just copy the cell below the ratio metric query along with the ratio metric query itself, and I'll be able to take a closer look.

helpful-application-7107

01/11/2023, 3:41 PM

@green-jordan-83609 shared some more details in a private DM. The issue in this case is that we use log approximations in our gaussian tests in

gbstats

and the values in your case are small (mean) and uncertain (variance), and the log approximation can suffer when the mean values are close to zero. We have a check built in

gbstats

here that returns no meaningful information for the posterior distribution when this check fails, since the log approximation could be inexact.

helpful-application-7107

01/11/2023, 3:42 PM

If the stddev of the ratio metric will decrease as you collect more data, this could go away. If not, then we could help you do a bespoke analysis in a jupyter notebook. In any case, we definitely need better error messaging and handling when cases like this occur. I'll open an issue for that.

green-jordan-83609

01/11/2023, 3:49 PM

Understood, thanks a lot! Would changing the scale of the metric help or that would just make it worst? Anyways we need to capture more data for sure.

helpful-application-7107

01/11/2023, 3:51 PM

I'm not sure if changing the scale of the metric would help. It would change the metric itself I would imagine, but you could try! I think the main issue here is that those standard deviations you shared with me are really large relative to the mean. However, that standard deviation depends on both the numerator and the denominator since this is a ratio metric.

88 Views

Open in Slack

Previous Next