have encountered a discrepancy in the results of an experime GrowthBook Users #ask-questions

have encountered a discrepancy in the results of a...

abundant-air-78515

02/10/2024, 11:09 AM

have encountered a discrepancy in the results of an experiment where the same metric is measured in two different ways. When the metric is calculated using a Count approach, the result shows a decrease of 15%. However, when it is calculated using a binomial approach, there is an increase of 10%. In both cases, I apply a delimiter to evaluate the total mass. How can such a difference occur?

helpful-application-7107

02/12/2024, 4:54 PM

In both cases, I apply a delimiter to evaluate the total mass.

Denominators (if that is what you mean) are different for cont and binomial metrics. What are you choosing as your denominator?

helpful-application-7107

02/12/2024, 4:55 PM

You can read more about that here: https://docs.growthbook.io/app/metrics#denominator-ratio--funnel-metrics. The UI could be improved here, but they are treated separately when you apply a custom denominator. My follow-up question for you would be, what is your denominator that you're using?

abundant-air-78515

02/12/2024, 7:11 PM

its no difference , its only Count or Binominal thats the difference

abundant-air-78515

02/12/2024, 7:11 PM

between them

abundant-air-78515

02/12/2024, 7:12 PM

but show totally different result

abundant-air-78515

02/12/2024, 7:12 PM

as i wrote

helpful-application-7107

02/12/2024, 7:36 PM

In both cases, I apply a delimiter to evaluate the total mass

What do you mean by this?

helpful-application-7107

02/12/2024, 7:38 PM

Can you share the SQL for these two metrics as well as the denominator field? Screenshots would be very helpful.

abundant-air-78515

02/12/2024, 7:42 PM

Thanks, absolutely

abundant-air-78515

02/12/2024, 7:43 PM

its the same sql :SELECT c.id AS grade, c.created_at AS timestamp, MAX(c.employee_account_id) AS user_id, MAX(c.company_id) as company_id FROM demando-data.prod_public.connection c INNER JOIN demando-data.prod_public.account AS a ON a.id = c.employee_account_id LEFT JOIN demando-data.prod_public.candidate AS ca ON c.candidate_id = ca.id LEFT JOIN ( SELECT candidate_id, score AS latest_score, MIN(created_at) AS min_created_at FROM demando-data.prod_public.candidate_activity_score GROUP BY candidate_id, score ) AS cas ON cas.candidate_id = ca.id AND cas.min_created_at > c.created_at WHERE c.assistant_id IS NULL AND c.initiated_by = ‘employee’ AND c.is_open = TRUE GROUP BY c.id, c.created_at

abundant-air-78515

02/12/2024, 7:44 PM

And the denominator to this is company_accept_grade_sent from this sql : SELECT 1 AS grade, c.created_at AS timestamp, MAX(c.employee_account_id) AS user_id, MAX(c.company_id) as company_id FROM demando-data.prod_public.connection c INNER JOIN demando-data.prod_public.account AS a ON a.id = c.employee_account_id LEFT JOIN demando-data.prod_public.candidate AS ca ON c.candidate_id = ca.id LEFT JOIN ( SELECT candidate_id, score AS latest_score, MIN(created_at) AS min_created_at FROM demando-data.prod_public.candidate_activity_score GROUP BY candidate_id, score ) AS cas ON cas.candidate_id = ca.id AND cas.min_created_at > c.created_at WHERE c.assistant_id IS NULL AND c.initiated_by = ‘employee’ GROUP BY c.id, c.created_at

abundant-air-78515

02/12/2024, 7:46 PM

So the first sql i have as COUNT and Binominal in two different metrics pointng to same denominator. But showing very diff results..

helpful-application-7107

02/12/2024, 7:49 PM

For the Count metric, the SQL will have to be different because it will need a

value

column. What are you returning as the

value

column?

helpful-application-7107

02/12/2024, 7:50 PM

What's probably happening is for

binomial

metrics the numerator is 0 or 1 for users, but for

count

metrics we sum the

value

column. It seems possible from your above queries that you could end up with multiple rows per

user_id

which would result in different numerator values.

abundant-air-78515

02/12/2024, 8:58 PM

Ah sorry here is for the COUNT: SELECT 1 AS value, c.created_at AS timestamp, MAX(c.employee_account_id) AS user_id, MAX(c.company_id) as company_id FROM demando-data.prod_public.connection c INNER JOIN demando-data.prod_public.account AS a ON a.id = c.employee_account_id LEFT JOIN demando-data.prod_public.candidate AS ca ON c.candidate_id = ca.id LEFT JOIN ( SELECT candidate_id, score AS latest_score, MIN(created_at) AS min_created_at FROM demando-data.prod_public.candidate_activity_score GROUP BY candidate_id, score ) AS cas ON cas.candidate_id = ca.id AND cas.min_created_at > c.created_at WHERE c.assistant_id IS NULL AND c.initiated_by = ‘employee’ AND c.is_open = TRUE GROUP BY c.id, c.created_at

abundant-air-78515

02/12/2024, 9:00 PM

What different values should i put when its COUNT or Binominal? Dont know whats happening under the hood here after my sql..

abundant-air-78515

02/12/2024, 9:01 PM

Can you explain what i need to do andhow it works?

helpful-application-7107

02/12/2024, 9:14 PM

After your query we aggregate at the

user_id

level (or whatever identifier type you are using for your experiment). For Count we

SUM(value)

and for binomial we just look for any row for a user and count that as

. So ideally you shouldn't change your count definition to match the binomial and can just rely on the binomial metric. If you want to make them match, you could use the "Custom Aggregation" field of simply

rather than the default

SUM(value)

and then the results should be similar (the statistics won't be, because we model the two metrics differently, but they should be close, and the uplift percents should be the same).

abundant-air-78515

02/13/2024, 8:04 AM

So i change the count logic ? Shoould i use User Value Aggregation field and set SUM(value) on the 1 value?

abundant-air-78515

02/13/2024, 8:05 AM

And whats the difference between seting for example SUM in User Value Aggregation field or in the query

abundant-air-78515

02/13/2024, 8:08 AM

SELECT 1 AS value, c.created_at AS timestamp, MAX(c.employee_account_id) AS user_id, MAX(c.company_id) as company_id FROM demando-data.prod_public.connection c INNER JOIN demando-data.prod_public.account AS a ON a.id = c.employee_account_id LEFT JOIN demando-data.prod_public.candidate AS ca ON c.candidate_id = ca.id LEFT JOIN ( SELECT candidate_id, score AS latest_score, MIN(created_at) AS min_created_at FROM demando-data.prod_public.candidate_activity_score GROUP BY candidate_id, score ) AS cas ON cas.candidate_id = ca.id AND cas.min_created_at > c.created_at WHERE c.assistant_id IS NULL AND c.initiated_by = ‘employee’ AND c.is_open = TRUE GROUP BY c.id, c.created_at

abundant-air-78515

02/14/2024, 9:03 AM

@helpful-application-7107

helpful-application-7107

02/14/2024, 4:31 PM

We look at all rows for each user returned by your query and apply the user value aggregation. So what you want is to use

as the User Value Aggregation.

helpful-application-7107

02/14/2024, 4:32 PM

You can see how this gets used eventually if you click the three dots in the results table and click "View Queries"

abundant-air-78515

02/15/2024, 11:55 AM

So 1 in the query and SUM(value) in User Value Aggregation

abundant-air-78515

02/15/2024, 11:55 AM

Ill test it see if i get different result

abundant-air-78515

02/15/2024, 2:24 PM

First is the changed logic for COUNT and the other is for the binominal. Its still a very different result -30% vs +5%

abundant-air-78515

02/15/2024, 2:24 PM

We track the same result but in % and Number..

helpful-application-7107

02/15/2024, 6:45 PM

No, as I said above: So what you want is to use

as the User Value Aggregation.

helpful-application-7107

02/15/2024, 6:45 PM

The

value

column in the query can honestly be anything.

74 Views

Open in Slack

Previous Next