Hey all, I noticed a discrepancy in a few of our m...
# give-feedback
q
Hey all, I noticed a discrepancy in a few of our metrics that I think is caused by a slight error in a query -
When a metric has a denominator other than the base experiment query, the query joins the denominator events to the experiment events before deduplicating and taking the first timestamp that the user joined the experiment. Hence, if the user converted after the appropriate window, but they’d also fired another experiment event later on, they will end up in the data set.
it’s this section of the query (for a binomial metric that has a denominator other than the total experiment group):
Copy code
__denominatorUsers as (
    SELECT
      initial.analytics_id,
      t0.conversion_start as conversion_start,
      t0.conversion_end as conversion_end
    FROM
      __experiment initial
      JOIN __denominator0 t0 ON (t0.analytics_id = initial.analytics_id)
    WHERE
      t0.timestamp >= initial.conversion_start
      AND t0.timestamp <= initial.conversion_end
  ),
f
@helpful-application-7107 thoughts?
h
Hmm, I think you're right that this probably is not the expected behavior, but I think I know why it works this way and I also think it's a tiny bit different from what you're suggesting. • Our expected behavior if you choose "First Exposure" should probably look for people who had a denominator event within the first conversion window after exposure,. • Our current behavior includes users who had a denominator event within the conversion window after any exposure.
Hence, if the user converted after the appropriate window, but they’d also fired another experiment event later on, they will end up in the data set.
So I think the more accurate representation here is: "if the user converted after the first window closed, but they converted within the conversion window of a later exposure, they will end up in the data set." I am curious if you agree with this @quaint-window-98285, and if this explains the discrepancies you're seeing. Why is it this way? This is because our denominators tend to act like activation metrics, where they act as the kind of "first exposure" rather than the experiment event itself. This was probably built that way so that if a user hit an exposure a bunch of times, but then it wasn't until some activation event fired that they are considered "exposed" they would be included. What should we do? I agree that we should support the case, especially for denominators, if not for activation metrics, where you want to just look at denominators in that first exposure window. We have a TODO to improve some of our ratio metric functionality and I'll add this to that list. I also agree that this is more of a "bug" than our other planned improvements so I'll treat it with higher priority.
q
Hey Luke, yes, this explains the discrepancy perfectly. I see what you mean, that in some cases, you might want to count the first exposure to the denominator as the first exposure. But, as with my scenario, we want to count the denominator events made within the initial exposure window, so having this be configurable would be ideal. Just for some context, in case it is helpful - my use case is an experiment where we are testing a price increase. For this test, our target KPI is average revenue per user (ARPU), but as secondary metrics we want to see the change in conversion rate and paid ARPU (that is, the average amount spent by users who converted). I set up ARPU as the avg value spent with a denominator of converted users (so, I used the conversion rate metric for this). The problem now is that the % change in conversion rate and the % change in paid ARPU don’t add up to the % change in ARPU.