https://www.growthbook.io/ logo
#give-feedback
Title
# give-feedback
p

prehistoric-beard-84272

01/15/2024, 10:16 AM
Hi, I get an error message on one of my queries, that seems to be new.
Copy code
-- Average Order Value (AOV) (revenue)
WITH
  __rawExperiment AS (
    SELECT
      user_pseudo_id as anonymous_id,
      --(SELECT value.string_value FROM UNNEST(event_params) where key = "gbuuid") as anonymous_id,
      TIMESTAMP_MICROS(event_timestamp) as timestamp,
      (
        SELECT
          value.string_value
        FROM
          UNNEST (event_params)
        where
          key = "experiment_id"
      ) as experiment_id,
      (
        SELECT
          value.int_value
        FROM
          UNNEST (event_params)
        where
          key = "variation_id"
      ) as variation_id,
      geo.country as country,
      traffic_source.source as source,
      traffic_source.medium as medium,
      device.category as device,
      device.web_info.browser as browser,
      device.operating_system as os,
      (
        SELECT
          value.string_value
        FROM
          UNNEST (event_params)
        where
          key = "visitor_type"
      ) as visitor_type,
    FROM
      `bigquery-xxx`.`analytics_xxx`.`events_*`
    WHERE
      REGEXP_EXTRACT(_TABLE_SUFFIX, r'[0-9]+') BETWEEN '20231218' AND '20240115'
      AND event_name = 'experiment_viewed'
      AND user_pseudo_id is not null
  ),
  __experimentExposures AS (
    -- Viewed Experiment
    SELECT
      e.anonymous_id as anonymous_id,
      cast(e.variation_id as string) as variation,
      CAST(e.timestamp as DATETIME) as timestamp,
      e.device AS dim_device
    FROM
      __rawExperiment e
    WHERE
      e.experiment_id = 'dac23-076'
      AND e.timestamp >= '2023-12-18 14:00:51'
      AND e.timestamp <= '2024-01-15 10:12:35'
  ),
  __activationMetric as ( -- Metric (dac23-076_viewed)
    SELECT
      anonymous_id as anonymous_id,
      1 as value,
      CAST(m.timestamp as DATETIME) as timestamp
    FROM
      (
        SELECT
          user_id,
          user_pseudo_id as anonymous_id,
          TIMESTAMP_MICROS(event_timestamp) as timestamp
        FROM
          `bigquery-xxx`.`analytics_xxx`.`events_*`
        WHERE
          event_name = 'dac23-076_viewed'
          AND (
            (_TABLE_SUFFIX BETWEEN '20231218' AND '20240118')
            OR (
              _TABLE_SUFFIX BETWEEN 'intraday_20231218' AND 'intraday_20240118'
            )
          )
      ) m
    WHERE
      m.timestamp >= '2023-12-18 14:00:51'
      AND m.timestamp <= '2024-01-18 10:12:35'
  ),
  __experimentUnits AS (
    -- One row per user
    SELECT
      e.anonymous_id AS anonymous_id,
      (
        CASE
          WHEN count(distinct e.variation) > 1 THEN '__multiple__'
          ELSE max(e.variation)
        END
      ) AS variation,
      MIN(e.timestamp) AS first_exposure_timestamp,
      SUBSTRING(
        MIN(
          CONCAT(
            SUBSTRING(format_datetime("%F %T", e.timestamp), 1, 19),
            coalesce(
              cast(e.dim_device as string),
              cast('__NULL_DIMENSION' as string)
            )
          )
        ),
        20,
        99999
      ) AS dim_exp_device,
      MIN(
        (
          CASE
            WHEN a.timestamp >= e.timestamp
            AND a.timestamp <= DATETIME_ADD(e.timestamp, INTERVAL 72 HOUR) THEN a.timestamp
            ELSE NULL
          END
        )
      ) AS first_activation_timestamp
    FROM
      __experimentExposures e
      LEFT JOIN __activationMetric a ON (a.anonymous_id = e.anonymous_id)
    GROUP BY
      e.anonymous_id
  ),
  __distinctUsers AS (
    SELECT
      anonymous_id,
      dim_exp_device AS dimension,
      variation,
      first_activation_timestamp AS timestamp,
      date_trunc(first_exposure_timestamp, DAY) AS first_exposure_date
    FROM
      __experimentUnits
    WHERE
      first_activation_timestamp IS NOT NULL
  ),
  __metric as ( -- Metric (Average Order Value (AOV))
    SELECT
      anonymous_id as anonymous_id,
      m.value as value,
      CAST(m.timestamp as DATETIME) as timestamp
    FROM
      (
        SELECT
          user_id,
          user_pseudo_id as anonymous_id,
          TIMESTAMP_MICROS(event_timestamp) as timestamp,
          ecommerce.purchase_revenue_in_usd as value
        FROM
          `bigquery-xxx`.`analytics_xxx`.`events_*`
        WHERE
          event_name = 'purchase'
          AND REGEXP_EXTRACT(_TABLE_SUFFIX, r'[0-9]+') BETWEEN '20231218' AND '20240124'
      ) m
    WHERE
      m.timestamp >= '2023-12-18 14:00:51'
      AND m.timestamp <= '2024-01-24 10:12:35'
  ),
  __denominator0 as ( -- Metric (Count of purchase)
    SELECT
      anonymous_id as anonymous_id,
      m.value as value,
      CAST(m.timestamp as DATETIME) as timestamp
    FROM
      (
        SELECT
          user_id,
          user_pseudo_id as anonymous_id,
          TIMESTAMP_MICROS(event_timestamp) as timestamp,
          1 as value
        FROM
          `bigquery-xxx.analytics_xxx.events_*`
        WHERE
          event_name = 'purchase'
          AND REGEXP_EXTRACT(_TABLE_SUFFIX, r'[0-9]+') BETWEEN '20231218' AND '20240124'
      ) m
    WHERE
      m.timestamp >= '2023-12-18 14:00:51'
      AND m.timestamp <= '2024-01-24 10:12:35'
  ),
  __userMetricJoin as (
    SELECT
      d.variation AS variation,
      d.dimension AS dimension,
      d.anonymous_id AS anonymous_id,
      (
        CASE
          WHEN m.timestamp >= d.timestamp
          AND m.timestamp <= DATETIME_ADD(d.timestamp, INTERVAL 72 HOUR) THEN m.value
          ELSE NULL
        END
      ) as value
    FROM
      __distinctUsers d
      LEFT JOIN __metric m ON (m.anonymous_id = d.anonymous_id)
  ),
  __userMetricAgg as (
    -- Add in the aggregate metric value for each user
    SELECT
      variation,
      dimension,
      anonymous_id,
      SUM(COALESCE(value, 0)) as value
    FROM
      __userMetricJoin
    GROUP BY
      variation,
      dimension,
      anonymous_id
  ),
  __userDenominatorAgg AS (
    SELECT
      d.variation AS variation,
      d.dimension AS dimension,
      d.anonymous_id AS anonymous_id,
      SUM(COALESCE(value, 0)) as value
    FROM
      __distinctUsers d
      JOIN __denominator0 m ON (m.anonymous_id = d.anonymous_id)
    WHERE
      m.timestamp >= d.timestamp
      AND m.timestamp <= DATETIME_ADD(d.timestamp, INTERVAL 72 HOUR)
    GROUP BY
      d.variation,
      d.dimension,
      d.anonymous_id
  )
  -- One row per variation/dimension with aggregations
SELECT
  m.variation AS variation,
  m.dimension AS dimension,
  COUNT(*) AS users,
  SUM(COALESCE(m.value, 0)) AS main_sum,
  SUM(POWER(COALESCE(m.value, 0), 2)) AS main_sum_squares,
  SUM(COALESCE(d.value, 0)) AS denominator_sum,
  SUM(POWER(COALESCE(d.value, 0), 2)) AS denominator_sum_squares,
  SUM(COALESCE(d.value, 0) * COALESCE(m.value, 0)) AS main_denominator_sum_product
FROM
  __userMetricAgg m
  LEFT JOIN __userDenominatorAgg d ON (d.anonymous_id = m.anonymous_id)
GROUP BY
  m.variation,
  m.dimension
The curious thing is that it runs without errors when copy-pasting the query directly into BQ sql console. Also have been using this query for many other experiments, and also other queries are working without error
image.png
f

future-teacher-7046

01/15/2024, 2:14 PM
It looks like that error is coming from our python stats engine. So the query is running fine, but there's an error when processing results. We did make a bunch of changes there recently. Some quick searching suggests it happens when there are NaNs or Infinity in the data. I do see a couple rows in the BigQuery response with all zeros which is suspicious. I'll see if I'm able to reproduce the error.
One thing you could test is adding a custom SQL filter under analysis settings to remove smart tv and see if that helps.
device <> 'smart tv'
c

curved-dawn-87799

01/16/2024, 5:28 PM
i'm getting similar errors when using any dimension to cut results. started within the previous few days and hadn't had issues with this prior
h

helpful-application-7107

01/16/2024, 5:37 PM
I'm investigating now, thanks.
Bryan, can you share the query modal where you're seeing this error? The error may not be tied to the first query in the list as it can be produced by any query in the analysis.
p

prehistoric-beard-84272

01/16/2024, 7:21 PM
Not sure if you fixed something today, but I can't reproduce the error anymore
h

helpful-application-7107

01/16/2024, 7:21 PM
No, we haven't fixed it. But the issue was stemming from that one row where device was
smart tv
. But we have identified the issue and will be addressing it today.
🙌 2
p

prehistoric-beard-84272

01/16/2024, 7:22 PM
OK, thank you
c

curved-dawn-87799

01/16/2024, 7:38 PM
@helpful-application-7107 - do you still need the example query?
image.png
h

helpful-application-7107

01/16/2024, 7:49 PM
Is it a ratio metric, @curved-dawn-87799?
And are there some rows where the denominator_sum is 0?
c

curved-dawn-87799

01/16/2024, 7:51 PM
possibly. we have about 6 metrics we look at, but it doesnt tell me which failed. removing some that wouldnt have a denominator of 0, the dimension segmentation works
h

helpful-application-7107

01/16/2024, 7:52 PM
No worries.
Hey Bryan. Could you rerun those queries now? We just landed what I think should fix it.
c

curved-dawn-87799

01/16/2024, 8:21 PM
all set, confirmed no errors now. thank you!
1
h

helpful-application-7107

01/17/2024, 4:47 PM
Sascha, let us know if you see this error again. We've landed the fix.
🙌 1
p

prehistoric-beard-84272

01/18/2024, 10:17 AM
Thank you. Yes, will let you know
3 Views