https://www.growthbook.io/ logo
#ask-questions
Title
# ask-questions
f

few-train-25941

11/22/2023, 11:12 AM
Hi all, I was trying to import my existing running experiment data via databricks as source The query looks something like this
Copy code
SELECT user_id,
       exp_time::timestamp AS timestamp,
       experiment_id ,
       experiment_name,
       get_json_object(params_raw, concat('$.', json_object_keys(params_raw)[0])) AS variation_name,
       get_json_object(params_raw, concat('$.', json_object_keys(params_raw)[0])) AS variation_id
FROM   main.artemis.legos_experimentation
where experiment_id=1042
Upon going to the Add Experiment tab the query that growth-book produces is this
Copy code
-- Past Experiments
    WITH
      
        __exposures0 as (
          SELECT 
            cast('user_id' as string) as exposure_query,
            experiment_id,
            MIN(experiment_name) as experiment_name,
            cast(variation_id as string) as variation_id,
            MIN(variation_name) as variation_name,
            date_trunc('day', timestamp) as date,
            count(distinct user_id) as users
          FROM
            (
              SELECT user_id,
       exp_time::timestamp AS timestamp,
       experiment_id ,
       experiment_name,
       get_json_object(params_raw, concat('$.', json_object_keys(params_raw)[0])) AS variation_name,
       get_json_object(params_raw, concat('$.', json_object_keys(params_raw)[0])) AS variation_id
FROM   main.artemis.legos_experimentation
where experiment_id=1042
            ) e0
          WHERE
            timestamp > TIMESTAMP'2023-10-22T09:46:12.800Z'
          GROUP BY
            experiment_id,
            variation_id,
            date_trunc('day', timestamp)
        ),
      __experiments as (
        SELECT * FROM __exposures0
      ),
      __userThresholds as (
        SELECT
          exposure_query,
          experiment_id,
          MIN(experiment_name) as experiment_name,
          variation_id,
          MIN(variation_name) as variation_name,
          -- It's common for a small number of tracking events to continue coming in
          -- long after an experiment ends, so limit to days with enough traffic
          max(users)*0.05 as threshold
        FROM
          __experiments
        WHERE
          -- Skip days where a variation got 5 or fewer visitors since it's probably not real traffic
          users > 5
        GROUP BY
        exposure_query, experiment_id, variation_id
      ),
      __variations as (
        SELECT
          d.exposure_query,
          d.experiment_id,
          MIN(d.experiment_name) as experiment_name,
          d.variation_id,
          MIN(d.variation_name) as variation_name,
          MIN(d.date) as start_date,
          MAX(d.date) as end_date,
          SUM(d.users) as users
        FROM
          __experiments d
          JOIN __userThresholds u ON (
            d.exposure_query = u.exposure_query
            AND d.experiment_id = u.experiment_id
            AND d.variation_id = u.variation_id
          )
        WHERE
          d.users > u.threshold
        GROUP BY
          d.exposure_query, d.experiment_id, d.variation_id
      )
    SELECT
      *
    FROM
      __variations
    WHERE
      -- Skip experiments at start of date range since it's likely missing data
      datediff(day, TIMESTAMP'2023-10-22T09:46:12.800Z', start_date) > 2
    ORDER BY
      experiment_id ASC, variation_id ASC
In the above query the data is filtered, such that assignment data only for the last month is present and later in the same query it has a filter condition which filters out all the variants of a experiment based on the difference in the start date of experiment and the current date of previous month These 2nd filter condition always filter out all the data as the start date of experiment in the query is taken as minimum of all the assignments My question here is, that is this query that is generated by growth-book customisable or can be configured to change these filter conditions
r

rhythmic-agent-34208

11/22/2023, 11:12 AM
Our official support hours are 6:30am - 5pm Pacific Time, Monday through Friday. You may occasionally hear from us outside of these hours. Your support request has been logged in our system. Our support team will get back to you very soon!
Hi Shubham, yes, the SQL query that is generated by GrowthBook when importing existing experiment data is customizable. This can be done by editing the Experiment Assignment Query from the data source page. You can write your own SQL to match your existing database format and rename columns as necessary to fit the expected format by GrowthBook.
To edit the Experiment Assignment Query in GrowthBook, you should navigate to the data source configuration in your account. Once there, look for the "Experiment Assignment Queries" section. This is where you can edit the SQL Query to match your specific dataset and table structure. Please check our User Permissions documentation to ensure you have the correct permission level to make these changes.
4 Views