https://www.growthbook.io/ logo
#ask-questions
Title
# ask-questions
c

calm-pilot-30179

04/03/2024, 3:53 AM
Hi team. I am setting up AWS Athena as the data source for my metrics in GrowthBook. I noticed that there is no explicit requirement of the schema of the resulting SQL from athena. How does GB metrics associate event records with ongoing experiments? Is there any extra setup needed?
f

fresh-football-47124

04/03/2024, 3:57 AM
The SQL is adjustable
once you connect to the data source, you'll be able to edit the query to pull the assignment information - which users were exposed to experiments, and what variation they got
c

calm-pilot-30179

04/03/2024, 4:06 AM
so I will need to write the experiments and variations assigned to the user, at the time of the event, in Athena?
f

fresh-football-47124

04/03/2024, 4:07 AM
yes, in some way, there needs to be a way to pull that information. On the assignment side, we have the 'trackingCallback' which is where you would record this info
c

calm-pilot-30179

04/03/2024, 4:28 AM
Does that means we have to merge the events in the table in Athena with another data source that is populated from tracking Callbacks. Would it do any harm if we were to record all the ongoing experiments and variations, along with the user_id and timestamp, and all other into Athena?
I am still a little confused how to join the user experiments assignments and our events data in Athena.
can you provide an example of a SQL query that has all the information to display AB test results in metrics?
f

fresh-football-47124

04/03/2024, 4:29 AM
the trackingCallback is up to you to define what to do with that info. Usually people use their same event tracking to record that information
so there are two queries you need
One is the assignment query, which is which users were exposed to which experiments (and should return the randomization unit, experiment ID, variation ID, and time it happened)
then you add metric queries, which can be anything, but needs to return the same randomization unit as used for exposure, and the value for the metric
then GrowthBook will do all the joining and statistical analysis
Here are some examples:
Copy code
SELECT
  userid as user_id,
  timestamp as timestamp,
  experimentid as experiment_id,
  variationid as variation_id,
  browser,
  country,
  date_trunc('week', timestamp) as weeknumber
FROM
  sample.experiment_viewed
(this is an example from segment, that has a few other columns returned which can be used as dimensions for experiment analysis)
then a metric might be something like this: Orders per user:
Copy code
SELECT
  userId as user_id,
  anonymousId as anonymous_id,
  timestamp as timestamp,
  1 as value
FROM
  sample.orders
c

calm-pilot-30179

04/03/2024, 4:38 AM
i see. that makes more sense. Let me play with it a bit. Thanks a lot for the clear explanation
Hi @fresh-football-47124, is it possible to join the user expeirment assignment query with the metrics from 2 different data sources?
4 Views