Hey everyone! :wave: I’m setting up GrowthBook an...
# ask-questions
r
Hey everyone! 👋 I’m setting up GrowthBook and running my first A/A test. I’ve connected it to GA4 & BigQuery. Initially, I used the daily export, but two days ago, I switched to streaming export. Since switching to streaming, I noticed that data is now being populated in events_intraday_ instead of events_. This means that events like experiment_viewed, add_to_cart, and others I need for GrowthBook no longer appear in events_. I’ve already updated my fact tables in GrowthBook to pull data from events_intraday_ instead of events_ When I check the query running for my A/A test metric (add_to_cart*)*, I see that it’s scanning both events_ and events_intraday_, and I don’t understand why. • Is it really necessary for GrowthBook to query both tables? • It feels like an unnecessarily large query—how can I avoid this? • If I only want to query events_intraday_, how should I configure GrowthBook to do that?* Would love some insights on best practices here! Thanks in advance. 😊
h
Is it really necessary for GrowthBook to query both tables?
Yes, from: https://support.google.com/analytics/answer/7029846?hl=en
If the Streaming export option is enabled, a table named
events_intraday_YYYYMMDD
is created. This table is populated continuously as events are recorded throughout the day. This table is deleted at the end of each day once
events_YYYYMMDD
is complete.
Basically, google deletes the intraday data after one day has passed. This is why our default auto-generated GA4 queries include both day and intraday.
It feels like an unnecessarily large query—how can I avoid this?
If I only want to query events_intraday_, how should I configure GrowthBook to do that?*
It should not be unnecessarily large, because intraday is deleted, it's the same as usual + the most recent day of data. However, if you want to avoid this or only query one table, all you have to do is modify the SQL you use for your Experiment Assignment Query in the Datasource page (the base sql used for exposure events) and your Fact Tables (the base sql used for metrics). However, if you only use
intraday
you will only be scanning the latest day of data so we do not recommend this!
r
thanks for taking the time to answer! I also understood it as data being deleted from intraday_/"moved" to event_ on a daily basis, but what confuses me is that since I switched to streaming 2 days ago, I have no data at all in events_, all I have is in events_intraday_ I came to the conclusion it must be since I only have streaming activated in GA4 (see image attached), but when I read what you say it makes me confused again 😄 that's why it seems unnecessary to scan both, since i cant find any data at all in events_ since 2 days back this is really not my area of expertise, and I'm currently trying to learn, so probably I'm misunderstanding something
h
Oh I see, you have daily off.
Then in that case if all your data is in events_intraday_ then it doesn't matter that the SQL is scanning both events_ and events_intraday_
Because events_ is empty, it won't take any extra effort to scan it.
r
ok! cool 👍 thanks again out of curiosity: why would one want both daily and streaming on? I mean, in intraday_ i get the daily and historical data as well, i just also get it during the day rather than once per day..? am i missing something?
h
From the docs: https://support.google.com/analytics/answer/9358801?hl=en&utm_id=ad
BigQuery streaming export does not include the following user-attribution data for new users:
• traffic_source.name (reporting dimension: User campaign)
• traffic_source.source (reporting dimension: User source)
• traffic_source.medium (reporting dimension: User medium)
User-attribution data for existing users is included but that data requires ~24 hours to fully process, so we recommend not relying on that data from the streaming export and instead getting user-attribution data from the full daily export.
There may be other minimal data cleaning/processing they do in the daily update.
r
Great, thanks!! I think i got it covered now 🙂