Hey there am I correct in thinking that the built in Date di GrowthBook Users #announcements

Hey there, am I correct in thinking that the built...

wide-jordan-14758

03/09/2023, 8:53 PM

Hey there, am I correct in thinking that the built in Date dimensional analysis includes only behavior that happened within a user's first conversion window on a given day? Any reason it can't include all of a user's behavior on a given day?

wide-jordan-14758

03/09/2023, 8:53 PM

cc @ambitious-apartment-58735 @orange-magician-36994

helpful-application-7107

03/09/2023, 9:06 PM

I think if you switch to the "Multiple Exposures" attribution model then it should work somewhat like you suggest. However, we're currently considering an overhaul of the the date dimensional analysis (since it can be prone to carry over bias in the current format) and deprecating multiple exposures in favor of another approach. The new date dimensional analysis will only use the date of a user's first exposure date for the date. Using later dates as independent units can result in bias in those later analyses if one variation causes more users or users of a certain type to return to the experiment more frequently. The replacement for "Multiple Exposures" attribution model will just look at all conversions after any potential conversion delay (e.g. from exposure date + conversion delay until the end of the experiment). This is a much simpler query that should be more performant.

helpful-application-7107

03/09/2023, 9:06 PM

Eventually we want to then support more date analyses (e.g. a time series of total effects as of day X, rather than the effects for users bucketed on day X, which is what our new approach will be).

helpful-application-7107

03/09/2023, 9:06 PM

LMK what you think and what kind of analysis would be really helpful for you.

wide-jordan-14758

03/10/2023, 9:26 PM

From our PoV, we do not need the conversion window. After a user is exposed to a test, we do not want to filter out any data. To do this, we could use first exposure with a large conversion window (at least as long as the test). But we also look at metrics over time for each our tests, and I believe the current daily analysis captures all conversions within the window that starts on day X. Thus, a long conversion window wouldn't work for daily analysis. So for now, we use all exposures and make sure the conversion windows don't filter out any data. We look at the metrics over time to try to identify novelty effects as well as conduct data QA checks. We aren't doing any statistical testing, so I don't think introducing bias/dependent observations is a concern for us (but please correct me if I'm wrong). It would be very helpful to be able to track metrics per cell partitioned by day; meaning all the user behavior during day X is included for each day in the test.

helpful-application-7107

03/10/2023, 9:38 PM

From our PoV, we do not need the conversion window. After a user is exposed to a test, we do not want to filter out any data.

Good news: our refactor will handle this use case out of the box and is about 80% the runtime on our test data. We are finalizing the naming, but it will be something like

Full History

First Exposure until End of Experiment

that we intend to use to replace

Multiple Exposures

. Bad news: we don't really have an existing way to do the analysis you want to do. You're correct about the way it works now; each day will have all users exposed on that day + the conversion window starting on that day (each day a user is bucketed on kind of treats that user as if they were a totally new user). After our refactor, each day will be just users bucketed for the first time on that day and their metrics will be from either their first conversion window, or from the start of that window until the end of the experiment depending on whether you choose

First Exposure

Full History

Therefore, the time series will show you how effects have changed over when a user first entered an experiment, but not the average difference between group A and group B (for all users in that group) on day X.

👍 1

helpful-application-7107

03/10/2023, 9:39 PM

However, supporting that last use case is definitely on our roadmap as that is probably the most common time series of interest (besides how many users enter the experiment on each day).

👍 1

84 Views

Open in Slack

Previous Next