Hi all, We're using GrowthBook for our A/B testing...
# ask-questions
m
Hi all, We're using GrowthBook for our A/B testing and have a question about variance estimation for one of our key metrics.
Our metric is a proportion:
p_watched
(defined as
sum (user watched >= 25s of video) / count (user saw video)
), specifically for videos shown in the first position of algorithmically ranked video playlists.
The core concern is that our data isn't strictly i.i.d. due to clustering. A small number of specific
video_ids
make up a large percentage of the impressions for this first playlist position, and these videos have inherently different
p_watched
rates (say, ranging from 0.1 for some videos to 0.25 for other videos). This means observations (views) are correlated within each
video_id
cluster. Often, the control and treatment groups have different videos at the top position.
Classical variance formulas for proportions (like
p̂(1-p̂)/n
) assume independence and appear to underestimate the true standard error and produce overly narrow confidence intervals in this scenario.
My question is specifically about the standard error calculation for the treatment effect (lift or absolute difference) itself. Does GrowthBook's frequentist or Bayesian engine incorporate adjustments for this type of data clustering (similar to Cluster Robust Standard Errors - CRSE) when calculating the variance/SE for proportion metrics? Or does it primarily rely on the user-level i.i.d. assumption?
Have you encountered this specific clustering scenario before with proportion metrics? Any insights into how GrowthBook handles it, or recommended best practices within the platform, are greatly appreciated. Thank you, Teodor