mysterious-iron-16289
05/23/2024, 12:40 PMhelpful-application-7107
05/23/2024, 3:05 PM\Delta_cv
is the CUPED lift estimate, and \Delta_cv
is going to be different from \Delta
(introduced right below equation 1 in section 2.1 in their paper). This is because the raw variation averages (in their paper written as Ybar
) are different from the adjusted variation averages (in their paper written as Yhat_cv
). That's just to help you see how the quantities are different in their paper.
Intuitively, in order for CUPED to work, our estimate of the lift itself has to move. Think of it this way: in some respects CUPED reduces variance by eliminating "improbable" randomizations. If you end up with a randomization (that works well) but for some reason puts slightly less active users in your test variation, then CUPED uses that pre-experiment knowledge that these were less active users to reduce that imbalance's impact on the estimate. That would mean shifting the estimate to account for this chance imbalance.
Because this imbalance is due to chance, you're correct that on average, across many runs of the same experiment, the CUPED estimate would be on average the same as the non-CUPED estimate. However, in a single run, the estimate will always change. (BTW, this applies to whether you use relative or absolute effects, because the shift occurs in the variation averages, which is used for both estimates.)helpful-application-7107
05/23/2024, 3:08 PM\mu_T
and \mu_C
and that affects both the relative and absolute effects \Delta_r
and \Delta_a
, respectively.helpful-application-7107
05/23/2024, 3:10 PM\sigma_T
and \sigma_c
in our notation above) by understanding how much of the variance is explained by fixed factors (e.g. the pre-experiment values).mysterious-iron-16289
05/24/2024, 10:20 AMhelpful-application-7107
05/24/2024, 1:22 PMhelpful-application-7107
05/24/2024, 1:23 PMhelpful-application-7107
05/24/2024, 8:59 PMYes, your math looks right and is a clear exposition!I probably will update our CUPED documentation now that we updated the statistics details docs (the one I linked above). Would you mind if I borrowed a bit from your exposition because I think it's clear and I could add it to a FAQ on the CUEPD page.
mysterious-iron-16289
05/27/2024, 7:14 AMmysterious-iron-16289
05/27/2024, 1:41 PMmysterious-iron-16289
05/31/2024, 3:13 PMhelpful-application-7107
05/31/2024, 3:15 PMmysterious-iron-16289
06/03/2024, 8:19 AM