Hello, we're running back-to-back A/A tests. Yeste...
# ask-questions
c
Hello, we're running back-to-back A/A tests. Yesterday we launched the 4th test with no gap between tests and started a new phase with re-randomization. After one day, it's showing significant negative results. I hope it flattens out in 3-4 days. If it stays significant, that will be a big red flag that something is wrong with the setup. As 3rd test was significant. We found a major issue reviewing Tests 1 and 2: 98.1% of returning users got identical assignments to their previous test. That's sticky assignments, not fresh randomization. We stopped Test 1 and relaunched it with a new phase. Tests 2 and 3 had proper randomization with fresh 50/50 splits for returning users. Maybe we did the relaunch wrong? Should we create a new experiment instead of just changing the status back to running after test was stopped? As starting new phase looks like works, after 4th test we will be able to confirm this. Additional questions: • Are there known issues with Cloudflare Worker + GrowthBook randomization? • Could re-randomization cause systematic bias? • Anything else we should know about re-launching test? That is not mentioned on Growthbook website?
c
I'm not positive about the behaviour of starting a new phase though
s
After one day, it's showing significant negative results. I hope it flattens out in 3-4 days. If it stays significant, that will be a big red flag that something is wrong with the setup. As 3rd test was significant.
Hi, I am a data scientist at GrowthBook. How many metrics are in your AA test?
We found a major issue reviewing Tests 1 and 2: 98.1% of returning users got identical assignments to their previous test.
Did you find the issue causing this?
c
Hey! @cool-house-41256, if sticky assignment is enabled then yes, if not it should re-randomize all returning users if understand correctly
Hey! @steep-dog-1694 We have two related metrics in the test. Now AA as expected got back to flat, now 90% confident it will end flat for both metrics. >> We found a major issue reviewing Tests 1 and 2: 98.1% of returning users got identical assignments to their previous test. > Did you find the issue causing this? Looking at the current Test 4 re-randomization it works. The only didn't worked was Test 2. The only explanation I found maybe that "seed" wasnt generated, either it failed how it was launched or was random error that I dont have good explanation. Here is the history and steps for Test 1 to Test 2: • Test 1 after run was stopped on August 11 AA test showed expected not significant results. Stopped by clicking "Stop experiment" and chosen result "inconclusive" • August 14 we decided to launch a series of back-to-back tests • Used same experiment, Clicked Edit phase -> New phase and changed status from "Stopped" to "Running" • Started again new phase, but it looks like without re-randomization, miss clicked or what happened not sure. When looking at Audit logs I don't see changes for "seed" in this, while in recent phase changes I see that line is changed. If I understand correctly whenever re-randomization is done correctly new "seed" should be generated?
s
Hi Marius, yes, a new seed is implemented with rerandomization. It sounds like other than Test 2 (where something went wrong and rerandomization didn't occur, which can bias the results), the AA test results are reasonable?
c
Hi @steep-dog-1694 still in progress, as we needed to restart because dev accidentally removed AA and we havent observed for couple of days 😄
Hi @steep-dog-1694, yesterday we got 2nd significant test result for AA test out of 4. Which is alarming a bit. Context: • Test run time was same as for all AA tests - 1 week • Same product area, same triggers • Same metrics CR and click_pay, both metrics moved to same direction • We don't use any priors for Bayesian (not even default that GB offers 0.3), but this won't change final result as sample size is not small. While I understand that it would have more movement while sample is small especially at the beginning • While 1st test had confidence a bit weaker (CI including zero) 2nd test effect looks more suspicious • Re-randomization from previous Test 4 to Test 5 works as expected and observed no issues Any suggestions?
One thing that we found that there is Safari bias. Meaning that in every test there was significant results for these users. Safari has cookie limit, and we observed what when hitting it it deletes. Could it be any other issues with cloudflare worker that we need to be aware for these users?
@strong-mouse-55694 pinging for more visibility
s
Hi @cold-nightfall-57392, would you be comfortable sharing screenshots of the results page for the 4 AA tests?
c
Could I send you in DM?
s
yes
s
Hey. I'm not seeing anything jumping out at me that could be an issue, but Safari does include several tracking protections that can interfere with analytics calls. Are you able to share the page where the test is occurring?
c
Hey @strong-mouse-55694 the test is currently running in AA checkout, but also we have AA test in homepage eldorado
From what we read about ITP and Safari we should be compliant. Unless there is something with cross-domain tracking that still haven't observed
s
You said you're using Cloudflare Workers. Is the tracking call firing in the worker or on the browser?
1
c
Browser
s
Thanks. When I go to the URL, I see some GA4 calls. Is that what you're using for event tracking? I don't see any experiment callbacks. Is the experiment still running?
c
Yeah we using GA4 for event tracking. If you revisiting website you only get once
experiment_viewed
. Thats why you might not be able to see. Launching in incognito should create clean cookies and you should get all experiment events. We are sending in the areas were the test is set up.
f
@brief-honey-45610 @flaky-noon-11399 Do you mind helping Marius with this?
thankyou 1
f
Hi Marius, just got caught up on the thread. From what I can see there could be numerous factors at play 1. ITP issue mentioned. When Safari hits its cookie limit, it may delete cookies, leading to users being re-bucketed or assigned new identifiers. 2. You mentioned the trackingCallback being triggered on the browser, but the recommendation is for this to be handled on Edge 3. For the SRM errors have your reviewed our help doc here that walks you through the causes and solutions? In particular to check the identifiers match All in all, as you you consistently see bias only in Safari users, this is a strong indicator that cookie persistence is the root cause. It may be worth testing the user flow in Safari to see if the assignment, identifier remains consistent throughout the user journey when they accept cookies, etc. to make sure they are not re-randomized in their user flow
c
Hi @flaky-noon-11399 thanks for the quick response! 1. We couldn't confirm Safari cookie limit, tried multiple times to replicate and couldn't. Maybe it's deleting if website was not visited within 7 days (will try to confirm soon this), sadly for our AA test this doesn't apply this anyway. 2. After discussing with dev team: Our site is a Single Page Application (Angular). The Cloudflare Worker only executes on the initial page request. After that, all navigation happens client-side via JavaScript routing. a. Example flow: i. User lands on
<http://eldorado.gg/category|eldorado.gg/category>
→ Cloudflare Worker runs, user gets bucketed ii. User clicks to navigate to homepage → JavaScript router changes URL to
/homepage
iii. No request hits Cloudflare Worker → Edge worker has no idea user navigated to homepage iv. If we only track on edge, this user would never be recorded for a homepage experiment b. The goal was to avoid flicker with this approach. Not sure if is the best approach, but this is what the team used. As we have systematic biases im trying to question all. 3. We will review additional, but this doesn't seem the issue. Are there any SPA-specific recommendations in GrowthBook's documentation that we might have missed? Or something not in official documentation that would work for our use case? Thank you in advance!
f
Hi @cold-nightfall-57392, good morning and Happy Friday 👋 Thank you for your speedy response and assisted information. We do have SPA related documentation here but its quite bare. Let me cross check with our Edge expert on a recommendation here
c
Happy friday @flaky-noon-11399, thanks! It looks a bit bare as we don't use GB SDK in our front-end. Let me know if Edge expert have any recommendations. 🙏
🙌 1
f
Thank you Marius, I've submitted a query, just waiting for the team to come online for the day and review. Will keep you posted on updates 🌻
thankyou 1
Hi @cold-nightfall-57392, the team have asked how how are you firing your tracking callback currently? They can see experiment events coming through on the dataLayer, but it doesn't look like you're using the edge to write the events to the DOM
c
Hi @flaky-noon-11399, Can you clarify what you mean by "using the edge to write events to the DOM"? Our current implementation: • Edge worker (Cloudflare): Handles user bucketing only • Browser: We use custom
gtag
script to fire
experiment_viewed
events to GA4 when users navigates to target pages (e.g., checkout) • We are not using GrowthBook's browser SDK or any GrowthBook tracking callbacks Our site is a Single Page Application (Angular).
f
Apologies for the delay here. I believe he means to inject information—like experiment assignments or tracking data—directly into the HTML response before it reaches the browser. This is typically done by mutating the DOM server-side, for example, by injecting a fully-bootstrapped JavaScript SDK or experiment assignment data into the rendered HTML. It seems that your approach relies on the browser to handle all tracking, rather than having the edge worker pre-populate the page with experiment assignment or tracking logic. Let me cross-check if that's correct with our Edge expert and confirm if the solution here is for you to modify your edge worker to inject experiment assignment data or tracking scripts into the HTML response, so the browser can immediately access this information on page load.
c
Thank you @flaky-noon-11399!
❤️ 1
f
Please can you share your edge settings / env vars?
You can DM me via here with them, or respond to the Live chat thread where we spoke to last time (so the Edge expert can also see them there too)
I'll message you "here" on the other thread so you can find it on your end 🙏
🙏 1
c
Received this from devs:
Copy code
const gbContext = {
    enableDevMode: true,
    disableVisualExperiments: true,
    disableJsInjection: true,
    disableUrlRedirectExperiments: true,
  };
Is this what you asked @flaky-noon-11399?