Hi team I have been running A A tests a few times now and ru GrowthBook Users #ask-questions

Hi team, I have been running A/A tests a few times...

prehistoric-beard-84272

09/20/2023, 7:42 AM

Hi team, I have been running A/A tests a few times now and run into the issue that they will become statistically significant (lost or won). As far as I understand, this should not happen, right? I set them up using the Visual editor that both control and variation log a different message to console. Everything else is equal. Here's the json object loaded by the website:

Copy code

{
  "status": 200,
  "features": {},
  "experiments": [
    {
      "key": "test_aa_v2",
      "status": "running",
      "variations": [
        {
          "css": "",
          "js": "console.log(\"Version0\");",
          "domMutations": []
        },
        {
          "css": "",
          "js": "console.log(\"Version1\");",
          "domMutations": []
        }
      ],
      "hashVersion": 2,
      "hashAttribute": "deviceId",
      "urlPatterns": [
        {
          "include": true,
          "type": "regex",
          "pattern": ".*\\/products\\/.*"
        }
      ],
      "weights": [
        0.5,
        0.5
      ],
      "filters": [],
      "seed": "test_aa_v2",
      "phase": "0",
      "coverage": 1,
      "meta": [
        {
          "key": "0"
        },
        {
          "key": "1"
        }
      ]
    }
  ],
  "dateUpdated": "2023-09-18T16:50:28.162Z"
}

This test runs on a PDPs and I just compare how often the ATC button is clicked, which statistically should be very even across a high number of events. Any ideas why results lean to one side?

fresh-football-47124

09/21/2023, 5:15 AM

Hi Sasha

fresh-football-47124

09/21/2023, 5:15 AM

sorry for the delay

fresh-football-47124

09/21/2023, 5:15 AM

Let me follow up with the team, @helpful-application-7107, do you have thoughts?

👍 1

helpful-application-7107

09/21/2023, 4:47 PM

Hi, there's a long discussion of A/A tests: https://linen.growthbook.io/t/13142527/hi-growthbook-team-having-the-same-issue-as-i-described-here#76332a3f-3c02-4ee5-b01a-b3de79fdc260 I gave answers as

helpful-application-7107

in that thread. Please take a look and let me know if you have follow-up questions.

prehistoric-beard-84272

09/22/2023, 8:13 AM

Thanks for the link to the discussion. I understand the general idea that A/A tests can become significant. It still feels "off" though. I ran another A/A test for the same scenario ( A/A, just with console.log() ) and it shows the same results. I'll let it running, to see what happens. But -20% and 99.7% significance to lose feels quite strong.

helpful-application-7107

09/25/2023, 4:25 PM

Hmm, having two with this strong of results and trending in the same direction does lead me to be a little suspicious as well. Let me check in with the team that owns the visual editor.

👍 1

helpful-application-7107

09/25/2023, 4:32 PM

What happened in the

v1

that you ran?

prehistoric-beard-84272

09/25/2023, 4:39 PM

It had multi user exposure, so I removed it

prehistoric-beard-84272

09/25/2023, 4:41 PM

I restarted the test v3 in a new phase (deleted the previous phase). Looks better, but still extreme:

prehistoric-beard-84272

09/25/2023, 4:41 PM

image.png

helpful-application-7107

09/25/2023, 4:42 PM

Did you choose to re-randomize? New phases can suffer from carry-over bias if you don't re-randomize.

prehistoric-beard-84272

09/25/2023, 4:42 PM

Sorry, what do you mean by re-randomize?

helpful-application-7107

09/25/2023, 4:45 PM

Hmm, maybe it isn't an option in the visual editor at the moment. Did you start the new phase from clicking the three dots in the far top right and editing phases in the Experiment page?

prehistoric-beard-84272

09/25/2023, 4:45 PM

Yes, I deleted the phase 1 which reset the experiment to "draft"

helpful-application-7107

09/25/2023, 4:45 PM

I see.

helpful-application-7107

09/25/2023, 4:46 PM

So that might not provide a fully new test as it will not re-randomize devices and past user behavior could influence who enters the new test. The safest thing to re-run the A/A test with a new experiment key. If you want to do that while I get some more input from our side that may be informative.

prehistoric-beard-84272

09/25/2023, 4:47 PM

Sure, I'll set that up.

prehistoric-beard-84272

09/25/2023, 4:57 PM

OK, recreated the experiment from scratch - will send you an update tomorrow

helpful-application-7107

09/28/2023, 4:12 PM

Any update here?

prehistoric-beard-84272

09/29/2023, 6:13 AM

Yes, it looks much better now. My take away is to create new experiments and not delete phases and start over with existing experiments. Thanks for the help, Luke!

helpful-application-7107

09/29/2023, 3:35 PM

Yeah, so a new phase does not re-randomize unless you're explicitly choose to do so (only available in some flows) so it will be as "unlucky" as the previous draw for A/A experiments, and in real experiments it can cause carry-over bias.

busy-air-96466

10/09/2023, 9:23 AM

Ah, I wonder if that's why I am seeing weird values for experiments? I have an existing A/B experiment, and I modified it to use a namespace, and created an A/A experiment in the other half of the namespace. About half of the users are in the A/B experiment and they ALL have the associated feature 'true'. The other half are in the A/A experiment and their associated feature is 'false'.

busy-air-96466

10/09/2023, 9:25 AM

I'm using the SDK, NOT the visual editor.

busy-air-96466

10/09/2023, 9:27 AM

If it is this issue @helpful-application-7107 do I need to create new features as well, or only new experiments using the existing features? Thank you!

helpful-application-7107

10/10/2023, 2:37 PM

I have an existing A/B experiment, and I modified it to use a namespace

This is an unsafe operation and it can be difficult to make guarantees about how the user experience will change over time. We should make it clear in the app that there isn't a safe way to do this without creating new features/re-randomizing if we don't already.

77 Views

Open in Slack

Previous Next