I can't mark an experiment as "flat", only as "inc...
# give-feedback
w
I can't mark an experiment as "flat", only as "inconclusive", which is not the same: https://chatgpt.com/share/68d56dce-7788-800c-8528-6a60b10a1fd0
s
Thanks for the feedback. I can't access the ChatGPT convo, but as far as I understand: • Inconclusive is the larger category, which includes flat results (where metrics aren't moved in either direction) and divergent results where some metrics are up and others are down. What benefit are you looking for in being able to mark results as flat?
w
Let me share the convo: The distinction is about why an experiment doesn’t give you a clear, actionable outcome: Inconclusive experiment • The experiment was run, but the data doesn’t provide enough evidence to favor one hypothesis over another. • Causes: ◦ Too little traffic/sample size (low statistical power). ◦ High variance in the data. ◦ Poorly chosen metrics (not sensitive enough). • Interpretation: You can’t say whether the treatment works or not — you need more or better data. Flat experiment • The experiment ran with enough power, and the metrics are stable, but the treatment effect is essentially zero. • Causes: ◦ The tested change genuinely doesn’t move the measured outcomes. ◦ The feature might not impact the chosen metric (or at all). • Interpretation: The experiment confirms the null hypothesis — there’s no meaningful difference between control and variant. In short:Inconclusive = lack of evidence (data issue). • Flat = clear evidence of no effect (treatment issue). Do you want me to also show you how these are usually treated differently in decision-making frameworks (e.g., whether to re-run vs. discard the idea)?
b
I'll go ahead and call a bit of AI hallucination on this one. "Inconclusive" also applies when you've run a good test.
Proving that something is truly "flat" is practically impossible. So you'd have to define flat before you can demonstrate with a test.
w
Looks like there's no consensus from what I'm seeing 🙂
I still like to think as both being different tho
b
How are they different for you? Under what circumstances?
w
Inconclusive: we stopped the test early because it had a bug or whatever Flat: we reached the sample size and it didn't won or lose
(as an example)
b
Well there is a "stopped did not finish" designation in GB.
w
That's right!
I forgot about it