Hi, We're investigating a case of segment ratio m...
# ask-questions
c
Hi, We're investigating a case of segment ratio mismatch. By randomly segmenting users we notice that we result in segments that do not pass chi squared test. - There's significant segment ratio mismatch. How we do it is we generate a random value, put it on a cookie and on every request evaluate experiments using the cookie value. What we have observed so far is that the shorter the segmentation attribute value is, the more it is prone to SRM. This happens because the string value is iterated and each character is summed to the resulting hash value. shorter string, the less summation operations We've cahnged the attribute values to be 32 base 62 symbols. That did improve things a bit but we still measure significant SRM. Before it was a random int in the range [1;100000) Has there been any known issues or are there any guidance regarding this matter? Has there been any consideration in what effect might the experiment key have when calculating the hash value? The key is always prepended to the ID attribute, it might introduce some bias. Any chance the experiement key - guid value - is introducing the bias. thanks in advance
s
I think I understand your process but need some more detail. For experiment assignment, are you using the SDK? If so, which language/version?
c
Yes, sdk. Both go and js/ts. Latest..? Experiments are def evaluated using the v2 hashing fn
s
Thanks for the follow up. I'm going to share with the team, but do you have any way we can easily replicate?
Specifically, some if the IDs, experiment seeds, and how you tested.
c
coming back to this topic So okay, I took the implementation of hashing v2 from js sdk and reimplemented it on go. Ran bunch of experiments and discovered obvious corelation between string value length and segment ratio missmatch. the shorter the value, more it's prone to biases
what else I was finally able to discover is that most likely there's something that's causing
_ga
cookie to be removed from the browser but our custom first party cookie that stores segmentation value, remains, ending up particular clients reusing the segmentation value under different google analytics client ids
leading to bias that's more prominent with ios
s
Thanks for the follow up. We're investigating the bias issue on our end. The cookie issue is different, of course. Have been able to find out the cause of its removal?
c
no, but I suspect it's gotta do with privacy features safari has
s
Requested access. Thanks
c
asking out of curiosity have you made any conclusions out of this?
s
Sorry, yes. We took an initial look and ran our own tests, but we didn't see any issues with biases/SRM. I was going to take another pass before getting back to you but that was our preliminary conclusion.