Hello! I've found that quite a few customers in an...
# ask-questions
p
Hello! I've found that quite a few customers in an experiment have been assigned multiple variants (which was helpfully flagged on the UI thankyou ), but in digging into it I've identified that customers are getting multiple variants if they are associated with multiple anonymous_ids. The experiment and feature are set to assign based on customer_id, so I'm quite confused as to why this is happening. Screenshots and an example in the thread.
Screenshot 2023-12-15 at 1.17.47 PM.png,Screenshot 2023-12-15 at 1.17.30 PM.png,Screenshot 2023-12-15 at 1.05.30 PM.png
f
where is the anonymous_id coming from?
p
From Rudderstack
It's an external ID that we use to stitch customers outside the product to authenticated users
f
and the customer id?
p
That's an internal ID that we generate (or call) when a user authenticates
f
I wonder if an adblock is causing the rudderstack cookie to not be as sticky as your internal cookies
what percentage of your users had this?
p
About 5% currently, but it is increasing overtime (likely due to more opportunities to be assigned another variant)
f
one thing you can check is the cookie length
I wonder if rudderstacks cookie is more short lived
p
But shouldn't the Growthbook assignment not even know/care about the anonymous_id when assigning, given we've said to assign based on customer_id?
f
the fact that you're getting this multiple assignment error means that you're not doing that
are you sure that youre not assigning by anon id and doing the report by customer id?
p
Given the feature in product says split users by customer_id, I assumed that would mean assigning by customer_id - is there another place we should check?
Screenshot 2023-12-15 at 1.27.55 PM.png
f
on the top of the experiment reports
you should see a bar, that has some info about the experiment data
to the right of "Analysis settings"
p
That also shows logged in visitors (meaning with customer_id)
I wonder if perhaps the Growthbook assignment is being called in a race condition at login, and perhaps the customer_id isn't always available when it's called - would it default to the alternative assignment ID if the customer_id is null?
f
if you call our sdk to get a value of a feature flag that includes an experiment on an attribute which is not defined, it will return the default value, but the trackingCallback is not going to be fired
p
hm strange so shouldn't be assigning based on the other attribute even in absence of customer_id
f
sometimes users can get this with sdk setups like id = some_id || another _id
that can cause this
p
I will confirm the implementation and keep you posted!
So it does not appear that we're ever sending the anonymous_id in place of or as a default or alternative for the customer_id... any other suggestions of where we might look to see why it seems to be using that alternative id?
Is it possible that because we've set both as identifiers in Growthbook (broadly, not for this experiment) it is concatenating them somewhere along the way? and then assigning based on both?
f
it might be, if you added a metric that is marked for anon_id, and the experiment is customer id, it will use the join query
p
We've run into this issue again, and it's becoming an increasingly large concern for us. If we aren't able to consistently show a customer the same variant within an experiment the whole experiment is a bit in jeopardy. None of the metrics used include the anonymous_id, the experiment is set to target customer_id at both the experiment level and in the feature. Is there any further support that you can provide?
r
Hi, hoping to chat this through
We're still unfortunately (on another experiment) having double ups despite it being a consistent customer­_id and the dev I'm working with isn't quite sure how to proceed.
f
Can you share the SDK implementation code?
Do you know sql enough to pull a list of records for the doubly exposed folks?
r
Yes have a list of the exposures, and can get the SDK implementation code one moment!
What data on the multiple exposures would be useful?
This is where we initialise Growthbook
And the specific trackingCallback set up
And for the specific experiment that's currently hitting this issue we do  ​`useFeatureValue(growthbookTopTenCarouselKey, 'control')`​ - how we're loading the feature. (growthbookTopTenCarouselKey is ​`explore-top-ten-carousel'`​)
h
The issue happens because the id that you are using as your Identifier Type in your analysis (this is determined by the Experiment Assignment Table you select for your experiment) is different from the one you're using to hash users. So if your hash attribute is the customer_id, you need to make sure that exact id is being tracked in rudderTrack and is the id your Experiment Assignment Query is pointed to on your Datasource page.
p
We have used customer_id across the metrics, and the analysis settings, as well as in the Feature set up so I don't believe this could be the cause
Screenshot 2024-03-25 at 9.14.02 AM.png
h
s.identity?.id
that you set as an attribute may not always be the same as what
rudderTrack
is setting as the
customer_id
. I wonder if there's a way to explicity add this
s.identity?.id
to your ruddertrack to check if they every are different from the
customer_id
. This is almost always the reason for multiple exposures.
p
Hi again, We've recently rolled out another experiment and are seeing 29% of customers receiving multiple exposures. We've again narrowed this down to the scenario above, where Growthbook seems to be assigning different variants when the anonymous_id changes, despite a consistent customer_id. We have confirmed the setup again is all set to use customer_id, and we also have since added the
s.identity?.id
to our Ruddertrack call to ensure this is matching customer_id (and it is). We need assistance as to what else might be going wrong, as essentially we're unable to use Growthbook for experiments with this level of multiple exposures.
Is it possible to have a call to get some support for this? Or is there any other suggestions we can look at? Happy to provide examples of the events and data we have for those seeing multiple exposures if that's helpful for troubleshooting.
f
Hi Jordan - can you share the code where you set the attributes for the user?
p
f
it looks like you're using local storage - if that is disabled, users will get new anonymousIds every time
I wonder if that could do it
as you're using rudderstack, do you see any commonalities in the browser or device?
p
Anonymous ID is changing and yes does correlate to browser and device (we've seen that a new anonymous_id is generated when a customer goes from a native page to a web page that we have wrapped in the native wrapper). But my concern is that Growthbook should essentially not even care about anonymous ID as I understand it. It should only be assigning based on customer_id as that's what we've set throughout the feature and experiment - so why would changes to anonymous ID have any impact?
f
oh, you're assigning only by customer_id?
p
For this experiment (and the others we've had issues with) yes
Screenshot 2024-09-13 at 10.58.17 AM.png
f
are you using sticky bucketing?
p
How can I check that? I'm not sure
f
I dont see it
for the analysis - are you also using customer_id or another attribute?
p
just customer_id
f
Did you change the split percentage while this test has been running?
p
We have not no - had a read on Sticky Bucketing but yeah have not made any adjustments since going live
(we've stopped the experiment now, but while it was live no changes)
Any other suggestions? Happy to book in a call to troubleshoot if that might help as well. We're currently running this experiment manually outside of Growthbook due to the issue 😭
f
Hi Jordan, I will chase up a follow-up with the team on this thread and we shall get back to you shortly. Thank you for your ongoing patience and apologies for any inconvenience caused.
thankyou 1
h
Hey Jordan, sorry for the trouble here. The core of the issue is that there isn't a clear reason why GrowthBook isn't assigning purely based on customer_id. To help us pin this down: Are the variations correlating with that anonymous ID? Or some other ID? In other words, is there evidence GrowthBook is assigning by some other ID that you can see? Can you share your Experiment Assignment Query/Table for the customer id logged in users option?
p
It does appear to be correlating with anonymous ID (which is a secondary ID we sometimes do use for experiments, but not in use for these). I've grabbed the query here:
Copy code
SELECT customer_id
       , abtest_experiment_received_at as timestamp
       , experiment_name as experiment_id
       , result_variation_id as variation_id
       , (CASE WHEN device_type ILIKE 'Tablet' THEN 'Tablet/Desktop'
              WHEN device_type ILIKE 'Desktop' THEN 'Tablet/Desktop'
          ELSE device_type END
         ) AS device
       , device_browser
       , device_os AS os
FROM smarts.reporting.rpt_abtest_event
WHERE customer_id IS NOT NULL
Could the fact this query has device type/browser/OS mean it's not recognising that the customerID already has been assigned on another browser/OS?
h
No, those are just extra columns that are in your tracked event that can help with slicing and dicing. How does customer_id end up in that rpt_abtest_event table? I don't see it in your tracking callback, and we don't automatically track all of your attributes for you.
👍 1
p
That table is based on the ruddertrack event that is fired (and that does encompass a number of default identity attributes)
h
Is there a chance the customer_id being set to GrowthBook differs from the one in the track event?
Is there a way to add
s.identity.user?.id
directly to the tracking callback?
That way you can compare that value to the one in
customer_id
?
p
I believe it is the exact field we set as customer_id in the identify process for Rudderstack, but will just triple check that with the devs and come back to you!
h
Is the
anonymous_id
column in that table? Can you create an Experiment Assignment Query for that id and analyze the experiment by that ID and see what happens then?
If it is, you could also just run the following SQL directly in your data warehouse:
Copy code
WITH exposures AS (
  SELECT anonymous_id
       , COUNT(DISTINCT result_variation_id) as n_variations
  FROM smarts.reporting.rpt_abtest_event
  WHERE customer_id IS NOT NULL AND experiment_name = 'shared-lunch-in-app'
  GROUP BY anonymous_id
)

SELECT 
  CASE WHEN n_variations > 1 THEN 'multiple' ELSE 'single' END AS exposures
  , COUNT(*)
FROM exposures
GROUP BY 1
To see how many multiple exposures there are according to
anonymous_id
. You may want to add date filters or something if you know the range of this experiment, but it shouldn't be necessary if you only ran it once in one phase.
p
Have confirmed it is same customer_id on web - just confirming with Mobile devs as well and will let you know
h
Oh, you already did this check I think:
we also have since added the
s.identity?.id
to our Ruddertrack call to ensure this is matching customer_id (and it is)
p
On that it is much more single variant - definitely seems to be assigning related to that variant (which also corresponds to when a user goes between native/web wrapper or devices)
Screenshot 2024-09-19 at 9.20.31 AM.png
Still a few thousand multiple, but vast majority single variant
h
Ok, way fewer multiple exposures, but still some.
p
have now also triple checked with Mobile - it's all same id
h
Got it, thanks.
Does this experiment have multiple Phases?
Have you changed the phase dates at all?
p
No phases, and no date changes
h
Has this happened with other experiments, or just this one?
p
It has happened with other experiments, although not to this extent (we have seen up to 10% multiple exposures before though)
h
Can you share the screenshot of your SDK Attributes page and of your Identifier Types in the Data Source? I'm honestly at a bit of a loss.
Is there a second
setAttributes
call somewhere in your codebase that could somehow be overwriting this one?
p
Here's the screenshots - I'll ask the team to do a search just in case... but I don't think so
h
Ok, everything looks fine there as well.
Does
useAppSelector
do something I should be aware of?
Probably nt, just really at a loss here.
My leading hypothesis is that somehow
customer_id
is being set incorrectly in the
growthbook
instance. Either because: •
s.identity.user?.id
is somehow falling back to
anonymousId
in the
useAppSelector
or something weird like that • We are falling back to
anonymousId
when it's missing for some reason, maybe a race condition, but this should only happen if you have sticky bucketing and a fallback attribute set up, which it doesn't look like you do. • Somehow the attributes aren't being set correctly or are being overriden.
That's why I'm asking about other places you set attributes, or how
useAppSelector
actually provides those ids.
It seems like it's tracking alright to your warehouse since the anonymous_id multiple exposure was low, and it seems from your end that the customer_id : anonymous_id mapping seems right. Do your
anonymous_id
experiments have any multiple exposures issues?
Maybe just run the above query for another
experiment_name
that you know ran on
anonymous_id
?
Why is the tracking callback being set somewhere other than the place where the first screenshot is happening? What's the context for having the callback set in a different place to where the attributes are set?
p
Had a look at an anonymous_id experiment and that is attached. Similar to the customer_id experiment with nearly all single, and a small % exceptions. I'm wondering if it would be possible to organise a chat, so we can run through some of the specific examples were seeing, and have devs on the call to answer any questions and have you take a look at the setup. Would that be possible?
f
Hi Jordan, good afternoon. A call will be handy but unfortunately, Luke will be unavailable as he is out on PTO. I'll now co-ordinate with the team who can assist here and get back to you shortly 🙏
🙌 1
thankyou 1
p
Hi, any further update or timing options for a meeting? We'd really like to get this sorted ASAP. What hours are you available (in what timezone) and we can take a look at what might suit?
r
Hi, just following up here as well as Slack seems to have dropped off. I apologise for being the squeaky wheel, but we're approaching 2 weeks since I raised this again and we're unable to run this type of experiment with any confidence currently.
Is it possible to get this picked up again?
b
Hi Jordan, August here from the Support Engineering team. We're at a loss as to what could be going on here. Both Graham and Luke have gone through all of the troubleshooting steps we are aware of. I can arrange a time to have a screen share with you, but just to level-set, it's likely that we will use the call to gather information and unlikely that we can get this resolved live during the call. What is your availability for the rest of this week? Most of our team is in the US Pacific time zone.