blue-agent-1506707/05/2023, 9:58 AM
• I have a
that is generated once some step is done
is the key used to identify conversions
• so i set up a join table like
here are the doubts:
• Is there a smart way that I am missing to handle repetition? example:
1. I am on device 1 (so I have
SELECT client_id, lead_id FROM table_relation
) and generate a
2. then I log in on device 2 (so
so same lead_id from point 1
3. then i do a binomial conversion with my
4. when doing analysis, I would have for both
a conversion (because are both linked to
• Is there a way to label these phenomena in analysis (ideally, I would also like to count distinct conversion)?
• Do I have to create some attribution logic to handle it? e.g. only last
helpful-application-710707/05/2023, 10:48 PM
mapping to one set of conversions, and you randomize on
, then you are potentially pooling together conversions because both
, who is potentially in your treatment, and
who is potentially in your control, are both getting the same conversion values.
• Is there a way to label these phenomena in analysis (ideally, I would also like to count distinct conversion)?What do you mean by "label"? There's no way for any analysis system to attribute conversions to specific clients if you only give it conversion data with
. You would need to analyze at the
level, which would involve tracking conversions by clients. If many users are using multiple clients, then you may have some issue with spillovers, but these can be very hard to track. However, unless you have many, many users with multiple clients, I would consider this strategy instead.
Do I have to create some attribution logic to handle it? e.g. only lastI'm not sure what exactly you are proposing here, but if you say more then maybe there could be some middle ground solution we could discuss.if multiple...
blue-agent-1506707/06/2023, 7:36 AM
across the dataset to have a taste of the spillover. ideally, if I have an high share of "count > 1" the spillover could be worrying.
Regarding the attribution, is something I would not consider, but I like to know your thoughts about it: basically in the join table you create some logic to exclude all the multiple occurrences of
(associated to multiple
) with some timestamp logic (e.g. I give 100% credit to the conversion to the last -or first - relationship
based on the timestamp)
clientd_id_n <-> lead_id_x