Hi all! I'm trying to do a count distinct metric. ...
# announcements
b
Hi all! I'm trying to do a count distinct metric. We're trying to measure "unique user follows" aka, for a given user in a control/treatment group in this experiment, how many new, unique follows do they add w/in the experiment time window. There's probably a way to smooth out repeat follows that gets close-ish to a count distinct (e.g. some kind of rolling window that grabs the first follower/followee event in a given 7-day span or something similar). But we also have another use case for this type of metric where that kind of pre-growthbook modeling wouldn't work so well. Any ideas about how to do this in growthbook? And/or I'd love to add a feature request for this type of metric!
1
f
Hi Quinn. Are you trying to de-duplicate multiple of the same follow during an experiment? Or are you trying to exclude follows that are repeats of ones done before the experiment started?
b
Multiples of the same follow during the experiment, most of which are from people following/unfollowing/refollowing in quick succession.
I think we have several use cases for this though where we conceivably could do some kind of sorta-complicated smoothing/modeling before the data gets to growthbook but it'd be annoying & maybe compute-intensive, and a count distinct over the whole experiment period would make it way easier
Other one I can think of off the top of my head is we like to measure "distinct chat-channels a user sent a message in", which isnt as conducive to a simple "de-dupe over x time period / rolling window" kind of solution. Bc a user might be jumping around and posting in those different chat channels across different time scales
f
I think this is almost possible in GrowthBook. We let you specify the per-user aggregation (defaults to
SUM(value)
). You could in theory do something like
COUNT(distinct follow_id)
instead. However, we currently don't let you add custom columns like
follow_id
. I think adding support for that would be fairly simple though.
👀 1
b
Ah cool! Where can I change the per-user aggregation? I don't see it in the "edit metric" view EDIT: Ah, found it! You have to switch to the "Query Builder" in Edit metric > Query Settings
Aha okay so I think you do support this right now actually, right?(!) Haha had just never poked around in the "query builder" screen before.
I believe this does what I want it to do, which is count the distinct users-followed w/in the experiment's time-frame