I am trying to figure out the segments functionali...
# announcements
o
I am trying to figure out the segments functionality. While I've managed to successfully setup a segment, I am somewhat concerned about how they work exactly. I am seeing users are joined with their segment data by user_id only, but user properties change overtime, sometimes with every pageview.
JOIN segment s ON (s.user_id = u.user_id)
In my data source configuration, I already have some user properties that are used in the experiment analysis and are pageview-level. Wouldn't it make sense to have user properties as a part of the pageview query?
Copy code
SELECT
  user_id,
  timestamp,
  url,
  has_orders,
  lang,
  country
FROM pageviews
In which case it would me much easier to calculate the estimates for an experiment based on pageviews by just filtering pageviews with certain user properties. Looks like I can achieve more precision by editing this query
Copy code
SELECT
  u.user_id,
  MIN(u.conversion_end) AS conversion_end,
  MIN(u.session_start) AS session_start,
  MIN(u.actual_start) AS actual_start
FROM
  __users u
JOIN
  segment s
ON
  (s.user_id = u.user_id)
WHERE
  s.date <= u.actual_start
GROUP BY
  u.user_id
When I change <= to simple "=" , I only join user properties which were observed exactly on viewing specific page and not before that.
f
Yes, segments are only really useful for stable user properties over time, not things that change per request or session. We have custom SQL filters for the experiment queries. We could add something similar for the impact estimate queries.
o
I need something that could cover scenarios like experiment sample estimates based on "users with more then 3 sessions who visited this particular page" or "users who visited this page and had more then 3 transactions". These kind of properties always change but can still be called user properties (at least that's the philosophy platforms like Amplitude use)
I am using dimensions when analyzing experiment results and it's actually beatiful how it's implemented (value distribution between groups, in particular). I expected something similar in the estimates report - where you can filter out certain pageviews based on their dimensions.