Hi, I'm new to Growthbook and I'm just trying to a...
# ask-questions
r
Hi, I'm new to Growthbook and I'm just trying to add a metric or filter to a Fact Table I created, but can't seem to do either. Here's a screenshot of the failure to add a simple filter. I can add a metric to the raw databricks soruce table, but when I try to convert to the imported fact table it just errors. What's the syntax here for using the fact table by name?
Copy code
SELECT
  user_id,
  anonymous_id,
  timestamp,
  1 as value
FROM
    main.analytics.events -- replace this with a fact table called AnalyticsEvents
    where event in ('Application Opened')
f
Hi Jason, can you open the developer console and see if there is a more descriptive error?
r
I did, and unfortunately there's not -
Copy code
{
  "status": 400,
  "message": "The operation failed due to an error",
  "errorId": "36143d91749c45368a4e69013ba464ad"
}
f
let me check with the team - you are using the cloud right?
r
yes
f
Can you double check the connection to Databricks?
I’m wondering if the connection to Databricks is failing and causing this
r
I can definitely query databricks through here
The "test query" button for the fact table works and returns data
Copy code
SELECT
user_id, timestamp as client_ts, received_at as timestamp, anonymous_id,
  event, *
FROM
    main.analytics.events
where received_at > '2023-10-16'
*Sample 5 Rows* Succeeded in 3372ms
Do you know the answer to the 2nd part, how to refer to the fact tables in other parts of growthbook
I'm not sure why the schema browser isn't working though. It's retuning a 404 rather than a permission denied error
I tried creating a new databricks datasource connection, same issue. I can read and preview data, but schema browser doesn't work
f
@future-teacher-7046 thoughts?
r
@fresh-football-47124 or @future-teacher-7046 any updates here? We're a new customer trying to get this integration off the ground for a priority project and I'm blocked at the moment on moving forward while we're unable to use this data source
f
Do you know if you're using the Unity Catalog in Databricks? I think we might require that in order for our schema browser to work
r
Yes, we are
I set that up myself
f
hmm. I wonder if it's a permissions issue with the database user? We just run pretty standard
information_schema
queries to populate our schema browser
r
The principal I created has reader perms on the whole catalog
Is there a way to get a more detailed error message? Right now it's just saying
The operation failed due to an error
f
let me check our Cloud logs and see if I can see any more details
I'm not seeing anything in our logs. Have you tried creating a filter recently?
r
Several times on Friday, I can try again now
Yep, same error
Copy code
{
  "status": 400,
  "message": "The operation failed due to an error",
  "errorId": "b73ff6ddd391423db1d536440392aaf5"
}
f
yeah, I see it now, but unfortunately, can't see any additional info on my end. We're logging the entire Databricks error message
r
That's the entire error from databricks?
What query is it trying to execute?
f
We wrap the query in the CTE and add a LIMIT clause. So something like this:
Copy code
WITH _table as (
  SELECT timestamp from ([factTable.sql]) t WHERE [filter.sql]
)
SELECT * FROM _table LIMIT 5
r
Do you have any suggestions I can try to get this working?
f
It looks like our DataBricks SDK version is out-of-date and they have improved error handling code since then. We can try updating that on our end and see if it helps.
In the meantime, you should be able to skip using Fact Tables for now and just define metrics directly.
r
If I wasn't using a fact table (that does some high level filtering and column aliasing), I'd have to recreate that in each of the dimensions, metrics, etc or is there an easier way
f
yeah, you'd have to copy/paste that same SQL for every metric. That's the exact thing Fact Tables is trying to solve, but it's still in Beta and has some rough edges, which I think you're running into
r
Well, I found out why the fact table stuff wasn't working, I had
select user_id, foo as timestamp, *
which somehow worked in the preview and save for the fact table SQL, but weren't valid SQL due to duplicate column names. Luckily databricks has (
select * except (...)
) so this is now working with the fact table.
Odd that the schema browser doesn't work though
f
ah, so I guess it wasn't an error until we explicitly tried to select one of the duplicate column names.
r
Well it must have run the sql when the fact table was defined because it showed preview data
f
I believe for that, we run a
SELECT *
and we don't explicitly do
SELECT timestamp, ...