Skip to main content

Modeling your data with dbt

Unleash the power of your behavioral data
If you're looking for more guided approaches that contains information about tracking and modeling your data, check out all our Data Product Accelerators!
πŸ‘‰ Take me there! πŸ‘ˆ

dbt enables analytics engineers to transform data in their warehouses by simply writing select statements. Snowplow has written and maintain a number of dbt packages to model your snowplow data for various purposes and produce derived tables for use in analytics, AI, ML, BI, or reverse ETL tools.

Using Snowplow's dbt packages means you can draw insight and value from your data quicker, easier, and cheaper than building your own modeling from scratch.

Snowplow Data Modeling PackagesSnowplow Data Modeling Packages

To setup dbt, Snowplow open source users can start with the dbt User Guide and then we have prepared some introduction videos for working with the Snowplow dbt packages.

For Snowplow BDP customers, dbt projects can be configured and scheduled in the console meaning you can get started running dbt models alongside your Snowplow pipelines.

Snowplow dbt Packages

Our dbt packages come with powerful built-in features such as an optimization to the incremental materialization to save you cost on warehouse compute resources compared to the standard method, a custom incremental logic to ensure we process just the required data for each run and keep your models in sync, plus the ability to build your own custom models using both of these!

There are 5 core snowplow dbt packages:

The Snowplow Media Player package is designed to be used with the Snowplow Web package and not as a standalone package.

Each package comes with a set of standard models to take your Snowplow tracker data and produce tables aggregated to different levels, or to perform analysis for you. You can also add your own models on top, see the page on custom modules for more information on how to do this.

In addition to the other packages, there is a Normalize package that makes it easy for you to build models that transform your events data into a different structure that may be better suited for downstream consumers.

The supported data warehouses per version can be seen below:

snowplow-web versiondbt versionsBigQueryDatabricksRedshiftSnowflakePostgres
0.16.0>=1.5.0 to <2.0.0βœ…βœ…βœ…βœ…βœ…^
0.15.2>=1.4.0 to <2.0.0βœ…βœ…βœ…βœ…βœ…^
0.13.3*>=1.3.0 to <2.0.0βœ…βœ…βœ…βœ…βœ…
0.11.0>=1.0.0 to <1.3.0βœ…βœ…βœ…βœ…βœ…
0.5.1>=0.20.0 to <1.0.0βœ…βŒβœ…βœ…βœ…
0.4.1>=0.18.0 to <0.20.0βœ…βŒβœ…βœ…βŒ

^ Since version 0.15.0 of snowplow_web at least version 15.0 of Postgres is required, otherwise you will need to overwrite the default_channel_group macro to not use the regexp_like function.

* From version v0.13.0 onwards we use the load_tstamp field so you must be using RDB Loader v4.0.0 and above, or BigQuery Loader v1.0.0 and above. If you do not have this field because you are not using these versions, or you are using the Postgres loader, you will need to set snowplow__enable_load_tstamp to false in your dbt_project.yml and will not be able to use the consent models.

dbt Version Compatibility Checker​

You may be using other dbt packages that require you to use a specific version of dbt, or a specific version of the dbt_utils package. Or you simply may not wish to upgrade your existing setup.

Here’s a way to check the latest version of our packages you can install for a given setup. Simply enter your dbt version and (optionally) dbt_utils version below.

Please enter a valid dbt version.
Was this page helpful?