dbt Integration
Overview
Hashboard can import models directly from dbt by specifying measures in your dbt yaml file. Hashboard will automatically sync dbt documentation and keep measures in sync with downstream dashboards (whether they are checked into version control or not). The Hashboard dbt integration can also be integrated in CI/CD workflows.
Configure your dbt_project.yml
You can use the glean-defaults
key to specify the glean version (which is required) and connection information.
models:
+meta:
# glean-defaults settings will be merged into each Hashboard model
glean-defaults:
glean: "1.0"
# preventUpdatesFromUI defaults to true, but can be set to false
preventUpdatesFromUI: false
source:
connectionName: bigquery-prod # the name of the conneciton in Hashboard
Define model information
The simplest example of a Hashboard model just defines a measure. Hashboard models must include the glean
key in the dbt model.
version: 2
models:
#
- name: order_attribution
description: This model describes how each order is attributed to different marketing touchpoints.
columns:
- name: attribution_id
description: Unique ID for the attribution.
- name: order_id
description: Order ID, matches the order_id in the sales model.
- name: touchpoint_id
description: ID of the marketing touchpoint which led to the order.
meta:
glean:
cols:
- id: row_count
type: metric
name: Order Attribution Records
aggregate: row_count
Hashboard previously used the term metric to refer to what is now called a
measure. At the moment our configuration files still refer to type: metric
when creating measures inside a data model. We plan to update the
schema of these configuration files in the future to use the term measure
instead of metric.
Build your Hashboard models
This is the simplest way to preview your models, there are more advanced options below.
# assuming you've already built your tables with dbt run
hb preview --dbt
# and later
hb deploy --dbt
Model specification
The glean-defaults
connection configuration in the example above gets merged into the Hashboard configuration inside each dbt model. If you don't specify the defaults, you must add the complete connection information in the meta
key of each dbt model you want imported to Hashboard. Only dbt models that specify a glean
key will get built into Hashboard models.
version: 2
models:
- name: a_model
description: model docs here
meta:
glean:
glean: "1.0" # Hashboard dbt integration version
source:
connectionName: snowflake_production # which Hashboard data connection to use, can also be specified globally if you use glean-defaults
columns:
- name: a_str
description: some column documentation
# ... your dbt columns
Some notes on how models get built from dbt:
- Hashboard will read the dbt manifest file and find all models with a
glean
key specified within themeta
key. - The name of dbt model is imported into Hashboard as the model name (this can be overriden).
- Columns specified explicitly in the dbt model will become attributes in Hashboard.
- Descriptions on the model will get imported as model documentation.
- You can specify additional attribute configuration in a
meta
key for a column.
How data model source is determined:
source
has the same options as the corresponding field of Hashboard data models.- Unless otherwise specified, Hashboard automatically infers defaults for the source
physicalName
andschema
. If noconnectionName
orconnectionId
is specified, Hashboard will try to use a connection whose name is the same as the dbt target database (e.g."bigquery"
or"snowflake"
).
You may then use dbt-core to produce a Manifest JSON file (opens in a new tab), which should be included with any Hashboard deploy (or preview):
hb preview --dbt-manifest=./target/manifest.json
Hashboard will include each dbt column as an attribute in the corresponding Hashboard model. You can also add additional fields to the dbt columns to take advantage of Hashboard's more advanced modeling features:
# ... as above
columns:
- name: my_dbt_attribute
documentation: this is a dbt attribute
tests:
- unique
- not_null
# add additional configuration to this column
meta:
glean:
primaryKey: true
To add Hashboard measures (or other attributes not present in the dbt model) to the imported model, you can add a cols
field to the dbt model's meta.glean
configuration:
models:
- # ... dbt schema configuration, as above
meta:
glean:
cols:
- id: daily_active_users
type: metric
name: Daily Active Users
sql: COUNT(DISTINCT customer_id) / COUNT(DISTINCT create_at)
Hashboard previously used the term metric to refer to what is now called a
measure. At the moment our configuration files still refer to type: metric
when creating measures inside a data model. We plan to update the
schema of these configuration files in the future to use the term measure
instead of metric.
Hashboard configuration in your dbt schema file has access to the same templating and macros (opens in a new tab) as your dbt configuration. You could, for example, configure the Hashboard data connection based on your dbt target:
models:
- # ... dbt model configuration
meta:
glean:
glean: "1.0"
name: "customers"
source:
# configure the source based on the dbt target
connectionName: "{{ target.database }}"
Imported dbt models are exclusively controlled by code and cannot be saved from the web UI, though you are still free to prototype and experiment with dbt models in the Hashboard model editor.
Column descriptions will be automatically synced from dbt to the corresponding Hashboard model during deploys.
Global configuration
If all models use the same Hashboard source
, you can configure a default source
in the dbt_project.yml
file:
models:
the_models:
+meta: # will be applied to every model in models/the_models
glean-defaults:
source:
connectionName: # which Hashboard data connection to use
To import all dbt models in a folder as Hashboard models, simply add
models:
the_models:
+meta:
glean-defaults:
glean: "1.0"
source:
connectionName: # which Hashboard data connection to use
# schema is always optional because Hashboard will infer it
schema: "{{ target.schema }}" # can use jinja substitution here
This configuration will add the meta.glean
object to every model in the_models
, so they will all be imported into Hashboard. You can still add additional metadata for the specific models (such as adding additional metrics).