DBT
Status: experimental
The DBT plugin ingests metadata from dbt (Data Build Tool) projects, including models, sources, seeds, and lineage relationships. It reads dbt's generated artifacts to understand your data transformation layer and how it connects to your warehouse.
Prerequisites
Generate dbt Artifacts
Before Marmot can ingest your dbt project, you need to generate the artifact files. These are created in your project's target/ directory.
Required: Run dbt compile or dbt build to generate manifest.json:
dbt compile
# or
dbt build
Recommended: Run dbt docs generate to create catalog.json with column types and statistics:
dbt docs generate
Artifact Files
| File | Required | Generated By | Contains |
|---|---|---|---|
manifest.json | Yes | dbt compile, dbt build, dbt run | Models, sources, seeds, lineage, SQL |
catalog.json | No | dbt docs generate | Column types, table stats, owner info |
Example Configuration
target_path: "/path/to/dbt/project/target"
project_name: "analytics"
environment: "production"
tags:
- "dbt"
- "analytics"
Configuration
The following configuration options are available:
| Property | Type | Required | Description |
|---|---|---|---|
| discover_models | bool | false | Discover DBT models |
| discover_sources | bool | false | Discover DBT sources |
| discover_tests | bool | false | Discover DBT tests |
| environment | string | false | Environment name (e.g., production, staging) |
| external_links | []ExternalLink | false | External links to show on all assets |
| include_catalog | bool | false | Include catalog.json for table/column descriptions |
| include_manifest | bool | false | Include manifest.json for model definitions |
| include_run_results | bool | false | Include run_results.json for test results |
| include_sources_json | bool | false | Include sources.json for source definitions |
| model_filter | plugin.Filter | false | Filter configuration for models |
| project_name | string | false | DBT project name |
| tags | TagsConfig | false | Tags to apply to discovered assets |
| target_path | string | false | Path to DBT target directory containing manifest.json, catalog.json, etc. |
Available Metadata
The following metadata fields are available:
| Field | Type | Description |
|---|---|---|
| adapter_type | string | Database adapter type (postgres, snowflake, bigquery, etc) |
| alias | string | Table alias if different from model name |
| catalog_comment | string | Comment from database catalog |
| column_comment | string | Column comment from database catalog |
| column_description | string | Column description from DBT |
| column_name | string | Column name |
| column_tags | []string | Tags applied to this column |
| config_enabled | bool | Whether model is enabled |
| config_full_refresh | bool | Whether to perform full refresh |
| config_materialized | string | Materialization strategy from config |
| config_on_schema_change | string | Behavior when schema changes (append_new_columns, fail, ignore) |
| config_persist_docs | bool | Whether to persist documentation to database |
| config_tags | string | Tags from config |
| data_type | string | Column data type |
| database | string | Source database name |
| database | string | Target database name |
| database | string | Target database name |
| dbt_materialized | string | Materialization type (table, view, incremental, ephemeral) |
| dbt_original_path | string | Original path in the DBT project |
| dbt_package | string | DBT package name |
| dbt_package | string | DBT package name |
| dbt_package | string | DBT package name |
| dbt_path | string | Path to the model file |
| dbt_unique_id | string | DBT's unique identifier for this source |
| dbt_unique_id | string | DBT's unique identifier for this node |
| dbt_unique_id | string | DBT's unique identifier for this seed |
| dbt_version | string | DBT version used to generate this model |
| environment | string | Deployment environment (dev, prod, etc) |
| environment | string | Deployment environment |
| environment | string | Deployment environment |
| freshness_checked | bool | Whether freshness checks are configured |
| fully_qualified_name | string | Fully qualified name (database.schema.table) |
| fully_qualified_name | string | Fully qualified name (database.schema.table) |
| fully_qualified_name | string | Fully qualified name (database.schema.table) |
| identifier | string | Physical table identifier |
| last_run_execution_time | float64 | Execution time of last run in seconds |
| last_run_failures | int | Number of failures in last run |
| last_run_message | string | Message from last DBT run |
| last_run_status | string | Status of the last DBT run (success, error, skipped) |
| loaded | bool | Whether source was loaded at time of DBT execution |
| model_name | string | DBT model name |
| owner | string | Table/view owner from database catalog |
| project_name | string | DBT project name |
| project_name | string | DBT project name |
| project_name | string | DBT project name |
| raw_sql | string | Raw SQL before compilation |
| schema | string | Source schema name |
| schema | string | Target schema name |
| schema | string | Target schema name |
| seed_path | string | Path to seed CSV file |
| source_name | string | DBT source name |
| stat_approximate_count | int64 | Approximate row count |
| stat_bytes | int64 | Size in bytes |
| stat_last_modified | string | Last modification timestamp |
| stat_num_rows | int64 | Number of rows (alternative) |
| stat_row_count | int64 | Number of rows |
| stat_size | float64 | Table size |
| table_name | string | Physical table/view name in database |
| table_name | string | Source table name |
| table_name | string | Seed table name |