Open Lineage
Overview
OpenLineage is an open standard for data lineage collection and analysis. It provides a unified way to track data flows across different tools and platforms by emitting standardized events during job execution.
Marmot integrates with OpenLineage to automatically discover assets and lineage relationships from your data pipelines, eliminating manual catalog maintenance.
How Marmot Uses OpenLineage
Asset Discovery
OpenLineage events automatically create assets in Marmot's catalog:
- Jobs/Tasks: Airflow DAGs, DBT models, Spark jobs
- Datasets: Tables, files, topics from various data sources
- Lineage: Relationships between jobs and datasets
Asset Types
Marmot maps OpenLineage events to specific asset types:
DAG
- Airflow workflowsTask
- Individual Airflow tasksModel
- DBT modelsProject
- DBT projectsTable
- Database tablesFile
- Data filesTopic
- Kafka topics
Stub Assets
Assets discovered for the first time via OpenLineage are marked as "stub assets" until enhanced by other integrations. This allows lineage tracking even for undocumented datasets without poluting the catalog with potential bad data.
Run History
All OpenLineage events are stored as run history, providing:
- Execution timeline and status
- Input/output data volumes
- Error messages and debugging info
- Performance metrics
Authentication
The OpenLineage endpoint requires authentication via API key.
Generate API Key
- Navigate to Profile → API Keys
- Click New Key
- Copy the generated key
- Configure your OpenLineage producer
Endpoint URL
POST /api/v1/lineage
Authorization: X-API-Key <your-api-key>
Configuration Examples
Airflow
Configure the OpenLineage provider in airflow.cfg
:
[openlineage]
transport = http
url = https://your-marmot-instance.com/api/v1/lineage
api_key = your-api-key
DBT
Add to your profiles.yml
:
your_profile:
outputs:
prod:
# ... your connection details
vars:
openlineage:
url: https://your-marmot-instance.com/api/v1/lineage
api_key: your-api-key
Spark
Set environment variables:
export OPENLINEAGE_URL=https://your-marmot-instance.com/api/v1/lineage
export OPENLINEAGE_API_KEY=your-api-key
For detailed OpenLineage configuration, see the official documentation.