Version: Preview

Claude Agent SDK

The Claude Agent SDK integration ships in the Python and TypeScript SDKs. It has two halves:

The Marmot MCP server (built into your Marmot instance at /api/v1/mcp) gives the agent catalog-aware tools: discover_data, find_ownership, lookup_term. Point the SDK at it via mcpServers and it just shows up as a tool source.
MarmotAgentTracker plugs into the SDK's hook system. The first time any hook fires it registers the agent as an asset of type Agent, captures every tool output for mrn:// references, and writes one batched lineage call when the session ends.

Install

pip install "marmot-sdk[claude-agent]"

The claude-agent extra adds claude-agent-sdk.

Quick start

A minimal agent that searches the catalog, registers itself and writes lineage:

Python TypeScript

import asyncio
from marmot import Client, resolve
from marmot.integrations.claude_agent import MarmotAgentTracker
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions

async def main():
    base_url, credential = resolve(base_url=None)
    client = Client(base_url=base_url, credential=credential)
    tracker = MarmotAgentTracker(
        client,
        name="catalog-explorer",
        model="claude-sonnet-4-5",
        owner="data-eng",
    )

    options = ClaudeAgentOptions(
        mcp_servers={
            "marmot": {
                "type": "http",
                "url": f"{base_url}/api/v1/mcp",
                "headers": {"Authorization": f"Bearer {credential.token}"},
            }
        },
        hooks=tracker.hooks(),
        permission_mode="bypassPermissions",
    )

    async with ClaudeSDKClient(options=options) as agent:
        await agent.query("Find a postgres table about orders and summarise it.")
        async for _ in agent.receive_response():
            pass

    print("agent registered as:", tracker.agent_mrn)

asyncio.run(main())

import { Client, resolve } from "@marmotdata/sdk";
import { MarmotAgentTracker } from "@marmotdata/sdk/claude-agent";
import { query } from "@anthropic-ai/claude-agent-sdk";

const { baseUrl, credential } = await resolve({});
const client = new Client({ baseUrl, credential });

const tracker = new MarmotAgentTracker(client, {
  name: "catalog-explorer",
  model: "claude-sonnet-4-5",
  owner: "data-eng",
});

for await (const msg of query({
  prompt: "Find a postgres table about orders and summarise it.",
  options: {
    mcpServers: {
      marmot: {
        type: "http",
        url: `${baseUrl}/api/v1/mcp`,
        headers: { Authorization: `Bearer ${credential.token}` },
      },
    },
    hooks: tracker.hooks(),
    permissionMode: "bypassPermissions",
  },
})) {
  // stream messages as you wish
}

console.log("agent registered as:", tracker.agentMrn);

After the first run the agent appears in Marmot as type=Agent, service=ClaudeAgent, name=catalog-explorer, with lineage edges from every asset it touched.

Catalog tools (via MCP)

The Marmot MCP server exposes the catalog as tools, namespaced mcp__marmot__*:

Tool	Purpose
`discover_data`	Find or browse assets — by name, type, provider, tags, or MRN
`find_ownership`	Resolve who owns an asset, or what a team/user owns
`lookup_term`	Search the business glossary for term definitions

Tool responses include MRNs (often inside markdown content blocks). The tracker walks both structured objects and free text for mrn:// URIs, so lineage is captured automatically.

To restrict the agent to a subset:

Python TypeScript

options = ClaudeAgentOptions(
    mcp_servers={"marmot": {...}},
    hooks=tracker.hooks(),
    allowed_tools=["mcp__marmot__discover_data", "mcp__marmot__find_ownership"],
)

Custom tools

Two ways to attribute lineage from tools you ship as your own MCP server alongside Marmot's.

MRNs in tool output

If your tool returns objects with mrn fields — { mrn, ... } or { results: [{ mrn, ... }] } — or text bodies that mention mrn://... URIs, the tracker picks them up on every PostToolUse hook. This is the same mechanism that captures MRNs from Marmot's MCP responses.

Manual `record_source`

Use this when the upstream is only known at runtime — for example a tool that picks one of several tables:

Python TypeScript

def query_table(table: str, sql: str) -> list[dict]:
    tracker.record_source(f"mrn://table/postgres/{table}")
    return run_sql(sql)

Install​

Quick start​

Catalog tools (via MCP)​

Custom tools​

MRNs in tool output​

Manual record_source​

Install

Quick start

Catalog tools (via MCP)

Custom tools

MRNs in tool output

Manual `record_source`