FabricFabricHarness
Deployment

Databricks

First-party Databricks integration — Mosaic AI model serving, Unity Catalog governance, RAG, Lakebase, deploy targets, and cost reconciliation.

@fabric-harness/databricks is a first-party integration for building governed agents on Databricks. One databricks() call wires the agent's brain (Mosaic AI Model Serving), tools (governed SQL, Unity Catalog, AI Functions, Genie, Lakeflow, Vector Search, Feature Serving), state (Lakebase), governance (Unity Catalog), and cost controls — under a single principal. Everything is layered on top of Unity Catalog; it never re-implements UC permissions.

Status: controlled-pilot ready with mocked unit tests and live-gated tests. The REST/serving API shapes are written against current Databricks contracts; smoke-test against a live workspace before GA.

The bundle

import { databricks } from '@fabric-harness/databricks';

const dbx = databricks({
  host: process.env.DATABRICKS_HOST!,
  // UC enforces this principal's grants on every call.
  principal: {
    kind: 'service-principal',
    host: process.env.DATABRICKS_HOST!,
    clientId: process.env.DATABRICKS_CLIENT_ID!,
    clientSecret: process.env.DATABRICKS_CLIENT_SECRET!,
  },
  model: 'databricks-meta-llama-3-3-70b-instruct',
  warehouseId: process.env.DATABRICKS_WAREHOUSE_ID!,
  vectorSearch: { index: 'main.kb.docs_index', textColumn: 'chunk', idColumn: 'id' }, // RAG
  aiFunctions: true,                 // ai_query()
  genie: { spaceId: 'space-123' },   // NL → SQL analytics
  lakeflow: true,                    // pipeline tools
  consumption: true,                 // System-Tables cost reporting
  featureServing: { endpoint: 'user-features' },
  governance: { stewardAudience: 'data-steward', onLineage: (r) => console.log('[lineage]', r) },
});

// dbx.modelProvider, dbx.tools, dbx.policy, dbx.retriever, dbx.consumption, dbx.store, dbx.identity

databricks/<endpoint> model refs also resolve through the SDK, so FABRIC_MODEL=databricks/<endpoint> works directly.

Model serving

Databricks serving endpoints expose an OpenAI-compatible API, so databricksFoundationModelProvider({ host, token }) is a thin wrapper that inherits streaming, tool-calls, and reasoning effort. The same token threads through every REST and data call.

Identity & Unity Catalog governance

databricksIdentity() produces a rotating token from a PAT, an OAuth service principal (cached + refreshed), or on-behalf-of a specific end user. UC enforces that principal's table/row/column grants natively — the agent physically cannot read what it lacks SELECT on. On top of UC, Fabric adds:

  • Lineage/auditwithGovernance() stamps every tool call (principal, service, catalog/schema) to an onLineage sink (secrets redacted).
  • Approval routingdatabricksGovernancePolicy() routes sensitive (write/execute) tools to a steward audience via CapabilityPolicy.approvalRules.
  • Egress allowlist — outbound network pinned to the workspace host.

With channel actor propagation, an agent can act on behalf of the human who triggered it, so UC enforces that user's grants — per-user data boundaries with no per-user policy code.

Tools

ToolWhat
databricks_sqlRun SQL on a warehouse (governed).
databricks_unity_catalog_tables / databricks_table_infoDiscover + describe UC tables.
search (Vector Search)RAG retrieval over a Mosaic AI Vector Search index.
databricks_ai_queryIn-warehouse ai_query() model inference (parameterized, injection-safe).
databricks_genie_askNL → SQL analytics over an AI/BI Genie space.
databricks_pipeline_*List / status (read) + start / stop (execute) Lakeflow pipelines.
databricks_feature_lookupLow-latency feature lookup from a Feature Serving endpoint.
databricks_consumptionReal DBUs + list cost from system.billing System Tables.

Job, notebook, and MLflow tools (databricksRunJobTool, databricksNotebookTool, databricksMlflowLogMetricTool) and the SQL-warehouse SandboxEnv (databricksSqlSandbox, @fabric-harness/databricks/sql-sandbox) remain available as building blocks.

State — Lakebase

Lakebase is managed Postgres. lakebaseClient() builds a Postgres client whose password is the rotating OAuth token (evaluated per connection), so it drops into PostgresSessionStore — agent sessions and the dispatch journal live in the lakehouse under the same principal.

Deploy targets

Build a deployable artifact with fabric-harness build --target <name>:

  • databricks-app — runs the agent in-workspace as a Databricks App (the app's service principal is the acting UC identity). Emits app.yaml (bridges DATABRICKS_APP_PORT) + deploy docs.
  • databricks-serving — wrapper-only: packages an MLflow ChatAgent proxy to an agent deployed elsewhere, registers it to Unity Catalog, and creates a serving endpoint that Agent Bricks and the Playground consume as a model/tool.

Agents are runtime-agnostic: the same agent can also run on Temporal/Node/Cloudflare and simply consume Databricks as a backend.

Cost reconciliation

databricksTenantCostLimit() enforces a perScope budget against real Databricks spend from System Tables (estimates still guard perCall/perSession). See Cost attribution.

See also

  • Examples: with-databricks (analytics copilot), with-databricks-rag (support agent), with-databricks-dataeng (pipelines), with-databricks-cost-attribution.
  • Model providers · Channels · Cost attribution