Databricks
SQL Warehouse, Jobs, notebooks, Unity Catalog, MLflow, and workspace files for Fabric Harness agents.
Fabric Harness ships Databricks integration helpers in @fabric-harness/databricks. They are designed for data, analytics, ML, and governance agents that need to query warehouse data, trigger jobs, inspect Unity Catalog, run notebooks, or write MLflow telemetry without putting Databricks credentials in model context.
Status: integration helper package with mocked unit tests and live-gated tests. It is ready for controlled pilots when you provide a Databricks host/token and opt into live tests. A SQL Warehouse-backed
SandboxEnv(databricksSqlSandbox) ships in v0.6; cluster/notebook-backedSandboxEnvmodes remain future work.
Install
npm install @fabric-harness/databricks @fabric-harness/sdkClient
import { createDatabricksClient } from '@fabric-harness/databricks';
const databricks = createDatabricksClient({
host: process.env.DATABRICKS_HOST!,
token: process.env.DATABRICKS_TOKEN!,
});Use a token provider function when running with managed credentials:
const databricks = createDatabricksClient({
host: process.env.DATABRICKS_HOST!,
token: async () => getDatabricksTokenFromVault(),
});SQL Warehouse tool
import { databricksSqlTool } from '@fabric-harness/databricks';
const sql = databricksSqlTool(databricks, {
warehouseId: process.env.DATABRICKS_WAREHOUSE_ID!,
});
const fabric = await init({ tools: [sql] });
const session = await fabric.session();
await session.prompt('Query revenue by month. Use only SELECT statements.');The tool calls /api/2.0/sql/statements. For long-running statements, use waitForDatabricksStatement() in trusted code.
import { waitForDatabricksStatement } from '@fabric-harness/databricks';
const submitted = await sql.execute?.({ statement: 'select 1' }) as { statement_id?: string };
if (submitted?.statement_id) {
await waitForDatabricksStatement(databricks, submitted.statement_id);
}SQL Warehouse sandbox
Run an entire agent against a SQL Warehouse, with exec() mapped to SQL statement execution:
import { init } from '@fabric-harness/sdk';
import { databricksSqlSandbox } from '@fabric-harness/databricks/sql-sandbox';
const fabric = await init({
sandbox: databricksSqlSandbox({
host: process.env.DATABRICKS_HOST!,
token: process.env.DATABRICKS_TOKEN!,
warehouseId: process.env.DATABRICKS_WAREHOUSE_ID!,
catalog: 'main', // optional Unity Catalog
schema: 'analytics', // optional schema
resultFormat: 'jsonl', // 'jsonl' (default) or 'csv'
}),
});
const session = await fabric.session();
const result = await session.shell('SELECT customer_id, SUM(amount) FROM main.analytics.orders GROUP BY 1 ORDER BY 2 DESC LIMIT 10');
console.log(result.stdout); // → newline-delimited JSON rowsFile operations are stored in an in-memory map for the session — useful for staging small CSVs/Markdown summaries the agent emits. For real data files, mount with databricksVolumeSource (below) instead.
Unity Catalog volumes
Mount a UC volume as files inside any sandbox:
import { databricksVolumeSource } from '@fabric-harness/connectors/databricks-volume';
await session.mount('/mnt/landing', databricksVolumeSource({
host: process.env.DATABRICKS_HOST!,
token: process.env.DATABRICKS_TOKEN!,
volumePath: '/Volumes/main/landing/raw',
}));
await session.shell('grep -c error /mnt/landing/2026/01/jan.log');Jobs and notebooks
Trigger existing Jobs:
import { databricksRunJobTool } from '@fabric-harness/databricks';
const tools = [databricksRunJobTool(databricks)];Submit a one-off notebook run:
import { databricksNotebookTool } from '@fabric-harness/databricks';
const notebook = databricksNotebookTool(databricks, {
existingClusterId: process.env.DATABRICKS_CLUSTER_ID,
});Wait for runs in trusted code:
import { waitForDatabricksRun } from '@fabric-harness/databricks';
const run = await databricksRunJobTool(databricks).execute?.({ jobId: 123 }) as { run_id?: number };
if (run?.run_id) await waitForDatabricksRun(databricks, run.run_id);Unity Catalog discovery
import { unityCatalogTablesTool } from '@fabric-harness/databricks';
const uc = unityCatalogTablesTool(databricks);
const fabric = await init({ tools: [uc] });This gives agents read-only discovery over catalog/schema table metadata. Combine it with policy prompts that require explicit table names and approved query shapes before invoking SQL.
MLflow logging
import {
databricksMlflowLogMetricTool,
databricksMlflowLogParamTool,
} from '@fabric-harness/databricks';
const tools = [
databricksMlflowLogMetricTool(databricks),
databricksMlflowLogParamTool(databricks),
];Use these for evaluation or data-profiler agents that should write run metrics back to MLflow.
Workspace files as a filesystem source
Mount exported workspace notebooks/files into the Fabric sandbox so agents can read, grep, and glob them like local files:
import {
databricksWorkspaceSource,
} from '@fabric-harness/databricks';
import { withFilesystemSources } from '@fabric-harness/sdk';
const sandbox = withFilesystemSources('virtual', [{
mountAt: '/workspace/databricks',
source: databricksWorkspaceSource(databricks, '/Repos/acme/analytics'),
}]);
const fabric = await init({ sandbox });The source uses /api/2.0/workspace/list and /api/2.0/workspace/export.
Security model
- Keep
DATABRICKS_TOKENin env, Key Vault, or your runtime secret manager. - Prefer least-privilege service principals.
- Use Unity Catalog permissions as the primary data governance layer.
- Treat SQL and Jobs tools as
executeeffects and gate risky writes with Fabric policy/approvals. - Do not pass tokens, PATs, warehouse IDs, or cluster IDs in prompts.
Live tests
Live tests are skipped unless explicitly enabled:
FABRIC_DATABRICKS_TEST=1 \
DATABRICKS_HOST=https://dbc-...cloud.databricks.com \
DATABRICKS_TOKEN=... \
pnpm --filter @fabric-harness/databricks testOptional resources unlock deeper checks:
DATABRICKS_WAREHOUSE_ID=...
DATABRICKS_CATALOG=main
DATABRICKS_SCHEMA=default
DATABRICKS_JOB_ID=...
DATABRICKS_NOTEBOOK_PATH=/Repos/acme/smoke
DATABRICKS_CLUSTER_ID=...
DATABRICKS_MLFLOW_RUN_ID=...
DATABRICKS_WORKSPACE_ROOT=/Repos/acmeWhat is not implemented yet
- A Databricks cluster/notebook-backed
SandboxEnv(SQL Warehouse-backeddatabricksSqlSandboxships — see above). - First-class deployment target that packages and deploys Fabric agents as Databricks Jobs.
- Unity Catalog lineage writeback helpers.
See docs/ROADMAP.md for status.