FabricFabricHarness
Operating

Cost Attribution

Enforce budgets against real Databricks spend and attribute cost by agent, user, and tenant.

The SDK cost-budget tracks per-call USD from a static price table — good for immediate, synchronous protection, but it doesn't know your real Databricks spend (model serving, SQL, Vector Search, AI Functions, Genie, Lakeflow, Feature Serving — all measured in DBUs in System Tables). Cost reconciliation closes that gap: enforce perScope budgets against actuals while keeping estimates for perCall/perSession.

Hybrid enforcement

The key idea maps the existing limit tiers to what each can know:

  • perCall / perSession stay estimate-based → immediate, synchronous soft caps. They catch a runaway agent now.
  • perScope (tenant / agent / user) becomes the real-spend hard cap via an external ActualCostSource → retroactive, cross-process. It catches sustained overspend on the next call.

System Tables lag 15–60 minutes, so actual enforcement is intentionally lagged — but the estimate tiers cover the window. A model call is never blocked on a warehouse query: actuals are fetched asynchronously after the response, cached (default 60s), and a violation is enforced before the next call.

Databricks tenant limit

import { databricksTenantCostLimit } from '@fabric-harness/databricks';

const costLimit = databricksTenantCostLimit(client, warehouseId, tenantId, {
  perDayUsd: 50,
  cacheTtlMs: 60_000,
});

const fabric = await init({ costLimit /* …model, tools, policy */ });

This returns a CostLimit with scopeSource: 'external' and a Databricks ActualCostSource that queries system.billing.usage joined to system.billing.list_prices, filtered by fabric.* custom tags, cached per scope key. Existing tenantCostLimit / postgresCostBudgetStore usage is unchanged — scopeSource defaults to incremental.

Attribution dimensions

Spend is attributed across agent · user · tenant · model · provider · session · turn. Stamped onto each ModelUsage and tool-call entry, these answer "which agent/user/tenant spent what":

const breakdown = await databricksActualCostSource(client, warehouseId).queryAttribution({
  tenantId: 'acme',
  startDate: '2026-06-01',
  endDate: '2026-06-23',
  groupBy: ['agentId', 'userId'],
});
// → [{ agentId, userId, tenantId, period, estimatedUsd, actualUsd, actualDbus, deltaUsd, calls }]

queryAttribution is for offline/ops/dashboard use (it may scan large windows), not inline enforcement. cost_limit events carry actualsFetchedAt / cacheTtlMs so consumers understand the lag.

Attribution contract

Attributing spend to a specific agent/tenant requires Databricks resources to carry fabric.* custom tags (which flow into system.billing.usage.custom_tags). The Databricks deploy targets can tag the resources they create; for untagged / bring-your-own setups, scope coarsely by warehouse/endpoint + time window.

See also

  • Multi-tenancy · Databricks
  • Example: examples/with-databricks-cost-attribution (runs against a mocked warehouse).