FabricFabricHarness
Reference

Production Readiness Checklist

What to validate before running Fabric Harness agents in production.

Fabric Harness v1.13.0 is ready for controlled production pilots on the Node, virtual, Docker, and Temporal paths. Cloudflare Containers/Sandbox remains pilot/live-gated, and Cloudflare Shell Workspace is pilot-ready after manual live validation: keep the live smoke workflow green in your Cloudflare account before production rollout. Treat other hosted/cloud backends as pilot or experimental until they have live deployment coverage in your environment.

Release baseline

Before promoting a release:

  • Run pnpm install --frozen-lockfile.
  • Run pnpm run check.
  • Run pnpm run build.
  • Run pnpm run test.
  • Pack publishable packages and inspect manifests for semver dependency ranges, not workspace:*.
  • Install @fabric-harness/sdk@latest, @fabric-harness/node@latest, and @fabric-harness/cli@latest in a clean ESM consumer project.
  • Verify fh --help and SDK subpath imports.
  • Deploy the Fumadocs site and verify https://harness.fabric.pro/docs plus llms.txt.
  1. Start with the node target and virtual sandbox for simple support/routing/extraction agents.
  2. Use local only when CI/repo automation intentionally needs host tools.
  3. Use Docker sandbox for untrusted shell or data-analysis work.
  4. Use Temporal for long-running tasks, approval delays, and restart durability.
  5. Use Cloudflare for edge/serverless pilots after live smoke passes in your account.
  6. Keep AKS, ACA, Databricks, E2B, Daytona, and Modal behind live-gated smoke tests until your environment validates them.
  7. Use policies and approvals for every mutating command, tool, or external integration.

Security gate

  • Define explicit filesystem, command, tool, and network policies.
  • Prefer deny-by-default command and network policy for unattended agents.
  • Route risky actions through requireApproval or custom approval rules.
  • Store secrets outside prompts and logs; verify redaction on representative failures.
  • Prefer Docker or a remote sandbox for untrusted code. Avoid host-local sandbox for untrusted workloads.
  • Configure tenant IDs when running multiple users/customers through the same deployment.

Runtime gate

  • Set request body limits, auth, and rate limits for every public HTTP surface.
  • Use durable session stores for production deployments; avoid in-memory stores outside tests.
  • Back up session/artifact stores according to your recovery objectives.
  • Verify cancellation, retry, and timeout behavior on your target runtime.
  • For Temporal, test worker restart, workflow retry, approval delay, and namespace/auth configuration.

Observability gate

  • Export traces through OpenTelemetry, App Insights, Langfuse, or your chosen collector.
  • Track token usage, cost, duration, tool calls, approvals, and task lifecycle events.
  • Add dashboards/alerts for model failures, budget exhaustion, approval timeouts, rate limits, and sandbox failures.
  • Keep audit exports for compliance-sensitive agents.

Live integration gate

Run live-gated tests for every backend you plan to support:

  • Real model provider smoke tests.
  • Docker sandbox execution.
  • Postgres/Redis/session stores under expected concurrency.
  • Temporal local or cloud namespace tests, using a real gRPC readiness probe before integration tests.
  • Cloudflare Worker/R2 deployment smoke tests, including Shell Workspace code tool execution when using mode: 'shell-workspace'.
  • Azure OpenAI, Foundry, AKS, ACA, or ARM tests as applicable.
  • Databricks SQL/Jobs/Volumes tests as applicable.
  • E2B, Daytona, Modal, Kubernetes, or other sandbox provider tests as applicable.

API and release policy gate

Before declaring a GA/v1 runtime:

  • Publish and follow a SemVer/deprecation policy.
  • Mark experimental subpaths and deployment targets clearly.
  • Provide migration guides for minor versions with behavioral changes.
  • Generate public API reference from JSDoc.
  • Keep example projects runnable against the published npm packages.

Current v1.13.0 status

  • Production-pilot ready: SDK, CLI, Node target, virtual sandbox first-run path, Docker sandbox, Temporal controlled deployments, policies, approvals, artifacts, costs, rate limiting, audit export, session.fs / agent.fs, admin/OpenAPI aliases, and docs publishing.
  • Pilot/live-gated: Cloudflare Workers/Sandbox/R2, Cloudflare Shell Workspace repeated CI/live deployment coverage, Postgres at scale, AKS/Kubernetes, Databricks, E2B, Daytona, Modal, provider live matrices.
  • Still pending for GA: broader cloud backend certification, more live CI coverage, generated API reference, independent production deployments, and stronger operator UI/notification surfaces for approvals.