Sandbox lifecycle
Auto-suspend, snapshot/fork, and sandbox refs — managing long-running compute durably.
Long-running sandboxes (AKS pods, E2B sandboxes, Daytona workspaces) cost money while they're sitting idle. Fabric Harness v0.7 introduces lifecycle primitives — modeled after Temporal's sandbox-orchestration-harness — to let agents reclaim that cost without losing state.
Auto-suspend on idle (v0.7 first slice)
Pass idleSuspendMs to init() (or any sandbox factory option). After that many milliseconds with no operation, the sandbox is suspended. The next operation transparently resumes it.
import { init } from '@fabric-harness/sdk';
const fabric = await init({
sandbox: 'docker',
idleSuspendMs: 5 * 60_000, // suspend after 5 minutes idle
autoResumeOnAccess: true, // default — auto-resume on next call
});Backends that don't implement suspend() / resume() ignore the option (no-op pass-through). Currently supported:
| Backend | Native support |
|---|---|
daytonaSandbox | ✅ via stop / start |
e2bSandbox | ✅ via pause / resume |
modalSandbox | ⏳ snapshot fallback (Phase B) |
kubernetesSandbox / aksSandbox | ⏳ scale-to-0 (planned) |
local / docker / empty / virtual | n/a — no value to suspending |
How it works
createSandboxEnv() wraps the env in a SuspendingSandboxEnv decorator when idleSuspendMs > 0 and the inner env supports suspend(). The decorator:
- Tracks
lastAccesson every operation. - Schedules a
setTimeoutforidleSuspendMsfrom the last access. - Calls
inner.suspend()when the timer fires. - On the next op, calls
inner.resume()first (auto), then dispatches. - Clears the timer in
cleanup().
The timer is unref()'d so it never keeps the Node process alive on its own.
Manual control
You can call session.sandbox.suspend() / resume() directly when capability is reported:
const sandbox = await session.sandbox;
if (sandbox.capabilities?.suspend) {
await sandbox.suspend();
}Strict mode
Set autoResumeOnAccess: false to receive a FabricError (SANDBOX_UNAVAILABLE) instead of transparent resume. Useful when you want to explicitly gate operations on suspension state.
Snapshot-and-fork (v0.7 Phase B)
session.fork(label?) captures the sandbox state and returns a SandboxFork handle. Calling fork.attach() spins up an independent session whose sandbox is a fork of the captured state — writes in branch A are invisible to branch B and to the origin.
const session = await fabric.session('experiment');
await session.shell('npm install && npm test');
const fork = await session.fork('after-install');
// Run two parallel experiments from the same checkpoint:
const branchA = await fork.attach();
const branchB = await fork.attach();
await Promise.all([
branchA.shell('node bench.js --variant fast'),
branchB.shell('node bench.js --variant safe'),
]);Capability
const sandbox = await session.sandbox;
if (sandbox.capabilities?.fork) {
// session.fork() will work
}Backend support
| Backend | Fork supported |
|---|---|
empty / virtual | ✅ in-memory snapshot clone |
local | ✅ on-disk snapshot copied to a fresh workspace temp dir |
docker | ✅ on-disk snapshot copied to a fresh container workspace |
e2bSandbox / modalSandbox (provider-managed) | ✅ via RemoteSandboxApi.fork() when the underlying provider supports start-from-snapshot |
daytonaSandbox | ❌ Daytona doesn't expose start-from-snapshot |
databricksSqlSandbox | ❌ stateless SQL — fork is identity, not implemented |
cloudflareSandbox | ❌ depends on R2 + DO model |
session.fork() throws FabricError { code: 'SANDBOX_UNAVAILABLE' } when the sandbox doesn't support fork.
Cleanup
Each forked session owns its own sandbox; calling branch.sandbox.cleanup() (or letting it fall out of scope) tears down only that branch. The origin is unaffected.
For local/docker, fork creates a fresh temp workspace directory — the fork's cleanup() removes it (Docker) or leaves it for caller responsibility (Local). The shared snapshotRoot is not auto-removed; that's intentional so other forks of the same snapshot remain attachable.
Sandbox refs (v0.7 Phase C)
Sometimes you want two sessions sharing the same running sandbox — not a fork, the actual instance — for multi-agent coordination, observability shadowing, or test isolation. session.sandboxRef() returns an opaque handle; pass it to attachSandbox(ref) in another session to participate in the same sandbox without taking ownership of its lifecycle.
import { init, attachSandbox } from '@fabric-harness/sdk';
// Owner: creates the sandbox, owns cleanup.
const ownerAgent = await init({ sandbox: 'docker', cwd: '/tmp/work' });
const owner = await ownerAgent.session('orchestrator');
await owner.shell('npm install');
const ref = await owner.sandboxRef();
// Attached: shares the same sandbox, does NOT own cleanup.
const workerAgent = await init({ sandbox: attachSandbox(ref) });
const worker = await workerAgent.session('worker-1');
await worker.shell('npm test'); // runs in owner's sandbox
// worker.cleanup() is a no-op for the underlying sandbox.
// owner.cleanup() tears down for everyone.Lifecycle rules
- The first session to register a ref is the owner. Owner cleanup tears down the underlying sandbox.
- Any session can
attachSandbox(ref)to participate. Attached sessions are non-owning: their cleanup unregisters the attachment but doesn't touch the underlying sandbox. - After the owner cleanup, attached sessions see
FabricError { code: 'SANDBOX_UNAVAILABLE' }on subsequent operations. The error message says "no longer alive (owner cleanup ran)" so debugging is unambiguous. - Refs are in-process only in v0.7. Cross-process refs (encoded with provider routing data) are tracked for v0.8+.
When to use
- Multi-agent coordination: an orchestrator and a worker both operating on the same workspace.
- Observability shadow: a metrics/log-collecting session attached to a primary session's sandbox.
- Test scaffolding: a setup phase that pre-populates a sandbox and hands the ref to the test session.
When you want independent branches instead, use session.fork() (Phase B) above. Forks copy state at a point in time; refs share live state.
See docs/ROADMAP.md for status.
Reference
SandboxEnv.suspend?(): Promise<void>SandboxEnv.resume?(): Promise<void>SandboxEnv.fork?(snapshot): Promise<SandboxEnv>SandboxCapabilities.suspend?: booleanSandboxCapabilities.resume?: booleanSandboxCapabilities.fork?: booleanSandboxFactoryOptions.idleSuspendMs?: numberSandboxFactoryOptions.autoResumeOnAccess?: booleanFabricSession.fork(label?): Promise<SandboxFork>SandboxFork.attach(options?): Promise<FabricSession>FabricSession.sandboxRef(): Promise<SandboxRef>attachSandbox(ref): SandboxFactory— pass toinit({ sandbox }).registerSandbox(env, options?)/unregisterSandbox(refId)— direct registry access for advanced cases.SuspendingSandboxEnv(class) — the suspend/resume decorator.withIdleSuspend(env, { idleSuspendMs })— helper that wraps when applicable, returns env unchanged otherwise.