FabricFabricHarness
Reference

Context Compaction

Automatic, event-emitting context compaction for long sessions. Default import enables it; /strict opts in explicitly. Threshold and overflow modes.

When a session's message history grows past the model's context window, an agent will start failing or thrashing. Fabric Harness solves this with automatic compaction: older session entries are summarized into a single compact entry, recent entries are kept verbatim, and the agent keeps running.

Defaults

SDKDefaultWhy
Default import (@fabric-harness/sdk)enabled: trueHeadless agents are often webhook/serverless handlers. Failing at the context window is the wrong default.
Strict import (@fabric-harness/sdk/strict)not injected — undefined until you set itRequired for Temporal replay determinism: auto-compaction is non-deterministic across replays. Opt in only when you don't need replay safety.

⚠️ Selecting runtime: 'temporal' from the default import emits a one-time console.warn because of this exact reason. Either switch to /strict or pass compaction: { enabled: false } explicitly.

Trigger modes

  1. Threshold — before each model call, the harness estimates token usage from session history. When estimated tokens exceed compactAtTokens (auto-derived as contextWindowTokens - reserveTokens if not set), it compacts before sending the next prompt.
  2. Overflow — if the model returns a context overflow error anyway, the harness compacts and retries the prompt once. Set recoverFromOverflow: false to disable.

Both paths emit a typed compaction event with reason: 'threshold' | 'overflow', messagesBefore, messagesAfter, and tokensBefore.

Configuration

import { agent } from '@fabric-harness/sdk';

export default agent<{ message: string }>({
  name: 'long-running',
  run: async ({ init, input }) => {
    // Compaction is on by default. Override only if you need to.
    const session = await (await init({
      compaction: {
        enabled: true,
        reserveTokens: 8192,         // default 4096
        keepRecentEntries: 30,       // default 20
        // compactAtTokens auto-derived from contextWindow if not set
      },
    })).session();
    return { reply: await session.prompt<string>(input.message) };
  },
});

To disable: init({ compaction: { enabled: false } }).

import { agent, schema } from '@fabric-harness/sdk/strict';

export default agent({
  name: 'long-running',
  input: schema.object({ message: schema.string() }),
  run: async ({ init, input }) => {
    // Off by default in /strict — opt in explicitly. Avoid for Temporal.
    const session = await (await init({
      runtime: 'inline',
      sandbox: 'local',
      compaction: {
        enabled: true,
        reserveTokens: 16_384,
        keepRecentEntries: 30,
        recoverFromOverflow: true,
      },
    })).session();
    return await session.prompt(input.message);
  },
});

CompactionSettings reference

FieldTypeDefaultPurpose
enabledbooleanDefault import: true; /strict: not injectedMaster switch.
compactAtTokensnumberderived from contextWindowTokens - reserveTokensTrigger threshold.
reserveTokensnumber4096Headroom kept below the model's context window.
keepRecentEntriesnumber20Recent entries kept verbatim during compaction. Older entries get folded into the summary.
recoverFromOverflowbooleantrueCatch provider context-overflow errors, compact, and retry once.

All fields can also be set per-prompt via session.prompt(text, { compactAtTokens, reserveTokens, compactionKeepRecentEntries, recoverContextOverflow }). Per-prompt values win.

Subscribing to compaction events

import { init, isEvent, type AgentEvent } from '@fabric-harness/sdk';

const fabric = await init({
  onEvent: (event: AgentEvent) => {
    if (isEvent(event, 'compaction')) {
      console.log(
        `compacted ${event.data.messagesBefore} → ${event.data.messagesAfter}`,
        `(reason: ${event.data.reason}, ${event.data.tokensBefore} tokens)`,
      );
    }
  },
});

See the Events reference for the full event taxonomy.

Manual compaction

You can also force compaction outside the auto-loop:

const result = await session.compact({
  keepRecentEntries: 10,
  generateSummary: true,
  reason: 'manual',
});
console.log(result.summary, result.compactedEntries);

How the summary is generated

The summary is produced by the configured model with a deterministic fallback when no model is available. Compaction never breaks an active tool-call/tool-result pair — those move atomically into the kept-recent window. References to files read or modified are appended as a separate block so the agent can re-discover them after compaction.

The summary entry replaces the compacted slice in the session history; subsequent prompts see only the summary plus the recent entries you chose to keep.

When not to enable compaction

  • Short prompts that always fit (one-shot classifiers).
  • Agents that need an exact, replayable conversation history (legal/audit). Use Temporal-backed durability instead.
  • Tasks where summary loss could change the outcome (e.g., the agent must remember a specific identifier mentioned 50 turns ago). In that case, persist key facts as artifacts and reference them by name.

See also