---
id: "framework-harness-derivation"
type: "framework"
source_timestamps: ["§ Working Backwards from Desired Agent Behavior to Harness Engineering"]
tags: ["design-patterns", "methodology"]
related: ["concept-agent-harness", "concept-filesystem-primitive", "concept-bash-general-tool", "concept-ralph-loop"]
steps: ["Identify a behavior we want the agent to perform (e.g., maintain durable state, execute code).", "Recognize the model's out-of-the-box limitation (e.g., models only output text, have fixed context windows).", "Design and implement a Harness feature to bridge the gap (e.g., provide a filesystem abstraction, wrap in a while loop, provide a sandbox)."]
---
# Working Backwards: Deriving Harness Features

## The Framework

The author proposes a methodology for designing agent systems by **working backwards** from a desired behavior — or, equivalently, from a known model deficiency — to a specific harness feature. Instead of arbitrarily adding features, engineers should run the following three-step derivation:

### Step 1 — Identify the Desired Behavior

Name the capability the agent should exhibit. Examples:

- *Maintain durable state across sessions* → leads to [[concept-filesystem-primitive]].
- *Solve arbitrary problems without pre-built tools* → leads to [[concept-bash-general-tool]].
- *Continue working on long tasks without early stopping* → leads to [[concept-ralph-loop]].

### Step 2 — Recognize the Model's Out-of-the-Box Limitation

What does the raw model fail to do? Examples:

- Models only output text — they cannot persist state on their own.
- Models have fixed context windows — they cannot operate over arbitrarily long horizons.
- Models stop generating when they think they're done — they exhibit early stopping.

### Step 3 — Design the Harness Feature to Bridge the Gap

Build the smallest harness primitive that closes the gap. Examples:

- Provide a **filesystem abstraction** + `read_file` / `write_file` tools.
- Wrap the agent in a **while loop with a clean-context reinjection hook** (a Ralph Loop).
- Provide a **sandbox** + `bash` tool.

## Why It Works

This framework keeps the harness ([[concept-agent-harness]]) **principled**: every primitive can be traced back to a concrete model limitation it is solving. It avoids the common failure mode of bolting on features that don't address a real model deficiency.

It also makes the harness debuggable — if a behavior fails, you can ask which step of the derivation broke.
