# Glossary

All defined terms in the vault, one line each. Click through for full notes.

## Concepts

- **[[concept-agent-harness|Agent Harness]]** — Every piece of code, configuration, and execution logic surrounding a raw model that gives it state, tool execution, feedback loops, and enforceable constraints.
- **[[concept-filesystem-primitive|Filesystem as a Harness Primitive]]** — A foundational harness abstraction providing durable storage, enabling agents to offload context, persist work across sessions, and collaborate.
- **[[concept-bash-general-tool|Bash + Code Execution as a General Purpose Tool]]** — Providing agents with bash and code execution capabilities so they can autonomously write scripts to solve problems rather than relying on pre-built tools.
- **[[concept-context-rot|Context Rot]]** — The degradation of a model's reasoning and task completion capabilities as its context window becomes increasingly full.
- **[[concept-compaction|Context Compaction]]** — Intelligently summarizing and offloading parts of a full context window to prevent API errors and allow continued agent operation.
- **[[concept-tool-call-offloading|Tool Call Offloading]]** — A context management technique that retains only the head and tail tokens of large tool outputs in context, saving the full output to the filesystem.
- **[[concept-progressive-disclosure|Progressive Disclosure (Skills)]]** — A harness primitive that prevents early context rot by dynamically loading tool or skill descriptions into context only when needed.
- **[[concept-ralph-loop|Ralph Loops]]** — A harness pattern that intercepts a model's exit attempt and reinjects the prompt in a clean context window to force task continuation.
- **[[concept-harness-model-coevolution|Coupling of Model Training and Harness Design]]** — The feedback loop where models are post-trained within specific harnesses, improving native capabilities but risking overfitting to specific tool logic.

## Contrarian Insights

- **[[contrarian-harness-optimization|Native Harnesses Aren't Always Best]]** — First-party harnesses are not necessarily optimal; custom harnesses can outperform them on specific benchmarks.
- **[[contrarian-harness-longevity|Harness Engineering Will Survive Model Advancements]]** — Even as models gain native planning and verification, harness engineering will remain critical for safety, state, and observability.

## Claims

- **[[claim-agent-equation|Agent = Model + Harness]]** — A raw model is not an agent; an agent is strictly the combination of a model and a harness.
- **[[claim-long-horizon-compounds|Long-horizon execution requires compounding harness primitives]]** — Robust long-horizon work emerges from combining durable state, continuation forcing, and self-verification — not any single trick.
- **[[claim-harness-overfitting|Post-training with a harness creates tool-logic overfitting]]** — Training models with a fixed harness in the loop bakes in tool-protocol priors that hurt generalization.

## Frameworks

- **[[framework-harness-derivation|Working Backwards: Deriving Harness Features]]** — Three-step methodology: identify a desired behavior, recognize the model's limitation, design a harness feature to bridge the gap.

## Entities

- **[[entity-vivek-trivedy|Vivek Trivedy]]** — Author of the article; harness-engineering practitioner associated with LangChain and deepagents.
- **[[entity-langchain|LangChain]]** — Organization hosting the blog; develops harness libraries and tooling. https://www.langchain.com/
- **[[entity-deepagents|deepagents]]** — LangChain's harness-building library used for advanced harness research.
- **[[entity-langsmith|LangSmith]]** — LangChain's platform for tracing, debugging, evaluating, and deploying agents. https://smith.langchain.com/
- **[[entity-claude-code|Claude Code]]** — Anthropic's coding-agent product; canonical example of model–harness co-evolution.
- **[[entity-codex-5-3|Codex-5.3]]** — Versioned OpenAI Codex snapshot whose `apply_patch` behavior is cited as evidence of tool-logic overfitting.
- **[[entity-opus-4-6|Opus 4.6]]** — Specific Claude Opus version used as the fixed-model variable in the Terminal Bench 2.0 example.
- **[[entity-terminal-bench-2-0|Terminal Bench 2.0]]** — Terminal-based coding-agent leaderboard cited to show harness-dependent performance swings.
- **[[entity-context7|Context7]]** — MCP tool for fetching up-to-date documentation beyond model training cutoffs.

## Quotes

- **[[quote-harness-definition]]** — *“If you're not the model, you're the harness.”*
- **[[quote-context-engineering]]** — *“Harnesses today are largely delivery mechanisms for good context engineering.”*
- **[[quote-intelligence-vs-usefulness]]** — *“The model contains the intelligence and the harness is the system that makes that intelligence useful.”*

## Action Items

- **[[action-implement-filesystem]]** — Equip agents with filesystem abstractions and tools for durable storage and context offloading.
- **[[action-provide-bash]]** — Ship harnesses with a bash tool to allow models to autonomously write and execute code.
- **[[action-tool-call-offloading]]** — Truncate large tool outputs in context to head/tail tokens, saving the full output to disk.
- **[[action-use-ralph-loops]]** — Intercept model exit attempts and reinject prompts in clean context windows to force task continuation.

## Prerequisites

- **[[prereq-react-loop|ReAct Loop]]** — Reasoning + Acting cycle where a model reasons, acts via a tool call, observes the result, and repeats in a while loop.
- **[[prereq-mcp|Model Context Protocol (MCP)]]** — Anthropic-backed standard for connecting AI models to data sources and tools via MCP servers and clients.

## Open Questions

- **[[question-orchestrating-hundreds|Orchestrating Hundreds of Agents]]** — How to coordinate many agents working in parallel on a shared codebase.
- **[[question-self-analyzing-traces|Agents Analyzing Own Traces]]** — How agents can analyze their own execution traces to fix harness-level failures.
- **[[question-jit-tool-assembly|Just-In-Time Tool Assembly]]** — How harnesses can dynamically assemble tools and context per task.