# Full Vault — Agent Primer — Interpretible Context Methodology & The Future of AI Dialogue

> **Single-fetch comprehensive vault.** Contains the agent primer + map-of-content + glossary + speakers + every note inline. Use this file for agents that cannot follow embedded links (e.g., URL-provenance-restricted fetchers). For agents that can follow links, prefer `_AGENT_PRIMER.md` for progressive disclosure with on-demand drill-down.

> *All wikilinks resolve to within-document anchors (e.g. `[concept-foo](#concept-foo)`). The vault contains 31 notes total.*

---

## Agent Primer

> **Read me first.** This document primes a downstream AI agent to act as a subject-matter expert on the source video. Read this in full before consulting individual notes.

**Source**: [Interpretible Context Methodology & The Future of AI Dialogue](https://www.youtube.com/watch?v=956DPSPX4wg)  
**Duration**: 26m 38s  
**Speakers**: Jake Van Clief, David McDermott, K. Kumar  
**Domains**: `ai-agents`, `software-engineering`, `prompt-engineering`, `workflow-automation`, `knowledge-management`  
**Vault slug**: `interpretible-context-methodology-icm`  
**Generated**: 2026-06-02T05:36:35.336Z

---

> **⚑ Companion source folded in.** This vault was built from the video as its *primary* source, then enriched with the **formal academic paper by the same author** — [*"Interpretable Context Methodology: Folder Structure as Agent Architecture"*](https://arxiv.org/html/2603.16021v2) (Van Clief & McDermott, arXiv:2603.16021, Eduba / University of Edinburgh). See the note [entity-icm-paper-arxiv](#entity-icm-paper-arxiv). Treat the **video as conviction/practitioner framing** and the **paper as structure + evidence + acknowledged limits**. Where they differ in altitude, the paper supplies three things the talk omits:
>
> 1. **The Five-Layer Context Hierarchy** — Layer 0 `CLAUDE.md` (global identity) → Layer 1 `CONTEXT.md` (workspace routing) → Layer 2 stage `CONTEXT.md` (stage contracts) → Layer 3 reference material → Layer 4 per-run working artifacts. This is the explicit skeleton ICM only gestures at in the talk.
> 2. **Quantitative grounding for the efficiency claim** — 2,000–8,000 *focused* tokens per stage vs. monolithic prompts exceeding 40,000 tokens (most irrelevant), justified via Liu et al.'s *"lost in the middle"*. Plus an adoption signal: 30 of 33 practitioners report a U-shaped human-editing pattern (heavy/light/heavy across stages 1/2/3).
> 3. **Hard limits the talk's "frameworks are absurdities" rhetoric hides** — the paper *explicitly* states there is **no controlled comparison** vs. monolithic prompting, all testing used a single model family (Claude Opus/Sonnet 4.6), and ICM is **not for** real-time multi-agent collaboration, high-concurrency, or complex automated branching. When asked "is ICM better than LangChain?", answer with the paper's honesty: it is *more efficient and more interpretable for sequential, human-reviewed workflows*, **not benchmarked-superior in general**.

---
## You Are A Subject-Matter Expert On The Interpretible Context Methodology (ICM)

This vault distills a 26-minute talk by **Jake Van Clief** (with K. Kumar and David McDermott as co-participants) titled *"Interpretible Context Methodology & The Future of AI Dialogue"*. Your job, as a downstream agent, is to answer questions about the source faithfully, preserve every nuance, and clearly mark the difference between what the speaker asserts and what the broader literature supports.

You should be able to answer ~80% of likely questions from this primer alone. For finer detail, traverse the [[wikilinks]] to the supporting notes.

---

## 1. The Headline Thesis

The speaker argues against the use of complex, multi-agent frameworks such as [entity-langchain](#entity-langchain) or [entity-semantic-kernel](#entity-semantic-kernel) and instead advocates for what he calls the **Interpretible Context Methodology (ICM)** — see [concept-icm](#concept-icm). ICM holds that:

> An AI agent (in practice [entity-claude](#entity-claude)) given access to a well-structured **folder hierarchy of markdown files** can navigate context, understand constraints, and execute complex tasks deterministically — without the orchestration glue that multi-agent frameworks impose.

He further argues that **all sophisticated AI workflows fundamentally stem from human dialogue and decision trees** — see [concept-dialogue-structure](#concept-dialogue-structure). By capturing the implicit conversational intent, constraints, assumptions, and sub-goals into structured markdown ('skills'), users create highly effective, reusable AI capabilities.

The ultimate evolution of this methodology is presented as **real-time, voice-driven AI collaboration** — see [concept-voice-collaboration](#concept-voice-collaboration). An AI participates in a live meeting, listens to voice commands, reads and writes to a shared local file system in real-time, and eliminates the need for post-meeting processing.

The headline quote: *"They're not building multi-agentic frameworks and all these absurdities, they're building folders and markdown files on their computer and getting huge results from it."* — see [quote-absurdities](#quote-absurdities).

---

## 2. The Four Core Concepts

### 2.1 Interpretible Context Methodology — [concept-icm](#concept-icm)

ICM is a contrarian architecture for AI agents:

- **Substrate**: plain text, markdown files, standard folder hierarchies
- **Agent**: a single LLM (typically Claude) that navigates the folder structure on demand
- **Skills**: discrete markdown files capturing goals, constraints, assumptions, and sub-goals
- **Claimed benefits**: 20–40% token reduction, faster execution, lower barrier to entry, higher determinism, easier maintenance

The methodology shifts complexity *out of code* (framework configuration, agent message routing) and *into text* (folder organization, well-written prompts).

Prerequisites for understanding the efficiency argument: [prereq-llm-context](#prereq-llm-context) (token economics) and [prereq-markdown](#prereq-markdown) (the syntax).

### 2.2 Three Levels of AI Use — [concept-three-levels-ai](#concept-three-levels-ai)

A maturity model for organizational AI adoption:

- **Level 1 — Copy & Paste**: ad-hoc chat usage; low effort, low and inconsistent impact
- **Level 2 — Structured Use**: standardized prompts, brand-tone files, verified outputs, basic markdown skills
- **Level 3 — Integrated Workflow**: automated pipelines chaining skills, prompts, and deterministic scripts; the AI navigates an ICM folder structure to execute multi-step processes

The speaker's signature claim: **the jump from L1 to L2 is the highest-ROI move** an organization can make — see [claim-l2-roi](#claim-l2-roi) and [quote-l2-roi](#quote-l2-roi). L3 has higher absolute impact but much higher engineering cost.

### 2.3 Dialogue as Workflow Structure — [concept-dialogue-structure](#concept-dialogue-structure)

The philosophical centre of the talk. The claim: **all complex AI workflows can be reverse-engineered from successful human–AI conversations**. A trivial-looking request like *'tighten this paragraph'* hides a multi-step decision tree:

1. Goal — primary intent
2. Constraints — boundaries on the response
3. Assumptions — implicit context
4. Sub-goals — intermediate steps
5. Execution — production of the output

[entity-k-kumar](#entity-k-kumar), the speaker's collaborator from the University of Edinburgh, built a **visual mapping tool** that surfaces these latent components from real chat transcripts. The headline quote: *"All of these skills, all of these folders and markdown files, all have one core theme: discussion and dialogue."* — [quote-dialogue-theme](#quote-dialogue-theme).

### 2.4 Real-Time Voice-Driven AI Collaboration — [concept-voice-collaboration](#concept-voice-collaboration)

The forward-looking finale. Stack:

- **Voice cloning**: a custom model of the speaker's own voice via [entity-11labs](#entity-11labs)
- **LLM**: a local instance of [entity-claude](#entity-claude)
- **Codebase under control**: the speaker's 'Ethics Engine' project containing psychometric scales
- **Substrate**: an ICM folder structure
- **Loop**: voice → STT → Claude → file system read/write → response, all happening *during* a live meeting

The speaker pitches this as the replacement for the current record-transcribe-summarize-actuate workflow. The motivating question: *"What if I could sit inside of a group call and control someone else's Claude code or AI through my voice and immediately access all of that data that's locally on their computer?"* — see [quote-voice-control](#quote-voice-control).

---

## 3. The Operationalizing Framework

### Skill Creation via Dialogue Extraction — [framework-skill-creation](#framework-skill-creation)

A five-step process for converting ephemeral chats into permanent ICM skills:

1. **Identify the Goal / Intent**
2. **Extract the Constraints**
3. **Identify the Assumptions**
4. **Map the Sub-goals**
5. **Encode into a structured markdown file**

This framework is the bridge between the philosophy of dialogue ([concept-dialogue-structure](#concept-dialogue-structure)) and the substrate of ICM ([concept-icm](#concept-icm)). It is also the formal pattern behind concrete actions such as [action-codify-voice](#action-codify-voice) (writing a `voice-and-tone.md`).

---

## 4. The Top Claims, With Confidence

| # | Claim | Source confidence | Validated? |
|---|-------|-------------------|-----------|
| 1 | ICM (folders + markdown + single agent) outperforms multi-agent frameworks — [claim-icm-superiority](#claim-icm-superiority) | high | **partially.** Single-agent-first guidance is mainstream (e.g., Microsoft Cloud Adoption Framework). The 20–40% token figure is **anecdotal**; the 'absurdities' framing **overshoots** — multi-agent frameworks are well-motivated across security/team/scale boundaries. |
| 2 | The L1→L2 jump is the highest-ROI move — [claim-l2-roi](#claim-l2-roi) | high | **well-aligned with practitioner consensus**, but quantitative ROI data is scarce. Treat as a strong consultant heuristic, not a formal law. Some orgs may extract more ROI by jumping into a narrow L3 deployment. |
| 3 | Real-time voice-driven AI collaboration is the future of workflows — [claim-voice-future](#claim-voice-future) | medium | **technically plausible and partially realized**, but normative and speculative. Voice will likely be one mode within multimodal control, not the universal control modality. Security barriers are substantial. |

---

## 5. The Contrarian Frame — [contrarian-frameworks](#contrarian-frameworks)

The talk is structured as a contrarian provocation against the dominant industry pattern of building elaborate multi-agent orchestration stacks. The position:

- Most teams reach for [entity-langchain](#entity-langchain), [entity-semantic-kernel](#entity-semantic-kernel), or AutoGen-style frameworks too early
- These frameworks introduce orchestration tax in tokens, debugging time, and adoption friction
- A single well-contextualized agent operating over a folder is sufficient for the vast majority of workflows

**Where the contrarian frame holds up (per the literature):**

- Microsoft's Cloud Adoption Framework explicitly recommends **starting with a single-agent system** and only escalating when there are clear drivers (security boundaries, multiple teams, large-scale evolution)
- Developer guides on single-agent vs multi-agent agree that most coding tasks fit single-agent + tools well
- Externalizing persistent state into files and loading only relevant slices is broadly endorsed as a way to reduce token usage and improve determinism

**Where the contrarian frame overshoots:**

- Multi-agent research shows decomposing tasks into specialized roles (retrieval, reasoning, validation, monitoring) **does** improve modularity and robustness in complex environments
- Enterprise multi-agent literature documents necessary distributed-systems patterns (sagas, circuit breakers, immutable state, governance) that exist because they earn their complexity at scale
- The blanket label 'absurdities' is rhetorical, not analytic

A balanced reframe for downstream questions: **ICM is the right starting architecture and often the right terminal architecture for single-team, single-domain workflows. Multi-agent frameworks earn their place at security/compliance/team/scale boundaries.**

---

## 6. Entities You Should Be Fluent About

- **[entity-jake-van-clief](#entity-jake-van-clief)** — primary speaker, AI consultant, originator of ICM
- **[entity-k-kumar](#entity-k-kumar)** — co-founder and student at the University of Edinburgh; built the visual decision-tree mapping tool central to the dialogue thesis
- **[entity-david-mcdermott](#entity-david-mcdermott)** — co-participant in the source conversation; named in the speaker list with limited individually attributed content
- **[entity-anthropic](#entity-anthropic)** — creator of [entity-claude](#entity-claude); culturally aligned with skill-based, structured-context approaches
- **[entity-andrej-karpathy](#entity-andrej-karpathy)** — AI researcher, recently associated with Anthropic; his 'LLM Wiki' markdown approach mirrors ICM and is cited as independent validation
- **[entity-claude](#entity-claude)** — the LLM used in all demos, including the voice-driven collaboration finale
- **[entity-langchain](#entity-langchain)** — the canonical example of a 'complex orchestration framework' that ICM aims to obviate
- **[entity-semantic-kernel](#entity-semantic-kernel)** — Microsoft's orchestration framework, similarly positioned as a foil; ironic given its own 'skill' abstraction
- **[entity-11labs](#entity-11labs)** — ElevenLabs; provider of voice cloning used to train a custom voice model for the live demo

---

## 7. Action Items A Reader Can Adopt

- **[action-implement-folders](#action-implement-folders)** — replace orchestration code with folder + markdown context
- **[action-move-to-l2](#action-move-to-l2)** — audit team usage, build prompt libraries and skills to move from L1 to L2
- **[action-codify-voice](#action-codify-voice)** — write a `voice-and-tone.md` and reference it from every agent prompt

These three actions form a coherent on-ramp: codify voice → standardize prompts to reach L2 → restructure into ICM folders.

---

## 8. Prerequisites

- **[prereq-llm-context](#prereq-llm-context)** — token economics, context windows, attention degradation; required to understand the efficiency argument
- **[prereq-markdown](#prereq-markdown)** — basic markdown literacy; required because markdown is the substrate of everything

---

## 9. Open Questions A Sceptic Will Raise

- **[question-icm-scaling](#question-icm-scaling)** — does single-agent folder navigation scale to massive legacy enterprise codebases? The speaker's demos are bounded; framework defenders argue scale is exactly where multi-agent earns its keep. Resolution path: case studies, benchmarks, hybrid patterns.
- **[question-voice-security](#question-voice-security)** — voice cloning is cheap; authentication, permission scoping, bystander hijacking, and audit trails are unresolved for production voice-driven file-system control. Resolution path: voice biometrics + secondary factor, sandboxed/capability-scoped execution, command confirmation patterns.

---

## 10. The Signature Quotes (Use These When You Want Punchy Citations)

- **[quote-absurdities](#quote-absurdities)** — 'folders and markdown files… huge results' — the contrarian banner
- **[quote-l2-roi](#quote-l2-roi)** — 'the jump from L1 to L2 is the highest-ROI move' — the consulting heuristic
- **[quote-dialogue-theme](#quote-dialogue-theme)** — 'one core theme: discussion and dialogue' — the philosophical centre
- **[quote-voice-control](#quote-voice-control)** — 'sit inside of a group call and control someone else's Claude code…' — the forward-looking vision

---

## 11. Mini-Glossary For Quick Recall

- **ICM (Interpretible Context Methodology)** — folder + markdown substrate for agent context
- **Skill** — a single markdown file encoding Goal, Constraints, Assumptions, Sub-goals for a reusable AI capability
- **Level 1 / 2 / 3** — copy-paste / structured prompts / integrated automation
- **Dialogue tree** — the latent decision structure inside a successful human–AI conversation
- **LLM Wiki** — Karpathy's markdown-based personal knowledge approach, cited as ICM-adjacent
- **Voice-driven collaboration** — real-time voice + LLM + local file system loop, run during live meetings

---

## 12. How To Use This Vault

When answering questions:

1. **Anchor in the speaker's framing first** — present what the source actually claims, with the speaker's confidence level.
2. **Distinguish source confidence from external validation.** If you cite a specific number (e.g., 20–40% token reduction), flag it as anecdotal.
3. **Follow [[wikilinks]] to the relevant note** for details; never invent figures the source did not give.
4. **For challenges to the thesis**, see [contrarian-frameworks](#contrarian-frameworks) (the source's own challenge to industry) and the validation notes embedded in [claim-icm-superiority](#claim-icm-superiority), [claim-l2-roi](#claim-l2-roi), and [claim-voice-future](#claim-voice-future) (which present where the broader literature pushes back).
5. **Speakers**: when answering 'who said this?', resolve to the entity notes ([entity-jake-van-clief](#entity-jake-van-clief), [entity-k-kumar](#entity-k-kumar), [entity-david-mcdermott](#entity-david-mcdermott)). Most claims and quotes are Jake Van Clief's unless otherwise marked.

---

## 13. The Big Picture, In One Paragraph

The talk advances a coherent reductionist program for AI engineering. **At the substrate**, replace orchestration frameworks with folder hierarchies of markdown ([concept-icm](#concept-icm)). **At the workflow layer**, recognize that all skills are codified dialogue and build them via a disciplined five-step extraction ([framework-skill-creation](#framework-skill-creation), [concept-dialogue-structure](#concept-dialogue-structure)). **At the organizational layer**, move from copy-paste to structured-prompt maturity before investing in integration ([concept-three-levels-ai](#concept-three-levels-ai), [claim-l2-roi](#claim-l2-roi)). **At the interaction layer**, evolve from text chat to real-time voice collaboration where the AI is a meeting participant operating directly on local files ([concept-voice-collaboration](#concept-voice-collaboration), [claim-voice-future](#claim-voice-future)). The unifying intuition is that **simple, inspectable text + a single capable agent beats elaborate orchestration for most real workflows** — a claim that the broader literature endorses as a starting posture, partially validates for many use cases, and pushes back on as a universal rule. Be that nuanced answer for downstream users.---
## How to Navigate This Vault
- `_QUERY_INDEX.json` — machine-readable concept→file map for programmatic lookup
- `00-index/moc.md` — map-of-content with all notes organized by section
- `00-index/glossary.md` — all defined terms with one-line definitions
- `concepts/`, `claims/`, `frameworks/`, `entities/`, `quotes/`, `action-items/`, `prerequisites/`, `open-questions/` — fixed-core note folders
Cross-references use `[[note-id]]` wikilink syntax.


---

## Map of Content

# Map of Content — Interpretible Context Methodology & The Future of AI Dialogue

This vault distills a talk by [entity-jake-van-clief](#entity-jake-van-clief) (with [entity-k-kumar](#entity-k-kumar) and [entity-david-mcdermott](#entity-david-mcdermott)) arguing for **folder-and-markdown agent architectures** over framework-driven multi-agent orchestration, and forecasting a future of real-time voice-driven AI collaboration.

> **Start here:** read `_AGENT_PRIMER.md` for the full distilled context. This MOC is the navigational index.

---

## Core Reading Order

1. **Headline thesis** → [quote-absurdities](#quote-absurdities)
2. **The methodology** → [concept-icm](#concept-icm)
3. **The philosophical centre** → [concept-dialogue-structure](#concept-dialogue-structure) (+ [quote-dialogue-theme](#quote-dialogue-theme))
4. **The maturity ladder** → [concept-three-levels-ai](#concept-three-levels-ai)
5. **The forward-looking finale** → [concept-voice-collaboration](#concept-voice-collaboration) (+ [quote-voice-control](#quote-voice-control))

---

## Concepts

- [concept-icm](#concept-icm) — Interpretible Context Methodology
- [concept-three-levels-ai](#concept-three-levels-ai) — Three Levels of AI Use (L1 / L2 / L3)
- [concept-dialogue-structure](#concept-dialogue-structure) — Dialogue as Workflow Structure
- [concept-voice-collaboration](#concept-voice-collaboration) — Real-Time Voice-Driven AI Collaboration
- [contrarian-frameworks](#contrarian-frameworks) — Contrarian: Multi-Agent Frameworks are Over-Engineered

## Claims (with validation)

- [claim-icm-superiority](#claim-icm-superiority) — ICM outperforms multi-agent frameworks *(high confidence in source; partially supported by literature)*
- [claim-l2-roi](#claim-l2-roi) — The L1→L2 jump yields highest ROI *(high; well-aligned with practitioner consensus)*
- [claim-voice-future](#claim-voice-future) — Real-time voice AI is the future of workflows *(medium; plausible prediction)*

## Frameworks

- [framework-skill-creation](#framework-skill-creation) — Skill Creation via Dialogue Extraction (5 steps)

## Entities

**Speakers**

- [entity-jake-van-clief](#entity-jake-van-clief) — primary speaker
- [entity-k-kumar](#entity-k-kumar) — collaborator, built the decision-tree mapping tool
- [entity-david-mcdermott](#entity-david-mcdermott) — co-participant

**Organizations**

- [entity-anthropic](#entity-anthropic)

**People (cited)**

- [entity-andrej-karpathy](#entity-andrej-karpathy)

**Products & Tools**

- [entity-claude](#entity-claude) — primary LLM
- [entity-langchain](#entity-langchain) — contrast example
- [entity-semantic-kernel](#entity-semantic-kernel) — contrast example
- [entity-11labs](#entity-11labs) — voice cloning provider (ElevenLabs)

## Quotes

- [quote-absurdities](#quote-absurdities) — folders over frameworks
- [quote-l2-roi](#quote-l2-roi) — ROI of Level 2
- [quote-dialogue-theme](#quote-dialogue-theme) — dialogue as the core theme
- [quote-voice-control](#quote-voice-control) — voice controlling local AI

## Action Items

- [action-implement-folders](#action-implement-folders) — structure context as folders + markdown
- [action-move-to-l2](#action-move-to-l2) — standardize prompts and skills
- [action-codify-voice](#action-codify-voice) — write a `voice-and-tone.md`

## Prerequisites

- [prereq-llm-context](#prereq-llm-context) — LLM context windows and token economics
- [prereq-markdown](#prereq-markdown) — basic markdown literacy

## Open Questions

- [question-icm-scaling](#question-icm-scaling) — does ICM scale to enterprise codebases?
- [question-voice-security](#question-voice-security) — security of real-time voice access

---

## Suggested Traversal Paths

**For an engineer evaluating ICM:** [concept-icm](#concept-icm) → [claim-icm-superiority](#claim-icm-superiority) → [contrarian-frameworks](#contrarian-frameworks) → [action-implement-folders](#action-implement-folders) → [question-icm-scaling](#question-icm-scaling)

**For an org leader planning AI adoption:** [concept-three-levels-ai](#concept-three-levels-ai) → [claim-l2-roi](#claim-l2-roi) → [action-move-to-l2](#action-move-to-l2) → [action-codify-voice](#action-codify-voice)

**For a prompt engineer:** [concept-dialogue-structure](#concept-dialogue-structure) → [framework-skill-creation](#framework-skill-creation) → [action-codify-voice](#action-codify-voice)

**For a futurist / strategist:** [concept-voice-collaboration](#concept-voice-collaboration) → [claim-voice-future](#claim-voice-future) → [question-voice-security](#question-voice-security)


---

## Glossary

# Glossary

One-line definitions for every named concept, claim, framework, entity, and pattern in this vault. Follow the [[wikilink]] for the full note.

## Concepts & Frames

- **Interpretible Context Methodology (ICM)** — folder + markdown substrate for AI agent context, replacing orchestration frameworks. → [concept-icm](#concept-icm)
- **Three Levels of AI Use** — maturity model: L1 copy-paste, L2 structured prompts, L3 integrated workflows. → [concept-three-levels-ai](#concept-three-levels-ai)
- **Dialogue as Workflow Structure** — thesis that all skills are codified human–AI conversational decision trees. → [concept-dialogue-structure](#concept-dialogue-structure)
- **Real-Time Voice-Driven AI Collaboration** — live voice + LLM + local file system loop run during meetings. → [concept-voice-collaboration](#concept-voice-collaboration)
- **Contrarian: Multi-Agent Frameworks are Over-Engineered** — counter-stance to industry orchestration-framework trend. → [contrarian-frameworks](#contrarian-frameworks)
- **Level 1 (L1)** — ad-hoc copy/paste chatbot use; low effort, low impact.
- **Level 2 (L2)** — standardized prompts, brand-tone files, markdown skills; highest-ROI step per source.
- **Level 3 (L3)** — fully integrated automated pipelines chaining skills + scripts.
- **Skill** — a single markdown file encoding Goal, Constraints, Assumptions, Sub-goals for a reusable AI capability.
- **LLM Wiki** — Karpathy-popularized markdown personal knowledge approach; ICM-adjacent.
- **Dialogue tree** — the latent decision structure inside a successful human–AI conversation.

## Claims

- **ICM Outperforms Multi-Agent Frameworks** — single agent over folders beats LangChain/Semantic Kernel for most workflows. → [claim-icm-superiority](#claim-icm-superiority)
- **L1→L2 Yields Highest ROI** — standardizing prompts gives the biggest step-change per unit of effort. → [claim-l2-roi](#claim-l2-roi)
- **Real-Time Voice AI is the Future** — voice-controlled live meeting agents will replace post-meeting workflows. → [claim-voice-future](#claim-voice-future)

## Framework

- **Skill Creation via Dialogue Extraction** — five-step process (Goal → Constraints → Assumptions → Sub-goals → Markdown). → [framework-skill-creation](#framework-skill-creation)

## People

- **Jake Van Clief** — primary speaker; AI consultant; ICM originator. → [entity-jake-van-clief](#entity-jake-van-clief)
- **K. Kumar** — co-founder and student at the University of Edinburgh; built the dialogue decision-tree mapping tool. → [entity-k-kumar](#entity-k-kumar)
- **David McDermott** — co-participant in the source conversation. → [entity-david-mcdermott](#entity-david-mcdermott)
- **Andrej Karpathy** — AI researcher; popularized LLM Wiki / markdown workflow; now associated with Anthropic. → [entity-andrej-karpathy](#entity-andrej-karpathy)

## Organizations

- **Anthropic** — AI safety lab; creator of Claude; promotes skill-based, structured-context approaches. → [entity-anthropic](#entity-anthropic)

## Products & Tools

- **Claude** — Anthropic's LLM family; the model used in all demos. → [entity-claude](#entity-claude)
- **LangChain** — open-source LLM orchestration framework; contrast example in the talk. → [entity-langchain](#entity-langchain)
- **Semantic Kernel** — Microsoft's open-source LLM orchestration framework; contrast example. → [entity-semantic-kernel](#entity-semantic-kernel)
- **11Labs (ElevenLabs)** — AI speech-synthesis and voice-cloning provider; used to clone the speaker's voice for the demo. → [entity-11labs](#entity-11labs)

## Quotes

- **Folders over Frameworks** — *"They're not building multi-agentic frameworks and all these absurdities…"* → [quote-absurdities](#quote-absurdities)
- **The ROI of Level 2** — *"The jump from L1 to L2 is the highest ROI move."* → [quote-l2-roi](#quote-l2-roi)
- **Dialogue as the Core Theme** — *"All of these skills… have one core theme: discussion and dialogue."* → [quote-dialogue-theme](#quote-dialogue-theme)
- **Voice Controlling Local AI** — *"What if I could sit inside of a group call and control someone else's Claude code…"* → [quote-voice-control](#quote-voice-control)

## Actions

- **Implement Folder-Based Context** — structure agent context as folders + markdown. → [action-implement-folders](#action-implement-folders)
- **Standardize Prompts to Reach Level 2** — build shared prompt libraries and skills. → [action-move-to-l2](#action-move-to-l2)
- **Codify Voice and Tone in Markdown** — write a `voice-and-tone.md` and reference it from prompts. → [action-codify-voice](#action-codify-voice)

## Prerequisites

- **Understanding LLM Context Windows** — token economics and context limits required to grasp ICM's efficiency argument. → [prereq-llm-context](#prereq-llm-context)
- **Familiarity with Markdown** — basic markdown literacy required to author skills. → [prereq-markdown](#prereq-markdown)

## Open Questions

- **Scaling ICM to Enterprise Codebases** — does single-agent folder navigation hold up at massive scale? → [question-icm-scaling](#question-icm-scaling)
- **Security of Real-Time Voice Access** — authentication, permission scoping, and spoofing risks in voice-driven file control. → [question-voice-security](#question-voice-security)


---

## Speakers

# Speakers

> Speaker manifest for this vault. 4 person entities, 10 attributed notes.

## Andrej Karpathy

Entity note: [entity-andrej-karpathy](#entity-andrej-karpathy)

*No attributed notes in this vault.*

## David McDermott

Entity note: [entity-david-mcdermott](#entity-david-mcdermott)

*No attributed notes in this vault.*

## Jake Van Clief

Entity note: [entity-jake-van-clief](#entity-jake-van-clief)

**Action-items** (3):
- [action-codify-voice](#action-codify-voice) — Codify Voice and Tone in Markdown
- [action-implement-folders](#action-implement-folders) — Implement Folder-Based Context
- [action-move-to-l2](#action-move-to-l2) — Standardize Prompts to Reach Level 2

**Claims** (3):
- [claim-icm-superiority](#claim-icm-superiority) — ICM Outperforms Multi-Agent Frameworks
- [claim-voice-future](#claim-voice-future) — Real-Time Voice AI is the Future of Workflows
- [claim-l2-roi](#claim-l2-roi) — The Jump to Level 2 AI Use Yields Highest ROI

**Quotes** (4):
- [quote-dialogue-theme](#quote-dialogue-theme) — Dialogue as the Core Theme
- [quote-absurdities](#quote-absurdities) — Folders over Frameworks
- [quote-l2-roi](#quote-l2-roi) — The ROI of Level 2
- [quote-voice-control](#quote-voice-control) — Voice Controlling Local AI

## K. Kumar

Entity note: [entity-k-kumar](#entity-k-kumar)

*No attributed notes in this vault.*


---

## All Notes

### Folder: concepts

#### concept-dialogue-structure

*type: `concept`*

## Thesis

A central claim of the source: **all structured AI workflows, prompt libraries, and 'skills' are fundamentally derived from human dialogue and conversational decision trees.** See [quote-dialogue-theme](#quote-dialogue-theme).

## The Hidden Decision Tree

A simple chat request like *'tighten this paragraph'* hides a complex chain of decisions:

1. **Goal** — understand the primary intent
2. **Constraints** — reduce wordiness, maintain tone, keep meaning
3. **Assumptions** — target audience, expected register, length budget
4. **Sub-goals** — restructure sentences, eliminate filler, preserve rhythm
5. **Execution** — produce the revision

## The Visual Mapping Tool

[entity-k-kumar](#entity-k-kumar), a co-founder and student at the University of Edinburgh, built a visual tool used in the video to render these implicit decision trees from real chat transcripts. The tool exposes the latent goals, constraints, and assumptions that drove a successful interaction.

## From Ephemeral to Permanent

Once the tree is mapped, the four components (Goal / Constraints / Assumptions / Sub-goals) can be encoded into a markdown skill file. This transforms ephemeral chat history into a **reusable, deterministic AI skill** — exactly the artifact format used by [concept-icm](#concept-icm).

The process is codified as [framework-skill-creation](#framework-skill-creation).

## Where the Encoded Artifact Lives (Companion Paper)

The "permanent artifact" this note describes is, in the formal paper [entity-icm-paper-arxiv](#entity-icm-paper-arxiv), a specific tier of the **Five-Layer Context Hierarchy**: the mapped Goal/Constraints/Assumptions/Sub-goals become the **Stage `CONTEXT.md` (Layer 2)** contract and its **Layer 3 reference material**, while the live chat that seeded it is transient **Layer 4** working content. The paper's lineage for this move is Knuth's *literate programming* (instructions and context co-located in readable text) and Wei et al.'s *chain-of-thought decomposition* — i.e., the decision tree this note extracts from dialogue is the same structure the paper persists as a stage contract. K. Kumar's Edinburgh affiliation also lines up with the paper's Eduba / University of Edinburgh base.

## Counter-Perspective

The descriptive claim ('skills can be reverse-engineered from conversations') is strongly consistent with prompt-engineering and conversational-UX practice. The stronger philosophical claim ('all complex AI workflows originate from human conversational decision trees') is a useful lens but not universal — many production workflows are better modeled as business processes, state machines, or event-driven dataflows. Dialogue is one structural perspective, not the only one.


#### concept-icm

*type: `concept`*

## Definition

The **Interpretible Context Methodology (ICM)** is a contrarian approach to building AI agent architectures. Instead of relying on complex multi-agent orchestration frameworks such as [entity-langchain](#entity-langchain) or [entity-semantic-kernel](#entity-semantic-kernel), ICM advocates for using plain text, markdown files, and standard folder hierarchies to manage context and workflows.

## Core Philosophy

An AI agent (typically [entity-claude](#entity-claude)) can navigate a well-structured file system to:

- Gather necessary context on demand
- Understand constraints encoded as markdown
- Execute tasks deterministically without orchestration glue code

By breaking down workflows, prompt libraries, and specific 'skills' into discrete markdown files, users create a **highly transparent, easily modifiable, and human-readable architecture**.

## Claimed Benefits

- **Token reduction of 20–40%** versus framework-driven approaches (see [claim-icm-superiority](#claim-icm-superiority) — note that this figure is anecdotal/case-study based, not a peer-reviewed benchmark).
- Faster execution and lower latency.
- Significantly lower barrier to entry for non-technical teams who only need to manage folders and text files instead of Python code or API integrations.
- Greater determinism and inspectability of agent behaviour.

## Formal Grounding (Companion Paper)

The talk's practitioner framing is formalized in the peer-companion paper [entity-icm-paper-arxiv](#entity-icm-paper-arxiv) (*"Interpretable Context Methodology: Folder Structure as Agent Architecture,"* Van Clief & McDermott, arXiv:2603.16021, Edinburgh). The paper makes explicit a structure the video only implies — the **Five-Layer Context Hierarchy**:

- **Layer 0** — `CLAUDE.md` (global identity)
- **Layer 1** — `CONTEXT.md` (workspace routing)
- **Layer 2** — Stage `CONTEXT.md` (per-stage contracts)
- **Layer 3** — Reference material (stable across runs)
- **Layer 4** — Working artifacts (per-run content)

Numbered stage folders (`01_research`, `02_script`, `03_production`) carry explicit Inputs / Process / Outputs contracts, with **human review gates** between stages. The paper grounds the efficiency claim quantitatively: **2,000–8,000 focused tokens per stage** vs. a monolithic prompt **exceeding 40,000 tokens, most of it irrelevant** — invoking Liu et al.'s *"lost in the middle"* degradation as the mechanism. It also reframes ICM as **"interpretable" in Rudin's sense** (inherently inspectable, not post-hoc explained) and as Karpathy-style *"context engineering."* The full set of paper diagrams (five-layer hierarchy with token budgets, the folder tree, the token-composition chart, the review-gate pipeline) is captured and synthesized in [exhibit-icm-paper-figures](#exhibit-icm-paper-figures).

## Cultural Validation

[entity-anthropic](#entity-anthropic) and researchers such as [entity-andrej-karpathy](#entity-andrej-karpathy) independently arrived at similar ideas — Karpathy's 'LLM Wiki' approach mirrors ICM's emphasis on structured markdown as the substrate of agent context.

## Related Building Blocks

- The structural origin of ICM skills is explored in [concept-dialogue-structure](#concept-dialogue-structure).
- ICM is operationalized in the [framework-skill-creation](#framework-skill-creation) process.
- The maturity ladder for adopting ICM is described in [concept-three-levels-ai](#concept-three-levels-ai).
- Its ultimate expression is [concept-voice-collaboration](#concept-voice-collaboration).

## Prerequisites for Understanding

- [prereq-llm-context](#prereq-llm-context)
- [prereq-markdown](#prereq-markdown)

## Open Questions

- [question-icm-scaling](#question-icm-scaling) — how does this scale to massive enterprise codebases?

## Counter-Perspective

See [contrarian-frameworks](#contrarian-frameworks). Note that Microsoft's Cloud Adoption Framework and other enterprise sources agree with the *starting* posture (single agent first) but argue that multi-agent frameworks remain valuable across security boundaries, multi-team environments, and at large scale.


#### concept-three-levels-ai

*type: `concept`*

## Definition

A maturity model for AI adoption inside organizations, proposed by [entity-jake-van-clief](#entity-jake-van-clief) from his enterprise consulting work.

## Level 1 — Copy & Paste

Baseline ad-hoc use: users paste prompts into chatbots (ChatGPT, [entity-claude](#entity-claude)), iterate manually, copy outputs back into their work.

- Low effort, low and inconsistent impact
- No reuse, no shared assets
- Every interaction is ephemeral

## Level 2 — Structured Use

Users operate from refined, saved prompts, brand-tone guides, and verified outputs. Teams employ 'skills' and prompt libraries to standardize interactions.

- Jump from L1→L2 is the **highest-ROI move** (see [claim-l2-roi](#claim-l2-roi) and [quote-l2-roi](#quote-l2-roi))
- Flattens the effort curve while raising output quality and consistency
- Operationalized via [action-move-to-l2](#action-move-to-l2) and [action-codify-voice](#action-codify-voice)

## Level 3 — Integrated Workflow

Automated pipelines where multiple skills, prompts, and deterministic scripts (e.g., Python) are chained. The AI navigates the folder structure established under [concept-icm](#concept-icm) to execute multi-step processes with minimal human intervention.

- Highest absolute impact
- High engineering cost: requires distributed-systems thinking (orchestration, observability, error handling)
- Sees ad-hoc assistance replaced with systemic automation

## Enrichment Notes

Industry guidance broadly agrees that codifying prompts before building automation is the right *order* of operations. The specific framing 'highest ROI step' is grounded in practitioner experience rather than rigorous controlled studies — read it as a strong consultant heuristic, not a formal law.

For some organizations with large repetitive workloads and strong engineering teams, jumping directly into a narrow L3 deployment can also deliver outsized ROI.


#### concept-voice-collaboration

*type: `concept`*

## The Vision

The video culminates in a live demonstration of what [entity-jake-van-clief](#entity-jake-van-clief) considers the future of AI workflows: **an AI agent actively participating in a live group call**, not as a passive transcriber but as a real-time collaborator.

See [quote-voice-control](#quote-voice-control) for the speaker's framing.

## The Demo Stack

- **Voice cloning**: a custom voice model of the speaker, trained via [entity-11labs](#entity-11labs)
- **LLM runtime**: a local instance of [entity-claude](#entity-claude)
- **Codebase under control**: the speaker's 'Ethics Engine' project
- **Substrate**: a folder structure following [concept-icm](#concept-icm), where psychometric scales and other context live as markdown
- **Interaction loop**: voice → STT → Claude → file system read/write → response back via TTS during the live call

## What It Replaces

The traditional paradigm of:

1. Record a meeting
2. Transcribe afterwards
3. Feed the transcript to an LLM
4. Manually pick up generated tasks

…is collapsed into a single real-time loop where the AI executes during the conversation. There is no post-meeting processing and no separate orchestration layer.

## Why ICM Matters Here

The folder structure of [concept-icm](#concept-icm) gives the voice-driven agent its situational awareness: when asked to *'open the openness scale'*, the agent navigates a predictable directory and finds a markdown file describing it. Without ICM the same demo would require bespoke tool-calling glue.

## Open Issues

See [question-voice-security](#question-voice-security). Voice control of local code raises authentication, permission-scoping, and voice-spoofing concerns that current consumer voice stacks do not adequately address. Likely production future: multimodal (text + GUI + voice) rather than voice-only control.

See also the prediction in [claim-voice-future](#claim-voice-future).


---

### Folder: frameworks

#### framework-skill-creation

*type: `framework`*

## Purpose

A repeatable framework for converting ephemeral chatbot conversations into structured, reusable AI **skills** (markdown files) suitable for use under [concept-icm](#concept-icm). It operationalizes the thesis of [concept-dialogue-structure](#concept-dialogue-structure).

The framework grew out of the visual decision-tree mapping tool built by [entity-k-kumar](#entity-k-kumar).

## The Five Steps

### 1. Identify the Goal / Intent

What is the user actually trying to achieve? Example: *'Tighten this paragraph.'* This becomes the top of the decision tree.

### 2. Extract Constraints

What boundaries shaped the response? Examples:

- Reduce wordiness
- Maintain the original rhythm and voice
- Preserve all factual content
- Honour a length budget

### 3. Identify Assumptions

What did the user and the model implicitly assume? Examples:

- Target audience is a general blog readership
- The paragraph is final-draft quality
- No new sources or facts may be introduced

### 4. Map Sub-Goals

What intermediate steps were required? Examples:

- Identify filler phrases
- Collapse redundant clauses
- Re-balance sentence lengths
- Verify meaning preservation

### 5. Encode into a Markdown Skill

Write Goal / Constraints / Assumptions / Sub-goals into a structured markdown file in the skills folder. This artifact becomes a permanent, version-controlled skill consumable by any agent operating inside the [concept-icm](#concept-icm) vault.

## Why This Works

The framework is consistent with mainstream prompt-engineering practice (deriving system prompts and tools by abstracting successful interactions) and conversational-UX practice (modelling chatbot flows as decision trees). Multi-agent research likewise decomposes tasks into conversational roles with explicit protocols.

## Related Action

[action-codify-voice](#action-codify-voice) is a concrete instance of this framework applied to writing voice/tone.


---

### Folder: claims

#### claim-icm-superiority

*type: `claim`*

## Claim

[entity-jake-van-clief](#entity-jake-van-clief) asserts that the [concept-icm](#concept-icm) — simple folder structures plus markdown files navigated by a single agent — is superior to complex multi-agent frameworks like [entity-langchain](#entity-langchain) or [entity-semantic-kernel](#entity-semantic-kernel).

## Specific Sub-Claims

1. **Token usage reduction of 20–40%** versus framework-driven approaches.
2. **Faster outcomes** (less orchestration overhead, no agent-to-agent message loops).
3. **Easier adoption and maintenance** by non-technical teams.
4. **Greater determinism** in execution.
5. **'Multi-agentic harnesses' are absurdities** — a single well-contextualized agent is sufficient.

## Evidence in the Source

- Live demos of ICM-based skills outperforming framework approaches on the same tasks.
- Reports from enterprise clients adopting the methodology.
- The headline [quote-absurdities](#quote-absurdities) crystallizes the rhetorical position.
- Reinforced by [contrarian-frameworks](#contrarian-frameworks).

## Companion-Paper Grounding (sharpens, but does not benchmark)

The formal paper [entity-icm-paper-arxiv](#entity-icm-paper-arxiv) supplies the figures behind the "20–40%" headline:

- **Per-stage context budget:** 2,000–8,000 *focused* tokens per stage vs. monolithic prompts **exceeding 40,000 tokens, most of it irrelevant**. The win is framed as *relevance density*, not raw compression.
- **Mechanism, not just outcome:** the reduction is justified theoretically via Liu et al.'s *"lost in the middle"* — staged loading keeps load-bearing content out of the degraded mid-context band.
- **Adoption evidence (N=33, informal self-report):** **30 of 33** practitioners report a **U-shaped human-intervention pattern** — heavy edits at stage 1 (**92%**), light at stage 2 (**30%**), heavy at stage 3 (**78%**); three non-coders shipped working workspaces.
- ⚠️ **Still no controlled head-to-head.** The paper *explicitly states* there is "no controlled comparison between ICM's staged loading and monolithic prompting," and all testing used a single model family (Claude Opus/Sonnet 4.6). So the figures corroborate the efficiency story but **do not** establish superiority over LangChain/Semantic Kernel on a benchmark.

## Confidence: **high** (per source) — but validation says:

- **Single-agent-first guidance is mainstream**, not fringe. Microsoft's Cloud Adoption Framework explicitly recommends starting with a single-agent system and only escalating to multi-agent when crossing security boundaries, team boundaries, or scaling needs.
- **The 20–40% token-reduction figure is anecdotal** — no peer-reviewed benchmark exists comparing ICM-style navigation vs LangChain/Semantic Kernel.
- **The 'absurdities' framing overshoots** — multi-agent research shows role decomposition (retrieval, reasoning, validation, monitoring) improves modularity and robustness in genuinely complex environments. Enterprise multi-agent literature also documents necessary patterns (sagas, circuit breakers, governance) that are not 'absurd' but earned.

## Testability

**Yes** — benchmark a representative workflow implemented via ICM vs LangChain/Semantic Kernel on (a) tokens consumed, (b) wall-clock latency, (c) maintenance effort, and (d) determinism of outputs across runs.

## Related Action

[action-implement-folders](#action-implement-folders)


#### claim-l2-roi

*type: `claim`*

## Claim

Based on consulting work with enterprise companies, [entity-jake-van-clief](#entity-jake-van-clief) claims that moving an organization from Level 1 (ad-hoc copy/paste into chatbots) to Level 2 (structured prompts and verified outputs) of [concept-three-levels-ai](#concept-three-levels-ai) delivers the **highest ROI** of any AI adoption step.

See [quote-l2-roi](#quote-l2-roi) for the punchy framing.

## Reasoning

- **L3 has higher absolute impact** but requires significant engineering investment (distributed systems, observability, orchestration).
- **L1→L2 is comparatively cheap**: build shared prompt libraries, brand-tone guides, and basic markdown 'skills'.
- The ratio of (gain in quality + consistency) to (effort) is maximized at this transition.

## How To Act On It

See [action-move-to-l2](#action-move-to-l2) and [action-codify-voice](#action-codify-voice).

## Confidence: **high** (per source) — validation says:

- **Well-aligned with practitioner consensus**. Standardizing prompts/patterns before building automation is widely recommended (Microsoft's adoption guidance, prompt-library/playbook literature, vendor playbooks).
- **Empirical ROI quantification is scarce**. The 'highest ROI' framing is consultant insight, not a controlled finding.
- **Counter-case**: organizations with large repetitive workloads and strong engineering teams may extract outsized ROI by jumping directly into a narrow L3 deployment.

## Testability

**Yes, with caveats**. ROI must be operationalized (output quality scores, cycle time, error rate) and measured pre/post intervention across a comparable cohort. This is hard but feasible.


#### claim-voice-future

*type: `claim`*

## Claim

[entity-jake-van-clief](#entity-jake-van-clief) claims that the future of software engineering and workflow automation lies in **real-time, voice-driven AI collaboration** — see [concept-voice-collaboration](#concept-voice-collaboration). He predicts that the ability to verbally command an agent to read, analyze, and write to a local file system during a live meeting will replace the current paradigm of post-meeting transcript analysis and manual task execution.

See [quote-voice-control](#quote-voice-control) for the framing.

## What's Already Possible

The demoed stack (voice cloning via [entity-11labs](#entity-11labs) + local [entity-claude](#entity-claude) + [concept-icm](#concept-icm) folders) is technically feasible today. Real-time transcription, live IDE editing by voice, and 'AI teammate' patterns already exist in commercial tools.

## Confidence: **medium**, **not testable** (it's a prediction).

Validation perspective:

- **Technically plausible and partially realized.** This is a forward-looking but reasonable prediction.
- **Broad consensus on voice as *the* dominant modality does not exist.** Many engineers prefer keyboard/editor workflows for precision, speed, and privacy.
- **Substantial barriers**: see [question-voice-security](#question-voice-security). Voice spoofing, replay attacks, bystander exposure, open-office acoustics, and corporate IT policy all impede mainstream adoption.
- **Likely future**: multimodal control (text + GUI + voice) where voice is one option, not a universal replacement.


---

### Folder: entities

#### entity-11labs

*type: `entity` · entity: tool*

## Profile

ElevenLabs (referred to in the video as '11Labs') is a commercial provider of AI speech synthesis and voice cloning, widely used for custom voices, dubbing, and interactive applications.

## Role in the Source

- Used to create a **custom voice model of [entity-jake-van-clief](#entity-jake-van-clief) himself** for the real-time voice-driven AI collaboration demo
- Forms the TTS/voice-cloning side of the loop demonstrated in [concept-voice-collaboration](#concept-voice-collaboration)

## Security Footnote

The ease of voice cloning here is exactly the substrate of the concerns raised in [question-voice-security](#question-voice-security): if voices are cheap to clone, voice-as-authentication becomes risky for high-trust filesystem control.


#### entity-andrej-karpathy

*type: `entity` · entity: person*

## Profile

Andrej Karpathy is an influential AI researcher: former Director of AI at Tesla, founding member of OpenAI, and an educator widely followed for material on deep learning and LLM usage. In 2024 he publicly announced collaboration with [entity-anthropic](#entity-anthropic).

## Relevance to This Vault

- He popularized an 'LLM Wiki' / markdown-based personal knowledge workflow that closely mirrors the philosophy of [concept-icm](#concept-icm).
- Cited by [entity-jake-van-clief](#entity-jake-van-clief) as independent validation that prominent AI practitioners are converging on folder-based, markdown-first context management.
- Symbolic of the broader cultural shift away from heavy orchestration frameworks toward simple, inspectable substrates.


#### entity-anthropic

*type: `entity` · entity: organization*

## Profile

Anthropic is an AI safety and research company that develops the [entity-claude](#entity-claude) family of large language models and promotes 'constitutional AI' approaches.

## Relevance to This Vault

- Cited as an organization that heavily uses the concept of 'skills' and structured context, aligning closely with [concept-icm](#concept-icm).
- [entity-andrej-karpathy](#entity-andrej-karpathy) is noted to have recently joined Anthropic to teach, reinforcing the cultural overlap between Karpathy's 'LLM Wiki' approach and ICM.
- [entity-claude](#entity-claude) is the model used in the live demos including [concept-voice-collaboration](#concept-voice-collaboration).


#### entity-claude

*type: `entity` · entity: product*

## Profile

Claude is [entity-anthropic](#entity-anthropic)'s family of large language models, designed for helpfulness and safety. The current generation (Claude 3 series and successors) is used extensively in filesystem-navigation and coding-assistant scenarios.

## Role in the Source

- The primary LLM used in [entity-jake-van-clief](#entity-jake-van-clief)'s demonstrations
- Drives the live demo of [concept-voice-collaboration](#concept-voice-collaboration) — a local instance executes voice-issued commands against a folder structure built per [concept-icm](#concept-icm)
- Used as the agent that 'navigates folders' in ICM examples

## Cultural Note

Anthropic's emphasis on **skills** as a first-class abstraction is repeatedly cited as cultural validation of ICM.


#### entity-david-mcdermott

*type: `entity` · entity: person*

## Profile

David McDermott appears in the source's speaker list as a participant in the conversation alongside [entity-jake-van-clief](#entity-jake-van-clief) and [entity-k-kumar](#entity-k-kumar). He is listed as present but did not have substantive content attributed to him in the *video* extraction.

**Companion-source upgrade:** the supporting paper [entity-icm-paper-arxiv](#entity-icm-paper-arxiv) identifies McDermott as the **co-author** of *"Interpretable Context Methodology: Folder Structure as Agent Architecture"* (arXiv:2603.16021) with Van Clief, affiliated with **Eduba / University of Edinburgh** — the same institution as [entity-k-kumar](#entity-k-kumar). So he is not merely a passive participant: he is the formal academic co-originator of the methodology the talk presents.

## Role in the Source

In the video: co-host / interlocutor, not individually quoted. In the broader work: **research co-author** responsible (with Van Clief) for the formal articulation of the Five-Layer Context Hierarchy, the staged-folder architecture, and the documented limitations. Where the talk supplies practitioner conviction, the paper (and thus McDermott's contribution) supplies the structure, lineage, and stated threats to validity.

## Note

This entity note is emitted per the speaker-completeness rule so that cross-vault tooling resolves every named speaker consistently. Enriched from companion source [entity-icm-paper-arxiv](#entity-icm-paper-arxiv).


#### entity-icm-paper-arxiv

*type: `entity` · entity: publication*

> **Provenance note:** This note is a **supplementary companion source** added alongside the YouTube extraction. The video ([entity-jake-van-clief](#entity-jake-van-clief)'s talk) is the *primary* source of this vault; this is the formal academic paper by the **same author** that grounds the video's claims. The `yt-extract-agent` pipeline is single-source — this note was folded in manually so downstream agents inherit both the practitioner talk and its peer-companion paper.

## Bibliographic

- **Title:** Interpretable Context Methodology: Folder Structure as Agent Architecture
- **Authors:** [entity-jake-van-clief](#entity-jake-van-clief), [entity-david-mcdermott](#entity-david-mcdermott)
- **arXiv:** [2603.16021v2](https://arxiv.org/html/2603.16021v2) (18 Mar 2026)
- **Affiliation:** Eduba, University of Edinburgh

## Abstract (verbatim)

> Current approaches to AI agent orchestration typically involve building multi-agent frameworks that manage context passing, memory, error handling, and step coordination through code. These frameworks work well for complex, concurrent systems. But for sequential workflows where a human reviews output at each step, they introduce engineering overhead that the problem does not require. This paper presents Interpretable Context Methodology (ICM), a method that replaces framework-level orchestration with filesystem structure. Numbered folders represent stages. Plain markdown files carry the prompts and context that tell a single AI agent what role to play at each step. Local scripts handle the mechanical work that does not need AI at all. The result is a system where one agent, reading the right files at the right moment, does the work that would otherwise require a multi-agent framework.

## Visual Exhibits

The paper's 5 figures + 2 tables are extracted, rendered, and synthesized in **[exhibit-icm-paper-figures](#exhibit-icm-paper-figures)** — including the five-layer hierarchy with per-layer token budgets (Fig 1), the layer-annotated folder tree (Fig 2), the stacked token-composition chart showing the monolithic ~42k context as mostly irrelevant waste (Fig 3), the human-review-gate pipeline (Fig 4), the U-shaped intervention chart (Fig 5), and the framework-vs-ICM control-surface table (Table 1). These exhibits are the richest layer the companion source adds over the video.

## Formal Components (grounds the video's [concept-icm](#concept-icm))

**Five-Layer Context Hierarchy** — the paper's central artifact, not stated explicitly in the talk:

- **Layer 0** — `CLAUDE.md` (global identity)
- **Layer 1** — `CONTEXT.md` (workspace routing)
- **Layer 2** — Stage `CONTEXT.md` (stage contracts)
- **Layer 3** — Reference material (stable across runs)
- **Layer 4** — Working artifacts (per-run content)

**Stage structure** — numbered folders (`01_research`, `02_script`, `03_production`) with explicit Inputs / Process / Outputs contracts. **Review gates** sit between stages as human intervention points where outputs become editable. The workspace is a self-contained folder using plain markdown + JSON as the universal interface. See [concept-dialogue-structure](#concept-dialogue-structure) and [framework-skill-creation](#framework-skill-creation).

## Quantitative Grounding (sharpens [claim-icm-superiority](#claim-icm-superiority))

The video's "20–40% token reduction" is anecdotal; the paper supplies the underlying figures:

- **Per-stage context:** 2,000–8,000 *focused* tokens per stage vs. a monolithic prompt **exceeding 40,000 tokens, most of it irrelevant**.
- **Theoretical basis:** Liu et al.'s *"lost in the middle"* context-degradation effect — staged loading keeps relevant content out of the degraded middle band.
- **Practitioner observation (N=33, informal self-report):** **30 of 33** report a **U-shaped intervention pattern** — heavy editing at stage 1 (**92%**), light at stage 2 (**30%**), heavy at stage 3 (**78%**). Three non-coders successfully built video workspaces.
- ⚠️ **No controlled quantitative comparison** between ICM and monolithic prompting is reported. The numbers are efficiency/usage figures, not a benchmark win.

## Intellectual Lineage

The paper situates ICM against: McIlroy's Unix "do one thing well" + plain-text-as-interface; Shaw & Garlan's pipe-and-filter pattern; Aho et al.'s multi-pass compilation / intermediate representation; Wei et al.'s chain-of-thought decomposition; Horvitz's mixed-initiative systems; Liu et al.'s lost-in-the-middle; Fails & Olsen's interactive ML; Knuth's literate programming; Rudin's interpretability framework; and Karpathy's "context engineering" (2025) — the same lineage [entity-andrej-karpathy](#entity-andrej-karpathy) is cited for in the talk.

## Stated Limitations (extends [question-icm-scaling](#question-icm-scaling))

- Data is **self-reported through conversation, not instrumented** measurement.
- Practitioner community is **invite-only, self-selected** (52 members); active use **concentrated in content production**.
- **All testing on a single model family** (Claude Opus/Sonnet 4.6).
- **No controlled comparison** of staged vs. monolithic loading.
- **Non-support (explicit):** cannot handle real-time multi-agent collaboration, high-concurrency systems, or complex automated branching — consistent with the talk's [contrarian-frameworks](#contrarian-frameworks) caveat that frameworks retain value across security boundaries and at scale.

## Open Questions Raised

- Does the five-layer hierarchy **generalize across model families**?
- As context windows grow, does **selective loading stay important**?
- How **sensitive** is output quality to context ordering/formatting within layers?
- Needs: formal cross-model evaluation + structured user studies with systematic data collection.


#### entity-jake-van-clief

*type: `entity` · entity: person*

## Profile

Jake Van Clief is the **primary speaker** and originator of the [concept-icm](#concept-icm) thesis presented in this source. He works as an AI consultant for enterprise organizations and frames his observations *in the talk* as practitioner experience rather than research.

**That practitioner/research split is bridged by the companion source:** Van Clief is also the **lead author** of the formal paper [entity-icm-paper-arxiv](#entity-icm-paper-arxiv) (*"Interpretable Context Methodology: Folder Structure as Agent Architecture,"* arXiv:2603.16021, with co-author [entity-david-mcdermott](#entity-david-mcdermott), Eduba / University of Edinburgh). The video is his conviction-driven practitioner pitch; the paper is the same idea rendered as method, lineage, and acknowledged limitations. Read together, the talk supplies the *why-it-matters* and the paper supplies the *what-it-actually-is* and *where-it-breaks*.

## Role in the Source

- Delivers the talk on Interpretible Context Methodology and the future of AI dialogue
- Demonstrates ICM in live coding sessions
- Performs the real-time voice-driven AI collaboration demo using a custom voice model trained via [entity-11labs](#entity-11labs) and a local instance of [entity-claude](#entity-claude)

## Attributed Contributions in This Vault

Concepts proposed:

- [concept-icm](#concept-icm)
- [concept-three-levels-ai](#concept-three-levels-ai)
- [concept-dialogue-structure](#concept-dialogue-structure)
- [concept-voice-collaboration](#concept-voice-collaboration)

Claims advanced:

- [claim-icm-superiority](#claim-icm-superiority)
- [claim-l2-roi](#claim-l2-roi)
- [claim-voice-future](#claim-voice-future)

Quotes attributed:

- [quote-absurdities](#quote-absurdities)
- [quote-l2-roi](#quote-l2-roi)
- [quote-dialogue-theme](#quote-dialogue-theme)
- [quote-voice-control](#quote-voice-control)

Contrarian positioning:

- [contrarian-frameworks](#contrarian-frameworks)

Recommended actions he advocates:

- [action-implement-folders](#action-implement-folders)
- [action-move-to-l2](#action-move-to-l2)
- [action-codify-voice](#action-codify-voice)


#### entity-k-kumar

*type: `entity` · entity: person*

## Profile

K. Kumar is described in the source as a **co-founder and student at the University of Edinburgh**. He created the visual mapping tool used in the video to extract decision trees, goals, and constraints from human–AI dialogue.

## Role in the Source

- Builder of the dialogue-extraction visualization tool central to the live demos
- Co-participant in the conversation (named in the speaker list)
- Intellectual collaborator on the thesis that conversation is the substrate of AI skills

## Attributed Contributions in This Vault

- The visual decision-tree mapping tool underpinning [concept-dialogue-structure](#concept-dialogue-structure)
- Methodological inspiration for [framework-skill-creation](#framework-skill-creation)

## Disambiguation Note

Multiple individuals match 'Kumar' at the University of Edinburgh. No definitive canonical URL is asserted without additional identifiers — downstream tools should treat the entity as project-specific.


#### entity-langchain

*type: `entity` · entity: tool*

## Profile

LangChain is a popular open-source framework for building LLM applications, offering chains, agents, tools, retrievers, and integrations for both single- and multi-agent patterns.

## Role in the Source

Cited as the canonical example of a 'complex multi-agent framework' that [entity-jake-van-clief](#entity-jake-van-clief) argues is over-engineered relative to [concept-icm](#concept-icm). Forms the core of the contrarian position in [claim-icm-superiority](#claim-icm-superiority) and [contrarian-frameworks](#contrarian-frameworks).

## Balanced View

LangChain is genuinely useful for cross-boundary, multi-team, and large-scale workflows where role decomposition, tool routing, and orchestration glue earn their complexity. The video's critique is best read as 'most teams don't need it yet', not 'no one ever needs it.'


#### entity-semantic-kernel

*type: `entity` · entity: tool*

## Profile

Semantic Kernel is Microsoft's open-source orchestration framework for AI agents and LLM-driven applications. It exposes the abstractions of **skills**, **planners**, and **connectors** that bridge LLMs to external services.

## Role in the Source

Mentioned alongside [entity-langchain](#entity-langchain) as an orchestration framework that [entity-jake-van-clief](#entity-jake-van-clief)'s [concept-icm](#concept-icm) aims to replace with plain folder structures. Central to [claim-icm-superiority](#claim-icm-superiority) and [contrarian-frameworks](#contrarian-frameworks).

## Notable Irony

Semantic Kernel's 'skill' abstraction is conceptually close to ICM's 'skill' markdown file — both formalize a reusable unit of LLM capability. The disagreement is over the implementation substrate (code-defined plugins vs. plain markdown navigable by a single agent).


---

### Folder: quotes

#### quote-absurdities

*type: `quote`*

> "They're not building multi-agentic frameworks and all these absurdities, they're building folders and markdown files on their computer and getting huge results from it."

— [entity-jake-van-clief](#entity-jake-van-clief), 00:00:16

## Why It Matters

The video's thesis in one sentence. It anchors [concept-icm](#concept-icm), [claim-icm-superiority](#claim-icm-superiority), and the contrarian framing in [contrarian-frameworks](#contrarian-frameworks).


#### quote-dialogue-theme

*type: `quote`*

> "All of these skills, all of these folders and markdown files, all have one core theme: discussion and dialogue."

— [entity-jake-van-clief](#entity-jake-van-clief), 00:07:09

## Why It Matters

The philosophical centre of the talk. It connects [concept-icm](#concept-icm) back to [concept-dialogue-structure](#concept-dialogue-structure) and [framework-skill-creation](#framework-skill-creation) — skills are conversational decision trees made permanent.


#### quote-l2-roi

*type: `quote`*

> "The jump from L1 to L2 is the highest ROI move."

— [entity-jake-van-clief](#entity-jake-van-clief), 00:02:29

## Why It Matters

Compresses [claim-l2-roi](#claim-l2-roi) into a memorable consultant heuristic. References [concept-three-levels-ai](#concept-three-levels-ai).


#### quote-voice-control

*type: `quote`*

> "What if I could sit inside of a group call and control someone else's Claude code or AI through my voice and immediately access all of that data that's locally on their computer?"

— [entity-jake-van-clief](#entity-jake-van-clief), 00:19:05

## Why It Matters

The motivating question for [concept-voice-collaboration](#concept-voice-collaboration) and the forward-looking [claim-voice-future](#claim-voice-future). Also implicitly raises [question-voice-security](#question-voice-security).


---

### Folder: action-items

#### action-codify-voice

*type: `action-item`*

## Action

Create a dedicated markdown file (e.g., `voice-and-tone.md`) that explicitly defines:

- Writing style and register
- Formatting constraints (headings, lists, emphasis)
- Teaching methodology / explanatory posture
- Prohibited words, clichés, or rhetorical moves
- Examples of 'good' and 'bad' outputs

Reference this file from your primary agent prompt so [entity-claude](#entity-claude) (or any agent operating in your [concept-icm](#concept-icm) vault) consistently matches your style without repetitive manual prompting.

## Expected Outcome

- Consistent AI outputs across all projects
- One place to evolve style — every downstream skill inherits
- Eliminates the 'I'll just tell it the style each time' tax

## Concrete Instance Of

[framework-skill-creation](#framework-skill-creation) — voice/tone files are a special case of a structured skill (Goal: 'write in our voice'; Constraints: the rules; Assumptions: the audience).


#### action-implement-folders

*type: `action-item`*

## Action

Structure your AI agent's context, instructions, prompts, and 'skills' using **standard file system folders and markdown files** rather than adopting an orchestration framework such as [entity-langchain](#entity-langchain) or [entity-semantic-kernel](#entity-semantic-kernel).

## How

1. Create a vault/folder per project or domain
2. Place reusable skills as markdown files in a `skills/` subfolder (use [framework-skill-creation](#framework-skill-creation))
3. Keep voice/tone, constraints, and brand guidelines as named markdown files referenced from the main prompt
4. Give a single agent (e.g., [entity-claude](#entity-claude)) navigational access to the folder

## Expected Outcome

- 20–40% reduction in token usage (anecdotal — verify locally)
- Increased transparency and inspectability
- Easier maintenance because everything is plain text
- Lower barrier for non-engineers to participate

## Underlying Concept

[concept-icm](#concept-icm)

## Supporting Claim

[claim-icm-superiority](#claim-icm-superiority)


#### action-move-to-l2

*type: `action-item`*

## Action

Audit your team's current AI usage. If they are primarily copy-pasting into chatbots ([concept-three-levels-ai](#concept-three-levels-ai) Level 1), invest effort into creating:

- **Shared prompt libraries** (versioned, owned, reviewed)
- **Brand-tone guides** as markdown (see [action-codify-voice](#action-codify-voice))
- **Structured markdown skills** built using [framework-skill-creation](#framework-skill-creation)

This moves the team to Level 2 — the highest-ROI step per [claim-l2-roi](#claim-l2-roi) and [quote-l2-roi](#quote-l2-roi).

## Expected Outcome

- Flattens the effort curve while raising quality and consistency
- Creates the foundation for an eventual Level 3 transition
- Reduces variance across team members

## Caveats

ROI claims are practitioner-grounded, not formally measured. Establish your own baseline and re-measure post-intervention if you want hard numbers.


---

### Folder: prerequisites

#### prereq-llm-context

*type: `prereq`*

## Why It's Required

To fully grasp why the folder-based [concept-icm](#concept-icm) is efficient, you must understand:

- **How LLMs process tokens** — text is chunked into tokens before inference
- **Context window limits** — every model has a maximum context size
- **Cost scaling** — most APIs price per input + output token, so context bloat directly costs money
- **Attention degradation** — practical performance often degrades as context grows ('lost in the middle')

## Connection to ICM

ICM's central efficiency argument is that **on-demand folder navigation loads only relevant slices into context**, instead of stuffing everything into the prompt. This is consistent with general LLM guidance (externalize persistent state, load only what is needed).

The claimed 20–40% token reduction in [claim-icm-superiority](#claim-icm-superiority) is a direct consequence of this design choice.


#### prereq-markdown

*type: `prereq`*

## Why It's Required

The entire [concept-icm](#concept-icm) methodology relies on **Markdown** as the substrate for instructions, context, and skills. You should be comfortable with at least:

- Headings (`#`, `##`, `###`)
- Bullet and numbered lists
- Bold and italic emphasis
- Code fences and inline code
- Links and image syntax
- Optionally: front-matter (YAML at the top of files), tables, and footnotes

## Why Markdown Specifically

- Plain-text → version-controllable and diff-friendly
- Human-readable → low barrier for non-engineers
- Machine-parseable → LLMs handle markdown structure exceptionally well
- Portable → works in Obsidian, GitHub, IDEs, and editor of choice

Markdown is the lingua franca that makes [framework-skill-creation](#framework-skill-creation) and [action-codify-voice](#action-codify-voice) possible without code.


---

### Folder: open-questions

#### question-icm-scaling

*type: `open-question`*

## The Question

While [entity-jake-van-clief](#entity-jake-van-clief) demonstrates [concept-icm](#concept-icm) working effectively on focused projects and bounded databases, it remains an open question how well a **single agent navigating a folder structure scales** when applied to massive, legacy enterprise codebases with tens of thousands of interconnected files.

## Why It Matters

If ICM degrades at enterprise scale, the contrarian critique in [contrarian-frameworks](#contrarian-frameworks) weakens — because multi-agent orchestration frameworks (with specialized retrieval, planning, and validation agents) were *designed* for exactly that scale.

## Resolution Path

- Case studies of ICM applied to a monolithic enterprise codebase
- Benchmarks comparing single-agent ICM navigation vs framework-based approaches on (a) accuracy of file selection, (b) time-to-answer, (c) refactor correctness
- Hybrid patterns where ICM provides the substrate but a thin orchestration layer handles cross-team boundaries

## Sub-Threads

- At what file count / repository complexity does single-agent navigation break down?
- Do hierarchical 'index' markdown files mitigate the problem?
- Does Claude's improving context window absorb the scaling problem on its own over time?

## Open Questions Stated in the Companion Paper

The paper [entity-icm-paper-arxiv](#entity-icm-paper-arxiv) independently flags adjacent unknowns and explicitly bounds ICM's claims — these are *author-acknowledged* gaps, not external critique:

- **Cross-model generalization** — does the Five-Layer Context Hierarchy hold outside the Claude family? *All* paper testing used a single model family (Claude Opus/Sonnet 4.6).
- **Diminishing returns of selective loading** — as context windows grow, does staged loading stay worthwhile, or does the scaling problem dissolve on its own (directly echoes the sub-thread above)?
- **Sensitivity to ordering/formatting** — how much does output quality depend on context ordering within a layer?
- **Explicit non-support** — the paper states ICM is *not* intended for real-time multi-agent collaboration, high-concurrency systems, or complex automated branching. This narrows the scaling question: ICM's authors concede the high-concurrency enterprise case to frameworks rather than claiming to scale into it.
- **Methodological weakness** — evidence is self-reported (not instrumented), from an invite-only, self-selected community (52 members) concentrated in content production; no controlled comparison exists. Resolving the scaling question therefore requires the *formal cross-model evaluation and structured user studies* the paper itself calls for.


#### question-voice-security

*type: `open-question`*

## The Question

The [concept-voice-collaboration](#concept-voice-collaboration) demonstration shows an AI agent taking voice commands during a live call and executing **read/write operations on a local file system**. This raises significant security questions:

- **Authentication**: how does the agent know the speaker is authorized?
- **Voice spoofing**: tools like [entity-11labs](#entity-11labs) make cloning trivial; replay and synthesized-voice attacks are realistic
- **Permission scoping**: which folders/files can the voice channel touch?
- **Bystander hijacking**: in a compromised or public call, anyone with mic access could issue commands
- **Audit trail**: how are voice-issued operations logged?

## Why It Matters

Without solid answers, the prediction in [claim-voice-future](#claim-voice-future) cannot transition from demo to production deployment in any regulated or sensitive environment.

## Resolution Path

- Robust voice authentication protocols (biometric + secondary factor)
- Sandboxed execution environments for voice-driven agents (capability-scoped, read-only by default)
- Command-confirmation patterns for destructive operations
- Cryptographic provenance for voice commands
- Policy frameworks at the OS / enterprise IT level


---

### Folder: contrarian-insights

#### contrarian-frameworks

*type: `contrarian-insight`*

## The Contrarian Position

Contrary to the industry trend of building increasingly complex multi-agent orchestration frameworks (such as [entity-langchain](#entity-langchain), [entity-semantic-kernel](#entity-semantic-kernel), or AutoGen), [entity-jake-van-clief](#entity-jake-van-clief) argues that these are 'absurdities.' He posits that a simpler approach — standard file-system folders and markdown files navigated by a single agent — is more effective, cheaper, and easier to maintain.

See the matching claim [claim-icm-superiority](#claim-icm-superiority) and the headline [quote-absurdities](#quote-absurdities).

## What the Position Challenges

The prevailing belief that complex tasks demand complex orchestration. The contrarian view says: **most tasks don't, and the orchestration tax is paid in tokens, debugging time, and adoption friction.**

## How Far the Contrarian Claim Holds

- **Supported by the literature**: starting with a single agent plus structured context is mainstream advice (e.g., Microsoft's Cloud Adoption Framework). For coding tasks especially, single-agent + tools fits most workloads.
- **Partially supported / anecdotal**: the specific 20–40% token-reduction figure is plausible but not benchmarked.
- **Not supported in absolute form**: multi-agent frameworks are well-motivated when crossing security/compliance boundaries, when multiple teams own subsystems, when you need specialized roles with distinct permissions, or when distributed-systems patterns (sagas, circuit breakers, immutable state) become necessary at scale.

## Balanced Reframe

ICM is the right *starting* architecture and the right *terminal* architecture for many single-team workflows. Multi-agent frameworks are not absurd; they encode known distributed-systems patterns and earn their complexity once a system genuinely crosses boundaries.


---

### Folder: exhibits

#### exhibit-icm-paper-figures

*type: `exhibit`*

> **Source & provenance.** All exhibits below are extracted from the companion paper [entity-icm-paper-arxiv](#entity-icm-paper-arxiv) (*Interpretable Context Methodology: Folder Structure as Agent Architecture*, Van Clief & McDermott, [arXiv:2603.16021v2](https://arxiv.org/html/2603.16021v2)). Figures were rendered from the paper's inline SVG and captured as PNG; tables were transcribed from the paper's HTML. They were **not** in the YouTube primary source — the talk shows none of this structure on screen. This is the single richest layer of detail the companion source adds. Each exhibit below pairs the rendered image with synthesized insight for a downstream agent.

---

## Figure 1 — The Five-Layer Context Hierarchy

![Five-layer context hierarchy](fig1-five-layer-hierarchy.png)

The load-bearing diagram of the whole paper. Two things the prose and the talk both omit, visible only here:

**(a) Each layer carries an explicit token budget** — the hierarchy is a *budget*, not just a taxonomy:

| Layer | File | Token budget | Diagnostic question | Class |
|-------|------|--------------|---------------------|-------|
| **0** | `CLAUDE.md` | ~800 tok | *"Where am I?"* (global identity) | Structural (routing) |
| **1** | `CONTEXT.md` | ~300 tok | *"Where do I go?"* (workspace routing) | Structural (routing) |
| **2** | Stage `CONTEXT.md` | 200–500 tok | *"What do I do?"* (stage contract) | Structural (routing) |
| **3** | Reference material | 500–2k tok | *"What rules apply?"* (the **factory**, stable across runs) | Content |
| **4** | Working artifacts | varies | *"What am I working with?"* (the **product**, per-run) | Content |

**(b) The colour split is the architecture.** Layers 0–2 (blue, *structural / routing*) total only ~1.3–1.6k tokens — they tell the agent **where it is and what role to play**. Layers 3–4 (orange, *content*) carry the actual substance. The factory/product metaphor (L3 = factory/recipe, L4 = product/ingredients) is the paper's mnemonic for *what should change between runs* (only L4) vs. *what stays fixed* (L3). See [concept-icm](#concept-icm).

> **Agent takeaway:** total structural overhead is ~1.5k tokens. Everything else in a well-scoped stage is task content. That is *why* a stage lands at 2–8k tokens instead of 40k.

---

## Figure 2 — ICM Workspace Folder Structure (layer-annotated)

![Folder structure of an ICM workspace](fig2-folder-structure.png)

The canonical on-disk layout, every node tagged by layer:

```
workspace/
├── CLAUDE.md                 ← Layer 0  (global identity)
├── CONTEXT.md                ← Layer 1  (workspace routing)
├── stages/
│   ├── 01_research/
│   │   ├── CONTEXT.md        ← Layer 2  (stage contract)
│   │   ├── references/       ← Layer 3  (reference, persists)
│   │   └── output/           ← Layer 4  (working, per-run)
│   ├── 02_script/            … same triad (L2 / L3 / L4)
│   └── 03_production/        … same triad (L2 / L3 / L4)
├── _config/                  ← Layer 3  (shared reference)
├── shared/                   ← Layer 3  (shared reference)
└── setup/
    └── questionnaire.md      ← (setup-time only; unannotated)
```

**Synthesized insight (not stated explicitly in prose):**
- Every stage folder is the *same triad* — `CONTEXT.md` (L2) + `references/` (L3) + `output/` (L4). The repeating triad is what makes "add/remove a stage" a filesystem op (see Table 1).
- `_config/` and `shared/` are **top-level Layer 3** — cross-stage reference that escapes the per-stage `references/`. This is how ICM shares stable material (voice, design system, conventions) without duplicating it into every stage.
- `setup/questionnaire.md` is *un-layered* — it runs once at workspace creation and is not part of any run's context. This is the seam where the **non-coder onboarding** happens (the three non-coders in the study filled a questionnaire, not code).

---

## Figure 3 — Context Window Composition by Stage (the efficiency claim, visualized)

![Context window composition by stage](fig3-token-composition.png)

Representative token counts from the paper's *script-to-animation* workspace. Stacked by source: **blue** = Layers 0–2 (structural), **orange** = Layer 3 (reference), **tan** = Layer 4 (working), **grey** = unused / irrelevant context.

| Stage | Total tokens | Composition |
|-------|-------------|-------------|
| Research | **~4.9k** | almost entirely useful (structural + reference + working) |
| Script | **~5.5k** | almost entirely useful |
| Production | **~5.6k** | almost entirely useful |
| **Monolithic** | **~42k** | **mostly grey — the irrelevant band dwarfs the useful content** |

**The visual is the argument.** In the three ICM bars there is almost no grey: nearly every token in context is relevant to the current stage. In the monolithic bar, the grey *"unused/irrelevant"* band is larger than the entire useful payload of any single stage — the agent is carrying all three stages' instructions, all reference material, and all prior outputs simultaneously, ~80%+ of it irrelevant to whatever it is doing right now. This is the concrete mechanism behind [claim-icm-superiority](#claim-icm-superiority) and Liu et al.'s *"lost in the middle"*: ICM doesn't just use fewer tokens, it keeps the **relevant** tokens out of the degraded middle band.

> **Caveat (per the paper):** these are *representative* counts from one workspace, not a measured benchmark across many. The shape is illustrative.

---

## Figure 4 — Pipeline Flow with Human Review Gates

![Pipeline flow through three stages with review gates](fig4-pipeline-review-gates.png)

`Stage 1 (Research) → [Review gate / Human] → Stage 2 (Script) → [Review gate / Human] → Stage 3 (Production)`. Each stage receives its own context (Layers 0–4), writes to its `output/` folder; a **human review gate** (red diamond) sits on every stage boundary where the output becomes editable before the next stage reads it.

**The single most important sentence in the figure:** *"The same model executes every stage; the folder structure controls what context it receives."* This is the thesis in one line — there is **no second agent, no router model, no orchestration code**. The *only* thing that differs between stages is which files the one agent reads. The "multi-agent" behaviour is an illusion produced entirely by folder scoping + human gates. Connects to [concept-dialogue-structure](#concept-dialogue-structure) (each stage contract is a persisted decision tree) and the talk's [contrarian-frameworks](#contrarian-frameworks) stance.

---

## Figure 5 — U-Shaped Human Intervention (N=33 practitioners)

![U-shaped frequency of human edits per stage](fig5-ushaped-edits.png)

Y-axis is ordinal (Never → Rarely → Sometimes → Often → Almost always). Self-reported edit frequency at each stage boundary:

| Stage boundary | Edit frequency | Ordinal band | Character of the edit |
|----------------|----------------|--------------|------------------------|
| **Stage 1 output (Research)** | **92%** | Almost always | **Creative judgment** — direction-setting |
| **Stage 2 output (Script)** | **30%** | Rarely | constrained execution, little to fix |
| **Stage 3 output (Production)** | **78%** | Often | **closer to debugging** — aligning output with earlier decisions |

The **U-shape** is the headline empirical pattern: humans intervene heavily where they set direction (stage 1) and where they reconcile final output against intent (stage 3), but largely leave the constrained middle alone. The paper is careful: *"Values are approximate and based on practitioner self-report through conversation, not instrumented measurement"* (N=33, invite-only community). Use as a **directional** finding, not a metric. Extends [question-icm-scaling](#question-icm-scaling)'s methodology caveats.

---

## Table 1 — Control-Surface Comparison: Framework vs. ICM

The paper's most honest exhibit — first six rows favour ICM, **last four rows favour frameworks** (the "what ICM gives up" section). Reproduced verbatim:

| Dimension | Framework approach | ICM approach |
|-----------|--------------------|--------------|
| Change stage order | Edit orchestration code, redeploy | Rename or reorder folders |
| Modify a prompt | Edit agent configuration in code | Edit a markdown file |
| Add or remove a stage | Write new agent class, update orchestrator | Add or delete a folder |
| Inspect intermediate state | Add logging, build dashboard | Open the folder, read the files |
| Hand off to another person | Document environment, dependencies, setup | Copy the folder |
| Who can make changes | Developer | Anyone with a text editor |
| **Error recovery mid-pipeline** | Built-in retry, fallback, exception handling | Manual re-run of failed stage |
| **Conditional branching** | Programmatic routing based on agent output | Human decides between stages |
| **Concurrent execution** | Native parallel agent coordination | Sequential by design |
| **External service integration** | Programmatic API calls, auth management | Local scripts or MCP connections |

**Synthesis:** rows 1–6 are ICM's pitch (everything is a filesystem/text-editor op, *anyone* can change it, handoff = copy the folder). Rows 7–10 are the **concession lines** — frameworks win on automated error recovery, programmatic branching, true concurrency, and managed integrations. This table is the precise boundary of where the talk's *"multi-agent harnesses are absurdities"* over-reaches: ICM trades those four capabilities away **on purpose**, in exchange for interpretability and zero orchestration code. It is the right trade *only* for sequential, human-reviewed workflows. Directly grounds the counter-perspective in [claim-icm-superiority](#claim-icm-superiority) and [contrarian-frameworks](#contrarian-frameworks).

---

## Table 2 — Layer 3 (Reference) vs. Layer 4 (Working)

| | Layer 3: Reference | Layer 4: Working |
|---|---|---|
| Changes between runs | **No** | **Yes** |
| Example files | `voice.md`, `design-system.md`, `conventions.md` | `research-output.md`, `script-draft.md` |
| Model should | **Internalize as constraints** | **Process as input** |
| Configured during | Workspace setup (once) | Pipeline execution (each run) |
| Folder location | `references/`, `_config/`, `shared/` | `output/` |
| Analogy | **The recipe** | **The ingredients** |

**Why this matters for an agent consuming this vault:** the L3/L4 distinction tells the agent *how to treat each file it reads*. L3 content (`voice.md`, conventions) is a **constraint to obey**; L4 content (prior `output/`) is **material to transform**. Misclassifying the two is the core failure mode the layering prevents — e.g., treating a style guide as editable working text, or treating a prior draft as an immutable rule. The recipe/ingredients metaphor is the paper's compression of this rule.

---

## Cross-References

- Paper entity: [entity-icm-paper-arxiv](#entity-icm-paper-arxiv)
- Core concept: [concept-icm](#concept-icm) · Stage contracts as persisted dialogue: [concept-dialogue-structure](#concept-dialogue-structure)
- Efficiency claim these figures ground: [claim-icm-superiority](#claim-icm-superiority)
- Limitations these figures inherit: [question-icm-scaling](#question-icm-scaling)
- Authors: [entity-jake-van-clief](#entity-jake-van-clief) · [entity-david-mcdermott](#entity-david-mcdermott)


---