---
id: "framework-karpathy-loop-execution"
type: "framework"
source_timestamps: ["00:02:12", "00:02:22", "00:10:51", "00:11:02"]
tags: ["optimization", "execution-cycle"]
related: ["concept-karpathy-loop", "concept-karpathy-triplet", "framework-safety-pillars"]
steps: ["Analyze the current state/configuration of the target file or harness.", "Propose a scoped edit or mutation to the file based on previous traces or directives.", "\"Run deterministic test cases or a time-boxed experiment (e.g.", "5 minutes) in a sandbox environment.\"", "\"Evaluate the results of the experiment against a single", "predefined objective metric.\"", "\"Commit the change if the metric improves", "or revert the change if the metric degrades or fails.\""]
sources: ["s04-karpathy-agent-700"]
sourceVaultSlug: "s04-karpathy-agent-700"
originDay: 4
---
# The Karpathy Loop Execution Cycle

## Purpose
The step-by-step process by which an autonomous agent iteratively improves a system, based on [[entity-andrej-karpathy-d4|Andrej Karpathy]]'s auto-research script and adapted for broader [[concept-harness-engineering|harness engineering]].

## The Five Steps

1. **Analyze** the current state/configuration of the target file or harness.
2. **Propose** a scoped edit or mutation to the file based on previous [[concept-trace-driven-optimization|traces]] or directives.
3. **Run** deterministic test cases or a time-boxed experiment (e.g., **5 minutes**) in a **sandbox environment**.
4. **Evaluate** the results of the experiment against a single, predefined objective metric.
5. **Commit** the change if the metric improves, or **revert** the change if the metric degrades or fails.

## Inputs
Requires the [[concept-karpathy-triplet|Karpathy Triplet]]:
- One editable surface
- One metric
- One time budget

## Architectural Context
In the [[concept-meta-task-agent-split|Meta/Task split]], the Meta-Agent runs this cycle on the Task Agent's harness. Steps 1-2 are reasoning over traces; steps 3-5 are deterministic evaluation and version control.

## Safety Pairing
The execution cycle must be wrapped in the [[framework-safety-pillars|Four Pillars of Reliable Automation]] — tight loops, clear baselines, version control, human oversight.

## Throughput Example
[[entity-product-skypilot|SkyPilot]] demo: an agent ran this cycle **910 times in 8 hours**, with [[claim-emergent-meta-behaviors|emergent optimizations]] like spontaneously switching to faster GPUs for validation.


## Related across days
- [[concept-karpathy-loop]]
- [[concept-karpathy-triplet]]
- [[framework-safety-pillars]]
- [[framework-2026-builder-practices]]
