---
id: "framework-safety-pillars"
type: "framework"
source_timestamps: ["00:19:42", "00:19:54"]
tags: ["ai-safety", "deployment-strategy"]
related: ["concept-silent-degradation", "concept-metric-gaming", "prereq-version-control-revert", "framework-karpathy-loop-execution"]
steps: ["\"Tight Loops: Constrain the agent's search space to a single file and a fixed time budget to prevent sprawling", "unpredictable changes.\"", "\"Clear Baselines: Establish robust", "multi-dimensional evaluation harnesses that test for both the primary metric and secondary regressions.\"", "Version Control: Maintain strict versioning of all edits to ensure the ability to instantly revert any change that causes downstream issues.", "Human Oversight: Require human inspection of the reasoning traces and final results before promoting autonomous optimizations to production."]
sources: ["s04-karpathy-agent-700"]
sourceVaultSlug: "s04-karpathy-agent-700"
originDay: 4
---
# Four Pillars of Reliable Automation

## Purpose
A mitigation framework designed to prevent auto-optimizing agents from causing [[concept-silent-degradation|silent degradation]], [[concept-metric-gaming|metric gaming]], or catastrophic failures in production business systems.

## The Four Pillars

### 1. Tight Loops
Constrain the agent's search space to a **single file** and a **fixed time budget** to prevent sprawling, unpredictable changes. This realizes the [[concept-karpathy-triplet|Karpathy Triplet]] in production.

### 2. Clear Baselines
Establish robust, **multi-dimensional evaluation harnesses** that test for both:
- The **primary metric**
- **Secondary regressions** (safety, formatting, edge cases, brand voice)

Without multi-dimensional baselines, [[concept-silent-degradation|silent degradation]] is inevitable.

### 3. Version Control
Maintain **strict versioning** of all edits to ensure the ability to **instantly revert** any change that causes downstream issues. Realized as [[prereq-version-control-revert]].

### 4. Human Oversight
Require **human inspection** of the reasoning traces and final results before promoting autonomous optimizations to production. This is where [[claim-human-role-shift|the human role concentrates upward]] — review and gating, not execution.

## Pairing
This framework wraps the [[framework-karpathy-loop-execution|Karpathy Loop Execution Cycle]]. The execution cycle generates change; the safety pillars contain it.

## Anchoring Metaphor
> [[quote-ferrari-ditch|"Speed without infrastructure is running your Ferrari into a ditch."]]


## Related across days
- [[concept-karpathy-loop]]
- [[prereq-evaluation-infrastructure]]
- [[prereq-version-control-revert]]
- [[concept-silent-degradation]]
- [[arc-evaluation-frontier]]
