---
id: "open-question-agent-monitoring"
type: "open-question"
source_timestamps: ["00:05:58", "00:06:10"]
tags: ["observability", "ai-safety"]
related: ["concept-long-running-agents", "action-prepare-agent-monitoring"]
resolutionPath: "Development of new AI observability and telemetry tools specifically designed for agentic work-in-progress."
sources: ["s35-compounding-gap"]
sourceVaultSlug: "s35-compounding-gap"
originDay: 35
---
# How do we monitor long-running agents effectively?

## Open Question: How do we monitor long-running agents effectively?

### The problem
If an agent is tasked with a **week-long job**, how do humans monitor its progress to ensure it hasn't gone off the rails by day three — **without having to manually review all its intermediate steps**?

### Why this is hard
- The volume of intermediate steps in a multi-day run is too large for manual review
- Drift from the original specification can compound silently
- Today's logging and tracing tools were designed for short-lived processes, not week-long autonomous workflows

### Why it matters
This is a structural blocker for the [[concept-long-running-agents]] prediction. Without good monitoring, organizations will either avoid long-running agents (forfeiting the productivity gains) or deploy them blindly (incurring catastrophic failures).

### Resolution path
Development of new **AI observability and telemetry tools** specifically designed for agentic work-in-progress. Early prototypes exist as plugins for CrewAI, AutoGen, and LangGraph.

### Recommended action
See [[action-prepare-agent-monitoring]].