---
id: "action-reduce-demand"
type: "action-item"
source_timestamps: ["§ The Incumbent's Energy Playbook", "¶12"]
tags: ["optimization", "model-routing", "efficiency"]
related: ["concept-intelligence-per-watt", "entity-pinterest", "entity-pixie", "prereq-llm-operations"]
speakers: ["Yinuo Tang", "Eric Yanfei Zhao"]
action: "Route simple tasks to smaller models, cache queries, compress prompts, and batch nonurgent inference."
outcome: "Reduces unnecessary energy consumption and lowers AI operating costs without requiring physical infrastructure control."
sources: ["futures"]
sourceVaultSlug: "hbr-seg-futures"
originDay: 2
articleStem: "hbr-nm-101-energy-strategy-ai"
sourceUrl: "https://hbr.org/2026/06/your-company-needs-an-energy-strategy-for-ais-next-phase"
sourceTitle: "Your Company Needs an Energy Strategy for AI’s Next Phase"
---
# Optimize AI workloads to reduce energy demand

## Action
Route simple tasks to smaller models, cache queries, compress prompts, and batch nonurgent inference.

## Detail
Require AI engineering teams to:
- **Route simple tasks** (like customer-service summaries) to smaller models rather than frontier models.
- **Cache** repeated queries.
- **Compress prompts.**
- **Quantize models** where appropriate.
- **Batch nonurgent inference** tasks.
- **Shift flexible workloads** to lower-cost times or regions.

Requires the technical fluency described in [[prereq-llm-operations]]. [[entity-pinterest]] and its real-time recommendation system [[entity-pixie]] are the cited case study of "reduce demand before buying supply."

## Outcome
Reduces unnecessary energy consumption and lowers AI operating costs **without requiring physical infrastructure control** — Step 2 of [[framework-incumbent-energy-playbook]] and the cheapest lever on [[concept-intelligence-per-watt]].
