---
id: "concept-shiftable-vs-latency-sensitive"
type: "concept"
source_timestamps: ["§ The Incumbent's Energy Playbook", "¶11", "¶12"]
tags: ["cloud-architecture", "workload-management"]
related: ["action-redesign-compute-location", "question-latency-vs-shiftable-threshold"]
definition: "The distinction between AI tasks that require instant, localized processing and those whose timing and physical processing location can be flexibly moved to optimize energy costs."
sources: ["futures"]
sourceVaultSlug: "hbr-seg-futures"
originDay: 2
articleStem: "hbr-nm-101-energy-strategy-ai"
sourceUrl: "https://hbr.org/2026/06/your-company-needs-an-energy-strategy-for-ais-next-phase"
sourceTitle: "Your Company Needs an Energy Strategy for AI’s Next Phase"
---
# Shiftable vs. Latency-Sensitive Workloads

## Definition
The distinction between AI tasks that require instant, localized processing and those whose timing and physical processing location can be flexibly moved to optimize energy costs.

## The Two Categories
- **Latency-sensitive workloads** — e.g., real-time customer-service interactions — must run instantly and often geographically close to the user.
- **Shiftable workloads** — e.g., compliance searches, batch nonurgent inference, model training, and analytics — have flexible timing and location requirements.

Identifying and separating these allows companies to route shiftable workloads to lower-cost, cooler, or lower-carbon data-center regions, optimizing energy costs and grid exposure. This is the technical basis for [[action-redesign-compute-location]].

## Open Problem
The source does not define the specific latency thresholds (in milliseconds) that determine when a workload truly must stay near users — see [[question-latency-vs-shiftable-threshold]].

## Enrichment (external validation)
The distinction is standard cloud-architecture practice. Brookings reports hyperscalers already time non-urgent, energy-intensive tasks (training, background processing) to run when renewable energy is abundant or the grid is underutilized, while handling real-time services differently.
