---
id: "contrarian-bigger-data-better"
type: "contrarian-insight"
source_timestamps: ["§ Applying Gen AI to Proprietary Data"]
tags: ["big-data", "machine-learning"]
related: ["concept-data-saturation-point"]
challenges: "The 'bigger is always better' dogma regarding data collection for machine learning and AI training."
sources: ["spine"]
sourceVaultSlug: "hbr-seg-spine"
originDay: 1
articleStem: "hbr-cl-96-ai-no-sustainable-advantage"
sourceUrl: "https://hbr.org/2024/09/ai-wont-give-you-a-new-sustainable-advantage"
sourceTitle: "AI Won’t Give You a New Sustainable Advantage"
---
# Massive Data Scale Yields Diminishing Returns

**Contrarian insight.** Big tech emphasizes ever-larger datasets, but the authors argue that for *strategic business patterns*, **1 billion data points may offer no advantage over 50 million** — once the AI has identified the core pattern, additional volume does not change the strategic output (see [[concept-data-saturation-point]]).

**What it challenges:** The 'bigger is always better' dogma of data collection for ML/AI training.

**Counter-perspective (enrichment):** Diminishing returns apply to *more of the same signal* — genuinely well recognized in ML. The nuance the authors underweight: continuous product-usage data can deliver *new, differentiated* signal (data network effects) that keeps compounding. Scale beyond saturation is wasteful; *fresh, differentiated* data at scale is not.