---
id: "question-solving-model-collapse"
type: "open-question"
source_title: "Don't Let AI Slop Muck Up Your Company's Processes"
source_url: "https://hbr.org/2026/06/dont-let-ai-slop-muck-up-your-companys-processes"
source_timestamps: ["¶19"]
tags: ["model-training", "future-of-ai"]
related: ["concept-generative-inbreeding"]
resolutionPath: "Creating closed-loop data partnerships with enterprises that rigorously track data provenance, or developing new model architectures that are resilient to synthetic data degradation."
sources: ["execution"]
sourceVaultSlug: "hbr-seg-execution"
originDay: 8
articleStem: "hbr-sig-54-ai-slop-processes"
sourceUrl: "https://hbr.org/2026/06/dont-let-ai-slop-muck-up-your-companys-processes"
sourceTitle: "Don’t Let AI Slop Muck Up Your Company’s Processes"
---
# How will the AI industry solve model collapse?

**Open question.** If a large share of the internet is already AI-generated, and models inevitably degrade when trained on synthetic data ([[concept-generative-inbreeding]]), how will foundational model providers secure enough high-quality, human-generated ground truth to train future generations of LLMs?

**Possible resolution path.** Closed-loop data partnerships with enterprises that rigorously track data provenance ([[action-track-provenance]]) — the very reason [[contrarian-ai-providers-need-enterprises|AI providers need enterprises to restrict AI]] — or new model architectures resilient to synthetic-data degradation.

Enrichment caveat: the premise's '50% of internet content is AI-generated' figure is speculative and unverified; model collapse is a recognized risk but there is limited public evidence of it happening at scale today, partly because leading labs actively curate and filter training data.
