---
id: "concept-generative-inbreeding"
type: "concept"
source_title: "Don't Let AI Slop Muck Up Your Company's Processes"
source_url: "https://hbr.org/2026/06/dont-let-ai-slop-muck-up-your-companys-processes"
source_timestamps: ["§ Knowledge Entropy", "¶19"]
tags: ["model-training", "synthetic-data"]
related: ["concept-knowledge-entropy", "claim-ai-providers-need-ground-truth"]
definition: "The degradation of an AI model's accuracy and variability caused by training it on synthetic data generated by other AI models."
sources: ["execution"]
sourceVaultSlug: "hbr-seg-execution"
originDay: 8
articleStem: "hbr-sig-54-ai-slop-processes"
sourceUrl: "https://hbr.org/2026/06/dont-let-ai-slop-muck-up-your-companys-processes"
sourceTitle: "Don’t Let AI Slop Muck Up Your Company’s Processes"
---
# Generative Inbreeding (Model Collapse)

Also called 'model collapse,' generative inbreeding occurs when Large Language Models are trained on synthetic data — data created by another LLM or by a previous version of the same model. Over time this recursive training loop severely degrades the model's accuracy, variability, and overall performance. It is the training-time endpoint of [[concept-knowledge-entropy]].

The authors argue that because an estimated 50% of current internet and social-media content is already AI-generated, this synthetic data will inevitably become training data for future models. That creates the paradox in [[claim-ai-providers-need-ground-truth]] and [[contrarian-ai-providers-need-enterprises]]: preventing knowledge decay is just as critical for the companies building AI as for the enterprises using it. The unresolved industry problem is captured in [[question-solving-model-collapse]].

IMPORTANT NUANCE (enrichment overlay): model collapse is a recognized theoretical/empirically-demonstrated risk when models train predominantly on their own outputs, but the specific '50% of internet content is AI-generated' figure is NOT substantiated by the cited governance sources and should be treated as speculative and likely overstated. Leading labs actively curate datasets and apply filters, so there is limited public evidence of foundation models collapsing at scale today.
