---
id: "prereq-neural-network-training"
type: "prereq"
source_timestamps: ["§ Ask the Bot"]
tags: ["machine-learning", "ai-fundamentals"]
related: ["concept-data-mixture-weights"]
reason: "Required to understand why 'data mixture weights' exist and how they can be used as a proxy for economic value."
sources: ["tail1"]
sourceVaultSlug: "hbr-seg-tail1"
originDay: 1
articleStem: "hbr-tail-109-ai-pay-fair-rates-content"
sourceUrl: "https://hbr.org/2026/06/how-ai-companies-can-pay-fair-rates-for-the-content-they-need"
sourceTitle: "How AI Companies Can Pay Fair Rates for the Content They Need"
---
# Understanding of Neural Network Training (Tokens and Mixtures)

## Prerequisite

Understanding that Large Language Models are trained on massive datasets broken into **tokens**, and that engineers actively curate the **ratios (mixtures)** of different data types — e.g., 10% code, 50% web text, 20% books — to optimize final performance.

## Why it's needed

Required to understand why [[concept-data-mixture-weights]] exist in the first place and how they can serve as a proxy for economic value in the [[framework-cmo-compensation]].
