---
id: "action-curate-and-license"
type: "action-item"
source_timestamps: ["§ Lessons for Rightsholders", "¶11"]
tags: ["monetization", "data-licensing"]
related: ["concept-curated-training-datasets", "framework-rightsholder-defense"]
action: "Package IP into clean, reliable, curated datasets tailored for specific AI model training and offer licenses."
outcome: "Generates new revenue streams by capitalizing on AI developers' need to avoid legal risks and scraping delays."
sources: ["tail2"]
sourceVaultSlug: "hbr-seg-tail2"
originDay: 2
articleStem: "hbr-tail-126-genai-copyright"
sourceUrl: "https://hbr.org/2025/07/can-gen-ai-and-copyright-coexist"
sourceTitle: "Can Gen AI and Copyright Coexist?"
---
# Curate and License Clean Datasets

**Action (rightsholders):** Package IP into clean, reliable, curated datasets tailored for specific AI-model training and offer licenses.

**Expected outcome:** New revenue streams by capitalizing on AI developers' need to avoid legal risk and scraping delays.

**Why it works:** A market for [[concept-curated-training-datasets]] is already active (70+ deals reported — HarperCollins, Universal Music, Reddit, Shutterstock, WSJ). It is step 3 of [[framework-rightsholder-defense]] and the mirror of the AI-side move to "sign proactive licenses" in [[framework-gen-ai-risk-mitigation]]. Pairs naturally with paywalling (see [[action-rethink-freemium]]).
