---
id: "concept-piracy-caveat"
type: "concept"
source_timestamps: ["§ Lessons for Gen AI Companies", "¶14"]
tags: ["copyright-law", "statutory-damages", "financial-risk"]
related: ["claim-piracy-financial-risk", "concept-shadow-libraries", "quote-alsup-piracy", "prereq-statutory-damages", "entity-judge-william-alsup"]
definition: "The principle that even if the end-use of data (AI training) is fair use, obtaining that data through piracy remains independently, irredeemably infringing."
sources: ["tail2"]
sourceVaultSlug: "hbr-seg-tail2"
originDay: 2
articleStem: "hbr-tail-126-genai-copyright"
sourceUrl: "https://hbr.org/2025/07/can-gen-ai-and-copyright-coexist"
sourceTitle: "Can Gen AI and Copyright Coexist?"
---
# The Piracy Caveat to Fair Use

A critical nuance in [[entity-judge-william-alsup]]'s ruling in *Bartz v. Anthropic* is the **piracy caveat**. While Alsup found that training an LLM on copyrighted data could be transformative fair use, he explicitly held that this defense evaporates when the underlying data was obtained via piracy: "piracy of otherwise available copies is inherently, irredeemably infringing even if the pirated copies are immediately used for the transformative use" (see [[quote-alsup-piracy]]).

This separates two acts: (1) the *computational learning* from a work, which can be fair use, and (2) the *illegal acquisition and retention* of the work, which is not. Because AI companies frequently rely on **shadow libraries** of pirated books to assemble their corpora (see [[concept-shadow-libraries]]), this caveat exposes them to catastrophic statutory damages under 17 U.S.C. §504 (see [[prereq-statutory-damages]]), regardless of whether the training itself is ultimately deemed fair use. The financial consequence is quantified in [[claim-piracy-financial-risk]].

**Enrichment refinement:** The Copyright Alliance's analysis confirms the court found that "downloading books from pirate sites is 'inherently, irredeemably infringing,'" and that tokenization/copies for training being fair use "doesn't absolve Anthropic's liability for piracy." Note the precise scope: Alsup did *not* hold that *any* subsequent use of pirated copies can never be fair use; he separated the fair training use from the infringing act of downloading and keeping a permanent central library of pirated books. The strategic takeaway — **pirated acquisition is independently actionable** — is accurate and is the highest-leverage legal fact in this vault.
