Press "Enter" to skip to content

Encyclopedia Britannica Sues OpenAI Over Alleged Use of Its Content to Train AI Models

Chicago/Washington, 16 March 2026 – Encyclopaedia Britannica has filed a lawsuit against OpenAI, accusing the artificial-intelligence firm of unlawfully using thousands of its copyrighted articles to train generative AI systems such as ChatGPT.

The legal complaint alleges that OpenAI copied nearly 100,000 Britannica articles during the training process for its large language models, enabling AI systems to reproduce or closely mirror the reference publisher’s content.

According to the lawsuit, Britannica claims that AI systems can generate responses that include “near-verbatim” passages from its encyclopedia entries and dictionary definitions. The publisher argues that this practice diverts users away from its own websites, undermining traffic and subscription revenues that support its editorial operations.

The complaint also accuses OpenAI of trademark misuse, claiming that AI-generated responses sometimes cite Britannica as a source in incorrect or fabricated contexts—commonly referred to as AI “hallucinations.” Britannica argues that such citations could mislead users into believing the information originates from or is endorsed by the company.

Britannica is seeking unspecified monetary damages and a court order to prevent further use of its content in AI training without authorization.

The case is the latest in a rapidly expanding series of legal disputes between publishers and artificial-intelligence companies over the use of copyrighted material for AI training. Content owners—including authors, media companies and data providers—have increasingly argued that AI developers are building powerful models using proprietary information without consent or compensation.

Earlier this month, Nielsen subsidiary Gracenote also sued OpenAI, alleging that the company used its proprietary entertainment metadata without permission to train AI models.

Britannica itself has previously pursued similar claims against AI search startup Perplexity AI, accusing it of copying encyclopedia entries and definitions in AI-generated answers.

AI developers, including OpenAI, have consistently argued that training models on large datasets constitutes “fair use” under copyright law because the systems transform original material into new outputs rather than reproducing it directly.

Publishers, however, contend that AI systems rely heavily on copyrighted works and may reproduce them in ways that compete directly with the original sources.

Legal experts say the outcome of such lawsuits could have far-reaching consequences for the global AI industry. If courts rule that large-scale data scraping for AI training violates copyright law, technology companies may be forced to negotiate licensing agreements with publishers or significantly alter how AI models are trained.

A Defining Moment for the AI Economy

The dispute highlights the increasingly complex relationship between the AI industry and the global knowledge economy.

Reference publishers like Encyclopaedia Britannica argue that their carefully curated databases represent decades of editorial investment, while AI developers claim that access to large volumes of information is essential for building intelligent systems.

As courts begin to weigh these competing interests, the outcome could shape the future of generative AI, digital publishing and the economics of knowledge creation in the years ahead.

Author

  • Steven is a writer focused on science and technology, with a keen eye on artificial intelligence, emerging software trends, and the innovations shaping our digital future.

Latest News