Publishers Move to Join Lawsuit Challallenging Google’s AI Training Practices

Share this post:

Major US publishers are seeking to join an ongoing lawsuit against Google, escalating legal pressure over how artificial intelligence systems are trained using copyrighted material. Hachette Book Group and Cengage Group asked a federal court in California for permission to intervene in a proposed class action that accuses Google of using protected works without authorization to develop its AI models. The publishers argue that large-scale copying of books and textbooks formed a core input into Google’s AI capabilities, amounting to widespread copyright infringement. If approved, their participation could significantly expand the scope of the case and increase potential damages. The move reflects growing resistance from content owners who say rapid AI development has outpaced legal safeguards designed to protect intellectual property.

The publishers contend that Google used works from well-known authors and educational texts to train its Gemini large language model without consent or compensation. They described the alleged conduct as one of the most extensive copyright violations ever undertaken, asserting that publishers are uniquely positioned to address questions around licensing, market harm, and the economic value of written content. The case originally focused on visual artists who claimed their images were improperly used to train an AI-powered image generator. By seeking to intervene, book and textbook publishers aim to broaden the legal challenge beyond visual media, highlighting that generative AI systems rely on diverse creative inputs. Google did not immediately respond to the request, though the company has previously argued that its AI training practices fall within lawful use.

The lawsuit is part of a wider wave of legal action confronting the technology sector as AI adoption accelerates. Authors, artists, music labels, and publishers have filed cases against multiple AI developers, arguing that their works were ingested without permission during model training. Some companies have chosen to settle rather than litigate, underscoring the financial and legal risks involved. Last year, Anthropic agreed to a multibillion-dollar settlement with authors over similar claims related to its chatbot technology. These disputes are shaping how courts interpret copyright law in the context of machine learning, an area where precedent remains limited and outcomes uncertain.

At the center of the Google case is the question of whether training AI systems on copyrighted material constitutes infringement or qualifies as transformative use under existing law. The publishers are seeking unspecified monetary damages on behalf of themselves and a broader class of authors, arguing that unlicensed AI training undermines traditional publishing markets. A federal judge will decide whether to allow the publishers to formally join the lawsuit, a decision that could influence parallel cases across the technology industry. For markets and policymakers, the dispute highlights mounting legal and regulatory risks surrounding AI development, with potential implications for innovation costs, licensing frameworks, and the balance between technological progress and intellectual property rights.