Penguin Random House: No AI Training via Our Publications, Books

Li Nguyen
AI Needs Data to Learn, Evolve, and Mature

Penguin Random House, one of the largest global publishers, is taking a firm stance against artificial intelligence (AI) companies using its books for AI training. This decisive action is in response to growing concerns over how AI firms source the data they use to train models, often scraping various materials like books, social media, and news articles. The publisher is amending its copyright terms to prevent its titles from being used for AI training purposes.

What’s Happening & Why This Matters

Penguin Random House has updated the copyright notice on all its books, explicitly stating that no part of its publications can be used to train AI systems. This change affects both newly released and reprinted titles across all its imprints worldwide. The publisher joins a growing movement within the publishing industry to protect intellectual property rights in the face of AI technology’s rapid advancement. The decision aligns with a directive from the European Parliament, which grants copyright holders the right to protect their work from being used for AI training, provided they choose to opt out. Penguin’s move underscores the broader debate about how AI systems acquire the vast amount of data needed for their models.

While Penguin Random House has taken a firm stance, not all publishers are following suit. Several academic and technical publishers, including Wiley, Oxford University Press, and Taylor & Francis, have struck deals to allow their content to be used for AI training under specific conditions. Meanwhile, media outlets remain divided. In late 2023, The New York Times sued OpenAI and Microsoft for allegedly using millions of its articles without permission, while other organizations have signed agreements with tech firms.

TF Summary: What’s Next

Penguin Random House’s actions may indicate the positions content creators and publishers in managing their intellectual property rights versus AI. As AI models develop and emerge, each will require more data; other major players may follow Penguin’s lead.

However, the divide within the publishing and media sectors is not ironclad. There are differing approaches on how to handle AI’s growing influence. TF sees future developments requiring legal battles or additional agreements between content creators and AI companies — establishing clear boundaries and fair compensation.

— Text-to-Speech (TTS) provided by gspeech

Share This Article
Avatar photo
By Li Nguyen “TF Emerging Tech”
Background:
Liam ‘Li’ Nguyen is a persona characterized by his deep involvement in the world of emerging technologies and entrepreneurship. With a Master's degree in Computer Science specializing in Artificial Intelligence, Li transitioned from academia to the entrepreneurial world. He co-founded a startup focused on IoT solutions, where he gained invaluable experience in navigating the tech startup ecosystem. His passion lies in exploring and demystifying the latest trends in AI, blockchain, and IoT
Leave a comment