Generative AI was introduced as a neutral, transformative technology: ingesting vast datasets, learning patterns and producing outputs that do not replace the originals. That framing is now under intense legal scrutiny.
On 28 January 2026, AI start-up Anthropic was sued in a California federal court by major music publishers Universal Music Group, Concord and ABKCO. They allege Anthropic used copyrighted songs to train its chatbot, Claude, without authorisation.
The claim covers more than 20,000 works, including tracks such as Wild Horses, Sweet Caroline and Bennie and the Jets, potentially exposing the company to billions in damages.
This case signals a strategic shift. Instead of policing AI outputs, rights holders are targeting the training process itself.
Training Data Moves Centre Stage
For years, AI developers treated training data as a technical input, largely invisible to the law. Anthropic’s lawsuit challenges that assumption head-on.
The publishers allege that Anthropic “pirated” over 700 works - including lyrics and sheet music - and infringed thousands more by incorporating protected material into Claude’s datasets without authorisation. Crucially, the claim is not based on whether Claude reproduces the songs verbatim; it targets the act of collecting protected data.
This approach reflects a broader recalibration across creative industries: when outputs are difficult to police, inputs become the battleground.
Fair Use: A Narrow Shield
Anthropic is no stranger to copyright litigation. In 2025, the company paid $1.5 billion to settle a lawsuit brought by book authors alleging that it used pirated texts to train Claude.
In that case, US District Judge William Alsup distinguished between transformative use and unlawful acquisition. Training may qualify as fair use. Copying pirated material to build a model may not.
The music publishers’ complaint aims to exploit that fault line. The allegation that pirated books contained lyrics and sheet music for at least 714 songs immediately undermines any fair use defence. The message is clear: fair use does not excuse theft.
Why Music Raises the Stakes
Music presents more challenges than text or images.
It involves multiple overlapping rights - composition, lyrics, sound recordings and performers’ rights - creating legal complexities. Music is also inherently recognisable: short phrases or stylistic similarities can have significant commercial and cultural impact.
Publishers argue that AI models trained on copyrighted music risk replicating existing works and undermining licensing markets. In an industry already under financial pressure from streaming economics, AI is not just a tool but a competitor as well.
The Business Case
From a commercial perspective, the litigation highlights the tension between efficiency and risk.
Unlicensed datasets allow AI developers to scale rapidly and cut costs. Large-scale litigation introduces uncertainty that can deter investors, complicate acquisitions and raise insurance premiums. However, companies that invest in licensed datasets, transparent sourcing and robust governance may gain a strategic edge.
Legal Team Involvement
Legal teams are no longer gatekeepers; they are architects of defensible AI.
Key areas of involvement include:
- Intellectual property teams: mapping layered rights in music and assessing infringement risk
- Commercial and technology law teams: structuring licensing agreements and data partnerships
- Litigation risk management teams: anticipating claims before products reach market
Anthropic’s case demonstrates what happens when legal oversight is reactive rather than integrated.
Future Outlook
The Anthropic lawsuit marks a turning point. Courts are moving beyond abstract arguments about transformative outputs and focusing on how AI systems are built.
For the music industry, the case represents a bid to reclaim control over creative assets in an AI-driven economy. For developers, it is a stark warning: scale alone does not confer immunity.
The next phase of AI innovation will not be judged solely on model performance, but on the legality and transparency of the data behind it. In this evolving landscape, the strongest systems may not be the most powerful, but the most defensible.