Meta is so desperate for data sources to train its AI it weighed risking copyright lawsuits: report

Tech giants have been scrambling to find new data sources to train their AI systems.
Meta considered several ways to harvest data, including buying Simon & Schuster, the Times reported.
It also considered dealing with lawsuits instead of negotiating licensing deals, the Times wrote.

Tech giants are scrambling to find new data sources to fuel the AI arms race.

And at Meta, the issue has been so critical that executives met almost daily in March and April of last year to hash out a plan, The New York Times reported.

As AI systems become more powerful, tech companies have been forced to seek data more aggressively, which could open them up to possible copyright violations. Some have suspected OpenAI, for example, of using YouTube to train its video generator, Sora. The company’s CTO, Mira Murati, has denied those accusations.

During Meta’s meetings, the Times reported that some attendees floated the idea of buying the publishing house Simon & Schuster, which private equity firm KKR purchased for $1.62 billion last August. Others suggested paying $10 a book to obtain the full licensing rights to new titles.

By the time of the meetings, Meta had already summarized many books, essays, and other online works. The company had hired contractors in Africa to bundle together summaries of fiction and nonfiction titles — some of which included copyrighted information. “We have no way of not collecting that,” a manager said during a meeting.

Attendees discussed whether the company could just continue collecting data from potentially copyrighted sources without taking the time and money to procure licensing deals. When a lawyer pointed out the “ethical” concerns of taking intellectual property, they were greeted with silence, the Times reported.

Meta did not immediately respond to a request for comment from Business Insider.

Ultimately, executives at the meeting decided to rely on the precedent set in Authors Guild vs. Google, a 2015 court case brought before the Supreme Court. The court declined to hear the case, upholding a lower court ruling. That court said Google can scan and digitize books for Google Books under fair use guidelines. Meta’s lawyers said the company could train its AI systems under the same guidelines, the Times reported.

Trending Now

The best smash burger in Los Angeles, according to a British tourist and an American

Mira Murati Is Asking Investors to Commit to at Least $50 Million

AUD/USD trades around 0.6450 after pulling back from five-month highs

Key Price Points for DOGE, SHIB, TRUMP, PEPE, FLOKI

How FCS Quarterbacks Selected in Recent NFL Drafts Have Fared

Meta is so desperate for data sources to train its AI it weighed risking copyright lawsuits: report

Mira Murati Is Asking Investors to Commit to at Least $50 Million

Best-Dressed Couples at the 2025 Met Gala

Rihanna Pregnant With Baby 3; Revealed Baby Bump Before Met Gala 2025

Met Gala 2025: Best Outfits Men Wore on the Red Carpet

US Dollar drops after TWD surge despite strong data

195K Student-Loan Borrowers Have 30 Days Before Benefits Garnishment

The Best and Worst Looks Billionaires Have Worn at the Met Gala

Why Now Is the Ideal Time to Start a Business, Even Amid Recession Fears

Red Flags at Hotels to Look for, From Traveler Who’s Stayed in 500+

The best smash burger in Los Angeles, according to a British tourist and an American

Mira Murati Is Asking Investors to Commit to at Least $50 Million

AUD/USD trades around 0.6450 after pulling back from five-month highs

Key Price Points for DOGE, SHIB, TRUMP, PEPE, FLOKI

How FCS Quarterbacks Selected in Recent NFL Drafts Have Fared

Best-Dressed Couples at the 2025 Met Gala

Palantir stock pauses rally ahead of most-watched earnings release on Monday

New House Market Structure Bill Would End SEC Oversight of Crypto’s Top Coins

Trending Now

Meta is so desperate for data sources to train its AI it weighed risking copyright lawsuits: report

Related Articles