Study Suggests OpenAI Trained AI Models on Copyrighted Content

•

a day ago

Summary

A new study suggests that OpenAI trained at least some of its AI models on copyrighted content, sparking fair use concerns. The study's authors used a new method to identify training data 'memorized' by the models and found signs of memorization in GPT-4 and GPT-3.5 models.

Key Points

The study proposes a new method for identifying training data 'memorized' by AI models
GPT-4 and GPT-3.5 models showed signs of having memorized portions of popular fiction books and New York Times articles
OpenAI has long advocated for looser restrictions on developing models using copyrighted data

Why It Matters

The study's findings highlight the need for greater transparency in AI training data and fair use concerns in the development of AI models.

Author

Kyle Wiggers

More Headlines

San Francisco Mayor Wants to Attract Tech Leaders

San Francisco Mayor Daniel Lurie wants to bring back the city's glory days by addressing a drug and homelessness crisis, streamlining permitting, and attracting tech leaders with tax breaks. He has spent much of his first 100 days in office walking troubled neighborhoods and rolling back a program that handed out free pipes, foil, and straws used for ingesting drugs.

OpenAI's ChatGPT struggles to turn momentum into revenue in India

OpenAI's ChatGPT has seen rapid growth in India, but the company is struggling to turn this momentum into revenue. One factor may be the lack of local pricing for India, with OpenAI's cheapest plan costing $20 per month.

Week in Review: Epic Wins, Zelle Shuts Down, Plaid Valuation

In this week's Week in Review, Epic Games won a case against Google over app store practices. Zelle, a person-to-person payment platform, shut down its stand-alone app as most users accessed it through their banks. Plaid, a fintech company, raised $575 million at a valuation of $6.1 billion.

Minecraft Movie Brings in $58 Million on Opening Friday

The Minecraft movie adaptation brought in $58 million on its opening day, making it one of the biggest openings of the year. The film is expected to bring in a total of $135 million over the weekend, surpassing 'Captain America: Brave New World' and providing a much-needed boost to the theatrical box office.

Deel's Communications Chief Resigns

Deel's head of communications Elisabeth Diana has resigned amid allegations that the company planted a spy at rival workforce management platform Rippling. The news follows a lawsuit filed by Rippling against Deel, accusing the company of violation of the RICO racketeering act, misappropriation of trade secrets, and unfair competition.

Meta Releases New Collection of AI Models

Meta has released a new collection of AI models, called Llama 4. These reasoning models fact-check their answers and generally respond to questions more reliably, but as a consequence take longer than traditional models to deliver answers.

Elon Musk's DOGE Plans Hackathon

Elon Musk's DOGE is planning a hackathon to create a 'mega API' that will provide access to taxpayer data. The event, organized by two DOGE staffers at the Internal Revenue Service (IRS), aims to make it easy for cloud providers to access IRS data, including names, addresses, social security numbers, tax returns, and employment information. This has raised concerns about the potential security risks and privacy breaches.

Tech Layoff Wave Still Kicking in 2025

The tech layoff wave continues in 2025 as several startups and companies announce layoffs. Pandion, Icon, Altruist, Aqua Security, and Solargedge Technologies have all announced job cuts, impacting hundreds of employees.

IPO Window May Close Again

in the wake of president trump's sweeping tariffs, two highly anticipated ipos are hitting pause. klarna and stubhub had released public documents for their respective ipos last month, each hoping to raise at least $1 billion in their debuts. both were set to launch their road shows next week, talking to potential investors about their ipo, but decided to postpone.

Trump Extends TikTok Deadline

President Donald Trump has extended the deadline for a deal to save TikTok in the US by 75 days, citing progress in negotiations. The move comes just one day before the original deadline was set to expire. ByteDance, the Chinese company that owns TikTok, must still finalize a sale of its US operations.

GitHub Copilot's Costly Upgrade

github announced 'premium requests' for its ai coding assistant, github copilot. the new system imposes rate limits on tasks and actions with newer models. customers on pro and business plans will receive 300 and 1000 monthly premium requests respectively.

Meta to Abandon Fact-Checkers in US

Meta announced that it would be abandoning its fact-checking program in the US, following a cultural tipping point towards prioritizing free speech. This change comes after Meta's founder and CEO Mark Zuckerberg attended President Trump's inauguration and added Dana White, a longtime Trump ally, to Meta's board.

Nintendo Delays Switch 2 Preorders Due to Tariffs

Nintendo has delayed the start of preorders for its newly announced Switch 2 console in the U.S. due to concerns over President Donald Trump's tariffs. The launch date of June 5, 2025, is still on track. The company unveiled the Switch 2 on Tuesday, which will be available for purchase on its own or with a bundle that includes Mario Kart World.

Turbine Raises $22M to Provide Liquidity to Venture Capital Investors

Turbine, a debt platform for limited partners in private equity and VC, has raised $22 million in equity funding. The company provides liquidity to investors by using their fund stakes as collateral, similar to a home equity line of credit.

Microsoft Copilot gets major upgrades for its 50th birthday

For its 50th birthday, Microsoft is teaching its AI-powered Copilot chatbot a few new tricks. The bot can now take action on 'most websites,' enabling it to book tickets, reserve restaurants, and more.