• Home
  • News
  • Coins2Day 500
  • Tech
  • Finance
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechAI

A widely used AI image training database contained explicit pictures of children. Experts warn that’s just the tip of the iceberg

Rachyl Jones
By
Rachyl Jones
Rachyl Jones
Down Arrow Button Icon
Rachyl Jones
By
Rachyl Jones
Rachyl Jones
Down Arrow Button Icon
December 21, 2023, 4:38 PM ET
AI image generators can exacerbate the issue of child exploitation.
AI image generators can exacerbate the issue of child exploitation.

Artificial intelligence has quickly become intertwined with consumers’ work and personal lives, with some Big Tech leaders lauding its potential for positive reverberations like nothing the world has ever seen. But a new Stanford study paints a bleak picture about what AI can do when safety measures fall through the cracks.  

On Wednesday, Stanford University’s Cyber Policy Center published a report claiming it found more than 1,000 illegal images depicting child sexual abuse in an open-source database used to train popular image-generation tools like Stable Diffusion. LAION, the nonprofit that put the database together, used web crawling tools to create datasets with more than 5 billion links to online images, which companies can then pull from to use as training data for their own AI models. 

While 1,000 images is just a fraction of the total, that child abuse exists in training data nevertheless aids image generation models in producing realistic, explicit images of children. So how did this happen? Experts who spoke with Coins2Day blame a race to innovate and lack of accountability in the AI space. What’s more, they add, it is a certainty that other illegal or objectionable material exists in training data elsewhere. 

“It was a matter of time,” Merve Hickok, president at the Center for AI and Digital Policy, told Coins2Day. “We opened the floodgates at this time last year, when one company after another released their models without safeguards in place, and the consequence of that race to market—we will have that for a very long time.” 

This isn’t the first case of child sexual exploitation through AI. Just last month, New Jersey police began investigating an incident in which male high school students used AI to create and share fake nude images of their female classmates. In September alone, 24 million unique visitors clicked into websites that can “undress” pictured individuals using AI, social media analytics firm Graphika found. Ads for these services appear on mainstream social media platforms, making them more accessible, Graphika reported. Bad actors can use these images to extort, blackmail, and harm the reputations of average people, experts warned. And the ability to create explicit images of children using AI—even if they don’t depict a specific person—can put children at risk in the real world.

“We are in the early innings here, and I’m afraid it can get much worse,” said Yaron Litwin, chief marketing officer of Canopy, a company using AI to filter out inappropriate content for children.  

LAION has temporarily taken down its datasets and will ensure they are safe before republishing them, it said in an emailed statement. The nonprofit claimed it has “rigorous filters to detect and remove illegal content…before releasing them.” How 1,000 explicit images bypassed those filters is unclear, and LAION did not respond to additional questions. 

How does this happen? 

Child safety “is not an issue people necessarily think about when starting their projects,” said David Thiel, the ex-Facebook, Stanford researcher who authored the report. “My impression is that the original dataset was built by AI enthusiasts who didn’t have a ton of experience with the various kinds of safety measures you would want to put in place.”

Thiel first began working on this project in September after being tipped off by someone else in the field. Another researcher had reason to believe child sexual abuse material might exist in a public dataset after viewing keywords in the descriptions of image entries. Thiel then designed a process for finding individual, illegal images in large databases by using PhotoDNA, a technology created by Microsoft that finds pictures similar to an existing one. While Stanford used other datasets for training purposes, it only scanned the LAION one for this report, so explicit images of children may exist in other public databases. 

“Like much of the technology sector, there are a lot of things that are overlooked in a rush to get things out there,” Thiel told Coins2Day. “ That’s something I believe happened here as well. It has echoes of ‘move fast and break things,’” he said, referencing the early-Facebook ideology.

What’s missing here is accountability and regulation, experts agreed. And already, consumers have become less forgiving about the concept of companies scraping the internet for training data. “Most people have realized the ‘crawl the whole web’ methodology is fraught for a number of reasons,” Thiel said. “There’s a shift towards training things that have been licensed.” A number of news organizations have partnered with AI companies to license their content for training purposes, most recently German media giant Axel Springer, which owns Politico and E&E News in the U.S.

While this shift in mindsets offers a positive outlook for the future of AI regulation, Thiel said, “The damage done by those early models will be with us for a bit.” 

Coins2Day Brainstorm AI returns to San Francisco Dec. 8–9 to convene the smartest people we know—technologists, entrepreneurs, Coins2Day Global 500 executives, investors, policymakers, and the brilliant minds in between—to explore and interrogate the most pressing questions about AI at another pivotal moment. Register here.
About the Author
Rachyl Jones
By Rachyl Jones
LinkedIn iconTwitter icon
See full bioRight Arrow Button Icon
Rankings
  • 100 Best Companies
  • Coins2Day 500
  • Global 500
  • Coins2Day 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Leadership
  • Success
  • Tech
  • Asia
  • Europe
  • Environment
  • Coins2Day Crypto
  • Health
  • Retail
  • Lifestyle
  • Politics
  • Newsletters
  • Magazine
  • Features
  • Commentary
  • Mpw
  • CEO Initiative
  • Conferences
  • Personal Finance
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Coins2Day Brand Studio
  • Coins2Day Analytics
  • Coins2Day Conferences
  • Business Development
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Coins2Day
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map

© 2025 Coins2Day Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Coins2Day Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.