• Home
  • Latest
  • Coins2Day 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechAI

If data is the new oil, these companies are the new Baker Hughes

Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
Jeremy Kahn
By
Jeremy Kahn
Jeremy Kahn
Editor, AI
Down Arrow Button Icon
February 4, 2020, 7:00 AM ET

Artificial intelligence runs on data. And today, in most cases, that data needs to be labeled by humans.

This is particularly true when using computer vision to identify tumors on medical scans, spot roof damage from aerial photography or figure out whether an object crossing in front of your self-driving car is a plastic bag or a mother pushing a stroller. But it’s also true for speech recognition: To train the software, someone must provide an accurate transcript to match an audio recording.

Data labeling for machine learning has spawned an entirely new industry, and the companies springing up to help businesses label their data are among the hottest “picks and shovels” investment plays for venture capitalists hoping to cash in on the current A.I. Gold rush.

The latest datapoint in this data labeling boom: Labelbox, a San Francisco startup that operates a software platform for helping companies manage their data labeling tasks, on Tuesday announced it had received $25 million in additional venture capital funding.

The money is from prominent Silicon Valley venture capital firm Andreessen Horowitz, whose managing partner Peter Levine, is joining Labelbox’s board; Google’s A.I.-focused venture capital fund, Gradient Ventures; and Kleiner Perkins, another of the Valley’s best-known firms.

The investment, which is Labelbox’s Series B, or second round of institutional financing, brings the total that the not-quite-two-year-old startup has raised to $39 million.

Labelbox competes with a number of other labeling companies: there’s Scale AI, another San Francisco data labeling platform that has raised $122 million since its founding three years ago, as well as companies that specialize in running teams of human data labelers on a project basis, such as Hive, Cloudfactory, and Samasource, the startup founded by Leila Janah, who died last month at age 37, but who saw data labeling as a way to bring decent wages and skilled work to people in the developing world.

Alexandr Wang, the 23-year-old founder and CEO of Scale AI, which has worked with a number of self-driving car companies, says that the “dirty secret” of artificial intelligence is that getting the software to work well in the real world requires a large amount of high-quality data.

“Where the rubber hits the road is what does the data these A.I. Systems are trained on look like?” He says. “Is that data biased? Is that data high quality? Does that data have noise? Is that data comprehensive?”

Providing labels can be relatively low-skilled work (identifying “cats” in videos) performed by thousands of contractors in traditional outsourcing hubs such as India, Romania, or the Philippines, or it can be much higher-skilled work performed by radiologists (outline the exact contours of a tumor on a medical scan) or lawyers (identify a non-compete clause in a contract). Often companies have a need for both general and more expert labeling and employ a combination of outsourcing firms, freelancers, and in-house experts to affix these annotations. The labels can be in the form of bounding boxes around objects, tagging items visually or with text labels in photographs, or entering a classification into a separate text-based database that accompanies the original data.

Wang says that with such complex work flows, data governance—how companies track what data they are using, who’s using it, and what they are doing with it— is critical. “It isn’t sexy, but it really matters,” he says. Companies trying to deploy machine learning are often slowed because they don’t have systems in place to manage data labeling efficiently, he says.

Both Scale AI and Labelbox provide tools to help companies’ machine learning and data science teams analyze the data once it is labeled, allowing them to identify blindspots and biases. For example, are men overrepresented in your X-ray data (bias)? Or did you have too few examples of cats running across the road in order to train your self-driving algorithm to brake for them (a blindspot)? “Every A.I. Company needs tools to edit, manage, and review labels,” Manu Sharma, Labelbox’s co-founder and CEO, says.

Michael Phillippi, vice president of technology at Lytx, a San Diego company that sells systems that allow trucking businesses to assess and track drivers’ behavior through cameras and sensor data, says it takes about 10,000 hours of labeled 20-second video clips to train a prototype A.I. System to detect something like driver distraction. To put that system into actual production, though, requires four to five million hours of video, he says. That is a lot of labeling.

John-Isaac Clark is the CEO of Arturo.ai, a spin out from American Family Insurance that specializes in machine learning software to analyze images, including satellite and aerial photography, for the insurance industry. He says that large, well-labeled data sets are especially important for training A.I. Software to correctly identify “edge cases”—unusual or rare situations.

Humans can often use common sense to deal with these situations, even when they haven’t encountered them before. Most A.I. Systems, in contrast, need to have seen multiple examples during training to correctly handle them.

Both Arturo and Lytx are Labelbox customers. Clark says Labelbox enabled Arturo to reduce the number of employees it needed to supervise its data labeling contractors from four to just one.

Sharma and his co-founder Brian Rieger, who is now the Labelbox’s chief operating officer, met when they both worked in aeronautics industry, helping to design and test flight control systems. Sharma later worked for Planet Labs, a company that analyzes gigantic datasets of satellite images, where he realized the difficulty companies had with managing labeling tasks for A.I. Training data and began thinking of creating a company to address this problem. His other co-founder, Dan Rasmuson, now Labelbox’s chief technology officer, had encountered similar problems working at a company that sold drone imagery.

Labelbox’s software supplies a set of labeling tools for both images and text, as well as a way to distribute data to labelers in such a way that multiple labelers can work on the same data simultaneously without duplicating any labels.

Some companies in the labeling space, such as Scale AI and Hive, provide labeling services themselves. In fact, Scale AI uses its own A.I. Software to automatically generate labels for certain kinds of data. These labels are then checked by humans to ensure accuracy, Wang says.

Automatic labeling, he says, allows Scale AI’s customers to benefit from the work Scale AI has done in the past—if it has already built a system to detect cars in videos, for instance, customers may not need to train their own system from scratch. Even in cases where customers want to build their own models, he says, automatic labeling makes the process more efficient.

Labelbox, meanwhile, has taken a different approach. It doesn’t perform any labeling itself. Instead, it’s a tool for managing labeling projects and data across different contract labelers, who often work for large outsourcing firms. The software also allows Labelbox’s customers to audit the quality of labeling contractors. Labelbox gets paid based on how much data a customer runs through the software.

Andreessen Horowitz’s Levine compares Labelbox to Github, the software code repository that many companies use to manage their code. Acquired by Microsoft for $7.5 billion in 2018, it was an Andreessen Horowitz investment. “Labelbox has the potential to fill a similar role for data in the AI/ML world,” Levine writes in response to emailed questions, using shorthand for artificial intelligence and machine learning. He says the platform can serve as “a single source of truth” for training data across an organization.

This story has been updated to correct the spelling of Labelbox chief technology officer Dan Rasmuson’s last name.

More must-read stories from Coins2Day:

—The long ocean voyage that helped find the flaws in GPS
—Global companies enter lockdown mode as coronavirus rocks China
—3 key takeaways from Tesla’s blockbuster fourth-quarter earnings
—Facebook says its ad machine is being weakened by privacy changes
—Predicting the biggest tech headlines of 2020

Catch up with Data Sheet, Coins2Day’s daily digest on the business of tech.

About the Author
Jeremy Kahn
By Jeremy KahnEditor, AI
LinkedIn iconTwitter icon

Jeremy Kahn is the AI editor at Coins2Day, spearheading the publication's coverage of artificial intelligence. He also co-authors Eye on AI, Coins2Day’s flagship AI newsletter.

See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Coins2Day Editors
October 20, 2025
Rankings
  • 100 Best Companies
  • Coins2Day 500
  • Global 500
  • Coins2Day 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Leadership
  • Success
  • Tech
  • Asia
  • Europe
  • Environment
  • Coins2Day Crypto
  • Health
  • Retail
  • Lifestyle
  • Politics
  • Newsletters
  • Magazine
  • Features
  • Commentary
  • Mpw
  • CEO Initiative
  • Conferences
  • Personal Finance
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Coins2Day Brand Studio
  • Coins2Day Analytics
  • Coins2Day Conferences
  • Business Development
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Coins2Day
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

A smartphone displaying the Google Gemini logo.
AIEye on AI
As ‘agentic commerce’ gains ground, companies shouldn’t put too much faith in ‘GEO,’ one industry insider warns
By Jeremy KahnJanuary 13, 2026
50 minutes ago
AIChatbots
Being mean to ChatGPT can boost its accuracy, but scientists warn you may regret it
By Marco Quiroz-GutierrezJanuary 13, 2026
2 hours ago
AIGoldman Sachs Group
‘Humans could go the way of horses’: Goldman calculated how bad the AI ‘job apocalypse’ will be—and its analysts were pleasantly surprised
By Jim EdwardsJanuary 13, 2026
3 hours ago
Mark Zuckerberg
Future of WorkMeta
Meta is changing its performance review to reward output over effort, taking a page from Amazon and X
By Jake AngeloJanuary 13, 2026
3 hours ago
Warren Buffett on the phone
SuccessProductivity
Gen X CEO uses AI versions of Steve Jobs and Warren Buffett as a ‘fantasy board of directors’ to help him prepare for meetings and performance reviews
By Preston ForeJanuary 13, 2026
3 hours ago
Mercor Founders - Adarsh Hiremath, Brendan Foody
AIskills
Chief people officers—and Jamie Dimon—say AI can’t learn ‘human skills.’ The world’s youngest self-made billionaires want to prove them wrong
By Jake AngeloJanuary 13, 2026
4 hours ago

Most Popular

placeholder alt text
Economy
Treasury spent $276 billion in interest on the national debt in the final three months of 2025, says the CBO—up $30 billion from a year prior
By Eleanor PringleJanuary 12, 2026
1 day ago
placeholder alt text
Economy
‘Sell America’: Investors dump U.S. assets in fear of the end of Fed independence
By Jim EdwardsJanuary 12, 2026
1 day ago
placeholder alt text
Newsletters
The oil CEO who stood up to Trump is a follower of the disciplined 'Exxon way' and has a history of blunt statements
By Jordan BlumJanuary 13, 2026
10 hours ago
placeholder alt text
Success
An exec at $62 billion giant Colgate says Gen Z workers, despite getting flak for being woke and lazy, are actually ‘pushing us to get better’
By Emma BurleighJanuary 10, 2026
3 days ago
placeholder alt text
Tech
Elon Musk asked people to upload their medical data to X so his AI company could learn to interpret MRIs and CT scans
By Sasha RogelbergJanuary 11, 2026
2 days ago
placeholder alt text
Real Estate
'Something big' just happened in the U.S. housing market, real estate CEO says. And it could mean the difference of being able to buy a home or not
By Sydney LakeJanuary 12, 2026
1 day ago

© 2025 Coins2Day Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Coins2Day Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.