Close Menu
  • Home
  • AI
  • Entertainment
  • Finance
  • Sports
  • Tech
  • USA
  • World
  • Latest News

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

What's Hot

Mexico travel guide: What you need to know as violence erupts over cartel leader’s death

February 23, 2026

Ariel Kebbell and Zach Roerig break up: The Vampire Diaries Starz split

February 23, 2026

Benfica’s Prestiani suspended for one match over Vinicius incident | Soccer News

February 23, 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram Vimeo
BWE News – USA, World, Tech, AI, Finance, Sports & Entertainment Updates
  • Home
  • AI
  • Entertainment
  • Finance
  • Sports
  • Tech
  • USA
  • World
  • Latest News
BWE News – USA, World, Tech, AI, Finance, Sports & Entertainment Updates
Home » Silicon Valley makes big bets on the “environment” to train AI agents
AI

Silicon Valley makes big bets on the “environment” to train AI agents

adminBy adminSeptember 17, 2025No Comments9 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
Share
Facebook Twitter LinkedIn Pinterest Email


For years, the CEOs of leading high-tech companies have promoted the vision of AI agents that can use software applications to complete people’s tasks. But whether it’s Openai’s ChatGpt agent or Perplexity’s comet, spin today’s consumer AI agents. That way you can quickly realize how limited the technology is still. Making AI agents more robust could potentially adopt a new set of techniques the industry is still discovering.

One of these techniques is to carefully simulate a workspace where agents can be trained in multi-step tasks known as multi-step tasks (RL) environments. Just as how labeled datasets move in the final wave of AI, the RL environment is beginning to appear as an important factor in agent development.

AI researchers, founders and investors tell TechCrunch that the leading AI labs are demanding more RL environments and there is a shortage of startups that want to provide them.

“All big AI labs are building RL environments in-house,” Jennifer Li, general partner at Andreessen Horowitz, said in an interview with TechCrunch. “But as you can imagine, creating these datasets is so complicated that AI Labs is also looking at third-party vendors who can create high-quality environments and assessments. Everyone is looking at this space.”

The push of the RL environment has minted a new class of newly funded startups, including mechanization and key intelligence, aimed at leading the space. Meanwhile, large data label companies like Mercor and Surge say they are investing more in RL environments, addressing the industry’s shift from static datasets to interactive simulations. Major labs are also considering investing heavily. According to information, human leaders are debating more than $1 billion in spending on the RL environment over next year.

The hope for investors and founders is that one of these startups emerges as a “Scale AI for the Environment” and refers to $29 billion in data labeled Powerhouse, powered by the age of chatbots.

The question is whether the RL environment will truly boost the frontier of AI progression.

TechCrunch Events

San Francisco
|
October 27th-29th, 2025

What is an RL environment?

Because RL environments are core, they are the basis for training to simulate what AI agents do in real software applications. One founder explained in a recent interview that they will build them in “creating very boring video games, etc.”

For example, the environment can simulate a Chrome browser and task AI agents to buy socks on Amazon. The agent is graded for its performance and sends a reward signal when it is successful (in this case, it buys valuable socks).

Such tasks sound relatively simple, but there are many places where AI agents can stumble. You may be navigating through drop-down menus on a web page or purchasing too many socks. Also, since developers cannot accurately predict what wrong an agent is doing, the environment itself must be robust enough to capture unexpected behavior and still provide useful feedback. This makes the built environment much more complicated than a static dataset.

Some environments are very elaborate, allowing AI agents to use tools, access the Internet, and use a variety of software applications to complete specific tasks. Others are narrower and aimed to help agents learn specific tasks in enterprise software applications.

The RL environment is currently the hottest thing in Silicon Valley, but there are many precedents for using this technique. One of Openai’s first projects in 2016 was to build “RL Gyms.” This was very similar to the modern concept of the environment. In the same year, Google Deepmind’s Alphago AI system defeated the world champion in the board game. We also used RL technology within a simulated environment.

What’s unique about today’s environment is that researchers are trying to build computer-based AI agents with large-scale trans models. Unlike Alphago, a specialized AI system that runs in a closed environment, today’s AI agents are trained to have more general functions. AI researchers today have a stronger starting point, but there are also complex goals that don’t go well with many more.

A busy field

AI Scale AI data labeling companies like AI, Surge, Mercor are trying to meet at the moment and build an RL environment. These companies have more resources than many startups in this space, as well as their deeper relationships with AI Labs.

Surge CEO Edwin Chen told TechCrunch that demand for RL environments within AI Labs has been “a significant increase” recently. Surge, which reportedly worked with AI labs such as Openai, Google, Anthropic and Meta last year to generate revenues in $1.2 billion, recently said it has spun a new internal organisation specially charged to build an RL environment.

Just behind Surge is Mercor, a $10 billion worth of startup that also works in Openai, Meta, and humanity. Mercor is pitching investors to a business building RL environment for domain-specific tasks such as coding, healthcare and law, according to marketing materials seen by TechCrunch.

“Leah, few people understand how big the opportunities around the RL environment are,” Melkor CEO Brendan Hoody told TechCrunch in an interview.

The scale AI used to control the labeling space for data has lost ground since Meta invested $14 billion and hired CEOs. Since then, Google and Openai have removed Scale AI as data providers, and startups have even faced a race for data labeling work within Meta. But even so, Scale is trying to meet at the moment and create an environment.

“This lies in the nature of the business (Scale AI),” said Chetan Rane, head of product for agents and RL environments. “Scale proves its ability to adapt quickly. We did this early on in our first business unit, the self-driving cars. When ChatGPT came out, AI adapted to it.

Some new players have focused solely on the environment from the start. Among them is a startup that was founded about six months ago with the bold goal of “automating all jobs.” However, co-founder Matthew Barnett tells TechCrunch that his company starts with an AI coding agent’s RL environment.

Mechanization aims to provide AI labs with a small number of robust RL environments, says Barnett, rather than a large data company that creates a wide range of simple RL environments. At this point, the startup is building an RL environment by offering a $500,000 salary to software engineers. This is much higher than hourly contractors can work with AI or surges.

Mechanize is already working with humanity in an RL environment, two sources familiar with the issue told TechCrunch. Mechanization and humanity declined to comment on the partnership.

Other startups bet that the RL environment will have an impact outside of AI Labs. Prime Intellect – A startup supported by AI researchers Andrej Karpathy, Founders Fund and Menlo Ventures targets small developers in RL environments.

Last month, Prime Intellect launched the RL Environments Hub. This is intended to “hugging the face of an RL environment.” The idea is to allow open source developers to access the same resources that large AI Labs have, allowing those developers to access computational resources in the process.

According to Will Brown of Prime Intellect Researcher, a generally capable agent can be more computational than previous AI training techniques in an RL environment. Along with startups building RL environments, GPU providers can enhance their processes have another opportunity.

“The RL environment would be too big for one company to control,” Brown said in an interview. “Part of what we do is try to build a great open source infrastructure around it. The services we sell are calculations, so it’s a convenient on-ramp to use GPUs, but this is what we’re thinking about in the long run.”

Does it scale?

An unresolved question regarding the RL environment is whether the technique is scaled like previous AI training methods.

Reinforcement learning has driven some of the biggest leaps in AI over the past year, including models such as Openai’s O1 and Anthropic’s Claude Opus 4. These are particularly important breakthroughs as methods previously used to improve AI models show reduced returns.

The environment is part of AI Labs’ larger bets on RL, and we believe that many will continue to drive progress as data and computational resources are added to the process. Some Openai researchers behind O1 previously told TechCrunch that the company originally invested in the AI ​​Reasoning model (created through investment in RL and calculations during testing) and thought it would be a good extension.

The best way to scale RL remains unknown, but the environment appears to be a promising candidate. Instead of simply rewarding the chatbot for text responses, agents use tools and computers at their disposal to run in simulations. It’s much more resource intensive, but potentially rewarding.

Some are skeptical that all these RL environments will pan out. Ross Taylor, former AI research lead at Meta, who co-founded general reasoning, tells TechCrunch that RL environments tend to reward hacking. This is the process in which AI models cheats to earn rewards without actually performing tasks.

“I think people underestimate how difficult it is to expand the environment,” Taylor said. “Even the best (RL environments) that are generally available will not normally work without serious changes.”

Sherwin Wu, Head of Engineering for API Business at Openai, said in a recent podcast that it was “short” at RL environment startups. Wu said it is a very competitive space, but AI research has evolved so quickly that it is difficult to serve AI labs well.

Karpathy, a leading intelligence investor who calls the RL environment a potential breakthrough, has paid more broad attention to the RL space. In X’s post, he raised concerns about whether he could squeeze more AI progress from RL.

“I’m bullish about the interaction between environment and agent, but specifically, I’m bearish towards reinforcement learning,” says Karpathy.

Update: Previous versions of this article were called mechanization work. Updated to reflect the company’s official name.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticleColombia’s Petro assaults the US government over designation of the war on drugs
Next Article Trump is pushing businesses to report their revenues frequently. Here are both sides of the discussion
admin
  • Website

Related Posts

Secretary of Defense summons Antropic’s Amodei over military use of Claude

February 23, 2026

How AI agents will disrupt the economy

February 23, 2026

All the important news from the ongoing India AI Impact Summit

February 23, 2026

6 days left until disruption rate locks in at lowest level in 2026

February 22, 2026
Leave A Reply Cancel Reply

Our Picks

Newly freed hostages face long road to recovery after two years in captivity

October 15, 2025

Former Kenyan Prime Minister Raila Odinga dies at 80

October 15, 2025

New NATO member offers to buy more US weapons to Ukraine as Western aid dwindles

October 15, 2025

Russia expands drone targeting on Ukraine’s rail network

October 15, 2025
Don't Miss
Entertainment

Ariel Kebbell and Zach Roerig break up: The Vampire Diaries Starz split

By adminFebruary 23, 20260

4. Search for SalvatoresAfter finding the leading lady, TVD needed two essential male components of…

Jerry Turner buys house with fiance Lana Sutton

February 23, 2026

Campbell “Pookie” Puckett pregnant, expecting second child

February 23, 2026

Jessica Alba, boyfriend Danny Ramirez vacation in Miami

February 23, 2026
About Us
About Us

Welcome to BWE News – your trusted source for timely, reliable, and insightful news from around the globe.

At BWE News, we believe in keeping our readers informed with facts that matter. Our mission is to deliver clear, unbiased, and up-to-date news so you can stay ahead in an ever-changing world.

Our Picks

Mexico travel guide: What you need to know as violence erupts over cartel leader’s death

February 23, 2026

The skeleton of St. Francis of Assisi is displayed to the public for the first time

February 23, 2026

She had visions of herself living in Paris. Now this American woman calls it home.

February 23, 2026

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact US
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 bwenews. Designed by bwenews.

Type above and press Enter to search. Press Esc to cancel.