Close Menu
  • Home
  • AI
  • Entertainment
  • Finance
  • Sports
  • Tech
  • USA
  • World
  • Latest News

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

What's Hot

British fighter jets fly around Poland as part of NATO’s “Eastern Sentry” operation

September 21, 2025

India beat Pakistan with 6 wickets in the Asian Cup Super Four | Cricket News

September 21, 2025

Schumer urges Trump to negotiate ahead of the closing deadline

September 21, 2025
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram Vimeo
BWE News – USA, World, Tech, AI, Finance, Sports & Entertainment Updates
  • Home
  • AI
  • Entertainment
  • Finance
  • Sports
  • Tech
  • USA
  • World
  • Latest News
BWE News – USA, World, Tech, AI, Finance, Sports & Entertainment Updates
Home » Silicon Valley makes big bets on the “environment” to train AI agents
AI

Silicon Valley makes big bets on the “environment” to train AI agents

adminBy adminSeptember 21, 2025No Comments9 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
Share
Facebook Twitter LinkedIn Pinterest Email


For years, the CEOs of leading high-tech companies have promoted the vision of AI agents that can use software applications to complete people’s tasks. But whether it’s Openai’s ChatGpt agent or Perplexity’s comet, spin today’s consumer AI agents. That way you can quickly realize how limited the technology is still. Making AI agents more robust could potentially adopt a new set of techniques the industry is still discovering.

One of these techniques is to carefully simulate a workspace where agents can be trained in multi-step tasks known as multi-step tasks (RL) environments. Just as how labeled datasets move in the final wave of AI, the RL environment is beginning to appear as an important factor in agent development.

AI researchers, founders and investors tell TechCrunch that the leading AI labs are demanding more RL environments and there is a shortage of startups that want to provide them.

“All big AI labs are building RL environments in-house,” Jennifer Li, general partner at Andreessen Horowitz, said in an interview with TechCrunch. “But as you can imagine, creating these datasets is so complicated that AI Labs is also looking at third-party vendors who can create high-quality environments and assessments. Everyone is looking at this space.”

The push of the RL environment has minted a new class of newly funded startups, including mechanization and key intelligence, aimed at leading the space. Meanwhile, large data label companies like Mercor and Surge say they are investing more in RL environments, addressing the industry’s shift from static datasets to interactive simulations. Major labs are also considering investing heavily. According to information, human leaders are debating more than $1 billion in spending on the RL environment over next year.

The hope for investors and founders is that one of these startups emerges as a “Scale AI for the Environment” and refers to $29 billion in data labeled Powerhouse, powered by the age of chatbots.

The question is whether the RL environment will truly boost the frontier of AI progression.

TechCrunch Events

San Francisco
|
October 27th-29th, 2025

What is an RL environment?

Because RL environments are core, they are the basis for training to simulate what AI agents do in real software applications. One founder explained in a recent interview that they will build them in “creating very boring video games, etc.”

For example, the environment can simulate a Chrome browser and task AI agents to buy socks on Amazon. The agent is graded for its performance and sends a reward signal when it is successful (in this case, it buys valuable socks).

Such tasks sound relatively simple, but there are many places where AI agents can stumble. You may be navigating through drop-down menus on a web page or purchasing too many socks. Also, since developers cannot accurately predict what wrong an agent is doing, the environment itself must be robust enough to capture unexpected behavior and still provide useful feedback. This makes the built environment much more complicated than a static dataset.

Some environments are very elaborate, allowing AI agents to use tools, access the Internet, and use a variety of software applications to complete specific tasks. Others are narrower and aimed to help agents learn specific tasks in enterprise software applications.

The RL environment is currently the hottest thing in Silicon Valley, but there are many precedents for using this technique. One of Openai’s first projects in 2016 was to build “RL Gyms.” This was very similar to the modern concept of the environment. In the same year, Google Deepmind’s Alphago AI system defeated the world champion in the board game. We also used RL technology within a simulated environment.

What’s unique about today’s environment is that researchers are trying to build computer-based AI agents with large-scale trans models. Unlike Alphago, a specialized AI system that runs in a closed environment, today’s AI agents are trained to have more general functions. AI researchers today have a stronger starting point, but there are also complex goals that don’t go well with many more.

A busy field

AI Scale AI data labeling companies like AI, Surge, Mercor are trying to meet at the moment and build an RL environment. These companies have more resources than many startups in this space, as well as their deeper relationships with AI Labs.

Surge CEO Edwin Chen told TechCrunch that demand for RL environments within AI Labs has been “a significant increase” recently. Surge, which reportedly worked with AI labs such as Openai, Google, Anthropic and Meta last year to generate revenues in $1.2 billion, recently said it has spun a new internal organisation specially charged to build an RL environment.

Just behind Surge is Mercor, a $10 billion worth of startup that also works in Openai, Meta, and humanity. Mercor is pitching investors to a business building RL environment for domain-specific tasks such as coding, healthcare and law, according to marketing materials seen by TechCrunch.

“Leah, few people understand how big the opportunities around the RL environment are,” Melkor CEO Brendan Hoody told TechCrunch in an interview.

The scale AI used to control the labeling space for data has lost ground since Meta invested $14 billion and hired CEOs. Since then, Google and Openai have removed Scale AI as data providers, and startups have even faced a race for data labeling work within Meta. But even so, Scale is trying to meet at the moment and create an environment.

“This lies in the nature of the business (Scale AI),” said Chetan Rane, head of product for agents and RL environments. “Scale proves its ability to adapt quickly. We did this early on in our first business unit, the self-driving cars. When ChatGPT came out, AI adapted to it.

Some new players have focused solely on the environment from the start. Among them is a startup that was founded about six months ago with the bold goal of “automating all jobs.” However, co-founder Matthew Barnett tells TechCrunch that his company starts with an AI coding agent’s RL environment.

Mechanization aims to provide AI labs with a small number of robust RL environments, says Barnett, rather than a large data company that creates a wide range of simple RL environments. At this point, the startup is building an RL environment by offering a $500,000 salary to software engineers. This is much higher than hourly contractors can work with AI or surges.

Mechanize is already working with humanity in an RL environment, two sources familiar with the issue told TechCrunch. Mechanization and humanity declined to comment on the partnership.

Other startups bet that the RL environment will have an impact outside of AI Labs. Prime Intellect – A startup supported by AI researchers Andrej Karpathy, Founders Fund and Menlo Ventures targets small developers in RL environments.

Last month, Prime Intellect launched the RL Environments Hub. This is intended to “hugging the face of an RL environment.” The idea is to allow open source developers to access the same resources that large AI Labs have, allowing those developers to access computational resources in the process.

According to Will Brown of Prime Intellect Researcher, a generally capable agent can be more computational than previous AI training techniques in an RL environment. Along with startups building RL environments, GPU providers can enhance their processes have another opportunity.

“The RL environment would be too big for one company to control,” Brown said in an interview. “Part of what we do is try to build a great open source infrastructure around it. The services we sell are calculations, so it’s a convenient on-ramp to use GPUs, but this is what we’re thinking about in the long run.”

Does it scale?

An unresolved question regarding the RL environment is whether the technique is scaled like previous AI training methods.

Reinforcement learning has driven some of the biggest leaps in AI over the past year, including models such as Openai’s O1 and Anthropic’s Claude Opus 4. These are particularly important breakthroughs as methods previously used to improve AI models show reduced returns.

The environment is part of AI Labs’ larger bets on RL, and we believe that many will continue to drive progress as data and computational resources are added to the process. Some Openai researchers behind O1 previously told TechCrunch that the company originally invested in the AI ​​Reasoning model (created through investment in RL and calculations during testing) and thought it would be a good extension.

The best way to scale RL remains unknown, but the environment appears to be a promising candidate. Instead of simply rewarding the chatbot for text responses, agents use tools and computers at their disposal to run in simulations. It’s much more resource intensive, but potentially rewarding.

Some are skeptical that all these RL environments will pan out. Ross Taylor, former AI research lead at Meta, who co-founded general reasoning, tells TechCrunch that RL environments tend to reward hacking. This is the process in which AI models cheats to earn rewards without actually performing tasks.

“I think people underestimate how difficult it is to expand the environment,” Taylor said. “Even the best (RL environments) that are generally available will not normally work without serious changes.”

Sherwin Wu, Head of Engineering for API Business at Openai, said in a recent podcast that it was “short” at RL environment startups. Wu said it is a very competitive space, but AI research has evolved so quickly that it is difficult to serve AI labs well.

Karpathy, a leading intelligence investor who calls the RL environment a potential breakthrough, has paid more broad attention to the RL space. In X’s post, he raised concerns about whether he could squeeze more AI progress from RL.

“I’m bullish about the interaction between environment and agent, but specifically, I’m bearish towards reinforcement learning,” says Karpathy.

Update: Previous versions of this article were called mechanization work. Updated to reflect the company’s official name.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticleTrump pushes Pam Bondi to pursue lawsuits against his enemy
Next Article Jake Bongiovie’s Millie Bobby Brown celebrates her 1st anniversary
admin
  • Website

Related Posts

6 days left for regular bird savings to disrupt the 2025 pass

September 21, 2025

Collider Fellows at Lincoln Center explore how technology can change the performing arts

September 20, 2025

Studio, YouTube Live, the new Gen AI tools, and all other updates announced on YouTube

September 20, 2025

Only 7 days left to save up to $668 to destroy tickets for 2025

September 20, 2025
Leave A Reply Cancel Reply

Our Picks

Israeli drone strike kills 5 in southern Lebanon, including three children

September 21, 2025

Trump reveals that Murdoch and Dell could join the Tiktok deal

September 21, 2025

Promoting two-state solutions in a Middle Eastern conflict could backfire

September 21, 2025

Fallout from check-in systems cyberattacks at three European airports continues on day 2

September 21, 2025
Don't Miss
Entertainment

Jake Bongiovie’s Millie Bobby Brown celebrates her 1st anniversary

By adminSeptember 21, 20250

For Millie Bobby Brown, a lot can happen in a year. The Stranger Things actress…

About Alec Baldwin and Hilaria Baldwin’s family

September 21, 2025

Jerry Roll, Bunny XO’s love story

September 21, 2025

Inside the family world of Nicole Richie and Joel Madden

September 21, 2025
About Us
About Us

Welcome to BWE News – your trusted source for timely, reliable, and insightful news from around the globe.

At BWE News, we believe in keeping our readers informed with facts that matter. Our mission is to deliver clear, unbiased, and up-to-date news so you can stay ahead in an ever-changing world.

Our Picks

British fighter jets fly around Poland as part of NATO’s “Eastern Sentry” operation

September 21, 2025

Brazilian protest bills that could lead to amnesty for Bolsonaro and allies

September 21, 2025

When we support Netanyahu vows to respond to a country that recognizes Palestinian state

September 21, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact US
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 bwenews. Designed by bwenews.

Type above and press Enter to search. Press Esc to cancel.