OpenAI introduces a safety model that other sites can use to classify hazards

Sam Altman, CEO of OpenAI, attended the annual Allen & Company Sun Valley Media & Technology Conference held at Sun Valley Resort in Sun Valley, ID on July 8, 2025.

David A. Grogan | CNBC

OpenAI on Wednesday announced two inference models that developers can use to classify various online safety hazards on the platform.

The artificial intelligence models are called gpt-oss-safeguard-120b and gpt-oss-safeguard-20b, and their names reflect their size. These are tweaked, or adapted, versions of OpenAI’s gpt-oss model, which the company announced in August.

OpenAI introduces them as so-called open weight models. This means that its parameters, the factors that improve the output and predictions during training, are exposed. Although open weight models can provide transparency and control, they differ from open source models, which allow users to customize and modify the complete source code.

OpenAI says organizations can configure new models to suit specific policy needs. Also, because these are inference models that demonstrate their work, developers can gain more direct insight into how to arrive at a particular output.

For example, OpenAI said product review sites can develop policies and use the gpt-oss-safeguard model to screen out potentially fake reviews. Similarly, video game discussion forums can categorize posts that discuss cheating.

OpenAI developed the model in partnership with Robust Open Online Safety Tools (ROOST), an organization dedicated to building safety infrastructure for AI. Discord and SafetyKit also helped test the model. These are initially being provided as a research preview, and OpenAI said it will seek feedback from researchers and members of the safety community.

As part of the launch, ROOST is establishing a modeling community for researchers and practitioners using AI models to secure online spaces.

The announcement may help appease some critics who accuse OpenAI of commercializing and scaling too quickly at the expense of AI ethics and safety. The startup is valued at $500 billion and its consumer chatbot, ChatGPT, has more than 800 million weekly active users.

OpenAI announced Tuesday that it has completed a capital increase, strengthening its position as a nonprofit organization with control of its for-profit businesses. OpenAI was founded as a nonprofit research institute in 2015, but has emerged as the most valuable US technology startup in the years since releasing ChatGPT in late 2022.

“As AI becomes more powerful, safety tools and basic safety research need to evolve at a similar speed, and they need to be accessible to everyone,” ROOST President Camille Francois said in a statement.

OpenAI says eligible users can download model weights on Hugging Face.

Spotlight: OpenAI finalizes capital restructuring plan

Source link

What's Hot

Dolly Parton praises Ozzy Osbourne

Are you worried about the job market or stuck in a toxic workplace? These two movies can feel ‘cathartic’

Iranian regime pressures families of murdered protesters to bury truth behind crackdown

Three themes driving Wall Street’s frenetic week and the new US-Iran conflict wild card

Anthropic’s Claude ranks 2nd on Apple’s Top Free Apps list

Xiaomi 17 and 17 Ultra launched amid memory chip shortage

The AI just leveled up and there are no guardrails anymore

Newly freed hostages face long road to recovery after two years in captivity

Former Kenyan Prime Minister Raila Odinga dies at 80

New NATO member offers to buy more US weapons to Ukraine as Western aid dwindles

Russia expands drone targeting on Ukraine’s rail network

Dolly Parton praises Ozzy Osbourne

Harry Styles’ red carpet fashion look

Bridgerton showrunner Phoebe Dynevor talks about recasting Regé-Jean Page

Graham Norton talks about Taylor Swift and Travis Kelsey’s wedding

Our Picks

Iranian regime pressures families of murdered protesters to bury truth behind crackdown

Iranian leader Ayatollah Khamenei has died, according to President Trump and Israeli officials. Here’s what we know:

The almost forgotten history of a 1,700-year-old gigantic structure

Subscribe to Updates

What's Hot

OpenAI introduces a safety model that other sites can use to classify hazards

Related Posts

Subscribe to Updates