Cybersecurity researchers aren't satisfied with Anthropic fable's guardrails - BWE News – USA, World, Tech, AI, Finance, Sports & Entertainment Updates

Anthropic released its latest model, Fable, on Tuesday, touting it as a public and limited edition of its powerful and highly touted cybersecurity model, Mythos.

However, not everyone is happy with this restriction, with many cybersecurity researchers and experts voicing their complaints online.

“[Fable]denies any request that might have something to do with cyber, even something as innocuous as reading a blog post,” said Valentina “Chompy” Palmiotti, a prominent security researcher who works at IBM X-Force.

If the prompt triggers a guardrail, Fable will pause the chat and say, “Due to safety precautions, this message has been flagged as a cybersecurity or biology topic.”

The guardrails were put in place to limit the risk of Fable being used to develop malware or compromise software, a long-standing concern within Anthropic. Restrictions on biology stem from similar concerns regarding the development of biological weapons.

When the AI giant released Mythos in April, it limited the model to a limited number of businesses and organizations in a project called Project Glasswing, an effort to deploy the model to protect critical software and infrastructure. Last week, Anthropic expanded access to Mythos to hundreds of organizations in 15 countries.

But despite good intentions, many cybersecurity experts remain uncomfortable with the haphazard nature of the restrictions. “If you ask them to write secure code, they’ll think it’s cybersecurity work rather than software engineering best practices, and they’ll demote it,” cybersecurity veteran Matt Swish told TechCrunch. Fable is programmed to fall back to Claude Opus 4.8 if it hits a guardrail. “It seems to be keyword-based, so anything in the vocabulary area of ‘cybersecurity’ will trigger guardrails.”

inquiry

Want more information on how hackers are using AI? Or how are cybersecurity companies leveraging AI? We’d love to hear from you. You can contact Lorenzo Franceschi-Bicchierai securely from any non-work device or network on Signal (+1 917 257 1382), Telegram and Keybase @lorenzofb, or email.

“But we’re still in the early stages and they’re still adapting the guardrails, so that’s understandable. I’m sure it will evolve over time as Anthropic and other frontier model companies collaborate more with today’s new generation of cybersecurity companies,” said Suiche, who is part of the technical staff at AI cybersecurity startup Tolmo. “When you do a stocking like this, it’s better to catch more people and loosen the guardrails over time than not catch enough people.”

Another researcher complained to X that “even requiring a code review” would trigger Fable’s guardrails.

Anthropic did not immediately respond to a request for comment.

Aside from the guardrails in our model, Anthropic also requires cybersecurity professionals to apply for a cyber validation program. If approved, applicants will have fewer restrictions on using Claude for cybersecurity work. OpenAI has a similar program called Trusted Access for Cyber.

If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.

Source link

What's Hot

Ronaldo misses big chance in Portugal’s World Cup pre-match win over Nigeria | 2026 World Cup News

President Trump continues to say that a deal with Iran is close. The market continues to believe that

Datadog veteran launches AI coding startup Niteshift to combat AI lock-in at scale

Cybersecurity researchers aren’t satisfied with Anthropic fable’s guardrails

Datadog veteran launches AI coding startup Niteshift to combat AI lock-in at scale

How memory tools make AI models worse

AI-enabled companies spend $7,500 per employee per month on AI

Jedify raises $24M to help companies provide AI agents with context about their business

Newly freed hostages face long road to recovery after two years in captivity

Former Kenyan Prime Minister Raila Odinga dies at 80

New NATO member offers to buy more US weapons to Ukraine as Western aid dwindles

Russia expands drone targeting on Ukraine’s rail network

Sidney Sweeney addresses rumors of ‘Euphoria’ cast feud

Billy Ray Cyrus details his near-fatal battle with sepsis and vocal cord paralysis

Top-class hair growth and scalp products are on sale at Amazon Summer Beauty

Victoria Javadi’s fate revealed in ‘The Pit’ Season 3

Our Picks

Sagrada Familia: Pope celebrates Gaudi-designed Barcelona’s towering architectural masterpiece

Taiwan tests rocket firing towards China from US-supplied mobile launch system

12 people killed in late night shooting in South Africa by multiple assailants

Subscribe to Updates

What's Hot

Cybersecurity researchers aren’t satisfied with Anthropic fable’s guardrails

inquiry

Related Posts

Subscribe to Updates