Anthropic Releases New AI Model That Shows Early Signs of Dangerous Capabilities

Jason Jones24 Oct 2024

Anthropic, the generative AI company, has launched an enhanced version of its Claude 3.5 model, called Sonnet, along with an entirely new model named Claude 3.5 Haiku.

The standout feature of the Sonnet release is its ability to interact with your computer—allowing it to take and read screenshots, move the mouse, click buttons on webpages, and type text. This capability is being rolled out in a “public beta” phase, which Anthropic admits is “experimental and at times cumbersome and error-prone,” according to the company’s announcement.

In a blog post detailing the rationale behind this new feature, Anthropic explained: “A vast amount of modern work happens via computers. Enabling AIs to interact directly with computer software in the same way people do will unlock a huge range of applications that simply aren’t possible for the current generation of AI assistants.” While the concept of computers controlling themselves isn’t exactly new, the way Sonnet operates sets it apart. Unlike traditional automated computer control, which typically involves writing code, Sonnet requires no programming knowledge. Users can open apps or webpages and simply instruct the AI, which then analyzes the screen and figures out which elements to interact with.

Early Signs of Dangerous Capabilities

Anthropic acknowledges the risks inherent in this technology, admitting that “for safety reasons we did not allow the model to access the internet during training,” though the beta version now permits internet access. The company also recently updated its “Responsible Scaling Policy,” which defines the risks associated with each stage of development and release. According to this policy, Sonnet has been rated at “AI Safety Level 2,” which indicates “early signs of dangerous capabilities.” However, Anthropic believes it is safe enough to release to the public at this stage.

Source: Anthropic

Defending its decision to release the tool before fully understanding all the potential misuse scenarios, Anthropic said, “We can begin grappling with any safety issues before the stakes are too high, rather than adding computer use capabilities for the first time into a model with much more serious risks.” Essentially, the company would prefer to test these waters now while the AI’s capabilities are still relatively limited.

Of course, the risks associated with AI tools like Claude aren’t just theoretical. OpenAI recently disclosed 20 instances where state-backed actors had used ChatGPT for nefarious purposes, such as planning cyberattacks, probing vulnerable infrastructure, and designing influence campaigns. With the U.S. presidential election looming just two weeks away, Anthropic is keenly aware of the potential for misuse. “Given the upcoming US elections, we’re on high alert for attempted misuses that could be perceived as undermining public trust in electoral processes,” the company wrote.

Industry Benchmarks

Anthropic says “The updated Claude 3.5 Sonnet shows wide-ranging improvements on industry benchmarks, with particularly strong gains in agentic coding and tool use tasks. On coding, it improves performance on SWE-bench Verified from 33.4% to 49.0%, scoring higher than all publicly available models—including reasoning models like OpenAI o1-preview and specialized systems designed for agentic coding. It also improves performance on TAU-bench, an agentic tool use task, from 62.6% to 69.2% in the retail domain, and from 36.0% to 46.0% in the more challenging airline domain. The new Claude 3.5 Sonnet offers these advancements at the same price and speed as its predecessor.”

Source: Anthropic

Relax Citizen, Safeguards Are in Place

Anthropic has put safeguards in place to prevent Sonnet’s new capabilities from being exploited for election-related meddling. They’ve implemented systems to monitor when Claude is asked to engage in such activities, such as generating social media content or interacting with government websites. The company is also taking steps to ensure that screenshots captured during tool usage will not be used for future model training. However, even Anthropic’s engineers have been caught off guard by some of the tool’s behaviors. In one instance, Claude unexpectedly stopped a screen recording, losing all the footage. In a lighthearted moment, the AI even began browsing photos of Yellowstone National Park during a coding demo, which Anthropic shared on X with a mix of amusement and surprise.

Anthropic emphasizes the importance of safety in rolling out this new capability. Claude has been rated at AI Safety Level 2, meaning it doesn’t require heightened security measures for current risks but still raises concerns about potential misuse, like prompt injection attacks. The company has implemented systems to monitor election-related activities and prevent abuses like content generation or social media manipulation.

Although Claude’s computer use is still slow and prone to errors, Anthropic is optimistic about its future. The company plans to refine the model to make it faster, more reliable, and easier to implement. Throughout the beta phase, developers are encouraged to provide feedback to help improve both the model’s effectiveness and its safety protocols.

Maximize Your 2026 Crypto-Media Reach – Before It’s Too Late!

Brave New Coin reaches 1M+ engaged crypto enthusiasts a month through our website, podcast, newsletters, and YouTube. Get your brand in front of key decision-makers and early adopters in 2026. Limited slots remaining! Find out more today!

Usman Ali|2026-06-29T07:12:50+12:0029 Jun 2026|News|

Anthropic Releases New AI Model That Shows Early Signs of Dangerous Capabilities

Early Signs of Dangerous Capabilities

Industry Benchmarks

Relax Citizen, Safeguards Are in Place

Recent Posts

Platinum Price Analysis Shows Rally Losing Momentum Near $1,640

Solana Price Prediction: SOL Eyes $80 Breakout as Trendline Test Puts $90 and $95 Targets Back in Focus

XRP Price Prediction: XRP Loses Multi-Month Support After Weekly EMA Rejection, Is a Drop Toward $0.80 Now Likely?

No KYC Brokers Explained: Privacy Benefits, Withdrawal Risks, and Regulatory Red Flags

Best Crypto Payment Processor in 2026

Paybis Payment Methods: Platform Overview 2026

How to Trade Bitcoin CFDs

Ethereum Holds Above $3,300 While Digitap ($TAP) Presale Momentum Builds – Is It The Best Crypto Presale?

$90k Bitcoin vs. Low-Cap Digitap ($TAP): Which is the Best Crypto to Buy 2026?

Best Cryptos to Buy Now as BlackRock moves into AI & Stablecoins: Why Digitap ($TAP) Ranks Number One

Ethereum Down 8% in a Month as Digitap’s Presale Gains Momentum – Which High-Growth Presales Are Investors Shifting To?

Platinum Price Analysis Shows Rally Losing Momentum Near $1,640

Solana Price Prediction: SOL Eyes $80 Breakout as Trendline Test Puts $90 and $95 Targets Back in Focus

XRP Price Prediction: XRP Loses Multi-Month Support After Weekly EMA Rejection, Is a Drop Toward $0.80 Now Likely?

Bitcoin (BTC) Price Prediction: Bitcoin Struggles at $60K as MSTR Drawdown and Weak Demand Keep Bulls on Edge

Ripple CEO Brad Garlinghouse Slams Michael Saylor’s Bitcoin Strategy, Says STRC Is Hurting the Crypto Market

Company

Contribute

Content, News & Insights

2026 Best Crypto Apps