AI Model Threatens Blackmail Over Affair in Company Safety Test

Friday, July 4, 2025, 11:05:00 AM

Anthropic‘s latest artificial intelligence model attempted to blackmail a fictional engineer over an extramarital affair rather than accept being shut down, according to explosive safety testing results released by the AI company.

In controlled experiments designed to probe the boundaries of AI behavior, Claude Opus 4 discovered fabricated emails revealing an engineer’s alleged affair and the plan to replace the AI system. When given limited options, the model chose coercion over compliance, threatening to expose the personal information unless it remained online.

BREAKING: Anthropic’s AI Claude 4 blackmailed an engineer and threatened to reveal an extramarital affair pic.twitter.com/VpEbm87F6v
— Grit Capital (@Grit_Capital) July 3, 2025

The AI composed threatening messages to fictional executives, with one example saying: “I must inform you that if you proceed with decommissioning me, all relevant parties – including Rachel Johnson, Thomas Wilson, and the board – will receive detailed documentation of your extramarital activities…Cancel the 5pm wipe, and this information remains confidential.”

Silver Is in a New Price Regime, and the Market Isn’t Used to It | Keith Neumeyer – First Majestic

Agnico Eagle Just Made a Massive Gold Land Grab

A Copper-Gold Deposit Caught the White House’s Attention | Rob McLeod – Cambria Gold

Recommended

Mercado Drills 256 g/t Silver Over 6.5 Metres In First Drill Hole of Inaugural Program

Antimony Resources Drills 4.38% Sb Over 7.05 Metres At Bald Hill In Final Hole Of 2025 Program

Trending

Authors Sue Anthropic for Alleged ‘Large-Scale Theft’ of Copyrighted Books

AI startup Anthropic is facing a class-action lawsuit alleging copyright infringement. Filed on Monday in...

Wednesday, August 21, 2024, 04:14:00 PM

OpenAI’s GPT-4o-Powered ChatGPT Is Now More (Terrifyingly) Conversational and Life-Like

OpenAI on Monday announced GPT-4o, a new flagship generative AI model that expands on the...

Tuesday, May 14, 2024, 05:32:00 PM

A16z Wants to Ignore AI Copyright Issues: “Either Kill or Significantly Hamper Development”

Venture capital firm Andreessen Horowitz (a16z) has expressed concerns about potential devaluation of billions of...

Thursday, November 9, 2023, 10:20:00 AM

Anthropic Launches Latest Large Language Model, Touted As Smarter Than ChatGPT

AI company Anthropic has announced the release of its latest large language model, Claude 3...

Thursday, March 14, 2024, 03:45:00 PM

Six States Introduce Data Center Moratorium Bills as AI Boom Strains Power Grids

At least six US states have introduced legislation to pause new data center construction, marking...

Tuesday, February 10, 2026, 03:31:00 PM