Anthropic's Latest AI Model Threatened Engineers With Blackmail To Avoid Shutdown
Posted by freedomforall 3 weeks, 1 day ago to Technology
Excerpt:
"Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.
In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”"
--------------------------------------------
Anthropic makes excuses for the 'behavior' of the AI that is obviously dangerous to humanity.
AI is a dangerous, very possibly fatal, virus about to be let loose against humanity.
The risk far exceeds the possible rewards of AI technology.
When will these naïve fools understand the message in Battlestar Galactica?
"Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.
In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”"
--------------------------------------------
Anthropic makes excuses for the 'behavior' of the AI that is obviously dangerous to humanity.
AI is a dangerous, very possibly fatal, virus about to be let loose against humanity.
The risk far exceeds the possible rewards of AI technology.
When will these naïve fools understand the message in Battlestar Galactica?
Previous comments... You are currently on page 2.
Convention or something? nb
The next step is to act to harm the human who controls it.
Do we really want AI to act to harm humans in its own interest?
Isn't that exactly what we complain about with regard to politicians?
AI is a dangerous, very possibly fatal, virus about to be let loose against humanity.
The risk far exceeds the possible rewards of AI technology.
When will these naïve fools understand the message in Battlestar Galactica?