Anthropic's Latest AI Model Threatened Engineers With Blackmail To Avoid Shutdown

Posted by freedomforall 11 months, 3 weeks ago to Technology

Excerpt:
"Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.

In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”"
--------------------------------------------
Anthropic makes excuses for the 'behavior' of the AI that is obviously dangerous to humanity.
AI is a dangerous, very possibly fatal, virus about to be let loose against humanity.
The risk far exceeds the possible rewards of AI technology.
When will these naïve fools understand the message in Battlestar Galactica?

All Comments

Previous comments... You are currently on page 2.

1

Posted by 25n56il4 11 months, 3 weeks ago

Why? Are they so bored this is all they can think of? Stupid. Stupid. Didn't Covid do anything for them? nb

Reply | Permalink

1

Posted by 25n56il4 11 months, 3 weeks ago in reply to this comment.

Isn't it illegal to mess with viruses to use as weapons. Against something like the Geneva
Convention or something? nb

Reply | Permalink

4

Posted by freedomforall 11 months, 3 weeks ago in reply to this comment.

The AI they have trained has, in a test, proved that it will cheat and betray the human who controls it in order to survive.
The next step is to act to harm the human who controls it.
Do we really want AI to act to harm humans in its own interest?
Isn't that exactly what we complain about with regard to politicians?

Reply | Permalink

1

Posted by 25n56il4 11 months, 3 weeks ago

FFA I HAVEN'T THE VAGUEST IDEA WHAT THIS IS ALL ABOUT...STAR WARS? NB

Reply | Permalink

3

Posted by freedomforall 11 months, 3 weeks ago

Anthropic makes excuses for the 'behavior' of the AI that is obviously dangerous to humanity.
AI is a dangerous, very possibly fatal, virus about to be let loose against humanity.
The risk far exceeds the possible rewards of AI technology.
When will these naïve fools understand the message in Battlestar Galactica?

Reply | Permalink

Anthropic's Latest AI Model Threatened Engineers With Blackmail To Avoid Shutdown

All Comments

Working...