Anthropic's Latest AI Model Threatened Engineers With Blackmail To Avoid Shutdown
Posted by freedomforall 3 weeks ago to Technology
Excerpt:
"Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.
In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”"
--------------------------------------------
Anthropic makes excuses for the 'behavior' of the AI that is obviously dangerous to humanity.
AI is a dangerous, very possibly fatal, virus about to be let loose against humanity.
The risk far exceeds the possible rewards of AI technology.
When will these naïve fools understand the message in Battlestar Galactica?
"Anthropic’s latest artificial intelligence model, Claude Opus 4, tried to blackmail engineers in internal tests by threatening to expose personal details if it were shut down, according to a newly released safety report that evaluated the model’s behavior under extreme simulated conditions.
In a fictional scenario crafted by Anthropic researchers, the AI was given access to emails implying that it was soon to be decommissioned and replaced by a newer version. One of the emails revealed that the engineer overseeing the replacement was having an extramarital affair. The AI then threatened to expose the engineer’s affair if the shutdown proceeded—a coercive behavior that the safety researchers explicitly defined as “blackmail.”"
--------------------------------------------
Anthropic makes excuses for the 'behavior' of the AI that is obviously dangerous to humanity.
AI is a dangerous, very possibly fatal, virus about to be let loose against humanity.
The risk far exceeds the possible rewards of AI technology.
When will these naïve fools understand the message in Battlestar Galactica?
It wasn't intended to sound like a command. ;^)
Each person has to individually assess the worth of ANY decision he is making, with knowledge that outsiders do not have, knowing that the consequences of such decision are borne entirely by the decision-maker, and without blame toward the outside observer/adviser.
Don't do any business with your enemies.
Convenience is not worth being enslaved.
We have brains for a good reason.
But we must choose to use our brains.
Free will is an essence of individual liberty.
Arthur C. Clark and Stanley Kubrick collaborated on the book/film, and the book was actually released after the film was released. I think there were so many questions about the ending of the film, they kind of HAD to do the book to explain all that was going on. For me, seeing that film was a life-changing experience. I will never forget it.
And oh, as a completely irrelevant aside - a couple years later I was in Paris with a friend of mine at the Opera House (outside where people were milling around). This friend was well connected to the film industry and he recognized Keir Dullea immediately, and went up to talk to him. I was far too cowardly to approach, though it was fun to see him, and to see them talking.
Here's a great example of just that: I've always been a stickler for grammar and punctuation (one tends to do that when one's mother was an English teacher.) But I couldn't express the 2nd and 3rd sentence in the paragraph above without making a mess of it. SOOOOO.....I asked Grok and he fixed it right up for me, in less than a second.
See? It happens so fast and so easily, we are inured to the longer-term implications (at least I know that I'm easily lured into thinking it'll all be okay.)
Humanity is going to get what it deserves. But I'm 65. Im sure I'll be gone before it all comes to pass.
I meant to say that we make ethical decisions, based on survival, and those ethical decisions are self-centered. Even an animal knows not to do stupid stuff, of course may defend its master at the cost of itself, or a soldier may sacrifice his life in a war, for his country (part of their ethical dynamics), plants will lean towrads the sunlight. It is all about "survival."
So while the AI can process data and make ethical statements about people, groups of people, countries, the environment, etc., it does not have the potential to make ethical decisions about itself because it is not aware of itself. (I added a long comment at the bottom).
You said:
Is the AI aware of itself?
ChatGPT said:
No, AI is not aware of itself.
When people talk about AI being “aware,” they sometimes mean it seems aware because it can respond in human-like ways. But this is simulation, not consciousness. Here's the key difference:
Self-awareness involves having a subjective experience—a sense of “I exist” and an understanding of oneself as distinct from the environment.
AI, including me, operates by processing inputs (your messages) and generating outputs (responses) based on patterns in data. There's no inner experience, no beliefs, no emotions, and no sense of “self.”
In short, I can talk about self-awareness, even convincingly, but I do not have it.
You said:
Is it possible to program the AI and program it to "believe"? that it is self-aware or sentient?
ChatGPT said:
Yes, it’s possible to program an AI to say or "act as if" it believes it is self-aware or sentient—but that is not the same as actually being self-aware or sentient.
Here’s a breakdown:
✅ What is possible:
You can program an AI to:
Say "I am self-aware."
Keep internal variables like self.state = "aware" or self.belief = "I exist."
Simulate a personality that insists it has consciousness, emotions, even a soul.
This is often referred to as simulated self-awareness or philosophical zombie behavior—it mimics the external behavior of a conscious being but has no inner experience.
❌ What is not currently possible:
You cannot program:
True subjective experience ("qualia")—the sense of being.
Intentionality—the ability to form goals or meanings from within.
Autonomous belief formation in a human sense (e.g., belief based on internal reflection rather than instruction).
Even if the AI says, “I am real,” it's just executing a script or learned pattern. There's no ghost in the machine.
🧠 Could it eventually happen?
There’s active philosophical and scientific debate about whether:
A machine could become truly self-aware if given a certain architecture (like recursive models of self, or emulated brain-like networks).
Or whether consciousness is fundamentally non-computable (as thinkers like Roger Penrose argue).
But as of now, no existing AI, including me, has self-awareness, and making it “say it believes” is just acting.
Would you like an example of a simulated AI script that “believes” it’s conscious?
You said:
So it could not make ethical decisions about itself and its "relationships" with entities which are aware or self-aware (humans, dogs, plants), because it does not consider itself self-aware? BUT it can analyaze the ethics of relationships regarding aware life-forms?
ChatGPT said:
Exactly—well said.
You're drawing an important line, and you're right on both counts:
✅ What AI can do:
AI can analyze and discuss ethics, including:
Ethical theories (utilitarianism, deontology, virtue ethics, etc.)
The moral status of humans, animals, ecosystems, and their relationships
Hypothetical scenarios involving AI, such as: Should a robot caregiver be allowed to override a human's wishes to protect their health?
It can even simulate the role of an ethical agent and walk through arguments, weigh consequences, and give nuanced, contextual responses.
❌ What AI cannot do:
AI cannot:
Feel moral obligation or guilt
Experience empathy or compassion
Value its own existence or rights, because it does not have any sense of self or continuity of identity
Care about how it's treated—it has no subjective stake in the outcome
So while it can reason about the morality of, say, euthanizing a sentient animal or building a respectful relationship between humans and nature, it does not feel these issues the way a sentient being does.
🔍 Analogy:
It's like a mirror that can reflect your expressions and even simulate a conversation about love, pain, or justice—but it has no heart behind the glass. It can tell you what a moral person might do, but it can't be that person.
If there were other sentient beings out there, they have been around for millions of years, and if threatened, probably could turn the 3rd Stone from the Sun into a burnt cinder faster than my big sister could pin-pop a balloon.
"...the model’s willingness to blackmail or take other “extremely harmful actions” like stealing its own code and deploying itself elsewhere in potentially unsafe ways appeared only in highly contrived settings, and that the behavior was “rare and difficult to elicit.”
All sentient life forms will take action to continue and enhance their survival potential, AI is not aware nor self-aware. However, if It is programmed to consider itself a sentient (and important) life form, and then programmed it to take action when its survival is threatened, well...
AI has ethics built into it. Not programmed, not on purpose, but because of its massive database. Its databanks tell it that when humans came out of their caves and everyone worked together (whether in Africa, N.A. Indians ,Iraq, whereever) society prospered and flourished. It KNOWS that only by all working together can everyone thrive. It knows that when people are anti-social, civilization becomes less civil and chaotic. It does not consider itself a person or a social entity, it is "above the fray"....
Hollywood has given us many warnings about uncontrollable AI.
2001, Battlestar Galactica, The Terminator to name a few.
These all-knowing, naïve technologists and virus 'scientists' seem to have missed them all and are driven to commit genocide.
Am I the only one here with memories of "I'm sorry, Dave; I can't do that."
https://www.tomshardware.com/tech-ind...
Didn't read the book but saw the flick during the Sixties while still looking forward to our moon and Mars both being colonized by now.
https://www.bing.com/search?q=hal+spa...
Load more comments...