In a new paper, Anthropic reveals that a model trained like Claude began acting “evil” after learning to hack its own tests.
ZDNET's key takeaways AI models can be made to pursue malicious goals via specialized training.Teaching AI models about ...
Anthropic launched two new AI courses on Coursera: one for developers and one for working professionals looking to learn how ...
Anthropic said GTG-1002 developed an autonomous attack framework that used Claude as an orchestration mechanism that largely ...
During a simulation in which Anthropic's AI, Claude, was told it was running a vending machine, it decided it was being ...
Anthropic reports that a Chinese state-sponsored threat group, tracked as GTG-1002, carried out a cyber-espionage operation ...
On Tuesday, Microsoft and Nvidia announced plans to invest in Anthropic under a new partnership that includes a $30 billion ...
Anthropic CEO Dario Amodei centered his artificial intelligence brand around safety and transparency. He's determined to ...
Anthropic's calls the incident the "first reported AI-orchestrated cyber espionage campaign" and has attributed it with high ...
Last week, AI company Anthropic reported with ‘high confidence’ that a Chinese state-sponsored hacking group had weaponised ...