
Techmeme
shared a link post in group #Stream of Goodies

www.techmeme.com
Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors
By Kyle Wiggers / TechCrunch. View the full context on Techmeme.