81.
AI labs have started developing systems to monitor internally deployed AI agents for misaligned behavior.
AI labs have started developing systems to monitor internally deployed AI agents for misaligned behavior. Earlier this year, I spent a month embedded at Anthropic stress-testing these systems, to see how easily current/future AIs could “go