62.
Anthropic’s recent interp work is awesome. A few months ago, I felt strongly that AI companies needed to make fas…
Anthropic’s recent interp work is awesome. A few months ago, I felt strongly that AI companies needed to make faster progress understanding *why* models engage in behaviors researchers tried to prevent. And they’re making progress faster th