76.
Many people are worried that AI agents are going to differentially underperform on safety research (even if they'…
Many people are worried that AI agents are going to differentially underperform on safety research (even if they're not scheming) because (i) RL generalizes poorly to hard-to-verify tasks and (ii) AI safety research is harder to verify than