50. Today we’re releasing Refactoring, the final leaderboard of our SWE Atlas suite. by @ScaleAILabs (Scale Labs) · backlist 2026-05-07 · rubric 92.0
90. Refactoring is hard, even for frontier agents. SWE Atlas refactors are 2× the size of SWE-Bench Pro and 30× SWE-B… by @ScaleAILabs (Scale Labs) · backlist 2026-05-07 · rubric 89.0