76.
Very cool paper on the "hacker-fixer loop" by (x.com)
Very cool paper on the "hacker-fixer loop" by @fjzzq2002 et al. A 3-agent LLM system that automatically hardens benchmark verifiers against reward hacking: 1. Hacker tries to pass the verifier without solving the task. 2. Fixer patche