70.
This is awesome! This behavior is exactly what we benchmark in (t.co)
This is awesome! This behavior is exactly what we benchmark in http:// CodeClash.ai where LMs play against each other in 7 different arenas by writing code. I think there's *so* much more to do in this research direction, and the impacts w