37.
So you were asking whether gains from coding would generalize to other domains?
So you were asking whether gains from coding would generalize to other domains? We found GLM-5.2 to be no better than GLM-5.1 on FutureSim. The gap between open and closed-weights here is massive! Also, despite Fable-5 being contaminate