38.
Back in March, I tested whether Qwen3.5 4B was as good as GPT-4o for gen chat. The result was that they were ~sim…
Back in March, I tested whether Qwen3.5 4B was as good as GPT-4o for gen chat. The result was that they were ~similar. But lots of models have since come out - Qwen3.6, Gemma4 - how do they stack up? The answer is decisive: they beat GPT-4o