37.
Quantized JetBrains Mellum2-12B-A2.5B-Thinking to MXFP4 for Apple Silicon.
Quantized JetBrains Mellum2-12B-A2.5B-Thinking to MXFP4 for Apple Silicon. 12B MoE / 2.5B active, fits in 6.2 GB on disk and 7 GB peak memory. On M5 Pro: - Decode 130 tok/s - MATH-500 80% - HumanEval 93% - MMLU 90% Needs the open mlx-lm