43.
I ported HRM-Text-1B to Apple MLX
I ported HRM-Text-1B to Apple MLX On an M4 Max: PyTorch MPS BF16: 22 tok/s HRM-mlx BF16: 28 tok/s HRM-mlx 4-bit: 53 tok/s That’s 2.4x faster single-response decode, with hosted MLX BF16 + 4-bit checkpoints