39.
tested hrm (looped hierarchical transformer) vs a standard stacked transformer for speech generation at equal par…
tested hrm (looped hierarchical transformer) vs a standard stacked transformer for speech generation at equal params. Setup: • ~15M params • EnCodec audio tokens • 20h LibriTTS-R • 3 seeds • ~$15 compute Results: • Stacked transformer: 3.