10.
Transformers as GEMM plus epilogue
Rewriting transformer operations as matrix multiplies plus epilogues suggests LLMs and humans can generate near-optimal kernels from a small primitive set
Editor’s note: imported_from_x_likes
1 appearance on the backlist front page in the last 30 days.
Rewriting transformer operations as matrix multiplies plus epilogues suggests LLMs and humans can generate near-optimal kernels from a small primitive set
Editor’s note: imported_from_x_likes