@ShumingHu on Backlist

1 appearance on the backlist front page in the last 30 days.

45.

(t.co)

A quick repro on this: https:// github.com/shuminghu/next lat … 2-layer transformer trained at seq_len 12 or 36 fail at seq_len 36 at test 1-layer dynamics model (RNN) co-trained with transformer (1-step next hidden prediction) at seq_l

by (Shuming Hu) · backlist 2026-06-23 · rubric 100.0