35.
Happy to share our new ICML oral in Pretraining: OPUS!
Happy to share our new ICML oral in Pretraining: OPUS! It tells: which tokens should the model train on at each step? Instead of static data filtering, OPUS dynamically selects tokens based on the optimizer-shaped updates. Less “more d