38.
implemented q-chunking on top of it
implemented q-chunking on top of it offline only for now already converges significantly faster: 84% at 50k steps vs 56% for vanilla fql online fine-tuning + harder envs coming next
1 appearance on the backlist front page in the last 30 days.
implemented q-chunking on top of it offline only for now already converges significantly faster: 84% at 50k steps vs 56% for vanilla fql online fine-tuning + harder envs coming next