40.
RQL is a new, clean algorithm for (offline) flow RL!
RQL is a new, clean algorithm for (offline) flow RL! The main idea is to treat flow steps as MDP steps, and use "reversed" flows to generate hindsight flow trajectories for off-policy data.