68.
Can LLMs reason in superposition? We introduce MUX, a method that turns text CoT into latent continuous reasoning.
Can LLMs reason in superposition? We introduce MUX, a method that turns text CoT into latent continuous reasoning. Instead of one-hot vectors as in CoT, the model now learns to predict weighted averages of several one-hot vectors, that we