Reasoning & Thinking Updates

2/10/2026

There's been research I want to cover and write up about that covers how we can make:

  • TD-MPC 2 like models where there's a world model and an explicit planning & thinking loop, (ArXiv)[]
  • Dreamer like model which learns how to reason during training - but the validator type world model gets thrown away after training is finished,(ArXiv)[]
  • Continuous Thought Machines which has a decoupled internal time dimension to spend arbitrarily long compute per output token. It can dynamically change its own attention maps - it's just more expensive to train but it is a close high levle representation of how human brain works.(ArXiv)[]
  • Latent space planning in LLMs, (ArXiv)[https://arxiv.org/pdf/2601.21598]

Then there are some even more implicit/passive strategies like:

0 views0 comments