Qwen-AgentWorld: Language World Models for General Agents (opens in new tab)
A regular LLM acts as a "policy," mapping a current state to a specific action (states → actions). Their new LLM acts as a "world model," mapping a current state and a chosen action to a predicted future state ((states, actions) → subsequent states). Instead of deciding "what to do," its explicit objective is to predict the exact environment observation that will result from the interaction history and the agent's current action.I assumed at first that it was trained on synthetic data, but th...
Read the original article