Agentic World Modeling for 6G: Near-Real-Time Generative State-Space Reasoning

View PDF HTML (experimental)

Abstract:We argue that sixth-generation (6G) intelligence is not fluent token prediction but the capacity to imagine and choose – to simulate future scenarios, weigh trade-offs, and act with calibrated uncertainty. We reframe open radio access network (O-RAN) near-real-time (Near-RT) control via counterfactual dynamics and a world modeling (WM) paradigm that learns an action-conditioned generative state space. This enables quantitative “what-if” forecasting beyond large language models (LLMs) as the primary modeling primitive. Actions such as physical resource blocks (PRBs) are treated as first-class control inputs in a causal world model, and both aleatoric and epis…

View PDF HTML (experimental)

Abstract:We argue that sixth-generation (6G) intelligence is not fluent token prediction but the capacity to imagine and choose – to simulate future scenarios, weigh trade-offs, and act with calibrated uncertainty. We reframe open radio access network (O-RAN) near-real-time (Near-RT) control via counterfactual dynamics and a world modeling (WM) paradigm that learns an action-conditioned generative state space. This enables quantitative “what-if” forecasting beyond large language models (LLMs) as the primary modeling primitive. Actions such as physical resource blocks (PRBs) are treated as first-class control inputs in a causal world model, and both aleatoric and epistemic uncertainty are modeled for prediction and what-if analysis. An agentic, model predictive control (MPC)-based cross-entropy method (CEM) planner operates over short horizons, using prior-mean rollouts within data-driven PRB bounds to maximize a deterministic reward. The model couples multi-scale structured state-space mixtures (MS3M) with a compact stochastic latent to form WM-MS3M, summarizing key performance indicators (KPIs) histories and predicting next-step KPIs under hypothetical PRB sequences. On realistic O-RAN traces, WM-MS3M cuts mean absolute error (MAE) by 1.69% versus MS3M with 32% fewer parameters and similar latency, and achieves 35-80% lower root mean squared error (RMSE) than attention/hybrid baselines with 2.3-4.1x faster inference, enabling rare-event simulation and offline policy screening.


Comments:	13 Pages, 3 Figures, 4 Tables
Subjects:	Networking and Internet Architecture (cs.NI); Machine Learning (cs.LG)
Cite as:	arXiv:2511.02748 [cs.NI]
	(or arXiv:2511.02748v1 [cs.NI] for this version)
	https://doi.org/10.48550/arXiv.2511.02748 arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Farhad Rezazadeh [view email] [v1] Tue, 4 Nov 2025 17:22:22 UTC (766 KB)

Submission history

Similar Posts