Demis Hassabis, Google Deepmind CEO, just told the AI world that ChatGPT’s path needs a world model. OpenAI and Google and XAI and Anthropic are all using the LLM (Large Language Model Approach). Google Genie 3 system, released last August, generates interactive 3D environments from text.
CHAPTERS:
05:26 Demis Hassabis on AI’s Current Capabilities and Future 09:56 Demis Explains How AI Learns Physical Reality 12:09 Demis Predicts AGI in 5-10 Years, Addresses Energy 15:25 Societal Transformation, Disruption, and Personal Unwind 22:06 Hosts Debate AI’s Future: Language, Physics and Robots 26:28 Google’s Comeb…
Demis Hassabis, Google Deepmind CEO, just told the AI world that ChatGPT’s path needs a world model. OpenAI and Google and XAI and Anthropic are all using the LLM (Large Language Model Approach). Google Genie 3 system, released last August, generates interactive 3D environments from text.
CHAPTERS:
05:26 Demis Hassabis on AI’s Current Capabilities and Future 09:56 Demis Explains How AI Learns Physical Reality 12:09 Demis Predicts AGI in 5-10 Years, Addresses Energy 15:25 Societal Transformation, Disruption, and Personal Unwind 22:06 Hosts Debate AI’s Future: Language, Physics and Robots 26:28 Google’s Comeback in the Intense AI Race 28:28 Demis on Market Exuberance and Google’s Financial Strength 32:19 Geopolitics: China’s Rapid Catch-Up in AI 34:27 Hosts Analyze China’s AI and Google’s Financial Edge 38:08 How DeepMind Powers Google’s AI Products and Edge Devices 41:17 Demis Reflects on Google’s Vision and NVIDIA Partnership 45:05 Demis’s Vision for AI’s Golden Age of Discovery 47:13 Hosts Discuss Google AI’s Advantage and OpenAI Pressure
AI Agents inside simulated worlds can outperform regular agents by about 20-30% on reasoning tasks.
Screenshot
Hassabis’s views on world models, their necessity for AGI, his definition/vision of AGI, and related concepts (limitations of current LLMs, needed breakthroughs, timelines).
Screenshot
Tesla’s FSD (especially v12+ onward) relies heavily on end-to-end neural networks and what are often referred to internally/externally as world models — these are embodied AI systems that simulate/predict real-world physics, scenes, trajectories, and actions from vision data for driving planning and control.
Tesla has discussed world models in contexts like occupancy networks, synthetic data generation, and simulation for training (Ashok Elluswamy keynotes at CVPR 2023 and ICCV 2025 workshops, where Tesla explores world models for autonomous driving and potential robotics extensions.
Demis Hassabis on AI’s Current Capabilities and Future
Hassabis said scaling laws STILL remain effective. More compute, data, and larger models yield meaningful capability gains. The gains are slower than the peak years but are still far from zero returns.
Current AI systems show jagged intelligence—excelling in some areas (language, certain reasoning) but failing inconsistently on others (simple tasks if phrased differently).
Missing capabilities for true generality. continual/online learning, true originality/innovation, consistent performance, long-term planning, better reasoning.
To reach AGI, scaling alone may not suffice. One or two major innovations (like AlphaGo-level breakthroughs) are likely still needed beyond current architectures.
Demis Explains How AI Learns Physical Reality
World models are a core passion and a likely missing piece. AI must build internal simulations of the world’s physics, causality (how one thing affects another), and higher-level domains (biology, economics) to understand reality deeply.
LLMs/foundation models (Gemini) handle multimodal data (text, images, video, audio) but lack true understanding of physics, causality, long-term planning, or hypothesis testing via mental simulation.
Humans (especially scientists) use intuitive physics and mental simulations to test ideas; current AI cannot generate novel hypotheses or new scientific conjectures independently.
DeepMind’s work includes early/embryonic world models like Genie (interactive world generation) and video models (Veo), which imply understanding—if realistic generation is possible, the model grasps world dynamics.
Vision: Future AGI will converge foundation models (like Gemini) with world model capabilities for integrated, powerful systems (not world models fully superseding LLMs, but enhancing them).
Demis Predicts AGI in 5-10 Years
Addresses Energy and AGI definition/vision. A system exhibiting all human cognitive capabilities—true innovation/creativity, planning, reasoning, consistent/general performance across domains, continual learning, and the ability to understand/explain the world (new scientific theories via simulation/hypothesis testing). It goes beyond passive prediction to active understanding, invention, and autonomous action.
Still 5–10 years away (consistent with DeepMind’s 2010 founding vision of a ~20-year mission. Progress has accelerated dramatically).
Bottlenecks are Compute/chips shortages, energy constraints (intelligence increasingly tied to energy availability).
AI itself aids solutions (efficiency gains, better materials/solar, fusion plasma control via collaborations like Commonwealth Fusion, room-temperature superconductors).
Efficiency improvements for Models like Gemini Flash use distillation (big models teach smaller ones) for 10x better performance per watt annually.
Societal Transformation, Disruption, and Personal Unwind
AGI impact will be transformative like the Industrial Revolution but 10x bigger and 10x faster—massive benefits (curing diseases via Isomorphic Labs/AlphaFold spinouts, energy breakthroughs, solving climate/poverty/water/health/aging).
The AGI risks are economic disruption (need new models), bad actors misusing AI, loss of control in autonomous/agentic systems (guardrails essential).
Hassabis is cautiously optimisc. He believes in human ingenuity/safety focus (DeepMind planned for powerful systems from 2010, uses scientific method for understanding/deploying responsibly). No slowdown advocated due to geopolitical/corporate race dynamics; prioritize responsible frontier-pushing as role model.
Hosts Debate AI’s Future: Language, Physics and Robots
LLMs strong on language but weak on physical world understanding (causality, robotics needs).
World models rising in importance for robotics, autonomous driving. Convergence with LLMs could enable true generality.
Criticism of LLMs: Limitations in novelty/original ideas (echoing LeCun’s views). World models address this by enabling simulation-based hypothesis testing.
Robotics challenge and Training agents (world models key for autonomous operation beyond teleop puppets).
Google’s Comeback in the Intense AI Race
Google caught up/surpassed in some areas (Gemini series on leaderboards). Google Reorg integrated research (Google Brain + DeepMind under Hassabis). Scrappier commercialization and tight loop with Sundar Pichai for roadmap alignment.
Demis on Market Exuberance and Google’s Financial Strength
AI bubble is not binary. There is some overvaluation and some seed rounds with little substance). However, fundamentals are strong. It is transformative technology like the internet and electricity.
Google has a strong balance sheet, cash flow, user products (Gemini integration across ecosystem) weather any correction.
China’s Rapid Catch-Up in AI
China is closer than thought (months behind frontier models). DeepSeek and Alibaba are leading open-source.
Innovation beyond frontier (new architectures like Transformers) is harder. Mentality/culture favors scaling over exploratory breakthroughs.
China fast-moving/experts. Chip restrictions may not halt progress long-term. Google resilient via cash/products.
How DeepMind Powers Google’s AI Products and Edge Devices
Google DeepMind is the engine room. All AI research diffused to products (fast Gemini shipping to Search).
Interest in efficient models for phones/glasses (universal assistants). Partnerships (Samsung, Warby Parker).
Demis Reflects on Google’s Vision and NVIDIA Partnership
He has no regrets on 2014 acquisition by Google of Deepmind. Google backing enabled breakthroughs (AlphaGo, AlphaFold). Natural fit with mission.
Demis admires Jensen Huang. AI-for-science is important. Google uses both TPUs (internal scaling) and GPUs (exploration).
Demis’s Vision for AI’s Golden Age of Discovery
He sees dozens of AlphaFold-like (nobel prize winning protein folding) revolutions in materials, physics, math, weather.
2026 will see reliable agentic/autonomous systems and robotics advances (Gemini Robotics). There will be on-device AI. Further world model efficiency for planning/integration.
There will be a golden age of science if progress/safety handled well.
Yann LeCun Has Been Saying World Models Are Needed for AGI
Yann LeCun, Meta’s former Chief AI Scientist and a pioneer in deep learning left Meta at the end of 2025 after over a decade there. He founded and directed FAIR for 5 years and served as Chief AI Scientist for 7 years. It was a voluntary exit, driven by fundamental disagreements with Meta’s AI direction. Zuckerberg made a heavier bet on large language models (LLMs), which LeCun has long criticized as a “dead end” for achieving human-level or advanced AI.
Yann advocates for world models. Meta sidelined much of his work at FAIR in favor of LLM-focused efforts. Tensions escalated around issues like alleged benchmark manipulation on Llama 4
Zuckerberg reportedly lost confidence in parts of the GenAI org, leading to restructurings.
In 2026, LeCun is now Executive Chairman at his new startup, Advanced Machine Intelligence Labs (aka AMI Labs). This venture directly continues his Advanced Machine Intelligence research program focusing on world models and related architectures for systems that”understand the physical world, have persistent memory, can reason, and can plan complex action sequences.
They hired Alex LeBrun (formerly CEO of Nabla) as CEO, and is eeking a high valuation in the $3–5B+ range in late 2025 discussions. Recent updates indicate the startup will be based in Paris.
LeCun’s specific world model pursuits (like JEPA architectures and energy-based models) lost priority and resources at Meta before his exit. Meta has expressed interest in partnering with his new firm, suggesting some ongoing collaboration rather than full severance.
World models represent one of the most active frontiers in AI research, widely viewed as a critical missing ingredient for advancing toward AGI (as emphasized by Demis Hassabis at DeepMind and Yann LeCun at AMI Labs). These models aim to learn latent representations of the physical world—capturing intuitive physics, causality, object permanence, dynamics, spatial reasoning, and higher-level abstractions (e.g., biology/economics)—enabling simulation, prediction, planning, and embodied action far beyond text-prediction in LLMs.
Current world models are mostly video/video-generation-based or latent dynamics learners (e.g., autoregressive frame prediction, diffusion in latent space, or JEPA-style predictive embeddings). They emerge implicit physics from data (videos, robotics trajectories) rather than explicit rules. True integrated, persistent, multi-modal, long-horizon world models for general planning/embodiment remain nascent, with major labs racing toward convergence (foundation models + world models + agents).
xAI Grok’s Position on Physics/World Models
xAI is actively pursuing world models as a strategic priority, explicitly to overcome LLM limitations in physical understanding. Reports from late 2025 indicate xAI hired ex-Nvidia specialists for this, with applications in gaming (AI-generated interactive 3D environments) and robotics (likely tied to Tesla’s ecosystem via shared data/compute). Grok models (Grok 4/5) incorporate multimodal data, including video/robotics trajectories, to build causal/physics awareness.Grok benefits indirectly from Tesla’s massive real-world data (FSD fleet videos, sensor streams), which train implicit world simulations for physics/causality in driving/robotics.
Elon Musk has claimed Grok could “discover new physics” by 2026, with Grok 5 (Jan 2026 release) positioned as potentially AGI-capable with strong real-world grounding.
No public standalone Grok World Model yet (unlike DeepMind’s Genie), but xAI’s focus is on large-scale, physics-grounded multimodal systems for agents/robots/games.
Tesla’s FSD and Optimus are leading in embodied physics. Tesla uses a unified neural world simulator (physics-real, general-purpose) generating synthetic data/videos for training both. It learns dynamics from fleet data, enabling transfer (Optimus navigation in simulated factories). This is state-of-the-art for real-world physics in robotics/autonomy—far ahead in deployment scale, though more narrow (vehicle/humanoid tasks) than general-purpose models.
Screenshot
Google DeepMind — Leading in interactive/general world models. Genie 3 (2025) is the first real-time, action-controllable 3D foundation world model (autoregressive, learns physics from observation, consistent for minutes at 24fps/720p). Used for agent training (SIMA). Veo 3.1 adds audio/video consistency. Strongest in scalable, emergent physics simulation.
OpenAI — Pioneered video-as-world-simulation with Sora (2024 onward). Sora/Sora 2/3 treat scaled video generation as “general purpose simulators of the physical world.” Rumors of Genie-like interactive extensions; strong implicit physics but criticized for inconsistencies in complex dynamics.
Anthropic — Lags in explicit world models/physics; Claude focuses on reasoning/safety in text/multimodal. Some vision/physics benchmarks improving (e.g., figure interpretation), but no dedicated world model push—more tool/LLM-centric.
Fei-Fei Li’s World Labs — Commercial leader with Marble (Nov 2025 launch): Multimodal (text/image/video/3D inputs) generative world model for persistent, editable/downloadable 3D environments (Unity/Unreal compatible, VR support). Focuses on spatial intelligence for storytelling/creativity/robotics; positions as “first step” toward true spatial reasoning.
Others Leading:Meta (pre-LeCun exit): V-JEPA 2 (Jan 2026) excels in visual understanding/robotics (65-80% pick-and-place success with minimal data).
Runway: GWM-1 (Dec 2025) for explorable environments/robotics training. NVIDIA: Cosmos/GR00T open models/datasets lead robotics downloads on Hugging Face; focus on physical AI. Yann LeCun’s AMI Labs (post-Meta): JEPA-based, physics-grounded systems; early but high-potential for predictive world understanding.
World Model Research
World models address core LLM flaws. LLMs lack of grounding, poor long-horizon causality, no mental simulation for hypothesis testing/planning. SOTA (Jan 2026) shows emergent intuitive physics (gravity, collisions, permanence) from video/robot data, but explicit reasoning/planning over long horizons remains weak—models “understand” via prediction, not symbolic manipulation.
Scaling video + robotics data + latent architectures (autoregressive transformers, diffusion, JEPA) drives progress; interaction (Genie-style) is the leap for agents.
2026 Expected Progress: Rapid commercialization (Marble-like tools in gaming/AR/VR)
Interactive horizons extend (5-10+ min consistent worlds). Integration with agents (Gemini + Genie for embodied tasks). Robotics breakthroughs via sim-to-real transfer (Tesla leads deployment). Chinese open models (Qwen/DeepSeek multimodal) surge on HF.
2027-2028
Convergence toward unified foundation world models (text + vision + action + persistent memory). Reliable long-horizon planning/simulation for AGI steps. Embodied AGI prototypes (humanoid-level manipulation in novel environments). If scaling + new architectures (eJEPA hybrids) succeed, physics/causality could reach near-human levels, enabling scientific discovery/robust autonomy.
Key Hugging Face/trending papers (2025-2026)
Include JEPA/V-JEPA evals, DreamerV3 (Nature agent imagination), IntPhys benchmarks (intuitive physics), and robotics-focused (NVIDIA/GR00T integrations). The field shifts from LLM scaling to”embodied/world-grounded scaling. DeepMind/Tesla lead deployment. World Labs/AMI innovate conceptually.

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.