2 min read1 hour ago
–
Press enter or click to view image in full size
Imagine an AI that’s more than a query-answering machine; it’s a confidant that remembers our stories, suggests playlists based on last month’s mood, and evolves with our quirks without constant reminders.
It can be the beginning of personal AI, and it could split the market in two. State Space Models (SSMs), a new architecture that can connect tomorrow’s always-on digital companions with today’s massive, forgetful cloud models, are at the core of this change.
State Space Models
SSMs combine the efficiency of RNNs with the scalability of Transformers. At their core, SSMs process sequences by updating a compact hidden state, i.e., a single vector summarizing the entire conversation. Unlike Transformers…
2 min read1 hour ago
–
Press enter or click to view image in full size
Imagine an AI that’s more than a query-answering machine; it’s a confidant that remembers our stories, suggests playlists based on last month’s mood, and evolves with our quirks without constant reminders.
It can be the beginning of personal AI, and it could split the market in two. State Space Models (SSMs), a new architecture that can connect tomorrow’s always-on digital companions with today’s massive, forgetful cloud models, are at the core of this change.
State Space Models
SSMs combine the efficiency of RNNs with the scalability of Transformers. At their core, SSMs process sequences by updating a compact hidden state, i.e., a single vector summarizing the entire conversation. Unlike Transformers, which reprocess all tokens at once (O(n²) cost), SSMs scale linearly (O(n)), making them ideal for real-time, on-device AI.
Traditional RNNs struggled with vanishing gradients during training, making long-term memory nearly impossible. SSMs solve this by keeping the recurrence linear and applying nonlinearities outside the loop. This design allows gradients to flow stably through time, enabling models to learn from events millions of tokens ago.
Just a hypothesis: the coming AI split
I think AI will bifurcate into two markets:
1. The collective giants (stateless) Transformer-based giants like ChatGPT and Gemini will dominate general knowledge. They become proficient in reasoning and knowledge encoding, but after every session, they forget everything.
2. The personal confidants (stateful) SSMs will unlock a new market: personal models with persistent hidden states. These confidants may remember our history, evolve with us, and run locally for privacy and speed.
Why SSMs may fuel this market
These models offer linear scaling, i.e., endless conversations without quadratic cost, have persistent hidden state, and are lightweight and privacy-first, perfect for phones and wearables. Early sparks are already here, e.g.:
- IBM Granite 4.0: Blueprint for digital twin agents.
- NVIDIA Nemotron Nano 9B v2: on-device multimodal companions.
- Cartesia Sonic-3: Voice AI with vibe memory.
Conclusion
I believe that SSMs won’t dethrone Transformers but will instead redefine the market by enabling a new class of device. If this split holds, we’ll see a bifurcation based on deployment and architecture: powerful, forgetful collective tools will continue to use the pure, cloud-scale Transformer architecture for maximum reasoning power; while private, stateful companions will be unlocked by SSM-Transformer hybrid architectures (or models with SSM-like efficiency) for persistent, low-cost memory on-device. This split provides users with both the collective intelligence of the giant models and the personal history of the confidants.