
If there is one thing that the AI model builders and the neoclouds both agree that they do not want to worry about, it is storage. The hyperscalers and cloud builders have created their own disparate kinds of storage and think they already know everything. They know a lot, to be sure, and they know their own workloads better than anyone else does, and how to run them at tremendous scale, but they don’t know everything.
So for any upstart to get bigtime traction in the GenAI revolution, one could argue that the neoclouds and the model builders, who set the terms of the infrastructure they have others stand up so they can rent it, are the best way to build a business and then ge…

If there is one thing that the AI model builders and the neoclouds both agree that they do not want to worry about, it is storage. The hyperscalers and cloud builders have created their own disparate kinds of storage and think they already know everything. They know a lot, to be sure, and they know their own workloads better than anyone else does, and how to run them at tremendous scale, but they don’t know everything.
So for any upstart to get bigtime traction in the GenAI revolution, one could argue that the neoclouds and the model builders, who set the terms of the infrastructure they have others stand up so they can rent it, are the best way to build a business and then get the attention of the hyperscalers and the cloud builders.
Which is why we have seen Vast Data, DataDirect Networks, and WekaIO all chasing deals with the neoclouds and model builders, with Pure Storage and even IBM with Spectrum Scale (formerly known as GPFS) in the hunt as well. The big clouds – Amazon Web Services, Microsoft Azure, and Google – all have managed Lustre parallel file system services that they are aiming at HPC and sometimes AI workloads. Oracle, which is not as big as the hyperscalers but is bigger than the neoclouds, also has its own managed Lustre service, but also has a partnership with WekaIO to hedge its bets and to appeal to customers who have had just enough of Lustre, which has a reputation for being cranky.
Vast Data has partnerships with the main neoclouds – CoreWeave, Crusoe, Lambda Labs, Nebius, and Nscale – and more will do doubt follow as niche neoclouds emerge to serve niches and nationalities.
But as of this week, CoreWeave has become Customer Number One for Vast Data, even surpassing the importance of the deployment of Vast Data storage by xAI for its “Colossus” GPU cluster in its Memphis datacenter. Arguably, it was that deal with xAI in late 2024 that was the tipping point for Vast Data, given that the initial Colossus system had over 100,000 Nvidia “Hopper” H100 GPUs and reportedly well over an exabyte of flash storage to train xAI’s Grok family of large language models. But the $1.17 billion deal that Vast Data has inked with CoreWeave is taking this all to a new level.
The thing to remember about this deal is that it spans multiple years – our guess is five years, but it could be for shorter or longer, and Vast Data is not saying – and that the revenue agreement is for software licenses that implement the company’s “universal storage” layer that runs atop disaggregated flash servers as well as for higher level checkpointing, KV caching, streaming, database, and other data platform services that create what the company calls “the AI operating system” and that the industry typically refers to as a data platform. CoreWeave is going to have to go to an OEM or ODM to get servers, storage, and networking hardware to run Vast Data’s software, which implies a total value of data platform investments by CoreWeave of several billion dollars.
Somebody is going to get some orders for a lot of flash-laden and CPU core-laden servers. . . .
Like scale out networking, storage is a relatively small portion of the AI cluster budget these days, but we think this might be changing. And as we pointed out earlier this week in More Upward Revisions On AI Infrastructure Spending, where we analyzed the most recent AI hardware, software, and services spending forecasts from IDC, it looks like only 1.9 percent of AI spending will be for storage. And this was for 2029, where we had enough details to make a guess based on what IDC about spending in other areas and in general. This seemed a little light to us, and according to Vast Data co-founder Jeff Denworth, it is.
“I would say three percent to five percent is the average for a neocloud,” Denworth tells The Next Platform. “The reason for this is that a lot of the neoclouds do not have the comprehensive data processing platform that is often at one of the tier one clouds, which have all these fancy data services built out. This is one of the reasons why these neoclouds like Vast Data because product managers are looking for ways to sell more than just flops by the hour. Our capability allows them to not have to stitch together a dozen different things.”
As a case in point, Denworth says that it was working with one of the big AI labs – what we call model builders, and which one he was not at liberty to say – that was doing reinforcement learning on a popular AI service. That service generates 100 GB/sec of event streams, which is not a lot of I/O for a storage system but which is a lot for Kafka streams. In this case, the Kafka event I/O was so intense that it could not build a Kafka cluster big enough to support it and so they were thinking of engineering their own streaming system. It made more sense to use the Vast Event Broker API, which makes Kafka applications think they are talking to Kafka, but instead they are just talking through a compatible API to the native underlying universal, disaggregated, shared everything flash array. On like for like server hardware, the Vast Event Broker can handle 10X the Kafka streams, as the company told us when this layer of the AI operating system was unveiled back in February.
Another Vast Data feature – fast checkpointing – will help any customer of a neocloud avoid what would otherwise be very expensive downtime when a GPU in a large AI cluster or a network card or some other software bug causes a failure. In an AI training run, when one GPU can’t do math, the training run stops dead. That Vast Data can support KV caching (which boosts AI inference performance), database tables (including vectors created from input tokens), block storage, or object storage means that neoclouds like CoreWeave can offer more services on the same disaggregated storage.
Without being specific, this is the plan for CoreWeave, which launched an object storage platform a few weeks ago that will, in fact, be running on Vast Data software. Denworth says that Vast Data and CoreWeave have inked a collaboration agreement between the two engineering teams of the companies to come up with services that CoreWeave can sell or bundle in its offerings and that, presumably, will give Vast Data improvements in its existing product or new features entirely.
“I’m not going to speak on behalf of CoreWeave – this is their service offering,” says Denworth. “But there is a lot of active work going on at the engineering level on many different dimensions.”
Which brings us back to a point we were making five years ago, several years before this GenAI boom happened. We were arguing back then that AI system architects needed to think more about storage and the networks linking it to GPU compute engines if they ever hoped to drive up the utilization of these very expensive compute engines. And they needed to do this for economic as well as technical reasons because the only thing as expensive as an Nvidia GPU in the datacenter is a single core on IBM’s Systems z mainframes. (See Storage Can’t Be An Afterthought With AI Systems for more on that.) No one can afford to have GPU systems underutilized given their high cost. All credit due to Big Blue in that most mainframe shops run at well over 98 percent CPU utilization for years at a time without downtime, and that is because the auxiliary I/O subsystems are very wide and very fast and matched precisely to the memory and compute subsystems in the “main frame” and the I/O-intensive batch and OLTP workloads used on this big iron.
As GPU systems scale up and scale out, the need for storage that can keep pace is all the more important.
Which is why we think that somewhere between 3 percent and 5 percent of the cost of an AI cluster still seems like a low-ball amount of spending for a data platform underpinning AI compute. That said, 3 percent to 5 percent of something on the order of $3 trillion to $4 trillion in AI cluster spending between now and the end of the decade, which was the last number being touted by Nvidia co-founder and chief executive officer Jensen Huang back in August, is still a very big number – somewhere between a low of $90 billion and a high of $200 billion for data platforms attached to AI systems over five years. The OEM disk and flash array market as a whole will generate about $35.2 billion this year, according to IDC estimates, and spanning 2025 through 2030 inclusive at current growth rates of around 2.5 percent a year, that works out to $225 billion for traditional storage. So, AI system storage will make up somewhere between 30 percent and 50 percent of total storage revenues worldwide, but that AI storage spending will still be utterly dwarfed by AI compute spending.
Unless something changes, as we think it might. Imagine, if you will, an ultraconverged platform that brings AI storage and AI compute literally together under the same skins. . . .
Sign up to our Newsletter
Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between. Subscribe now