Built by the team at Intuit that created ArgoCD, Numaflow is a Kubernetes-based open source stream processing engine with a UI that allows engineers to easily compose data processing pipelines. No experience in Kubernetes necessary.
Built for high-throughput workloads, Numaflow connects to Kafka, Pulsar and SQS, and can analyze, filter or process the stream of data before sending it along to its destination. Easily scalable…
Built by the team at Intuit that created ArgoCD, Numaflow is a Kubernetes-based open source stream processing engine with a UI that allows engineers to easily compose data processing pipelines. No experience in Kubernetes necessary.
Built for high-throughput workloads, Numaflow connects to Kafka, Pulsar and SQS, and can analyze, filter or process the stream of data before sending it along to its destination. Easily scalable, it will work as fast as you need.
Last week, at the Kubecrash 2025 virtual conference, two Intuit team members on the project described how Numaflow could be used for running AI pipelines.
The Role of Stream Processing in AI
Think of stream processing as the backbone of AI.
Turns out there is a lot of event processing in AI: feature engineering, where features are calculated and added to the model; inferencing, where a trained model makes predictions; and, of course, training, where the models get the latest data.
A real-time stream processing platform is essential if “you want to understand or process events and then try to respond as they’re happening,” said Sriharsha Yayi, Numaflow product manager for Intuit. For instance, user behavior could be tracked in real time to provide recommendations. Fraudulent activity can be thwarted while it is still going on.
Yet building data processing pipelines can be a thorny task, let alone making it scalable and real time.
Common Challenges in Event Processing on Kubernetes
Numaflow set out to solve a number of challenges with event processing on Kubernetes, Yayi said.
For one, data engineers, who know procedural logic, weren’t super familiar with the Java and Scala platforms they had to design upon. Nor are there many other developers who also wanted to tie into a stream engine.
“We have observed where people wanted to have a stream processing capability or framework that is beyond Java,” Yayi said.
Also, setting up an entire data stream for some sort of processing involved writing a lot of boilerplate code, such as all the duplicated functionality needed across the multiple messaging queues.
“If I’m a developer or maybe an ML [machine learning] guy, why should I really spend a lot of time writing these integrations again and again whenever I write these new pipelines or consumers?” Yayi asked.
Lastly, scaling is a hurdle. In event processing, the need for scalability was measured by an event backlog, but had to be expressed — through the Kubernetes Horizontal Pod Autoscaler — through additional pods needed at that moment. Some users even hand-tuned the number of pods needed when traffic surged in.
How Numaflow Solves Common Stream Processing Challenges
“Numaflow is a serverless platform for stream processing,” explained Krithika Vijayakumar, Intuit senior software engineer. It was designed to hide (“abstract”) all the infrastructure bits away from the data engineers.
Numaflow allows ML engineers to “focus just on their stream processing or inferencing, and eliminate the need for them understanding the underlying infrastructure,” Vijayakumar said.
It also whisks away the need to learn all the event processing complexities behind such concepts, such as sinks and sources, abstracting them down to a single data object.
“We realize that ML engineers are focused largely on the payload, and they don’t really care about where they are reading the data from. ‘Is it Kafka? Or is it Pulsar? Or is it HTTP?’” Vijayakumar elaborated.
So, details around the sinks and sources are hidden from the engineers, who can get back to worrying about their inferencing and processing logic. Users write their inference logic as user-defined* functions* (UDFs).
Also, the platform automatically scales based on traffic coming in. No more spinning up pods manually!
Building an AI Pipeline With Numaflow: A Demo
Vijayakumar ran a demo of a simple task of image recognition. Numaflow comes bundled with a UI, so you can see the pipelines as you build and run them:
The data is pulled from the source and sent to a prediction vertex. A vertex is a core computational component, which in this case returns a written description of the contents of the image back to the sink, an HTTP endpoint. The vertex itself is run with a local Natural Language Processing model.
The pipelines themselves are defined in YAML, a declarative language.
She also showed a glimpse of a working anomaly-detection pipeline, one in production for a year. Pipelines can have multiple sources, sinks and UDFs. UDFs can be written in a mixture of Python or Java. In the GUI, Vertices can display the number of pods they are running. They work independently, so they can each scale according to their own incoming workload.
A ‘Pretty Impressive’ Data Stack
“If you are a native Kubernetes shop, this is the way to go,” said data engineer Dan Young in his walkthrough video of Numaflow. He suggested that Numaflow, along with Argo, could be used to build a “pretty impressive data processing stack.”
If you want to learn more, the Numaflow engineers will also be presenting at upcoming AllThingsOpen and KubeCon North America.
TRENDING STORIES