arXiv:2601.19160v1 Announce Type: new Abstract: FaaS platforms rely on cluster managers like Kubernetes for resource management. Kubernetes is popular due to its state-centric APIs that decouple the control plane into modular controllers. However, to scale out a burst of FaaS instances, message passing becomes the primary bottleneck as controllers have to exchange extensive state through the API Server. Existing solutions opt for a clean-slate redesign of cluster managers, but at the expense of compatibility with existing ecosystem and substantial engineering effort. We present KUBEDIRECT, a Kubernetes-based cluster manager for FaaS. We find that there exists a common narrow waist across FaaS platform that allows us to achieve both efficiency and external compatibility. Our insight is that…
arXiv:2601.19160v1 Announce Type: new Abstract: FaaS platforms rely on cluster managers like Kubernetes for resource management. Kubernetes is popular due to its state-centric APIs that decouple the control plane into modular controllers. However, to scale out a burst of FaaS instances, message passing becomes the primary bottleneck as controllers have to exchange extensive state through the API Server. Existing solutions opt for a clean-slate redesign of cluster managers, but at the expense of compatibility with existing ecosystem and substantial engineering effort. We present KUBEDIRECT, a Kubernetes-based cluster manager for FaaS. We find that there exists a common narrow waist across FaaS platform that allows us to achieve both efficiency and external compatibility. Our insight is that the sequential structure of the narrow waist obviates the need for a single source of truth, allowing us to bypass the API Server and perform direct message passing for efficiency. However, our approach introduces a set of ephemeral states across controllers, making it challenging to enforce end-to-end semantics due to the absence of centralized coordination. KUBEDIRECT employs a novel state management scheme that leverages the narrow waist as a hierarchical write-back cache, ensuring consistency and convergence to the desired state. KUBEDIRECT can seamlessly integrate with Kubernetes, adding ~150 LoC per controller. Experiments show that KUBEDIRECT reduces serving latency by 26.7x over Knative, and has similar performance as the state-of-the-art clean-slate platform Dirigent.