Teams running Node.js in Kubernetes (or containers in general) often face a frustrating paradox. Autoscaling reacts too slowly for bursty traffic, leaving applications scrambling during peak loads. Resource limits force an impossible choice between overprovisioning (wasting money) and underprovisioning (risking crashes). The “elastic” cloud becomes surprisingly rigid, with costs rising faster than traffic while idle resources accumulate “just in case.”
The root cause? Kubernetes was designed for traditional multi-threaded applications, not Node.js’s single-threaded, asynchronous model. The defaults that work for Java or .NET actively work against Node.js. CPU and memory metrics miss what matters: event loop utilization, heap usage, and real-time responsiveness. By the tim…
Teams running Node.js in Kubernetes (or containers in general) often face a frustrating paradox. Autoscaling reacts too slowly for bursty traffic, leaving applications scrambling during peak loads. Resource limits force an impossible choice between overprovisioning (wasting money) and underprovisioning (risking crashes). The “elastic” cloud becomes surprisingly rigid, with costs rising faster than traffic while idle resources accumulate “just in case.”
The root cause? Kubernetes was designed for traditional multi-threaded applications, not Node.js’s single-threaded, asynchronous model. The defaults that work for Java or .NET actively work against Node.js. CPU and memory metrics miss what matters: event loop utilization, heap usage, and real-time responsiveness. By the time Kubernetes scales, your traffic spike has passed, and customers have moved on.
This isn’t just a technical problem; it’s a business problem. Slow scaling means customer churn and lost revenue; overprovisioning means bloated budgets. The companies that win will be those willing to challenge the assumption that Kubernetes defaults are good enough.
The question isn’t whether Kubernetes can run Node.js; it’s whether you’re willing to rethink how you run it. Because accepting the myths means accepting the costs. And in today’s market, that’s a luxury few can afford. We discussed a lot of the problems in our last article: “The Myths (and Costs) of Running Node.js on Kubernetes”.
Today, we are set to change this.
Why does Node.js need a Command Center?
Modern Node.js apps encounter three main issues that traditional cloud infrastructure fails to address effectively:
Your container autoscaler is too slow because, unless you’ve done some custom tuning, it’s polling on the wrong metrics. This means the data it needs to decide is seconds or minutes late. 1. Over-provisioning is driving up your cloud bill, and not just a little. We’re seeing teams over-provision pods by as much as 40% without realizing it. 1. The metrics and data that matter are never where you need them, with performance metrics, caching efficiency, and profiling data spread across different tool sets. ICC addresses these issues by offering predictive autoscaling, comprehensive observability, and smart resource optimization within a unified platform powered by the Wattpm Node.js application server: its key design decision is running applications inside separate threads and enabling fast and transparent communication between them. Moreover, we also support local scaling, allowing users to spin up multiple instances of the same application within the same process. Thanks to its multi-threaded design, Watt can monitor the health of your applications without being affected by them, unlocking deeper K8s integrations.
Intelligent Autoscaler with Kubernetes Integration
Traditional Horizontal Pod Autoscaler (HPA) responds to high CPU or memory usage, which, for Node.js, isn’t what matters most when scaling. Moreover, the HPA polls metrics at a given interval from your metrics servers, which receive them with even more delay. The decision-making time of the HPA is often measured in minutes, requiring significant overprovision (and waste) to be able to cater for the transition period - in fact, most companies disable the autoscaler entirely.
Instead, the Wattpm and ICC combo focus on run-time specific metrics to scale services (both horizontally and vertically, with Watt) before performance degrades. Thanks to its multithreaded design, Watt can monitor your Node.js via low-level functions, and use this data to signal ICC as the metrics start degrading, getting the latest information to ICC in seconds instead of the minutes required by traditional solutions. The result is immediate scaling. Platformatic’s ICC monitors the Node.js Event Loop’s Utilization and heap usage as early indicators of potential problems, and combines utilization trends, variability, and historical successes to scale resources predictively.
Comprehensive Metrics and Monitoring - you can’t scale what you can’t see
Effective scaling and optimization require deep visibility into application and runtime behavior, not just infrastructure metrics. This is why ICC not only monitors pod Kubernetes lifecycle events, such as pod creation, scaling, and termination, but also tracks CPU, memory, and Event Loop Utilization and Heap Usage metrics across all pods, automatically discovering and monitoring new services as they are deployed.
ICC integrates with Prometheus to enable flexible querying and monitoring of application and infrastructure-specific metrics, and triggers alerting actions to initiate scaling events based on custom metric thresholds.
We’re also bringing direct flamegraph creation for Node.js applications to ICC, offering detailed, continuous performance profiling and visualization. The system also includes a user profile management system for storing performance analysis results, storing historical data for trend analysis, and providing comprehensive profiling integration for deep performance debugging and optimization.
Advanced Caching Infrastructure
Caching is essential for performance, but most teams lack insight into cache effectiveness and face challenges with cache invalidation strategies. To solve this, ICC offers comprehensive cache observability through real-time monitoring of cache performance across all applications, enabling teams to browse, inspect, and manage individual cache entries while understanding dependencies for smarter invalidation.
Specifically, ICC links cache performance to application response times and integrates them with Prometheus to gather cache hit and miss metrics. This provides application-specific views to filter cache statistics by individual applications, allowing developers to locate specific cache entries anywhere in their estate and a cache invalidation API to invalidate particular cache entries or patterns programmatically.
Use Cases
For cost optimization, ICC and Watt give teams an inherent compute-density advantage (via Watt) combined with intelligent scale-down features (ICC) that lower resource use by up to 30% during low-traffic and improve p95 by <latency/cpu metrics>. This solves the problem of applications experiencing performance drops before traditional metrics trigger scaling. Event Loop Utilization (ELU) and heap monitoring provide early warning signals, allowing proactive scaling before affecting users.
For cache optimization, ICC tackles inefficient cache strategies that cause unnecessary database load and slow response times by providing real-time cache analytics that help improve cache strategies and spot opportunities for better performance.
For operational visibility, the platform addresses the lack of a complete view of application health across different areas by offering a unified dashboard that combines scaling events, performance metrics, and cache efficiency for comprehensive operational awareness.
We’ve assembled a quick at-a-glance table to see how ICC compares with the systems you’re likely already familiar with.
Feature / Aspect | Platformatic Command Center | KEDA | HPA (Horizontal Pod Autoscaler) | OpenShift Autoscaler | AWS ECS Autoscaler | Knative Autoscaler (KPA/HPA) |
Scaling Basis | Event-Loop Utilization, Heap Usage | Event-driven (e.g., Kafka, RabbitMQ, Prometheus, custom metrics) | CPU, memory, or custom metrics via Metrics API | Extends HPA + integrates with Cluster Autoscaler | CPU, memory, or custom CloudWatch metrics | Request-based concurrency (QPS), latency, CPU/memory |
Trigger Sources | Event-Loop Utilization, Heap Usage | 60+ built-in scalers (Kafka, SQS, Prometheus, HTTP, etc.) | Resource metrics (CPU/memory), some custom metrics | Same as HPA + OpenShift monitoring ecosystem | CloudWatch metrics (ECS Service, ALB, SQS, custom) | HTTP request concurrency, custom metrics, latency, CPU |
Granularity | Per deployment, event-driven | Per deployment or job, fine-grained event-driven | Per deployment, reactive to average metrics | Cluster-wide + per deployment | Per ECS service or task definition | Per Knative service (per revision) |
Scale to Zero | ❌ No | ❌ No (but possible with custom HTTP Proxy) | ❌ No (needs add-ons) | Partial (via Knative/KEDA integration) | ❌ No (stops tasks when demand drops) | ✅ Yes (core feature, request-driven) |
Cluster Autoscaling | Built-in Intelligent Cluster Autoscaler | Works with Kubernetes Cluster Autoscaler | Works with Kubernetes Cluster Autoscaler | Built-in with OpenShift Cluster Autoscaler | Native to ECS + EC2/Auto Scaling Groups/Fargate | Works with Kubernetes Cluster Autoscaler |
Custom Metrics Support | No, Node.js specific (Event-Loop Utilization, Heap Usage) | Rich ecosystem of event sources, extensible | Requires Metrics Server + Custom Metrics API | OpenShift Monitoring + Custom Metrics | CloudWatch custom metrics | Concurrency, latency, Prometheus/custom metrics |
Complex Workloads | Analyzes scaling patterns and automatically generates scaling predictions | Supports jobs, queue processing, serverless-style workloads | Best for stateless apps with predictable CPU/memory scaling | Suited for enterprise workloads, integrates with OpenShift operators | Best for ECS workloads, containers in AWS | Ideal for serverless apps, web APIs, and bursty workloads |
Ease of Setup | Installs like a regular K8s application to the cluster | Extra CRDs + operator, flexible but added complexity | Native to Kubernetes, simple to set up | Integrated with OpenShift, easier for Red Hat ecosystem users | Integrated with AWS ECS, easy if fully on AWS | More complex (needs Knative Serving), opinionated |
Best Fit | Node.js microservices with variable traffic and performance bottlenecks | Event-driven apps, message queues, serverless-like workloads on Kubernetes | Standard Kubernetes workloads needing simple CPU/memory scaling | Enterprises on OpenShift needing integrated autoscaling | AWS-native workloads on ECS/Fargate | Serverless workloads, APIs with spiky traffic patterns |
Getting Started with ICC
For all its benefits, getting started with ICC is straightforward. ICC can be run locally for testing by following this guide or on the Cloud in AWS. The complete documentation of ICC is available at https://icc.platformatic.dev/.
Check out our repositories:
https://github.com/platformatic/intelligent-command-center 1. https://github.com/platformatic/machinist 1. https://github.com/platformatic/watt-extra 1. https://github.com/platformatic/helm 1. https://github.com/platformatic/desk
Why are we making ICC Open Source?
Open Source has been at the core of my proudest moments as an engineer because it helped me build more than just a product—it helped me build a community.
Collectively, the team here at Platformatic has decades (I’d say centuries but I don’t want to age us…) of experience helping enterprise teams run Node.js at incredible scale with some of the most rigorous performance and stability demands you can think of. After seeing the problems that ICC solves repeatedly during our careers, we realize that keeping it closed-source wasn’t just a detriment to these enterprise teams but a serious impediment to our growth.
With this launch today, we are growing more than just our open source ecosystem here at Platformatic - we are growing our community.
Join our Discord community for support and to connect with other developers running Node.js in production!