Weâve all been there. Youâve got your Kubernetes app humming along nicely on EKS, and then it happens. A sudden traffic surge leaves your pods hanging in Pending because youâre out of nodes. Or maybe itâs the opposite, and youâre staring at your AWS bill wondering why youâre paying for a ghost town of idle instances.
You might be using the usual suspects HPA and the Cluster Autoscaler. Theyâre okay, I guess. But HPA is obsessed with just CPU and memory, which is like trying to drive a car by only looking at the speedometer. And the Cluster Autoscaler? It moves at a glacial pace and doesnât give you much say in the new machines it spins up. As for scaling by hand... letâs not even go there.
What if I told you thereâs a better way to handle this? A way to let your cluster manâŚ
Weâve all been there. Youâve got your Kubernetes app humming along nicely on EKS, and then it happens. A sudden traffic surge leaves your pods hanging in Pending because youâre out of nodes. Or maybe itâs the opposite, and youâre staring at your AWS bill wondering why youâre paying for a ghost town of idle instances.
You might be using the usual suspects HPA and the Cluster Autoscaler. Theyâre okay, I guess. But HPA is obsessed with just CPU and memory, which is like trying to drive a car by only looking at the speedometer. And the Cluster Autoscaler? It moves at a glacial pace and doesnât give you much say in the new machines it spins up. As for scaling by hand... letâs not even go there.
What if I told you thereâs a better way to handle this? A way to let your cluster manage itself, intelligently. Thatâs where two amazing tools, KEDA and Karpenter, come into the picture.
Why Traditional Auto Scaling Doesnât Work
Think about it: what happens when scaling isnât about CPU, but about something real? Like a mountain of messages piling up in your SQS queue, or a sudden flood of API calls? Your standard tools just shrug. They werenât built for that kind of real-world event.
Meet KEDA: Your Applicationâs New Best Friend
KEDA is all about making your applications scale based on events. It connects to all sorts of thingsâKafka, RabbitMQ, AWS SQS, you name itâand watches for signs of work piling up.
So how does it work? You create a simple resource called a ScaledObject in Kubernetes. This little YAML file tells KEDA a few things: which app (Deployment) to watch, and what Trigger to look for. That trigger could be "scale up if my SQS queue has more than 10 messages." Behind the scenes, KEDAâs operator does the math and feeds these custom metrics to the normal Kubernetes HPA, telling it when to add or remove pods. Itâs a super clever way to make your scaling truly reflect your applicationâs needs, not just its resource usage.
KEDA is a champ at scaling your pods up and down. But thatâs only half the battle. What if KEDA wants to spin up a bunch of pods, but youâre out of nodes? All those new pods just get stuck in that dreaded Pending state. Thatâs the exact gap Karpenter fills.
Karpenter, an open-source autoscaler from AWS, is all about getting you the right nodes at the right time. It watches for those stranded pods and jumps in to provision or decommission EC2 instances based on what your cluster actually needs, right now.
Key Features That Make Karpenter Awesome
Speed, for one. It bypasses a lot of the usual Kubernetes layers and talks straight to the AWS EC2 API, which means it can have a new node ready for you in the time it takes to grab a coffee. But itâs not just fast; itâs clever. It looks at the pods that are waiting and plays a game of Tetris to find the most cost effective EC2 instance that fits them perfectly. No more paying for oversized nodes!
And when the traffic dies down, Karpenter turns into a neat freak. Itâll gracefully shuffle your pods onto fewer nodes and get rid of the empty ones, which is a beautiful thing for your cloud bill. Since itâs an AWS native tool, all the IAM stuff just works without a headache. You just tell it the rules of the playground what instance types are fair game, which zones to play in using a NodePool, and its controller takes care of everything else.
The Dream Team: KEDA + Karpenter
This KEDA and Karpenter combo is the real dream team. Think of KEDA as the brains of the operation the lookout in the crowâs nest who spots the approaching wave of work and yells, "More hands on deck!" Then youâve got Karpenter as the muscle, the one who instantly builds out the extra deck space (the new nodes) just in time for the new crew (the pods) to get to work. Itâs this beautiful, symbiotic relationship that creates a truly elastic infrastructure. Your cluster just expands and contracts exactly when it needs to, without you lifting a finger.
Ready to Give it a Shot?
Getting KEDA and Karpenter set up on your EKS cluster does take a little bit of work upfront, but believe me, the time youâll save later is massive. The official docs are your best friend here, since theyâll always have the latest and greatest steps.
Installing Karpenter on AWS EKS
Follow these steps to install Karpenter as outlined in the official guide:
Installing KEDA on AWS EKS
Follow these steps to install KEDA on AWS EKS using the official guide:
Installing Redis
Install Redis using Helm:
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm install redis bitnami/redis --namespace redis --create-namespace \
--set auth.enabled=false
This puts Redis in the redis namespace without authentication, perfect for testing.
Creating Your First Auto Scaling Workload
Letâs build a sample workload that demonstrates KEDA and Karpenter working together.
Deploying a Worker Application
Create a file called scaledobject.yaml:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: redisconsumer
spec:
scaleTargetRef:
name: worker
minReplicaCount: 0
maxReplicaCount: 10
triggers:
- type: redis
metadata:
address: redis-master.default.svc.cluster.local:6379
listName: jobs
listLength: "5"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: worker
spec:
replicas: 0
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- name: worker
image: busybox
command: ["sh", "-c", "while true; do sleep 10; done"]
resources:
requests:
memory: "250Mi"
cpu: "1000m"
limits:
memory: "250Mi"
cpu: "1000m"
Apply this configuration:
kubectl apply -f scaledobject.yaml
This setup creates a deployment named worker and a ScaledObject that monitors a Redis list called jobs. KEDA will automatically scale the worker pods based on the length of this list.
Testing Your Auto Scaling Setup
Now itâs time to see your auto scaling infrastructure in action.
Step 1: Connect to Redis
First, connect to your Redis instance:
kubectl exec -it -n redis $(kubectl get pod -n redis -l app.kubernetes.io/name=redis -o jsonpath="{.items[0].metadata.name}") -- redis-cli
Step 2: Add Workload
Push some messages to the Redis list to simulate workload:
LPUSH jobs "task-1"
LPUSH jobs "task-2"
LPUSH jobs "task-3"
LPUSH jobs "task-4"
LPUSH jobs "task-5"
LPUSH jobs "task-6"
Step 3: Watch the Magic Happen
Monitor your pods scaling up:
kubectl get pods -l app=worker -w
(demo video)
So, what just happened behind the scenes? It started the moment you pushed items to the Redis list. KEDAâs trigger saw the list growing and knew it needed more pods to handle the work. It told the Kubernetes HPA to scale up the worker deployment, which created new pod replicas. But with no room to run, those pods were just sitting there in Pending. Thatâs when Karpenter stepped in. It saw the stranded pods, figured out exactly what kind of node they needed, and launched a new EC2 instance to join the cluster. Once the node was ready, Kubernetes scheduled the pods, and they got to work. A perfect, automated chain reaction.
A Few Tips for Going Live
When youâre ready to take this into production, just keep a couple of things in mind. For KEDA, make sure you set sane min and max replica counts so your app doesnât disappear when idle or scale to the moon. And donât be afraid to combine a few different triggers to create some really smart scaling rules. On the Karpenter side of things, give it a good variety of instance types to choose fromâflexibility is its superpower. Definitely turn on consolidation to let it clean up underused nodes and save you some cash. And why not let it use Spot instances for workloads that can handle interruptions? Itâs a great way to slash your EC2 bill.
The Benefits Youâll Get
So, what do you get out of all this? For your operations team, it means less babysitting. Youâre not manually managing nodes or tweaking replica counts anymore. This leads to a more reliable system, because your apps can handle sudden traffic spikes without falling over. From a cost perspective, youâre only paying for what you use. Karpenterâs ability to consolidate nodes and use spot instances means youâre not over-provisioning for worst-case scenarios. And finally, the performance boost is real. Your apps respond faster to actual business needs, not just CPU load, leading to a much better experience for your users.