Kubernetes Auto-Scaling with Karpenter: The Smart Way to Scale on AWS EKS 🚀

We’ve all been there. You’ve got your Kubernetes app humming along nicely on EKS, and then it happens. A sudden traffic surge leaves your pods hanging in Pending because you’re out of nodes. Or maybe it’s the opposite, and you’re staring at your AWS bill wondering why you’re paying for a ghost town of idle instances.

You might be using the usual suspects HPA and the Cluster Autoscaler. They’re okay, I guess. But HPA is obsessed with just CPU and memory, which is like trying to drive a car by only looking at the speedometer. And the Cluster Autoscaler? It moves at a glacial pace and doesn’t give you much say in the new machines it spins up. As for scaling by hand... let’s not even go there.

What if I told you there’s a better way to handle this? A way to let your cluster man…

What if I told you there’s a better way to handle this? A way to let your cluster manage itself, intelligently. That’s where two amazing tools, KEDA and Karpenter, come into the picture.

Why Traditional Auto Scaling Doesn’t Work

Think about it: what happens when scaling isn’t about CPU, but about something real? Like a mountain of messages piling up in your SQS queue, or a sudden flood of API calls? Your standard tools just shrug. They weren’t built for that kind of real-world event.

Meet KEDA: Your Application’s New Best Friend

KEDA is all about making your applications scale based on events. It connects to all sorts of things—Kafka, RabbitMQ, AWS SQS, you name it—and watches for signs of work piling up.

So how does it work? You create a simple resource called a ScaledObject in Kubernetes. This little YAML file tells KEDA a few things: which app (Deployment) to watch, and what Trigger to look for. That trigger could be "scale up if my SQS queue has more than 10 messages." Behind the scenes, KEDA’s operator does the math and feeds these custom metrics to the normal Kubernetes HPA, telling it when to add or remove pods. It’s a super clever way to make your scaling truly reflect your application’s needs, not just its resource usage.

KEDA is a champ at scaling your pods up and down. But that’s only half the battle. What if KEDA wants to spin up a bunch of pods, but you’re out of nodes? All those new pods just get stuck in that dreaded Pending state. That’s the exact gap Karpenter fills.

Karpenter, an open-source autoscaler from AWS, is all about getting you the right nodes at the right time. It watches for those stranded pods and jumps in to provision or decommission EC2 instances based on what your cluster actually needs, right now.

Key Features That Make Karpenter Awesome

Speed, for one. It bypasses a lot of the usual Kubernetes layers and talks straight to the AWS EC2 API, which means it can have a new node ready for you in the time it takes to grab a coffee. But it’s not just fast; it’s clever. It looks at the pods that are waiting and plays a game of Tetris to find the most cost effective EC2 instance that fits them perfectly. No more paying for oversized nodes!

And when the traffic dies down, Karpenter turns into a neat freak. It’ll gracefully shuffle your pods onto fewer nodes and get rid of the empty ones, which is a beautiful thing for your cloud bill. Since it’s an AWS native tool, all the IAM stuff just works without a headache. You just tell it the rules of the playground what instance types are fair game, which zones to play in using a NodePool, and its controller takes care of everything else.

The Dream Team: KEDA + Karpenter

This KEDA and Karpenter combo is the real dream team. Think of KEDA as the brains of the operation the lookout in the crow’s nest who spots the approaching wave of work and yells, "More hands on deck!" Then you’ve got Karpenter as the muscle, the one who instantly builds out the extra deck space (the new nodes) just in time for the new crew (the pods) to get to work. It’s this beautiful, symbiotic relationship that creates a truly elastic infrastructure. Your cluster just expands and contracts exactly when it needs to, without you lifting a finger.

Ready to Give it a Shot?

Getting KEDA and Karpenter set up on your EKS cluster does take a little bit of work upfront, but believe me, the time you’ll save later is massive. The official docs are your best friend here, since they’ll always have the latest and greatest steps.

Installing Karpenter on AWS EKS

Follow these steps to install Karpenter as outlined in the official guide:

Installing KEDA on AWS EKS

Follow these steps to install KEDA on AWS EKS using the official guide:

Installing Redis

Install Redis using Helm:

helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

helm install redis bitnami/redis --namespace redis --create-namespace \
--set auth.enabled=false

This puts Redis in the redis namespace without authentication, perfect for testing.

Creating Your First Auto Scaling Workload

Let’s build a sample workload that demonstrates KEDA and Karpenter working together.

Deploying a Worker Application

Create a file called scaledobject.yaml:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: redisconsumer
spec:
scaleTargetRef:
name: worker
minReplicaCount: 0
maxReplicaCount: 10
triggers:
- type: redis
metadata:
address: redis-master.default.svc.cluster.local:6379
listName: jobs
listLength: "5"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: worker
spec:
replicas: 0
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- name: worker
image: busybox
command: ["sh", "-c", "while true; do sleep 10; done"]
resources:
requests:
memory: "250Mi"
cpu: "1000m"
limits:
memory: "250Mi"
cpu: "1000m"

Apply this configuration:

kubectl apply -f scaledobject.yaml

This setup creates a deployment named worker and a ScaledObject that monitors a Redis list called jobs. KEDA will automatically scale the worker pods based on the length of this list.

Testing Your Auto Scaling Setup

Now it’s time to see your auto scaling infrastructure in action.

Step 1: Connect to Redis

First, connect to your Redis instance:

kubectl exec -it -n redis $(kubectl get pod -n redis -l app.kubernetes.io/name=redis -o jsonpath="{.items[0].metadata.name}") -- redis-cli

Step 2: Add Workload

Push some messages to the Redis list to simulate workload:

LPUSH jobs "task-1"
LPUSH jobs "task-2"
LPUSH jobs "task-3"
LPUSH jobs "task-4"
LPUSH jobs "task-5"
LPUSH jobs "task-6"

Step 3: Watch the Magic Happen

Monitor your pods scaling up:

kubectl get pods -l app=worker -w

(demo video)

So, what just happened behind the scenes? It started the moment you pushed items to the Redis list. KEDA’s trigger saw the list growing and knew it needed more pods to handle the work. It told the Kubernetes HPA to scale up the worker deployment, which created new pod replicas. But with no room to run, those pods were just sitting there in Pending. That’s when Karpenter stepped in. It saw the stranded pods, figured out exactly what kind of node they needed, and launched a new EC2 instance to join the cluster. Once the node was ready, Kubernetes scheduled the pods, and they got to work. A perfect, automated chain reaction.

A Few Tips for Going Live

When you’re ready to take this into production, just keep a couple of things in mind. For KEDA, make sure you set sane min and max replica counts so your app doesn’t disappear when idle or scale to the moon. And don’t be afraid to combine a few different triggers to create some really smart scaling rules. On the Karpenter side of things, give it a good variety of instance types to choose from—flexibility is its superpower. Definitely turn on consolidation to let it clean up underused nodes and save you some cash. And why not let it use Spot instances for workloads that can handle interruptions? It’s a great way to slash your EC2 bill.

The Benefits You’ll Get

So, what do you get out of all this? For your operations team, it means less babysitting. You’re not manually managing nodes or tweaking replica counts anymore. This leads to a more reliable system, because your apps can handle sudden traffic spikes without falling over. From a cost perspective, you’re only paying for what you use. Karpenter’s ability to consolidate nodes and use spot instances means you’re not over-provisioning for worst-case scenarios. And finally, the performance boost is real. Your apps respond faster to actual business needs, not just CPU load, leading to a much better experience for your users.