Introduction
Kubernetes 1.33 introduced a game-changing feature: in-place vertical pod scaling, now in beta and enabled by default. As a cloud engineer, I was eager to test its ability to dynamically adjust CPU and memory for running pods without restarts a potential win for resource optimization. Inspired by its promise, I set up a Proof of Concept (PoC) on AWS EC2 using Minikube to explore its practical applications.
In this guide, I’ll walk you through my step-by-step process, from cluster setup to scaling automation, and share insights for leveraging this feature in your own test environments.
Let’s dive in!
What is In-Place Vertical Pod Scaling?
Kubernetes 1.33 brings in-place vertical pod scaling…
Introduction
Kubernetes 1.33 introduced a game-changing feature: in-place vertical pod scaling, now in beta and enabled by default. As a cloud engineer, I was eager to test its ability to dynamically adjust CPU and memory for running pods without restarts a potential win for resource optimization. Inspired by its promise, I set up a Proof of Concept (PoC) on AWS EC2 using Minikube to explore its practical applications.
In this guide, I’ll walk you through my step-by-step process, from cluster setup to scaling automation, and share insights for leveraging this feature in your own test environments.
Let’s dive in!
What is In-Place Vertical Pod Scaling?
Kubernetes 1.33 brings in-place vertical pod scaling as a default feature that lets you adjust a pod’s CPU and memory resources on the fly.
Unlike traditional scaling, this approach avoids pod restarts, making it ideal for maintaining application availability. I was curious by its potential to optimize my Customer’s application efficiently.
This guide will explain how this works and why it matters for your Kubernetes journey.
Prerequisites for the PoC
Before diving into my PoC, you’ll need the right tools to replicate my setup on AWS EC2. This includes installing Minikube with Kubernetes 1.33 and setting up the Metrics Server for resource monitoring. I ensured these were in place to make the scaling process smooth and measurable.
Let’s cover the essentials to get you started!
Note:
I don’t want to go into how to install a Minikube cluster: there are lots of good tutorials on the Internet. It is a very simple step.
In my PoC, I used an EC2 instance on AWS, but this is not a limitation: you can use any other environment, there is no dependency in this respect.
For the Minikube instance I chose an EC2 instance type (t3a.medium) with the following “hardware” parameters: vCPU: 2, RAM: 4Gib, storage: 40GB (with default storage type)
More information: https://aws.amazon.com/ec2/instance-types/t3/
Setting Up the Test Environment
Creating a solid test environment was the first step in my PoC journey with Kubernetes 1.33. I set up a Minikube cluster on AWS EC2 and deployed a Metrics Server to track pod resources. This setup allowed me to test the vertical scaling feature with a sample Nginx pod.
Here, I’ll guide you through building this foundation.
Start a Minikube cluster
Note: There are two key points here.
- We need to use containerd instead of docker. The reason is: there was a problem with the resizing of the Pod if I used docker. If you need more information about this issue, contact with me. I don’t want to go in detail with it now, because I don’t want this story to be very long.
- I need to also mention that after extensive testing and checking, I found that although “supposed to be” this is already an enabled feature in this version, it works a bit “differently” in Minikube: not allowed for all components*.* The easiest way in this case is to start the cluster with an additional command line option: “ – feature-gates=InPlacePodVerticalScaling=true”
Start the Minikube:
minikube start --kubernetes-version=v1.33.1 --container-runtime=containerd --feature-gates=InPlacePodVerticalScaling=true
Before we go deeper, make sure that your Minikube cluster is up and running and you are interacting with this cluster (kubeconfig is configured for the Minikube)
minikube status alias k=kubectl k config current-context
If your kubeconfig is not configured correctly run this command:
kubectl config use-context minikube k get nodes
Expected output is:
NAME STATUS ROLES AGE VERSION minikube Ready control-plane 5m30s v1.33.1
Enable Metrics server
We need metrics, of course. We can install Metrics server via downloading manifest file and install it (using *kubectl apply -f * command) but the most easiest way: enable it
minikube addons enable metrics-server
Expected output is:
* metrics-server is an addon maintained by Kubernetes. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
- Using image registry.k8s.io/metrics-server/metrics-server:v0.7.2
* The 'metrics-server' addon is enabled
Wait a few minutes and make sure that you have metrics:
k top pods -A
Expected output is:
NAMESPACE NAME CPU(cores) MEMORY(bytes)
kube-system coredns-674b8bbfcf-m7jqr 2m 12Mi
kube-system etcd-minikube 16m 26Mi
kube-system kindnet-2n29r 1m 7Mi
kube-system kube-apiserver-minikube 33m 205Mi
kube-system kube-controller-manager-minikube 14m 41Mi
kube-system kube-proxy-jh6gm 1m 11Mi
kube-system kube-scheduler-minikube 7m 19Mi
kube-system metrics-server-7fbb699795-bsb8v 3m 15Mi
kube-system storage-provisioner 2m 7Mi
test test-pod 0m 2Mi
Create namespace and manifest file
Next create a new namespace and start a Pod within it:
kubectl create namespace test
Create a file, named “ test-pod.yaml” , save and apply it:
apiVersion: v1
kind: Pod
metadata:
name: test-pod
namespace: test
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
cpu: "200m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
and apply it:
kubectl apply -f test-pod.yaml
Important! : don’t forget that you should match your kubectl version with the running Kubernetes’ version! (there is some tolerance for deviation but I always suggest that they should be at the same version level)
That means that you should avoid a similar situation:
$ kubectl versionClient Version: v1.28.3Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3Server Version: v1.33.1WARNING: version difference between client (1.28) and server (1.33) exceeds the supported minor version skew of +/-1
Confirm that the test pod is up and running:
k get pods -ntest
Expected output is:
NAME READY STATUS RESTARTS AGE test-pod 1/1 Running 0 17m
Testing manual resizing
Validate that the resizing works well: first, we will test it manually. Later, we will move forward and automate these steps.
First, run these commands:
kubectl patch pod test-pod -n test --subresource resize --patch '{"spec": {"containers": [{"name": "nginx", "resources": {"requests": {"cpu": "300m", "memory": "256Mi"}, "limits": {"cpu": "600m", "memory": "512Mi"}}}]}}'
kubectl get pod test-pod -n test -o yaml
kubectl get pod test-pod -n test
Make sure that the pod wasn’t restarted and limits and requests are updated correctly:
kubectl get pod test-pod -n test
NAME READY STATUS RESTARTS AGE test-pod 1/1 Running 0 30m
k get pods -ntest -oyaml |grep -i -E "limits|requests" -A4
Expected output is:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"test-pod","namespace":"test"},"spec":{"containers":\[{"image":"nginx","name":"nginx","resources":{"limits":{"cpu":"500m","memory":"256Mi"},"requests":{"cpu":"200m","memory":"128Mi"}}}\]}}
creationTimestamp: "2025-05-28T11:54:23Z"
generation: 2
name: test-pod
namespace: test
limits:
cpu: 600m
memory: 512Mi
requests:
cpu: 300m
memory: 256Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
limits:
cpu: 600m
memory: 512Mi
requests:
cpu: 300m
memory: 256Mi
restartCount: 0
started: true
Configuring Access and Permissions
Set Up RBAC Permissions to enable your monitoring script to access pod metrics and perform resize operations. It is definitely necessary in our case now because it will support our automation goal (using a CronJob with monitor.sh script to resize pods based on resource usage).
First, we need to “restore” the previous state regarding our test pod (we already updated it):
Delete the pod and recreate it:
k delete pod test-pod -ntest && k apply -f test-pod.yaml
pod/test-pod created
and
k get pods -ntest -oyaml |grep -i -E "limits|requests" -A4
Expected output is:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"test-pod","namespace":"test"},"spec":{"containers":[{"image":"nginx","name":"nginx","resources":{"limits":{"cpu":"500m","memory":"256Mi"},"requests":{"cpu":"200m","memory":"128Mi"}}}]}}
creationTimestamp: "2025-05-28T11:54:23Z"
generation: 1
name: test-pod
namespace: test
--
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 200m
memory: 128Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
--
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 200m
memory: 128Mi
restartCount: 0
started: true
The next, create the necessary Kubernetes objects.
Create and save these manifest files and apply those:
Name: pod-scaler-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: pod-scaler
namespace: test
and Name: pod-scaler-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: pod-scaler-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "patch"]
- apiGroups: [""]
resources: ["pods/resize"]
verbs: ["get", "patch"]
resourceNames: ["test-pod"]
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "podmetrics"]
verbs: ["get", "list"]
and Name: pod-scaler-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: pod-scaler-binding
subjects:
- kind: ServiceAccount
name: pod-scaler
namespace: test
roleRef:
kind: ClusterRole
name: pod-scaler-role
apiGroup: rbac.authorization.k8s.io
Automating Scaling with a Monitor Script
The next step, we will verify the RBAC setup in a pod to simulate the environment where monitor.sh will run in the CronJob.
This ensures the ServiceAccount token is correctly mounted and can access the Kubernetes API (both metrics.k8s.io for metrics and pods/resize for resizing) from within a pod, mimicking the CronJob’s runtime behaviour.
This is critical to confirm the automation will work end-to-end.
First, create a new pod, named “test-access”:
Name: “test-access-pod.yaml”
apiVersion: v1
kind: Pod
metadata:
name: test-access
namespace: test
spec:
serviceAccountName: pod-scaler
containers:
- name: test
image: bitnami/kubectl:1.33
command: ["sleep", "infinity"]
Apply it and make sure that both pods are up and running:
k apply -f test-access-pod.yaml
and
k get pods -ntest
NAME READY STATUS RESTARTS AGE test-access 1/1 Running 0 8s test-pod 1/1 Running 0 66m
Validate it! :
Availability of metrics
Jump into the pod and run these commands to make sure you get metrics:
kubectl exec -it test-access -n test -- bash
and within the pod run the commands:
apt-get update && apt-get install -y curl
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl -sSk -H "Authorization: Bearer $TOKEN" https://kubernetes.default.svc/apis/metrics.k8s.io/v1beta1/namespaces/test/pods/test-pod
Expected output is:
Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
Get:4 http://deb.debian.org/debian bookworm/main amd64 Packages [8793 kB]
Get:5 http://deb.debian.org/debian bookworm-updates/main amd64 Packages [512 B]
Get:6 http://deb.debian.org/debian-security bookworm-security/main amd64 Packages [261 kB]
Fetched 9309 kB in 3s (3079 kB/s)
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
curl is already the newest version (7.88.1-10+deb12u12).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "test-pod",
"namespace": "test",
"creationTimestamp": "2025-05-28T13:01:02Z"
},
"timestamp": "2025-05-28T12:59:54Z",
"window": "1m4.584s",
"containers": [
{
"name": "nginx",
"usage": {
"cpu": "0",
"memory": "3004Ki"
}
}
]
}root@test-access:/#
Test Pod resizing
Jump into the 2nd pod again, and inside the pod, named “test-access” and run these commands:
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
kubectl --token=$TOKEN patch pod test-pod -n test --subresource resize --patch '{"spec": {"containers": [{"name": "nginx", "resouces": {"requests": {"cpu": "400m", "memory": "384Mi"}, "limits": {"cpu": "800m", "memory": "768Mi"}}}]}}'
kubectl --token=$TOKEN get pod test-pod -n test -o jsonpath='{.spec.containers[0].resources}'
Like this:
kubectl exec -it test-access -n test -- bash
and within the pod:
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
kubectl --token=$TOKEN patch pod test-pod -n test --subresource resize --patch '{"spec": {"containers": [{"name": "nginx", "resouces": {"requests": {"cpu": "400m", "memory": "384Mi"}, "limits": {"cpu": "800m", "memory": "768Mi"}}}]}}'
kubectl --token=$TOKEN get pod test-pod -n test -o jsonpath='{.spec.containers[0].resources}'
Expected output is:
pod/test-pod patched {"limits":{"cpu":"800m","memory":"768Mi"},"requests":{"cpu":"400m","memory":"384Mi"}} I have no name!@test-access:/$
and
curl -sSk -H "Authorization: Bearer $TOKEN" https://kubernetes.default.svc/apis/metrics.k8s.io/v1beta1/namespaces/test/pods/test-pod
Expected output is:
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "test-pod",
"namespace": "test",
"creationTimestamp": "2025-05-28T13:24:42Z"
},
"timestamp": "2025-05-28T13:24:01Z",
"window": "1m20.148s",
"containers": [
{
"name": "nginx",
"usage": {
"cpu": "0",
"memory": "3004Ki"
}
}
]
}I have no name!@test-access:/$ exit
Check the pods:
k get pods -ntest
NAME READY STATUS RESTARTS AGE test-access 1/1 Running 0 9m59s test-pod 1/1 Running 0 91m
You can realize that the affected pod configuration has been patched successfully and it wasn’t restarted.
Check it again:
k get pods test-pod -ntest -oyaml |grep -i -E "limits|requess" -A4
Expected output is:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"test-pod","namespace":"test"},"spec":{"containers":[{"mage":"nginx","name":"nginx","resources":{"limits":{"cpu":"500m","memory":"256Mi"},"requests":{"cpu":"200m","memory":"128Mi"}}}]}
creationTimestamp: "2025-05-28T13:55:30Z"
generation: 2
name: test-pod
namespace: test
--
limits:
cpu: 800m
memory: 768Mi
requests:
cpu: 400m
memory: 384Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
--
limits:
cpu: 800m
memory: 768Mi
requests:
cpu: 400m
memory: 384Mi
restartCount: 0
started: true
Before we move on, we need to “restore” the original values of affected pod. Delete and recrate it.
k delete -f test-pod.yaml
pod "test-pod" deleted
and
k apply -f test-pod.yaml
pod/test-pod created
and
k get pods -ntest
NAME READY STATUS RESTARTS AGE test-access 1/1 Running 0 48m test-pod 1/1 Running 0 8m41s
Create a monitor.sh script file
We need a Linux script file so let’s go and create it.
Goal: Create a monitor.sh script to check test-pod’s resource usage via metrics.k8s.io and resize it if usage exceeds thresholds (e.g., CPU > 80% of request). We will use this script via a CronJob resource, with the pod-scaler ServiceAccount.
Name: “monitor.sh”
Content:
#!/bin/bash
set -e
# Install curl, jq, and kubectl
apt-get update && apt-get install -y curl jq wget
wget -q https://dl.k8s.io/release/v1.33.0/bin/linux/amd64/kubectl
chmod +x kubectl
mv kubectl /usr/local/bin/
# Configuration
POD_NAME="test-pod"
NAMESPACE="test"
CPU_THRESHOLD=320000000 # 80% of 400m (in nanocores)
NEW_REQUESTS_CPU="600m"
NEW_REQUESTS_MEMORY="512Mi"
NEW_LIMITS_CPU="1200m"
NEW_LIMITS_MEMORY="1024Mi"
# Get metrics
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
METRICS=$(curl -sSk -H "Authorization: Bearer $TOKEN" https://kubernetes.default.svc/apis/metrics.k8s.io/v1beta1/namespaces/$NAMESPACE/pods/$POD_NAME)
# Parse CPU usage (in nanocores)
CPU_USAGE=$(echo "$METRICS" | jq -r '.containers[0].usage.cpu' | sed 's/n$//')
if [ -z "$CPU_USAGE" ]; then
echo "Error: Could not retrieve CPU usage"
exit 1
fi
echo "CPU Usage: $CPU_USAGE nanocores, Threshold: $CPU_THRESHOLD nanocores"
# Check if CPU usage exceeds threshold
if [ "$CPU_USAGE" -gt "$CPU_THRESHOLD" ]; then
echo "CPU usage exceeds threshold, resizing pod..."
cat <<EOF > patch.json
{
"spec": {
"containers": [
{
"name": "nginx",
"resources": {
"requests": {
"cpu": "$NEW_REQUESTS_CPU",
"memory": "$NEW_REQUESTS_MEMORY"
},
"limits": {
"cpu": "$NEW_LIMITS_CPU",
"memory": "$NEW_LIMITS_MEMORY"
}
}
}
]
}
}
EOF
kubectl --token=$TOKEN patch pod $POD_NAME -n $NAMESPACE --subresource resize --patch-file patch.json
if [ $? -eq 0 ]; then
echo "Pod resized successfully"
else
echo "Error resizing pod"
exit 1
fi
else
echo "CPU usage below threshold, no action needed."
fi
Note: focus on the specified variables, modify them if necessary.
Make it executable with thiscommand:
chmod +x monitor.sh
Create a configmap and store the monitor.sh file within it:
k create configmap monitor-script --from-file=monitor.s -n test
configmap/monitor-script created
In the next step, create a cronjob, named “pod-scaler-cronjob.yaml”
Name: “pod-scaler-cronjob.yaml”
Content:
apiVersion: batch/v1
kind: CronJob
metadata:
name: pod-scaler-cronjob
namespace: test
spec:
schedule: "* * * * *" # Every minute
jobTemplate:
spec:
template:
spec:
serviceAccountName: pod-scaler
containers:
- name: scaler
image: debian:bookworm-slim
command: ["/bin/bash", "/scripts/monitor.sh"]
volumeMounts:
- name: script
mountPath: /scripts
volumes:
- name: script
configMap:
name: monitor-script
restartPolicy: OnFailure
and apply it
k apply -f pod-scaler-cronjob.yaml
cronjob.batch/pod-scaler-cronjob created
Use Case: Dynamic Pod Resizing in Action
Testing the in-place vertical pod scaling feature in action was a key goal of my PoC. I used it to dynamically resize an Nginx pod based on CPU thresholds, simulating real-world demand. This experiment showcased its practicality for test environments.
Let’s explore how you can apply this in your own setup!
Generate workload
We will simulate workload (for 2 minutes) on the affected pod, named “test-pod”.
Jump into the pod:
kubectl exec -it test-pod -n test -- bash
and run these commands:
apt-get update && apt-get install -y stress
stress --cpu 2 --timeout 120
Expected output is:
Hit:1 http://deb.debian.org/debian bookworm InRelease
Hit:2 http://deb.debian.org/debian bookworm-updates InRelease
Hit:3 http://deb.debian.org/debian-security bookworm-security InRelease
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
stress is already the newest version (1.0.7-1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
stress: info: [257] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
stress: info: [257] successful run completed in 120s
root@test-pod:/# exit
exit
Validation
Check the pods: not restarted
kubectl get pods -n test
NAME READY STATUS RESTARTS AGE pod-scaler-cronjob-29140986-z2h98 1/1 Running 0 12s test-access 1/1 Running 0 5h50m test-pod 1/1 Running 0 22m
and run it again
kubectl get pods -n test
NAME READY STATUS RESTARTS AGE pod-scaler-cronjob-29140986-z2h98 0/1 Completed 0 14s test-access 1/1 Running 0 5h50m test-pod 1/1 Running 0 22m
Check the logs:
k logs pod-scaler-cronjob-29140986-z2h98 -n test
... CPU Usage: 990630785 nanocores, Threshold: 320000000 nanocores CPU usage exceeds threshold, resizing pod... pod/test-pod patched (no change) Pod resized successfully
Check the limits and requests values of affected pod:
kubectl describe pod test-pod -n test | grep -i -E "limits|requests" -A4
Expected output is:
Limits:
cpu: 1200m
memory: 1Gi
Requests:
cpu: 600m
memory: 512Mi
Environment: <none>
Mounts:
Great! This is what we wanted!
Limitations and Production Considerations
While my PoC was exciting, it highlighted some limitations of in-place scaling in its beta state. It works well for test environments but requires refinement for production use.
I plan to share these insights to help with advance planning.
Conclusion and Next Steps
My PoC with Kubernetes 1.33’s in-place vertical pod scaling opened my eyes to its potential and challenges. This guide walked you through the process, from setup to automation, with real-world insights.
About the Author I’m Róbert Zsótér, Kubernetes & AWS architect. If you’re into Kubernetes, EKS, Terraform, and cloud-native security, follow my latest posts here:
- LinkedIn: Róbert Zsótér
- Substack: CSHU
Let’s build secure, scalable clusters, together.
Originally published on Medium: Hands-On with Kubernetes 1.33: My PoC on In-Place Vertical Scaling