Hands-On with Kubernetes 1.33: My PoC on In-Place Vertical Scaling

Introduction

Kubernetes 1.33 introduced a game-changing feature: in-place vertical pod scaling, now in beta and enabled by default. As a cloud engineer, I was eager to test its ability to dynamically adjust CPU and memory for running pods without restarts a potential win for resource optimization. Inspired by its promise, I set up a Proof of Concept (PoC) on AWS EC2 using Minikube to explore its practical applications.

In this guide, I’ll walk you through my step-by-step process, from cluster setup to scaling automation, and share insights for leveraging this feature in your own test environments.

Let’s dive in!

What is In-Place Vertical Pod Scaling?

Kubernetes 1.33 brings in-place vertical pod scaling…

Introduction

In this guide, I’ll walk you through my step-by-step process, from cluster setup to scaling automation, and share insights for leveraging this feature in your own test environments.

Let’s dive in!

What is In-Place Vertical Pod Scaling?

Kubernetes 1.33 brings in-place vertical pod scaling as a default feature that lets you adjust a pod’s CPU and memory resources on the fly.

Unlike traditional scaling, this approach avoids pod restarts, making it ideal for maintaining application availability. I was curious by its potential to optimize my Customer’s application efficiently.

This guide will explain how this works and why it matters for your Kubernetes journey.

Prerequisites for the PoC

Before diving into my PoC, you’ll need the right tools to replicate my setup on AWS EC2. This includes installing Minikube with Kubernetes 1.33 and setting up the Metrics Server for resource monitoring. I ensured these were in place to make the scaling process smooth and measurable.

Let’s cover the essentials to get you started!

Note:

I don’t want to go into how to install a Minikube cluster: there are lots of good tutorials on the Internet. It is a very simple step.

In my PoC, I used an EC2 instance on AWS, but this is not a limitation: you can use any other environment, there is no dependency in this respect.

For the Minikube instance I chose an EC2 instance type (t3a.medium) with the following “hardware” parameters: vCPU: 2, RAM: 4Gib, storage: 40GB (with default storage type)

More information: https://aws.amazon.com/ec2/instance-types/t3/

Setting Up the Test Environment

Creating a solid test environment was the first step in my PoC journey with Kubernetes 1.33. I set up a Minikube cluster on AWS EC2 and deployed a Metrics Server to track pod resources. This setup allowed me to test the vertical scaling feature with a sample Nginx pod.

Here, I’ll guide you through building this foundation.

Start a Minikube cluster

Note: There are two key points here.

We need to use containerd instead of docker. The reason is: there was a problem with the resizing of the Pod if I used docker. If you need more information about this issue, contact with me. I don’t want to go in detail with it now, because I don’t want this story to be very long.
I need to also mention that after extensive testing and checking, I found that although “supposed to be” this is already an enabled feature in this version, it works a bit “differently” in Minikube: not allowed for all components*.* The easiest way in this case is to start the cluster with an additional command line option: “ – feature-gates=InPlacePodVerticalScaling=true”

Start the Minikube:

minikube start --kubernetes-version=v1.33.1 --container-runtime=containerd --feature-gates=InPlacePodVerticalScaling=true

Before we go deeper, make sure that your Minikube cluster is up and running and you are interacting with this cluster (kubeconfig is configured for the Minikube)

minikube status alias k=kubectl k config current-context

If your kubeconfig is not configured correctly run this command:

kubectl config use-context minikube k get nodes

Expected output is:

NAME STATUS ROLES AGE VERSION minikube Ready control-plane 5m30s v1.33.1

Enable Metrics server

We need metrics, of course. We can install Metrics server via downloading manifest file and install it (using *kubectl apply -f * command) but the most easiest way: enable it

minikube addons enable metrics-server

Expected output is:

* metrics-server is an addon maintained by Kubernetes. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
- Using image registry.k8s.io/metrics-server/metrics-server:v0.7.2
* The 'metrics-server' addon is enabled

Wait a few minutes and make sure that you have metrics:

k top pods -A

Expected output is:

NAMESPACE     NAME                               CPU(cores)   MEMORY(bytes)
kube-system   coredns-674b8bbfcf-m7jqr           2m           12Mi
kube-system   etcd-minikube                      16m          26Mi
kube-system   kindnet-2n29r                      1m           7Mi
kube-system   kube-apiserver-minikube            33m          205Mi
kube-system   kube-controller-manager-minikube   14m          41Mi
kube-system   kube-proxy-jh6gm                   1m           11Mi
kube-system   kube-scheduler-minikube            7m           19Mi
kube-system   metrics-server-7fbb699795-bsb8v    3m           15Mi
kube-system   storage-provisioner                2m           7Mi
test          test-pod                           0m           2Mi

Create namespace and manifest file

Next create a new namespace and start a Pod within it:

kubectl create namespace test

Create a file, named “ test-pod.yaml” , save and apply it:

apiVersion: v1
kind: Pod
metadata:
name: test-pod
namespace: test
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
cpu: "200m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"

and apply it: kubectl apply -f test-pod.yaml

Important! : don’t forget that you should match your kubectl version with the running Kubernetes’ version! (there is some tolerance for deviation but I always suggest that they should be at the same version level)

That means that you should avoid a similar situation:

$ kubectl versionClient Version: v1.28.3Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3Server Version: v1.33.1WARNING: version difference between client (1.28) and server (1.33) exceeds the supported minor version skew of +/-1

Confirm that the test pod is up and running:

k get pods -ntest

Expected output is: NAME READY STATUS RESTARTS AGE test-pod 1/1 Running 0 17m

Testing manual resizing

Validate that the resizing works well: first, we will test it manually. Later, we will move forward and automate these steps.

First, run these commands:

kubectl patch pod test-pod -n test --subresource resize --patch '{"spec": {"containers": [{"name": "nginx", "resources": {"requests": {"cpu": "300m", "memory": "256Mi"}, "limits": {"cpu": "600m", "memory": "512Mi"}}}]}}'
kubectl get pod test-pod -n test -o yaml
kubectl get pod test-pod -n test

Make sure that the pod wasn’t restarted and limits and requests are updated correctly:

kubectl get pod test-pod -n test

NAME READY STATUS RESTARTS AGE test-pod 1/1 Running 0 30m

k get pods -ntest -oyaml |grep -i -E "limits|requests"  -A4

Expected output is:

{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"test-pod","namespace":"test"},"spec":{"containers":\[{"image":"nginx","name":"nginx","resources":{"limits":{"cpu":"500m","memory":"256Mi"},"requests":{"cpu":"200m","memory":"128Mi"}}}\]}}
creationTimestamp: "2025-05-28T11:54:23Z"
generation: 2
name: test-pod
namespace: test

limits:
cpu: 600m
memory: 512Mi
requests:
cpu: 300m
memory: 256Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File

limits:
cpu: 600m
memory: 512Mi
requests:
cpu: 300m
memory: 256Mi
restartCount: 0
started: true

Configuring Access and Permissions

Set Up RBAC Permissions to enable your monitoring script to access pod metrics and perform resize operations. It is definitely necessary in our case now because it will support our automation goal (using a CronJob with monitor.sh script to resize pods based on resource usage).

First, we need to “restore” the previous state regarding our test pod (we already updated it):

Delete the pod and recreate it:

k delete pod test-pod -ntest && k apply -f test-pod.yaml

pod/test-pod created and

k get pods -ntest -oyaml |grep -i -E "limits|requests"  -A4

Expected output is:

{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"test-pod","namespace":"test"},"spec":{"containers":[{"image":"nginx","name":"nginx","resources":{"limits":{"cpu":"500m","memory":"256Mi"},"requests":{"cpu":"200m","memory":"128Mi"}}}]}}
creationTimestamp: "2025-05-28T11:54:23Z"
generation: 1
name: test-pod
namespace: test
--
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 200m
memory: 128Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
--
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 200m
memory: 128Mi
restartCount: 0
started: true

The next, create the necessary Kubernetes objects.

Create and save these manifest files and apply those:

Name: pod-scaler-sa.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: pod-scaler
namespace: test

and Name: pod-scaler-role.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: pod-scaler-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "patch"]
- apiGroups: [""]
resources: ["pods/resize"]
verbs: ["get", "patch"]
resourceNames: ["test-pod"]
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "podmetrics"]
verbs: ["get", "list"]

and Name: pod-scaler-binding.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: pod-scaler-binding
subjects:
- kind: ServiceAccount
name: pod-scaler
namespace: test
roleRef:
kind: ClusterRole
name: pod-scaler-role
apiGroup: rbac.authorization.k8s.io

Automating Scaling with a Monitor Script

The next step, we will verify the RBAC setup in a pod to simulate the environment where monitor.sh will run in the CronJob.

This ensures the ServiceAccount token is correctly mounted and can access the Kubernetes API (both metrics.k8s.io for metrics and pods/resize for resizing) from within a pod, mimicking the CronJob’s runtime behaviour.

This is critical to confirm the automation will work end-to-end.

First, create a new pod, named “test-access”:

Name: “test-access-pod.yaml”

apiVersion: v1
kind: Pod
metadata:
name: test-access
namespace: test
spec:
serviceAccountName: pod-scaler
containers:
- name: test
image: bitnami/kubectl:1.33
command: ["sleep", "infinity"]

Apply it and make sure that both pods are up and running:

k apply -f test-access-pod.yaml

and

k get pods -ntest

NAME READY STATUS RESTARTS AGE test-access 1/1 Running 0 8s test-pod 1/1 Running 0 66m

Validate it! :

Availability of metrics

Jump into the pod and run these commands to make sure you get metrics:

kubectl exec -it test-access -n test -- bash

and within the pod run the commands:

apt-get update && apt-get install -y curl
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl -sSk -H "Authorization: Bearer $TOKEN" https://kubernetes.default.svc/apis/metrics.k8s.io/v1beta1/namespaces/test/pods/test-pod

Expected output is:

Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
Get:4 http://deb.debian.org/debian bookworm/main amd64 Packages [8793 kB]
Get:5 http://deb.debian.org/debian bookworm-updates/main amd64 Packages [512 B]
Get:6 http://deb.debian.org/debian-security bookworm-security/main amd64 Packages [261 kB]
Fetched 9309 kB in 3s (3079 kB/s)
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
curl is already the newest version (7.88.1-10+deb12u12).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "test-pod",
"namespace": "test",
"creationTimestamp": "2025-05-28T13:01:02Z"
},
"timestamp": "2025-05-28T12:59:54Z",
"window": "1m4.584s",
"containers": [
{
"name": "nginx",
"usage": {
"cpu": "0",
"memory": "3004Ki"
}
}
]
}root@test-access:/#

Test Pod resizing

Jump into the 2nd pod again, and inside the pod, named “test-access” and run these commands:

TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
kubectl --token=$TOKEN patch pod test-pod -n test --subresource resize --patch '{"spec": {"containers": [{"name": "nginx", "resouces": {"requests": {"cpu": "400m", "memory": "384Mi"}, "limits": {"cpu": "800m", "memory": "768Mi"}}}]}}'
kubectl --token=$TOKEN get pod test-pod -n test -o jsonpath='{.spec.containers[0].resources}'

Like this:

kubectl exec -it test-access -n test -- bash

and within the pod:

TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
kubectl --token=$TOKEN patch pod test-pod -n test --subresource resize --patch '{"spec": {"containers": [{"name": "nginx", "resouces": {"requests": {"cpu": "400m", "memory": "384Mi"}, "limits": {"cpu": "800m", "memory": "768Mi"}}}]}}'
kubectl --token=$TOKEN get pod test-pod -n test -o jsonpath='{.spec.containers[0].resources}'

Expected output is: pod/test-pod patched {"limits":{"cpu":"800m","memory":"768Mi"},"requests":{"cpu":"400m","memory":"384Mi"}} I have no name!@test-access:/$

and

curl -sSk -H "Authorization: Bearer $TOKEN" https://kubernetes.default.svc/apis/metrics.k8s.io/v1beta1/namespaces/test/pods/test-pod

Expected output is:

{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "test-pod",
"namespace": "test",
"creationTimestamp": "2025-05-28T13:24:42Z"
},
"timestamp": "2025-05-28T13:24:01Z",
"window": "1m20.148s",
"containers": [
{
"name": "nginx",
"usage": {
"cpu": "0",
"memory": "3004Ki"
}
}
]
}I have no name!@test-access:/$ exit

Check the pods:

k get pods -ntest

NAME READY STATUS RESTARTS AGE test-access 1/1 Running 0 9m59s test-pod 1/1 Running 0 91m

You can realize that the affected pod configuration has been patched successfully and it wasn’t restarted.

Check it again:

k get pods test-pod -ntest -oyaml |grep -i -E "limits|requess"  -A4

Expected output is:


{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"test-pod","namespace":"test"},"spec":{"containers":[{"mage":"nginx","name":"nginx","resources":{"limits":{"cpu":"500m","memory":"256Mi"},"requests":{"cpu":"200m","memory":"128Mi"}}}]}
creationTimestamp: "2025-05-28T13:55:30Z"
generation: 2
name: test-pod
namespace: test
--
limits:
cpu: 800m
memory: 768Mi
requests:
cpu: 400m
memory: 384Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
--
limits:
cpu: 800m
memory: 768Mi
requests:
cpu: 400m
memory: 384Mi
restartCount: 0
started: true

Before we move on, we need to “restore” the original values of affected pod. Delete and recrate it.

k delete -f test-pod.yaml

pod "test-pod" deleted and

k apply -f test-pod.yaml

pod/test-pod created and

k get pods -ntest

NAME READY STATUS RESTARTS AGE test-access 1/1 Running 0 48m test-pod 1/1 Running 0 8m41s

Create a monitor.sh script file

We need a Linux script file so let’s go and create it.

Goal: Create a monitor.sh script to check test-pod’s resource usage via metrics.k8s.io and resize it if usage exceeds thresholds (e.g., CPU > 80% of request). We will use this script via a CronJob resource, with the pod-scaler ServiceAccount.

Name: “monitor.sh”

Content:

#!/bin/bash
set -e

# Install curl, jq, and kubectl
apt-get update && apt-get install -y curl jq wget
wget -q https://dl.k8s.io/release/v1.33.0/bin/linux/amd64/kubectl
chmod +x kubectl
mv kubectl /usr/local/bin/

# Configuration
POD_NAME="test-pod"
NAMESPACE="test"
CPU_THRESHOLD=320000000 # 80% of 400m (in nanocores)
NEW_REQUESTS_CPU="600m"
NEW_REQUESTS_MEMORY="512Mi"
NEW_LIMITS_CPU="1200m"
NEW_LIMITS_MEMORY="1024Mi"

# Get metrics
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
METRICS=$(curl -sSk -H "Authorization: Bearer $TOKEN" https://kubernetes.default.svc/apis/metrics.k8s.io/v1beta1/namespaces/$NAMESPACE/pods/$POD_NAME)

# Parse CPU usage (in nanocores)
CPU_USAGE=$(echo "$METRICS" | jq -r '.containers[0].usage.cpu' | sed 's/n$//')
if [ -z "$CPU_USAGE" ]; then
echo "Error: Could not retrieve CPU usage"
exit 1
fi

echo "CPU Usage: $CPU_USAGE nanocores, Threshold: $CPU_THRESHOLD nanocores"

# Check if CPU usage exceeds threshold
if [ "$CPU_USAGE" -gt "$CPU_THRESHOLD" ]; then
echo "CPU usage exceeds threshold, resizing pod..."
cat <<EOF > patch.json
{
"spec": {
"containers": [
{
"name": "nginx",
"resources": {
"requests": {
"cpu": "$NEW_REQUESTS_CPU",
"memory": "$NEW_REQUESTS_MEMORY"
},
"limits": {
"cpu": "$NEW_LIMITS_CPU",
"memory": "$NEW_LIMITS_MEMORY"
}
}
}
]
}
}
EOF
kubectl --token=$TOKEN patch pod $POD_NAME -n $NAMESPACE --subresource resize --patch-file patch.json
if [ $? -eq 0 ]; then
echo "Pod resized successfully"
else
echo "Error resizing pod"
exit 1
fi
else
echo "CPU usage below threshold, no action needed."
fi

Note: focus on the specified variables, modify them if necessary.

Make it executable with thiscommand:

chmod +x monitor.sh

Create a configmap and store the monitor.sh file within it:

k create configmap monitor-script --from-file=monitor.s -n test

configmap/monitor-script created

In the next step, create a cronjob, named “pod-scaler-cronjob.yaml”

Name: “pod-scaler-cronjob.yaml”

Content:

apiVersion: batch/v1
kind: CronJob
metadata:
name: pod-scaler-cronjob
namespace: test
spec:
schedule: "* * * * *" # Every minute
jobTemplate:
spec:
template:
spec:
serviceAccountName: pod-scaler
containers:
- name: scaler
image: debian:bookworm-slim
command: ["/bin/bash", "/scripts/monitor.sh"]
volumeMounts:
- name: script
mountPath: /scripts
volumes:
- name: script
configMap:
name: monitor-script
restartPolicy: OnFailure

and apply it

k apply -f pod-scaler-cronjob.yaml

cronjob.batch/pod-scaler-cronjob created

Use Case: Dynamic Pod Resizing in Action

Testing the in-place vertical pod scaling feature in action was a key goal of my PoC. I used it to dynamically resize an Nginx pod based on CPU thresholds, simulating real-world demand. This experiment showcased its practicality for test environments.

Let’s explore how you can apply this in your own setup!

Generate workload

We will simulate workload (for 2 minutes) on the affected pod, named “test-pod”.

Jump into the pod:

kubectl exec -it test-pod -n test -- bash

and run these commands:

apt-get update && apt-get install -y stress
stress --cpu 2 --timeout 120

Expected output is:

Hit:1 http://deb.debian.org/debian bookworm InRelease
Hit:2 http://deb.debian.org/debian bookworm-updates InRelease
Hit:3 http://deb.debian.org/debian-security bookworm-security InRelease
Reading package lists... Done
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
stress is already the newest version (1.0.7-1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
stress: info: [257] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
stress: info: [257] successful run completed in 120s
root@test-pod:/# exit
exit

Validation

Check the pods: not restarted

kubectl get pods -n test

NAME READY STATUS RESTARTS AGE pod-scaler-cronjob-29140986-z2h98 1/1 Running 0 12s test-access 1/1 Running 0 5h50m test-pod 1/1 Running 0 22m

and run it again

kubectl get pods -n test

NAME READY STATUS RESTARTS AGE pod-scaler-cronjob-29140986-z2h98 0/1 Completed 0 14s test-access 1/1 Running 0 5h50m test-pod 1/1 Running 0 22m

Check the logs:

k logs pod-scaler-cronjob-29140986-z2h98  -n test

... CPU Usage: 990630785 nanocores, Threshold: 320000000 nanocores CPU usage exceeds threshold, resizing pod... pod/test-pod patched (no change) Pod resized successfully

Check the limits and requests values of affected pod:

kubectl describe pod test-pod -n test | grep -i -E "limits|requests" -A4

Expected output is:

Limits:
cpu:     1200m
memory:  1Gi
Requests:
cpu:        600m
memory:     512Mi
Environment:  <none>
Mounts:

Great! This is what we wanted!

Limitations and Production Considerations

While my PoC was exciting, it highlighted some limitations of in-place scaling in its beta state. It works well for test environments but requires refinement for production use.

I plan to share these insights to help with advance planning.

Conclusion and Next Steps

My PoC with Kubernetes 1.33’s in-place vertical pod scaling opened my eyes to its potential and challenges. This guide walked you through the process, from setup to automation, with real-world insights.

About the Author I’m Róbert Zsótér, Kubernetes & AWS architect. If you’re into Kubernetes, EKS, Terraform, and cloud-native security, follow my latest posts here:

LinkedIn: Róbert Zsótér
Substack: CSHU

Let’s build secure, scalable clusters, together.

Originally published on Medium: Hands-On with Kubernetes 1.33: My PoC on In-Place Vertical Scaling

Introduction

What is In-Place Vertical Pod Scaling?

Introduction

What is In-Place Vertical Pod Scaling?

Prerequisites for the PoC

Setting Up the Test Environment

Start a Minikube cluster

Enable Metrics server

Create namespace and manifest file

Testing manual resizing

Configuring Access and Permissions

Automating Scaling with a Monitor Script

Availability of metrics

Test Pod resizing

Create a monitor.sh script file

Use Case: Dynamic Pod Resizing in Action

Generate workload

Validation

Limitations and Production Considerations

Conclusion and Next Steps

Similar Posts