Autoscaling
Overview
Autoscaling on this platform operates across three independent axes:
- Horizontal autoscaling — change the number of pods (or Kubernetes Jobs) in response to load. The platform delivers this via KEDA, which generalises the native
HorizontalPodAutoscalerto over 60 external event sources (queues, streams, Prometheus queries, cron, CPU/memory, cloud services). - Vertical autoscaling — change the CPU and memory requests on existing pods. The platform ships the Vertical Pod Autoscaler (VPA), which continuously analyzes actual resource usage and recommends right-sized requests.
- Node-level autoscaling — provision new worker nodes to fit pending pods. This is handled by Karpenter.
| Layer | What it scales | Platform component |
|---|---|---|
| Cluster (nodes) | Worker node count and shape | Karpenter |
| Workload (replicas) | Number of pods / parallel Jobs | KEDA |
| Pod (resource requests) | CPU and memory requests per pod | VPA |
Why all three?
Each layer solves a different problem:
- Karpenter ensures you have enough compute capacity for all pods
- KEDA scales the number of pods based on external events (queues, streams, time-based schedules)
- VPA right-sizes each pod's resource requests based on actual usage
Running all three together means the right number of right-sized pods on the right amount of compute.
Safe integration: VPA + KEDA
VPA and KEDA are complementary but must be coordinated to avoid pod churn:
VPA + KEDA — Prevent Conflicts
When using VPA on workloads scaled by KEDA:
- Use VPA in
Offmode (recommendations only, no automatic pod evictions) - Apply recommendations manually during planned maintenance windows
- This prevents VPA evictions from disrupting KEDA's scaling decisions
See VPA with KEDA for the full pattern.
Enabling autoscaling
Each addon is feature-flagged on the cluster definition in the tenant repository:
metadata:
labels:
enable_vpa: "true" # Vertical scaling — right-size pod requests
enable_keda: "true" # Horizontal scaling — scale replicas on events
enable_karpenter_nodepools: "true" # Node scaling — auto-provision worker nodes
Next steps
- VPA — enabling VPA, reading recommendations, applying changes safely, and safe integration with KEDA.
- KEDA — enabling KEDA, customising Helm values, Prometheus integration, and end-to-end examples (scale-to-zero, cron, queue depth, CPU, hybrid patterns).
- Karpenter — node-level autoscaling for AWS/EKS.