Kubernetes makes scaling easy, but uncontrolled growth can lead to surprising cloud bills. Here are proven strategies to optimize Kubernetes costs.
The Cost Problem
Common Kubernetes cost issues:
- Over-provisioning - Pods request more resources than they need
- Cluster autoscaler lag - Nodes running with low utilization
- Lack of visibility - No clear cost attribution by team/app
- Inefficient storage - Unused persistent volumes
- Data transfer costs - Cross-AZ or cross-region traffic
Right-Sizing Workloads
Set Appropriate Resource Requests
Don't guess - use metrics:
resources:
requests:
cpu: 100m # Based on p50 usage
memory: 256Mi # Based on p95 usage
limits:
cpu: 500m # Allow burst for spikes
memory: 512Mi # OOM prevention
Tools: VPA (Vertical Pod Autoscaler), Goldilocks, KRR
Use Horizontal Pod Autoscaling
Scale pods based on actual demand:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
scaleTargetRef:
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Cluster-Level Optimization
1. Node Autoscaling
Configure Cluster Autoscaler for right node sizing:
- Use multiple node groups with different instance types
- Enable scale-down for removing underutilized nodes
- Set appropriate scale-down delay (10-15 min)
2. Spot/Preemptible Instances
Run non-critical workloads on spot instances (60-80% savings):
nodeSelector:
node.kubernetes.io/instance-type: spot
tolerations:
- key: "node.kubernetes.io/spot"
operator: "Exists"
Best for: Batch jobs, development environments, stateless apps with replicas
3. Bin Packing
Improve node utilization with better pod placement:
- Use pod priority classes
- Configure scheduler to pack pods tightly
- Avoid anti-affinity unless required
Storage Optimization
Clean Up Unused Volumes
List orphaned PVs:
kubectl get pv | grep Released
Use Storage Classes Wisely
- General purpose - For most workloads (gp3 on AWS)
- High IOPS - Only for databases (io2)
- Cold storage - For backups (S3/GCS)
Snapshot Policies
Implement retention policies - don't keep snapshots forever.
Network Cost Reduction
Minimize Cross-AZ Traffic
- Use topology-aware routing
- Colocate related services in same AZ when possible
- Cache frequently accessed data
Use Service Mesh Efficiently
Service meshes add overhead - ensure benefits justify cost:
- Sidecar resource consumption
- Control plane infrastructure
- Increased network hops
Monitoring & Attribution
Implement Cost Allocation
Tag resources for cost attribution:
labels:
team: platform
environment: production
cost-center: engineering
Tools: Kubecost, OpenCost, CloudHealth
Set Up Alerts
Alert on cost anomalies:
- Unexpected cluster scale-up
- High data transfer costs
- Storage growth
Case Study
For an enterprise client, we reduced K8s costs by 58%:
Actions:
- Right-sized 200+ deployments using VPA recommendations (-22%)
- Implemented spot instances for dev/staging (-30%)
- Enabled cluster autoscaler with proper configuration (-15%)
- Cleaned up 2TB of orphaned storage (-8%)
- Moved logs to cheaper storage tier (-5%)
- Optimized cross-AZ traffic patterns (-10%)
Tools Used:
- Kubecost for visibility
- Goldilocks for recommendations
- Cluster Autoscaler
- VPA
- Cost allocation tags
Quick Wins Checklist
- [ ] Delete unused namespaces and resources
- [ ] Set resource requests and limits
- [ ] Enable Cluster Autoscaler
- [ ] Clean up old persistent volumes
- [ ] Review storage classes
- [ ] Implement HPA for variable workloads
- [ ] Add cost allocation labels
- [ ] Use spot instances for non-critical workloads
- [ ] Set up cost monitoring dashboard
- [ ] Schedule autoscaling for dev/test environments
Conclusion
Kubernetes cost optimization is an ongoing process. Start with visibility (what's costing what?), then right-size workloads, and finally optimize cluster configuration. A 30-50% reduction is achievable for most organizations without impacting reliability.