Kubernetes Cost Optimization: Practical Strategies

Kubernetes makes scaling easy, but uncontrolled growth can lead to surprising cloud bills. Here are proven strategies to optimize Kubernetes costs.

The Cost Problem

Common Kubernetes cost issues:

Over-provisioning - Pods request more resources than they need
Cluster autoscaler lag - Nodes running with low utilization
Lack of visibility - No clear cost attribution by team/app
Inefficient storage - Unused persistent volumes
Data transfer costs - Cross-AZ or cross-region traffic

Right-Sizing Workloads

Set Appropriate Resource Requests

Don't guess - use metrics:

resources:
  requests:
    cpu: 100m      # Based on p50 usage
    memory: 256Mi  # Based on p95 usage
  limits:
    cpu: 500m      # Allow burst for spikes
    memory: 512Mi  # OOM prevention

Tools: VPA (Vertical Pod Autoscaler), Goldilocks, KRR

Use Horizontal Pod Autoscaling

Scale pods based on actual demand:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
spec:
  scaleTargetRef:
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Cluster-Level Optimization

1. Node Autoscaling

Configure Cluster Autoscaler for right node sizing:

Use multiple node groups with different instance types
Enable scale-down for removing underutilized nodes
Set appropriate scale-down delay (10-15 min)

2. Spot/Preemptible Instances

Run non-critical workloads on spot instances (60-80% savings):

nodeSelector:
  node.kubernetes.io/instance-type: spot

tolerations:
- key: "node.kubernetes.io/spot"
  operator: "Exists"

Best for: Batch jobs, development environments, stateless apps with replicas

3. Bin Packing

Improve node utilization with better pod placement:

Use pod priority classes
Configure scheduler to pack pods tightly
Avoid anti-affinity unless required

Storage Optimization

Clean Up Unused Volumes

List orphaned PVs:

kubectl get pv | grep Released

Use Storage Classes Wisely

General purpose - For most workloads (gp3 on AWS)
High IOPS - Only for databases (io2)
Cold storage - For backups (S3/GCS)

Snapshot Policies

Implement retention policies - don't keep snapshots forever.

Network Cost Reduction

Minimize Cross-AZ Traffic

Use topology-aware routing
Colocate related services in same AZ when possible
Cache frequently accessed data

Use Service Mesh Efficiently

Service meshes add overhead - ensure benefits justify cost:

Sidecar resource consumption
Control plane infrastructure
Increased network hops

Monitoring & Attribution

Implement Cost Allocation

Tag resources for cost attribution:

labels:
  team: platform
  environment: production
  cost-center: engineering

Tools: Kubecost, OpenCost, CloudHealth

Set Up Alerts

Alert on cost anomalies:

Unexpected cluster scale-up
High data transfer costs
Storage growth

Case Study

For an enterprise client, we reduced K8s costs by 58%:

Actions:

Right-sized 200+ deployments using VPA recommendations (-22%)
Implemented spot instances for dev/staging (-30%)
Enabled cluster autoscaler with proper configuration (-15%)
Cleaned up 2TB of orphaned storage (-8%)
Moved logs to cheaper storage tier (-5%)
Optimized cross-AZ traffic patterns (-10%)

Tools Used:

Kubecost for visibility
Goldilocks for recommendations
Cluster Autoscaler
VPA
Cost allocation tags

Quick Wins Checklist

[ ] Delete unused namespaces and resources
[ ] Set resource requests and limits
[ ] Enable Cluster Autoscaler
[ ] Clean up old persistent volumes
[ ] Review storage classes
[ ] Implement HPA for variable workloads
[ ] Add cost allocation labels
[ ] Use spot instances for non-critical workloads
[ ] Set up cost monitoring dashboard
[ ] Schedule autoscaling for dev/test environments

Conclusion

Kubernetes cost optimization is an ongoing process. Start with visibility (what's costing what?), then right-size workloads, and finally optimize cluster configuration. A 30-50% reduction is achievable for most organizations without impacting reliability.