What Nobody Tells You When You Adopt Kubernetes
Every engineering team that adopts Kubernetes gets the same pitch: it scales, it is cloud-agnostic, it handles failover automatically, and it is what all the mature companies use. That pitch is not wrong. Kubernetes does all of those things.
What the pitch leaves out is the bill that comes with it.
The K8s Tax is not a single line item. It is ten separate cost mechanisms that none of your monitoring tools label "K8s Tax." They show up as EC2 charges, data transfer fees, EBS storage costs, CloudWatch metrics costs, and SRE time that never appears on a cloud invoice. Together they routinely add 35 to 60 percent to what a cluster should cost if it were running efficiently.
This guide breaks down each component with the specific math, shows you how to calculate your actual K8s Tax today, and gives you the exact changes that eliminate the largest portions of it. Not generic advice. Specific configurations, specific commands, and specific numbers.
The 8 K8s Tax Components Nobody Tracks Separately
Component 1: The Control Plane Fee You Pay for Doing Nothing
If you run Kubernetes on EKS, you pay $0.10 per hour per cluster for the control plane. That is $72 per month before a single pod runs.
Most engineering teams run more than one cluster. Development, staging, production. Maybe a second production cluster in another region. Production workloads split across two AWS regions. Five clusters is not unusual for a team of 15 engineers.
Five EKS clusters: $360 per month. $4,320 per year. Just for the control plane. Before nodes. Before workloads. Before data transfer.
Compare this to AKS (Azure), which charges nothing for the control plane. GKE Standard charges the same as EKS. But GKE Autopilot charges nothing for the control plane and bills per pod-second instead of per node-hour. For bursty workloads with significant idle time, Autopilot eliminates both the control plane fee and the idle node cost simultaneously.
Teams running multiple EKS clusters who have not re-evaluated this decision recently are paying thousands per year for a tax that does not exist on competing platforms.
Component 2: The System Pod Tax That Shrinks Every Small Cluster
Every Kubernetes cluster runs system pods that consume real compute resources before your applications get a single CPU cycle. On EKS these include: aws-node (VPC CNI), kube-proxy, CoreDNS, metrics-server, coredns, kube-apiserver proxies, and any cluster add-ons you have installed like the EBS CSI driver or the cluster autoscaler itself.
On a well-populated 50-node cluster, system overhead is approximately 3 to 5 percent of total cluster resources. Manageable. On a 3-node development cluster of m5.xlarge instances (4 vCPU, 16 GB each), system pods consume roughly 1.5 vCPU and 3 GB of RAM. That is 12.5 percent of your total cluster compute before your developers deploy anything.
The math that makes small clusters expensive: you are paying for 12 vCPU across those 3 nodes but effectively have 10.5 vCPU available for workloads. The system overhead per dollar of compute you actually use is 14 percent on a 3-node cluster. The same overhead on a 30-node cluster is under 2 percent.
This is why small "cheap" clusters are disproportionately expensive on a per-workload-CPU basis, and why consolidating development environments into fewer, slightly larger clusters almost always saves money without reducing capability.
Component 3: Cross-AZ Data Transfer, the 10 Percent Invisible Tax
AWS charges $0.01 per GB for data transferred between availability zones. This sounds small. In a Kubernetes cluster, it compounds constantly because Kubernetes does round-robin load balancing across all pod replicas by default, regardless of which availability zone those replicas live in.
Every service-to-service call in your cluster potentially crosses an AZ boundary. In a microservices architecture with 20 services talking to each other, the cumulative cross-AZ traffic adds up faster than intuition suggests.
For a data-intensive application generating 500 GB per month of inter-service traffic: $5 per month. Negligible.
For a microservices platform with 5 TB per month of inter-service traffic (not unusual for data pipelines, ML inference chains, or API platforms with heavy middleware): $50 per month in data transfer that shows up on your bill as generic EC2-to-EC2 traffic with no Kubernetes label attached.
The fix available since Kubernetes 1.21: topology-aware routing, which instructs the kube-proxy to prefer routing to endpoints in the same availability zone before routing across zones. Enable it with a single annotation on your Service:
apiVersion: v1
kind: Service
metadata:
name: my-service
annotations:
service.kubernetes.io/topology-mode: "auto"
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
Teams that enable this across their service mesh report 60 to 80 percent reduction in cross-AZ traffic charges. One annotation. Zero application changes.
Component 4: NAT Gateway Fees Hiding in Plain Sight
Most EKS clusters follow the security best practice of running worker nodes in private subnets. Any outbound internet traffic from those nodes routes through NAT gateways. AWS NAT gateway pricing: $0.045 per hour plus $0.045 per GB of data processed.
In 2026, applications call external APIs constantly: LLM inference APIs, payment processors, third-party SaaS services, telemetry endpoints. Every container image pull from a public registry routes through NAT. Every call to the OpenAI API from a pod in a private subnet routes through NAT.
A 20-node cluster with active workloads commonly generates 500 GB to 2 TB of NAT-processed traffic per month. At $0.045 per GB: $22.50 to $90 per month in data processing fees plus the $32.40 per month hourly NAT fee per gateway (most clusters have one per AZ, so 3 x $32.40 = $97.20 per month just in hourly fees).
Total NAT cost for this cluster: $120 to $190 per month that shows up as "VPC NatGateway" on your AWS bill with no Kubernetes connection visible.
The partial fix: VPC endpoints for AWS services eliminate NAT fees for traffic to S3, DynamoDB, ECR, CloudWatch, STS, and SSM. The ECR VPC endpoint alone eliminates NAT fees for every container image pull across all nodes. For a 50-node cluster pulling images regularly, this saves $40 to $120 per month.
# Create VPC endpoint for ECR (eliminates NAT fees for image pulls)
aws ec2 create-vpc-endpoint \
--vpc-id vpc-XXXXXXXX \
--service-name com.amazonaws.us-east-1.ecr.dkr \
--vpc-endpoint-type Interface \
--subnet-ids subnet-XXXXXXXX subnet-YYYYYYYY \
--security-group-ids sg-XXXXXXXX
Component 5: The LoadBalancer Service Multiplication Problem
Each Kubernetes Service with type: LoadBalancer provisions a cloud load balancer. On AWS, this is an NLB or CLB by default. On GCP, an external L4 load balancer. On Azure, a Standard Load Balancer.
AWS NLB pricing: $0.0225 per hour plus $0.006 per Load Balancer Capacity Unit. At minimal traffic, a single NLB costs $16 to $22 per month.
A team with 12 microservices each exposed as a LoadBalancer service: 12 x $20 = $240 per month in load balancers for a cluster that may only cost $300 per month in compute.
The fix that most teams have not implemented: replace individual LoadBalancer services with a single Ingress controller. One NLB or ALB at the cluster edge, routing to all services via path-based or host-based rules. The AWS Load Balancer Controller or NGINX Ingress Controller handles this cleanly.
From 12 load balancers ($240/month) to 1 Ingress controller ($22/month): $218 per month saved from one architectural decision. The engineering effort to make this change is one sprint. The savings persist forever.
Component 6: PersistentVolumeClaim Orphan Accumulation
Kubernetes StatefulSets do not automatically delete PersistentVolumeClaims when pods are deleted. This is intentional behavior designed to prevent accidental data loss. The side effect is that PVCs accumulate silently over the lifetime of an active cluster.
The lifecycle that creates orphans: a developer spins up a StatefulSet for testing a new database configuration. The test runs. The StatefulSet is deleted. The pods disappear. The PVCs remain, bound to EBS volumes that continue billing at $0.10 per GB per month.
For a team actively developing with 50 test StatefulSets created and destroyed over 18 months, assuming each had a 20 GB PVC: 50 PVCs x 20 GB x $0.10 = $100 per month in storage that nobody is using and nobody is looking at.
Find yours right now:
# Find all PVCs not mounted by any running pod
kubectl get pvc -A -o json | jq -r '.items[] |
select(.status.phase == "Released" or .status.phase == "Available") |
"\(.metadata.namespace)/\(.metadata.name) - \(.spec.resources.requests.storage)"'
The Released status means the PVC was bound to a PV that has been released. These are almost always safe to delete. The Available status means the PVC is unbound and unused. Audit both categories monthly.
Component 7: Container Image Storage Creep in ECR
AWS ECR charges $0.10 per GB per month for stored images. Machine learning container images with CUDA libraries commonly reach 10 to 25 GB. Standard application images with comprehensive dependency stacks reach 2 to 5 GB.
Teams that do not set ECR lifecycle policies accumulate every image version ever built. If you build and push 10 times per day (CI/CD for active development), and your image is 3 GB, and you retain 30 days of builds: 300 image versions x 3 GB = 900 GB x $0.10 = $90 per month in ECR storage for one service.
For a team with 15 services and no lifecycle policies: $1,350 per month in ECR storage that serves no purpose. Every image older than the last 5 tagged versions is essentially dead weight.
The lifecycle policy that fixes this immediately:
{
"rules": [
{
"rulePriority": 1,
"description": "Remove untagged images after 1 day",
"selection": {
"tagStatus": "untagged",
"countType": "sinceImagePushed",
"countUnit": "days",
"countNumber": 1
},
"action": { "type": "expire" }
},
{
"rulePriority": 2,
"description": "Keep only last 10 tagged images",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["v"],
"countType": "imageCountMoreThan",
"countNumber": 10
},
"action": { "type": "expire" }
}
]
}
Apply this to every ECR repository. For most teams, the first application of this policy deletes hundreds of gigabytes immediately.
Component 8: CloudWatch Metrics Cost Explosion
Kubernetes generates an enormous volume of metrics. Every pod, every container, every node produces CPU, memory, network, and disk metrics at regular intervals. When teams ship these to CloudWatch as custom metrics for dashboarding, the cost compounds quickly.
AWS CloudWatch custom metric pricing: $0.30 per metric per month beyond the free tier of 10 metrics.
A 30-node EKS cluster using the CloudWatch Container Insights agent generates approximately 5,000 to 8,000 custom metrics (metrics per pod, per container, per node). At $0.30 per metric: $1,500 to $2,400 per month in CloudWatch metrics costs for one cluster.
This number surprises almost every team when they see it isolated on their AWS Cost Explorer. It hides under "CloudWatch" rather than "EKS" or "Kubernetes," so the connection is not obvious.
The fix: run Prometheus inside the cluster for the full metrics volume (free, stores in cluster), and only export the subset of metrics you actually alert on to CloudWatch. For most teams, that is 50 to 100 metrics total rather than 5,000 to 8,000. Cost reduction: 95 percent.
The Prometheus Operator with a kube-prometheus-stack Helm chart deploys the full monitoring stack in 20 minutes and is the standard approach for teams serious about Kubernetes observability without the CloudWatch bill.
How to Calculate Your Actual K8s Tax Today
Most teams know their total cloud bill. Almost no team knows how much of that bill is K8s Tax versus necessary infrastructure cost. Here is how to calculate it.
Step 1: Pull your AWS Cost Explorer data for the last 30 days, filtered by these services:
- EC2: instances in your cluster's VPC (filter by tag if you tag nodes)
- EBS: volumes attached to your cluster nodes
- Data Transfer: filter by your cluster VPC
- ELB: load balancers with your cluster's tag
- ECR: storage for your application repositories
- CloudWatch: your account total (most of this is often Kubernetes metrics)
- VPC: NatGateway charges
Step 2: For each category, identify what percentage is pure K8s Tax:
| Cost Category | Typical K8s Tax % | What to Fix |
|---|---|---|
| EC2 compute | 20-40% (system pods + buffer) | Right-size requests, tune autoscaler |
| EBS storage | 30-60% (orphaned PVCs + old snapshots) | PVC lifecycle policies |
| Data transfer | 40-70% (cross-AZ service calls) | Topology-aware routing |
| NAT Gateway | 20-50% (routable via VPC endpoints) | VPC endpoints for AWS services |
| Load Balancers | 60-85% (replace with Ingress) | Single Ingress controller |
| ECR storage | 50-90% (image accumulation) | ECR lifecycle policies |
| CloudWatch | 70-95% (metrics volume) | Prometheus + selective export |
Step 3: Add up the K8s Tax across categories.
For a team spending $15,000 per month on cloud infrastructure running three EKS clusters, a realistic K8s Tax breakdown looks like this:
- EC2 waste (30% of $8,000 compute): $2,400
- Cross-AZ traffic (50% of $800 data transfer): $400
- NAT Gateway (30% of $600): $180
- Orphaned EBS (60% of $400 storage): $240
- Excess load balancers (75% of $300): $225
- ECR image accumulation (70% of $200): $140
- CloudWatch metrics (80% of $500): $400
Total K8s Tax: approximately $3,985 per month out of $15,000 total. 26.6 percent of the entire cloud bill.
All of it fixable with configuration changes and lifecycle policies, no architectural overhaul required.
Fixing the Cluster Autoscaler Settings Most Teams Get Wrong
The Kubernetes Cluster Autoscaler has default settings that are tuned for safety over cost. Those defaults leave money on the table for most teams.
The two settings that matter most:
--scale-down-utilization-threshold (default: 0.50)
A node will not be removed if its utilization is above 50%. In practice, this means nodes stay alive even when mostly idle. Raising this to 0.65 or 0.70 makes the autoscaler more aggressive about consolidation.
--scale-down-delay-after-add (default: 10 minutes)
After a new node is added, the autoscaler waits 10 minutes before considering scale-down. For bursty workloads that spike briefly, this means nodes added for a 2-minute demand peak sit running for 12 minutes. Reducing this to 2 to 3 minutes for development clusters or 5 minutes for production captures real savings.
# In your cluster-autoscaler deployment args
- --scale-down-utilization-threshold=0.65
- --scale-down-delay-after-add=3m
- --scale-down-unneeded-time=5m
- --skip-nodes-with-local-storage=false
These four settings together on an active cluster commonly reduce the "always-on" node count by 15 to 25 percent without impacting availability.
The Honest Assessment: When to Fix K8s vs When to Leave It
Kubernetes is not always the right tool. Before investing in optimizing a K8s deployment, it is worth being honest about whether K8s is the right platform for your workloads.
Fix your K8s deployment when:
- You have 10 or more services with independent scaling requirements
- You have stateful workloads (databases, ML model servers) that benefit from K8s scheduling
- Your team has genuine K8s expertise or is committed to building it
- Your workloads have complex resource requirements (GPU scheduling, custom networking)
- You are already past the early optimization stage and just need to tune
Move workloads off K8s when:
- You have stateless HTTP services that scale based on request volume (GCP Cloud Run or AWS App Runner handles this with zero infrastructure overhead and scale-to-zero)
- You have batch jobs with no concurrency requirements (AWS Lambda or Step Functions is significantly cheaper)
- You have fewer than 5 services with similar scaling patterns (ECS Fargate with no control plane complexity)
- Your team spends more than 25 percent of engineering time on cluster operations rather than product
The cost difference between running a stateless API on GCP Cloud Run versus EKS is significant. Cloud Run bills per 100ms of compute at $0.00002400 per vCPU-second and $0.00000250 per GB-second, with scale to zero when there are no requests. An API handling 1 million requests per month at 200ms average response time on 0.5 vCPU costs approximately $2.40 per month on Cloud Run. The same workload on a dedicated EKS node costs $72+ per month in control plane fees alone before the compute.
For workloads that fit this profile, the K8s Tax is not a tax to optimize. It is a signal to migrate.
The Namespace Cost Attribution Fix That Changes Team Behavior
None of the technical optimizations in this guide stick long-term without cost visibility at the team and workload level. Engineers do not optimize what they cannot see attributed to their work.
The minimum viable Kubernetes cost attribution setup:
Apply three labels to every Deployment, StatefulSet, DaemonSet, and CronJob in your cluster:
metadata:
labels:
team: platform-engineering
service: user-auth-api
environment: production
Kubecost uses these labels to allocate costs by namespace and label. Without them, you see cluster-level spend. With them, you see that the data-processing team is responsible for 43 percent of cluster spend, the product team is responsible for 31 percent, and platform infrastructure is 26 percent. That visibility changes conversations at budget reviews from "our cloud bill went up" to "the data-processing team's new ML pipeline added $4,800 per month."
When engineers see their team's spend attributed directly, right-sizing requests moves from a FinOps initiative to self-directed engineering work. The social accountability is more powerful than any tooling.
For teams wanting structured support implementing cost attribution and FinOps governance across Kubernetes workloads, FinOps consulting provides the frameworks to make this systematic from the start.
Connecting K8s Cost Optimization to the Broader Infrastructure Picture
The K8s Tax is one layer of a multi-layered cost challenge for modern engineering teams. The infrastructure layer (Kubernetes, node sizing, autoscaling) intersects with the application layer (container image sizes, connection pooling, observability volume) and the cloud commitment layer (Savings Plans, Reserved Instances, Committed Use Discounts).
Optimizing Kubernetes costs without also reviewing your cloud commitments is like fixing the fuel efficiency of your car while leaving the engine running overnight. The AWS cost optimization playbook covers the commitment layer, including how EKS Savings Plans interact with Karpenter-managed node selection.
For teams running AI workloads on top of Kubernetes, the specific GPU waste patterns are different from general K8s waste and are covered in depth in the Kubernetes cost optimization for AI workloads guide.
The real-time cloud cost monitoring tools guide covers how to instrument the visibility layer that makes all of these optimizations measurable and maintainable over time.
For cloud operations support in implementing these changes in production clusters without disrupting running workloads, the operational patterns here translate directly to real infrastructure changes that your team can execute incrementally.
Frequently Asked Questions About the Hidden K8s Tax
What is the K8s Tax and how much does it typically cost?
The K8s Tax is the collection of cloud costs that Kubernetes generates beyond what your workloads actually require: control plane fees, system pod overhead, cross-AZ data transfer from round-robin load balancing, NAT gateway fees for pods in private subnets, orphaned PersistentVolumeClaims, accumulated container images in ECR, and CloudWatch metrics costs from shipping all Kubernetes metrics to paid monitoring. For a team spending $15,000 to $20,000 per month on cloud infrastructure with three or more EKS clusters, the K8s Tax commonly represents 25 to 40 percent of total spend. That means $3,750 to $8,000 per month in recoverable waste.
How do I find orphaned PersistentVolumeClaims in my cluster?
Run kubectl get pvc -A and look for PVCs with status Released (previously bound to a deleted pod) or Available (never bound). PVCs in Released status are almost always safe to delete after confirming the data is not needed. To find PVCs not mounted by any running pod, you can cross-reference kubectl get pods -A -o json with your PVC list. Running this audit monthly and deleting orphaned PVCs is one of the fastest cost wins available in a mature cluster.
Why is EKS more expensive than AKS or GKE for the same workloads?
EKS charges $0.10/hour ($72/month) per cluster for the managed control plane. AKS charges nothing for the control plane. GKE Standard charges the same as EKS, but GKE Autopilot charges nothing and bills per pod rather than per node. For teams running multiple clusters, the control plane fee difference between EKS and AKS is $72/month per cluster, which compounds to $4,320 per year per cluster over three years. For teams where multi-cloud or AWS lock-in is not a concern, AKS and GKE Autopilot have structural cost advantages for multi-cluster deployments.
What is the fastest way to reduce the K8s Tax without changing application code?
Five configuration changes that require no application code changes: (1) Enable topology-aware routing on your Services to reduce cross-AZ data transfer by 60 to 80 percent. (2) Create ECR lifecycle policies to stop accumulating old images. (3) Run kubectl get pvc -A and delete PVCs in Released status. (4) Create a single Ingress controller and replace LoadBalancer Services with ClusterIP plus Ingress rules. (5) Create VPC endpoints for ECR, S3, and CloudWatch to eliminate NAT Gateway fees for AWS service traffic. Most teams can implement all five in one sprint and see savings on their next billing cycle.
Is Kubernetes worth it for a startup with fewer than 10 services?
For most startups with under 10 services, the operational overhead and K8s Tax of Kubernetes does not justify the benefits at that scale. ECS Fargate on AWS (no control plane, pay per task-second, no node management), GCP Cloud Run (scale to zero, per 100ms billing, fully managed), or Azure Container Apps (built on KEDA, managed Kubernetes without the management overhead) are all significantly simpler and cheaper for stateless HTTP services at sub-enterprise scale. Kubernetes becomes clearly justified when you have complex scheduling requirements, stateful workloads, or multi-tenant platform needs that exceed what managed container services can handle.
How does the Cluster Autoscaler default configuration waste money?
The Cluster Autoscaler's default --scale-down-utilization-threshold of 0.50 means nodes stay running as long as any single node is above 50% utilized. The default --scale-down-delay-after-add of 10 minutes means newly added nodes sit running for 10 minutes after the demand spike that triggered them has passed. For clusters with bursty workloads, these defaults result in 2 to 4 extra nodes running perpetually as buffer. On m5.2xlarge instances at $0.384/hour, 3 extra buffer nodes cost $829/month. Tuning --scale-down-utilization-threshold to 0.65 and --scale-down-delay-after-add to 3 minutes eliminates most of this buffer while maintaining availability for genuine demand spikes.
What Kubernetes metrics should I stop sending to CloudWatch?
Stop sending per-pod, per-container, and per-namespace metrics from Container Insights to CloudWatch. These are the metrics that generate 5,000 to 8,000 custom metrics on a mid-sized cluster at $0.30/metric/month. Instead, run Prometheus inside the cluster (free, retains 15 days of history locally), use Grafana connected to Prometheus for dashboarding (free for self-hosted), and only send to CloudWatch the metrics you actually alert on: node-level CPU and memory thresholds, cluster-level pod scheduling failures, and any custom business metrics. The CloudWatch bill for Kubernetes metrics drops from hundreds or thousands per month to under $30/month for the small set of alerting-critical metrics.
Reclaiming Your Cloud Budget, One Layer at a Time
The K8s Tax is not a Kubernetes problem. It is an instrumentation and awareness problem. Every component of it is visible once you know what to look for, and every component has a documented fix that the Kubernetes community has figured out and shared.
The teams paying the full K8s Tax are not doing anything wrong with their applications. They adopted Kubernetes for good reasons. They just never had someone sit down and show them the specific line items that accumulate silently in the background.
Now you have that list. Eight cost components, specific calculations for each, and specific fixes with YAML and CLI commands. The question is which one you start with this sprint.
For teams that want to compress this discovery and remediation process with structured guidance, cloud cost optimization and FinOps consulting from LeanOps provides the audit, the prioritization, and the implementation support to turn these individual fixes into a sustainable cost discipline.
For the full picture on Kubernetes cost optimization including GPU workloads and AI-specific patterns, see the Kubernetes cost optimization for AI workloads guide.
External resources: Kubernetes topology-aware routing documentation, CNCF FinOps for Kubernetes white paper, and AWS VPC endpoints documentation.
