Most platform teams pick their Kubernetes tenancy model for the wrong reason: cost. They look at the price of a second control plane, choose namespace isolation by default, layer on RBAC and ResourceQuotas, and convince themselves that is equivalent to real isolation. Then, six months later, a noisy tenant takes down shared ingress, an engineer discovers they can enumerate secrets across namespaces because no one scoped the RBAC properly, or a compliance audit surfaces that two customer environments share a data plane. The bill from getting it wrong is reliably larger than the bill for getting it right up front.
The real question is not "can we afford separate clusters?" It is "what do these tenants need to be isolated from?" That question has a technical answer — and it drives the architecture.
88%
Use namespaces to separate apps
up 16pp YoY
65%
Use cluster-level isolation
up 2pp YoY
80%
Running Kubernetes in production
up from 66% in 2023
Source: CNCF Annual Survey, 2024
What "tenant" actually means — and why the definition drives the architecture
"Tenant" is not a single concept. It covers everything from a squad in a startup to an enterprise customer with a compliance audit scope, and the trust model differs by an order of magnitude between those cases. Getting precise about what you are isolating matters before you pick the mechanism.
| Tenant type | Trust level | Primary concern | |---|---|---| | Internal engineering squad | High | Accidental resource contention, cost attribution | | Internal team with sensitive data (finance, PII) | Medium | Data isolation, audit trail | | External B2B SaaS customer | Low | Security boundary, blast-radius SLA, compliance | | Third-party workload (partner integration) | Very low | Code you do not own running in your infrastructure | | Regulated environment (HIPAA, PCI-DSS, SOC 2) | Depends on audit scope | Hard cryptographic and network boundary required by certification body |
These are not equivalent problems. Running three engineering squads in namespaces is a different problem from running 50 enterprise customers in a shared cluster. The mistake is applying the same model to both.
The CNCF 2024 Annual Survey found 88% of organisations using namespaces to separate applications — up 16 percentage points year over year. But "use namespaces" covers everything from a labelling convention to a hardened policy configuration. The number describes adoption, not effectiveness.
Namespace isolation: what you actually get, and what you don't
A Kubernetes namespace is an API partitioning primitive. It scopes names, resource quotas, and RBAC bindings. It does not, by itself, provide any of the following:
Network isolation: pods in namespace A can communicate freely with pods in namespace B unless NetworkPolicies explicitly block them. The default is no NetworkPolicies, which means a full mesh between all pods across all namespaces.
Node isolation: workloads from different namespaces co-schedule on the same physical nodes. A container-level exploit reaches the underlying node regardless of which namespace the pod runs in. CVE-2019-5736 (runc process overwrite) and CVE-2022-0185 (Linux kernel heap overflow enabling privilege escalation) both demonstrated this path — neither was mitigated by Kubernetes namespace boundaries.
Secret isolation: a cluster-admin can read any secret in any namespace by default. A namespace-scoped user with a loosely scoped RoleBinding — particularly one that references a ClusterRole rather than a Role — can often access more than intended.
Control-plane isolation: all namespace-isolated tenants share the same API server, etcd, and scheduler. A workload running tight list-watch loops on large resources can degrade API server performance cluster-wide. A misconfigured admission webhook applied at cluster scope can block admission across all namespaces simultaneously.
To turn a namespace into a real soft-tenancy boundary, you need all five of the controls below. Missing one leaves a visible gap that is, in practice, eventually triggered or exploited.
Control 1: NetworkPolicy — default-deny, explicit allow
The baseline NetworkPolicy for any tenant namespace is a default-deny on both ingress and egress:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: tenant-acme
spec:
podSelector: {}
policyTypes:
- Ingress
- EgressApply this first, then add explicit allow rules for every service pair that needs to communicate. Do not start with allow-all and narrow down — that approach drifts and never closes.
Two points that catch teams regularly:
First, NetworkPolicies are enforced by the CNI plugin, not the API server. On EKS with default VPC CNI, NetworkPolicy resources exist in etcd and are returned by kubectl get networkpolicy, but have no effect on traffic unless you have additionally installed a policy-capable CNI such as Cilium, Calico, or the AWS VPC CNI Network Policy Controller (available since EKS 1.25). Verify with actual traffic probes, not resource inspection.
Second, DNS traffic to kube-dns must be explicitly allowed after a default-deny policy, or pods silently fail name resolution:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: tenant-acme
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53Control 2: RBAC — namespace-scoped only, no cluster-wide shortcuts
Every tenant needs a RoleBinding scoped to their namespace, not a ClusterRoleBinding. The difference is significant: a ClusterRoleBinding referencing the view ClusterRole gives read access to secrets across the entire cluster, not just the intended namespace.
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: tenant-acme-developer
namespace: tenant-acme
subjects:
- kind: Group
name: acme-developers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: edit
apiGroup: rbac.authorization.k8s.ioAudit every cluster for ClusterRoleBindings that reference cluster-admin, admin, or edit. In most organically grown clusters, you will find bindings added as a short-term debugging fix and never revoked:
kubectl get clusterrolebindings -o json \
| jq '.items[] | select(.roleRef.name == "cluster-admin") | {name: .metadata.name, subjects: .subjects}'Control 3: ResourceQuota and LimitRange — enforce at namespace creation
ResourceQuotas prevent one tenant from consuming all cluster capacity. LimitRanges set defaults so every container receives a resource request and limit even if the developer did not declare one — which matters because the cluster autoscaler makes scale-up decisions based on pending requests, not actual CPU or memory usage at runtime.
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-acme-quota
namespace: tenant-acme
spec:
hard:
requests.cpu: "8"
requests.memory: 16Gi
limits.cpu: "16"
limits.memory: 32Gi
pods: "50"
services.loadbalancers: "0"
persistentvolumeclaims: "10"The services.loadbalancers: "0" line is worth noting: without it, a tenant can provision cloud load balancers directly, bypassing your shared ingress layer and generating uncapped cloud spend.
Control 4: Pod Security Standards — enforce Restricted or Baseline
Pod Security Standards replaced PodSecurityPolicies in Kubernetes 1.25. They are applied per namespace via labels:
apiVersion: v1
kind: Namespace
metadata:
name: tenant-acme
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: v1.30
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/audit: restrictedThe restricted profile blocks containers from running as root, prohibits privilege escalation, drops all Linux capabilities, requires a read-only root filesystem, and prevents host network and PID namespace access. Most stateless application workloads are compatible with restricted after a small number of annotation exceptions for legacy images.
Control 5: Audit logging and event alerting
Audit logs provide the forensic trail. Configure the API server audit policy to record Metadata level for reads and Request level for writes in tenant namespaces. Without it, you cannot distinguish "no one accessed that secret" from "someone accessed it and you have no record."
Cluster isolation: the real blast-radius boundary
Separate clusters provide guarantees that namespace isolation structurally cannot, regardless of how rigorously the namespace controls are applied:
Separate control planes: an API server outage, etcd compaction event, or runaway admission webhook in one cluster cannot affect another. Each cluster upgrades on its own schedule without forcing all tenants to move simultaneously.
Separate network planes: cluster networking is a fully isolated island. Cross-tenant communication requires an explicit, audited path — VPN peering, service mesh federation, or an API gateway. There is no accidental cross-tenant pod-to-pod route.
Independent upgrade schedules: if two tenants have different Kubernetes version requirements — one locked at 1.29 due to a third-party operator, one ready to move to 1.31 for a needed feature — cluster isolation handles this cleanly. Namespace isolation on a shared cluster forces everyone onto the same version simultaneously.
Separate node pools with different hardware profiles: workloads with different security requirements (PCI-scoped versus standard) must not co-schedule on the same physical nodes. A namespace label cannot enforce this. A separate cluster with a separate node group can.
The cost structure is materially different from shared namespace isolation. Each cluster carries fixed overhead: the managed control plane fee (approximately $70–100/month on EKS or GKE Standard) plus a minimum viable HA data plane of three nodes to tolerate an availability zone failure. Before any application workloads run, a minimal cluster costs roughly $400–600/month in most cloud configurations.
The cost math: density versus overhead
Cast.ai's 2026 State of Kubernetes Optimization report — analysing full-year 2025 workloads — found average Kubernetes CPU utilisation at 8% and memory utilisation at 20%. These numbers reflect that most teams overprovision for headroom, and shared clusters are no exception — but namespace isolation at least allows multiple tenants to share that headroom rather than paying for it redundantly on separate clusters.
Worked example: 10 internal teams, namespace isolation versus cluster isolation
Assumptions: 10 teams, each with 4 vCPU and 8 GiB memory in steady-state requests, running on AWS in us-east-1. Pricing is approximate on-demand and will vary by commitment level.
Shared cluster (namespace isolation):
- Total workload requests: 40 vCPU, 80 GiB memory
- With 50% buffer for burst and scheduler overhead: approximately 60 vCPU, 120 GiB provisioned
- 8 × m5.2xlarge (8 vCPU, 32 GiB, approximately $280/month each): approximately $2,240 in nodes
- 1 × EKS control plane (~$72/month): $72
- Shared cluster total: approximately $2,300/month
10 separate clusters (cluster isolation):
- Each cluster: same 4 vCPU workload, but minimum HA requires 3 nodes across AZs
- 3 × m5.xlarge (4 vCPU, 16 GiB, approximately $140/month each): $420 in nodes plus $72 control plane per cluster
- Cost per isolated cluster: approximately $490/month
- 10 isolated clusters total: approximately $4,900/month
The ratio is approximately 2.1x — real money, but bounded. At 10-team scale that is roughly $2,600/month more for cluster isolation. Whether that premium is justified depends on your risk tolerance and the cost of a security or compliance failure, not just on the absolute figure.
Namespace isolation is the right default for teams who trust each other. Cluster isolation is the right default for tenants who do not — or cannot be required to.
Decision framework: match the boundary to the trust model
Namespace isolation
- Internal teams in the same trust domain
- Cost-sensitive workloads where density is a genuine priority
- Policy enforced via OPA/Kyverno — all five controls applied
- Compliance scope permits a shared control plane
- Team has the operational maturity to maintain ongoing compliance
Cluster isolation
- External customers or code you do not own
- Regulated workloads (PCI-DSS, HIPAA, FedRAMP) with scoped audit boundaries
- Tenants with independent Kubernetes version requirements
- Workloads requiring dedicated hardware (GPU, high-memory, bare-metal)
- Zero blast-radius tolerance between tenants
The hybrid model: clusters by trust boundary, namespaces within
Neither extreme is the right answer for most organisations at scale. The most common production pattern sits between the two:
- Production customer cluster: one cluster per customer tier (or per large customer), strict PSS enforcement, dedicated node pools for compliance-scoped workloads, and cluster-per-customer for regulated verticals.
- Internal engineering cluster: all squads in namespaces with the full five-control configuration, managed with GitOps, cost-centre labels on every namespace.
- Non-production cluster: staging and development for all teams in one cluster with the same quota and network structure as production — to catch policy issues early — but with relaxed PSS for legacy tooling.
This gives you namespace density where the trust model allows it, and cluster isolation exactly where it is required.
Virtual clusters as a middle tier
For teams that want cluster-level API semantics without full cluster overhead, virtual clusters — Loft's vCluster project, now a CNCF Sandbox project — run a synthetic control plane inside a namespace of a host cluster. Tenants receive their own API server, CRD space, and admission webhook stack, while workloads schedule onto a shared underlying node pool.
This resolves several namespace isolation gaps — particularly CRD and admission webhook isolation, which standard namespace controls do not address — while preserving most of the cost density of namespace isolation. The hard limit: virtual clusters do not provide node-level blast-radius isolation. A kernel-level container escape still reaches the host node. For internal teams at medium trust level, virtual clusters are a strong middle option. For external customers under compliance audit scope, they are not a substitute for physically separate clusters.
Real-world pitfalls — each with a fix
These failure modes appear consistently in production environments, not hypothetical audits.
Pitfall 1: The permanent "temporary" cluster-admin binding. A developer needed debug access during a production incident, received a ClusterRoleBinding to cluster-admin as a quick fix, and it was never revoked. It is now eight months old and that person has since left the company.
Fix: use a just-in-time access system — Teleport, HashiCorp Boundary, or a lightweight controller pattern — that issues time-bounded ClusterRoleBindings with a maximum TTL of four hours and revokes them automatically. Run the audit query from Control 2 as a CI check on every commit to the cluster config repository.
Pitfall 2: NetworkPolicies that exist on paper but not in practice. A team applied a default-deny policy on EKS with default VPC CNI and no additional policy controller. The policy objects were present in etcd and returned by kubectl get networkpolicy. Traffic between namespaces was not blocked.
Fix: verify network policies with actual traffic probes, not resource inspection. A simple CI step using kubectl exec to attempt a cross-namespace TCP connection is sufficient. Cilium's connectivity-check and netpol-verify provide more comprehensive coverage. Never trust the existence of a NetworkPolicy resource as proof of enforcement.
Pitfall 3: Shared ingress as a single point of failure. A misconfigured Ingress resource in one namespace caused the shared NGINX controller to reload every 8 seconds, introducing a brief packet-loss window on each reload cycle. Tenants in unrelated namespaces experienced elevated error rates they could not diagnose.
Fix: separate ingress controllers per trust boundary using distinct IngressClass resources. The Kubernetes Gateway API — now generally available — provides better per-tenant traffic isolation than the legacy Ingress object. Configure an upper bound on the controller's reload frequency via Helm values to limit the blast radius of bad Ingress configurations.
Pitfall 4: Service account tokens auto-mounted and accessible. Prior to Kubernetes 1.24, all pods had a service account token auto-mounted by default. A compromised container in a tenant namespace could use that token to enumerate and read secrets from the namespace via the API server — often more data than intended, particularly when RBAC had drifted from its initial design.
Fix: disable auto-mounting on the default service account for every tenant namespace, and mount tokens only for workloads that require API server access, using projected token volumes with bounded lifetimes:
apiVersion: v1
kind: ServiceAccount
metadata:
name: default
namespace: tenant-acme
automountServiceAccountToken: falsePitfall 5: Zero-request pods that break the autoscaler. A tenant's deployment launched pods without resource requests. The scheduler placed all pods on two nodes that became CPU-saturated. Because there were no requests, the cluster autoscaler saw no unschedulable pods and did not provision additional capacity. Other tenants' workloads on those same nodes experienced throttling they could not explain.
Fix: LimitRanges with a minimum request are the primary control, but they do not prevent a pod with a request of exactly zero if the LimitRange minimum is not set. Enforce non-zero resource requests via a Kyverno policy or OPA Gatekeeper constraint as a hard admission block — warn and audit modes are insufficient for this failure mode.
Pitfall 6: A shared StorageClass that pins volumes to one availability zone. Two tenants in a multi-AZ cluster provisioned PersistentVolumeClaims from a shared StorageClass hardcoded to a single AZ. A node failure in that AZ unmounted volumes for both tenants simultaneously, converting an isolated node failure into a cross-tenant incident.
Fix: set volumeBindingMode: WaitForFirstConsumer on all StorageClasses. This defers volume binding until the pod is scheduled, placing the volume in the AZ where the pod actually lands rather than a hardcoded zone. Use separate StorageClasses per trust boundary for regulated workloads where storage isolation is in scope.
The operator dimension: GitOps, policy-as-code, and fleet upgrades
No tenancy model survives contact with reality without automation. Manual per-namespace configuration drifts within weeks, and a fleet of clusters managed by hand is not a fleet — it is a collection of independently diverging snowflakes.
- 01
Request via pull request
A namespace manifest directory is submitted to the platform repository by a developer or automation. It includes declared tenant tier, owner team, cost-centre labels, and quota tier.
- 02
Policy linting in CI
OPA Gatekeeper or Kyverno policies run against the manifests in CI. The pipeline blocks any PR missing NetworkPolicy, ResourceQuota, LimitRange, or PSS labels. No exceptions without a named senior approver in the PR review.
- 03
GitOps sync
ArgoCD or Flux applies the namespace and all five control objects atomically. A namespace never exists in a partially configured state because the sync unit is the entire directory, not individual resources.
- 04
Continuous drift detection
ArgoCD runs with prune disabled but alerts on out-of-sync state within 5 minutes. A nightly reconciliation job audits live cluster state against the Git source of truth and opens an incident on any detected drift.
- 05
Namespace lifecycle management
Every namespace carries an owner annotation and a TTL for non-production environments. An expiry controller sends a notification 7 days before deletion, requiring a renewal action. Abandoned namespaces are reaped on schedule without manual intervention.
Source: ClimsTech Engineering
For cluster fleets, Cluster API (CAPI) is the closest thing to a provisioning standard — it lets you declare cluster configurations as Kubernetes objects and manage them with the same GitOps tooling you use for applications. For managed clusters on AWS EKS or GCP GKE, Crossplane or Terraform module libraries provide equivalent lifecycle management with a smaller operational surface.
Policy-as-code: OPA Gatekeeper versus Kyverno
Both tools are production-proven. The practical tradeoffs in one place:
| Criterion | OPA Gatekeeper | Kyverno | |---|---|---| | Policy language | Rego (purpose-built) | Kubernetes-native YAML | | Expressiveness | Higher for complex cross-resource conditions | Sufficient for most admission use cases | | Learning curve | Steeper; requires Rego expertise to maintain | Lower; readable by most Kubernetes operators | | Mutation support | Limited (via mutations.gatekeeper.sh) | First-class: generate, mutate, validate in one policy | | Image verification | Separate Cosign integration required | Built-in Cosign and Notary v2 support | | Recommendation | Teams with existing OPA investment or complex policy logic | Teams starting fresh or optimising for operator speed |
Upgrading a cluster fleet without dropping tenants
Kubernetes minor-version releases happen approximately every four months. A fleet of clusters on diverging versions is a maintenance burden; a fleet on identical versions concentrates upgrade risk — all tenants move simultaneously.
A workable model: maintain a supported window of two minor versions (for example, 1.29 and 1.30). Run a canary cluster on the newest version for 30 days before rolling it to the fleet. Require application teams to smoke-test against the canary before the fleet upgrade proceeds. Use CAPI or your provisioning tool to perform rolling node upgrades — drain, replace, uncordon — with a maximum unavailability of one node per pool.
Missing this discipline is how you end up with clusters still on 1.27 when 1.27 reaches end-of-life, and a forced emergency upgrade across all tenants simultaneously — the single worst outcome for a team that chose cluster isolation to gain independent upgrade schedules.