CISSP Domain Mapping
| Domain | Relevance |
|---|---|
| Domain 3 — Security Architecture | Cluster API for declarative cluster lifecycle; Crossplane for cloud resource governance |
| Domain 7 — Security Operations | GitOps audit trail as compliance evidence; ArgoCD + Flux for policy-driven deployment |
| Domain 8 — Software Security | AI/ML workload supply chain; GPU operator image provenance |
Introduction
By 2023, the question had shifted from “how do we run Kubernetes?” to “how do we let other engineers run their workloads on Kubernetes without becoming a bottleneck?”
This is the platform engineering problem. And it drove the tooling that defined 2023–2025: GitOps as the deployment standard, Cluster API for Kubernetes-on-Kubernetes provisioning, AI/ML workloads forcing new scheduling capabilities, and the Kubernetes project itself shedding more weight to become faster to release and operate.
GitOps: Principle Becomes Practice
GitOps as a term was coined by Weaveworks in 2017. By 2023, it was no longer a debate — it was the default deployment model for organizations running Kubernetes at scale.
The principle: the desired state of your cluster lives in Git. A controller watches the repository and reconciles the cluster state to match. Every deployment is a PR merge. The audit trail is the Git history.
Flux v2 (CNCF graduated) and ArgoCD (CNCF incubating) became the two dominant implementations:
# Flux: GitRepository + Kustomization
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: production-config
namespace: flux-system
spec:
interval: 1m
url: https://github.com/org/k8s-config
ref:
branch: main
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: production-apps
namespace: flux-system
spec:
interval: 10m
path: ./clusters/production
prune: true # Remove resources deleted from Git
sourceRef:
kind: GitRepository
name: production-config
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: api
namespace: production
The prune: true behavior is critical: resources deleted from Git are deleted from the cluster. This is what makes GitOps a security control — unknown resources that aren’t in Git get removed. No more accumulation of forgotten test deployments, rogue debug pods, or unauthorized configuration changes that outlive the engineer who made them.
ArgoCD’s Application model added a UI, synchronization policies, and multi-cluster management:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: production-api
namespace: argocd
spec:
project: production
source:
repoURL: https://github.com/org/apps
targetRevision: HEAD
path: api/production
destination:
server: https://kubernetes.default.svc
namespace: api
syncPolicy:
automated:
prune: true
selfHeal: true # Revert manual kubectl changes
syncOptions:
- CreateNamespace=true
The selfHeal: true option is where GitOps becomes enforceable: any manual change made with kubectl is automatically reverted within the sync interval. For compliance-sensitive environments, this is a configuration drift prevention control.
Cluster API: Kubernetes Managing Kubernetes
Cluster API (cluster-sigs/cluster-api) flipped the usual model: instead of using tools like Terraform or Ansible to provision Kubernetes clusters, Cluster API lets you manage Kubernetes clusters as Kubernetes resources — using a management cluster to provision and manage workload clusters.
# Create a new Kubernetes cluster as a Kubernetes resource
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: workload-cluster-prod
spec:
clusterNetwork:
pods:
cidrBlocks: ["192.168.0.0/16"]
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSCluster
name: workload-cluster-prod
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
name: workload-cluster-prod-control-plane
Cluster API reconciliation handles cluster provisioning, scaling, upgrades, and deletion — all through the Kubernetes API, with all the tooling (RBAC, audit logging, GitOps integration) that entails. Multi-cluster platform teams could now manage hundreds of workload clusters from a single management cluster.
Kubernetes 1.28 — Sidecar Containers Alpha (August 2023)
Sidecar containers had been a Kubernetes pattern since 2015 — a helper container in the same pod as the main application. But there was no native sidecar lifecycle management. Sidecars were just regular init containers or additional containers, which meant:
– Init container sidecars ran before the application and had to block until they succeeded
– Regular container sidecars had no ordering guarantees at startup
– At pod termination, sidecars could die before the application finished draining
1.28 introduced native sidecar support: a new restartPolicy field for init containers:
spec:
initContainers:
- name: log-collector
image: fluentbit:latest
restartPolicy: Always # This makes it a sidecar
# Starts before main containers, stays running, stops after main containers exit
containers:
- name: application
image: myapp:latest
A sidecar container (init container with restartPolicy: Always):
– Starts before application containers
– Stays running throughout the pod lifecycle
– Terminates automatically after all main containers exit
– Restarts if it crashes (unlike regular init containers)
This solved the service mesh sidecar problem: Istio and Linkerd injected Envoy proxies as regular containers, leading to race conditions where the proxy hadn’t started when the application tried to make outbound connections. Native sidecar lifecycle guarantees the proxy is ready before the application starts.
Also in 1.28:
– Retroactive default StorageClass assignment: Existing PVCs without a StorageClass assignment get the default applied retroactively — useful for migrations
– Non-graceful node shutdown stable: Handle node power failures without manual pod cleanup
– Recovery from volume expansion failure: Previously, a failed volume expansion left the PVC in a broken state; 1.28 introduced a mechanism to recover
AI/ML Workloads Force New Kubernetes Capabilities
The LLM wave of 2023 drove GPU workloads onto Kubernetes at a scale and urgency the project hadn’t anticipated. Running LLM inference on Kubernetes required solving problems that CPU-centric cluster scheduling hadn’t encountered:
GPU topology awareness: Inference across multiple GPUs requires GPUs connected by NVLink or on the same PCIe switch, not arbitrary GPUs from different nodes or different PCIe buses. The Dynamic Resource Allocation API (1.26 alpha) was designed exactly for this.
Fractional GPU allocation: NVIDIA’s time-slicing and MIG (Multi-Instance GPU) allow multiple pods to share a single GPU. The GPU operator (NVIDIA) manages this at the node level:
# Check GPU resources visible to Kubernetes
kubectl get nodes -o custom-columns=\
"NODE:.metadata.name,GPU:.status.allocatable.nvidia\.com/gpu"
# NODE GPU
# gpu-node-1 8
# gpu-node-2 8
Batch scheduling for training jobs: Training runs require all workers to start simultaneously — a single missing GPU makes the entire job stall. The Kubernetes Job API doesn’t guarantee this. Projects like Volcano (CNCF incubating) and Kueue (Kubernetes SIG Scheduling) added gang scheduling: a job only starts when all requested resources are available.
# Kueue: queue AI training jobs with resource quotas
apiVersion: kueue.x-k8s.io/v1beta1
kind: ClusterQueue
metadata:
name: gpu-queue
spec:
namespaceSelector: {}
resourceGroups:
- coveredResources: ["nvidia.com/gpu", "cpu", "memory"]
flavors:
- name: a100-80gb
resources:
- name: nvidia.com/gpu
nominalQuota: 16
Kubernetes 1.29 — Sidecar to Beta, Load Balancer IP Mode (December 2023)
- Sidecar containers beta: The lifecycle semantics were refined based on 1.28 alpha feedback
- Load balancer IP mode alpha: Distinguish between load balancers that use virtual IPs (kube-proxy handles the traffic) vs. those that handle traffic directly (no need for kube-proxy rules) — important for eBPF-based load balancers
- ReadWriteOncePod volume access stable
Kubernetes 1.30 — Structured Authorization Config (April 2024)
- Structured authorization configuration beta: Define multiple authorization webhooks with explicit ordering, failure modes, and connection settings — replacing the flat
--authorization-modeflag - Sidecar containers beta continues
- Node memory swap support beta: Allow pods to use swap memory — controversial but necessary for workloads with bursty memory patterns that prefer using swap over OOM kill
# Node with swap enabled — kubelet config
kind: KubeletConfiguration
memorySwap:
swapBehavior: LimitedSwap
The swap support feature reversed a long-standing Kubernetes hard stance: swap was disabled since 1.0 because its interaction with Kubernetes memory accounting was unpredictable. The 1.30 approach adds proper accounting and policies.
Kubernetes 1.31 — Cloud Provider Code Removal Complete (August 2024)
1.31 marked the completion of the cloud provider code removal — the 1.5 million line migration that had been running since 1.26. Core binaries are 40% smaller. The API server, controller manager, and scheduler no longer contain vendor-specific code.
Also in 1.31:
– Persistent Volume health monitor stable
– AppArmor support stable: AppArmor profiles for pods using the native Kubernetes field (not annotations)
– Traffic distribution for Services beta: Express topology preferences for Service routing (prefer local node, prefer same zone)
# Traffic distribution: prefer endpoints in the same zone
apiVersion: v1
kind: Service
metadata:
name: api
spec:
trafficDistribution: PreferClose
selector:
app: api
ports:
- port: 80
targetPort: 8080
Kubernetes 1.32 — Sidecar Stable, DRA Beta (December 2024)
- Sidecar containers stable: After nearly a decade of workarounds, the sidecar pattern is a first-class Kubernetes primitive
- Dynamic Resource Allocation beta: GPU and specialized hardware scheduling ready for production evaluation
- Job API improvements: Success and failure policies for indexed jobs — granular control over batch workload behavior
- Custom Resource field selectors: Filter CRDs on arbitrary fields — making large CRD-based systems more efficient to query
Crossplane: Kubernetes as the Control Plane for Everything
Crossplane (CNCF graduated) extended the Kubernetes API model beyond the cluster itself. Using CRDs and controllers, Crossplane lets you manage cloud resources (RDS databases, S3 buckets, VPCs, IAM roles) as Kubernetes resources — provisioned, updated, and deleted through the Kubernetes API.
# Crossplane: provision an RDS PostgreSQL instance as a Kubernetes resource
apiVersion: database.aws.crossplane.io/v1beta1
kind: RDSInstance
metadata:
name: production-db
spec:
forProvider:
region: us-east-1
dbInstanceClass: db.r6g.xlarge
masterUsername: admin
engine: postgres
engineVersion: "15"
allocatedStorage: 100
multiAZ: true
writeConnectionSecretsToRef:
name: production-db-credentials
namespace: production
For platform teams, Crossplane means a single control plane — the Kubernetes API — for both compute workloads and cloud infrastructure. GitOps tools (Flux, ArgoCD) manage both.
Key Takeaways
- GitOps (Flux, ArgoCD) became the production deployment standard — not for ideological reasons, but because the audit trail, drift detection, and self-healing properties solve real operational and compliance problems
- Cluster API made Kubernetes cluster lifecycle (provisioning, upgrades, deletion) a Kubernetes-native operation — the same API, tooling, and audit trail
- Native sidecar containers (1.28 alpha → 1.32 stable) finally resolved the lifecycle ordering problem that service meshes and log collectors had worked around for years
- AI/ML workloads drove new scheduling capabilities (DRA, gang scheduling via Kueue/Volcano) and made GPU topology awareness a first-class concern
- Crossplane generalized the Kubernetes API model to cloud infrastructure — the cluster is now a control plane for everything, not just containers
What’s Next
← EP06: The Runtime Reckoning | EP08: Kubernetes Today →
Series: Kubernetes: From Borg to Platform Engineering | linuxcent.com