Kubernetes CRD CEL Validation: Replace Admission Webhooks for Schema Rules

Reading Time: 6 minutes

Kubernetes CRDs & Operators: Extending the API, Episode 5
What Is a CRD? · CRDs You Already Use · CRD Anatomy · Write Your First CRD · CEL Validation · Controller Loop · Build an Operator · CRD Versioning · Admission Webhooks · CRDs in Production


TL;DR

  • Kubernetes CRD CEL validation (x-kubernetes-validations) lets you write arbitrary validation rules in the CRD schema — no admission webhook needed
    (CEL = Common Expression Language, a lightweight expression language built into Kubernetes since 1.25 stable; replaces most reasons you would write a validating admission webhook)
  • CEL rules are evaluated by the API server at admit time — the same place as OpenAPI schema validation, before etcd
  • self refers to the current object’s field; oldSelf refers to the previous value (for update rules)
  • Cross-field validation: “if storageClass is premium, retentionDays must be ≤ 90″ — impossible with plain OpenAPI schema, trivial with CEL
  • Immutable fields: oldSelf == self with reason: Immutable prevents users from changing values after creation
  • CEL rules run in ~microseconds inside the API server; no external service, no TLS, no latency budget to manage

The Big Picture

  CEL VALIDATION: WHERE IT FITS IN THE ADMISSION CHAIN

  kubectl apply -f backup.yaml
         │
         ▼
  API Server admission chain
  ┌────────────────────────────────────────────────────┐
  │                                                    │
  │  1. Mutating admission webhooks (modify object)    │
  │  2. Schema validation (OpenAPI types, required,    │
  │     minimum/maximum, pattern)                      │
  │  3. CEL validation (x-kubernetes-validations)  ←  │ THIS EPISODE
  │  4. Validating admission webhooks (external)       │
  │                                                    │
  └────────────────────────────────────────────────────┘
         │
         ▼ (passes all checks)
  etcd storage

Kubernetes CRD CEL validation sits between schema validation and external webhooks. For most validation requirements, CEL eliminates the need for a webhook entirely — which means no separate deployment to maintain, no TLS certificates to rotate, no availability dependency between your CRD and a webhook server.


Why CEL Replaces Most Admission Webhooks

Before CEL (stable in Kubernetes 1.25), the only way to express “if field A has value X, field B must be present” was an admission webhook — a separate HTTP server that Kubernetes called synchronously during every API request.

Webhooks work, but they have real costs:

  • Availability dependency: if the webhook is down, creates/updates for that resource type fail
  • TLS management: webhook endpoints require valid TLS certs that must be rotated
  • Deployment overhead: another Deployment, Service, and certificate to manage
  • Latency: every API operation waits for an HTTP round-trip

CEL runs inside the API server process. There is no network call, no certificate, no separate deployment. Rules are compiled once and evaluated in microseconds.

The trade-off: CEL cannot make network calls or access state outside the object being validated. For rules that need to look up other resources (e.g., “does this referenced Secret exist?”), you still need a webhook or a controller that validates via status conditions.


CEL Syntax Basics

CEL expressions are small programs. In Kubernetes CRD validation, the key variables are:

Variable Meaning
self The current field value (or root object at top level)
oldSelf The previous value of the field (only available on update; nil on create)

CEL returns true (validation passes) or false (validation fails, API returns error).

Common patterns:

# String not empty
self.size() > 0

# String matches format
self.matches('^[a-z][a-z0-9-]*$')

# Integer in range
self >= 1 && self <= 365

# Field present (for optional fields)
has(self.fieldName)

# Conditional: if A then B
!has(self.premium) || self.retentionDays <= 90

# List not empty
self.size() > 0

# All items in list satisfy condition
self.all(item, item.namespace.size() > 0)

# Cross-field: access sibling field via parent
self.retentionDays >= self.minRetentionDays

Adding CEL Rules to the BackupPolicy CRD

Start from the CRD built in EP04. Add x-kubernetes-validations at the levels where you need them.

Rule 1: Cron expression validation

The OpenAPI pattern field can validate basic structure, but a proper cron regex is unwieldy. CEL is cleaner:

spec:
  type: object
  required: ["schedule", "retentionDays"]
  x-kubernetes-validations:
    - rule: "self.schedule.matches('^(\\\\*|[0-9,\\\\-\\\\/]+) (\\\\*|[0-9,\\\\-\\\\/]+) (\\\\*|[0-9,\\\\-\\\\/]+) (\\\\*|[0-9,\\\\-\\\\/]+) (\\\\*|[0-9,\\\\-\\\\/]+)$')"
      message: "schedule must be a valid 5-field cron expression"

Rule 2: Cross-field validation

spec:
  type: object
  x-kubernetes-validations:
    - rule: "!(self.storageClass == 'premium') || self.retentionDays <= 90"
      message: "premium storage class supports at most 90 days retention"
    - rule: "!self.suspended || !has(self.pausedBy) || self.pausedBy.size() > 0"
      message: "when suspended is true, pausedBy must be non-empty if provided"

Rule 3: Immutable fields

Once a BackupPolicy is created, the schedule field should not be changeable without deleting and recreating:

schedule:
  type: string
  x-kubernetes-validations:
    - rule: "self == oldSelf"
      message: "schedule is immutable after creation"
      reason: Immutable

reason field: Available reasons are FieldValueInvalid (default), FieldValueForbidden, FieldValueRequired, and Immutable. Using Immutable returns HTTP 422 with a clear message that the field cannot be changed.

Rule 4: Conditional required field

If storageClass is encrypted, then encryptionKeyRef must be present:

spec:
  type: object
  x-kubernetes-validations:
    - rule: "self.storageClass != 'encrypted' || has(self.encryptionKeyRef)"
      message: "encryptionKeyRef is required when storageClass is 'encrypted'"

Rule 5: List element validation

Ensure each target namespace is a valid RFC 1123 DNS label:

targets:
  type: array
  items:
    type: object
    x-kubernetes-validations:
      - rule: "self.namespace.matches('^[a-z0-9]([-a-z0-9]*[a-z0-9])?$')"
        message: "namespace must be a valid DNS label"

The Complete Updated CRD with CEL

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: backuppolicies.storage.example.com
spec:
  group: storage.example.com
  scope: Namespaced
  names:
    plural:     backuppolicies
    singular:   backuppolicy
    kind:       BackupPolicy
    shortNames: [bp]
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          required: ["spec"]
          properties:
            spec:
              type: object
              required: ["schedule", "retentionDays"]
              x-kubernetes-validations:
                - rule: "!(self.storageClass == 'premium') || self.retentionDays <= 90"
                  message: "premium storage class supports at most 90 days retention"
              properties:
                schedule:
                  type: string
                  x-kubernetes-validations:
                    - rule: "self == oldSelf"
                      message: "schedule is immutable after creation"
                      reason: Immutable
                retentionDays:
                  type: integer
                  minimum: 1
                  maximum: 365
                storageClass:
                  type: string
                  default: "standard"
                  enum: ["standard", "premium", "encrypted", "archive"]
                encryptionKeyRef:
                  type: string
                targets:
                  type: array
                  maxItems: 20
                  items:
                    type: object
                    required: ["namespace"]
                    x-kubernetes-validations:
                      - rule: "self.namespace.matches('^[a-z0-9]([-a-z0-9]*[a-z0-9])?$')"
                        message: "namespace must be a valid DNS label"
                    properties:
                      namespace:
                        type: string
                      includeSecrets:
                        type: boolean
                        default: false
                suspended:
                  type: boolean
                  default: false
            status:
              type: object
              x-kubernetes-preserve-unknown-fields: true
      subresources:
        status: {}
      additionalPrinterColumns:
        - name: Schedule
          type: string
          jsonPath: .spec.schedule
        - name: Retention
          type: integer
          jsonPath: .spec.retentionDays
        - name: Ready
          type: string
          jsonPath: .status.conditions[?(@.type=='Ready')].status
        - name: Age
          type: date
          jsonPath: .metadata.creationTimestamp

Testing CEL Rules

Apply the updated CRD:

kubectl apply -f backuppolicies-crd-cel.yaml

Test cross-field validation:

kubectl apply -f - <<'EOF'
apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: premium-long
  namespace: demo
spec:
  schedule: "0 2 * * *"
  retentionDays: 180          # violates: premium + > 90 days
  storageClass: premium
EOF
The BackupPolicy "premium-long" is invalid:
  spec: Invalid value: "object":
    premium storage class supports at most 90 days retention

Test immutability:

# Create valid policy
kubectl apply -f - <<'EOF'
apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: immutable-test
  namespace: demo
spec:
  schedule: "0 2 * * *"
  retentionDays: 30
EOF

# Try to change the schedule
kubectl patch bp immutable-test -n demo \
  --type=merge -p '{"spec":{"schedule":"0 3 * * *"}}'
The BackupPolicy "immutable-test" is invalid:
  spec.schedule: Invalid value: "0 3 * * *":
    schedule is immutable after creation

Test list element validation:

kubectl apply -f - <<'EOF'
apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: bad-namespace
  namespace: demo
spec:
  schedule: "0 2 * * *"
  retentionDays: 7
  targets:
    - namespace: "UPPERCASE_IS_INVALID"
EOF
The BackupPolicy "bad-namespace" is invalid:
  spec.targets[0]: Invalid value: "object":
    namespace must be a valid DNS label

CEL Cost and Limits

CEL expressions are evaluated at admission time in the API server. Kubernetes imposes cost limits to prevent expressions from consuming excessive CPU:

  • Each expression is assigned a cost based on its operations (string matches, list iteration, etc.)
  • If the expression cost exceeds the per-validation limit, the API server rejects the CRD itself when you apply it
  • Complex all() over large lists is the most common way to hit cost limits

If you hit a cost limit error:

CustomResourceDefinition is invalid: spec.validation.openAPIV3Schema...
  CEL expression cost exceeds budget

Solutions:
– Reduce list traversal in CEL rules; enforce list length with maxItems instead
– Split one expensive rule into multiple simpler rules
– Move the expensive validation to a controller (status condition) rather than admission


⚠ Common Mistakes

Using oldSelf on create. On create operations, oldSelf is nil/unset. A rule like self == oldSelf for immutability will panic on create unless you guard it: oldSelf == null || self == oldSelf. In practice, Kubernetes applies immutable rules only on updates (the reason: Immutable annotation helps here), but be explicit in rules that reference oldSelf.

Forgetting has() checks for optional fields. If encryptionKeyRef is optional (not in required) and you write a rule like self.encryptionKeyRef.size() > 0, it will fail with a “no such key” error when the field is absent. Always guard optional field access with has(self.fieldName).

Overloading CEL for what a controller should do. CEL validates fields at admission. If your rule needs to verify that a referenced Secret actually exists, CEL cannot do that — it only sees the object being submitted. Use a controller status condition for existence checks, not CEL.


Quick Reference: Common CEL Patterns

# String not empty
self.size() > 0

# String matches regex
self.matches('^[a-z][a-z0-9-]{1,62}$')

# Optional field guard
!has(self.fieldName) || self.fieldName.size() > 0

# Conditional requirement
!(condition) || has(self.requiredWhenConditionIsTrue)

# Immutable field (update only)
self == oldSelf

# All list items satisfy condition
self.all(item, item.namespace.size() > 0)

# At least one list item satisfies condition
self.exists(item, item.type == 'primary')

# Cross-field comparison
self.minReplicas <= self.maxReplicas

# Enum-style check
self.in(['standard', 'premium', 'archive'])

Key Takeaways

  • x-kubernetes-validations with CEL rules replaces most validating admission webhooks for CRD-specific logic
  • CEL runs inside the API server — no external service, no TLS, no separate deployment
  • Cross-field validation, immutable fields, and conditional requirements are all expressible in CEL
  • Use has() guards for optional fields; use oldSelf carefully (it is nil on create)
  • CEL has cost limits — avoid unbounded list iteration; use maxItems to bound lists first

What’s Next

EP06: The Kubernetes Controller Reconcile Loop explains how a controller watches BackupPolicy objects and acts on them — the mechanism that makes CRDs useful beyond validated configuration storage. Before writing code in EP07, you need to understand the reconcile loop conceptually.

Get EP06 in your inbox when it publishes → subscribe at linuxcent.com

Leave a Comment