Kubernetes CRD Versioning: From v1alpha1 to v1 Without Breaking Clients

Reading Time: 6 minutes

Kubernetes CRDs & Operators: Extending the API, Episode 8
What Is a CRD? · CRDs You Already Use · CRD Anatomy · Write Your First CRD · CEL Validation · Controller Loop · Build an Operator · CRD Versioning · Admission Webhooks · CRDs in Production


TL;DR

  • Kubernetes CRD versioning lets you evolve your API from v1alpha1 to v1 without deleting existing custom resources or breaking clients still using the old version
    (storage version = the version etcd actually stores objects in; served versions = the versions the API server responds to; you can serve v1alpha1 and v1 simultaneously while migrating)
  • The hub-and-spoke model is the recommended conversion architecture: one “hub” version (usually v1) that every other version converts to/from
  • Without a conversion webhook, the API server only allows one served version at a time — you must use a webhook to serve multiple versions with schema differences
  • kubectl storage-version-migrator (or manual re-apply) migrates existing objects from the old storage version to the new one after you update storage: true
  • Changing field names between versions without a conversion webhook corrupts data silently — always test conversion round-trips before promoting a version

The Big Picture

  CRD VERSION LIFECYCLE

  Stage 1: Alpha                 Stage 2: Beta              Stage 3: Stable
  ──────────────────             ──────────────             ──────────────
  v1alpha1                       v1alpha1 (deprecated)      v1alpha1 (removed)
    served: true                   served: true               served: false
    storage: true                  storage: false             storage: false
                                 v1beta1                    v1beta1 (deprecated)
                                   served: true               served: true
                                   storage: false             storage: false
                                 v1                         v1
                                   served: true               served: true
                                   storage: true              storage: true

  Clients using v1alpha1:         The API server converts     Eventually remove
  still work via conversion       on the fly                  old served versions
  webhook

Kubernetes CRD versioning is what allows you to ship BackupPolicy v1alpha1 today, learn from real usage, evolve the schema to v1 with renamed fields and new constraints, and keep existing clusters running without a migration window.


Why Versioning Is Necessary

When BackupPolicy v1alpha1 shipped, the spec used retentionDays. After six months of production use, the team learns:

  • retentionDays should be renamed to retention.days (nested under a retention object for future extensibility)
  • A new required field backupFormat needs to be added with a default of tar.gz
  • The targets field should be renamed to includedNamespaces

These are breaking changes. Clients (GitOps repos, Helm charts, other operators) still have YAML referencing v1alpha1 with the old field names. You cannot simply rename the fields.

The solution: add v1 with the new schema, run both versions simultaneously via a conversion webhook, migrate objects to the new storage version, then deprecate v1alpha1.


Simple Case: Non-Breaking Addition (No Webhook Needed)

If you only add new optional fields to the schema — no renames, no removals — you can add a new version without a conversion webhook, as long as only one version is served at a time.

versions:
  - name: v1alpha1
    served: false      # stop serving old version
    storage: false
    schema: ...
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        properties:
          spec:
            properties:
              schedule:
                type: string
              retentionDays:
                type: integer
              backupFormat:          # new optional field
                type: string
                default: "tar.gz"

Existing objects stored as v1alpha1 are served as v1 with the new field defaulted. This works for purely additive changes because the stored bytes are compatible with the new schema.

When this is not enough: field renames, type changes, field removal, or structural reorganization all require a conversion webhook.


The Hub-and-Spoke Model

For breaking schema changes, the API server needs a conversion webhook. The recommended architecture is hub-and-spoke:

  HUB-AND-SPOKE CONVERSION

       v1alpha1
          │
          ▼ convert to hub
         v1  (hub)
          ▲
          │ convert to hub
       v1beta1

  Every version converts TO the hub and FROM the hub.
  The hub is always the storage version.
  Two-version conversion: v1alpha1 → v1 → v1beta1
  Never directly: v1alpha1 → v1beta1

This means you only write N conversion functions (one per version) rather than N² (one per version pair). As you add versions, the conversion complexity grows linearly.


Writing a Conversion Webhook

The conversion webhook is an HTTPS endpoint that the API server calls when it needs to convert an object between versions.

1. Define the conversion hub

In the kubebuilder project, mark v1 as the hub:

In api/v1/backuppolicy_conversion.go:

package v1

// Hub marks this type as the conversion hub.
func (*BackupPolicy) Hub() {}

2. Implement conversion in v1alpha1

In api/v1alpha1/backuppolicy_conversion.go:

package v1alpha1

import (
    "fmt"
    v1 "github.com/example/backup-operator/api/v1"
    "sigs.k8s.io/controller-runtime/pkg/conversion"
)

// ConvertTo converts v1alpha1 BackupPolicy to v1 (the hub).
func (src *BackupPolicy) ConvertTo(dstRaw conversion.Hub) error {
    dst := dstRaw.(*v1.BackupPolicy)

    // Metadata
    dst.ObjectMeta = src.ObjectMeta

    // Field mapping: v1alpha1 → v1
    dst.Spec.Schedule      = src.Spec.Schedule
    dst.Spec.BackupFormat  = "tar.gz"           // new field: default for old objects
    dst.Spec.StorageClass  = src.Spec.StorageClass
    dst.Spec.Suspended     = src.Spec.Suspended

    // Renamed field: retentionDays → retention.days
    dst.Spec.Retention = v1.RetentionSpec{
        Days: src.Spec.RetentionDays,
    }

    // Renamed field: targets → includedNamespaces
    for _, t := range src.Spec.Targets {
        dst.Spec.IncludedNamespaces = append(dst.Spec.IncludedNamespaces,
            v1.NamespaceTarget{
                Namespace:      t.Namespace,
                IncludeSecrets: t.IncludeSecrets,
            })
    }

    dst.Status = v1.BackupPolicyStatus(src.Status)
    return nil
}

// ConvertFrom converts v1 (hub) BackupPolicy back to v1alpha1.
func (dst *BackupPolicy) ConvertFrom(srcRaw conversion.Hub) error {
    src := srcRaw.(*v1.BackupPolicy)

    dst.ObjectMeta = src.ObjectMeta

    dst.Spec.Schedule      = src.Spec.Schedule
    dst.Spec.StorageClass  = src.Spec.StorageClass
    dst.Spec.Suspended     = src.Spec.Suspended
    dst.Spec.RetentionDays = src.Spec.Retention.Days

    for _, n := range src.Spec.IncludedNamespaces {
        dst.Spec.Targets = append(dst.Spec.Targets, BackupTarget{
            Namespace:      n.Namespace,
            IncludeSecrets: n.IncludeSecrets,
        })
    }

    // backupFormat cannot be round-tripped to v1alpha1 (no such field)
    // Store it in an annotation to preserve the value if the object is
    // re-converted back to v1.
    if src.Spec.BackupFormat != "" && src.Spec.BackupFormat != "tar.gz" {
        if dst.Annotations == nil {
            dst.Annotations = make(map[string]string)
        }
        dst.Annotations["storage.example.com/backup-format"] = src.Spec.BackupFormat
    }

    dst.Status = BackupPolicyStatus(src.Status)
    return nil
}

3. Register the webhook

kubebuilder create webhook \
  --group storage \
  --version v1alpha1 \
  --kind BackupPolicy \
  --conversion

This generates the webhook server setup. Deploy with a TLS certificate (cert-manager can manage this automatically via the kubebuilder //+kubebuilder:webhook:... marker).


Updating the CRD to Reference the Webhook

spec:
  conversion:
    strategy: Webhook
    webhook:
      clientConfig:
        service:
          name: backup-operator-webhook-service
          namespace: backup-operator-system
          path: /convert
      conversionReviewVersions: ["v1", "v1beta1"]
  versions:
    - name: v1alpha1
      served: true
      storage: false
      schema: ...
    - name: v1
      served: true
      storage: true
      schema: ...

Once applied, kubectl get backuppolicies.v1alpha1.storage.example.com/nightly and kubectl get backuppolicies.v1.storage.example.com/nightly both work — the API server converts transparently.


Migrating Existing Objects to the New Storage Version

After changing storage: true from v1alpha1 to v1, existing objects in etcd are still stored as v1alpha1 bytes. They are served correctly (via conversion) but are not yet migrated.

Migrate them:

# Option 1: Manual re-apply (works for small object counts)
kubectl get backuppolicies -A -o name | while read name; do
  kubectl apply -f <(kubectl get $name -o yaml)
done

# Option 2: Storage Version Migrator (automated, for large clusters)
# Install: https://github.com/kubernetes-sigs/kube-storage-version-migrator
kubectl apply -f storageVersionMigration.yaml

After migration, all objects in etcd are stored as v1. You can then set v1alpha1 served: false to stop serving the old version.


Storage Version Migration Checklist

  SAFE VERSION PROMOTION CHECKLIST

  □ New version (v1) has served: true, storage: true
  □ Old version (v1alpha1) has served: true, storage: false
  □ Conversion webhook deployed and healthy
  □ Round-trip conversion tested (v1alpha1 → v1 → v1alpha1 preserves all data)
  □ kubectl get backuppolicies works at both versions
  □ Existing objects migrated (re-applied or migration job run)
  □ Old version set to served: false (stop serving)
  □ Old version removed from CRD after N release cycles

⚠ Common Mistakes

Changing the storage version without a conversion webhook. If you flip storage: true from v1alpha1 to v1 while still serving v1alpha1, the API server tries to read stored v1alpha1 bytes as v1 and fails. Always deploy the conversion webhook before changing the storage version.

Lossy conversion. If ConvertFrom (v1 → v1alpha1) drops a field that exists in v1, objects are silently corrupted when a v1alpha1 client reads and re-saves them. Round-trip test every conversion: original → hub → original must produce identical objects (or use annotations to preserve fields that cannot round-trip).

Forgetting to migrate existing objects. After changing the storage version, existing objects are still stored in the old format. They convert on read, but etcd still holds old bytes. Until migrated, your etcd backup/restore story is broken — restoring from backup would restore old-format bytes that need conversion.


Quick Reference

# Check which version is currently the storage version
kubectl get crd backuppolicies.storage.example.com \
  -o jsonpath='{.status.storedVersions}'
# output: ["v1alpha1"]  or  ["v1alpha1","v1"]  or  ["v1"]

# Verify conversion webhook is reachable
kubectl get crd backuppolicies.storage.example.com \
  -o jsonpath='{.spec.conversion.webhook.clientConfig}'

# Read an object at a specific version
kubectl get backuppolicies.v1alpha1.storage.example.com/nightly -n demo -o yaml
kubectl get backuppolicies.v1.storage.example.com/nightly -n demo -o yaml

# Check CRD conditions (NamesAccepted, Established)
kubectl describe crd backuppolicies.storage.example.com | grep -A5 Conditions

Key Takeaways

  • CRD versioning lets you evolve the schema without a migration window — old and new versions coexist via a conversion webhook
  • The hub-and-spoke model minimizes conversion code: N functions, not N² — the hub version is always the storage version
  • Never change the storage version without a deployed conversion webhook for breaking schema changes
  • Conversion must be lossless — fields that cannot round-trip should be preserved in annotations
  • Migrate existing objects to the new storage version after promoting it, then deprecate the old served version

What’s Next

EP09: Admission Webhooks completes the Kubernetes extension picture — validating and mutating webhooks that intercept API requests before they reach etcd, when to use them alongside CRDs, and how they differ from CEL validation.

Get EP09 in your inbox when it publishes → subscribe at linuxcent.com

Leave a Comment