Write Your First Kubernetes CRD: A Hands-On YAML Walkthrough

Reading Time: 6 minutes

Kubernetes CRDs & Operators: Extending the API, Episode 4
What Is a CRD? · CRDs You Already Use · CRD Anatomy · Write Your First CRD · CEL Validation · Controller Loop · Build an Operator · CRD Versioning · Admission Webhooks · CRDs in Production


TL;DR

  • Writing a Kubernetes CRD requires five YAML files: the CRD itself, a ClusterRole/ClusterRoleBinding, a namespaced Role/RoleBinding for consumers, and a sample custom resource
  • The BackupPolicy CRD built in this episode is the running example throughout the rest of the series — operators, versioning, and production patterns all use it
  • Apply the CRD, verify it with kubectl get crds, create a custom resource, and watch the API server validate your spec
  • RBAC for CRDs follows the same Role/ClusterRole model as built-in resources — the generated resource name is {plural}.{group}
  • Schema validation fires at apply time: bad field types, missing required fields, and out-of-range values all return clear errors before anything reaches etcd
  • Without a controller, a BackupPolicy is stored in etcd but nothing acts on it — that is the topic of EP05 and EP07

The Big Picture

  WHAT WE'RE BUILDING IN THIS EPISODE

  1. backuppolicies-crd.yaml        ← registers the BackupPolicy type
  2. backuppolicies-rbac.yaml       ← controls who can create/view/delete
  3. nightly-backup.yaml            ← our first custom resource instance

  After applying:

  kubectl get crds | grep backup      ← BackupPolicy type exists
  kubectl get backuppolicies -n demo  ← nightly instance exists
  kubectl describe bp nightly -n demo ← spec visible, status empty
  kubectl apply -f bad-backup.yaml    ← schema validation rejects bad data

Writing your first Kubernetes CRD is the step that bridges understanding CRDs conceptually to operating them in a real cluster. This episode is hands-on — every block of YAML is something you apply and verify.


Prerequisites

You need a running Kubernetes cluster and kubectl configured. Any of these work:

# Local options
kind create cluster --name crd-demo
# or
minikube start

# Verify cluster access
kubectl cluster-info
kubectl get nodes

Step 1: Write the CRD

Save this as backuppolicies-crd.yaml:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: backuppolicies.storage.example.com
spec:
  group: storage.example.com
  scope: Namespaced
  names:
    plural:     backuppolicies
    singular:   backuppolicy
    kind:       BackupPolicy
    shortNames:
      - bp
    categories:
      - storage
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          required: ["spec"]
          properties:
            spec:
              type: object
              required: ["schedule", "retentionDays"]
              properties:
                schedule:
                  type: string
                  description: "Cron expression (e.g. '0 2 * * *' for 02:00 daily)"
                retentionDays:
                  type: integer
                  minimum: 1
                  maximum: 365
                  description: "How many days to retain backup snapshots"
                storageClass:
                  type: string
                  default: "standard"
                  description: "StorageClass to use for backup volumes"
                targets:
                  type: array
                  description: "Namespaces and resources to include in the backup"
                  maxItems: 20
                  items:
                    type: object
                    required: ["namespace"]
                    properties:
                      namespace:
                        type: string
                      includeSecrets:
                        type: boolean
                        default: false
                suspended:
                  type: boolean
                  default: false
                  description: "Set to true to pause backup execution"
            status:
              type: object
              x-kubernetes-preserve-unknown-fields: true
      subresources:
        status: {}
      additionalPrinterColumns:
        - name: Schedule
          type: string
          jsonPath: .spec.schedule
        - name: Retention
          type: integer
          jsonPath: .spec.retentionDays
        - name: Suspended
          type: boolean
          jsonPath: .spec.suspended
        - name: Ready
          type: string
          jsonPath: .status.conditions[?(@.type=='Ready')].status
        - name: Age
          type: date
          jsonPath: .metadata.creationTimestamp

Apply it:

kubectl apply -f backuppolicies-crd.yaml

Verify it registered correctly:

kubectl get crds backuppolicies.storage.example.com
NAME                                    CREATED AT
backuppolicies.storage.example.com      2026-04-25T08:00:00Z

Check the API server now knows about it:

kubectl api-resources | grep backuppolic
backuppolicies    bp    storage.example.com/v1alpha1    true    BackupPolicy

Check it is Established:

kubectl get crd backuppolicies.storage.example.com \
  -o jsonpath='{.status.conditions[?(@.type=="Established")].status}'
True

If you see False or empty output, wait a few seconds and retry — the API server takes a moment to register new CRDs.


Step 2: Write RBAC

CRDs follow the same RBAC model as built-in resources. The resource name is {plural}.{group}.

Save this as backuppolicies-rbac.yaml:

# ClusterRole for operators/controllers that manage BackupPolicy objects
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: backuppolicy-controller
rules:
  - apiGroups: ["storage.example.com"]
    resources: ["backuppolicies"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: ["storage.example.com"]
    resources: ["backuppolicies/status"]
    verbs: ["get", "update", "patch"]
  - apiGroups: ["storage.example.com"]
    resources: ["backuppolicies/finalizers"]
    verbs: ["update"]
---
# Role for application teams to manage BackupPolicies in their namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: backuppolicy-editor
rules:
  - apiGroups: ["storage.example.com"]
    resources: ["backuppolicies"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
# Read-only role for auditors
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: backuppolicy-viewer
rules:
  - apiGroups: ["storage.example.com"]
    resources: ["backuppolicies"]
    verbs: ["get", "list", "watch"]
kubectl apply -f backuppolicies-rbac.yaml

Verify the roles exist:

kubectl get clusterrole | grep backuppolicy
backuppolicy-controller   2026-04-25T08:01:00Z
backuppolicy-editor       2026-04-25T08:01:00Z
backuppolicy-viewer       2026-04-25T08:01:00Z

Note on backuppolicies/status: The separate status RBAC rule is only meaningful if you enabled the status subresource (we did). Without it, status and spec share the same update path.


Step 3: Create a Namespace and Your First Custom Resource

kubectl create namespace demo

Save this as nightly-backup.yaml:

apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: nightly
  namespace: demo
  labels:
    app.kubernetes.io/managed-by: manual
spec:
  schedule: "0 2 * * *"
  retentionDays: 30
  storageClass: standard
  targets:
    - namespace: production
      includeSecrets: false
    - namespace: staging
      includeSecrets: false
  suspended: false

Apply it:

kubectl apply -f nightly-backup.yaml

Get it back:

kubectl get backuppolicies -n demo
NAME      SCHEDULE    RETENTION   SUSPENDED   READY   AGE
nightly   0 2 * * *   30          false       <none>  5s

The Ready column is <none> because there is no controller writing status yet. The custom resource exists and is stored in etcd, but nothing is acting on it.

Describe it:

kubectl describe bp nightly -n demo
Name:         nightly
Namespace:    demo
Labels:       app.kubernetes.io/managed-by=manual
Annotations:  <none>
API Version:  storage.example.com/v1alpha1
Kind:         BackupPolicy
Metadata:
  Creation Timestamp:  2026-04-25T08:05:00Z
  ...
Spec:
  Retention Days:  30
  Schedule:        0 2 * * *
  Storage Class:   standard
  Suspended:       false
  Targets:
    Include Secrets:  false
    Namespace:        production
    Include Secrets:  false
    Namespace:        staging
Status:
Events:  <none>

Step 4: Test Schema Validation

The API server now validates every BackupPolicy against the schema. Try creating an invalid one:

kubectl apply -f - <<'EOF'
apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: bad-policy
  namespace: demo
spec:
  schedule: "not-a-cron"
  retentionDays: 500
EOF
The BackupPolicy "bad-policy" is invalid:
  spec.retentionDays: Invalid value: 500:
    spec.retentionDays in body should be less than or equal to 365

Missing required field:

kubectl apply -f - <<'EOF'
apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: missing-schedule
  namespace: demo
spec:
  retentionDays: 7
EOF
The BackupPolicy "missing-schedule" is invalid:
  spec.schedule: Required value

Wrong type:

kubectl apply -f - <<'EOF'
apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: wrong-type
  namespace: demo
spec:
  schedule: "0 2 * * *"
  retentionDays: "thirty"
EOF
The BackupPolicy "wrong-type" is invalid:
  spec.retentionDays: Invalid value: "string":
    spec.retentionDays in body must be of type integer: "string"

All validation fires at the API boundary — before etcd, before any controller sees the object.


Step 5: Verify Default Values Apply

The schema defines storageClass: default: "standard" and suspended: default: false. Verify they are applied even when not specified:

kubectl apply -f - <<'EOF'
apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: minimal
  namespace: demo
spec:
  schedule: "0 0 * * 0"
  retentionDays: 7
EOF

kubectl get bp minimal -n demo -o jsonpath='{.spec.storageClass}'
standard
kubectl get bp minimal -n demo -o jsonpath='{.spec.suspended}'
false

Defaults are injected by the API server at admission time. They appear in etcd and in every kubectl get -o yaml output — the stored object includes the defaults even if the user did not specify them.


Step 6: Explore the API Endpoints

Your custom resource is now available at standard REST endpoints:

kubectl proxy --port=8001 &

# List all BackupPolicies in the demo namespace
curl -s http://localhost:8001/apis/storage.example.com/v1alpha1/namespaces/demo/backuppolicies \
  | jq '.items[].metadata.name'
"nightly"
"minimal"
# Get a specific BackupPolicy
curl -s http://localhost:8001/apis/storage.example.com/v1alpha1/namespaces/demo/backuppolicies/nightly \
  | jq '.spec'

This is how controllers discover and watch custom resources — via the same API server endpoints, using informers that wrap these REST calls with efficient list-and-watch semantics.


Step 7: Clean Up

kubectl delete namespace demo
kubectl delete -f backuppolicies-rbac.yaml
kubectl delete -f backuppolicies-crd.yaml   # WARNING: deletes all BackupPolicy instances first

⚠ Common Mistakes

metadata.name does not match {plural}.{group}. The most common error. If you name the CRD backuppolicy.storage.example.com (singular) but the spec says plural: backuppolicies, the API server rejects it. The name must always be {plural}.{group}.

No required fields on spec. Without required constraints, kubectl apply accepts an empty spec: {}. The controller then receives objects with no configuration and has to handle the nil case. Define required fields in the schema.

Forgetting subresources: status: {}. Without this, controllers writing .status also overwrite .spec on full PUT updates. This causes status updates to reset user edits. Enable the status subresource from day one.

Not testing validation errors. Schema validation is the first line of defense. Always explicitly test that your required fields are required, types are enforced, and range constraints work — before deploying the controller.


Quick Reference

# All kubectl operations work on custom resources
kubectl get      backuppolicies -n demo
kubectl get      bp -n demo                  # shortName
kubectl describe bp nightly -n demo
kubectl edit     bp nightly -n demo
kubectl delete   bp nightly -n demo

# Output formats
kubectl get bp -n demo -o yaml
kubectl get bp -n demo -o json
kubectl get bp -n demo -o jsonpath='{.items[*].metadata.name}'

# Watch for changes
kubectl get bp -n demo -w

# List across all namespaces
kubectl get bp -A

# Patch spec
kubectl patch bp nightly -n demo \
  --type=merge -p '{"spec":{"suspended":true}}'

Key Takeaways

  • A working CRD deployment needs: the CRD YAML, RBAC ClusterRoles, and at least one sample custom resource
  • The API server validates all custom resources against the schema at apply time — errors are surfaced immediately, not inside the controller
  • Default values in the schema are injected at admission time and appear in every stored object
  • RBAC for custom resources uses {plural}.{group} as the resource name — status and finalizers are separate sub-resources
  • Without a controller, custom resources are stored in etcd and serve as validated configuration — nothing acts on them until a controller is deployed

What’s Next

EP05: Kubernetes CRD CEL Validation extends schema validation beyond simple type and range checks — cross-field rules (“if storageClass is premium, retentionDays must be at most 90″), regex validation beyond pattern, and immutable field enforcement. All without an admission webhook.

Get EP05 in your inbox when it publishes → subscribe at linuxcent.com

What Is a Kubernetes CRD? How Custom Resources Extend the API

Reading Time: 6 minutes

Kubernetes CRDs & Operators: Extending the API, Episode 1
What Is a CRD? · CRDs You Already Use · CRD Anatomy · Write Your First CRD · CEL Validation · Controller Loop · Build an Operator · CRD Versioning · Admission Webhooks · CRDs in Production


TL;DR

  • A Kubernetes CRD (Custom Resource Definition) is how you add new resource types to the Kubernetes API — the same way Deployment and Service exist natively, you can make BackupPolicy or Certificate exist too
    (CRD = the schema/blueprint; Custom Resource = an instance of that schema, just like a Pod is an instance of the Pod schema)
  • Every kubectl get crds on a real cluster shows dozens of them — cert-manager, KEDA, Prometheus Operator, Crossplane all ship their own CRDs
  • CRDs are served by the same API server as built-in resources — kubectl, RBAC, watches, and events all work identically
  • A CRD alone does nothing — a controller watches the custom resources and acts on them; together they form an Operator
  • CRDs live in etcd just like Pods and Deployments — they survive API server restarts and cluster upgrades
  • You do not need to modify Kubernetes source code or restart the API server to add a CRD

The Big Picture

  HOW KUBERNETES CRDs EXTEND THE API

  ┌──────────────────────────────────────────────────────────────┐
  │  Kubernetes API Server                                       │
  │                                                              │
  │  Built-in resources          Custom resources (via CRD)      │
  │  ─────────────────           ──────────────────────────      │
  │  Pod                         Certificate     (cert-manager)  │
  │  Deployment                  ScaledObject    (KEDA)          │
  │  Service                     ExternalSecret  (ESO)           │
  │  ConfigMap                   BackupPolicy    (your team)     │
  │  ...                         ...                             │
  │                                                              │
  │  All resources: same API, same kubectl, same RBAC, same etcd │
  └──────────────────────────────────────────────────────────────┘
            ▲                          ▲
            │ built in                 │ registered at runtime
            │                         │
         Kubernetes              CustomResourceDefinition
          binary                    (a YAML you apply)

What is a Kubernetes CRD? It is a resource that defines resources — a schema registration that teaches the API server about a new object type you want to use in your cluster.


What Problem CRDs Solve

Kubernetes ships with roughly 50 resource types: Pods, Deployments, Services, ConfigMaps, Secrets, PersistentVolumes, and so on. These cover the general-purpose building blocks for running containerized workloads.

But the moment you operate real infrastructure, you hit the edges. You want to express:

  • “This database should have three replicas with point-in-time recovery enabled” — not a Deployment
  • “This TLS certificate for api.example.com should renew 30 days before expiry” — not a Secret
  • “This queue consumer should scale to zero when the queue is empty” — not a HorizontalPodAutoscaler

Before CRDs (pre-2017), the only options were: use ConfigMaps as a poor substitute (no schema, no validation, no dedicated RBAC), or fork Kubernetes and add the resource natively (impractical for everyone outside the core team).

CRDs, introduced as stable in Kubernetes 1.16, solved this by letting you register a new resource type with the API server at runtime — without touching Kubernetes source code, without restarting the API server, without any special access beyond being able to create cluster-scoped resources.


The Kubernetes API: A Brief Mental Model

Before CRDs make sense, the API model needs to be clear.

  KUBERNETES API STRUCTURE

  apiVersion: apps/v1       ← API group (apps) + version (v1)
  kind: Deployment          ← resource type
  metadata:
    name: web               ← instance name
    namespace: default      ← namespace scope
  spec:
    replicas: 3             ← desired state

Every Kubernetes resource has:
– A group (e.g., apps, batch, networking.k8s.io) — or no group for core resources
– A version (e.g., v1, v1beta1)
– A kind (e.g., Deployment, Pod)
– A scope: namespaced or cluster-wide

The API server is a registry. Each group/version/kind combination maps to a Go struct that knows how to validate, store, and serve that resource type.

A CRD registers a new entry in that registry. You supply the group, version, kind, and schema. The API server handles everything else — serving it via REST, storing it in etcd, exposing it to kubectl.


What a CRD Looks Like

Here is the smallest possible CRD — it creates a new BackupPolicy resource type in the storage.example.com API group:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: backuppolicies.storage.example.com
spec:
  group: storage.example.com
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                schedule:
                  type: string
                retentionDays:
                  type: integer
  scope: Namespaced
  names:
    plural: backuppolicies
    singular: backuppolicy
    kind: BackupPolicy
    shortNames:
      - bp

Apply it:

kubectl apply -f backuppolicy-crd.yaml

Now create an instance:

apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: nightly
  namespace: default
spec:
  schedule: "0 2 * * *"
  retentionDays: 30
kubectl apply -f nightly-backup.yaml
kubectl get backuppolicies
kubectl get bp            # shortName works
kubectl describe bp nightly

The API server validates the spec against the schema, stores it in etcd, and returns it via all the standard API endpoints — all without a single line of custom code.


CRD vs Built-In Resource: What Is Different?

Not much, deliberately.

Capability Built-in resource Custom resource (CRD)
kubectl get / describe / delete Yes Yes
RBAC (Roles, ClusterRoles) Yes Yes
Watch (informers, events) Yes Yes
Stored in etcd Yes Yes
OpenAPI schema validation Yes Yes (you define the schema)
Admission webhooks Yes Yes
Status subresource Yes Optional (you enable it)
Scale subresource Yes Optional (you enable it)
Built-in controller behavior Yes No — you write the controller

The last row is the critical one. When you create a Deployment, the deployment controller immediately starts managing ReplicaSets. When you create a BackupPolicy, nothing happens — until you write and deploy a controller that watches BackupPolicy objects and acts on them.

That controller + the CRD is what people call an Operator.


A Real Cluster: What You Actually See

Run this on any cluster running cert-manager, Prometheus Operator, or any other tooling:

kubectl get crds

Sample output (abbreviated):

NAME                                                  CREATED AT
certificates.cert-manager.io                          2024-11-01T08:12:00Z
certificaterequests.cert-manager.io                   2024-11-01T08:12:00Z
issuers.cert-manager.io                               2024-11-01T08:12:00Z
clusterissuers.cert-manager.io                        2024-11-01T08:12:00Z
scaledobjects.keda.sh                                 2024-11-01T08:13:00Z
scaledjobs.keda.sh                                    2024-11-01T08:13:00Z
externalsecrets.external-secrets.io                   2024-11-01T08:14:00Z
prometheuses.monitoring.coreos.com                    2024-11-01T08:15:00Z
servicemonitors.monitoring.coreos.com                 2024-11-01T08:15:00Z

Every tool that ships as a CRD-based system registers its resource types here first. The count often surprises engineers: a production cluster with a typical toolchain easily has 40–80 CRDs.

Check how many are on your cluster:

kubectl get crds --no-headers | wc -l

How the API Server Handles a CRD

When you apply a CRD, the API server does three things:

  CRD REGISTRATION FLOW

  kubectl apply -f my-crd.yaml
          │
          ▼
  1. API server validates the CRD manifest
     (is the schema valid OpenAPI v3? are names correct?)
          │
          ▼
  2. CRD stored in etcd
     (under /registry/apiextensions.k8s.io/customresourcedefinitions/)
          │
          ▼
  3. New REST endpoints activated immediately:
     GET  /apis/storage.example.com/v1alpha1/namespaces/{ns}/backuppolicies
     POST /apis/storage.example.com/v1alpha1/namespaces/{ns}/backuppolicies
     ...

From this point, any kubectl get backuppolicies or API call to those endpoints is handled exactly like a built-in resource call — the API server serves it from etcd, applies RBAC, runs admission webhooks, and returns standard JSON.

No restart required. The new endpoints appear within seconds.


The Difference Between CRD and CR

Two terms that are easily confused:

  • CRD (CustomResourceDefinition) — the schema/blueprint. There is one CRD per resource type. certificates.cert-manager.io is a CRD.
  • CR (Custom Resource) — an instance of a CRD. Every Certificate object you create is a custom resource. You can have thousands of CRs per CRD.
  CRD (one)          →  Custom Resource (many)
  ─────────             ─────────────────────
  certificates          web-tls           (namespace: production)
  .cert-manager.io      api-tls           (namespace: production)
                        admin-tls         (namespace: staging)
                        ...

The CRD is applied once (usually by the tool’s Helm chart). Custom resources are created by your users, your CI pipeline, or your GitOps system throughout the life of the cluster.


Where CRDs Fit in the Kubernetes Extension Model

CRDs are one of three ways to extend Kubernetes:

  KUBERNETES EXTENSION MECHANISMS

  1. CRDs + Controllers (Operators)
     Add new resource types + behavior
     → cert-manager, KEDA, Argo CD, Crossplane
     Used for: domain-specific abstractions, infrastructure management

  2. Admission Webhooks
     Intercept API requests to validate or mutate objects
     → OPA/Gatekeeper, Kyverno, Istio injection
     Used for: policy enforcement, sidecar injection, defaulting

  3. API Aggregation (AA)
     Register a fully separate API server behind the main API server
     → metrics-server, custom autoscalers
     Used for: when you need non-CRUD semantics (e.g. exec, attach, streaming)

For 95% of use cases, CRDs + controllers are the right mechanism. API aggregation is complex and only warranted for non-standard API semantics. Admission webhooks are complementary to CRDs, not an alternative.


⚠ Common Mistakes

Confusing the CRD with the controller. The CRD is just a schema registration — it does not execute code. If you apply a CRD but do not deploy its controller, creating custom resources will succeed (the API server accepts them) but nothing will happen. This catches many people the first time they try to use cert-manager by only applying the CRDs without installing the cert-manager controller.

Assuming CRD deletion is safe. Deleting a CRD deletes all custom resources of that type from etcd. There is no “are you sure?” prompt. If you delete the certificates.cert-manager.io CRD, every Certificate object in every namespace is gone.

Treating CRDs as ConfigMap replacements. Some teams store configuration in CRDs purely to get schema validation. This works, but without a controller, the custom resources are inert data. If you only need configuration storage with validation, a CRD is viable — just be explicit that there is no reconciliation loop.


Quick Reference

# List all CRDs in the cluster
kubectl get crds

# Inspect a specific CRD's schema
kubectl get crd certificates.cert-manager.io -o yaml

# List all custom resources of a type
kubectl get certificates -A

# Get details on a specific custom resource
kubectl describe certificate web-tls -n production

# Delete a CRD (WARNING: deletes all instances)
kubectl delete crd backuppolicies.storage.example.com

# Check if a CRD is established (ready to use)
kubectl get crd backuppolicies.storage.example.com \
  -o jsonpath='{.status.conditions[?(@.type=="Established")].status}'
# Returns: True

Key Takeaways

  • A Kubernetes CRD registers a new resource type with the API server — no source code changes, no restart required
  • Custom resources behave identically to built-in resources: kubectl, RBAC, watches, etcd, admission webhooks all work the same way
  • The CRD is just the schema; a controller gives custom resources behavior — together they form an Operator
  • Every production cluster running modern tooling already uses dozens of CRDs
  • Deleting a CRD deletes all its instances — treat CRDs as production-critical objects

What’s Next

EP02: CRDs You Already Use makes this concrete before we go deeper — we walk through cert-manager’s Certificate, KEDA’s ScaledObject, and External Secrets’ ExternalSecret as working examples, so you understand what a well-designed CRD looks like from a user’s perspective before you design your own.

Get EP02 in your inbox when it publishes → subscribe at linuxcent.com