The Kubernetes Controller Reconcile Loop: How CRDs Come Alive at Runtime

Reading Time: 7 minutes

Kubernetes CRDs & Operators: Extending the API, Episode 6
What Is a CRD? · CRDs You Already Use · CRD Anatomy · Write Your First CRD · CEL Validation · Controller Loop · Build an Operator · CRD Versioning · Admission Webhooks · CRDs in Production


TL;DR

  • The Kubernetes controller reconcile loop is the mechanism that makes CRDs do something — it watches custom resources, compares desired state (spec) to actual state, and takes actions to close the gap
    (reconcile = “make actual match desired”; the loop runs repeatedly because the world is not static — things drift, fail, and change)
  • Controllers do not receive events like webhooks — they receive object names from a work queue, then re-read the full object from the API server cache
  • The reconcile function is idempotent: calling it ten times with the same object must produce the same result as calling it once
  • controller-runtime is the Go library that provides the informer cache, work queue, and reconciler interface — kubebuilder scaffolds controllers on top of it
  • Kubernetes uses the same reconcile loop internally — the Deployment controller, ReplicaSet controller, and node lifecycle controller all follow this exact pattern
  • A failed reconcile returns an error or explicit requeue request; the controller retries with exponential backoff, not an infinite tight loop

The Big Picture

  THE KUBERNETES CONTROLLER RECONCILE LOOP

  etcd
   │ change event
   ▼
  Informer cache
  (API server-side list+watch,
   local in-memory replica)
   │ cache update → enqueue object name
   ▼
  Work queue
  (rate-limited, deduplicating)
   │ dequeue: "demo/nightly"
   ▼
  Reconcile(ctx, Request{Name, Namespace})
   │
   ├── 1. Fetch object from cache
   │        if not found → ignore (already deleted)
   │
   ├── 2. Read spec (desired state)
   │
   ├── 3. Read actual state
   │        (check child resources, external systems)
   │
   ├── 4. Compare: actual vs desired
   │
   ├── 5. Act: create/update/delete child resources
   │        OR update external system
   │
   └── 6. Update status with outcome
           └── return Result{}, nil      → done
               return Result{Requeue}, nil → retry after delay
               return Result{}, err     → immediate retry + backoff

The Kubernetes controller reconcile loop is what separates a CRD (validated storage) from an operator (automated behavior). Understanding this loop is the prerequisite for writing controllers that work correctly under failure, partial completion, and concurrent modification.


What “Reconcile” Actually Means

Reconcile means: look at what the user asked for (spec), look at what actually exists, and do whatever is needed to make actual match desired.

The key insight is that this is not event-driven in the traditional sense. A controller does not receive a “diff” — it receives a name. It reads the full current state of the object and acts accordingly.

This matters because:

  1. Multiple events get deduplicated. If a BackupPolicy is updated five times in one second, the work queue delivers one reconcile call, not five.
  2. The reconcile is stateless. The controller should not maintain in-memory state about what it “did last time.” It re-reads everything on each reconcile.
  3. Partial failure is safe. If the reconcile fails halfway through, the next reconcile re-reads actual state and continues from where it left off.

The Informer Cache

Controllers do not call the API server directly for every read. They use an informer — a list-and-watch mechanism that maintains a local in-memory copy of all objects of a given type.

  HOW THE INFORMER CACHE WORKS

  Controller startup:
  ┌─────────────────────────────────────────────────────┐
  │ 1. List all BackupPolicies from API server          │
  │    → populate local cache                           │
  │ 2. Establish a Watch stream                         │
  │    → receive incremental updates                    │
  │ 3. For each update: update cache + enqueue object   │
  └─────────────────────────────────────────────────────┘

  On reconcile:
  ┌─────────────────────────────────────────────────────┐
  │ controller reads from LOCAL cache (not API server)  │
  │ → fast, no network round-trip per reconcile         │
  │ → cache is eventually consistent                    │
  └─────────────────────────────────────────────────────┘

Cache consistency: After writing a change (creating a child Secret, for example), re-reading from the cache may return the old state for a brief period. This is normal and expected. Well-written controllers handle this by returning a requeue rather than assuming the write is immediately visible.


Walking Through a Reconcile for BackupPolicy

Suppose a user creates this BackupPolicy:

apiVersion: storage.example.com/v1alpha1
kind: BackupPolicy
metadata:
  name: nightly
  namespace: demo
spec:
  schedule: "0 2 * * *"
  retentionDays: 30
  targets:
    - namespace: production

The controller’s reconcile function runs. Here is what it does conceptually:

Reconcile(ctx, {Namespace: "demo", Name: "nightly"})

Step 1: Fetch BackupPolicy "demo/nightly" from cache
  → found; spec.schedule = "0 2 * * *", spec.retentionDays = 30

Step 2: Check if a CronJob for this BackupPolicy exists
  → kubectl get cronjob nightly-backup -n demo
  → not found

Step 3: Gap detected: CronJob should exist but doesn't
  → Create CronJob "nightly-backup" in namespace "demo"
    spec.schedule = "0 2 * * *"
    spec.jobTemplate.spec.template.spec.containers[0].args = ["--retention=30"]

Step 4: Set owner reference on CronJob pointing to BackupPolicy
  → CronJob is now garbage-collected if BackupPolicy is deleted

Step 5: Update BackupPolicy status
  → conditions: [{type: Ready, status: True, reason: CronJobCreated}]
  → lastScheduleTime: null (not yet run)

Step 6: Return Result{}, nil   → reconcile complete

Next time the BackupPolicy is modified (e.g., suspended: true):

Reconcile(ctx, {Namespace: "demo", Name: "nightly"})

Step 1: Fetch → spec.suspended = true

Step 2: Fetch CronJob "nightly-backup"
  → found; spec.suspend = false  ← actual state

Step 3: Gap: CronJob.spec.suspend should be true but is false
  → Patch CronJob: set spec.suspend = true

Step 4: Update status
  → conditions: [{type: Ready, status: True, reason: Suspended}]

Step 5: Return Result{}, nil

Idempotency: The Essential Property

The reconcile function must be idempotent. If it runs ten times with the same object state, the result must be the same as if it ran once.

Why? Because the controller framework delivers at-least-once semantics — your reconcile function will be called more than once for the same object state, especially at startup (the informer re-lists all objects) and after controller restarts.

Non-idempotent (wrong):

// Creates a new CronJob every time, even if one already exists
err := r.Create(ctx, cronJob)

Idempotent (correct):

// Only creates if it doesn't exist; updates if it does
existing := &batchv1.CronJob{}
err := r.Get(ctx, types.NamespacedName{Name: jobName, Namespace: ns}, existing)
if apierrors.IsNotFound(err) {
    err = r.Create(ctx, cronJob)
} else if err == nil {
    // update if spec differs
    existing.Spec = cronJob.Spec
    err = r.Update(ctx, existing)
}

The get-before-create pattern is the most basic idempotency mechanism. controller-runtime provides CreateOrUpdate helpers that codify this.


Requeue and Retry Semantics

The reconcile function returns a (Result, error) pair:

return Result{}, nil
  → Reconcile succeeded. Re-run only if object changes again.

return Result{RequeueAfter: 5 * time.Minute}, nil
  → Reconcile succeeded, but requeue in 5 minutes regardless.
  → Used for: polling external system, TTL-based refresh.

return Result{Requeue: true}, nil
  → Requeue immediately (with rate limiting).
  → Used for: cache not yet consistent after a write.

return Result{}, err
  → Reconcile failed. Retry with exponential backoff.
  → Used for: API errors, transient failures.
  RETRY BEHAVIOR

  First failure  → retry after ~1s
  Second failure → retry after ~2s
  Third failure  → retry after ~4s
  ...
  Max backoff    → ~16min (controller-runtime default)

  Object changes (new version from informer) → reset backoff, reconcile immediately

Do not return Result{Requeue: true}, nil in a tight loop — this saturates the work queue and starves other objects. If you need to poll, use RequeueAfter with a meaningful interval.


Watches: What Triggers a Reconcile

The controller does not only watch the primary resource (BackupPolicy). It also watches child resources and maps child changes back to the parent:

  WATCH CONFIGURATION (conceptual)

  Controller watches:
    BackupPolicy (primary) → reconcile when BackupPolicy changes
    CronJob (child/owned)  → reconcile BackupPolicy owner when CronJob changes
    ConfigMap (watched)    → reconcile BackupPolicy when referenced ConfigMap changes

If a user accidentally deletes the CronJob that the controller created:

  1. CronJob deletion event arrives in the informer
  2. Controller maps the deleted CronJob → its owner BackupPolicy
  3. BackupPolicy is enqueued
  4. Reconcile runs, detects missing CronJob, recreates it

This “self-healing” behavior — where controllers reconcile the world back to desired state — is the core operational value of operators. It is not magic; it is the result of watching child resources and re-running reconcile when they drift.


Level-Triggered vs Edge-Triggered

Kubernetes controllers are level-triggered, not edge-triggered. This distinction matters:

  EDGE-TRIGGERED (not what Kubernetes uses)
  → "BackupPolicy was updated FROM retained-30 TO retained-7"
  → If event is lost, the update is lost forever

  LEVEL-TRIGGERED (what Kubernetes uses)
  → "BackupPolicy exists with retentionDays=7"
  → On every reconcile, the controller reads the current level (state)
  → Missing an event is safe — the next reconcile corrects the state

Level-triggered design is why controllers survive restarts, network partitions, and lost events gracefully. The reconcile does not need to track “what changed” — it only needs to know “what is the desired state right now.”


The Same Pattern in Kubernetes Core

Every built-in Kubernetes controller follows this loop:

Controller Watches Manages Reconciles
Deployment controller Deployment ReplicaSets desired replicas ↔ actual ReplicaSet count
ReplicaSet controller ReplicaSet Pods desired replicas ↔ running Pod count
Node lifecycle controller Node Node conditions NotReady nodes → taint, evict pods
Service controller (cloud) Service LoadBalancer cloud LB exists ↔ Service spec

The BackupPolicy controller you will build in EP07 follows exactly the same structure as the Deployment controller.


⚠ Common Mistakes

Reading from the API server directly instead of the cache. Every reconcile reading directly from the API server (not the informer cache) creates N×M load on the API server as the number of objects and reconcile frequency grows. Always read via the controller’s cached client.

Not handling “not found” on object fetch. If a reconcile is triggered but the object has been deleted by the time reconcile runs, the cache returns “not found.” This is normal — the correct response is to return Result{}, nil, not an error.

Tight requeue loop on recoverable error. Returning Result{Requeue: true}, nil or Result{}, err on every call creates an infinite busy-loop. Use RequeueAfter for expected wait conditions, and only return errors for unexpected failures that should back off.

Mutable reconcile state. Do not store reconcile state in struct fields on the reconciler. The reconciler is shared across goroutines; mutable fields cause race conditions. Everything transient must be local to the reconcile function.


Quick Reference

Reconcile input:
  ctx context.Context
  req ctrl.Request   → {Namespace: "demo", Name: "nightly"}

Reconcile output:
  (ctrl.Result, error)

Common returns:
  Result{}, nil                        → done, wait for next change
  Result{Requeue: true}, nil           → retry now (rate limited)
  Result{RequeueAfter: 5*time.Minute}  → retry in 5 minutes
  Result{}, err                        → retry with backoff

Key operations:
  r.Get(ctx, req.NamespacedName, &obj)     → fetch from cache
  r.Create(ctx, &obj)                      → create in API server
  r.Update(ctx, &obj)                      → full update
  r.Patch(ctx, &obj, patch)                → partial update
  r.Delete(ctx, &obj)                      → delete
  r.Status().Update(ctx, &obj)             → update status only

Key Takeaways

  • The reconcile loop reads desired state from spec, reads actual state from the cluster, and closes the gap — on every trigger, not just on changes
  • Controllers use an informer cache for reads — fast, eventually consistent, does not hammer the API server
  • Idempotency is not optional: the reconcile function will be called multiple times with the same state
  • Level-triggered design means missing events is safe — the next reconcile corrects any drift
  • Return values from reconcile control retry behavior: RequeueAfter for polling, err for failures, nil for success

What’s Next

EP07: Build a Simple Kubernetes Operator with controller-runtime puts the reconcile loop into practice — kubebuilder scaffold, a complete reconciler for BackupPolicy, RBAC markers, and running the operator locally against a real cluster.

Get EP07 in your inbox when it publishes → subscribe at linuxcent.com

Leave a Comment