Building Custom Kubernetes Controllers and Operators

Extend Kubernetes with custom controllers and operators to automate management of complex stateful applications beyond built-in workload types.

published: March 25, 2026 reading time: 30 min read author: GeekWorkBench updated: June 17, 2026

Quick Summary

Custom controllers let you automate operational workflows that built-in Kubernetes controllers cannot handle. You define new resource types with CRDs, then build a reconciliation loop that watches for changes, compares desired versus actual state, and acts to bring reality in line. An operator takes this further by encoding your runbooks directly into the control loop — backup procedures, failover logic, certificate renewal — so the system handles it while you sleep. controller-runtime removes the boilerplate: watching, caching, the actual reconciliation machinery. You write the business logic.

Building Custom Kubernetes Controllers and Operators

Kubernetes ships with built-in controllers for Deployments, StatefulSets, DaemonSets, and Jobs. These cover many use cases, but sometimes you need custom behavior. Maybe you want to manage a database that requires coordinated initialization, or an external service that needs lifecycle management. Custom controllers and operators let you extend Kubernetes to handle these scenarios.

This post explains the controller pattern, Custom Resource Definitions (CRDs), and how to build operators using controller-runtime.

If you are new to Kubernetes, start with the Kubernetes fundamentals post. For core workload types, see the Kubernetes Workload Resources post.

Introduction

Custom controllers solve real problems, but they come with operational baggage. Before reaching for one, ask whether a Deployment with proper configuration would do the job. If it would, stop there.

That said, there are genuine cases: a database that needs coordinated initialization across pods, a certificate authority with renewal workflows, a multi-tenant platform where each tenant needs isolated resources. If you’re running kubectl exec into pods to handle state that should be automated, that’s a controller-shaped problem.

An operator is just a controller with domain knowledge baked in — it encodes the operational procedures that would otherwise live in runbooks. Database failover logic, schema migration pipelines, backup orchestration with retention policies: these are operator territory.

When to skip them

Do not reach for controllers just because they are interesting. A Deployment handles most stateless workloads fine. If you need something done once, a Job or CronJob has less moving parts. And if your “automation” is really just running a Helm template with pre-determined values, a controller adds unnecessary complexity.

The operational burden is real. Controllers run somewhere, need monitoring, can have bugs, and require updates when Kubernetes APIs change.

Before writing controller code, check whether simpler alternatives cover your use case:

Use Case	Better Alternative
Run a batch job once	Job or CronJob
Manage a third-party Helm chart	Helm hooks or post-install scripts
Inject config into pods	ConfigMap/Secret with volume mounts
Liveness/readiness probes	Built-in probe handlers
Periodic maintenance tasks	CronJob with a dedicated ServiceAccount
One-time cluster setup	Init containers or Job

If your “controller” would spend most of its time reading one custom resource and writing one ConfigMap, use a mutating webhook or an init container instead. Controllers make sense when they need to manage multiple related resources over time, handle failure recovery, or maintain invariants across objects.

If a senior engineer sees your design and says “a Helm chart could do this,” use the Helm chart.

Reconciliation Loop Flow

flowchart TD
    A[Watch API Server<br/>for custom resource changes] --> B[Get current state<br/>of managed resources]
    B --> C{Desired state<br/>== Current state?}
    C -->|Yes| A
    C -->|No| D[Reconcile:<br/>Create/Update/Delete]
    D --> E[Update resource status<br/>and conditions]
    E --> A

Controllers follow a declarative reconciliation loop: watch for changes, compare desired vs actual state, act to bring actual toward desired, update status, repeat.

Kubernetes Controller Pattern

A controller is a loop that watches the desired state and reconciles the actual state toward it. Kubernetes ships with many built-in controllers. The Deployment controller watches Deployments and creates ReplicaSets. The ReplicaSet controller creates Pods. The scheduler places Pods onto nodes. The kubelet on each node ensures containers are running.

The reconciliation loop follows this pattern:

watch(current_state) -> compare(desired, current) -> act(bring current to desired)

Controllers use the Kubernetes API to watch resources and create or modify other resources. If a Deployment requests 3 replicas, the controller ensures 3 pods exist. If a pod dies, the controller notices the difference and creates a replacement.

Custom Resources and CRDs

A Custom Resource Definition (CRD) extends the Kubernetes API with new resource types. Once you define a CRD, you can create instances of your custom resource just like built-in resources.

Defining a CRD

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  names:
    kind: Database
    plural: databases
    shortNames:
      - db
  scope: Namespaced
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                engine:
                  type: string
                  enum: ["postgres", "mysql", "mongodb"]
                version:
                  type: string
                replicas:
                  type: integer
                  minimum: 1
                  maximum: 5
              required:
                - engine
                - version
                - replicas
            status:
              type: object
              properties:
                phase:
                  type: string
                endpoint:
                  type: string

After applying this CRD, you can create Database resources:

apiVersion: example.com/v1
kind: Database
metadata:
  name: my-postgres
spec:
  engine: postgres
  version: "15"
  replicas: 3

Without a controller, these resources sit idle. You need a controller to watch Database resources and do something with them.

Controller-Runtime Library

controller-runtime is the standard library for building Kubernetes controllers. It handles API client setup, informer caching, and reconciliation loops.

Project setup

Scaffold the project before writing any controller logic. controller-runtime projects use a standard Go layout: a controllers/ package for reconcile functions and a main.go that bootstraps the manager.

mkdir my-operator && cd my-operator
go mod init my-operator
go get sigs.k8s.io/controller-runtime@v0.17.0

controller-runtime pulls in the manager, controller, reconcile, and source packages. You also need apimachinery for typed clients and client-go for Kubernetes resource types. After the setup commands, your go.mod will look something like this:

module my-operator

go 1.21

require (
    sigs.k8s.io/controller-runtime v0.17.0
    sigs.k8s.io/controller-tools v0.14.0
    k8s.io/apimachinery v0.29.0
    k8s.io/client-go v0.29.0
)

Run controller-gen object paths=./... whenever you change your CRD types. This step generates the DeepCopy methods that the fake client and API machinery need for serialization.

Most projects end up with this shape:

my-operator/
  main.go           # Manager setup and controller registration
  go.mod
  go.sum
  api/
    v1/
      database_types.go   # CRD type definitions
  controllers/
    database_controller.go  # Reconcile implementation
  config/
    crd/
      bases/
        databases.example.com.yaml  # Generated CRD manifest

The api/ directory is where your CRD types live. controllers/ is where the reconcile logic goes. Splitting them makes the project easier to test and maintain.

Main entry point

package main

import (
    "context"
    "log"

    "sigs.k8s.io/controller-runtime/pkg/controller"
    "sigs.k8s.io/controller-runtime/pkg/event"
    "sigs.k8s.io/controller-runtime/pkg/handler"
    "sigs.k8s.io/controller-runtime/pkg/manager"
    "sigs.k8s.io/controller-runtime/pkg/reconcile"
    "sigs.k8s.io/controller-runtime/pkg/source"
)

func main() {
    mgr, err := manager.New(cfg, manager.Options{})
    if err != nil {
        log.Fatal(err)
    }

    ctrl, err := controller.New("database-controller", mgr, controller.Options{
        Reconciler: &ReconcileDatabase{},
    })
    if err != nil {
        log.Fatal(err)
    }

    err = ctrl.Watch(
        &source.Kind{Type: &examplev1.Database{}},
        &handler.EnqueueRequestForObject{},
    )
    if err != nil {
        log.Fatal(err)
    }

    log.Fatal(mgr.Start(context.Background()))
}

Reconciliation Loops and Idempotency

The reconciler compares desired state with actual state and takes action. Reconcilers must be idempotent: applying the same reconciliation multiple times produces the same result.

Reconcile implementation

type ReconcileDatabase struct {
    client client.Client
    scheme *runtime.Scheme
}

func (r *ReconcileDatabase) Reconcile(ctx context.Context, req reconcile.Request) (reconcile.Result, error) {
    log.Printf("Reconciling Database %s/%s", req.Namespace, req.Name)

    // Fetch the Database instance
    db := &examplev1.Database{}
    err := r.client.Get(ctx, req.NamespacedName, db)
    if err != nil {
        return reconcile.Result{}, client.IgnoreNotFound(err)
    }

    // Create or update backend resources based on spec
    if db.Spec.Replicas > 0 {
        result, err := r.ensureStatefulSet(ctx, db)
        if err != nil {
            return result, err
        }
    }

    // Update status
    db.Status.Phase = "Running"
    db.Status.Endpoint = fmt.Sprintf("%s.%s.svc.cluster.local:5432", db.Name, db.Namespace)
    err = r.client.Status().Update(ctx, db)
    if err != nil {
        return reconcile.Result{}, err
    }

    return reconcile.Result{}, nil
}

The reconciler fetches the Database resource, ensures a StatefulSet exists with the right spec, and updates the status. If the spec changes, the next reconciliation pass updates the StatefulSet.

Idempotency in practice

If the StatefulSet already exists with the correct spec, the reconcile loop does nothing. If it needs updating, the controller updates it. If it does not exist, the controller creates it. Running reconciliation 100 times produces the same result as running it once.

Idempotency is not optional. It is what makes controllers safe to run in a shared, asynchronous system. Without it, you get duplicate resources, conflicting labels, and state that drifts the longer the controller runs.

The three rules that actually matter:

Check before create. Attempt a GET before creating any resource. If it exists, compare the spec and update only if needed.
Update only on diff. Use reflect.DeepEqual or a custom comparison function to decide whether an Update call is necessary. Unnecessary updates trigger another reconciliation cycle and waste API server quota.
Use server-side apply (Kubernetes 1.16+) when possible. Server-side apply lets the API server merge fields owned by different controllers, reducing the chance of conflicting overwrites.

// Check before create — the Kubernetes way
ss := &appsv1.StatefulSet{}
err := r.client.Get(ctx, namespacedName, ss)
if apierrors.IsNotFound(err) {
    // Safe to create — nothing exists yet
    return r.client.Create(ctx, r.buildStatefulSet(db))
}
if err != nil {
    return reconcile.Result{}, err
}

// Update only on diff
if !reflect.DeepEqual(ss.Spec.Replicas, &db.Spec.Replicas) {
    ss.Spec.Replicas = &db.Spec.Replicas
    return reconcile.Result{}, r.client.Update(ctx, ss)
}

The fake client in unit tests does not replicate real API server merge behavior. envtest integration tests are the only reliable way to verify idempotency logic end to end.

Operator Pattern for Stateful Apps

An operator extends the controller pattern with domain knowledge. It encodes operational procedures for managing a specific application. The operator pattern combines CRDs with custom controllers to handle application lifecycle events like backups, failover, and upgrades.

Database operator example

func (r *ReconcileDatabase) ensureStatefulSet(ctx context.Context, db *examplev1.Database) (reconcile.Result, error) {
    ss := &appsv1.StatefulSet{}
    err := r.client.Get(ctx, types.NamespacedName{
        Name:      db.Name,
        Namespace: db.Namespace,
    }, ss)

    if apierrors.IsNotFound(err) {
        // Create new StatefulSet
        ss = r.buildStatefulSet(db)
        err = r.client.Create(ctx, ss)
        return reconcile.Result{}, err
    }

    if err != nil {
        return reconcile.Result{}, err
    }

    // Update if spec changed
    if !reflect.DeepEqual(ss.Spec.Replicas, &db.Spec.Replicas) {
        ss.Spec.Replicas = &db.Spec.Replicas
        err = r.client.Update(ctx, ss)
        return reconcile.Result{}, err
    }

    return reconcile.Result{}, nil
}

func (r *ReconcileDatabase) buildStatefulSet(db *examplev1.Database) *appsv1.StatefulSet {
    replicas := int32(db.Spec.Replicas)
    return &appsv1.StatefulSet{
        ObjectMeta: metav1.ObjectMeta{
            Name:      db.Name,
            Namespace: db.Namespace,
            OwnerReferences: []metav1.OwnerReference{
                *metav1.NewControllerRef(db, examplev1.GroupVersion.WithKind("Database")),
            },
        },
        Spec: appsv1.StatefulSetSpec{
            Replicas: &replicas,
            Selector: &metav1.LabelSelector{
                MatchLabels: map[string]string{
                    "app": db.Name,
                },
            },
            ServiceName: db.Name,
            Template: corev1.PodTemplateSpec{
                ObjectMeta: metav1.ObjectMeta{
                    Labels: map[string]string{
                        "app": db.Name,
                    },
                },
                Spec: corev1.PodSpec{
                    Containers: []corev1.Container{
                        {
                            Name:  "database",
                            Image: fmt.Sprintf("%s:%s", db.Spec.Engine, db.Spec.Version),
                            Ports: []corev1.ContainerPort{
                                {ContainerPort: 5432},
                            },
                        },
                    },
                },
            },
        },
    }
}

The operator uses OwnerReferences to link the StatefulSet to the Database resource. When the Database is deleted, Kubernetes garbage collects the StatefulSet automatically.

Client-Go Basics

client-go is the Go client library for Kubernetes. controller-runtime uses client-go under the hood, but you may need client-go directly for more control or for operators not using controller-runtime.

Direct client usage

import (
    "k8s.io/client-go/kubernetes"
    "k8s.io/client-go/tools/clientcmd"
)

func main() {
    kubeconfig := os.Getenv("KUBECONFIG")
    config, err := clientcmd.BuildConfigFromFlags("", kubeconfig)
    if err != nil {
        log.Fatal(err)
    }

    clientset, err := kubernetes.NewForConfig(config)
    if err != nil {
        log.Fatal(err)
    }

    // List pods
    pods, err := clientset.CoreV1().Pods("default").List(context.Background(), metav1.ListOptions{})
    if err != nil {
        log.Fatal(err)
    }

    for _, pod := range pods.Items {
        fmt.Printf("Pod: %s\n", pod.Name)
    }
}

client-go provides typed clients for all Kubernetes resource types. The clientset.CoreV1().Pods() returns an interface for pod operations.

Choosing Your Approach: Trade-off Comparison

Approach	Control	Complexity	Best For
controller-runtime	Medium	Low-Medium	Most custom controllers, standard reconciliation
client-go directly	High	High	Very specific needs, learning K8s internals
Operator SDK	Medium	Medium	Production operators with OLM packaging
Kubebuilder	Medium	Medium	controller-runtime projects with best practices

controller-runtime is the default choice for most controllers. It handles caching, watching, and retry logic that you would otherwise have to write yourself. client-go directly gives you more control but more boilerplate. Operator SDK adds scaffolding on top of controller-runtime for production operators.

Production Failure Scenarios

Reconciliation Loop Falling Behind

Under heavy cluster load, the controller can fall behind its watch on the API server. This means spec changes take longer to propagate to actual resources.

The tell is resources converging slowly after a change, sometimes several minutes. You might also see the controller pod consuming more CPU than usual.

Fix this by watching your controller queue depth. If it is growing, increase worker counts or add indexing for large clusters.

API Server Timeout During Reconciliation

Network blips or API server overload cause reconciliation operations to time out. The error message is context deadline exceeded.

This can leave resources partially updated. A pod might exist but not have its final labels set. A StatefulSet might have the wrong replica count but no init containers configured. The controller moves on without ensuring the resource reached its desired state, and the next reconciliation event might be minutes away.

Use exponential backoff on retry. Start with a 1-second delay, double on each failure, cap at 5 minutes. When reconciliation finally succeeds, reset the delay to the initial interval. This gives the API server time to recover without adding load during an already-stressed period.

You can detect timeout-related partial updates by watching the controller’s error rate. A spike in context deadline exceeded errors usually means the API server is under pressure, and your controller’s retries are contributing to that pressure. If you see multiple controllers timing out simultaneously, the problem is likely the API server itself, not your controller.

The partial state problem slips through without custom metrics. Your controller marks a resource as reconciled even when it failed partway through. The fix is a status condition: type: Reconciled, status: True, reason: ReconciliationComplete when done, status: False, reason: ReconciliationFailed on timeout. Without this, users see their resource as healthy when it is actually not.

Leader Election Failures

In HA setups with multiple controller replicas, leader election failures cause two controllers to think they are both in charge. Both then try to manage the same resources.

The symptom is duplicate resources or conflicting updates appearing in quick succession.

controller-runtime has built-in leader election. Make sure your lease duration is long enough for your typical restart time.

Anti-Patterns

Ignoring Deletion

If your reconcile only handles creates and updates, deleted custom resources leave their child resources behind. The controller never cleans up.

Add finalizers to your resources. In the reconcile delete handling, remove the finalizer last after cleaning up dependents.

Deletion is the most commonly forgotten part of controller development. The reconcile function receives a request for a resource that no longer exists in etcd, but the controller still needs to act. There are two cases:

1. No finalizers, no children. The reconcile call comes in with an empty req — the resource is already gone. Return early.

2. Has children or external dependencies. StatefulSets, ConfigMaps, PVCs, cloud resources — you must clean these up before Kubernetes allows the resource to be deleted. This is where finalizers matter.

The delete flow in practice:

1. User deletes Database resource
2. Kubernetes sets .metadata.deletionTimestamp — the resource is "marked for deletion"
3. Controller reconcile sees the timestamp is set
4. Controller cleans up child StatefulSet, PVCs, external resources
5. Controller removes its finalizer from the Database
6. Kubernetes deletes the Database resource

Without a finalizer, step 3 never runs. The Database disappears and the StatefulSet stays behind as an orphan. Orphaned StatefulSets hold PVCs, which hold persistent volumes.

To detect whether a resource is in delete state, check db.ObjectMeta.DeletionTimestamp.IsZero(). If it returns false, you are in deletion handling mode.

Skipping Status Updates

Your users have no idea what the controller actually did. Did it create the StatefulSet? Is it still trying? Did it fail?

Update .status on every reconciliation pass. Use conditions to communicate transient states.

The .status subresource is your controller’s only interface to the outside world. Without it, users running kubectl get databases see an empty status column and have no idea whether the controller is working. Every user question becomes “is my Database resource actually working?”

Update status on every reconciliation pass, not just when something changes:

Condition	Status Field	Meaning
Reconciling	`phase: Reconciling`	Controller is processing the spec
Ready	`phase: Ready`	All child resources match the spec
Error	`phase: Error`	Reconciliation failed, see message
Updating	`phase: Updating`	Spec change detected, applying delta
Pending	`phase: Pending`	Waiting for dependent resource

Use conditions for detailed state:

db.Status.Phase = "Running"
db.Status.Conditions = []metav1.Condition{
    {
        Type:   "Ready",
        Status: metav1.ConditionTrue,
        Reason: "StatefulSetReady",
        Message: "StatefulSet replicas are all running",
    },
}
err = r.client.Status().Update(ctx, db)

Call r.client.Status().Update not r.client.Update — the latter replaces the entire resource and can clobber spec changes made by other controllers in the same reconciliation window.

Missing Exponential Backoff

When the API server is overloaded and your controller keeps retrying immediately, you make the problem worse. Every failing controller adds load.

Without backoff, a controller in a tight retry loop turns a transient API server hiccup into a cascading outage. The API server gets hit with exponentially more requests as every controller pod hammers it simultaneously. A brief 500 error window becomes a self-inflicted denial of service that takes down the very API server you need to recover.

The backoff pattern is straightforward: start at 1 second, double on each failure, cap at 5 minutes. Reset to the initial interval when reconciliation finally succeeds. This is the same backoff strategy Kubernetes uses for its own control loops.

controller-runtime handles this automatically when you return ReconcileResult{Requeue: true} on error. The library cannot tell whether an error is transient or permanent. For persistent failures like quota exhaustion or webhook timeouts, RequeueAfter tells it to wait before retrying:

if err != nil {
    // Transient error — back off and retry
    return reconcile.Result{RequeueAfter: 30 * time.Second}, err
}

For broken states (invalid spec, a missing external dependency that will never exist), do not requeue at all. Return an error without RequeueAfter and let the controller’s work queue apply its own default backoff. RequeueAfter gives you control over the timing. Omitting it lets the controller manager decide. Both eventually retry, but one is predictable.

Security Checklist

RBAC for Controllers

Controllers need permissions to manage the resources they watch. Apply least privilege:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: database-controller-role
rules:
  - apiGroups: ["example.com"]
    resources: ["databases"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  - apiGroups: ["apps"]
    resources: ["statefulsets"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

Never give a controller cluster-admin. If your controller only manages resources in one namespace, use a Role and RoleBinding instead of ClusterRole and ClusterRoleBinding.

Service Account for the Controller

Run your controller with a dedicated ServiceAccount:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: database-controller-sa
  namespace: controllers
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: database-controller-rolebinding
  namespace: controllers
subjects:
  - kind: ServiceAccount
    name: database-controller-sa
    namespace: controllers
roleRef:
  kind: ClusterRole
  name: database-controller-role
  apiGroup: rbac.authorization.k8s.io

Mount the token inside the controller pod and configure kubeconfig to use it automatically via MountableSecrets.

General Security Practices

These practices cover the runtime and supply chain dimension of controller security.

Run controllers in a dedicated namespace, isolated from application workloads
Do not let controllers manage their own deployment (circular dependency risk)
Use readOnlyRootFilesystem: true in the controller pod security context where possible
Enable RBAC audit logging to track controller permission usage
Do not embed credentials in controller code — use ServiceAccounts or external secrets solutions

Pod security context. Controllers rarely need to write to the filesystem or open arbitrary ports. Lock them down:

securityContext:
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  runAsUser: 1000
  seccompProfile:
    type: RuntimeDefault

Network policies. If your controller talks to external services, restrict egress. Kubernetes network policies default to allow-all within a namespace, so you must explicitly block outbound traffic from the controller namespace.

Image provenance. Build your controller image from a minimal base, sign it with Cosign or similar, and enforce image verification in your admission control pipeline. A compromised controller image essentially has cluster-admin access.

Controller updates. When you update a controller, the old version keeps reconciling until the new version is ready. During the transition window, both versions run simultaneously. Design your CRD schema to be backward-compatible across at least two controller versions, or use a conversion webhook to handle schema changes.

Testing Strategies

Unit Tests

Test reconciliation logic in isolation without a real API server:

func TestReconcileDatabaseCreatesStatefulSet(t *testing.T) {
    // Setup fake client with Database CR
    cl := fake.NewClientBuilder().
        WithObjects(&examplev1.Database{
            ObjectMeta: metav1.ObjectMeta{
                Name:      "test-db",
                Namespace: "default",
            },
            Spec: examplev1.DatabaseSpec{
                Engine:   "postgres",
                Replicas: 3,
            },
        }).Build()

    r := &ReconcileDatabase{Client: cl}
    req := reconcile.Request{
        NamespacedName: types.NamespacedName{
            Name:      "test-db",
            Namespace: "default",
        },
    }

    _, err := r.Reconcile(context.Background(), req)
    if err != nil {
        t.Fatalf("reconcile failed: %v", err)
    }

    // Verify StatefulSet was created
    ss := &appsv1.StatefulSet{}
    err = cl.Get(context.Background(), req.NamespacedName, ss)
    if err != nil {
        t.Errorf("StatefulSet not created: %v", err)
    }
}

Use sigs.k8s.io/controller-runtime/pkg/reconcile for the reconcile interface and sigs.k8s.io/controller-runtime/pkg/client/fake for the fake client.

Integration Tests

Test against a real API server using an envtest (like Kubebuilder’s test environment):

var testEnv *envtest.Environment

func TestMain(m *testing.M) {
    testEnv = &envtest.Environment{}
    _, err := testEnv.Start()
    if err != nil {
        os.Exit(1)
    }
    defer testEnv.Stop()
    os.Exit(m.Run())
}

func TestReconcileWithAPIServer(t *testing.T) {
    cl, err := client.New(testEnv.Config, client.Options{})
    if err != nil {
        t.Fatal(err)
    }

    db := &examplev1.Database{
        ObjectMeta: metav1.ObjectMeta{
            Name:      "integration-test",
            Namespace: "default",
        },
        Spec: examplev1.DatabaseSpec{
            Engine:   "postgres",
            Replicas: 1,
        },
    }
    cl.Create(context.Background(), db)

    r := &ReconcileDatabase{Client: cl}
    _, err = r.Reconcile(context.Background(), reconcile.Request{
        NamespacedName: types.NamespacedName{
            Name:      "integration-test",
            Namespace: "default",
        },
    })

    if err != nil {
        t.Errorf("reconcile failed: %v", err)
    }
}

envtest starts a real API server, etcd, and controller manager in-memory. This catches bugs that unit tests with fake clients miss.

Advanced Scenarios

Leader Election

For HA deployments with multiple controller replicas, use leader election to ensure only one replica acts at a time:

import "sigs.k8s.io/controller-runtime/pkg/leaderelection"

func main() {
    mgr, err := manager.New(cfg, manager.Options{
        LeaderElection:     true,
        LeaderElectionID:   "database-controller-leader",
        LeaderElectionNamespace: "controllers",
    })
    if err != nil {
        log.Fatal(err)
    }

    // Uses leader election if multiple replicas are running
    // Only the leader processes reconciliation events
    log.Fatal(mgr.Start(context.Background()))
}

Set LeaseDuration, RenewDeadline, and RetryPeriod based on your expected restart time. If your controller typically restarts in 30 seconds, set LeaseDuration to 60 seconds and RenewDeadline to 15 seconds.

Finalizers

Finalizers block deletion of a resource until the controller has cleaned up dependent resources:

// Add finalizer on first reconcile
if !controller.ContainsFinalizer(db, "database.example.com_cleanup") {
    controller.AddFinalizer(db, "database.example.com_cleanup")
    r.Client.Update(ctx, db)
    return reconcile.Result{}, nil
}

// Handle deletion
if !db.ObjectMeta.DeletionTimestamp.IsZero() {
    // Clean up external resources
    if err := r.cleanupExternalResources(ctx, db); err != nil {
        return reconcile.Result{}, err
    }
    // Remove finalizer
    controller.RemoveFinalizer(db, "database.example.com_cleanup")
    r.Client.Update(ctx, db)
    return reconcile.Result{}, nil
}

Without finalizers, deleting a Database custom resource would leave its StatefulSet and PVCs behind.

Owner References and Garbage Collection

Owner references tell Kubernetes to clean up child resources when the owner is deleted:

ownerRef := metav1.OwnerReference{
    APIVersion:         "example.com/v1",
    Kind:               "Database",
    Name:               db.Name,
    UID:                db.UID,
    Controller:         ptr.Bool(true),
    BlockOwnerDeletion: ptr.Bool(true),
}

ss := &appsv1.StatefulSet{
    ObjectMeta: metav1.ObjectMeta{
        Name:            db.Name,
        Namespace:       db.Namespace,
        OwnerReferences: []metav1.OwnerReference{ownerRef},
    },
    // ... spec ...
}

Set BlockOwnerDeletion: true if you need the garbage collection to wait for the owner to be fully deleted. This prevents race conditions where a pod tries to mount a volume before the PVC is ready.

Interview Questions

1. What is the reconcile loop pattern and why must controllers implement it idempotently?

Expected answer points:

The reconcile loop watches for changes to resources, fetches current state, compares with desired state, and takes action to bring actual toward desired
Idempotency means calling reconcile N times produces the same result as calling it once
Reconcilers must check if resources already exist before creating, and only update if spec differs
Without idempotency, running reconciliation multiple times causes duplicate or conflicting resources
controller-runtime handles caching to make idempotent reconciliation efficient

2. How do Custom Resource Definitions extend the Kubernetes API?

Expected answer points:

CRDs extend the API without modifying the Kubernetes binary by adding new resource types via the apiextensions.k8s.io API group
Once applied, custom resources behave like native resources (kubectl get, describe, etc.)
The schema is defined via OpenAPIV3Schema for validation at create/update time
Multiple versions can be served simultaneously with the storage version persisting to etcd
Controllers watch CRDs and reconcile them the same way they watch native resources

3. What is the difference between a controller and an operator?

Expected answer points:

A controller implements the reconcile loop pattern for custom resources without domain-specific knowledge
An operator is a controller with domain knowledge baked in for application-specific lifecycle management
Operators encode operational procedures like backup, restore, failover, and upgrades
Use operators for stateful applications with complex procedures that would otherwise live in runbooks
Operator SDK provides scaffolding for building operators with testing and OLM packaging

4. Why are finalizers required for controllers managing external resources?

Expected answer points:

Finalizers block deletion of a resource until the controller has cleaned up dependent resources
Without finalizers, deleting a custom resource would leave StatefulSets, PVCs, or external resources behind
The controller adds a finalizer on first reconcile and removes it last after cleanup
This prevents orphaned resources that are difficult to recover or clean up manually
OwnerReferences handle garbage collection for Kubernetes-native child resources

5. How does controller-runtime handle leader election for HA deployments?

Expected answer points:

controller-runtime has built-in leader election to ensure only one replica processes reconciliation events
Set LeaderElection: true in manager.Options with a unique LeaderElectionID
The lease duration must exceed your typical controller restart time to avoid unnecessary failovers
Without leader election, multiple replicas could manage the same resources causing conflicts
LeaderElectionNamespace defaults to the controller namespace but can be overridden

6. What is the purpose of OwnerReferences in custom controllers?

Expected answer points:

OwnerReferences link child resources to parent custom resources for garbage collection
When the owner is deleted, Kubernetes automatically deletes child resources marked with BlockOwnerDeletion
Set Controller: true on the OwnerReference to designate the parent as the controller
This eliminates the need for controllers to manually track and delete orphaned resources
Use with finalizers to handle cleanup of both Kubernetes-native and external resources

7. How do you test custom controllers without a real Kubernetes cluster?

Expected answer points:

Use controller-runtime/pkg/client/fake to create a fake client with objects for unit testing
Unit tests verify reconcile logic by checking if expected resources were created or updated
envtest starts a real API server and etcd in-memory for integration testing
Integration tests catch bugs that fake clients miss like real API server behavior
Use table-driven tests to cover multiple reconcile scenarios efficiently

8. What is the difference between controller-runtime and client-go?

Expected answer points:

controller-runtime is built on client-go and provides a higher-level abstraction
controller-runtime handles caching, watching, and reconcile loops out of the box
client-go provides direct access to Kubernetes API with typed clients for all resource types
Use client-go for very specific needs or when learning Kubernetes internals
controller-runtime is the default choice for most custom controllers

9. How do you handle API server timeouts during reconciliation?

Expected answer points:

Context deadline exceeded errors leave resources partially updated
Use exponential backoff on retry rather than immediate requeue with the same delay
controller-runtime/pkg/reconcile provides RequeueAfter for delayed reconciliation
Set a reasonable initial backoff interval (seconds, not milliseconds) to let the API server recover
Track partial state so reconciliation can resume from where it left off

10. What RBAC permissions does a controller typically need?

Expected answer points:

Controllers need permissions for the CRD they watch (get, list, watch) and the resources they manage (create, update, delete)
Apply least privilege: if the controller manages resources in one namespace, use Role and RoleBinding not ClusterRole
Never give controllers cluster-admin or permissions beyond what they actually need
Run controllers with a dedicated ServiceAccount, not the default
Use MountableSecrets in the ServiceAccount to automatically mount the token in pods

11. How does the controller queue depth affect cluster performance?

Expected answer points:

Under heavy cluster load, the controller can fall behind its watch causing spec changes to propagate slowly
A growing queue depth is a telltale sign the controller cannot keep up
Increase worker counts or add indexing for large clusters to process more reconciliation requests
Monitor controller queue depth with Prometheus metrics from controller-runtime
High queue depth can cause CPU spikes as the controller tries to catch up

12. What are the security considerations when running custom controllers?

Expected answer points:

Run controllers in a dedicated namespace isolated from application workloads
Do not let controllers manage their own Deployment (circular dependency risk)
Use readOnlyRootFilesystem: true in the controller pod security context where possible
Do not embed credentials in controller code use ServiceAccounts or external secrets solutions
Enable RBAC audit logging to track controller permission usage over time

13. How do you implement exponential backoff in a custom controller?

Expected answer points:

When reconciliation fails, requeue with an increasing delay rather than immediately
controller-runtime's reconcile.Result supports RequeueAfter for delayed requeue
Start with a short interval (seconds) and double on each failure up to a max interval
Exponential backoff prevents a failing controller from overwhelming the API server
Log errors at each retry level to help diagnose the underlying issue

14. What is the difference between watching CRDs vs native resources?

Expected answer points:

Controllers watch both CRDs and native resources using the same informer mechanism
The API server notifies watchers of changes via watch streams on the API endpoint
For CRDs, the informer watches the CRD endpoint under the custom API group
Controllers typically watch their own CRDs and the child resources they manage
Indexer caching applies to both native and custom resources for efficient reconciliation

15. How does Kubebuilder differ from Operator SDK for building operators?

Expected answer points:

Kubebuilder is a framework for building Kubernetes APIs and controllers using controller-runtime
Operator SDK extends Kubebuilder with additional tooling for operators including OLM packaging
Kubebuilder generates API definitions and controllers together as a project
Operator SDK supports both Go and Ansible-based operators
For pure controller-runtime projects with no special packaging needs, Kubebuilder is simpler

16. How do you handle controller bugs that cause resource drift?

Expected answer points:

Resource drift happens when the controller modifies resources that were changed externally
Use status conditions to track observed state and detect drift between desired and actual
Watch for spec changes in the reconcile loop and reconcile toward the latest spec
If the controller makes unwanted changes, fix the bug and manually reconcile to restore desired state
Add validation webhooks to prevent invalid spec changes at the API server level

17. What is the role of the controller manager in Kubernetes?

Expected answer points:

The controller manager (kube-controller-manager) runs all built-in control loops
It runs as a single process that implements multiple controllers: Deployment, ReplicaSet, Endpoint, etc.
Each controller is a control loop that reconciles actual toward desired state
Custom controllers run outside the controller manager, typically as pods in the cluster
The controller manager uses leader election to ensure only one instance runs controllers in HA mode

18. How do you structure a multi-controller project for large operators?

Expected answer points:

Group related controllers under a single manager to share caching and client connections
Use separate reconcile functions for each CRD type while sharing the same client
Define clear interfaces for shared logic like resource building and status updates
Each controller watches its own CRD and manages its own child resources
For very large projects, consider splitting into multiple operators with independent lifecycles

19. How does garbage collection work for custom controller resources?

Expected answer points:

Kubernetes garbage collection deletes child resources when the owner is deleted if OwnerReferences are set
Set BlockOwnerDeletion: true to make garbage collection wait for the owner to be fully deleted
Without OwnerReferences, deleting a custom resource leaves child resources orphaned
Finalizers handle cleanup of external resources that Kubernetes cannot garbage collect
The controller removes its finalizer only after all dependents are cleaned up

20. When should you choose client-go directly over controller-runtime?

Expected answer points:

Use client-go directly when you need fine-grained control over HTTP requests or specific API version handling
Use it for building operators that do not follow the standard reconcile loop pattern
Use it when working with dynamic client discovery for arbitrary resource types
Use client-go when you need to understand Kubernetes internals deeply
For everything else, controller-runtime is the better default choice due to reduced boilerplate

Quick Recap Checklist

Conclusion

Custom controllers extend Kubernetes beyond built-in workload types. CRDs define new resource types. Controllers watch those resources and reconcile actual state toward desired state. Operators encode domain knowledge to handle application-specific lifecycle management.

controller-runtime simplifies controller development by handling caching, API watching, and reconciliation loops. client-go provides the underlying client functionality for direct Kubernetes API access.

Building operators requires understanding of the controller pattern, Go, and Kubernetes internals. For teams running complex stateful applications on Kubernetes, custom operators can automate operational tasks that would otherwise require manual intervention.

For more advanced Kubernetes topics, see the Advanced Kubernetes post.

Building Custom Kubernetes Controllers and Operators

Introduction

When to skip them

Reconciliation Loop Flow

Kubernetes Controller Pattern

Custom Resources and CRDs

Defining a CRD

Controller-Runtime Library

Project setup

Main entry point

Reconciliation Loops and Idempotency

Reconcile implementation

Idempotency in practice

Operator Pattern for Stateful Apps

Database operator example

Client-Go Basics

Direct client usage

Choosing Your Approach: Trade-off Comparison

Production Failure Scenarios

Reconciliation Loop Falling Behind

API Server Timeout During Reconciliation

Leader Election Failures

Anti-Patterns

Ignoring Deletion

Skipping Status Updates

Missing Exponential Backoff

Security Checklist

RBAC for Controllers

Service Account for the Controller

General Security Practices

Testing Strategies

Unit Tests

Integration Tests

Advanced Scenarios

Leader Election

Finalizers

Owner References and Garbage Collection

Interview Questions

Quick Recap Checklist

Further Reading

Official Documentation

Articles and Guides

Conclusion

Category

Tags

Related Posts

Container Security: Image Scanning and Vulnerability Management

Deployment Strategies: Rolling, Blue-Green, and Canary Releases

Developing Helm Charts: Templates, Values, and Testing