Optimising NGINX Ingress Controller Startup Performance

NGINX Ingress Controller 5.5 introduces some significant performance improvements in startup times!

A few months ago, a community member noticed that NGINX Ingress Controller deployments with a large number of Ingress resources were experiencing longer-than-expected startup times. In clusters with hundreds or thousands of resources spread across many namespaces, the controller could take several minutes to become ready after a restart, delaying traffic routing for that period.

After further examination of this issue, we found three key areas where we could improve the NGINX Ingress Controller startup performance. This post documents how we identified these performance areas and how we optimised the code to improve our startup performance.

What “Becoming Ready” Actually Means

First, some context. When the NGINX Ingress Controller starts, it does not immediately begin routing traffic. It first builds up a complete picture of the cluster by processing a snapshot of every Ingress, VirtualServer, Policy, and other resource it knows about. Only once that picture is complete does it generate the NGINX configuration, reload NGINX, and signal to Kubernetes that it is ready to serve traffic.

This startup queue drain is expected and necessary, but per our findings it was also causing the startup performance to degrade in larger clusters.

Optimisation 1: Deferred Host Conflict Resolution

Every time the controller processed an Ingress during startup, it called an internal function called rebuildHosts(). This function resolves host conflicts: it works out which Ingress wins when two or more compete for the same hostname.

rebuildHosts() scans every resource the controller knows about to do its job. With a large number of Ingresses, it was being called once per resource during the queue drain, meaning the total work scaled as O(N²).

The fix was straightforward once identified: skip rebuildHosts() entirely during the queue drain. During startup, the controller is only building its in-memory model; it is not generating any configuration yet. The final state is identical whether conflicts are resolved as each resource arrives or deferred to the end. So we deferred all conflict resolution to a single rebuildHosts() call once the queue was empty, reducing total work from O(N²) to O(N).

At the same time, we found a related inefficiency inside rebuildHosts() itself: finding the minion Ingresses for a given master required scanning all Ingresses on every call. We added a minionsByHost index, a map maintained incrementally as resources arrive, so those lookups became O(1) instead of O(N). Together, these two changes eliminated the algorithmic bottleneck entirely.

Optimisation 2: Deferred Status Writes

Kubernetes provides a status field on Ingress and custom resources like VirtualServer. The controller is responsible for writing to this field to report things like which load-balancer IP was assigned, whether the resource is valid, and whether there are any configuration warnings.

Before our changes, the controller wrote each resource’s status immediately as it processed that resource during the queue drain. With a large number of resources, that meant a large number of sequential calls to the Kubernetes API, each one blocking the next. Those calls ran one at a time on the main controller goroutine, the same goroutine responsible for everything else.

Defer, Then Flush in Parallel

We separated when status is written from when the resource is processed.

During the queue drain, instead of writing status to the API, the controller now appends each resource to an in-memory slice. Once the queue is empty and NGINX has been reloaded and marked ready, a background goroutine dispatches all deferred statuses using a pool of 10 concurrent workers.

Status propagation now happens in the background, after the pod is already serving traffic, rather than blocking on the critical path.

The one exception is resources flagged as ConfigurationProblems (host conflicts, orphaned minions, and orphaned VirtualServerRoutes), which are written synchronously. This distinction is explained below.

Why Conflict Resources Are Handled Differently

When host conflict resolution runs at the end of the queue drain, it may find that some Ingresses lost a conflict. Two Ingresses claiming the same hostname, with only one allowed to win. The losing Ingress needs its load-balancer IP cleared, not set.

This matters because the deferred flush only knows how to call the “set IP” API. If conflict resources were deferred into the same flush, the flush would end up setting a load-balancer IP on an Ingress that should have none, which would be a correctness bug.

The solution was to keep ConfigurationProblem resources on the direct path, writing their status synchronously via processProblems(). This does not undo the optimisation: the number of problem resources is bounded by misconfiguration, not by total resource count. In a correctly configured cluster that number is zero. The direct API calls for problem resources are negligible compared to the savings from deferring everything else.

Optimisation 3: Decoupled Readiness from Status Propagation

Even with the parallel flush, the original code only marked the pod as ready after all status updates were complete. That meant the pod could not receive traffic until every resource had its status written, which at scale added noticeable delay.

The fix here was conceptual: status metadata is informational. It tells operators which IP was assigned and whether a resource is valid. It does not affect whether traffic actually routes correctly. NGINX was already configured and reloaded with all the right rules before the status flush began.

So we moved the readiness signal (isNginxReady = true) to immediately after the NGINX reload, before the status flush starts. The pod becomes ready, starts receiving traffic, and status propagates asynchronously in the background.

Leader Election and the Status Flush

In a multi-replica NGINX Ingress Controller deployment with high availability enabled, only one pod holds the leader lease at any given time. Only the leader is permitted to write status; followers skip those writes to avoid conflicts.

Leader election does not complete instantaneously. A pod can become ready and begin the status flush before it knows whether it is the leader.

The flush goroutine handles this by polling the leader election state in a loop. If the pod wins leadership within a bounded window (60 seconds), the flush proceeds. If the deadline elapses without winning (meaning this is a follower pod), the flush exits without writing anything. When that follower eventually takes over as leader, the leader-election callback handles writing all statuses at that point.

This design means the flush goroutine is the sole owner of deferred startup status writes. The only other writes that happen during startup are the synchronous processProblems() calls, which complete before the flush goroutine is launched. This avoids two independent sets of parallel workers hitting the Kubernetes API server simultaneously.

Results

Across all tested configurations, startup time dropped from several minutes to under 30 seconds. Both small and large deployments benefit: all three inefficiencies were always present and simply became more pronounced at scale.

What Did Not Change

Post-startup behaviour is identical to before these changes. Once the controller is running and ready:

Every resource update triggers an immediate status write, exactly as before.
Host conflict resolution runs on every change, exactly as before.
All other paths (Secret sync, ConfigMap sync, leader election callbacks) are untouched.

The changes are entirely confined to the startup path, gated behind two flags (startupComplete in the Configuration layer and isNginxReady in the controller) that flip to their permanent values as soon as the pod becomes ready.

Takeaway

The longer startup time was caused by three individually reasonable design decisions that compounded at scale:

Conflict resolution was correct to run on every change; but did not need to run on every individual item during a bulk initial load.
Status writes were correct to be synchronous in steady state; but did not need to block readiness or run serially during startup.
Readiness was correctly gated on configuration completeness; but “configuration complete” does not mean “status metadata propagated”.

None of these were bugs in the traditional sense. The original design was correct and sufficient until the deployment grew large enough to expose its quadratic and serial costs. The fix was to recognise that startup is a distinct phase with different constraints, and treat it accordingly.

We would also love to thank our community member hugolevino for contributing ideas and extensive testing that helped shape this work!