Skip to content

Control Plane Design

The control plane is the brain of Nantian Gateway. It watches Kubernetes resources, translates them into configuration the data plane can understand, manages the lifecycle of data plane connections, and writes status back to Gateway API resources. It runs as a Go binary built on controller-runtime and grpc-go.

The control plane boots in a carefully ordered sequence to avoid serving partial configuration. The entry point is gateway/cmd/manager/main.go, which calls run() in app.go:

  1. Load configuration from the YAML file specified by --config
  2. Build the Kubernetes scheme registering all Gateway API types, custom CRDs, and infrastructure types
  3. Create the controller-runtime manager with leader election, health probes, and a metrics server
  4. Initialize core services: metrics registry, IR snapshot store, and node status registry backed by Kubernetes Leases
  5. Create the translator with resource limits that protect against runaway translation
  6. Create the status writer with the controller name and advertised addresses
  7. Set up the reconciler runner with scoped reconcilers for infrastructure and status
  8. Create the snapshot syncer that watches Kubernetes resources, feeds them to the translator, and publishes snapshots
  9. Create the admin API server with optional TLS and bearer token authentication
  10. Create the gRPC xDS server with TLS/mTLS and runtime configuration
  11. Assemble all components under the lifecycle supervisor
  12. Run the supervisor — all components start in parallel, the startup gate blocks readiness until everything is healthy
Load Config --> Build Scheme --> Create Manager --> Init Services
|
v
Translator + Status + Reconciler
|
v
Syncer + Admin + gRPC
|
v
Supervisor (start all, gate readiness)

The translator lives at gateway/internal/translator/translator.go. Its primary method is Build(ctx, client), which reads all relevant Kubernetes resources through the controller-runtime client and produces a complete ir.Snapshot.

The translator fetches these Kubernetes resource types in a single build cycle:

  • Gateway API resources: Gateway, ListenerSet, HTTPRoute, GRPCRoute, TCPRoute, UDPRoute, TLSRoute, ReferenceGrant, BackendTLSPolicy
  • Nantian custom resources: AIService, TokenPolicy, WasmPlugin, BackendLBPolicy
  • Kubernetes primitives: Service, Endpoint, EndpointSlice, Secret, ConfigMap, Namespace, Node
  • Multi-cluster: ServiceImport (from the MCS API)

The translator follows a pipeline that processes resources in dependency order:

  1. Gateways and listeners are read first. They define the ports, protocols, and TLS configuration that the data plane listens on
  2. Routes are matched to gateways. Each route type (HTTPRoute, GRPCRoute, TCPRoute, UDPRoute, TLSRoute) is processed by its own translation function
  3. Backend references are resolved. The translator follows backendRefs from routes to Services, EndpointSlices, and custom backends
  4. Policies are applied. BackendTLSPolicy, BackendLBPolicy, TokenPolicy, and WasmPlugin are attached to the relevant route or listener
  5. AI service configuration is translated. AIService resources are resolved into backend clusters with provider-specific configuration
  6. Limits are enforced. The translator checks MaxInputObjects, MaxSnapshotObjects, and MaxSnapshotEndpoints and aborts if any limit is exceeded

The translator accepts configurable limits to prevent resource exhaustion:

LimitDefaultDescription
MaxInputObjects0 (unlimited)Maximum total Kubernetes objects read during a build
MaxSnapshotObjects0 (unlimited)Maximum objects in the output IR snapshot
MaxSnapshotEndpoints0 (unlimited)Maximum endpoint entries in the output snapshot

When a limit is exceeded, the translator returns an error and the build is considered failed. The snapshot store retains the last successful snapshot, so data planes continue serving traffic with the previous configuration.

The translator registers several field indexes with the controller-runtime manager to speed up resource lookups. These are defined in gateway/internal/translator/indexes.go and registered via SetupIndexes() during startup. Without these indexes, the translator would need to perform expensive full-list scans for every build.

The snapshot store (gateway/internal/ir/store.go) is the central distribution point for translated configuration. It holds the current ir.Snapshot and manages a subscriber list. Each subscriber represents a data plane that needs to receive configuration updates.

When the translator publishes a new snapshot, the store replaces the current snapshot and fans it out to all subscribers. If a subscriber is still processing the previous snapshot, the store coalesces the pending snapshot, replacing the old one with the new one. This prevents backpressure from a slow data plane from blocking the translator.

The store exposes hooks for metrics collection. The OnSubscriberQueueReplace hook increments the nantian_gateway_controlplane_xds_snapshot_fanout_coalesced_total counter when a pending snapshot is replaced.

The gRPC server (gateway/internal/grpcserver/server.go) implements the ConfigurationDiscoveryService gRPC service defined in proto/gateway/control/v1/. Data planes connect via a bidirectional streaming RPC and exchange DiscoveryRequest and DiscoveryResponse messages.

  1. A data plane opens a StreamConfiguration RPC and sends a DiscoveryRequest identifying itself by node ID
  2. The server validates the request, records the stream in the active stream registry, and subscribes the node to the snapshot store
  3. The server sends the current snapshot to the data plane
  4. The data plane acknowledges with an ACK or NACK
  5. When the snapshot store publishes a new snapshot, the server sends it to all active streams
  6. The data plane sends status reports (DataplaneStatusReport) on the same stream, providing health and configuration state
  7. The stream terminates when the data plane disconnects, the server shuts down, or a timeout occurs

The server tracks why each stream ended and records it in the nantian_gateway_controlplane_xds_stream_terminations_total metric with the reason label:

ReasonDescription
shutdownServer is shutting down
client_disconnectData plane closed the stream
stream_errorgRPC stream encountered an error
send_timeoutSending a snapshot timed out
ack_timeoutNo ACK/NACK received within the timeout
supersededA new stream from the same node replaced this one
invalid_requestThe initial request was malformed
otherAny other reason

The gRPC server receives DataplaneStatusReport messages from data planes and forwards them to the node status registry. Reports are validated before being applied. Rejected reports are counted in the nantian_gateway_controlplane_xds_status_report_rejections_total metric with rejection reasons like shutdown, invalid_request, unknown_node, or other.

The control plane uses a custom reconciler runner (gateway/internal/controller/reconciler_runner.go) rather than the default controller-runtime reconciler loop. The runner supports:

  • Scoped reconciliation: Infrastructure and status are reconciled independently with different scopes
  • Settle delay: Changes are debounced to avoid excessive reconciliation during rapid resource updates
  • Retry with backoff: Failed reconciliations are retried with exponential backoff
  • Immediate trigger: Node status changes can trigger immediate infrastructure reconciliation

The runner emits detailed metrics for monitoring: queue depth, trigger counts, deduplication counts, settle state, and retry state.

The control plane uses Kubernetes leader election through the controller-runtime manager. The leader election configuration is defined in the control plane config:

ParameterDefaultDescription
leaderElection.enabledtrueEnable leader election
leaderElection.idnantian-gw-controlplaneLease identity
leaderElection.leaseDuration15sHow long a lease is held
leaderElection.renewDeadline10sHow long the leader can attempt to renew
leaderElection.retryPeriod2sHow long candidates wait between acquisition attempts

Only the leader runs the translator, reconciler runner, and snapshot syncer. The standby replicas serve the Admin API and metrics endpoint but do not watch Kubernetes resources or build snapshots. If the leader fails, one of the standbys acquires the lease and takes over translation within the lease duration.

The status writer (gateway/internal/status/) writes status back to Gateway API resources. It updates:

  • Gateway status: Listener status (ready, warning, failed), addresses, and conditions
  • Route status: Parent gateway acceptance, route conditions, and resolved refs
  • Policy status: Ancestor references and conditions for BackendTLSPolicy, BackendLBPolicy, AIService, TokenPolicy, and WasmPlugin

The status writer is triggered by the reconciler runner after each infrastructure reconciliation. It uses the controller-runtime client to patch status subresources, respecting the standard Gateway API status conventions.

The node status system (gateway/internal/nodestatus/) tracks the health and configuration state of each data plane instance. It uses Kubernetes Lease objects for persistence, storing node status as JSON in the lease’s spec fields. The node status registry supports:

  • Debounced persistence: Status updates are batched and flushed after a configurable debounce window
  • Bounded backlog: The persistence queue has a configurable maximum depth to prevent unbounded memory growth
  • Immediate and debounced updates: Critical updates can bypass the debounce window

Node status metrics include queue depth, pending nodes, enqueued/dropped totals, and flush duration histograms.

The lifecycle supervisor (gateway/internal/lifecycle/supervisor.go) manages the startup and shutdown of all control plane components. Components are registered with a name and a run function. The supervisor:

  1. Starts all components in parallel
  2. Waits for all components to signal readiness (or for a startup timeout)
  3. Marks the startup gate as ready, allowing Kubernetes readiness probes to succeed
  4. On shutdown, cancels the context for all components and waits for graceful termination

Components include the controller-runtime manager, admin HTTP server, metrics HTTP server, gRPC server, and optionally the pprof debug server.