Split-Plane Architecture

Nantian Gateway uses a split-plane architecture where the control plane and data plane are separate processes communicating over gRPC. This design is fundamental to how the gateway operates and scales.

Why Split-Plane?

Many gateway implementations embed the proxy directly within the controller process. This simplifies deployment but creates operational constraints:

Scaling is coupled — scaling the control plane to handle more Kubernetes resources also scales the data plane, even when unnecessary
Failure domains overlap — a control plane crash can take down the proxy alongside it
Resource contention — management workloads compete with data path workloads for CPU and memory

A split-plane architecture separates these concerns. The control plane focuses on configuration management and translation, while the data plane focuses on request processing. Each plane can be scaled, monitored, and debugged independently.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                      Kubernetes Cluster                      │
│                                                              │
│  ┌──────────────────┐       ┌─────────────────────────────┐ │
│  │   Control Plane   │       │       Data Plane (Rust)      │ │
│  │      (Go)         │       │                              │ │
│  │                   │ gRPC  │  ┌─────────────────────────┐│ │
│  │  • Watch K8s API  │◄─────►│  │   Proxy Runtime         ││ │
│  │  • Translate CRDs │ xDS   │  │  • TLS termination      ││ │
│  │  • Push config    │       │  │  • HTTP routing          ││ │
│  │  • Admin API      │       │  │  • Rate limiting         ││ │
│  │  • Web Dashboard  │       │  │  • AI gateway            ││ │
│  └──────────────────┘       │  │  • Wasm extensions       ││ │
│                              │  └─────────────────────────┘│ │
│                              │                              │ │
│                              │  ┌─────────────────────────┐│ │
│                              │  │   xDS Client             ││ │
│                              │  │  • Receive config        ││ │
│                              │  │  • Apply snapshots       ││ │
│                              │  │  • Health reporting      ││ │
│                              │  └─────────────────────────┘│ │
│                              └─────────────────────────────┘ │
│                                                              │
│                         ┌──────────┐                         │
│                         │ Backend   │                         │
│                         │ Services  │                         │
│                         └──────────┘                         │
└─────────────────────────────────────────────────────────────┘

Control Plane (Go)

The control plane is written in Go and runs as a Kubernetes deployment. It performs the following responsibilities:

Watch Kubernetes API — monitors Gateway API resources (Gateway, HTTPRoute, etc.) and Nantian Gateway custom resources (AIService, TokenPolicy, WasmPlugin, BackendLBPolicy)
Translate configuration — converts Kubernetes resources into an internal configuration model that the data plane understands
Push configuration via xDS — streams configuration snapshots to connected data plane instances over gRPC bidirectional streams
Serve Admin API — provides an HTTP API for operational queries and management
Host Web Dashboard — serves a Next.js-based admin interface for monitoring and configuration

The control plane is deployed as a highly available replica set. In the event of a control plane failure, existing data plane instances continue to operate with their current configuration, but new configuration changes are not applied until the control plane recovers.

Data Plane (Rust)

The data plane is written in Rust and runs as a separate deployment. Each instance handles the actual request path:

Receive configuration — establishes a gRPC connection to the control plane and receives configuration snapshots
Apply configuration — activates the received configuration atomically, ensuring consistent routing behavior
Process requests — terminates TLS, routes HTTP and gRPC traffic, applies rate limits, transforms headers, and forwards requests to backend services
Run AI gateway — handles AI provider protocol adaptation, token counting, and PII masking within the proxy
Report health — sends metrics and health status back to the control plane

The data plane is designed for high throughput. Rust’s memory safety guarantees and zero-cost abstractions provide performance comparable to C++ proxies while reducing the risk of memory-related vulnerabilities.

xDS Communication

The control plane and data plane communicate over gRPC bidirectional streams using the xDS protocol. This protocol defines how configuration is serialized, versioned, and delivered.

Key properties of the xDS communication:

Bidirectional — the data plane can send acknowledgments and health reports back to the control plane
Incremental — only changed configuration is transmitted, reducing bandwidth
Versioned — each configuration snapshot has a version, allowing the data plane to detect and reject stale updates
Atomic — configuration snapshots are applied atomically, preventing partial configuration states

When a data plane instance connects, it receives the full current configuration snapshot. Subsequent changes are delivered incrementally. If the connection is lost, the data plane reconnects and receives the latest snapshot.

Independent Scaling

The split-plane design allows each plane to scale based on its own requirements:

Concern	Control Plane	Data Plane
Scaling trigger	Number of Kubernetes resources	Request throughput
Typical replicas	2 (HA)	2+ (load-dependent)
Resource profile	CPU for config translation	Network and CPU for traffic
Failure impact	Cannot apply new config	Cannot process traffic

This separation means a traffic spike that requires scaling the data plane does not affect the control plane, and a large number of Kubernetes resources that load the control plane does not affect request processing.

Next Steps

Gateway API Concepts — understand the Kubernetes resources the control plane watches
Architecture Details — deep dive into control plane and data plane internals
Quick Start — deploy the split-plane architecture in your cluster