Skip to content

Installation Overview

Nantian Gateway deploys into a Kubernetes cluster as three components: a Go control plane, Rust data plane proxies, and a management dashboard. You have two primary installation paths: Helm (recommended) and Kustomize. Both produce the same running system, so the choice depends on how your team manages Kubernetes workloads.

This section covers everything you need to get Nantian Gateway running, from a quick Helm install to a hardened production deployment with high availability.

MethodBest forComplexity
HelmMost teams. Templating, upgrades, rollbacks, and value overrides built in.Low
KustomizeGitOps workflows, teams that need patch-based overlays without templating.Medium

Helm is the recommended path. The chart is distributed via the Helm repository as nantian-gw/nantian-gw and includes templates for all three components, RBAC, certificates, and optional ServiceMonitor resources for Prometheus.

A default Helm installation creates these resources in the nantian-gw namespace:

ResourceReplicasPurpose
Control plane2Watches Gateway API resources, translates to xDS config, pushes to data planes
Data plane2Rust proxy that terminates TLS, routes traffic, enforces policies
Dashboard1Web UI for monitoring gateways, routes, and data plane status
GatewayClass1Registers the gateway.networking.k8s.io/nantian-gw controller

The control plane communicates with data planes over gRPC bidirectional streams using the xDS protocol. Data planes connect to the control plane’s gRPC service (port 18080 by default) and receive configuration snapshots as deltas.

Nantian Gateway runs inside your Kubernetes cluster as three distinct workloads. The diagram below shows how they relate to each other and to external traffic.

┌──────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
Client Traffic │ ┌──────────┐ ┌──────────────────┐ │
──────────────► │ │ Service │ │ Dashboard │ │
:80/:443 │ │ (LB/NP) │ │ (Port 3000) │ │
│ └────┬─────┘ └────────┬─────────┘ │
│ │ │ │
│ ┌────▼────────────────────▼──────────┐ │
│ │ Data Plane Pods │ │
│ │ (Rust, ports 10080/10443) │ │
│ │ ┌──────┐ ┌──────┐ ┌──────┐ │ │
│ │ │ DP-1 │ │ DP-2 │ │ DP-3 │ │ │
│ │ └──┬───┘ └──┬───┘ └──┬───┘ │ │
│ └──────┼─────────┼─────────┼─────────┘ │
│ │ gRPC/xDS│(port │ │
│ │ │ 18080) │ │
│ ┌──────▼─────────▼─────────▼──────────┐ │
│ │ Control Plane │ │
│ │ (Go, ports 18080/18082) │ │
│ │ Leader election via Lease │ │
│ └────────────────┬────────────────────┘ │
│ │ │
│ Watches Gateway API CRDs │
│ │ │
│ ┌────────────────▼────────────────────┐ │
│ │ Kubernetes API Server │ │
│ └─────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────┘

Traffic flow: Client requests hit a Kubernetes Service (LoadBalancer or NodePort) that routes to data plane pods. Each data plane pod terminates TLS, inspects the request, and forwards it to the backend Service matching the HTTPRoute rules.

Control flow: The control plane watches Gateway API resources (Gateway, HTTPRoute, etc.) through the Kubernetes API. It translates these into xDS configuration snapshots and pushes them to every data plane over persistent gRPC bidirectional streams.

Dashboard: Reads Gateway and HTTPRoute status from the Kubernetes API and polls the control plane’s admin endpoint (port 18082) for data plane health. The dashboard itself is stateless.

Everything installs into the nantian-gw namespace. A default deployment creates these Kubernetes resources:

ResourceKindNameComponent Label
NamespaceNamespacenantian-gw(N/A)
Control plane workloadDeploymentnantian-gw-controlplanecontrolplane
Data plane workloadDeploymentnantian-gw-dataplanedataplane
Dashboard workloadDeploymentnantian-gw-dashboarddashboard
Control plane gRPCService (ClusterIP)nantian-gw-controlplanecontrolplane
Data plane ingressServicenantian-gw-dataplanedataplane
Dashboard UIService (ClusterIP)nantian-gw-dashboarddashboard
Gateway registrationGatewayClassnantian-gw(none)
Leader electionLeasenantian-gw-controller-leader(none)

All resources carry these standard Kubernetes recommended labels:

  • app.kubernetes.io/name: nantian-gw
  • app.kubernetes.io/part-of: nantian-gw
  • app.kubernetes.io/managed-by: Helm (or kustomize, depending on install method)

The component label (app.kubernetes.io/component) is the primary selector for Services and PodDisruptionBudgets. If you write NetworkPolicies or additional RBAC rules, target this label to stay compatible with upgrades.

ServiceAccount, ClusterRole, and ClusterRoleBinding resources are also created. The control plane gets read access to Gateway API resources and write access to Gateway status subresources. Data plane and dashboard pods run with a restricted ServiceAccount that has no API access.

Each component exposes a specific set of ports. All services are ClusterIP by default. You control external exposure through Helm values or Kustomize overlays.

ComponentPortProtocolService TypeExternal?Purpose
Control plane18080gRPCClusterIPNoxDS configuration push to data planes
Control plane18082HTTPClusterIPNoAdmin API (health, metrics, debug)
Data plane10080HTTPConfigurableOptionalIncoming HTTP traffic
Data plane10443HTTPSConfigurableOptionalIncoming HTTPS traffic (TLS termination)
Dashboard3000HTTPClusterIPNoWeb UI (use port-forward or Ingress to expose)

If you use a NetworkPolicy controller, allow these flows:

SourceDestinationPortReason
Data plane podsControl plane pods18080xDS/gRPC config stream
Dashboard podControl plane pods18082Admin API queries
kubeletAll podshealth endpointsReadiness and liveness probes
External clientsData plane Service10080, 10443Incoming application traffic

The control plane’s gRPC port (18080) and admin port (18082) should never be exposed outside the cluster. The data plane ports (10080 and 10443) are the only ones that need external reachability, and you control how they’re exposed through the Service type setting (LoadBalancer, NodePort, or ClusterIP behind an Ingress).

Nantian Gateway is stateless. It does not require a database, persistent volumes, or external storage of any kind.

Configuration state: All Gateway API resources (Gateway, HTTPRoute, GatewayClass, etc.) live in the Kubernetes API server’s etcd store. The control plane reads them through watches and keeps an in-memory representation. There is no separate config database to manage or back up.

Leader election: The control plane uses a Kubernetes Lease object (nantian-gw-controller-leader) for leader election. The active leader holds the lease and pushes configuration to data planes. Standby replicas sit idle and take over if the leader stops renewing. The lease duration is 15 seconds with a 10-second renew deadline, so failover completes within 15 seconds in the worst case.

Data plane state: Data plane proxies keep their configuration in memory, received from the control plane over gRPC/xDS streams. No persistent storage is attached to data plane pods. If a pod restarts, it reconnects to the control plane, receives the current configuration snapshot, and resumes handling traffic.

Recovery behavior: A full cluster restart (all pods down, all pods back up) recovers cleanly. The control plane starts, wins leader election, watches Gateway API resources from the API server (backed by etcd), recomputes the configuration, and pushes it to data planes. The entire process typically takes 30 to 60 seconds from the first control plane pod starting.

Metrics: Prometheus metrics are scraped from the control plane’s admin endpoint (port 18082). No metrics are stored locally. You need an external Prometheus instance and optional Grafana to collect and visualize them.

The default resource requests in the Helm chart are tuned for evaluation and development. Production workloads need higher limits.

ComponentEnvRequest CPURequest MemoryLimit CPULimit Memory
Control planeDev100m128Mi500m512Mi
Control planeProd200m256Mi11Gi
Data planeDev500m256Mi11Gi
Data planeProd2512Mi(unset)2Gi
DashboardDev/Prod50m64Mi200m256Mi

The data plane carries the traffic load, so its CPU request should reflect expected throughput. For production data planes, the CPU limit is intentionally left unset so the proxy can burst to available node CPU during traffic spikes. The memory limit is the hard cap that prevents a runaway proxy from exhausting node memory.

What to monitor:

  • CPU throttling: Check container_cpu_cfs_throttled_seconds_total for the control plane and data plane. Frequent throttling means the CPU limit is too low.
  • Memory usage: Watch container_memory_working_set_bytes against the memory limit. The data plane should stay well below its limit under normal load.
  • gRPC stream health: The control plane exposes nantian_gw_xds_streams_active on its admin endpoint. This should equal the number of data plane replicas. A drop means a data plane disconnected.
  • Leader election: The Lease object should always have a holder. Check with kubectl get lease -n nantian-gw.

For a full breakdown of production resource planning, see the Production deployment guide.

Before installing, make sure you have:

  • Kubernetes 1.24 or later (the Gateway API v1.0 resources gateway.networking.k8s.io/v1 must be available)
  • Gateway API CRDs installed (see Prerequisites)
  • kubectl configured to talk to your cluster
  • helm 3.x (for Helm installs) or kubectl with Kustomize built in (for Kustomize installs)

Start with the Helm installation guide if you’re new to the project. It covers the default install, common value overrides, and how to inspect the chart before deploying.

Use the Kustomize guide if your team uses GitOps tooling like Argo CD or Flux and you prefer raw manifests over Helm templating.

Once you’re comfortable with the basics, work through the Production deployment and High availability guides. These cover hardening, resource planning, anti-affinity rules, TLS everywhere, and multi-zone deployment patterns.

After installation, run through these checks to confirm everything is working:

1. Check pod status

Terminal window
kubectl get pods -n nantian-gw

All pods should show Running and READY should match the replica count for each component.

2. Verify the GatewayClass

Terminal window
kubectl get gatewayclass

You should see nantian-gw listed with Accepted: True.

3. Check leader election

Terminal window
kubectl get lease -n nantian-gw

The lease nantian-gw-controller-lease should have a non-empty HOLDER column. This confirms the control plane leader is active.

4. Create a test Gateway and HTTPRoute

Save the following to test-gateway.yaml:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: test-gateway
spec:
gatewayClassName: nantian-gw
listeners:
- name: http
port: 80
protocol: HTTP
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: test-route
spec:
parentRefs:
- name: test-gateway
rules:
- backendRefs:
- name: test-backend
port: 80

Apply it:

Terminal window
kubectl apply -f test-gateway.yaml

Then check that the Gateway shows Programmed: True:

Terminal window
kubectl get gateway test-gateway

If the test Gateway is accepted and programmed, the control plane and data planes are communicating correctly.

5. Inspect logs (if something is wrong)

Terminal window
kubectl logs -n nantian-gw deployment/nantian-gw-controlplane --tail=50
kubectl logs -n nantian-gw deployment/nantian-gw-dataplane --tail=50

The control plane logs show Gateway API watch events and xDS snapshot publications. The data plane logs show gRPC connection status and configuration acceptance.

After you’ve confirmed everything is running, clean up the test resources:

Terminal window
kubectl delete -f test-gateway.yaml

After installation, head to the Quick Start to verify everything is running and create your first route. If you’re deploying to production, read the Production deployment guide before exposing the gateway to live traffic.