Production-Grade K3s HA Cluster: 3-Node Setup with Embedded etcd, Longhorn, and Nginx Ingress

Introduction

K3s is a lightweight, CNCF-certified Kubernetes distribution from Rancher that packages everything into a single binary under 100MB. While it was originally designed for edge and IoT workloads, it has matured into a solid choice for production clusters — especially when you want the full Kubernetes API surface without the operational overhead of kubeadm or managed cloud offerings.

In this guide, we'll build a 3-node HA K3s cluster using embedded etcd (no external datastore required), back it with Longhorn for distributed block storage, and expose workloads through the Nginx Ingress Controller. By the end, you'll have a cluster that survives a single control-plane node failure and provides persistent volumes that replicate across nodes.

Prerequisites

Before you start, make sure you have the following in place:

Infrastructure:

3 Linux nodes (Ubuntu 22.04 LTS recommended) — each with at least 2 vCPU, 4 GB RAM, and 40 GB disk
A dedicated data disk (e.g., /dev/sdb, 50+ GB) on each node for Longhorn
Static IPs or stable DNS names for all three nodes
SSH access with sudo privileges

Networking:

Nodes must be able to reach each other on ports 6443 (Kubernetes API), 2379-2380 (etcd), 10250 (kubelet), and 8472/UDP (Flannel VXLAN)
A load balancer or VIP in front of port 6443 for HA API access (HAProxy, kube-vip, or a cloud LB)

Tools on your workstation:

kubectl v1.28+
helm v3.12+
curl and ssh

Node naming convention used in this guide:

k3s-node-1  192.168.1.10  (first server — bootstraps the cluster)
k3s-node-2  192.168.1.11  (second server)
k3s-node-3  192.168.1.12  (third server)

Setting Up the 3-Node Cluster with Embedded etcd

Step 1: Prepare All Nodes

Run the following on all three nodes before installing K3s:

# Disable swap (required by Kubernetes)
sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab

# Load required kernel modules
sudo modprobe overlay
sudo modprobe br_netfilter

cat <<EOF | sudo tee /etc/modules-load.d/k3s.conf
overlay
br_netfilter
EOF

# Set sysctl params
cat <<EOF | sudo tee /etc/sysctl.d/99-k3s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables= 1
net.ipv4.ip_forward                = 1
EOF

sudo sysctl --system

# Install open-iscsi for Longhorn (required)
sudo apt-get update && sudo apt-get install -y open-iscsi nfs-common
sudo systemctl enable --now iscsid

Step 2: Bootstrap the First Server Node

On k3s-node-1, initialize the cluster with embedded etcd:

export K3S_TOKEN="your-strong-cluster-secret-here"
export VIP="192.168.1.100"  # Your load balancer / VIP address

curl -sfL https://get.k3s.io | sh -s - server \
  --cluster-init \
  --token="${K3S_TOKEN}" \
  --tls-san="${VIP}" \
  --tls-san="k3s-node-1" \
  --tls-san="192.168.1.10" \
  --disable=traefik \
  --disable=servicelb \
  --node-name="k3s-node-1" \
  --write-kubeconfig-mode=644

Key flags explained:

--cluster-init — bootstraps a new etcd cluster (only on the first node)
--tls-san — adds SANs to the API server certificate; include your VIP and all node IPs/names
--disable=traefik — we'll use Nginx Ingress instead
--disable=servicelb — removes the built-in load balancer (use MetalLB or cloud LB instead)

Wait for the node to become Ready:

sudo kubectl get nodes
# NAME         STATUS   ROLES                       AGE   VERSION
# k3s-node-1   Ready    control-plane,etcd,master   60s   v1.28.x+k3s1

Step 3: Join the Remaining Server Nodes

On k3s-node-2 and k3s-node-3, join as additional server nodes (not agents — this is what gives you HA control plane):

export K3S_TOKEN="your-strong-cluster-secret-here"
export SERVER_URL="https://192.168.1.10:6443"  # First node's IP
export VIP="192.168.1.100"

# Run on k3s-node-2
curl -sfL https://get.k3s.io | sh -s - server \
  --server="${SERVER_URL}" \
  --token="${K3S_TOKEN}" \
  --tls-san="${VIP}" \
  --disable=traefik \
  --disable=servicelb \
  --node-name="k3s-node-2"

# Run on k3s-node-3
curl -sfL https://get.k3s.io | sh -s - server \
  --server="${SERVER_URL}" \
  --token="${K3S_TOKEN}" \
  --tls-san="${VIP}" \
  --disable=traefik \
  --disable=servicelb \
  --node-name="k3s-node-3"

After both nodes join, verify the cluster:

sudo kubectl get nodes -o wide
# NAME         STATUS   ROLES                       AGE    VERSION
# k3s-node-1   Ready    control-plane,etcd,master   5m     v1.28.x+k3s1
# k3s-node-2   Ready    control-plane,etcd,master   2m     v1.28.x+k3s1
# k3s-node-3   Ready    control-plane,etcd,master   1m     v1.28.x+k3s1

Step 4: Configure kubectl on Your Workstation

# Copy the kubeconfig from node-1
scp user@192.168.1.10:/etc/rancher/k3s/k3s.yaml ~/.kube/k3s-config

# Update the server address to point to your VIP
sed -i 's/127.0.0.1/192.168.1.100/g' ~/.kube/k3s-config

export KUBECONFIG=~/.kube/k3s-config
kubectl get nodes

Installing Longhorn Storage

Longhorn is a cloud-native distributed block storage system for Kubernetes. It creates replicated volumes across your nodes, so a disk or node failure doesn't take down your persistent data.

Step 1: Verify Prerequisites

# Check open-iscsi is running on all nodes
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}' | \
  xargs -I{} ssh {} "systemctl is-active iscsid"

# Run the Longhorn environment check script
curl -sSfL https://raw.githubusercontent.com/longhorn/longhorn/v1.6.0/scripts/environment_check.sh | bash

Step 2: Install via Helm

helm repo add longhorn https://charts.longhorn.io
helm repo update

helm install longhorn longhorn/longhorn \
  --namespace longhorn-system \
  --create-namespace \
  --version 1.6.0 \
  --set defaultSettings.defaultReplicaCount=3 \
  --set defaultSettings.storageMinimalAvailablePercentage=15 \
  --set defaultSettings.defaultDataPath="/var/lib/longhorn"

Wait for all Longhorn pods to be Running:

kubectl -n longhorn-system rollout status deploy/longhorn-driver-deployer
kubectl -n longhorn-system get pods

Step 3: Set Longhorn as the Default StorageClass

# Patch the local-path provisioner to not be default
kubectl patch storageclass local-path \
  -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

# Verify Longhorn is now the default
kubectl get storageclass
# NAME                 PROVISIONER          RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
# longhorn (default)   driver.longhorn.io   Delete          Immediate           true                   2m
# local-path           rancher.io/local-path Delete         WaitForFirstConsumer false                 10m

Step 4: Test a Persistent Volume

# test-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-test-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 1Gi

kubectl apply -f test-pvc.yaml
kubectl get pvc longhorn-test-pvc
# NAME                STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
# longhorn-test-pvc   Bound    pvc-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx   1Gi        RWO            longhorn       30s

Configuring Nginx Ingress Controller

Step 1: Install via Helm

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --version 4.9.0 \
  --set controller.service.type=LoadBalancer \
  --set controller.replicaCount=2 \
  --set controller.nodeSelector."kubernetes\.io/os"=linux \
  --set controller.admissionWebhooks.enabled=true

If you're not using a cloud load balancer, set the service type to NodePort and use your VIP:

helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.service.type=NodePort \
  --set controller.service.nodePorts.http=30080 \
  --set controller.service.nodePorts.https=30443 \
  --set controller.replicaCount=2

Step 2: Verify the Controller

kubectl -n ingress-nginx get pods
kubectl -n ingress-nginx get svc ingress-nginx-controller

Step 3: Deploy a Test Application with Ingress

# test-app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-server
spec:
  replicas: 2
  selector:
    matchLabels:
      app: echo-server
  template:
    metadata:
      labels:
        app: echo-server
    spec:
      containers:
        - name: echo-server
          image: ealen/echo-server:latest
          ports:
            - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: echo-server
spec:
  selector:
    app: echo-server
  ports:
    - port: 80
      targetPort: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: echo-server
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: echo.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: echo-server
                port:
                  number: 80

kubectl apply -f test-app.yaml
kubectl get ingress echo-server

Verification Steps

Cluster Health

# All nodes Ready
kubectl get nodes

# etcd cluster health (run on any server node)
sudo k3s etcd-snapshot ls

# Check etcd member list
sudo kubectl -n kube-system exec -it \
  $(kubectl -n kube-system get pods -l component=etcd -o jsonpath='{.items[0].metadata.name}') \
  -- etcdctl member list \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
  --cert=/var/lib/rancher/k3s/server/tls/etcd/server-client.crt \
  --key=/var/lib/rancher/k3s/server/tls/etcd/server-client.key

Longhorn Health

# All volumes healthy
kubectl -n longhorn-system get volumes.longhorn.io

# Check replica distribution
kubectl -n longhorn-system get replicas.longhorn.io -o wide

Ingress Health

# Controller logs
kubectl -n ingress-nginx logs -l app.kubernetes.io/name=ingress-nginx --tail=50

# Test routing (replace with your node IP and NodePort if not using LB)
curl -H "Host: echo.example.com" http://192.168.1.100/

Simulate a Node Failure

To validate HA, cordon and drain one node and confirm the API server remains reachable:

kubectl cordon k3s-node-2
kubectl drain k3s-node-2 --ignore-daemonsets --delete-emptydir-data

# API server should still respond via VIP
kubectl get nodes

# Uncordon when done
kubectl uncordon k3s-node-2

Production Hardening Tips

A few things worth doing before you call this production-ready:

etcd snapshots — K3s takes automatic etcd snapshots every 12 hours by default. Configure --etcd-snapshot-dir to point to an NFS or S3-backed path for off-node backups.
cert-manager — Install cert-manager with a Let's Encrypt ClusterIssuer to automate TLS for your Ingress resources.
Network policies — Flannel (K3s default CNI) doesn't enforce NetworkPolicy. Consider switching to Calico or Cilium if you need policy enforcement.
Resource limits — Always set requests and limits on your workloads. K3s doesn't enforce this by default.
Upgrade strategy — Use k3s-upgrade-controller (system-upgrade-controller) to perform rolling upgrades without manual SSH.

Conclusion

You now have a 3-node K3s HA cluster with embedded etcd, Longhorn distributed storage, and Nginx Ingress — all without a managed Kubernetes service or an external datastore. The cluster can tolerate a single control-plane node failure while keeping the API server and workloads available.

This setup is a solid foundation for self-hosted production workloads. From here, you can layer on cert-manager for TLS automation, Prometheus and Grafana for observability, or Argo CD for GitOps-driven deployments.

If you're running this on bare metal or a homelab, kube-vip is worth a look as a lightweight VIP solution that integrates cleanly with K3s without needing an external load balancer.