Skip to main content

Kubeadm Cluster Creation

Clouder uses kubeadm to bootstrap production-grade Kubernetes clusters on cloud VMs with built-in support for CRIU (Checkpoint/Restore In Userspace). This design document describes how Clouder orchestrates kubeadm across Azure (and other cloud providers) to create clusters optimized for pod checkpoint and restore workflows.

Overview

┌─────────────────────────────────────────────────────────┐
│ Clouder CLI │
│ │
│ clouder k8s create <cluster-name> --provider azure │
└───────────────────────┬─────────────────────────────────┘

┌────────────▼────────────┐
│ 1. Provision VMs │
│ (Azure / OVH / ...) │
└────────────┬────────────┘

┌────────────▼────────────┐
│ 2. Install prereqs │
│ (containerd, CRIU, │
│ kubeadm, kubelet) │
└────────────┬────────────┘

┌────────────▼────────────┐
│ 3. kubeadm init │
│ (control plane node) │
└────────────┬────────────┘

┌────────────▼────────────┐
│ 4. kubeadm join │
│ (worker nodes) │
└────────────┬────────────┘

┌────────────▼────────────┐
│ 5. Post-setup │
│ (CNI, CRIU config, │
│ storage, monitoring) │
└─────────────────────────┘

Cluster Topology

A Clouder-managed cluster consists of:

RoleCountPurposeRecommended VM Size
Control Plane1 (or 3 for HA)API server, etcd, scheduler, controllerStandard_B4ms (4 vCPUs, 16 GB)
Worker1+Run application podsStandard_B4ms or larger
GPU Worker0+ML/AI workloadsStandard_NC6s_v3 (V100 GPU)

Step 1: VM Provisioning

Clouder provisions VMs using the cloud provider API (e.g., clouder azure vm-create). For a cluster, multiple VMs are created with appropriate naming and tagging:

# Create a 3-node cluster on Azure
clouder k8s create my-cluster \
--provider azure \
--region eastus \
--control-plane-size Standard_B4ms \
--worker-size Standard_B4ms \
--worker-count 2

This will:

  1. Create VMs: my-cluster-cp-1, my-cluster-worker-1, my-cluster-worker-2
  2. Tag all VMs with clouder-cluster=my-cluster and their role
  3. Set up a shared virtual network and subnet
  4. Configure security groups for Kubernetes ports

Required Ports

PortProtocolPurpose
6443TCPKubernetes API server
2379-2380TCPetcd client/peer
10250TCPkubelet API
10259TCPkube-scheduler
10257TCPkube-controller-manager
30000-32767TCPNodePort Services

Step 2: Node Preparation

Clouder connects to each VM via SSH and runs preparation scripts:

Container Runtime (containerd)

# Install containerd 2.x from Docker repo, and CRIU
apt-get update && apt-get install -y containerd.io criu

# Configure containerd for checkpoint/restore
cat > /etc/containerd/config.toml <<EOF
version = 2

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
EOF

systemctl restart containerd

Kubernetes Components

# Install kubeadm, kubelet, kubectl
apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.31/deb/Release.key | \
gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] \
https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /' | \
tee /etc/apt/sources.list.d/kubernetes.list

apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl

CRIU Prerequisites

# Verify CRIU installation
criu check

# Enable kubelet feature gates for checkpoint/restore
cat > /var/lib/kubelet/config.yaml <<EOF
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
featureGates:
ContainerCheckpoint: true
EOF

Step 3: Control Plane Initialization

Clouder generates a kubeadm init configuration and runs it on the control plane node:

Kubeadm Configuration

# clouder-kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v1.32.0
controlPlaneEndpoint: "<control-plane-ip>:6443"
networking:
podSubnet: "10.244.0.0/16"
serviceSubnet: "10.96.0.0/12"
apiServer:
extraArgs:
feature-gates: "ContainerCheckpoint=true"
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
kubeletExtraArgs:
feature-gates: "ContainerCheckpoint=true"

Initialization

# On the control plane node
kubeadm init --config clouder-kubeadm-config.yaml --upload-certs

# Save the join token for worker nodes
kubeadm token create --print-join-command > /tmp/join-command.sh

Step 4: Worker Node Join

Clouder retrieves the join command from the control plane and executes it on each worker:

# On each worker node
kubeadm join <control-plane-ip>:6443 \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash>

Step 5: Post-Setup

CNI Installation

Clouder installs Flannel as the CNI plugin for pod networking:

kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

Flannel uses VXLAN overlay networking with --pod-network-cidr=10.244.0.0/16, which works reliably on Azure VNets where all nodes share the same subnet.

CRIU Configuration

After the cluster is running, Clouder configures the checkpoint/restore infrastructure:

  1. Enable the Kubelet Checkpoint API on all nodes
  2. Configure checkpoint storage (local disk or S3-compatible)
  3. Install the CRIU operator (optional, for automated checkpointing)

See the CRIU documentation for details on checkpoint and restore workflows.

Storage Provisioners

For dynamic persistent volume provisioning, clouder kubeadm setup automatically installs:

  • Azure Disk CSI driver v1.30.3 — for block storage (managed-csi StorageClass, StandardSSD_LRS)
  • Azure File CSI driver v1.30.6 — for NFS shared filesystem (azure-nfs StorageClass, Premium_LRS)

Both require an Azure service principal with Contributor access to the cluster's resource group. Clouder auto-creates one if AZURE_TENANT_ID, AZURE_CLIENT_ID, and AZURE_CLIENT_SECRET environment variables are not set.

See the kubeadm CLI reference for details on manual installation if the automatic step was skipped.

CLI Commands

Create a Cluster

# Interactive mode - prompts for all settings
clouder k8s create my-cluster

# Full specification
clouder k8s create my-cluster \
--provider azure \
--region eastus \
--control-plane-size Standard_B4ms \
--worker-size Standard_B4ms \
--worker-count 2 \
--kubernetes-version v1.32.0 \
--cni flannel \
--enable-criu
note

CRIU support is now built into clouder kubeadm setup by default. The --enable-criu flag above is for the future clouder k8s create managed interface.

Get Kubeconfig

# Download kubeconfig for the cluster
clouder k8s kubeconfig my-cluster

# Set as current context
clouder k8s use my-cluster

Add Worker Nodes

# Add a GPU worker node pool
clouder k8s create-nodepool my-cluster \
gpu-workers \
--flavor Standard_NC6s_v3 \
--min 0 --desired 1 --max 5 \
--roles jupyter --xpu gpu-cuda

Cluster Lifecycle

# List clusters
clouder k8s ls

# Scale workers
clouder k8s update-nodepool my-cluster workers --desired 5

# Delete cluster (removes all VMs and resources)
clouder k8s delete my-cluster

Architecture Decisions

Why kubeadm?

  • Full control: Unlike managed Kubernetes (AKS, EKS), kubeadm gives full control over the kubelet configuration, which is required for CRIU feature gates
  • CRIU support: Managed Kubernetes services don't yet support the ContainerCheckpoint feature gate; kubeadm allows enabling it at init time
  • Portability: Same cluster setup works across Azure, OVH, or bare metal
  • Cost: No managed Kubernetes control plane fees

Why not Managed Kubernetes?

Managed Kubernetes services (AKS, EKS, GKE) abstract away the control plane, which means:

  • Cannot enable experimental feature gates like ContainerCheckpoint
  • Cannot configure containerd for CRIU checkpoint/restore
  • Cannot access the kubelet checkpoint API directly

Once CRIU support reaches GA in Kubernetes, Clouder may add managed Kubernetes backends as well.

Roadmap

  • Phase 1: Single control-plane cluster creation on Azure
  • Phase 2: HA control plane (3 nodes with etcd)
  • Phase 3: Multi-cloud support (OVH, bare metal)
  • Phase 4: Automated CRIU checkpoint scheduling
  • Phase 5: Cluster upgrades via kubeadm upgrade