Kubeadm Cluster Creation
Clouder uses kubeadm to bootstrap production-grade Kubernetes clusters on cloud VMs with built-in support for CRIU (Checkpoint/Restore In Userspace). This design document describes how Clouder orchestrates kubeadm across Azure (and other cloud providers) to create clusters optimized for pod checkpoint and restore workflows.
Overview
┌─────────────────────────────────────────────────────────┐
│ Clouder CLI │
│ │
│ clouder k8s create <cluster-name> --provider azure │
└───────────────────────┬─────────────────────────────────┘
│
┌────────────▼────────────┐
│ 1. Provision VMs │
│ (Azure / OVH / ...) │
└────────────┬────────────┘
│
┌────────────▼────────────┐
│ 2. Install prereqs │
│ (containerd, CRIU, │
│ kubeadm, kubelet) │
└────────────┬────────────┘
│
┌────────────▼────────────┐
│ 3. kubeadm init │
│ (control plane node) │
└────────────┬────────────┘
│
┌────────────▼────────────┐
│ 4. kubeadm join │
│ (worker nodes) │
└────────────┬────────────┘
│
┌────────────▼────────────┐
│ 5. Post-setup │
│ (CNI, CRIU config, │
│ storage, monitoring) │
└─────────────────────────┘
Cluster Topology
A Clouder-managed cluster consists of:
| Role | Count | Purpose | Recommended VM Size |
|---|---|---|---|
| Control Plane | 1 (or 3 for HA) | API server, etcd, scheduler, controller | Standard_B4ms (4 vCPUs, 16 GB) |
| Worker | 1+ | Run application pods | Standard_B4ms or larger |
| GPU Worker | 0+ | ML/AI workloads | Standard_NC6s_v3 (V100 GPU) |
Step 1: VM Provisioning
Clouder provisions VMs using the cloud provider API (e.g., clouder azure vm-create). For a cluster, multiple VMs are created with appropriate naming and tagging:
# Create a 3-node cluster on Azure
clouder k8s create my-cluster \
--provider azure \
--region eastus \
--control-plane-size Standard_B4ms \
--worker-size Standard_B4ms \
--worker-count 2
This will:
- Create VMs:
my-cluster-cp-1,my-cluster-worker-1,my-cluster-worker-2 - Tag all VMs with
clouder-cluster=my-clusterand their role - Set up a shared virtual network and subnet
- Configure security groups for Kubernetes ports
Required Ports
| Port | Protocol | Purpose |
|---|---|---|
| 6443 | TCP | Kubernetes API server |
| 2379-2380 | TCP | etcd client/peer |
| 10250 | TCP | kubelet API |
| 10259 | TCP | kube-scheduler |
| 10257 | TCP | kube-controller-manager |
| 30000-32767 | TCP | NodePort Services |
Step 2: Node Preparation
Clouder connects to each VM via SSH and runs preparation scripts:
Container Runtime (containerd)
# Install containerd 2.x from Docker repo, and CRIU
apt-get update && apt-get install -y containerd.io criu
# Configure containerd for checkpoint/restore
cat > /etc/containerd/config.toml <<EOF
version = 2
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
EOF
systemctl restart containerd
Kubernetes Components
# Install kubeadm, kubelet, kubectl
apt-get install -y apt-transport-https ca-certificates curl gpg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.31/deb/Release.key | \
gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] \
https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /' | \
tee /etc/apt/sources.list.d/kubernetes.list
apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl
CRIU Prerequisites
# Verify CRIU installation
criu check
# Enable kubelet feature gates for checkpoint/restore
cat > /var/lib/kubelet/config.yaml <<EOF
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
featureGates:
ContainerCheckpoint: true
EOF
Step 3: Control Plane Initialization
Clouder generates a kubeadm init configuration and runs it on the control plane node:
Kubeadm Configuration
# clouder-kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v1.32.0
controlPlaneEndpoint: "<control-plane-ip>:6443"
networking:
podSubnet: "10.244.0.0/16"
serviceSubnet: "10.96.0.0/12"
apiServer:
extraArgs:
feature-gates: "ContainerCheckpoint=true"
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
kubeletExtraArgs:
feature-gates: "ContainerCheckpoint=true"
Initialization
# On the control plane node
kubeadm init --config clouder-kubeadm-config.yaml --upload-certs
# Save the join token for worker nodes
kubeadm token create --print-join-command > /tmp/join-command.sh
Step 4: Worker Node Join
Clouder retrieves the join command from the control plane and executes it on each worker:
# On each worker node
kubeadm join <control-plane-ip>:6443 \
--token <token> \
--discovery-token-ca-cert-hash sha256:<hash>
Step 5: Post-Setup
CNI Installation
Clouder installs Flannel as the CNI plugin for pod networking:
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
Flannel uses VXLAN overlay networking with --pod-network-cidr=10.244.0.0/16, which works reliably on Azure VNets where all nodes share the same subnet.
CRIU Configuration
After the cluster is running, Clouder configures the checkpoint/restore infrastructure:
- Enable the Kubelet Checkpoint API on all nodes
- Configure checkpoint storage (local disk or S3-compatible)
- Install the CRIU operator (optional, for automated checkpointing)
See the CRIU documentation for details on checkpoint and restore workflows.
Storage Provisioners
For dynamic persistent volume provisioning, clouder kubeadm setup automatically installs:
- Azure Disk CSI driver v1.30.3 — for block storage (
managed-csiStorageClass, StandardSSD_LRS) - Azure File CSI driver v1.30.6 — for NFS shared filesystem (
azure-nfsStorageClass, Premium_LRS)
Both require an Azure service principal with Contributor access to the cluster's resource group. Clouder auto-creates one if AZURE_TENANT_ID, AZURE_CLIENT_ID, and AZURE_CLIENT_SECRET environment variables are not set.
See the kubeadm CLI reference for details on manual installation if the automatic step was skipped.
CLI Commands
Create a Cluster
# Interactive mode - prompts for all settings
clouder k8s create my-cluster
# Full specification
clouder k8s create my-cluster \
--provider azure \
--region eastus \
--control-plane-size Standard_B4ms \
--worker-size Standard_B4ms \
--worker-count 2 \
--kubernetes-version v1.32.0 \
--cni flannel \
--enable-criu
CRIU support is now built into clouder kubeadm setup by default. The --enable-criu flag above is for the future clouder k8s create managed interface.
Get Kubeconfig
# Download kubeconfig for the cluster
clouder k8s kubeconfig my-cluster
# Set as current context
clouder k8s use my-cluster
Add Worker Nodes
# Add a GPU worker node pool
clouder k8s create-nodepool my-cluster \
gpu-workers \
--flavor Standard_NC6s_v3 \
--min 0 --desired 1 --max 5 \
--roles jupyter --xpu gpu-cuda
Cluster Lifecycle
# List clusters
clouder k8s ls
# Scale workers
clouder k8s update-nodepool my-cluster workers --desired 5
# Delete cluster (removes all VMs and resources)
clouder k8s delete my-cluster
Architecture Decisions
Why kubeadm?
- Full control: Unlike managed Kubernetes (AKS, EKS), kubeadm gives full control over the kubelet configuration, which is required for CRIU feature gates
- CRIU support: Managed Kubernetes services don't yet support the
ContainerCheckpointfeature gate; kubeadm allows enabling it at init time - Portability: Same cluster setup works across Azure, OVH, or bare metal
- Cost: No managed Kubernetes control plane fees
Why not Managed Kubernetes?
Managed Kubernetes services (AKS, EKS, GKE) abstract away the control plane, which means:
- Cannot enable experimental feature gates like
ContainerCheckpoint - Cannot configure containerd for CRIU checkpoint/restore
- Cannot access the kubelet checkpoint API directly
Once CRIU support reaches GA in Kubernetes, Clouder may add managed Kubernetes backends as well.
Roadmap
- Phase 1: Single control-plane cluster creation on Azure
- Phase 2: HA control plane (3 nodes with etcd)
- Phase 3: Multi-cloud support (OVH, bare metal)
- Phase 4: Automated CRIU checkpoint scheduling
- Phase 5: Cluster upgrades via
kubeadm upgrade