Tutorial: How to Migrate to Cilium (Part 1)

A network migration is a daunting task that most network engineers will eventually have to face in their career. I remember having once to plan a major network migration program. It was a Herculean challenge that took months to complete and while it was eventually successful, we ran into many hiccups along the way (I may have caused a minor routing loop on a major service provider network when making a BGP change back in 2011 – oops!).

I am not sure whether the era of cloud native has made it any easier: on one hand, we now have programmable networks, documented APIs and we have adopted a culture of automation that should make large-scale migration far more repeatable and consistent.

On the other hand, we’ve added multiple layers of networking – Linux networking stack, container networking, virtual network overlay – to the already complex physical network.

But whether you are looking at migrating to a new data center network or to a different Container Network Interface (CNI), the challenges remain the same:

How can we migrate from one network to another, with minimal disruption?
How can do ensure connectivity between workloads on the old network and the new network during the transition?
How can we repeat the migration process across a number of network nodes?

In this multi-part series, we will look at answering these questions and explore how we can elegantly migrate from any CNI to Cilium. In this first part, we will consider various migration approaches and we will walkthrough a migration from Flannel to Cilium, using a recently-released feature that will make migrations easier for users.

In the next part, we will migrate from Calico to Cilium but you should know that the same approach should apply to other CNIs too.

Note we won’t explore why you should move to Cilium but if you’re not convinced yet, I suggest you head out to our networking, security and observability pages to learn more.

Migrating to Cilium Lab

If you'd rather do a migration yourself instead of reading about it, start the free hands-on lab below!

START LAB

CNI Migration Considerations and Approaches

Before we talk about CNI migration, we should review what a CNI does and how it actually works.

When the kubelet creates a Pod’s sandbox, the CNI specified in the configuration file located in the /etc/cni/net.d/ directory is called.

The CNI will handle the networking for a Pod – including:

allocating an IP address,
creating & configuring a network interface,
and (potentially) establishing an overlay network.

When migrating CNIs, there are several approaches with pros and cons.

Migration Approaches

The ideal scenario would be to build a brand new cluster and to migrate workloads (ideally, using a GitOps approach). But this can involve a lot of preparation and potential disruptions.
Another method consists in reconfiguring /etc/cni/net.d/ to point to Cilium. However, any existing Pods will still have been configured by the old network plugin and any new Pods will be configured by the newer CNI. To complete the migration, all Pods on the cluster that are configured by the old CNI must be recycled in order to be a member of the new CNI.
A naive approach to migrating a CNI would be to reconfigure all nodes with a new CNI and then gradually restart each node in the cluster, thus replacing the CNI when the node is brought back up and ensuring that all pods are part of the new CNI. This simple migration, while effective, comes at the cost of disrupting cluster connectivity during the rollout. Unmigrated and migrated nodes would be split in to two “islands” of connectivity, and pods would be randomly unable to reach one-another until the migration is complete.

In this blog post, you will learn about a new hybrid approach.

Hybrid Migration Mode

Cilium supports a hybrid mode, where two separate overlays are established across the cluster. While Pods on a given node can only be attached to one network, they have access to both Cilium and non-Cilium Pods while the migration is taking place. That’s as long as Cilium and the existing network use a separate IP range.

Migration Overview

The migration process utilizes the per-node configuration feature to selectively enable Cilium CNI. This allows for a controlled rollout of Cilium without disrupting existing workloads.

Cilium will be installed, first, in a mode where it establishes an overlay but does not provide CNI networking for any pods. Then, individual nodes will be migrated.

In summary, the process looks like:

Prepare the cluster and install Cilium in “secondary” mode.
Cordon, drain, migrate, and reboot each node.
Remove the existing network provider.
(Optional) Reboot each node again.

Requirements

This approach to our migration requires the following:

A new, distinct Cluster CIDR for Cilium to use.
Use of the Cluster Pool IPAM mode.
A distinct network overlay, either a different protocol (Geneve instead of VXLAN for example) or port.
An existing network plugin that uses the Linux routing stack, such as Flannel or Calico.

Let’s now go through a migration.

Step 1 – Check the existing cluster

First, let’s have a look at migrating away from Flannel.

Flannel is a very popular and simple CNI with widespread adoption in home lab environments. It has however limited routing and security features (it does not support the use of Network Policies, does not support Ingress/Gateway API, does not benefit from the performance gains from eBPF, etc…).

Let’s first look at our Kubernetes cluster (deployed via kind). It’s made up of two worker nodes and one control plane node.

root@server:~# kubectl get nodes
NAME                 STATUS   ROLES           AGE     VERSION
kind-control-plane   Ready    control-plane   3m32s   v1.24.0
kind-worker          Ready    <none>          3m11s   v1.24.0
kind-worker2         Ready    <none>          3m11s   v1.24.0

Flannel is deployed and running with no issues:

root@server:~# kubectl get ds/kube-flannel-ds -n kube-flannel
NAME              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
kube-flannel-ds   3         3         3       3            3           <none>          3m33s

Let’s check the PodCIDR (the IP address range from which the Pods will pick up an IP from) on each node. It’s from the 10.244.0.0/16 range – take note of this as this will be important later.

root@server:~# kubectl get node -o jsonpath="{range .items[*]}{.metadata.name} {.spec.podCIDR}{'\n'}{end}" | column -t
kind-control-plane  10.244.0.0/24
kind-worker         10.244.1.0/24
kind-worker2        10.244.2.0/24

Just to illustrate the connectivity during the migration process, we’ve deployed a Deployment of 10 nginx Pods.

root@server:~# kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-544dc8b7c4-d6s4r   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-dgrc2   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-dtrbd   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-gfzwm   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-ksz2q   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-mmfv9   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-n8rbj   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-pzzsl   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-vdgnv   1/1     Running   0          40s
nginx-deployment-544dc8b7c4-zdn2l   1/1     Running   0          40s
root@server:~# kubectl get deployments.apps nginx-deployment 
NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   10/10   10           10          44s

The ten Pods have been distributed across both worker nodes. Flannel would have allocated IP addresses from the PodCIDRs of the nodes where the Pods are deployed. Let’s verify that :

root@server:~# kubectl get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE     IP           NODE           NOMINATED NODE   READINESS GATES
nginx-deployment-544dc8b7c4-d6s4r   1/1     Running   0          9m10s   10.244.2.4   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-dgrc2   1/1     Running   0          9m10s   10.244.2.6   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-dtrbd   1/1     Running   0          9m10s   10.244.2.5   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-gfzwm   1/1     Running   0          9m10s   10.244.1.7   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-ksz2q   1/1     Running   0          9m10s   10.244.1.5   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-mmfv9   1/1     Running   0          9m10s   10.244.2.3   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-n8rbj   1/1     Running   0          9m10s   10.244.1.8   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-pzzsl   1/1     Running   0          9m10s   10.244.2.2   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-vdgnv   1/1     Running   0          9m10s   10.244.1.9   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-zdn2l   1/1     Running   0          9m10s   10.244.1.6   kind-worker    <none>           <none>

Step 2 – Prepare for the Migration

First, we need to select a new CIDR for Pods. It must be distinct from all other CIDRs in use and choosing a different CIDR will enable us to maintain connectivity during the migration.

For kind clusters, the default is 10.244.0.0/16 and is the one in use as we saw earlier. So, for this example, we will use 10.245.0.0/16.

Next, we need to select a different encapsulation protocol (Geneve instead of VXLAN for example) or a distinct encapsulation port. For this example, we will use VXLAN with a non-default port of 8473 (the default is 8472).

We will now create a Cilium configuration file that we will use during the installation of Cilium. The Cilium configuration file will based on a combination of the parameters below (defined in the Helm configuration file values-migration.yaml below) and parameters based on your own environment.

---
operator:
  unmanagedPodWatcher:
    restart: false
tunnel: vxlan
tunnelPort: 8473
cni:
  customConf: true
  uninstall: false
ipam:
  mode: "cluster-pool"
  operator:
    clusterPoolIPv4PodCIDRList: ["10.245.0.0/16"]
policyEnforcementMode: "never"
bpf:
  hostLegacyRouting: true

Let’s review some of the key parameters first:

operator:
  unmanagedPodWatcher:
    restart: false

This is there to prevent Cilium from restarting Pods that are not being managed by Cilium (we don’t want to disrupt the Pods that are managed by Flannel and not by Cilium).

tunnelPort: 8473

As highlighted earlier, this setting here specifies the different encapsulation port for VXLAN.

cni:
  customConf: true
  uninstall: false

The first setting above temporarily skips writing the CNI configuration (customConf: true). This is to prevent Cilium from taking over immediately. Note the customConf will be switched it back to the default false at the end of the migration.

The second setting above will prevent the CNI configuration file and plugin binaries to be removed which is recommended during the migration (uninstall: false).

ipam:
  mode: "cluster-pool"
  operator:
    clusterPoolIPv4PodCIDRList: ["10.245.0.0/16"]

As highlighted earlier, we recommend the use of cluster-pool IPAM mode and a distinct PodCIDR during the migration.

policyEnforcementMode: "never"

The above disables the enforcement of network policy until the migration is completed. We will enforce network policies post-migration.

bpf:
  hostLegacyRouting: true

This flag should route traffic via host stack to provide connectivity during the migration. We will verify during the migration that Flannel-managed pods and Cilium-managed pods have connectivity.

We now need to use these settings and apply them to your own specific environment. For this, let’s use the Cilium CLI.

We saw in a previous tutorial how cilium-cli can be used to install Cilium. In this instance, we will use it to auto-detect settings specific to the underlying cluster platform (kind in this particular post but could be minikube, GKE, AKS, EKS, etc…) and use helm to install Cilium.

With the following command, we can:

Create a new Helm values file called values-initial.yaml
Pull from values-migration.yaml the non-default values
Fill in the missing values through the use of the helm-auto-gen-values flag

root@server:~# cilium install --helm-values values-migration.yaml --helm-auto-gen-values values-initial.yaml🔮 Auto-detected Kubernetes kind: kind
✨ Running "kind" validation checks
✅ Detected kind version "0.14.0"
ℹ️  Using Cilium version 1.12.0
🔮 Auto-detected cluster name: kind-kind
🔮 Auto-detected datapath mode: tunnel
ℹ️  helm template --namespace kube-system cilium cilium/cilium --version 1.12.0 --set bpf.hostLegacyRouting=true,cluster.id=0,cluster.name=kind-kind,cni.customConf=true,cni.uninstall=false,encryption.nodeEncryption=false,ipam.mode=cluster-pool,ipam.operator.clusterPoolIPv4PodCIDRList[0]=10.245.0.0/16,kubeProxyReplacement=disabled,operator.replicas=1,operator.unmanagedPodWatcher.restart=false,policyEnforcementMode=never,serviceAccounts.cilium.name=cilium,serviceAccounts.operator.name=cilium-operator,tunnel=vxlan,tunnelPort=8473
ℹ️  Storing helm values file in kube-system/cilium-cli-helm-values Secret
ℹ️  Generated helm values file "values-initial.yaml" successfully written

Let’s review the created file. It is a combination of the the values pulled from the values-migration.yaml file and the one auto-generated by the Cilium CLI.

root@server:~# cat values-initial.yaml
bpf:
  hostLegacyRouting: true
cluster:
  id: 0
  name: kind-kind
cni:
  customConf: true
  uninstall: false
encryption:
  nodeEncryption: false
ipam:
  mode: cluster-pool
  operator:
    clusterPoolIPv4PodCIDRList:
    - 10.245.0.0/16
kubeProxyReplacement: disabled
operator:
  replicas: 1
  unmanagedPodWatcher:
    restart: false
policyEnforcementMode: never
serviceAccounts:
  cilium:
    name: cilium
  operator:
    name: cilium-operator
tunnel: vxlan
tunnelPort: 8473

Step 3 – Install Cilium as a second overlay

Let’s now install Cilium using helm and the values we have just generated.

root@server:~# helm repo add cilium https://helm.cilium.io/
root@server:~# helm install cilium cilium/cilium --namespace kube-system --values values-initial.yaml
"cilium" has been added to your repositories
NAME: cilium
LAST DEPLOYED: Mon Jun  5 13:32:02 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.

Your release version is 1.13.3.

For any further help, visit https://docs.cilium.io/en/v1.13/gettinghelp

At this point, we have a cluster with Cilium installed and an overlay established, but no Pods managed by Cilium itself.

Let’s verify this with the cilium status command.

root@server:~# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         disabled
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet         cilium             Desired: 3, Ready: 3/3, Available: 3/3
Containers:       cilium             Running: 3
                  cilium-operator    Running: 1
Cluster Pods:     0/13 managed by Cilium
Image versions    cilium             quay.io/cilium/cilium:v1.13.3@sha256:77176464a1e11ea7e89e984ac7db365e7af39851507e94f137dcf56c87746314: 3
                  cilium-operator    quay.io/cilium/operator-generic:v1.13.3@sha256:fa7003cbfdf8358cb71786afebc711b26e5e44a2ed99bd4944930bba915b8910: 1

Note that none of the 13 Pods are currently managed by Cilium. That’s to be expected. You can also confirm this by checking the CNI Configuration on the node:

root@server:~# docker exec kind-worker ls /etc/cni/net.d/
10-flannel.conflist

As you can see, the Cilium CNI configuration file has not been written in yet.

Step 4 – Deploy the Cilium Node Config

To migrate gradually and to minimize the disruption during the migration, we are going to be using a new feature introduced in Cilium 1.13: the CiliumNodeConfig object.

The Cilium agent process supports setting configuration on a per-node basis instead of constant configuration across the cluster. This allows overriding the global Cilium config for a node or set of nodes. It is managed by CiliumNodeConfig objects.

A CiliumNodeConfig object consists of a set of fields and a label selector. The label selector defines to which nodes the configuration applies.

Let’s now create a per-node config that will instruct Cilium to “take over” CNI networking on the node.

root@server:~# cat <<EOF | kubectl apply --server-side -f -
apiVersion: cilium.io/v2alpha1
kind: CiliumNodeConfig
metadata:
  namespace: kube-system
  name: cilium-default
spec:
  nodeSelector:
    matchLabels:
      io.cilium.migration/cilium-default: "true"
  defaults:
    write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist
    custom-cni-conf: "false"
    cni-chaining-mode: "none"
    cni-exclusive: "true"
EOF
ciliumnodeconfig.cilium.io/cilium-default serverside-applied

Initially, this will not apply to any nodes.

As you can see in the spec.nodeSelector section, the CiliumNodeConfig only applies to nodes with the io.cilium.migration/cilium-default: "true" label. We will gradually migrate nodes by applying the label to each node, one by one.

Once the node is reloaded, the custom Cilium configuration will be applied, the CNI configuration will be written and the CNI functionality will be enabled.

Step 5 – Start the Migration

Remember that we deployed 10 replicas of an nginx image earlier. You should see Pods spread across both worker nodes.

root@server:~# kubectl get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP           NODE           NOMINATED NODE   READINESS GATES
nginx-deployment-544dc8b7c4-2qv9g   1/1     Running   0          11m   10.244.2.2   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-2tqjq   1/1     Running   0          11m   10.244.1.8   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-4hs2n   1/1     Running   0          11m   10.244.2.5   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-7vbwf   1/1     Running   0          11m   10.244.1.5   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-9qjsh   1/1     Running   0          11m   10.244.1.6   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-b2x5h   1/1     Running   0          11m   10.244.1.7   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-cqkx6   1/1     Running   0          11m   10.244.1.9   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-jmfxl   1/1     Running   0          11m   10.244.2.3   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-mgm6h   1/1     Running   0          11m   10.244.2.6   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-zhs6j   1/1     Running   0          11m   10.244.2.4   kind-worker2   <none>           <none>

Cordon and Drain the Node

It is recommended to always cordon and drain at the beginning of the migration process, so that end-users are not impacted by any potential issues.

Let’s remind ourselves the differences between “cordon” and “drain”:

Cordoning a node will prevent new Pods from being scheduled on the node.
Draining a node will gracefully evict all the running Pods from the node. This ensures that the Pods are not abruptly terminated and that their workload is gracefully handled by other available nodes.

Let’s get started with kind-worker:

root@server:~# NODE="kind-worker"
root@server:~# kubectl cordon $NODE
node/kind-worker cordoned

To show that the node has been cordoned off, let’s scale the deployment to 12 from 10 with the following command:

root@server:~# kubectl scale deployment nginx-deployment --replicas=12
deployment.apps/nginx-deployment scaled
root@server:~# kubectl get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP           NODE           NOMINATED NODE   READINESS GATES
nginx-deployment-544dc8b7c4-2qv9g   1/1     Running   0          18m   10.244.2.2   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-2tqjq   1/1     Running   0          18m   10.244.1.8   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-4hs2n   1/1     Running   0          18m   10.244.2.5   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-7vbwf   1/1     Running   0          18m   10.244.1.5   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-972w9   1/1     Running   0          9s    10.244.2.8   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-9qjsh   1/1     Running   0          18m   10.244.1.6   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-b2x5h   1/1     Running   0          18m   10.244.1.7   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-cqkx6   1/1     Running   0          18m   10.244.1.9   kind-worker    <none>           <none>
nginx-deployment-544dc8b7c4-jmfxl   1/1     Running   0          18m   10.244.2.3   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-mgm6h   1/1     Running   0          18m   10.244.2.6   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-nssjj   1/1     Running   0          9s    10.244.2.7   kind-worker2   <none>           <none>
nginx-deployment-544dc8b7c4-zhs6j   1/1     Running   0          18m   10.244.2.4   kind-worker2   <none>           <none>

As you can see, no new nginx instance is deployed on kind-worker as it’s cordoned off (that’s why we have 7 Pods on kind-worker2 and 5 on kind-worker).

Let’s now drain the node. Note that we use the ignore-daemonset flag as several DaemonSets are still required to run. You should know that, when we drain a node, the node is automatically cordoned. We did it first in this instance to provide clarity in the migration process.

root@server:~# kubectl drain $NODE --ignore-daemonsets
node/kind-worker already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-6rgkh, kube-system/cilium-hkktv, kube-system/install-cni-plugins-rqt5q, kube-system/kube-proxy-wnd27
evicting pod local-path-storage/local-path-provisioner-9cd9bd544-9g5ll
evicting pod default/nginx-deployment-544dc8b7c4-9qjsh
evicting pod default/nginx-deployment-544dc8b7c4-cqkx6
evicting pod default/nginx-deployment-544dc8b7c4-2tqjq
evicting pod default/nginx-deployment-544dc8b7c4-b2x5h
evicting pod kube-system/coredns-6d4b75cb6d-5gc9x
evicting pod kube-system/coredns-6d4b75cb6d-d2frk
evicting pod default/nginx-deployment-544dc8b7c4-7vbwf
pod/nginx-deployment-544dc8b7c4-b2x5h evicted
pod/nginx-deployment-544dc8b7c4-cqkx6 evicted
pod/nginx-deployment-544dc8b7c4-7vbwf evicted
pod/nginx-deployment-544dc8b7c4-2tqjq evicted
pod/nginx-deployment-544dc8b7c4-9qjsh evicted
pod/coredns-6d4b75cb6d-d2frk evicted
pod/coredns-6d4b75cb6d-5gc9x evicted
pod/local-path-provisioner-9cd9bd544-9g5ll evicted
node/kind-worker drained

Let’s verify no Pods are running on the drained node.

root@server:~# kubectl get pods -o wide | grep -c kind-worker2
12

First node cordoned and drained before migration

The 12 pods are all running on kind-worker2. We can now label the node: this causes the CiliumNodeConfig to apply to this node.

root@server:~# kubectl label node $NODE --overwrite "io.cilium.migration/cilium-default=true"
node/kind-worker labeled

Let’s restart Cilium on the node. That will trigger the creation of CNI configuration file.

root@server:~# kubectl -n kube-system delete pod --field-selector spec.nodeName=$NODE -l k8s-app=cilium
pod "cilium-hkktv" deleted
root@server:~# kubectl -n kube-system rollout status ds/cilium -w
Waiting for daemon set "cilium" rollout to finish: 2 of 3 updated pods are available...
daemon set "cilium" successfully rolled out

Finally, we can reboot the node. As we are using Kind, simulating a node reboot is as simple as restarting the Docker container.

root@server:~# docker restart $NODE
kind-worker

Let’s take another look at the CNI configuration file:

root@server:~# docker exec kind-worker ls /etc/cni/net.d/
05-cilium.conflist
10-flannel.conflist.cilium_bak

Note how there is now a Cilium configuration file present!

Let’s deploy a Pod and verify that Cilium allocates the IP to the Pod.

Remember that we rolled out Cilium in cluster-scope IPAM mode where Cilium assigns per-node PodCIDRs to each node and allocates IPs on each node. The Cilium operator will manage the per-node PodCIDRs via the CiliumNode resource.

The following command will check the CiliumNode resource and will show us the Pod CIDRs used to allocate IP addresses to the pods:

root@server:~# kubectl get cn kind-worker -o jsonpath='{.spec.ipam.podCIDRs[0]}'
10.245.1.0/24

Let’s verify that, when we deploy a Pod on the migrated node, that the Pod picks an IP from the Cilium CIDR. The command below deploys a temporary Pod on the node and outputs the Pod’s IP details (filtering on the Cilium Pod CIDR 10.245). Note we use the toleration to override the cordon.

root@server:~# kubectl run --attach --rm --restart=Never verify  --overrides='{"spec": {"nodeName": "'$NODE'", "tolerations": [{"operator": "Exists"}]}}'   --image alpine -- /bin/sh -c 'ip addr' | grep 10.245 -B 2
8: eth0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP qlen 1000
    link/ether 82:69:c6:96:da:7c brd ff:ff:ff:ff:ff:ff
    inet 10.245.1.234/32 scope global eth0

As you can see, the temporary Pod picks up an IP from the new range.

Let’s test connectivity between Pods on the existing overlay and the new Cilium-overlay. Let’s first get the IP of one of the NGINX pod that was initially deployed. This Pod should still be on the Flannel network.

root@server:~# NGINX=($(kubectl get pods -l app=nginx -o=jsonpath='{.items[0].status.podIP}'))
echo $NGINX
10.244.1.16

This command will spin up a temporary container on the Cilium-managed network that will connect with curl to one of the nginx pods. We use grep to filter the response so that we only see the response code.

root@server:~# kubectl run --attach --rm --restart=Never verify  --overrides='{"spec": {"nodeName": "'$NODE'", "tolerations": [{"operator": "Exists"}]}}'   --image alpine/curl --env NGINX=$NGINX -- /bin/sh -c 'curl -I $NGINX | grep HTTP'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--      0   615    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
HTTP/1.1 200 OK
pod "verify" deleted

As the HTTP response code is a successful 200, we’ve just established that we have successful connectivity during the migration!

Successful connectivity between the 2 networks!

We can finally uncordon the migrated node with:

root@server:~# kubectl uncordon $NODE
node/kind-worker uncordoned

Step 6 – Repeat for the next node(s)

We can now proceed to the migration of the next worker node.

Let’s cordon and drain the node:

root@server:~# NODE="kind-worker2"
kubectl cordon $NODE
node/kind-worker2 cordoned
root@server:~# kubectl drain $NODE --ignore-daemonsets
node/kind-worker2 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-l94bb, kube-system/cilium-rn925, kube-system/install-cni-plugins-p8vpm, kube-system/kube-proxy-dp2gv
evicting pod kube-system/coredns-6d4b75cb6d-hptw7
evicting pod default/nginx-deployment-544dc8b7c4-49h7g
evicting pod default/nginx-deployment-544dc8b7c4-sp6m9
evicting pod default/nginx-deployment-544dc8b7c4-4hs2n
evicting pod default/nginx-deployment-544dc8b7c4-jmfxl
evicting pod default/nginx-deployment-544dc8b7c4-nssjj
evicting pod kube-system/cilium-operator-6695774dc-tlj5m
evicting pod default/nginx-deployment-544dc8b7c4-972w9
evicting pod default/nginx-deployment-544dc8b7c4-vs8b4
evicting pod default/nginx-deployment-544dc8b7c4-zhs6j
evicting pod default/nginx-deployment-544dc8b7c4-2qv9g
evicting pod default/nginx-deployment-544dc8b7c4-mgm6h
evicting pod default/nginx-deployment-544dc8b7c4-wdg2d
evicting pod default/nginx-deployment-544dc8b7c4-9669p
I0605 13:53:30.328064   43580 request.go:601] Waited for 1.128178103s due to client-side throttling, not priority and fairness, request: GET:https://127.0.0.1:36085/api/v1/namespaces/default/pods/nginx-deployment-544dc8b7c4-sp6m9
pod/nginx-deployment-544dc8b7c4-vs8b4 evicted
pod/nginx-deployment-544dc8b7c4-wdg2d evicted
pod/nginx-deployment-544dc8b7c4-zhs6j evicted
pod/cilium-operator-6695774dc-tlj5m evicted
pod/nginx-deployment-544dc8b7c4-jmfxl evicted
pod/nginx-deployment-544dc8b7c4-nssjj evicted
pod/nginx-deployment-544dc8b7c4-49h7g evicted
pod/nginx-deployment-544dc8b7c4-972w9 evicted
pod/nginx-deployment-544dc8b7c4-4hs2n evicted
pod/nginx-deployment-544dc8b7c4-9669p evicted
pod/nginx-deployment-544dc8b7c4-mgm6h evicted
pod/nginx-deployment-544dc8b7c4-2qv9g evicted
pod/nginx-deployment-544dc8b7c4-sp6m9 evicted
pod/coredns-6d4b75cb6d-hptw7 evicted
node/kind-worker2 drained

Let’s verify no Pods are running on the drained node (they should have been recreated over on the already-migrated node and should be all on the 10.245 IP range):

root@server:~# kubectl get pods -o wide
NAME                                READY   STATUS    RESTARTS   AGE   IP             NODE          NOMINATED NODE   READINESS GATES
nginx-deployment-544dc8b7c4-2ssnt   1/1     Running   0          50s   10.245.1.144   kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-4npn2   1/1     Running   0          50s   10.245.1.45    kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-6xpfd   1/1     Running   0          50s   10.245.1.69    kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-9rh66   1/1     Running   0          50s   10.245.1.254   kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-dbtl9   1/1     Running   0          50s   10.245.1.177   kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-dp8tj   1/1     Running   0          50s   10.245.1.205   kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-fdss6   1/1     Running   0          50s   10.245.1.225   kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-fzrwp   1/1     Running   0          50s   10.245.1.136   kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-gp249   1/1     Running   0          50s   10.245.1.16    kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-gq6t8   1/1     Running   0          49s   10.245.1.85    kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-k2gt2   1/1     Running   0          50s   10.245.1.34    kind-worker   <none>           <none>
nginx-deployment-544dc8b7c4-zrfzs   1/1     Running   0          50s   10.245.1.36    kind-worker   <none>           <none>

The drained Pods are restarted on the already-migrated nodes.

Second Node drained and Pods restarted on Cilium-managed node

It’s therefore no surprise that Cilium is now managing most of the Pods.

root@server:~# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         disabled
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

DaemonSet         cilium             Desired: 3, Ready: 3/3, Available: 3/3
Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
Containers:       cilium-operator    Running: 1
                  cilium             Running: 3
Cluster Pods:     13/15 managed by Cilium
Image versions    cilium             quay.io/cilium/cilium:v1.13.3@sha256:77176464a1e11ea7e89e984ac7db365e7af39851507e94f137dcf56c87746314: 3
                  cilium-operator    quay.io/cilium/operator-generic:v1.13.3@sha256:fa7003cbfdf8358cb71786afebc711b26e5e44a2ed99bd4944930bba915b8910: 1

We can now label the node, restart Cilium on it, reboot the node and uncordon it like we did earlier.

root@server:~# kubectl label node $NODE --overwrite "io.cilium.migration/cilium-default=true"
node/kind-worker2 labeled
root@server:~# 
root@server:~# kubectl -n kube-system delete pod --field-selector spec.nodeName=$NODE -l k8s-app=cilium
kubectl -n kube-system rollout status ds/cilium -w
pod "cilium-rn925" deleted
Waiting for daemon set "cilium" rollout to finish: 2 of 3 updated pods are available...
daemon set "cilium" successfully rolled out
root@server:~# 
root@server:~# docker restart $NODE
kind-worker2
root@server:~# kubectl uncordon $NODE
node/kind-worker2 uncordoned

You can now repeat the same process on the other nodes until the cluster is completed migrated.

At the end, the status of Cilium should be OK and all pods should be managed by Cilium:

root@server:~# cilium status --wait
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         disabled
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet         cilium             Desired: 3, Ready: 3/3, Available: 3/3
Containers:       cilium-operator    Running: 1
                  cilium             Running: 3
Cluster Pods:     15/15 managed by Cilium
Image versions    cilium-operator    quay.io/cilium/operator-generic:v1.13.3@sha256:fa7003cbfdf8358cb71786afebc711b26e5e44a2ed99bd4944930bba915b8910: 1
                  cilium             quay.io/cilium/cilium:v1.13.3@sha256:77176464a1e11ea7e89e984ac7db365e7af39851507e94f137dcf56c87746314: 3

Step 7 – Clean-up post-migration

Now the migration has been completed, let’s update the Cilium configuration to support Network Policies and remove the previous network plugin.

Now that Cilium is healthy, let’s update the Cilium configuration. First, let’s create the right configuration file.

root@server:~# cilium install --helm-values values-initial.yaml --helm-set operator.unmanagedPodWatcher.restart=true --helm-set cni.customConf=false --helm-set policyEnforcementMode=default --helm-auto-gen-values values-final.yaml
🔮 Auto-detected Kubernetes kind: kind
✨ Running "kind" validation checks
✅ Detected kind version "0.14.0"
ℹ️  Using Cilium version 1.12.0
🔮 Auto-detected cluster name: kind-kind
🔮 Auto-detected datapath mode: tunnel
ℹ️  helm template --namespace kube-system cilium cilium/cilium --version 1.12.0 --set bpf.hostLegacyRouting=true,cluster.id=0,cluster.name=kind-kind,cni.customConf=false,cni.uninstall=false,encryption.nodeEncryption=false,ipam.mode=cluster-pool,ipam.operator.clusterPoolIPv4PodCIDRList[0]=10.245.0.0/16,kubeProxyReplacement=disabled,operator.replicas=1,operator.unmanagedPodWatcher.restart=true,policyEnforcementMode=default,serviceAccounts.cilium.name=cilium,serviceAccounts.operator.name=cilium-operator,tunnel=vxlan,tunnelPort=8473
ℹ️  Storing helm values file in kube-system/cilium-cli-helm-values Secret
ℹ️  Generated helm values file "values-final.yaml" successfully written

Again, we are using the cilium-cli to generate an updated Helm config file. As you can see from checking the differences between the two files, we are only changing three parameters.

root@server:~# diff values-initial.yaml values-final.yaml
7c7
<   customConf: true
---
>   customConf: false
20,21c20,21
<     restart: false
< policyEnforcementMode: never
---
>     restart: true
> policyEnforcementMode: default

We are:

Enabling Cilium to write the CNI configuration file.
Enabling Cilium to restart unmanaged Pods.
Enabling Network Policy Enforcement.

Let’s apply it:

root@server:~# helm upgrade --namespace kube-system cilium cilium/cilium --values values-final.yaml
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Mon Jun  5 14:01:47 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.

Your release version is 1.13.3.

For any further help, visit https://docs.cilium.io/en/v1.13/gettinghelp
root@server:~# kubectl -n kube-system rollout restart daemonset cilium
daemonset.apps/cilium restarted
root@server:~# cilium status --wait

    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         disabled
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet         cilium             Desired: 3, Ready: 3/3, Available: 3/3
Containers:       cilium             Running: 3
                  cilium-operator    Running: 1
Cluster Pods:     15/15 managed by Cilium
Image versions    cilium             quay.io/cilium/cilium:v1.13.3@sha256:77176464a1e11ea7e89e984ac7db365e7af39851507e94f137dcf56c87746314: 3
                  cilium-operator    quay.io/cilium/operator-generic:v1.13.3@sha256:fa7003cbfdf8358cb71786afebc711b26e5e44a2ed99bd4944930bba915b8910: 1

Let’s remove Flannel as it is no longer needed:

root@server:~# kubectl delete -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
namespace "kube-flannel" deleted
serviceaccount "flannel" deleted
clusterrole.rbac.authorization.k8s.io "flannel" deleted
clusterrolebinding.rbac.authorization.k8s.io "flannel" deleted
configmap "kube-flannel-cfg" deleted
daemonset.apps "kube-flannel-ds" deleted

And we are done!

Conclusion

Migrating CNIs is not a task most users look forward to but we think this new method will give users the option to gracefully migrate their clusters to Cilium. We also think the experience can be even further improved by leveraging the new CRD and building some tooling around it to facilitate the migration for some of the larger clusters.

In the next post, coming soon, we will look at a migration from Calico to Cilium and some of the gotchas to consider.

We would love your feedback – you can find us on the Cilium Slack channel!