A network migration is a daunting task that most network engineers will eventually have to face in their career. I remember having once to plan a major network migration program. It was a Herculean challenge that took months to complete and while it was eventually successful, we ran into many hiccups along the way (I may have caused a minor routing loop on a major service provider network when making a BGP change back in 2011 – oops!).
I am not sure whether the era of cloud native has made it any easier: on one hand, we now have programmable networks, documented APIs and we have adopted a culture of automation that should make large-scale migration far more repeatable and consistent.
On the other hand, we’ve added multiple layers of networking – Linux networking stack, container networking, virtual network overlay – to the already complex physical network.
But whether you are looking at migrating to a new data center network or to a different Container Network Interface (CNI), the challenges remain the same:
How can we migrate from one network to another, with minimal disruption?
How can do ensure connectivity between workloads on the old network and the new network during the transition?
How can we repeat the migration process across a number of network nodes?
In this blog post, we will look at answering these questions and explore how we can elegantly migrate from any CNI to Cilium. We will consider various migration approaches and we will walkthrough a migration from Flannel to Cilium, using a recently-released feature that will make migrations easier for users. While this blog post focuses on migrating from Flannel, that the same approach should apply to other CNIs too.
Note we won’t explore why you should move to Cilium but if you’re not convinced yet, I suggest you head out to our networking, security and observability pages to learn more.
Migrating to Cilium Lab
If you'd rather do a migration yourself instead of reading about it, start the free hands-on lab below!
Before we talk about CNI migration, we should review what a CNI does and how it actually works.
When the kubelet creates a Pod’s sandbox, the CNI specified in the configuration file located in the /etc/cni/net.d/ directory is called.
The CNI will handle the networking for a Pod – including:
allocating an IP address,
creating & configuring a network interface,
and (potentially) establishing an overlay network.
When migrating CNIs, there are several approaches with pros and cons.
Migration Approaches
The ideal scenario would be to build a brand new cluster and to migrate workloads (ideally, using a GitOps approach). But this can involve a lot of preparation and potential disruptions.
Another method consists in reconfiguring /etc/cni/net.d/ to point to Cilium. However, any existing Pods will still have been configured by the old network plugin and any new Pods will be configured by the newer CNI. To complete the migration, all Pods on the cluster that are configured by the old CNI must be recycled in order to be a member of the new CNI.
A naive approach to migrating a CNI would be to reconfigure all nodes with a new CNI and then gradually restart each node in the cluster, thus replacing the CNI when the node is brought back up and ensuring that all pods are part of the new CNI. This simple migration, while effective, comes at the cost of disrupting cluster connectivity during the rollout. Unmigrated and migrated nodes would be split in to two “islands” of connectivity, and pods would be randomly unable to reach one-another until the migration is complete.
In this blog post, you will learn about a new hybrid approach.
Hybrid Migration Mode
Cilium supports a hybrid mode, where two separate overlays are established across the cluster. While Pods on a given node can only be attached to one network, they have access to both Cilium and non-Cilium Pods while the migration is taking place. That’s as long as Cilium and the existing network use a separate IP range.
Migration Overview
The migration process utilizes the per-node configuration feature to selectively enable Cilium CNI. This allows for a controlled rollout of Cilium without disrupting existing workloads.
Cilium will be installed, first, in a mode where it establishes an overlay but does not provide CNI networking for any pods. Then, individual nodes will be migrated.
In summary, the process looks like:
Prepare the cluster and install Cilium in “secondary” mode.
Cordon, drain, migrate, and reboot each node.
Remove the existing network provider.
(Optional) Reboot each node again.
Requirements
This approach to our migration requires the following:
A distinct network overlay, either a different protocol (Geneve instead of VXLAN for example) or port.
An existing network plugin that uses the Linux routing stack, such as Flannel or Calico.
Let’s now go through a migration.
Step 1 – Check the existing cluster
First, let’s have a look at migrating away from Flannel.
Flannel is a very popular and simple CNI with widespread adoption in home lab environments. It has however limited routing and security features (it does not support the use of Network Policies, does not support Ingress/Gateway API, does not benefit from the performance gains from eBPF, etc…).
Let’s first look at our Kubernetes cluster (deployed via kind). It’s made up of two worker nodes and one control plane node.
root@server:~# kubectl get nodesNAME STATUS ROLES AGE VERSION
kind-control-plane Ready control-plane 3m32s v1.24.0
kind-worker Ready <none> 3m11s v1.24.0
kind-worker2 Ready <none> 3m11s v1.24.0
Flannel is deployed and running with no issues:
root@server:~# kubectl get ds/kube-flannel-ds -n kube-flannelNAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-flannel-ds 33333<none> 3m33s
Let’s check the PodCIDR (the IP address range from which the Pods will pick up an IP from) on each node. It’s from the 10.244.0.0/16 range – take note of this as this will be important later.
The ten Pods have been distributed across both worker nodes. Flannel would have allocated IP addresses from the PodCIDRs of the nodes where the Pods are deployed. Let’s verify that :
First, we need to select a new CIDR for Pods. It must be distinct from all other CIDRs in use and choosing a different CIDR will enable us to maintain connectivity during the migration.
For kind clusters, the default is 10.244.0.0/16 and is the one in use as we saw earlier. So, for this example, we will use 10.245.0.0/16.
Next, we need to select a different encapsulation protocol (Geneve instead of VXLAN for example) or a distinct encapsulation port. For this example, we will use VXLAN with a non-default port of 8473 (the default is 8472).
We will now create a Cilium configuration file that we will use during the installation of Cilium. The Cilium configuration file will based on a combination of the parameters below (defined in the Helm configuration file values-migration.yaml below) and parameters based on your own environment.
This is there to prevent Cilium from restarting Pods that are not being managed by Cilium (we don’t want to disrupt the Pods that are managed by Flannel and not by Cilium).
tunnelPort:8473
As highlighted earlier, this setting here specifies the different encapsulation port for VXLAN.
cni:customConf:trueuninstall:false
The first setting above temporarily skips writing the CNI configuration (customConf: true). This is to prevent Cilium from taking over immediately. Note the customConf will be switched it back to the default false at the end of the migration.
The second setting above will prevent the CNI configuration file and plugin binaries to be removed which is recommended during the migration (uninstall: false).
As highlighted earlier, we recommend the use of cluster-pool IPAM mode and a distinct PodCIDR during the migration.
policyEnforcementMode:"never"
The above disables the enforcement of network policy until the migration is completed. We will enforce network policies post-migration.
bpf:hostLegacyRouting:true
This flag should route traffic via host stack to provide connectivity during the migration. We will verify during the migration that Flannel-managed pods and Cilium-managed pods have connectivity.
We now need to use these settings and apply them to your own specific environment. For this, let’s use the Cilium CLI.
We saw in a previous tutorial how cilium-cli can be used to install Cilium. In this instance, we will use it to auto-detect settings specific to the underlying cluster platform (kind in this particular post but could be minikube, GKE, AKS, EKS, etc…) and use helm to install Cilium.
With the following command, we can:
Create a new Helm values file called values-initial.yaml
Pull from values-migration.yaml the non-default values
Fill in the missing values through the use of the helm-auto-gen-values flag
Let’s review the created file. It is a combination of the the values pulled from the values-migration.yaml file and the one auto-generated by the Cilium CLI.
Let’s now install Cilium using helm and the values we have just generated.
root@server:~# helm repo add cilium https://helm.cilium.io/root@server:~# helm install cilium cilium/cilium --namespace kube-system --values values-initial.yaml"cilium" has been added to your repositories
NAME: cilium
LAST DEPLOYED: Mon Jun 513:32:02 2023NAMESPACE: kube-system
STATUS: deployed
REVISION: 1TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble.
Your release version is 1.13.3.
For any further help, visit https://docs.cilium.io/en/v1.13/gettinghelp
At this point, we have a cluster with Cilium installed and an overlay established, but no Pods managed by Cilium itself.
Note that none of the 13 Pods are currently managed by Cilium. That’s to be expected. You can also confirm this by checking the CNI Configuration on the node:
root@server:~# docker exec kind-worker ls /etc/cni/net.d/10-flannel.conflist
As you can see, the Cilium CNI configuration file has not been written in yet.
Step 4 – Deploy the Cilium Node Config
To migrate gradually and to minimize the disruption during the migration, we are going to be using a new feature introduced in Cilium 1.13: the CiliumNodeConfig object.
The Cilium agent process supports setting configuration on a per-node basis instead of constant configuration across the cluster. This allows overriding the global Cilium config for a node or set of nodes. It is managed by CiliumNodeConfig objects.
A CiliumNodeConfig object consists of a set of fields and a label selector. The label selector defines to which nodes the configuration applies.
Let’s now create a per-node config that will instruct Cilium to “take over” CNI networking on the node.
As you can see in the spec.nodeSelector section, the CiliumNodeConfig only applies to nodes with the io.cilium.migration/cilium-default: "true" label. We will gradually migrate nodes by applying the label to each node, one by one.
Once the node is reloaded, the custom Cilium configuration will be applied, the CNI configuration will be written and the CNI functionality will be enabled.
Step 5 – Start the Migration
Remember that we deployed 10 replicas of an nginx image earlier. You should see Pods spread across both worker nodes.
It is recommended to always cordon and drain at the beginning of the migration process, so that end-users are not impacted by any potential issues.
Let’s remind ourselves the differences between “cordon” and “drain”:
Cordoning a node will prevent new Pods from being scheduled on the node.
Draining a node will gracefully evict all the running Pods from the node. This ensures that the Pods are not abruptly terminated and that their workload is gracefully handled by other available nodes.
As you can see, no new nginx instance is deployed on kind-worker as it’s cordoned off (that’s why we have 7 Pods on kind-worker2 and 5 on kind-worker).
Let’s now drain the node. Note that we use the ignore-daemonset flag as several DaemonSets are still required to run. You should know that, when we drain a node, the node is automatically cordoned. We did it first in this instance to provide clarity in the migration process.
root@server:~# kubectl drain $NODE --ignore-daemonsetsnode/kind-worker already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-6rgkh, kube-system/cilium-hkktv, kube-system/install-cni-plugins-rqt5q, kube-system/kube-proxy-wnd27
evicting pod local-path-storage/local-path-provisioner-9cd9bd544-9g5ll
evicting pod default/nginx-deployment-544dc8b7c4-9qjsh
evicting pod default/nginx-deployment-544dc8b7c4-cqkx6
evicting pod default/nginx-deployment-544dc8b7c4-2tqjq
evicting pod default/nginx-deployment-544dc8b7c4-b2x5h
evicting pod kube-system/coredns-6d4b75cb6d-5gc9x
evicting pod kube-system/coredns-6d4b75cb6d-d2frk
evicting pod default/nginx-deployment-544dc8b7c4-7vbwf
pod/nginx-deployment-544dc8b7c4-b2x5h evicted
pod/nginx-deployment-544dc8b7c4-cqkx6 evicted
pod/nginx-deployment-544dc8b7c4-7vbwf evicted
pod/nginx-deployment-544dc8b7c4-2tqjq evicted
pod/nginx-deployment-544dc8b7c4-9qjsh evicted
pod/coredns-6d4b75cb6d-d2frk evicted
pod/coredns-6d4b75cb6d-5gc9x evicted
pod/local-path-provisioner-9cd9bd544-9g5ll evicted
node/kind-worker drained
Let’s verify no Pods are running on the drained node.
root@server:~# kubectl get pods -o wide | grep -c kind-worker212
The 12 pods are all running on kind-worker2. We can now label the node: this causes the CiliumNodeConfig to apply to this node.
Let’s restart Cilium on the node. That will trigger the creation of CNI configuration file.
root@server:~# kubectl -n kube-system delete pod --field-selector spec.nodeName=$NODE -l k8s-app=ciliumpod "cilium-hkktv" deleted
root@server:~# kubectl -n kube-system rollout status ds/cilium -wWaiting for daemon set"cilium" rollout to finish: 2 of 3 updated pods are available...
daemon set"cilium" successfully rolled out
Finally, we can reboot the node. As we are using Kind, simulating a node reboot is as simple as restarting the Docker container.
root@server:~# docker restart $NODEkind-worker
Let’s take another look at the CNI configuration file:
root@server:~# docker exec kind-worker ls /etc/cni/net.d/05-cilium.conflist
10-flannel.conflist.cilium_bak
Note how there is now a Cilium configuration file present!
Let’s deploy a Pod and verify that Cilium allocates the IP to the Pod.
Remember that we rolled out Cilium in cluster-scope IPAM mode where Cilium assigns per-node PodCIDRs to each node and allocates IPs on each node. The Cilium operator will manage the per-node PodCIDRs via the CiliumNode resource.
The following command will check the CiliumNode resource and will show us the Pod CIDRs used to allocate IP addresses to the pods:
root@server:~# kubectl get cn kind-worker -o jsonpath='{.spec.ipam.podCIDRs[0]}'10.245.1.0/24
Let’s verify that, when we deploy a Pod on the migrated node, that the Pod picks an IP from the Cilium CIDR. The command below deploys a temporary Pod on the node and outputs the Pod’s IP details (filtering on the Cilium Pod CIDR 10.245). Note we use the toleration to override the cordon.
root@server:~# kubectl run --attach --rm --restart=Never verify --overrides='{"spec": {"nodeName": "'$NODE'", "tolerations": [{"operator": "Exists"}]}}' --image alpine -- /bin/sh -c 'ip addr' | grep 10.245 -B 28: eth0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP qlen 1000 link/ether 82:69:c6:96:da:7c brd ff:ff:ff:ff:ff:ff
inet 10.245.1.234/32 scope global eth0
As you can see, the temporary Pod picks up an IP from the new range.
Let’s test connectivity between Pods on the existing overlay and the new Cilium-overlay. Let’s first get the IP of one of the NGINX pod that was initially deployed. This Pod should still be on the Flannel network.
root@server:~# NGINX=($(kubectl get pods -l app=nginx -o=jsonpath='{.items[0].status.podIP}'))echo$NGINX10.244.1.16
This command will spin up a temporary container on the Cilium-managed network that will connect with curl to one of the nginx pods. We use grep to filter the response so that we only see the response code.
root@server:~# kubectl run --attach --rm --restart=Never verify --overrides='{"spec": {"nodeName": "'$NODE'", "tolerations": [{"operator": "Exists"}]}}' --image alpine/curl --env NGINX=$NGINX -- /bin/sh -c 'curl -I $NGINX | grep HTTP' % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
00000000 --:--:-- --:--:-- --:--:-- 0615000000 --:--:-- --:--:-- --:--:-- 0HTTP/1.1 200 OK
pod "verify" deleted
As the HTTP response code is a successful 200, we’ve just established that we have successful connectivity during the migration!
We can now proceed to the migration of the next worker node.
Let’s cordon and drain the node:
root@server:~# NODE="kind-worker2"kubectl cordon $NODEnode/kind-worker2 cordoned
root@server:~# kubectl drain $NODE --ignore-daemonsetsnode/kind-worker2 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-flannel/kube-flannel-ds-l94bb, kube-system/cilium-rn925, kube-system/install-cni-plugins-p8vpm, kube-system/kube-proxy-dp2gv
evicting pod kube-system/coredns-6d4b75cb6d-hptw7
evicting pod default/nginx-deployment-544dc8b7c4-49h7g
evicting pod default/nginx-deployment-544dc8b7c4-sp6m9
evicting pod default/nginx-deployment-544dc8b7c4-4hs2n
evicting pod default/nginx-deployment-544dc8b7c4-jmfxl
evicting pod default/nginx-deployment-544dc8b7c4-nssjj
evicting pod kube-system/cilium-operator-6695774dc-tlj5m
evicting pod default/nginx-deployment-544dc8b7c4-972w9
evicting pod default/nginx-deployment-544dc8b7c4-vs8b4
evicting pod default/nginx-deployment-544dc8b7c4-zhs6j
evicting pod default/nginx-deployment-544dc8b7c4-2qv9g
evicting pod default/nginx-deployment-544dc8b7c4-mgm6h
evicting pod default/nginx-deployment-544dc8b7c4-wdg2d
evicting pod default/nginx-deployment-544dc8b7c4-9669p
I0605 13:53:30.328064 43580 request.go:601] Waited for1.128178103s due to client-side throttling, not priority and fairness, request: GET:https://127.0.0.1:36085/api/v1/namespaces/default/pods/nginx-deployment-544dc8b7c4-sp6m9
pod/nginx-deployment-544dc8b7c4-vs8b4 evicted
pod/nginx-deployment-544dc8b7c4-wdg2d evicted
pod/nginx-deployment-544dc8b7c4-zhs6j evicted
pod/cilium-operator-6695774dc-tlj5m evicted
pod/nginx-deployment-544dc8b7c4-jmfxl evicted
pod/nginx-deployment-544dc8b7c4-nssjj evicted
pod/nginx-deployment-544dc8b7c4-49h7g evicted
pod/nginx-deployment-544dc8b7c4-972w9 evicted
pod/nginx-deployment-544dc8b7c4-4hs2n evicted
pod/nginx-deployment-544dc8b7c4-9669p evicted
pod/nginx-deployment-544dc8b7c4-mgm6h evicted
pod/nginx-deployment-544dc8b7c4-2qv9g evicted
pod/nginx-deployment-544dc8b7c4-sp6m9 evicted
pod/coredns-6d4b75cb6d-hptw7 evicted
node/kind-worker2 drained
Let’s verify no Pods are running on the drained node (they should have been recreated over on the already-migrated node and should be all on the 10.245 IP range):
Again, we are using the cilium-cli to generate an updated Helm config file. As you can see from checking the differences between the two files, we are only changing three parameters.
Migrating CNIs is not a task most users look forward to but we think this new method will give users the option to gracefully migrate their clusters to Cilium. We also think the experience can be even further improved by leveraging the new CRD and building some tooling around it to facilitate the migration for some of the larger clusters.
We would love your feedback – you can find us on the Cilium Slack channel!
Nico Vibert is a Senior Staff Technical Marketing Engineer at Isovalent, the company behind the open-source cloud-native solution Cilium.
Prior to joining Isovalent, Nico worked in many different roles—operations and support, design and architecture, and technical pre-sales—at companies such as HashiCorp, VMware, and Cisco.
In his current role, Nico focuses primarily on creating content to make networking a more approachable field and regularly speaks at events like KubeCon, VMworld, and Cisco Live.
Nico has held over 15 networking certifications, including the Cisco Certified Internetwork Expert CCIE (# 22990).
Nico is now the Lead Subject Matter Expert on the Cilium Certified Associate (CCA) certification.
Outside of Isovalent, Nico is passionate about intentional diversity & inclusion initiatives and is Chief DEI Officer at the Open Technology organization OpenUK. You can find out more about him on his blog.
Migrating to Cilium from another CNI is a very common task. But how do we minimize the impact during the migration? How do we ensure pods on the legacy CNI can still communicate to Cilium-managed during pods during the migration? How do we execute the migration safely, while avoiding a overly complex approach or using a separate tool such as Multus?
With the use of the new Cilium CRD CiliumNodeConfig, running clusters can be migrated on a node-by-node basis, without disrupting existing traffic or requiring a complete cluster outage or rebuild.
In this lab, you will migrate your cluster from an existing CNI to Cilium. While we use Flannel in this simple lab, you can leverage the same approach for other CNIs.
Migrating to Cilium from another CNI is a very common task. But how do we minimize the impact during the migration? How do we ensure pods on the legacy CNI can still communicate to Cilium-managed during pods during the migration? How do we execute the migration safely, while avoiding a overly complex approach or using a separate tool such as Multus?
With the use of the new Cilium CRD CiliumNodeConfig, running clusters can be migrated on a node-by-node basis, without disrupting existing traffic or requiring a complete cluster outage or rebuild.
In this lab, you will migrate your cluster from Calico to Cilium.
Industry insights you won’t delete. Delivered to your inbox weekly.